Smart agents [original]

Smart Agents

Human-Like Driver Agents in Simulated Urban

Environments Based on Situational Understanding

and Dynamic Decision-Making

vorgelegt von

Teresa Rock, M. Sc.

an der Fakultät V – Verkehrs- und Maschinensysteme

der Technischen Universität Berlin

zur Erlangung des akademischen Grades

Doktorinder Ingenieurwissenschaften

- Dr.-Ing. -

genehmigte Dissertation

Promotionsausschuss:

Vorsitzende: Prof. Dr. Christina Völlmecke

Gutachterin: Prof. Dr. Stefanie Marker

Gutachterin: Prof. Dr. Nele Rußwinkel

Tag der wissenschaftlichen Aussprache: 11. April 2024

Berlin 2024

Zusammenfassung

Der Bedarf an intelligenten Fahrermodellen, die in der Lage sind, komplexe Verkehrssituationen

in einer Weise zu bewältigen, die dem menschlichen Verhalten ähnelt, ergibt sich aus

verschiedenen Anwendungsbereichen. Im Kontext des autonomen Fahrens werden Fahrermodelle

verwendet, um den menschlichen Fahrer zu ersetzen und sichere sowie fexible Lösungen für

die Mobilität anzubieten. Man geht davon aus, dass autonome Fahrzeuge, die sich menschlich

verhalten, für sicherere Verkehrsinteraktionen sorgen und außerdem eher von Nutzern akzeptiert

werden [1, 2]. Auch im Bereich der Fahrsimulation werden Fahrermodelle benötigt, um den

Umgebungsverkehr in der Simulation zu erzeugen, der möglichst dem Verkehrsverhalten in der

realen Welt entsprechen soll. Fahrsimulation hat sich zu einem zentralen und unverzichtbaren

Werkzeug für Forschung und Entwicklung im Verkehrssektor entwickelt. Darüber hinaus führen

globale Trends wie Globalisierung, Nachhaltigkeit und die steigende Nachfrage an Mobilität

zu einem erhöhten Forschungsbedarf, insbesondere im urbanen Kontext [3, 4]. Somit kann das

Verständnis und die Modellierung des menschlichen Fahrverhaltens in urbanen Umgebungen als

ein Forschungsgebiet mit wachsender Bedeutung für die zukünftige Mobilität angesehen werden.

Dieser zunehmenden Bedeutung steht jedoch ein derzeit unvollständiger Forschungsstand

gegenüber, da sich die meisten Arbeiten entweder auf einfache Autobahnszenarien konzentrieren

oder Ansätze zur Lösung isolierter Szenarien präsentieren. Dementsprechend sind aktuelle

Lösungsansätze nicht für die für die Vielfalt und Komplexität der im Stadtverkehr auftretenden

Situationen geeignet. Ziel dieser Arbeit ist es daher, üb ertragbare und praktikable Methoden

für die Modellierung von menschenähnlichem Fahrverhalten in urbanen Umgebungen zu

erarbeiten.

Um diese wissenschaftliche Lücke zu adressieren, wird ein zweistufger Ansatz verfolgt, indem

zunächst eine detaillierte Analyse des interdisziplinären Themas durchgeführt wird, um

Kernprobleme moderner Lösungen zu identifzieren, welche anschließend im zweiten Teil

mit neuartigen Methoden behandelt werden. Entsprechend der interdisziplinären Natur des

Themas werden verschiedenste Forschungsbereiche wie Psychologie, Robotik, Fahrsimulation

oder autonomes Fahren beleuchtet. Ausgehend davon werden klare Modellanforderungen

identifziert und die folgenden vier Kernherausforderungen abgeleitet, welche im Methodenteil

mit innovativen Methoden adressiert und kritisch diskutiert werden: Repräsentation von Umge-

bungsinformationen zur Ausbildung eines situativen Verständnisses in Modellen, Entwicklung

und Untersuchung generalisierbarer Vorhersagemodellen, dynamische Entscheidungsfndung

für situationsadaptives Verhalten und aussagekräftige Evaluationsstrategien zur Bewertung

der Menschenähnlichkeit von Modellverhalten. Eine umfassende Diskussion der Ergebnisse,

Limitationen und ein Ausblick auf weiterführende Forschungsthemen runden die Arbeit ab.

Abstract

The demand for intelligent driver models capable of handling complex trafc situations in a

way that resembles human behavior arises from various application areas. In the context of

Autonomous Driving, driver models replace human drivers, aiming at providing safe and fexible

mobility solutions. It is believed that autonomous vehicles exhibiting human-like behavior have

the potential to enhance the safety of trafc interactions and are better accepted by users [1, 2].

In the feld of Driving Simulation, driver models are required to generate surrounding trafc

within the Virtual Environment to provide a realistic replication of real-world trafc scenarios.

Driving Simulation has become a central and indispensable tool for research and development

in the transportation sector. Moreover, global trends such as globalization, sustainability, and

increased demand for mobility contribute to a growing need for research, especially in the

context of urban trafc scenarios [3, 4]. Therefore, modeling and understanding human driving

behavior in urban environments shows increasing necessity for the future of mobility.

Meanwhile, current research is incomplete, as most publications either focus on more simple

highway trafc or propose approaches to solve isolated scenarios or parts of the driving task.

As a result, current solutions are not suitable for the diversity and complexity of urban trafc.

Therefore, the objective of this thesis is to develop transferable, practicable, and reliable

methods for modeling human-like driving behavior in urban environments.

In order to address this scientifc gap, a twofold approach is taken. First, a detailed analysis

of the topic in its interdisciplinary nature is conducted in order to identify the fundamental

problems of modern solutions, which are subsequently addressed with novel methods in the

second part of the thesis.

Therefore, the topic is explored from the perspective of various research areas, including

psychology, robotics, Driving Simulation, and Autonomous Driving. Based on this multi-

dimensional analysis, key challenges in state-of-the-art solutions and clear requirements for

modeling human-like driving behavior are determined. The following four key challenges

are identifed to prevent successful modeling of human-like driving behavior in urban trafc:

representation of complex trafc situations to enable situational understanding, creation and

evaluation of generalizable prediction models to anticipate future scene developments, dynamic

decision-making to enable situational behavior adaptation, and meaningful evaluation strategies

capable of assessing human-like model behavior. Novel methods are presented to address these

four main challenges, and the results are critically discussed. A comprehensive discussion of

the results, limitations, and an outlook for further research will conclude the thesis.

Acknowledgements

I would like to thank my supervisor at the Technical University of Berlin, Prof. Marker, for her

consistent support and guidance throughout this project. I also want to thank my supervisors

at BMW, Thomas Bleher and Dr. Mohammad Bahram, for supporting and inspiring me to

think out of the box, thus enabling me to generate novelty in this densely covered research

area. Besides my supervisors, I would like to thank BMW and my incredible team there for

their collaborative eforts and personal support. Especially the collaboration with the other

PhD students in my team (Chantal H., Maurice K., and Robert J.) enriched my research time.

Finally, a special thanks to my family and friends for their endless emotional support, without

which I would not have been able to complete this research. In particular, the constant support

in the form of coworking, feedback, and encouragement from my friends Christoph B., Claudia

B., Judith P., Cosima V., and Deniz A. got me through this time.

Table of Contents

Title Page i

Zusammenfassung ii

Abstract iii

List of Figures vii

List of Tables xii

Abbreviations xiv

1 Introduction 1

1.1 Motivation ...................................... 1

1.2 Objectives and Focus of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

PART I: Identifcation of the Scientifc Gap and Related Research Questions 3

2 State-Of-The-Art 4

2.1 The Interdisciplinary Nature of Driver Models . . . . . . . . . . . . . . . . . . . 4

2.2 Driver Models in Driving Simulation . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2.1 Modeling Trafc in Driving Simulation . . . . . . . . . . . . . . . . . . . 5

2.2.2 Categories of Trafc Simulation . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.3 Agent-Based Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.4 Agent Models in Driving Simulation . . . . . . . . . . . . . . . . . . . . 6

2.2.5

Modeling Vehicle Driver Behavior by Agent Models in Driving Simulation

7

2.2.6 The Driver Agent at BMW: TRM . . . . . . . . . . . . . . . . . . . . . 9

2.3 Driver Models for Automated Vehicles . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Further Fields of Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4.1 Psychology .................................. 14

2.4.2 Cognitive Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4.3 Robotics .................................... 15

2.4.4 Imitating and Replicating Human Behavior . . . . . . . . . . . . . . . . 15

2.5 Summary ....................................... 15

ii

TABLE OF CONTENTS

3 Requirements for Modeling Human-Like Driving Behavior 17

3.1 Characteristics and Main Challenges of Urban Trafc . . . . . . . . . . . . . . . 17

3.2 Perspectives on Defning and Quantifying Human-Like Driving Behavior . . . . 18

3.2.1 Objective Approaches to Quantify Human Likeness . . . . . . . . . . . . 18

3.2.2 Subjective Approaches to Quantify Human Likeness . . . . . . . . . . . 19

3.2.3

Potentials and Limitations of Objective and Subjective Quantifcation

Methods .................................... 20

3.3 Requirements From a Driver-In-The-Loop Perspective . . . . . . . . . . . . . . 21

3.3.1

What It Takes to Be Perceived as Real - The Human Driver in the

Simulator as a Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.3.2

What It Takes to Cope With Urban Trafc - The Human Driver as a

RoleModel .................................. 22

3.3.3

Use Cases and Technical Requirements Originating From Driving

Simulation .................................. 23

3.3.4

Summary of Requirements for Human-Like Driver Agents in Urban

DrivingSimulation .............................. 23

3.4 Methodological Gaps in State-Of-The-Art Solutions . . . . . . . . . . . . . . . 25

3.4.1 Insufcient Representation of Context Information . . . . . . . . . . . . 26

3.4.2 Missing Anticipation in Agent Models . . . . . . . . . . . . . . . . . . . 26

3.4.3 Missing Dynamics in Decision-Making of Driver Agents . . . . . . . . . 26

3.4.4 Insufcient Evaluation Strategies and Misleading Metrics . . . . . . . . 27

PART II: Methods to Address the Identifed Research Gaps 27

4 Representation of Complex Trafc Situations 28

4.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.2 State-Of-The-Art: Environment Representation . . . . . . . . . . . . . . . . . . 29

4.2.1 Static Environment Representation . . . . . . . . . . . . . . . . . . . . . 29

4.2.2 OpenDRIVEMaps .............................. 30

4.2.3 Dynamic Environment Representation . . . . . . . . . . . . . . . . . . . 31

4.3

Method: Dynamic Scene Representation Obtained by Prior Knowledge and

Interpretation ..................................... 32

4.3.1 The General Scene Description . . . . . . . . . . . . . . . . . . . . . . . 33

4.3.2 The Data Processing Concept . . . . . . . . . . . . . . . . . . . . . . . . 34

4.4 Implementation .................................... 36

4.4.1 Implementation Step 1: Data Fusion . . . . . . . . . . . . . . . . . . . . 36

4.4.2 Implementation Step 2: Plausibility Check . . . . . . . . . . . . . . . . 37

4.4.3 Implementation Step 3: Interaction Identifcation . . . . . . . . . . . . . 38

4.5 Results......................................... 43

4.6 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.7 Conclusion ...................................... 46

5 Anticipating the Intention of Other Road Users 47

5.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

iii

TABLE OF CONTENTS

5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions . . . . . . . . . . . . . . 48

5.2.1 Prediction Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.2.2 Behavioral Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.2.3 Infuence Factors and Model Structures . . . . . . . . . . . . . . . . . . 49

5.2.4 Model Evaluation Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.3 Prediction Concept, Problem Formulation, and Data Preparation . . . . . . . . 52

5.3.1 General Concept of Intention Prediction . . . . . . . . . . . . . . . . . . 52

5.3.2 Problem Formulation and Model Architecture . . . . . . . . . . . . . . . 52

5.3.3 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.4

Method 1: Measuring the Efectiveness of the Scene Representation by Prediction

Performance ...................................... 54

5.4.1 Motivation .................................. 54

5.4.2 Concept.................................... 54

5.4.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.4.4 Results .................................... 55

5.4.5 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 57

5.4.6 Conclusion .................................. 59

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc . . . . . . 60

5.5.1 Motivation .................................. 60

5.5.2 Concept.................................... 60

5.5.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.5.4 Results .................................... 65

5.5.4.1

Infuence of Homogeneity in Training Data and Coverage of

Scene Information . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.5.4.2 Infuence of Individual Feature Categories . . . . . . . . . . . . 66

5.5.4.3 Impact of Tuning Parameters . . . . . . . . . . . . . . . . . . . 68

5.5.4.4 Measuring Generalizability and Plausibility . . . . . . . . . . . 68

5.5.4.5 Qualitative Prediction Results of Diferent Model Settings . . . 69

5.5.5 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 71

5.5.6 Conclusion .................................. 72

5.6 Summary on Anticipation Methods . . . . . . . . . . . . . . . . . . . . . . . . . 73

6 Dynamic Decision-Making 74

6.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.2 State-Of-The-Art: Decision-Making . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.2.1 Decision-Making Strategies for Agent Models in Simulation . . . . . . . 75

6.2.2 Trajectory Planning Approaches for Automated Vehicles . . . . . . . . . 76

6.2.2.1 Search-Based Methods . . . . . . . . . . . . . . . . . . . . . . 76

6.2.2.2 Sampling-Based Methods . . . . . . . . . . . . . . . . . . . . . 76

6.2.2.3 Optimization-Based Methods . . . . . . . . . . . . . . . . . . . 77

6.2.2.4 Decomposition Strategies . . . . . . . . . . . . . . . . . . . . . 77

6.2.3

Potentials and Weaknesses of Heuristic Decision-Making and Trajectory

Planning ................................... 77

6.3 Method: Dynamic Decision-Making for Agent Models . . . . . . . . . . . . . . 78

iv

TABLE OF CONTENTS

6.3.1 Concept for Dynamic Decision-Making . . . . . . . . . . . . . . . . . . . 78

6.3.2 The Planning Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.3.2.1 Planner Variant: PL_PVD . . . . . . . . . . . . . . . . . . . . 79

6.3.2.2 Planner Variant: PL_3D . . . . . . . . . . . . . . . . . . . . . 81

6.3.2.3 Trajectory Smoothing . . . . . . . . . . . . . . . . . . . . . . . 83

6.3.3 Parametrization ............................... 85

6.3.4 Evaluating Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.3.5 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.4 Results......................................... 89

6.4.1

Potential of Trajectory Planning Versus Heuristic Approaches for

Decision-Making ............................... 89

6.4.2 Sensitivity Towards Diferent Parameter Sets and Scenario Variations . 94

6.5 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.6 Conclusion ...................................... 97

7 Evaluation Methods 98

7.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

7.2 State-Of-The-Art: Evaluation Approaches for Human-Like Driver Models . . . 100

7.3

Method 1: Objectively Evaluating the Human Likeness by Introducing a

PlausibilityMetric .................................. 101

7.3.1 Concept.................................... 101

7.3.1.1 Parameter Specifcation . . . . . . . . . . . . . . . . . . . . . . 103

7.3.1.2 Identifcation of Context-Based Similarity of Situations . . . . 105

7.3.1.3 Preparing the Database . . . . . . . . . . . . . . . . . . . . . . 106

7.3.1.4 Quality Function Formulation . . . . . . . . . . . . . . . . . . 106

7.3.1.5 Strategy for Validating the Method . . . . . . . . . . . . . . . 108

7.3.2 Implementation ................................ 108

7.3.2.1 Used Databases and the Driver Model . . . . . . . . . . . . . . 108

7.3.2.2 Metric Formulation and Thresholds . . . . . . . . . . . . . . . 109

7.3.2.3 Survey for Validation . . . . . . . . . . . . . . . . . . . . . . . 109

7.3.3 Results .................................... 110

7.3.3.1 Objective Metric Results Versus Subjective Human Ratings . . 111

7.3.3.2 The Human Likeness of Investigated Datasets . . . . . . . . . 112

7.3.3.3

Case Study 1: Exemplary Application of the Method for

Investigating the Heuristic Model at BMW (TRM) . . . . . . 113

7.3.3.4

Case Study 2: Applying the Evaluation Metric to the Proposed

Trajectory Planners . . . . . . . . . . . . . . . . . . . . . . . . 117

7.3.4 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 118

7.3.5 Conclusion .................................. 120

7.4

Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving

Simulation Based on Questionnaires . . . . . . . . . . . . . . . . . . . . . . . . 121

7.4.1 Concept.................................... 121

7.4.2 Material and Setting of the Simulator Experiment . . . . . . . . . . . . 122

7.4.2.1 Study Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

v

TABLE OF CONTENTS

7.4.2.2 Questionnaires . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.4.2.3 Simulator Setting . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.4.2.4 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

7.4.3 Results .................................... 124

7.4.3.1 The Efect of Surrounding Trafc on Perceived Realism . . . . 124

7.4.3.2 Realism of Trafc Agent Behavior . . . . . . . . . . . . . . . . 124

7.4.4 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 125

7.4.5 Conclusion .................................. 127

7.5 Summary on Evaluation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 127

8 Summary, Discussion and Outlook 129

8.1 SummaryofResults ................................. 129

8.2 Limitations and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

8.3 FinalConclusion ................................... 134

8.4 SummarizedOutlook ................................. 135

References 137

Appendix A Appendix 154

A.1 Additional Material for the Context Representation Method . . . . . . . . . . . 154

A.2 Additional Material for the Dynamic Decision-making Method . . . . . . . . . 156

A.2.1

Scenario A_VEH Solved by the Two Planners and the Heuristic Agent

Model ..................................... 156

A.2.2

Scenario B_PED Solved by the Two Planners and the Heuristic Agent

Model ..................................... 157

A.2.3

Scenario C_STAT Solved by the Two Planners and the Heuristic Agent

Model ..................................... 158

A.2.4

Scenario D_BIC Solved by the Two Planners and the Heuristic Agent

Model ..................................... 159

A.2.5 Scenario D1 Solved by the PL_3D Planner . . . . . . . . . . . . . . . . 161

A.3 Additional Material for the Simulator Experiment . . . . . . . . . . . . . . . . 163

vi

List of Figures

1.1 Structure of the thesis following a two-fold approach. . . . . . . . . . . . . . . . 3

2.1

Proprietary-designed generic driver model that enables the categorization and

comparison of state-of-the-art solutions for diferent subsets describing the

drivingtask. ...................................... 5

2.2 Structure of the driver agent TRM developed by BMW. . . . . . . . . . . . . . 10

2.3 Overview of the interrelationships AI - ML - DL [70]. . . . . . . . . . . . . . . . 12

3.1

Evaluation of model performance based on the similarity of a predicted trajectory

compared to the individual human-driven trajectory. . . . . . . . . . . . . . . . 19

3.2

Key scenarios representing the main challenges of urban trafc for driver agents

for simulation. Ego vehicle in yellow experiencing the diferent situations. . . . 24

3.3

Overview of identifed components required for modeling human-like driving

behavior in urban trafc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.1

OpenDRIVE road network description including reference line, lane objects and

lanefeatures[149]. .................................. 31

4.2

OpenDRIVE Lanes: types, neighbor relations, and driving direction by lane ID

[150]........................................... 31

4.3 Examples of lanes and lane geometry representations in Spider.......... 32

4.4

Illustrating the idea of scenario mapping by the general scene description. The

ego vehicle (ID 1, yellow) interacts with diferent interaction partners once in a

typical turning maneuver and once in an exceptional situation resulting from

anoccupiedlane.................................... 35

4.5

Example of undefned areas within intersections. Recording location Heckstrasse

inDdataset[155]. ................................... 36

4.6 Example of ambiguous (left) and logically incorrect (right) lane assignments. . 38

4.7 Real world examples causing implausible lane assignments. . . . . . . . . . . . 40

4.8

The concept for identifying potential interactions with VRUs. The yellow vehicle

represents the ego vehicle facing a scene with two pedestrians and a cyclist.

Identifcation based on the relevant ego lanes (red), the motion polygons (blue),

and the euclidean distance e. ............................ 41

vii

LIST OF FIGURES

4.9

Illustration of diferent interactive situations of the inD dataset at the four

diferent locations. The ego vehicle is always drawn in red. The solid lines show

the path driven so far, the dotted lines represent the future path, and the circles

mark the current position. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.1

Structure of the base NN used for the two anticipation methods to investigate

the ability to generate generalizable predictions. . . . . . . . . . . . . . . . . . . 53

5.2

Concept for measuring the effectiveness of the proposed scene representation

by training the NN with diferent feature compositions. . . . . . . . . . . . . . 55

5.3

Quantitative results in unseen situations (Level 1). Ego GT trajectory marked

in orange. Predictions of the model trained on EMPI in cyan, and predictions

of the model trained on E in blue. GT and predictions represent positions in

next 5 seconds. For all trajectories: driven path: solid, future path: dotted. . 56

5.4

Quantitative results in unseen situations on unknown intersection (Level 2).

Ego GT trajectory marked in orange. Predictions of the model trained on EMPI

in cyan, and predictions of the model trained on E in blue. GT and predictions

represent positions in next 5 seconds. For all trajectories: driven path: solid,

futurepath: dotted. ................................. 58

5.5

The evaluation concept involving diferent test levels and metrics to quantify

generalizability..................................... 62

5.6

Illustration of the data concept including three levels of homogeneity of training

data (T1 - T3) on the left-hand side and diferent test data (L2 - L3) on the

right-hand side. Real trafc examples are provided by the inD dataset [155],

while synthetic driving data is obtained by simulation. As an additional test

case, an exceptional scenario is designed using the simulation framework and a

human driver (L3)................................... 64

5.7

Infuence of variability in training data on model performance. Left: accuracy

measured by ADE and FDE on diferent test levels with best feature setting of

training data category. Right: scores for accuracy, plausibility, and overall for

diferenttrainingdata. ................................ 66

5.8

Infuence of diferent feature settings on model accuracy considering homogeneity

level T3. Left: Efect measured on test level L2a. Right: Efect measured on

testlevelL3. ..................................... 67

5.9

Infuence of interaction features on temporal plausibility (left) and infuence of

map features on spatial plausibility (right). . . . . . . . . . . . . . . . . . . . . 68

5.10

Quantitative results on unseen intersections (isec 4 - L2b top, Heckstrasse -

L2a bottom). Ego GT trajectory marked in orange. Predictions of models

trained on EMPI features with diferent training datasets (T1, T2, T3). GT

and predictions represent positions in the next 5 seconds. For all trajectories:

driven path: solid, future path: dotted. . . . . . . . . . . . . . . . . . . . . . . 69

viii

LIST OF FIGURES

5.11

Qualitative evaluation of predictions from diferent feature settings for models

trained on T3 performing on unseen intersections (frst row: isec 4 - L2b, second

row: Heckstrasse - L2a). Ego GT trajectory marked in orange. Predictions of

models trained on T3 with diferent feature settings: EMPI, EMI, and E. GT

and predictions represent positions in the next 5 seconds. For all trajectories:

driven path: solid, future path: dotted. . . . . . . . . . . . . . . . . . . . . . . 70

6.1

The hierarchical planning framework realized as a two-layer concept for frst

planning a rough discretized frst-guess trajectory (behavioral layer), which

is subsequently smoothed into a dynamical feasible trajectory by the motion

planninglayer. .................................... 80

6.2

Illustration of the velocity profle generation using a hybrid A* algorithm applied

in the s/t space adopted from [66]. . . . . . . . . . . . . . . . . . . . . . . . . . 82

6.3

The APF formed by the position of the obstacle and velocity and position of

theego vehicle[213]. ................................. 83

6.4

Example visualization of the APF for a dynamic (right) and static (left) obstacle.

Ego vehicle green Bounding box, other trafc participants blue. . . . . . . . . . 84

6.5 Circle method for collision free motion planning inspired by [66]. . . . . . . . . 85

6.6 The four key scenarios for evaluation. . . . . . . . . . . . . . . . . . . . . . . . 89

6.7

Behavior of the heuristic agent model and the two planners in scenario A_VEH

at the same time step. Ego vehicle in green. . . . . . . . . . . . . . . . . . . . . 91

6.8

Behavior of the heuristic agent model and the two planners in scenario C_STAT

at the same time step. Ego vehicle in green. . . . . . . . . . . . . . . . . . . . . 91

6.9 PL_3D driving far into oncoming lane in scenario C_STAT and D_BIC. Ego

vehicleingreen. .................................... 92

6.10

Behavior of the heuristic agent model and the two planners in scenario D_BIC

at the same time step. Ego vehicle in green. . . . . . . . . . . . . . . . . . . . . 93

6.11

Profles showing velocity, acceleration and jerk of the PL_PVD planner in

Scenario D_BIC for the fnal trajectory. . . . . . . . . . . . . . . . . . . . . . . 93

6.12

Behavior of the PL_3D planner in Variant D2 best quality parameter set (top)

versus best runtime parameter set (bottom). . . . . . . . . . . . . . . . . . . . . 95

6.13

Velocity, acceleration and jerk of the PL_3D planner in Variant D2 best quality

parameter set (left) versus best runtime parameter set (right). . . . . . . . . . . 95

7.1 The general idea of evaluating model behavior within situational context. . . . 102

7.2

Illustration of the concept toolchain including metric development and validation

strategyofthemethod. ............................... 103

7.3 Illustration of the calculation concepts for interaction-related parameters. . . . 105

7.4

Two exemplary situations at the Bendplatz location (recording 12) were identifed

as similar based on the contextual parameters in Table 7.2. The ego vehicle is

marked in red, the cyclist in blue and the other vehicle in green. The trajectory

already traveled is marked as a solid line. . . . . . . . . . . . . . . . . . . . . . 107

7.5

Locations for data gathering - Left: synthetic intersections for creating artifcial

driving behavior. Right: Locations from inD Data [155]. . . . . . . . . . . . . . 109

ix

LIST OF FIGURES

7.6

Exemplary screenshots of videos shown to participants for rating human likeness

of a subject vehicle marked in red. . . . . . . . . . . . . . . . . . . . . . . . . . 110

7.7

Subjective human likeness rating obtained from participants (y-axis) associated

with objective human likeness scores obtained by the proposed metric (blue

value above). Vehicles are sorted by the average rating assigned by participants,

in descending order from left to right. . . . . . . . . . . . . . . . . . . . . . . . 111

7.8

Relationship between the fne-tuned objective human-like driving behavior scores

provided by the proposed methodology and subjective ratings of participants

duringthe survey. .................................. 112

7.9

Results for human-like scores for real and synthetic datasets: with tuned

thresholds and weights (left) and initial setting (right). . . . . . . . . . . . . . . 113

7.10

Analysis to identify which parameters mostly fail when comparing artifcial and

human behavior. ................................... 115

7.11

Exemplary trajectory of two vehicles showing signifcant high jerk values when

approaching the intersection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7.12

Distribution of lonVelocity for the two planners and the TRM model outside

and in the intersection area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

7.13

Exemplary illustrating the mismatch of driven trajectories along the defned

intersection lanes. Trajectory in red and lane polygon in cyan. . . . . . . . . . 119

7.14

Trafc scenarios experienced by participants during the simulator experiment.

The ego vehicle is marked with a red triangle. . . . . . . . . . . . . . . . . . . . 123

7.15 The driving simulator used in the present experiment. . . . . . . . . . . . . . . 124

7.16

The infuence of surrounding trafc on the perceived realism and the naturalism

of driving style, whereby 7 represents a high sense of presence (*:

p <

0.05, **:

p < 0.01). ....................................... 125

7.17

Assessment of the used agent models regarding key requirements for realistic

behavior, whereby 7 represents a positive rating of, for example, high individuality.

126

A.1

Velocity acceleration and jerk profle for the frst guess and fnal trajectory of

the two planners in scenario A_VEH. . . . . . . . . . . . . . . . . . . . . . . . 156

A.2

Velocity, acceleration, and jerk profle of the heuristic agent model in scenario

A_VEH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

A.3

Behavior of the heuristic agent model and the two planners in scenario B_PED

at the same time step. Ego vehicle in green. . . . . . . . . . . . . . . . . . . . . 157

A.4

Velocity, acceleration and jerk profle for the frst guess and fnal trajectory of

the two planners in scenario B_PED. . . . . . . . . . . . . . . . . . . . . . . . 158

A.5

Velocity, acceleration, and jerk profle of the heuristic agent model in scenario

B_PED. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

A.6

Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of

the two planners in scenario C_STAT. . . . . . . . . . . . . . . . . . . . . . . . 159

A.7

Velocity, acceleration, and jerk profle of the heuristic agent model in scenario

C_STAT. ....................................... 159

A.8

Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of

the two planners in scenario D_BIC. . . . . . . . . . . . . . . . . . . . . . . . . 160

x

LIST OF FIGURES

A.9

Velocity, acceleration, and jerk profle of the heuristic agent model in scenario

D_BIC. ........................................ 160

A.10

Behavior of the PL_3D planner employing diferent parameter sets in variation

D1 of scenario D_BIC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

A.11

Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of

PL_3D planner employing best quality and best runtime parameter sets. . . . 161

A.12

Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of

PL_3D planner employing local and global parameter sets. . . . . . . . . . . . 162

xi

List of Tables

2.1

Summary of the most frequently used input information for driver agents in

simulation. ...................................... 9

2.2

Summary of the most commonly used input information for autonomous vehicles

that is used for prediction or planning purposes. . . . . . . . . . . . . . . . . . 13

3.1

Potentials and Limitations of diferent perspectives to quantify human-like

drivingbehavior. ................................... 20

4.1 Feature information defned according to the general scene description. . . . . . 35

4.2 Semantic features describing scenario a and b shown in Figures 4.9a and 4.9b. . 44

5.1

Summary and categorization of the state-of-the-art approaches to vehicle motion

prediction........................................ 50

5.2 Summary of state-of-the-art metrics for trajectory prediction. . . . . . . . . . . 51

5.3 Results for measuring the efectiveness of the proposed scene representation. . . 56

5.4

Tuning parameter sets for investigating the efect of hyperparameter tuning [195].

64

5.5

Results for all model variants across diferent training data and diferent feature

settings at all test levels. The frst fve columns provide results summarized across

all test levels. The remaining columns show the results for each individual test

level. The best results within the training dataset (T1, T2, T3) are highlighted

in bold, the best results per column, i.e. per test level, are highlighted in bold

andunderlined..................................... 66

5.6

Analysis of spatial and temporal model performance measured in [%]. Best

result for

SO_temp

and

SO_spa

per homogeneity level (T1, T2, T3) highlighted

in bold, best overall results highlighted in bold and underlined. . . . . . . . . . 67

5.7

Results for diferent learning parameters on all test levels according to Table

5.4. Best result per column highlighted in bold. . . . . . . . . . . . . . . . . . . 68

6.1

The ability of the two planners and the heuristic agent mo del to solve the

defned key scenarios from a functional perspective. . . . . . . . . . . . . . . . . 90

6.2

Comparing the results of the planners PL_PVD and PL_3D with the heuristic

agent models (TRM) in scenario A_VEH and B_PED. . . . . . . . . . . . . . 92

6.3

Comparing the results of the planners PL_PVD and PL_3D with the heuristic

agent models (TRM) in scenario C_STAT and D_BIC. . . . . . . . . . . . . . 94

xii

LIST OF TABLES

6.4

Variations D1 and D2 for investigating changes in parametrization and scenario

for the cyclist scenario D_BIC using the PL_3D planner. . . . . . . . . . . . . 96

7.1 Overview Parameters for describing human likeness and plausibility of behavior.104

7.2

Overview of context parameters to distinguish scenarios, sorted by priority order

(P), to extract similar trafc situations from the comparison database. . . . . . 106

7.3

Thresholds for measuring human likeness for diferent parameters extracted

from the real trafc dataset. Distributions of velocities, accelerations, and jerk

are compared using KS statistics. Percent ratios are assigned for maximum

values and raw thresholds are assigned for context-free parameters. Parameters

marked with * are calculated context-free. . . . . . . . . . . . . . . . . . . . . . . 110

7.4

Spearman correlation between the parameters and average human-like driving

behavior score from the survey, with correlations converted into weights. . . . . 112

7.5

Analysis of the value distribution for dynamic parameters for model behavior and

real humans on a macroscopic level (left) and scenario-specifc under situational

conditions(right). .................................. 115

7.6

Review of all human likeness parameters that constitute the quality function

calculated for the two trajectory planners PL_3D, PL_PVD, and the heuristic

model TRM. *No TET value could be calculated for the scenario since no TTC

below the critical threshold of 6s was detected. . . . . . . . . . . . . . . . . . . 118

7.7 Distribution of the meta data of the participants. . . . . . . . . . . . . . . . . . 124

A.1

All features describing a driving scene at time

t

from the perspective of an

individual ego vehcile on the basis of features from the inD dataset. [155] . . . 155

A.2

Questionnaire 1: Investigating the sense of presence. P resence (PRE) items

according to M. Slater and A. Steed [239] and one additional item investigating

the degree of naturalism of the driving style. . . . . . . . . . . . . . . . . . . . 163

A.3 Questionnaire 2: Evaluating the degree of realism of the trafc agents. . . . . . 164

xiii

Abbreviations

ABM Agent-Based Modelling

ACC Adaptive Cruise Control

ACT-R Adaptive Control of the Thought-Rational

AD Autonomous Driving

ADE Average Displacement Error

AI Artifcial Intelligence

APF Artifcial Potential Field

ASAM

Association for Standardization of Automa-

tion and Measuring Systems

AV Automated Vehicle

CNN Convolutional Neural Network

DAC Driveable Area Compliance

DAG Directed Acyclic Graph

DDM Dynamic Decision-Making

DiL Driver-in-the-Loop

DL Deep Learning

DOG Dynamic Occupancy Grid

DS Driving Simulation

FDE Final Displacement Error

FOT Field Operational Test

GA Genetic Algorithm

GAIL Generative Adversarial Imitation Learning

GAN Generative Adversarial Network

GCN Graph Convolutional Network

GNN Graph Neural Network

GT Ground Truth

HD High-Defntion

HMI Human-Machine-Interface

HOR Hard Of-road Rate

IDM Intelligent Diver Model

IP Interior Point

IPQ Igroup Presence Questionnaire

KS Kolmogorov-Smirnov

LIDAR Light Detection and Ranging

LSTM Long Short-Term Memory

MAE Mean Absolute Error

MAS Multi-Agent-System

ML Machine Learning

MLP Multi-layer Perceptron

MOBIL

Minimizing Overall Braking Induced by

Lane Changes

MPC Model Predictive Control

MR Miss Rate

MSE Mean Squared Error

NDS Navigation Data Standard

NN Neural Network

ODM OpenDRIVE Map

OSM OpenStreetMap

PET Post Encroachment Time

PID Proportional-Integral-Derivative Controller

PIP Potential Interaction Partners

POC Proof of concept

PVD Path-Velocity-Decomposition

QP Quadratic Programming

RL Reinforcement Learning

RMSE Root Mean Squared Error

RNN Recurrent Neural Network

ROW right-of-way

RP Reference Point

RRT Rapidly-exploring Random Trees

SA Situation Awareness

SiL Software-in-the-Loop

SOR Soft Of-road Rate

SQP Sequential Quadratic Programming

SUS Slater-Usoh-Steed

TCC Temporal Correlation Coefcient

TET Time Exposed Time-to-Collision

TN Transformer Network

TTC Time to Collision

TTO Time-to-overlap

VAE Variational Autoencoder

VE Virtual Environment

VRU Vulnerable Road User

XML Extensible Markup Language

xiv

1

Introduction

1.1 Motivation

Global developments show increasing signifcance and demand f or mobility [5]. Urban research

is particularly important due to factors like urbanization and economic growth, which introduce

new complexities to the investigation and advancement of vehicle and mobility concepts [4].

A key challenge in this context is comprehending, replicating, and mo d eling human driving

behavior. This challenge engages a diverse array of research disciplines, including psychology,

computer science, and engineering, as models are employed for various applications, such as

AD (Autonomous Driving) or DS (Driving Simulation).

Within the realm of f uture mobility concepts, AD is a major area of study, striving to

comprehend and model driving behavior to take on the responsibility of the human driver [3, 6,

4]. The signifcance of mimicking human-like driving behavior for an AV (Automated Vehicle)

is justifed by the expectation to improve interactions between road users and to enhance the

comfort and acceptance of customers [1, 2].

DS represents the complementary research domain by providing the indispensable tool in

nowadays development to investigate various aspects under safe and reproducible conditions.

Simulation allows testing components in early development stages and conducting human-

centered investigations, focusing, for instance, on driver perception, distraction, or behavior in

critical situations [7]. DS can be categorized into DiL (Driver-in-the-Loop) and SiL (Software-

in-the-Loop) applications, investigating the ability of models and functions to interact with the

driver or the environment. In both loops, creating a realistic replica of the real world within

the VE (Virtual Environment) is crucial to establish valid test conditions [8]. Surrounding

trafc, as a crucial part of the VE, is typically simulated using agent models in DS which are

expected to behave as closely as possible to real road users.

Meanwhile, global trends in globalization and mobility accentuate the importance of urban

scenarios in mobility-related research [5]. Urban trafc is characterized by intricate interactions

among multiple road users and the adherence to complex trafc regulations [3], necessitating

awareness and perception of the situation as well as appropriate adaption of behavior[9].

1

1.2 Objectives and Focus of the Thesis

Especially in urban scenarios, the advantages of DS exceed those of real driving tests regarding

maintaining safety and reproducibility. Due to the increased complexity of urban environments,

new challenges arise for the development of driver models. Given the complexity of human

behavior and characteristics of urban trafc, driver models that are able to cope with the

diversity of urban trafc in a human-like manner pose a compelling scientifc domain with as yet

unresolved facets. Current solutions are either inappropriate as the subject is not considered

holistically and thus merely addresses isolated situations or approaches originated from highway

context, whereby solutions are not adequately transferrable to meet the prerequisites of urban

trafc. Furthermore, the subject is highly interdisciplinary, combining research areas such

as psychology, robotics and computer science. Existing research work does not sufciently

combine the fndings and methods from these areas to meet the complexity of the topic. In

particular, the ability to exhibit human-like driving behavior in various interactive trafc

scenarios requires a high level of dynamic and intelligence within the model. Therefore, the

present thesis addresses the challenge of modeling human-like driving behavior for urban trafc

with a special focus on DiL applications in DS.

1.2 Objectives and Focus of the Thesis

The primary aim of this thesis is to address the challenge of accurately modeling human-like

driving behavior within urban environments. Given the absence of a universally accepted

defnition of what constitutes as human-like behavior in this context, this thesis strives to

establish a more precise and pragmatic defnition of human likeness, accompanied by the

essential requirements for driver models by maintaining the interdisciplinary nature of the

topic.

Furthermore, a notable disparity becomes evident upon examining the substantial volume

of research publications related to driver models in contrast to the limited number of

practical implementations on real roads. Hence, the principal objective of this thesis is

to pinpoint methodological shortcomings and root causes hindering the progress of driver

model development for urban trafc and attempts to develop innovative and transferable

approaches addressing the identifed methodological gaps.

As the realm of human driving behavior modeling encompasses a wide spectrum of possibilities,

this thesis strategically narrows its research focus for clarity and precision. To begin with,

all methods and analyses within this thesis will be exclusively conducted within the context

of DS with a special focus on DiL applications. Nevertheless, it is essential to note that the

methodologies developed herein are intended to have broader applications in related research

domains, notably in the realm of AD. This is due to the inherent interconnectedness of the

research questions and challenges DS and AD.

This thesis places a distinct focus on urban trafc scenarios. Despite several studies addressing

the so-called highway problem, urban trafc remains relatively underexplored in scientifc

research. This is primarily attributed to the complexity of urban trafc, which creates

a signifcant gap in multiple dimensions encompassing modeling and evaluation purposes.

Specifcally, the thesis focuses on interactive situations, particularly those involving shared

space conficts introduced by ROW (right-of-way) regulations.

2

1.3 Structure of the Thesis

Since this research focuses on DS, the input interface to models is provided at the object

level. Therefore, the processing of raw sensory inputs, such as images or point clouds, is not

considered. Consistent with typical implementations, the output of the models is specifed as

the spatio-temporal motion of the vehicle, thus the trajectory it follows.

1.3 Structure of the Thesis

The present thesis follows a bipartite approach, as illustrated in Figure 1.1, to address the

multifaceted and interdisciplinary topic of modeling human-like driving behavior in a holistic

and diferentiated manner.

In the initial part, a comprehensive state-of-the-art analysis ofers an overview of key approaches

related to the modeling of surrounding trafc within DS and driver models in the context

of AVs. Furthermore, this section underscores the interdisciplinary nature of driver models,

discovering the diverse research domains involved in the quest for human-like driver models.

Building upon these insights and considering the practical application of DS, essential

requirements for human-like driver models are derived. This generates the basis for identifying

persistent challenges in modeling human-like driving behavior, resulting in the formulation of

four key research questions that are subsequently addressed in the methodological part of this

thesis.

The ensuing part comprises four chapters, each dedicated to tackling one of the aforementioned

research questions. These chapters encompass a comprehensive review of the state-of-the-art,

the presentation of innovative methodologies, and a detailed exploration of results, limitations,

and conclusions.

In the fnal summary, the insights derived from this thesis are consolidated in alignment with

the motivations and objectives previously outlined. Limitations are critically discussed, and

recommendations for future research eforts are proposed.

• State-Of-The-Art Analysis

• Requirements on Human-

Like Driver Models

• Methodological Gaps in

State-Of-The-Art Solutions

PART I: Identification of

Main Challenges

• Representation of

Complex Situations

• Anticipating the Intention

of Other Road Users

• Dynamic Decision-Making

• Meaningful Evaluation

Methods

PART II: Methodologies

and Concepts

INTRODUCTION

SUMMARY

Figure 1.1: Structure of the thesis following a two-fold approach.

3

2

State-Of-The-Art

The subject of human-like driver models is a multifaceted domain that encompasses various

research areas such as computer science, engineering, and psychology, as well as several

applications such as DS, AD, and robotics. Therefore, the following chapter aims to provide a

comprehensive outline describing the topic of human-like driver models. This review aims to

underscore the interdisciplinary nature of the topic serving as the basis for identifying scientifc

gaps and formulating research questions that will be explored in this thesis.

2.1 The Interdisciplinary Nature of Driver Models

Modeling driver behavior is a broad feld of research encompassing various methods and

perspectives. Signifcant parallels exist between the modeling of intelligent and autonomous

agents for DS and the advancement of AD for real-world transportation applications. In both

contexts, the primary objective is to achieve the independent navigation of a predetermined

route while adeptly managing the intricate dynamics of trafc in a secure and efcient

manner. The pursuit of emulating human problem-solving abilities also intersects with the

feld of robotics, as it involves the artifcial replication of human-like task performance. The

indispensable understanding of human behavior is at the core of modeling and replicating

the human facet of driving, which is addressed by psychology and cognition science. While

psychological approach aim to resolve human driving behavior by models and exp eriments,

cognitive modeling aims to render cognitive processes while driving tangible by bridging the

gap between p sychology and computer science. Therefore, the following sections present

state-of-the-art approaches related to driver models in simulation frameworks and AVs and

provide insights into methods originating from psychology, cognitive modeling and robotics.

However, most published research only provides solutions for one subpart of the driving task

such as fnding a path, decision-making, or controlling motion. To handle this, a generic and

modular driver model is defned that allows the comparison and categorization of the relevant

approaches. This generic model is inspired by the fundamental model of Donges [10] and the

layer model proposed by Michon and Keskinen [11]. Figure 2.1 illustrates the model, which

follows a matrix structure, aiming to cluster all parts of the driving task, into sub-modules

4

2.2 Driver Models in Driving Simulation

with respective interdependencies and visualizing the infuences of driver-internal components,

such as knowledge, experience, or personality. From the perspective of human decision-making,

most state-of-the-art ap proaches employ a structure that follows the chain of sense-plan-act [12,

13, 14, 15, 16, 17, 18, 16]. Meanwhile, psychological approaches introduce various layer models

inspired by human information processing and action-taking, distinguishing between a strategic,

tactical, and control level [11, 10]. Since both perspectives provide valid categorizations, the

generic driver model proposed here aims to combine both by a matrix structure. In addition,

the sense-plan-act structure is extended by the interpretation layer, whereby the task is to

link diferent sources of information and anticipate future developments.

internal state

personality

physiolo-

gical &

psycholo-

gical state

strategic

level

tactical

level

control

level

sense plan act

map,

target

point

set route/path

interpret

identify

drivable

areas

associate

anticipate

choose

maneuver

from set by

heuristic

criteria

maneuver

perform-

ance

explicit

commu-

nication

(indicator)

&

implicit

commun-

ication

(maneuver)

experience

knowledge

heuristic optimization

-based

plan

motion

interpret

trajectory

following

detect

dy-

namic

envi-

ron-

ment

plan

action,

maneuver

or rough

motion

Figure 2.1: Proprietary-designed generic driver model that enables the categorization and

comparison of state-of-the-art solutions for diferent subsets describing the driving task.

2.2 Driver Models in Driving Simulation

The following section provides an overview of the main concepts, important defnitions, and

the current state-of-the-art driver models used in DS.

2.2.1 Modeling Trafc in Driving Simulation

Integrating surrounding trafc into DS can be achieved through two distinct approaches:

replication or simulation. One way to infuse realistic trafc into the simulation involves

replicating scenarios using real trafc data. This approach entails replaying human driving

patterns within the simulation. However, in this replication process, the responsiveness of other

trafc participants to changing conditions is eliminated. Hence, the predominant strategies for

incorporating surrounding trafc into the VE revolve mainly around modeling and simulating

other road users.

5

2.2 Driver Models in Driving Simulation

2.2.2 Categories of Trafc Simulation

Trafc simulation can be classifed into three primary categories: macroscopic, mesoscopic, and

microscopic simulation [19, 20, 17]. These categories vary in terms of their level of detail and

the specifc phenomena that are observed. Macroscopic trafc simulation concerns the broader

trafc situation rather than individual road users. For instance, it examines the emergence of

trafc congestion and how alterations in road infrastructure impact trafc fow [21, 22, 23, 20].

Conversely, microscopic simulation focuses on modeling the interactions among road users

and their individual behavior [19, 20]. Mesoscopic simulation serves as an intermediary level,

bridging the gap between microscopic and macroscopic approaches to optimize computational

resources [17]. Furthermore, even more refned simulation levels exist, such as nanoscopic or

sub-microscopic simulations, which delve into the individual spatio-temporal behaviors of road

users [24]. Typical use cases for DiL applications require microscopic simulation to assess the

precise interactions unfolding between distinct road users an d the ego vehicle. Microscopic

simulation tools are particularly apt for urban settings due to their ability to enable vehicles’

individualized responses to the intricacies of road characteristics and trafc dynamics [25].

2.2.3 Agent-Based Modeling

Real trafc constitutes a dynamic system comprising autonomous individual entities, here

humans, that make independent decisions and mutually infuence each other. To efectively

model such intricate systems, ABM (Agent-Based Modelling) is employed, enabling the

representation of the multifaceted patterns that emerge through interactive dynamics. This

approach facilitates the replication of interactive phenomena and the emergence of unforeseen

behaviors [26]. ABM fnds typical applications in diverse felds such as trafc control, ecological

relationships, epidemic mo deling, market analysis, and organizational decision-making [27].

For DS, each trafc participant is therefore modeled by an individual agent model. In

trafc simulation, the scope of agents extends beyond road users to encompass entities like

street agents, intersection agents, and trafc light agents [28]. In general, an agent model is

characterized by the following facets [27]:

• autonomous, independently, takes own decisions

• follows personal rules and has a personal and explicit goal

• interacts with his environment

2.2.4 Agent Models in Driving Simulation

State-of-the-art simulation frameworks employ ABM to create the appearance of authentic

patterns through independent agents, whose actions are guided by environmental inputs [29,

19, 11]. These agents are usually assigned a predetermined route and a set of initialization

parameters, such as driver-related parameters like the desired velocity or vehicle-related

parameters, for example, the maximum acceleration [19]. Throughout the simulation, trafc

agents autonomously follow their assigned routes while taking context-driven decisions such as

applying brakes for collision avoidance [30], adhering to following distances [19], or executing

lane changes [23]. These agent models aim to replicate the multifaceted driving tasks and

6

2.2 Driver Models in Driving Simulation

situations handled by human drivers. Given the inherent complexity and diversity of these

tasks, it is suggested that a complete driver model should integrate solutions from diferent

paradigms integrated through a framework combining sub-modules [31, 32]. This holistic

approach involving multiple collaborative modules is referred to as a MAS (Multi-Agent-

System), which is proposed as the optimal approach for all driver models [31].

The strategies employed to model behavior within these agents can signifcantly difer in terms

of complexity. Prevalent multi-agent simulation frameworks are founded on distinct agent

types, such as vehicles, pedestrians, or cyclists, that maintain their designated roles throughout

the simulation [33, 11, 34, 35, 17]. More complex perspectives distinguish between humans

and their means of transportation by using a modularized architecture to describe humans

independently [17]. This allows, for example, a driver to behave more cautiously when driving

a bus compared to driving a passenger car given the same personality profle [17].

2.2.5

Modeling Vehicle Driver Behavior by Agent Models in Driving

Simulation

Most common strategies within DS employ a hierarchical framework to model driving behavior.

The constituents of these hierarchical structures and associated state-of-the-art methods are

discussed in categories proposed by the generic driver model, shown in Figure 2.1, to allow

comparability to relevant approaches from AV development.

• Strategical Level:

Agent models usually receive a destination point and map information or a predefned

route as input. Depending on the degree of input, the desired route or a reference path

is generated in the frst step. For further processing, knowledge is required to identify

driveable areas or preferred lanes. This strategic route information can be provided

in diferent formats, for example, in the form of way-points [36] or a list of contiguous

lanes [37]. Depending on the format, diferent strategies can be employed, such as an A*

algorithm to fnd a suitable route to reach the target [36].

• Tactical Level:

The tactical level of driver models in current simulation frameworks difers greatly in

complexity and strongly depends on the scenarios the model has to cope with, thus, the

respective aim of the research.

Most state-of-the-art agent models in DS follow heuristic and hierarchical structures

to decide from a set of possible actions [38], maneuvers [39, 13], or states [22] that

discretize the driving task. The decision-making process of current agent models is

mostly rule-based and aims to decide which maneuver or action to perform based on

environmental information. The decision-making varies in strategy, complexity, and

considered parameters. To provide some examples, the well-known open-source framework

SUMO includes a car-following, intersection, and lane-changing model [39]. Which of

these modules is active depends on topological circumstances and the dynamic movement

of road users. The decision of which module to take is solved hierarchically based on

heuristic rules. Hochstaedter et al. present an approach distinguishing between lateral

and longitudinal behavior [13]. Therefore, depending on the situation, diferent states of

7

2.2 Driver Models in Driving Simulation

a following and lane-changing model are assigned. The authors distinguish the following

tasks into uninfuenced driving, approaching, braking in emergency situations, and car-

following. The desired acceleration is calculated based on diferential speed and relative

distance. The lane-change decision is further infuenced by a contentedness parameter,

enabling the wish for a lane change when, for example, the vehicle in front is driving

slower.

Some commercial simulation frameworks like cogniBit [40] or AAI [41] use more

sophisticated approaches to model human-like driving behavior, including learning-based

decision-making combined with planning algorithms. Those commercial approaches

provide only very high-level information and no details of their individual solutions.

As a basis for decision-making, environmental information must be processed and

associated by the simulation framework or the agent model. Table 2.1 provides an

overview of the most commonly used information for driver agents in simulation associated

with exemplary references. Information is distinguished into three categories: ego

vehicle related information, dynamic environment information, and static environment

information. While ego vehicle related information characterizes the vehicle and the

driver, the dynamic environment information category describes the trafc scenario that

surrounds the ego vehicle. Please note that the ego vehicle does not refer to the d river or

the function being under test in the simulation but refers to the individual perspective

of the vehicle or driver being modeled as an individual agent.

The represented information strongly depends on the addressed maneuvers (e.g., car-

following or lane changes) and the respective scenarios (e.g., multi-lane highways or

urban intersections). Especially in urban environments, the motion of others, such as

the distance to the in-front driving vehicle, has to be considered for decision-making. It

is noteworthy that neighboring vehicles, as mentioned in Table 2.1, are determined by

diferent strategies, e.g., vehicles driving in the same lane [42], vehicles driving within a

certain Euclidean distance [43], or vehicles driving in adjacent lanes [13].

• Control Level:

A variety of motion models are available that can be ass igned to the control level in order

to carry out a particular maneuver or action. In this case, a maneuver-specifc model and

distinct parameter sets are used to compute spatio-temporal movement for the selected

maneuver. For example, practitioners frequently use models such as the IDM (Intelligent

Diver Model) [49, 50, 43] or the Wiedemann-Following model [13, 51] when s imulating

a car-following maneuver to represent longitudinal behavior. Methods like the MOBIL

(Minimizing Overall Braking Induced by Lane Changes) model [52] or the established

lane change model by Hoel et al. [53] are frequently used to model lateral behavior,

notably lane-changing. It is possible to include parameters in the modeling process that

refect the individuality of human behavior at both the tactical and control levels. These

parameters encompass elements like safe distances, gap acceptance thresholds, perceptual

sensitivity levels, or desired velocities, as summarized in Table 2.1. In addition, more

advanced approaches exist that interlink such parameters to create personality profles

inspired by statistical distributions of behavioral parameters obtained from actual human

driving data [36, 54, 15, 17, 46].

8

2.2 Driver Models in Driving Simulation

Table 2.1: Summary of the most frequently used input information for driver agents in simulation.

Ego vehicle related information

Ego vehicle motion [20, 13]

Ego vehicle dimensions

Desired following distance

[37, 13]

[37, 42]

Desired velocity [44, 13, 42, 34, 43]

Desired acceleration

Maximal velocity

[13, 34]

[37, 45]

Maximal acceleration

Emotional state or contentedness

[45]

[46, 13]

Reaction time [34]

Perception range or gaze behavior

Risk tolerance factor

[43, 47, 48, 37]

[43]

Dynamic environment information

Distance to in-front driving vehicle [45, 20, 44]

Distance of neighbour vehicles [46, 13]

Relative velocity of neighbour vehicles

Indicator and light status of neighbour vehicles

[46, 13, 34]

[47, 13]

Time-gaps to neighbour vehicles [46, 34]

Direction or lane information of in-front driving vehicle

State (position, velocity, acceleration) of neighbour vehicles

[44]

[46, 43, 47, 45]

Static environment information

Lane information

Trafc light status

Road topology

Road markings, trafc signs

Confict areas

[13]

[13, 34]

[19, 34]

[47]

[37]

2.2.6 The Driver Agent at BMW: TRM

The driver agent used as a representative agent model within this thesis is an in-house developed

model by BMW called TRM [55]. The driver agent employs a modular and heuristic structure

that aligns with the sense-plan-act architecture, drawing inspiration from human information

processing, and is illustrated in Figure 2.2. By means of diferent algorithms within the driver

agent and the simulation framework, a representation of the static and dynamic environment

is calculated. This includes, among other things, identifying driveable areas from the map, the

positioning on the respective lane, and the absolute and relative positions of other road users.

This environmental information, as well as a predefned route, which is represented as a list of

contiguous lane objects and a set of parameters, is provided as inp ut to the model. Essential

information, such as recognized trafc signs, is stored in an internal memory during runtime.

The output of the model encompasses the driver’s usual actions, such as turn signals, pedal

actions, and steering control. Based on this, the spatio-temporal motion is calculated by a

vehicle dynamic model. The decision-making follows a hierarchical and heuristic approach

whereby the driving task is separated into various maneuvers, such as lane changes, overtaking,

stopping at obstacles, stopping at red lights, slowing down due to speed limits or curves, and

more. A maneuver is defned as an active driving decision lasting several seconds. In the frst

step, possible maneuvers are evaluated based on the current situation, allowing high-level

decision-making for the driving strategy. At each time step, multiple maneuver evaluators

9

2.3 Driver Models for Automated Vehicles

determine their need for action. Due to the parallel evaluation, several maneuvers may request

action at the same time, but only one can be active. Therefore, a decision module applies rule-

based criteria to prioritize these potential maneuvers and fnally determines the appropriate

one. The actual motion of the driver agent is then calculated using the respective parameterized

motion models. For example, longitudinal motion is based on Wiedemann’s human-inspired

distance model [56]. For lateral movement, the active maneuver provides an allowed corridor

within which the driver may move. A path planner generates a smooth trajectory within this

corridor. For reducing complexity, lateral- and longitudinal motion, as well as temporal and

spatial decisions, are kept individual as far as possible within the model. For example, when

turning at an intersection, the driven velocity is inf uenced by the distance to the front-driving

vehicle, assumed to be a following task, and the curvature of the turning lane. Additionally, a

handling module is embedded to emulate human-like characteristics such as reaction times. For

personalized and individual driving behavior, driver-specifc characteristics such as minimal

safety distances, the desired velocity, or speeding attitude can be confgured for each driver

agent. Furthermore, dynamic parameters such as the contentedness in the current situation

infuence evaluators, impacting decisions such as the timing of a desired lane change [57].

ENVIRONMENT & ROUTE

DRIVER MEMORY

MANEUVERS EVALUATORS

DECISION MAKING

PATH PLANNING

VEHICLE DYNAMIC MODEL

HANDLING

PEDAL POSITION, STEERING WHEEL ANGLE,

INDICATOR STATE

COMBINED

ACCELERATION

DRIVER

Figure 2.2: Structure of the driver agent TRM developed by BMW.

2.3 Driver Models for Automated Vehicles

Since AVs ultimately aim to replace the human driver, the following section provides an

overview of the key methods used in the context of AVs to satisfy the interdisciplinary nature

of modeling human-like driving behavior. Modeling human driver behavior must be addressed,

as AVs need to employ automated driving systems capable of handling complex interactions

with humans in trafc in a safe and efcient manner [50]. Given the wide variety of challenges

related to AVs, the majority of publications only cover a part of the many tasks related to

10

2.3 Driver Models for Automated Vehicles

AD, such as behavior prediction, trajectory planning, or environmental information processing.

Investigating and solving isolated components of the driving task or individual scenarios makes

sense, given the complexity of the holistic problem. However, the independent research leads

to individual solutions based on assumptions that are rarely valid for the wide variety of

trafc, are challenging to transfer, and ofer few practical solutions. The primary methods

are discussed again in the context of the categories presented in the general driver model in

Figure 2.1.

• Strategic Level:

The strategic level in the context of AVs difers signifcantly from agent models in trafc

simulation, particularly during the early sensing phase. This distinction arises because

AVs must process raw sensory data from sources like cameras or LIDAR (Light Detection

and Ranging) sensors into meaningful environmental object information. However, as

they are outside the scope of this thesis, methods referring to object recognition or the

creation of an internal GT (Ground Truth) representation of the driving scene based on

raw sensor inputs are not further discussed. For generating a route, a process akin to

navigation or pathfnding is necessary to establish an initial path, similar to the input

required by an agent model.

• Tactical Level:

The tactical level in terms of AD aims to generate the future spatio-temporal movement

of the vehicle. This generation is mainly addressed as a planning task formulated as

an optimization or minimizing problem referred to as trajectory planning. Traj ectory

planning aims to generate the future spatio-temporal motion of the ego vehicle within a

given solution space while incorporating static and dynamic environmental information,

such as the behavior of other road users, the road geometry, or physical constraints

[58]. Environmental data is discretized and represented within the defned planning

space, facilitating the generation of an optimal or sub-optimal trajectory balancing

between diferent needs such as safety, comfort, and time efciency [59]. Given the

intricate nature of trajectory planning in urban scenarios, the problem is often addressed

hierarchically. For example, frst, taking a higher-level maneuver decision, followed

by local motion planning. Such decision-making tasks are addressed in various ways

in AV development. To give some examples, unsupervised learning architectures,

employing RL (Reinforcement Learning), and train an agent for reasonable decision-

making [60, 53]. Other researchers are using combinations of unsupervised and supervised

learning strategies [14] or solely supervised approaches for decision-making [61, 62, 63].

Furthermore, these higher-level decisions can be approached similarly to agent models,

where the driving task is separated into independent sub-problems or maneuvers, and

methods such as state machines are applied to choose the appropriate maneuver [64,

65]. Alternatively, a higher-level decision can be planned by formulating the problem

accordingly to accomplish the diversity of scenarios encountered in urban trafc [66].

Such a planning problem is mainly addressed by search-, sampling-, or optimization-based

approaches [59]. Some researchers combine the individual methods into hierarchical

planning frameworks [66]. To reduce complexity and thus computational efort, diferent

strategies for decomposing the complex task into multiple sub-parts are employed. For

11

2.3 Driver Models for Automated Vehicles

example, frst planning a path in the spatial dimension and subsequently planning a

velocity profle along this path [67].

In order to incorporate the evolution of the dynamic trafc scene into decision-making,

most approaches include a prediction module aiming at anticipating the future movement

of dynamic trafc participants [50]. For such anticipation tasks, Leon et al. present

a categorization into model-based and data-driven prediction models [68]. Model-

based approaches rely on knowledge, particularly physical dependencies and observable

spatio-temporal relationships. Consequently, those models perform well for short-term

predictions, as physical relationships, such as lateral acceleration for driven curvature,

serve as reliable indicators of motion dynamics [69]. However, these approaches show

weak performance when making long-term predictions due to the emergence of more

complex dependencies over extended time horizons, such as a yielding decision based on

surrounding trafc. In contrast, data-driven methods are predominantly based on black-

box models inspired by cognitive learning structures, such as a NN (Neural Network).

To provide a brief overview of learning-based and data-driven modeling approaches, the

most important keywords are explained:

The overall concept of AI (Artifcial Intelligence), illustrated in Figure 2.3, aims at

making machines as intelligent as a human, involving abilities such as solving problems

or learning [70]. The ability to learn is classifed as a subset of AI called ML (Machine

Learning) mainly addressed with NNs that are inspired by human cognition [70]. There

are various model approaches addressing learning, employing diferent levels of complexity

[70]. A highly nonlinear model structure is required to obtain dependencies when learning

complex patterns. Models incorporating multiple nonlinear layers for learning are called

DL (Deep Learning) models [70].

Field of Artificial Intelligence

Field of Deep Learning

Field of Machine Learning

Figure 2.3: Overview of the interrelationships AI - ML - DL [70].

DL models can be distinguished into supervised, unsupervised, and semi-supervised

[71]. Supervised models learn to predict desired output from a large amount of labeled

input-output training pairs, such as predicting a trajectory based on current and past

motion [72, 73, 74, 75, 76, 77, 78, 79]. Unsupervised models fnd less application when

predicting the future motion of other road users but in modeling decision-making. Some

approaches employ RL approaches for learning tactical decision-making in maneuvers

such as lane changing [60] or stopping at intersections [53]. Semi-supervised methods

combine supervised and unsupervised methods. A commonly used method is named

GAN (Generative Adversarial Network), composed of a Generator and a Discriminator.

12

2.3 Driver Models for Automated Vehicles

Roy et al. propose a trajectory prediction method that models interactions among

vehicles by embedding social context in a GAN, which helps to fnd the most acceptable

future trajectory candidate among a set of potential trajectories [80]. The generator

takes past trajectories as input and outputs a predicted trajectory. The past trajectories

and future prediction are provided to the discriminator, which classifes them as either

real or artifcial learning interactive behavior by rejecting non-acceptable trajectories.

Data-driven models, in particular supervised learning approaches based on NNs, exhibit

a superior aptitude for long-term predictions as they have the capability to determine

highly nonlinear patterns. Nevertheless, such models are associated with issues such as

overftting and the lack of transparency, arising from their black-box nature [81, 50].

Supplementary to Section 2.2.5, Table 2.2 lists the most commonly used environmental

information for prediction and planning models in the context of AVs with corresponding

publications. Again, the information strongly depends on the scop e and scenarios the

research covers. Furthermore, it has to be noted that relevant other road users, as

mentioned in Table 2.2, are determined either by a distance measure [82] or neighboring

relationship as presented by Mo et al. using a neighboring concept employing a graph

representation [77]. More detailed insights into the individual state-of-the-art trajectory

planning and prediction solutions will be provided in Chapter 5 and Chapter 6.

Table 2.2: Summary of the most commonly used input information for autonomous vehicles that

is used for prediction or planning purposes.

Ego vehicle related information

Ego position and motion

Past performed actions of the ego vehicle

[75, 83, 84, 68, 85]

[86]

Maximal velocity [58]

Maximal Acceleration [58]

Dynamic environment information

Movement and state of relevant road users

[75, 87, 74, 86, 84, 77, 78,

68]

History of movement of relevant road users [83, 74, 77, 68]

Dimensions of relevant road users

classifcation of relevant road users

[83, 68]

[87, 86, 78]

Constraint parameters of road user’s motion (v_max,

a_max)

[87]

Potential maneuvers and routes of relevant road users [88]

TTO (Time-to-overlap) to relevant vehicles

Relative movement (distance, velocity) to relevant vehicles

[86]

[68]

Static environment information

Road network information [75, 88, 77, 77, 68, 85]

Road signs and Speed limit [75, 68, 58]

Road geometry [86, 58, 68]

• Control Level:

To follow a predetermined trajectory, control strategies are employed to guide an AV by

computing the necessary actuating inputs, such as steering angle or throttle position.

Various controller strategies are available, ranging from classical methods employing PID

(Proportional-Integral-Derivative Controller) or model-based approaches using MPC

13

2.4 Further Fields of Research

(Model Predictive Control), and extending to adaptive concepts that incorporate NNs or

fuzzy sets [89]. The control level is not discussed in more detail, as the present thesis

focuses on behavior modeling.

2.4 Further Fields of Research

2.4.1 Psychology

In the context of AV development and agent modeling for simulation, the focus tends to be on

the modeling aspect, whereas researchers in psychology strive to comprehend and elucidate

human driving behavior. Various exp eriments have been conducted to explore the impacts

of cultural, demographic, and other factors on driving styles [90, 91]. Some investigations

aim to establish correlations between driver characteristics and distinct driving patterns [92,

93]. Furthermore, some researchers attempt to separate behavior by conceptualizing models

infuenced by cognitive information processing. For instance, Rasmussen introduces a tri-level

framework comprising knowledge-based, rule-based, and skill-based behavior, which diverge in

terms of the learning process, fundamental elements (experience or knowledge), and cognitive

load [94, 10]. Meanwhile, the driver model formulated by Donges is more focused on the

diferent tasks occurring while driving, such as the navigation, the control, and the stabilization

task associated with environmental infuences [95, 10].

2.4.2 Cognitive Modeling

Cognitive modeling as a bridge between computer science and psychology is another essential

research discipline to mention in the context of human driving behavior, including the study

of the social and physical aspects of driving and the driver’s interaction with the vehicle

[96]. Cognitive architectures are general frameworks for specifying computational behavioral

models of human cognitive performance and are used to understand and model human

cognitive processes [32]. Cognitive modeling and the use of cognitive architectures facilitate

the understanding of human driving behavior, taking into account human capabilities such as

memory storage, recall of motor actions, and limitations such as memory decay or foveal versus

peripheral visual coding [32]. In the theory of human cognition, ACT-R (Adaptive Control

of the Thought-Rational) is one of the most promising and complete cognitive architectures

widely used to model human behavior and decision-making and plays a crucial role in the

context of AVs [97]. ACT-R is identifed as the most suitable platform for integrating the so

far independently investigated areas of self-driving cars into an interdisciplinary framework

[97]. Such cognitive systems are used to understand human cognition while driving, to address

specifc tasks such as behavior prediction, and to study the driver’s cognitive load while driving

or interacting with the vehicle [98]. To give some examples, Salvucci et al. present a framework

based on ACT-R to model the driver components of control, monitoring, and decision-making

to handle typical highway tasks such as lane keeping and lane changing [32]. Janssen et al. use

cognitive architectures for agents moving in a new environment by adapting approaches from

cognition to RL [99]. The agent is supposed to learn a suitable plan in the new environment

based on experience and is modeled utilizing a cognitive architecture that characterizes its

14

2.5 Summary

perception, actions, and memory.

The use of cognitive models in conjunction with ML is also discussed as a promising approach

to improve performance by reducing complexity through the identifcation and delineation of

individual tasks, thus reducing the black-box character of models [98].

2.4.3 Robotics

Another pertinent research domain worth mentioning is robotics. In the context of AD, as

a vehicle transitions to an autonomous state, it transforms into a mobile robot since human

control is no longer in play. Many methodologies utilized in AV development originate from the

robotics feld, including techniques for path or tra jectory planning, tracking, and controlling

algorithms [10]. Moreover, within the realm of robotics, attempts to construct humanoid

robots entail similar challenges related to modeling human cognition and motion [100]. The

concept of agent models also shares a strong connection with robotics [27].

2.4.4 Imitating and Replicating Human Behavior

Imitating or replicating human behavior for various purposes, such as generating test cases for

AVs [101], is another related research area worth mentioning in the area of driver models. A

number of techniques in this area are based on generative modeling. For example, learning

from demonstration using a GAIL (Generative Adversarial Imitation Learning) approach

[102, 101] or using a GAN [73] to mimic driver behavior. These approaches follow the idea

of learning from data rather than engineering. The resulting models follow an end-to-end

structure and are therefore less explainable.

2.5 Summary

In summary, the state-of-the-art analysis highlights the rich interdisciplinary nature of human-

like driver models, ofering diverse perspectives, approaches, and technologies. However, it is

noteworthy that only a limited number of approaches address the topic with an interdisciplinary

perspective. Most of the research in this area is divided into either computer science or robotics,

focusing on behavior modeling or psychology, seeking to explain behavior.

The state-of-the-art in the feld of DS predominantly relies on heuristic and hierarchical

approaches that address individual sub-tasks of driving. A thorough literature review reveals

that the majority of published approaches are tailored for highway scenarios. In contrast,

urban scenarios involving interactions with other vehicles or VRU (Vulnerable Road User)s

received comparatively little attention in the past.

Due to the inherently heuristic nature of these models, their applicability beyond their specifc

context is limited. Consequently, new rules, maneuvers, and motion models must b e developed

and integrated to accommodate the broader range of scenarios occurring in urban trafc.

The complexity of driving behavior maintains a strong focus on highway scenarios, both in the

context of DS and AVs. Many approaches devised for highway scenarios rely on underlying

assumptions, such as identifying interacting vehicles through neighboring relations, which are

not transferable to urban scenarios. Furthermore, while numerous solutions address individual

15

2.5 Summary

situations or maneuvers, most methods lack in generalizability and applicability to a broader

range of more complex scenarios.

16

3

Requirements for Modeling

Human-Like Driving Behavior

Disclaimer: The following chapter is based on research presented in [103]: Teresa Rock et al. “Quantifying Realistic

Behaviour of Trafc Agents in Urban Driving Simulation Based on Questionnaires”. In: 2022 IEEE Intelligent Vehicles

Symposium (IV). IEEE. 2022, pp. 1675–1682.

Chapter 2 presented an extensive overview of various perspectives, applications, and

research domains related to human-like driver models. To comprehend the absence of a

universal solution despite extensive research eforts, it is essential to defne what constitutes

human-like driving behavior, defne quantifable metrics, and outline the resulting necessities

for driver models applicable to urban environments. The present chapter addresses these

open questions by performing a detailed analysis of the problem formulation from various

perspectives, enabling the identifcation of methodological gaps in current state-of-the-art

solutions. Building upon this analysis, the chapter aims to pinpoint yet unsolved key challenges

in modeling human-like driving behavior and to formulate appropriate research questions.

These research questions are the focus of investigation and method development in this thesis.

3.1 Characteristics and Main Challenges of Urban Trafc

Before delving into the analysis of requirements for human-like driver models, it is essential to

outline the primary characteristics and challenges inherent to urban trafc. This foundational

step provides a clear basis for a comprehensive discussion regarding the suitability of current

state-of-the-art approaches.

Trafc environments, in general, can be broadly categorized into three primary contexts:

highway, rural, and urban. This research is focused on the intricacies of urban trafc, which

are marked by a distinct set of characteristics and complexities compared to highway and rural

environments. The urban trafc milieu introduces an array of challenges attributed to the

following characteristics [3, 6]:

17

3.2 Perspectives on Defning and Quantifying Human-Like Driving Behavior

•

Shared spaces used by vehicles and VRUs: pedestrian crossings, intersections, bike lanes

•

More complex road network design: tighter streets, less or no separation from oncoming

trafc, bidirectional roads

•

High trafc dynamics: high relative velocities, frequent direction changes, high curvatures

•

Occurrence of trafc obstacles: occupied lanes by wrong parked cars, slower participants

like cyclists on the road

•

Complex trafc regulation: ROW situations without unique regulations requiring

situational behavior adaption and interaction

•

Higher diversity of types of trafc participants,such as pedestrians, cyclists, or scooters

with varying dynamics

In summary, urban trafc exhibits high complexity across multiple dimensions, resulting

in behavior infuenced by various factors. This complexity introduces new challenges for

driver models, both in terms of modeling behavior to process relevant information and make

reasonable decisions efectively and in evaluating model performance considering the impact of

situational infuences.

3.2

Perspectives on Defning and Quantifying Human-Like

Driving Behavior

As presented in Section 2.1, human-like driving behavior is explored across multiple research

domains. Various tasks, including modeling, analysis, and simulation of human-like or realistic

behavior, are discussed in the literature. Notably, no universally accepted defnition of

human likeness or realism of behavior exists. Consequently, diverse research domains such as

psychology and computer science ofer distinct approaches to quantifying, defning, or assessing

behavior. To ofer a comprehensive perspective within this thesis, the following section provides

an overview of objective and subjective methodologies stemming from various research areas.

3.2.1 Objective Approaches to Quantify Human Likeness

Objective approaches measure human likeness by various indicators intended to characterize

driving behavior. In simulations, driver models are used for trafc fow analysis [104, 105] or

safety assessment of automated driving functions [47, 48]. By such multi-agent simulations, a

database of synthetic behavioral data can be created. For instance, simulation-based safety

studies require driver models controlling surrounding trafc to exhibit human-like behavior.

Fries et al. investigate the degree of human likeness of their model by comparing collision risk

due to relative velocities and distances in highway scenarios with diferent real trafc datasets

[47]. Other approaches, such as Lindorfer et al., evaluate model behavior in well-defned

scenarios, such as following maneuvers on highways, by analyzing the similarity of speed traces

from models compared to reference data from FOT (Field Operational Test)s [106]. Typical

approaches to measure the similarity to arbitrary reference data include RMSE (Root Mean

Squared Error), means, and standard deviations [106, 107, 108].

18

3.2 Perspectives on Defning and Quantifying Human-Like Driving Behavior

In driving behavior studies, behavioral data of individuals is investigated concerning individual

aspects such as cultural or demographic infuences. Typical measurement parameters aiming

to characterize driving behavior are:

• Average and maximum speed as well as frequency and extent of speeding [109]

•

Distance to other vehicles, speed, steering wheel reversals, and acceleration counts [110]

• TTC (Time to Collision), longitudinal distance, and velocity in confict situations [90]

Other relevant methods evaluate trajectories generated by a driver model or submodules, such

as a prediction or planning module. The assessment is carried out by comparing the generated

trajectory to the human-driven trajectory in the same context. For instance, the cornering

and lane-change behavior of a driver model is assessed by comparing steering angle, lateral

acceleration, and longitudinal velocity with those exhibited by human drivers navigating the

same course [111]. In the feld of AV development, human-like driving behavior is imp ortant

in quantifying trajectory prediction models. To assess the accuracy of such prediction models,

the generated trajectories are compared against those driven by humans using the same input

data, as illustrated in Figure 3.1. Typically, spatio-temporal distance measures are employed

to compare prediction and GT, encompassing metrics such as ADE (Average Displacement

Error), FDE (Final Displacement Error) [112], RMSE, or specifc temporal (e.g. time to

detection) and spatial (e.g. lateral displacement) metrics [50].

In summary, objective human likeness is either defned as the statistical similarity between

macroscopic behavior parameters comparing artifcial and real driving data or using distance

measures comparing spatio-temporal motion sample-wise within the same context.

INPUT DATA

DESCRIBING THE

DRIVING

SITUATION

HUMAN DRIVEN

TRAJECTORY

PREDICTED

TRAJECTORY

Figure 3.1: Evaluation of model performance based on the similarity of a predicted trajectory

compared to the individual human-driven trajectory.

3.2.2 Subjective Approaches to Quantify Human Likeness

In contrast, subjective approaches use humans as quantifers, employing questionnaires to

inquire about their p erceptions. The underlying assumption is that either what is perceived as

real or can not be distinguished from artifcial behavior defnes human likeness. Zhang et al.,

for example, adapted the Turing test and asked participants to classify the driving behavior of

another vehicle as either artifcial or human-driven [113]. Du mbuya et al. followed a similar

concept by asking sub jects to rate how realistic they perceived a drive completed by diferent

driver models and how likely it was that a human driver conducted the drive [114]. Since model

19

3.2 Perspectives on Defning and Quantifying Human-Like Driving Behavior

behavior during simulation in DiL, applications can be considered as VE experience, methods

for measuring the experience and efectiveness of a VE is further relevant research area. In

most cases, the concept of presence [115, 116, 117, 118, 119] or immersion [120, 115] is used to

quantify the quality of a VE. The underlying assumption is that a high sense of presence means

that people resp ond as if the sensory input were real [118]. Thus, a high degree of presence

is associated with a high quality of the VE. Those approaches were applied to investigate

pedestrian [117, 119] and driving simulators [117, 119, 116]. Due to the complexity of human

perception and cognition, several approaches divide the sense of presence into sub-classes, such

as spatial presence, physical presence, social presence, or co-presence [121]. In experiments,

participants are asked to rate their sense of presence using questionnaires, such as the IPQ

(Igroup Presence Questionnaire) [122] and the SUS (Slater-Usoh-Steed) presence questionnaire

[123].

3.2.3

Potentials and Limitations of Objective and Subjective Quantifcation

Methods

Objective and subjective approaches ofer distinct strengths and limitations, which are

summarized in Table 3.1. Objective strategies for quantifying human likeness mainly rely

on macroscopic behavioral perspectives, evaluating the resemblance of motion parameters to

those found in real reference data. These assessments are typically independent of individual

situational infuences. The reliability and relevance of assessment results strongly depend on the

reference data employed and the scenarios under consideration. Such methods often necessitate

the establishment of thresholds, either based on the similarity of individual parameters or

applied statistics, to determine if the behavior is sufciently human-like. A major challenge

in comparing parameters to reference data is ensuring that the reference dataset includes

directly comparable scenarios. These scenarios or maneuvers can be defned by factors like

trafc density or the number of parallel lanes on a highway road [47]. Objective approaches

often prioritize highway scenarios, partially due to the growing complexity of identifying

comparable scenarios within more complex urban environments. While objective approaches

ofer insights into macroscopic behavioral patterns, they do not ofer transparency into the

realism of driving behavior concerning individual situational contexts. In contrast to objective

approaches, subjective experiments provide insights into behavior within the situational context

but demand substantial efort in the preparation and execution of experiments. Additionally,

these methods do not facilitate the identifcation of individual model weaknesses, as they

evaluate overall behavior without discretizing driving behavior into categories, dimensions, or

parameters.

Table 3.1: Potentials and Limitations of diferent perspectives to quantify human-like driving

behavior.

Objective Subjective

Efort (suitability for iterative development process)

Transferability (suitability for complex urban environments)

Providing insights into situational realism (micro-level)

Potential to identify individual model weaknesses

+

-

+

-

+

-

20

3.3 Requirements From a Driver-In-The-Loop Perspective

Section 3.2 provided an overview of diferent perspectives on quantifying the human likeness

of driver models, highlighting the associated possibilities and limitations of these approaches.

However, none of them provides sufcient insight to derive clear requirements for modeling

human-like drivers. Therefore, theories from a psychological point of view will be further

investigated concerning DiL applications. The resulting fndings will be combined with technical

requirements for DS to provide a basis for analyzing the state-of-the-art approaches and identify

root causes of problems and scientifc gaps.

3.3.1

What It Takes to Be Perceived as Real - The Human Driver in the

Simulator as a Mirror

The primary focus of this thesis centers on DS, specifcally applied within the context of DiL

applications. As mentioned above, potential defnitions of human likeness can be based on what

is perceived as real by a human experiencing the VE [118] or what is indistinguishable

from artifcial behavior for a human [113]. In order to gain a more profound insight into

what is required for behavior to be perceived as real, this section delves deeper into the role of

the driver in the simulator, analyzing the experience within the VE. As explained in Section

3.2, the common method for measuring the perceived realism in VEs revolves around the

concept of presence. Factors contributing to a high sense of presence are investigated to acquire

a more comprehensive understanding.

In literature, it is assumed that presence has the following conditions:

•

Consistent low-latency sensorimotor loop: predictable correlation between proprioception

and sensory input [118]

•

Statistical plausibility: visual inputs must be plausible in regard to the p robability

distribution of natural scenes [118]

•

Behavior-response correlations: appropriate correlations between state, behavior, and

responses [118]

•

Interaction with the environment: appropriate response from the environment especially

in social interactions with other virtual humans [115, 118, 124, 125]

Johnson et al. argue that people not only infer from given information but anticipate the future.

Consequently, a strong sense of presence relies on aligning past and current experiences [126].

Drawing from the constructivist theory, the authors emphasize that subjects in VEs do not

perceive an exact image but rather a constructed version shaped by their cognitive processes,

which are infuenced by various factors. The authors highlight the need to distinguish between

the degree of interactivity and the degree of graphic realism of a character. Additionally, as

examined by researchers, the uncanny valley theory reveals that a photorealistic character

displaying jerky movements is more likely to be perceived as uncanny compared to a cartoon

character that moves in a more human-like manner [125]. According to Jerald, social presence in

a VE is the sensation of true communication with other virtual characters, which is reinforced

when these characters move and behave in a manner consistent with the physical world

21

3.3 Requirements From a Driver-In-The-Loop Perspective

[126]. These factors show what positively infuences the perception of the human driver in

the simulator to perceive the VE as more real. In order to identify clear requirements for

human- like driver models, the following sections place these fndings in the context of trafc

behavior and relate them to the characteristics and capabilities of human driving behavior.

3.3.2

What It Takes to Cope With Urban Trafc - The Human Driver as a

Role Model

Since the objective is to model human-like driver agents, abilities and characteristics employed

by human drivers in urban trafc should be considered to derive required model capabilities.

Human driving style varies due to diferent personalities, experiences, and situational contexts,

including internal and external infuences [127]. External infuences result from the environment,

such as road conditions, weather, and other road users, while internal factors relate to the

individual driver, such as age, gender, and risk tolerance. Therefore, human trafc behavior is

characterized by a certain range of behavioral patterns corresponding to various combinations

of those factors.

Following general trafc rules, the driver’s central responsibility is participating safely in

trafc while following fundamental rules. Decision-making in urban situations is complex

and non-deterministic, requiring interaction and communication to maintain trafc fow and

prevent accidents. Studies of communication between AVs and human drivers, such as those

conducted by B. Faerber, show the importance and characteristics of communication in trafc

[128]. The exchange of information in trafc can vary in type, such as verbal and non-verbal

communication, but can only be understood from the respective context. Gestures and

movement are pervasive methods for non-verbal communication in trafc, making someone’s

behavior predictable for other road users. The authors conclude that AVs must be able to

recognize and interpret other road users’ gestures and trajectories. Situational knowledge and

understanding are required to interpret such signals and put information into the right context.

Understanding signals from context is investigated in various research areas and can be

associated with the concept of SA (Situation Awareness). According to Endsley, SA can be

divided into three levels, which are not necessarily strictly forward or linear [129]:

• Perception of the current situation

• Comprehension of the current situation

• Projections of probable future developments

Especially in urban trafc, various situations in shared spaces reveal interactions between road

users. Interaction can be defned as behavior infuenced by space-sharing conficts [130]. The

fundamental for interaction, assumed as implicit or explicit communication in trafc, is mutual

knowledge and shared information [131, 130]. To successfully maneuver through interactive

situations in trafc, Markkula et al. identifed the following key tasks that a human must

complete [9]:

• Perceiving: recognize and show the awareness of others

• Moving: adapting the spatio-temporal movement to avoid collisions

22

3.3 Requirements From a Driver-In-The-Loop Perspective

Furthermore, driving is described as a highly dynamic process that requires anticipation to

consider potential future outcomes for reasonable driving decisions [132]. In summary, based on

recognizing and interpreting the situational context, humans constantly adapt their behavior

to the current situation, leading to a context-related plausibility in behavior regarding s patial

and temporal changes.

3.3.3

Use Cases and Technical Requirements Originating From Driving

Simulation

There is a wide range of applications for DS, starting from concept testing during early

development to fnal safety assessment before introducing new features. In the context of DiL

applications, use cases related to the driver’s perception of HMI (Human-Machine-Interface)

systems are particularly relevant. These include studies exploring the acceptance, comfort, or

understanding of driving assistance features [133] or use cases in which the driver has to take

over control of the vehicle because a function is no longer available [134]. Common assistance

systems in urban settings encompass ACC (Adaptive Cruise Control), collision warning and

avoidance systems, VRU detection, blind spot detection, distance or speed assistants, and

intersection assistants [135, 3]. All these systems share the common characteristic of being

signifcantly infuenced by the behavior of other road users. DS ofers signifcant advantages,

particularly in scenarios that can not be tested in actual trafc for safety or ethical reasons.

Based on the characteristics of urban trafc and typical assistance functions, the following

critical scenarios were identifed that driver agents should be able to cope with in order to

exploit the full capacity of DS. Figure 3.2 and the following four key scenarios cover the

main challenges of urban trafc [3] and provide a basis for applying and testing the methods

developed in this thesis.

• Diferent ROW regulated interactions with other trafc participants (A_VEH)

• Interaction with pedestrians crossing the road (B_PED)

• Avoidance of static obstacles in trafc (C_STAT)

• Handling of dynamic obstacles, s uch as interactions with cyclists on the road (D_BIC)

In order to provide a reliable test environment for experiments, driver agents must be able to

functionally cope with the above scenarios. This means agent models must exhibit deterministic

and reproducible behavior while avoiding collisions and deadlocks. In addition, since the

simulation in DiL applications takes place online, approaches are required to be real-time

capable. Furthermore, the plausibility of the trafc agents’ behavior afects the simulation’s

validity [46].

3.3.4

Summary of Requirements for Human-Like Driver Agents in Urban

Driving Simulation

By combining the technical requirements for simulation, the psychological factors that

contribute to perceived realism, and the characteristics of urban trafc and human driving

behavior, the following higher-level requirements for human-like driver models in urban trafc

can be derived.

23

3.3 Requirements From a Driver-In-The-Loop Perspective

A_VEH

B_PED

D_BIC

C_STAT

Figure 3.2: Key scenarios representing the main challenges of urban trafc for driver agents for

simulation. Ego vehicle in yellow experiencing the diferent situations.

1. Spatio-temporal consistency (SPA-TEM): Context-related behaviour and SA.

Research demonstrated the importance of spatio-temporal consistency in behavior, both

in the context of presence measurements and in regard to safe and efcient participation

in trafc. Spatio-temporal consistency in the behavior of human drivers is based on

a certain level of understanding of the current situation and the ability to anticipate

potential temporal evolutions of the situation.

Therefore, to exhibit human-like and reasonable behavior, agent models must be able to

perceive and identify relevant environmental details, associate and interpret information,

and make predictions about temporal and contextual changes.

2. Interactivity (INTERA): Behaviour-response correlation and communication.

Interaction and behavior-response correlation contribute to a high sense of presence,

and the ability to interact and communicate is crucial for safe participation in trafc.

Furthermore, typical use cases in driving simulation encounter interactive trafc scenarios.

According to the conducted analysis, driver agents must be able to adapt behavior

according to the dynamic situation involving a shared understanding to interact with

other road users efectively.

24

3.4 Methodological Gaps in State-Of-The-Art Solutions

3. Individuality (STAT): Statistical representation of natural patterns.

Human driving style varies due to diferent factors such as personality, experiences, and

situational circumstances, such as weather, trafc density, or individual behavior of other

road users. Consequently, human trafc behavior exhibits a spectrum of behavioral

patterns.

Therefore, the behavior of driver models should be parameterizable, allowing them to

exhibit a range of behavioral variety that aligns with the statistical range of natural

patterns occurring in real trafc.

3.4 Methodological Gaps in State-Of-The-Art Solutions

Inspired by the literature presented in Section 2.3 and Section 2.2.5 and the analysis conducted

in Section 3.3.2 and Section 3.3.1, which discuss diferent perspectives on the driver, the

main components and interrelationships required to generate human-like driving behavior

in urban trafc are consolidated in a model illustrated in Figure 3.3. A representation of

the current situation is necessary to make information accessible for automated algorithms

controlling behavior in order to replicate human driving behavior in urban trafc. These

representations of complex environments depend on additional information sources in addition

to raw sensory information, such as knowledge and experience. Furthermore, information

processing is distributed among multiple tasks that involve interpreting information and

anticipating future developments. Decision-making for dynamic behavior control is based on

input provided by these components. The output of the model, its behavior, is defned as

spatio-temporal motion, in this case the vehicle’s tra jectory. Based on these components, their

interdependencies, and the high-level requirements presented, state-of-the-art solutions were

analyzed to derive the following methodological gaps.

SITUATION BEHAVIOR

HUMAN

REPRESENTATION

INTERPRETATION

TRAJECTORY

DECISION

MAKING

MODEL

Knowledge, Rules, Experiences,

Personality

ANTICIPATION

INTERPRETATION

Figure 3.3: Overview of identifed components required for modeling human-like driving behavior

in urban trafc.

,

25

3.4 Methodological Gaps in State-Of-The-Art Solutions

3.4.1 Insufcient Representation of Context Information

To allow for exhibiting interactive and plausible behavior in urban trafc, a certain degree

of SA is required. While scientists in psychology attempt to decipher the cognitive processes

happening while driving, most approaches in the technical feld of modeling lack the distinct

perspective of information processing and miss to consider the comprehension part to generate

SA within the model. The Tables 2.1 and 2.2 summarize the most commonly used information

in the context of AVs and agent models in DS. Most situational information is provided on

a raw basis without associated semantic meaning or interpretation. Dynamically changing

information is represented by the position and motion of selected neighbors of the ego vehicle.

In most cases, relevant road users are limited to other vehicles. Given the major challenges of

urban trafc, there is a gap in the representation of dynamic information at a sufcient level

that includes situational understanding of interactions characterized by involved road users

and their inherent relationships to enable SA. Furthermore, current research rarely considers

situational context as an input for modeling behavior or evaluating model behavior. Meanwhile,

approaches that incorporate more contextual information tend to focus on isolated situations

and lack transferability. An appropriate representation of the situation is the basis for any

further b ehavior modeling. Accordingly, the representation of information is identifed to be

one of the main challenges when modeling driving behavior for urban scenarios and the basis

for the frst research question:

R1: How to represent complex situations in urban trafc in a general and

transferable manner?

3.4.2 Missing Anticipation in Agent Models

Anticipation and the role of expectations are heavily discussed in the psychological perspective

of SA [129, 136]. Moreover, the basis for interactivity and spatio-temporal consistency in

behavior, as discussed in Section 3.3.4, shows the need for anticipation. Meanwhile, estimating

the intentions of other road users is a widely studied topic in developing AVs to enable safe

and reasonable decisions. However, current driver agents in simulation rarely anticipate future

developments of the situation or incorporate such information into decision-making. Although

behavior prediction of other road users is frequently addressed, it is still an unsolved challenge

for AVs in urban scenarios due to the complexity of behavior. Published approaches tend to

ofer single-point solutions that are not sufciently transferable to the diversity of urban trafc

due to the phenomenon of overftting and missing transparency of black-box models [81, 50].

Therefore, the published anticipation modules cannot simply be integrated into agent models

as a tool, but further research on anticipation in urban trafc is required. Based on this, the

second research question is formulated:

R2: How to enable general and transparent predictions in complex urban trafc?

3.4.3 Missing Dynamics in Decision-Making of Driver Agents

Current agent models are based on hierarchical and heuristic decision strategies that divide

the driving task into maneuvers and formulate rules for the selection of each maneuver. Such

approaches discretize the action space and thus the ability to adapt behavior situationally. The

26

3.4 Methodological Gaps in State-Of-The-Art Solutions

discussion in Section 3.3.4 demonstrated that more dynamic decision strategies are necessary to

satisfy the requirements of interactivity and spatio-temporal consistency in behavior. Moreover,

strong discretization of the action space can lead to deadlocks in complex situations [137, 138].

Such functional errors impede the simulation and might negatively infuence the participants’

perception of reality in DS [139, 118]. As a solution, agent models need to be enabled for DDM

(Dynamic Decision-Making), which is characterized by a decision process involving a sequence

of interdependent choices that infuence future actions [140], instead of static decision-making,

which is defned as a linear process that makes choices among explicit alternatives [141].

Following, the third research question is formulated:

R3: How to enable driver agents for dynamic decision-making in order to cope

with typical urban scenarios in a functional and human-like manner?

3.4.4 Insufcient Evaluation Strategies and Misleading Metrics

As discussed in Section 3.4.1, contextual information is important not only for modeling but

also for evaluating model behavior. The main approaches discussed in Section 3.2 and Table

3.1 show that the capability of currently available evaluation strategies is limited. Most applied

metrics are based on pairwise comparison of model behavior to human behavior under the

same conditions, which is limited by the available test scenarios and only provides insights into

the similarity of model behavior to that individual human behavior, not human-like behavior in

general. Meanwhile, macroscopic analyses identify comparable subsets in real-world trafc and

artifcial behavior data using criteria that are not transferable to urban scenarios. As a result,

a gap has been identifed in gaining transparent insights into the strengths and weaknesses

of individual models for coping with urban trafc, allowing for future improvements. Given

the strong situational dependency of behavior in urban trafc, the context of the behavior

must be considered in the evaluation to gain meaningful insights. It is crucial to provide

sufcient transparency regarding the limits of a model and to identify the situations in which

difculties arise to allow for reliable solutions in the future. Depending on the application and

the development state of the model, diferent approaches are required. Following, the fourth

research question is formulated:

R4: How to identify model limitations and quantify the degree of human likeness

of holistic driver models and individual subcomponents?

27

4

Representation of Complex Trafc

Situations

Disclaimer: The following chapter is based on a concept presented in [142]: Teresa Rock et al. “Data-Driven Prediction of

Other Road Users’ Intention for Better Scene Understanding in Trafc Agents”. In: Proceedings of the Driving Simulation

Conference 2022 Europe VR. ed. by Andras Kemeny, Jean-Rémy Chardonnet, and Florent Colombet. Driving Simulation

Association. Strasbourg, France, Sept. 15, 2022, pp. 9–16.

The present chapter addresses research question R1: How to represent complex

situations in urban trafc in a general and transferable manner?

Beginning with a brief introduction, the chapter provides an overview of current methods for

representing contextual information. Subsequently, a novel approach is introduced, focusing

on the representation of complex urban trafc scenarios inspired by the concept of knowledge

transfer. This technique serves as the foundation for enhancing situational understanding and

SA within a driver model. A framework is presented that facilitates the representation of

complex urban scenarios in a general and transferable manner and is applied to a real trafc

database. Finally results and limitations of the method are discussed, and recommendations for

future work are provided. As it’s usually not feasible to independently verify a representation

due to the absence of GT, the efectiveness of the presented representation is validated through

the utilization of a prediction model in the ensuing chapter.

4.1 Introduction and Motivation

The foundation of any model lies in the representation of information. Unfortunately, in

the feld of agent models for DS, the precise methodology of information processing is often

not discussed in pub lications. Consequently, the extent to which raw data is asso ciated and

interpreted to generate situational understanding within the agent model remains unclear.

Given that these models typically adhere to heuristic methodologies, it can be inferred that

these association processes are predominately situation- or maneuver-specifc, constraining their

28

4.2 State-Of-The-Art: Environment Representation

potential for generalization. Within the domain of AV development, contextual information

is mostly processed through the formulation of constraint and objective functions during

planning or by providing input features to a prediction model. In this context, situational

understanding is inherently tackled by optimizing cost functions or the intricate nonlinear

relationships within black-box models.

Drawing from Endsley’s theory of human SA, it’s acknowledged that achieving a level of

comprehension extends beyond the mere perceiving of information [143]. It is stated, that

instead, various streams of information are combined to derive meaning and extract relevant

information. The authors mentions the comparison of reading and understanding the meaning

of a text, rather than just reading single words as analog for this process.

Combining diferent sources of information to allow a comprehensive understanding of a

situation is referred to as interpretation in the following.

From the author’s point of view, the interpretation and thus the representation of information

is one of the major weaknesses of current driver models for urban trafc. Information is usually

provided in a raw format, and the process of relevance determination and interpretation is

delegated to some optimization procedure that aims at generating meaningful results but no

longer provides transparency. The lack of the interpretation level places unnecessary demands

on such a model, as the model is deprived of essential knowledge, such as the semantic meaning

of relationships between road users. At the same time, agent models in simulation formulate

scenario-specifc interpretations without further automation, and generalizability sufers as a

consequence.

The following sections propose a method for associating information in complex urban trafc

situations to address the missing interpretation level in driver models. In the frst step, a

general scene description from an ego perspective is elaborated to provide a basic description

covering the majority of situations occurring in urban trafc. The subsequent method generates

model-accessible features from a raw data basis containing dynamic situational and static

environmental information. The feature generation metho d is a multi-step process involving

associating and interpreting information using prior knowledge. The method is implemented

for an open-source dataset of real trafc situations on German intersections captured by a

drone.

4.2 State-Of-The-Art: Environment Representation

The following section provides an overview of state-of-the-art approaches to represent the

environment. In the frst part, the questions on how to represent the static environment e.g.

the road network or the map are addressed. Subsequently, the representation of dynamic

information is discussed.

4.2.1 Static Environment Representation

There exist diferent formats and approaches to represent the static environment. One can

distinguish between geometric representations and topological representations. Geometrical

representations are often built on raw sensor data aiming to represent lanes or lane boundaries

to provide a description of the driveable area. Typical methods are polylines, clothoids or

29

4.2 State-Of-The-Art: Environment Representation

representing the 3D environment by octrees [144].

Topological approaches involve semantic information and provide a more abstract manner to

represent the environment. Such representations vary greatly in their level of detail. In 2011

the OSM (OpenStreetMap), an open collaborative project was introduced [145]. OSM is a map

database representing road networks employing a hierarchical approach to provide geometrical

information of lanes associated with topological information such as relations or driving

directions [146]. Based on the OSM format, Bender et. al. introduced the lanelet description,

especially for self-driving cars, which combines geometrical and topological information to

overcome the drawback of the weak local geometrical representation of OSM [144]. In some

experiments, HD (High-Defntion) maps were introduced for AVs, which is a high-detail format

containing additional information such as road boundaries, and road curvatures [146]. Based on

the high level of detail, such representations are computationally demanding especially in online

applications [146]. Ofine formats were introduced to overcome the demanding process of

dynamically building HD maps. Commonly used formats are, OpenDRIVE, NDS (Navigation

Data Standard), and lanelet2, which are usually ofine pregenerated [147]. OpenDRIVE maps

are commonly used in the feld of DS and are named as the most likely HD map format

for the future based on advantages like expressiveness and accessibility benefting from the

open-source and open community character by ASAM (Association for Standardization of

Automation and Measuring Systems) [147, 148]. Therefore, the OpenDRIVE format was

chosen as the representation for static environments in this thesis. Detailed information about

the description of road networks is described in the following section.

4.2.2 OpenDRIVE Maps

The OpenDRIVE format was introduced by ASAM, describing either real or synthetic road

networks using a multi-level approach incorporating roads, lanes, and objects, such as trafc

signs or signals. The map description is stored in a XML (Extensible Markup Language) fle

format, employing an extendable node format to allow transferability to individual applications

[149]. Due to the relevance for this thesis, the following descriptions are focused on representing

urban road n etworks. The basic elements in a ODM (OpenDRIVE Map) are roads, junctions,

and lanes. Roads consist of one or multiple lanes which are not allowed to overlap. Junctions

consist of multiple roads which are allowed to overlap and connect incoming and outgoing

roads. Each road refers to a reference line that provides the basis for describing placements

of ob jects and lanes within this road using a local two-dimensional coordinate system called

s/t

, whereby the

s

-axis runs along the road and the

t

coordinate is perpendicular. Any object

in the road, such as signs or signals can be either referred by global coordinates or local

s/t

coordinates within the respective lane, as illustrated in Figure 4.1. Lanes of diferent roads are

connected by processor and successor relationships. Each road in the map owns a unique ID

and each lane within a road owns a unique ID. Thus, each lane can be describ ed individually

by the combination of road and lane ID. Trafc signs or signals are associated with lanes and

roads by referring to the respective IDs. Lanes have multiple properties, such as type, driving

direction, and geometrical descriptions of lane boundaries and the center line. Lanes can b e of

diferent types, such as driving lanes, parking lanes, or bicycle lanes, as shown in Figure 4.2a.

The geometrical representation of reference lines is expressed by points, whereby each point

30

4.2 State-Of-The-Art: Environment Representation

Figure 4.1: OpenDRIVE road network description including reference line, lane objects and lane

features [149].

inherits global coordinates and

s/t

coordinates. Multiple lanes within one road can be identifed

as neighbors by their driving direction, characterized by the sign of the respective lane ID

illustrated in Figure 4.2b [150]. For reading OpenDRIVE maps, a module of the BMW

proprietary simulation framework Spider is used [151, 55]. Map representations within Spider

align with the OpenDRIVE standard. However, instead of distinguishing between roads and

junctions, the road network is divided into diferent types of sections: linear sections referring

to roads and intersections referring to junctions. Each section consists of one or multiple lanes.

Comparable to the OpenDRIVE standard, lanes are allowed to overlap within intersections,

but not within linear sections. Each lane is uniquely identifed by lane ID, section ID, and

section type. The geometry of a lane is described by the polylines of the center-line, left-lane,

and right-lane boundary as illustrated in Figure 4.3. Each point of the polyline is characterized

by geometrical attributes involving local coordinates,

s/t

coordinates, curvature, and direction

values.

(a) OpenDRIVE Lane types in urban

(b) Neighbor relations and driving direction

areas. by lane ID.

Figure 4.2: OpenDRIVE Lanes: types, neighbor relations, and driving direction by lane ID [150].

4.2.3 Dynamic Environment Representation

The dynamic environment representation encompasses all trafc participants and additional

objects not included in the map, such as parked cars or temporary road obstacles. In the

context of AD, these representations emerge from sensors like LIDAR or cameras. Hence,

prevalent techniques in robotics and AD involve constructing a DOG (Dynamic Occupancy

Grid) or spatio-temporal grids using a DAG (Directed Acyclic Graph) to represent dynamically

changing scenarios [59, 152]. These representations establish efectiveness when input data is

available as point clouds. However, by discretizing the information into grids, no semantic

31

4.3 Method: Dynamic Scene Representation Obtained by Prior Knowledge and

Interpretation

(a) Linear Lanes. (b) Intersection Lanes.

Figure 4.3: Examples of lanes and lane geometry representations in Spider.

information remains. Furthermore, creating these dynamic grids during runtime incurs

substantial computational expenses [153, 154]. In the context of DS, all environmental data is

already available in an object format as perfect GT without any sensor noise. Consequently,

the following discussion will primarily center around representations derived from the object

level. As illustrated in Tables 2.1 and 2.2, the ego vehicle and other trafc participants are

typically characterized by their position and orientation within a global coordinate system,

along with their dimensions and dynamic attributes such as velocity and acceleration. All trafc

participants are classifed, mostly following a simple distinction: passenger car, bus, truck,

cyclist, motorcycle, pedestrian [155]. Researchers include supplementary details like indicators

or brake light status additionally [75]. Based on this fundamental scene representation,

information can be associated to extract additional context information, such as the distance

or velocity relative to the in-front driving vehicle [45, 44] or other neighboring vehicles [46,

13]. As discussed in Section 4.1, the processes of associating information in the context of AVs

is implicitly addressed through cost functions and extraction mechanisms when constructing

embeddings using NNs. Meanwhile, interpreting information within agent models is either

scenario-specifc or not sufciently described in published research.

4.3

Method: Dynamic Scene Representation Obtained by Prior

Knowledge and Interpretation

The following method addresses the question of how to represent complex situations in trafc.

The challenge of scene representation concerns fnding an appropriate level of abstraction to

formulate universal solutions while retaining sufcient information to meet the requirements

of functions reliant on environmental data. Schreier et al. emphas ize the connection between

abstraction level and generalizability, highlighting the difculty in achieving a suitable balance

[156]. A high level of informational content may introduce drawbacks in terms of runtime and

complexity, as the subsequent model becomes burdened with unnecessary data, while a low

level of informational content impedes the solution potential of the ensuing model [156].

Given the example of data-driven prediction models, generalizability is a primary challenge.

Meanwhile, humans are capable of anticipating and responding logically to novel situations, even

32

4.3 Method: Dynamic Scene Representation Obtained by Prior Knowledge and

Interpretation

if they haven’t encountered those exact circumstances before. The cognitive psychological theory

of knowledge transfer can explain this ability. This theory encompasses three mechanisms:

analogical transfer, constraint violation, and knowledge compilation. Analogical transfer refers

to the capacity to establish a shared relational mapping from prior knowledge to handle new

problems. Constraint violation centers on learning from errors, while knowledge compilation

outlines the process of sequentially interpreting and optimizing rules from prior knowledge to

achieve specifc objectives [157].

To enable situational understanding and thus SA within a driver model, an approach for a

general scene description inspired by analogue transfer and knowledge compilation is presented.

To overcome the identifed shortcomings in common methods, algorithms are developed to

extract, associate, and interpret information for generating a context-rich representation of the

scene in the format of a feature vector. The universal feature vector can be used with various

modeling techniques and for diferent applications.

First, a generalized approach to describe complex urban trafc scenes that allows for linking

between diferent complex trafc scenarios is presented. Subsequently, novel algorithms are

developed to extract, identify, and associate all relevant information, resulting in a feature

vector that describes not only the raw time series but also the inherent relationships between

the ego vehicle and all surrounding trafc participants that could potentially afect the ego

vehicle’s behavior..

As a POC (Proof of concept), the idea of the general scene description is implemented for a

real trafc dataset and later tested using a prediction model. With a focus on complex urban

trafc scenarios in this thesis, a dataset has b een chosen that aligns best with the defned key

scenarios in Section 3.3.3. The selected dataset, known as inD [155], is an open-source dataset

capturing a bird’s-eye view of interactive urban trafc by drones. It includes shared spaces,

non-deterministic regulations, and interactions between vehicles and VRUs. This dataset

contains tracks data describing the spatial-temp oral movement of all road users, tracksMeta

data outlining dimensions and classifcations of road users, and the corresponding OpenDRIVE

map for the recorded location.

Drone data holds signifcant potential for capturing contextual information due to its bird’s

eye perspective that shows the entire scenario around an ego vehicle. However, this potential

comes with trade-ofs, as the bird’s eye perspective may lack certain details that would be

accessible in an ego vehicle-perspective dataset, such as turning indicator states or brake lights

of other vehicles. Since this thesis aims to provide transferable solutions, and since it can

be assumed that AVs primarily receive information from the ego perspective, sophisticated

processing algorithms are developed to compensate for such defciencies.

4.3.1 The General Scene Description

In order to improve the level of understanding as part of SA within a driver model, a general

representation of interactive trafc scenes is required. Considering the key scenarios outlined in

Section 3.3.3, diferent interactive trafc situations, from an ego perspective, all share common

essential aspects. The following characteristics, describing an interactive trafc scenario aim

to enable a mapping between various complex situations in urban trafc:

33

4.3 Method: Dynamic Scene Representation Obtained by Prior Knowledge and

Interpretation

1. The static environment:

Diferent areas (e.g. lanes) that are restricted for diferent types of trafc participants

(e.g. sidewalk or driving lane) or defned directions (e.g. left-turning lane or straight

lane).

2. The relevant dynamic environment:

A subset of the surrounding road users infuencing the ego’s behavior or directly

interacting with the ego vehicle, in the following called PIP (Potential Interaction

Partners). PIP are characterized by static and dynamic information

3. Confict zones:

From the ego’s perspective regarding each PIP, there exists a spatial confict zone

resulting from overlapping lanes or shared spaces.

4. Relationships:

Between the ego and each PIP individual relationships can occur for example the ROW

situation.

5. Relative movement:

For each potential confict, the relative movement of the involved road users (PIP and

ego vehicle) relative to each other and relative to the confict zone can be described.

To give an example, Figure 4.4 illustrates two typical but very diferent situations. A typical

ROW regulated turning scenario (bottom) and a situation whereby a vehicle is forced to move

to the oncoming trafc lane because of a static obstacle on the road (top). Both scenarios can

be described on the same basis that characterizes the PIP, their inherent relationship to the

ego vehicle, and their relative movement. In both situations, the ego vehicle (ID1, yellow) must

give the ROW to a fnite number of interaction partners, and a spatial confict zone results

from the overlap of the individual lanes or driving corridors. Thus, the exceptional situation

illustrated in the upper part of Figure 4.4 can be described by the same characteristics as the

typical turning maneuver. This enables situational understanding and allows both scenarios

to be addressed in one method without having to formulate each situation separately, which in

turn improves generalizability. To express the defned description and make it accessible for

models, information must be mapped into features. Based on the components of the general

scene description, fve feature categories are introduced and summarized in Table 4.1. All

individual features associated with these categories that characterize a scene at time

t

are

provided in the Appendix in Table A.1.

4.3.2 The Data Processing Concept

A multi-level strategy is required to obtain the characteristics outlined in Table 4.1 from

the data source. A toolchain is presented, beginning with the fusion of time series and map

data to derive contextual details related to the vehicles’ positions. This context information

generates knowledge on which lane each vehicle is driving. Leveraging this newly acquired lane

information in conjunction with information from the time series and map-based knowledge

of lane connections, potential interactions between road users can be identifed, and the

34

4.3 Method: Dynamic Scene Representation Obtained by Prior Knowledge and

Interpretation

ego features partner features

relationship &

relative movements

ego features

partner features

relationship &

relative movements

partner features

relationship &

relative movements

relationship &

relative movements

Figure 4.4: Illustrating the idea of scenario mapping by the general scene description. The

ego vehicle (ID 1, yellow) interacts with diferent interaction partners once in a typical turning

maneuver and once in an exceptional situation resulting from an occupied lane.

Table 4.1: Feature information defned according to the general scene description.

CATEGORY FEATURE INFORMATION

Ego vehicle features (E)

Classifcation, dimension, position, velocity, and accel-

eration of the ego vehicle

Map features (M)

Lane course description including turn direction,

geometric information, curvature

Partner features (P)

Interaction features (I)

Classifcation, dimension, position, velocity, and accel-

eration of all PIP

Semantic and geometrical features describing the

relationship and the relative movement of PIP, ego

and their relative to the confict zone

above-mentioned interaction features can be calculated. In detail, the toolchain incorporates

the following steps:

1.

Data Fusion: Identifcation of the lane on which vehicles are traveling, resulting in a lane

assignment for all vehicles at all time steps.

2.

Plausibility Check: An automatic check of all lane assignments for logical plausibility.

The reasons why plausibility checks are necessary are explained in the following section.

3.

Interaction Identifcation & Feature Calculation: Th e identifcation of PIP and calculation

of associated relationship features based on the data fusion from each ego perspective.

Please note that the scene description is always considered from the perspective of an ego

vehicle, describing the ego vehicle itself, the surrounding road users, and their relationship.

35

4.4 Implementation

Algorithms processing information of road users surrounding the ego vehicle only receive

features available from this ego perspective to be transferable to AD.

4.4 Implementation

4.4.1 Implementation Step 1: Data Fusion

The data fusion intends to assign all vehicles, including passenger cars, motorcycles, trucks,

and buses, to the respective lane they are currently driving on. Only vehicles are as signed

to lanes since it can be assumed that VRUs do not strictly follow the lanes but move in

a more free-space manner. Algorithm 1 assigns the positions of all vehicle samples to the

map and returns a

lane_assignment

for each time step extending the original tracks fle. A

lane_assignment

consists of one or multiple lanes defned by lane ID, section ID, section

type, and the respective s/t position of the vehicle within this lane.

Based on the map representation composed of lanes directing over the intersection, undefned

areas within the intersection area can result. An exemplary occurrence is illustrated in Figure

4.5. As a result, in some cases, the RP (Reference Point) of a vehicle is positioned within

such an undefned area, leading to no valid result from the assignment function. In order to

compensate for such cases, the bounding box of the respective vehicle at time

t

is calculated,

and the edge points are used for lane assignment.

Figure 4.5: Example of undefned areas within intersections. Recording location Heckstrasse inD

dataset [155].

36

4.4 Implementation

Algorithm 1 Initial Data Fusion Algorithm

Data: tracks, tracksMeta, map

Result: tracks_ex fle with an additional column physical_assignment

for vehi in vehicles do

for t in time do

phy_ass

=

get_assignment

(

post , map

)

// physical assignments are a

vehi

structured numpy array containing the lanes specified by lane ID, section

ID, section type, and the s position of the vehicle

if length(get_assignment()) > 0 then

tracks_ex[’physical_assignment’][row] = phy_ass next

else

// if no physical assignment could be found, the bounding box of the

vehicle is calculated, and the edges are assigned to the map

bbox = get_bbox(vehi, t)

phy_ass_bbox = [ ]

for edge in bbox do

phy_ass_edge

=

get_assignment

(

posedge, map

)

phy_ass_bbox

[

i

] =

phy_ass_edge

tracks_ex[’physical_assignment’][row] = phy_ass_bbox next

return tracks_ex

4.4.2 Implementation Step 2: Plausibility Check

In order to obtain situational interpretations, e.g. who is interacting with whom and which

relationship they share, a clear an d unambiguous assignment of vehicles to a specifc lane is

required. For a number of reasons, the data fusion can lead to ambiguous or non-plausible

results. Ambiguous results occur for example when a vehicle is located on the intersection,

where multiple lanes overlap. The function

get_assignment

() then returns multiple lanes

originating from various processor lanes entering the intersection and leading to diferent

destination lanes, as exemplary illustrated in Figure 4.6 (left). However, only a subset of the

assigned lanes is plausible with respect to the spatio-temporal motion of the vehicle. Therefore,

information from the past time series is used to check if the motion of the vehicle matches

the origin of each assigned lane. Since multiple lanes still can have the same origin, as the

yellow and orange lane in Figure 4.6 (left), the

check_plausibility

() function, additionally

compares the direction of the lane and the heading of the vehicle. By applying a threshold when

comparing heading and lane direction, one lane is identifed as logically valid. When applying

these algorithms online to detect other road users from th e ego perspective, the identifed

lane another vehicle is following while located in the intersection area can be confrmed by

the indicator status of the respective vehicle if this signal is available. When the algorithm is

used ofine to process data, such as preparing a database for training a prediction model or

data analysis, future information about the vehicle’s route can be used to distinguish between

logical and physical lanes more precisely in such cases.

In other situations, the lane assignment process can return a wrong lane from a logical p oint

of view. The right part of Figure 4.6 shows an example of a vehicle overtaking a cyclist. The

37

4.4 Implementation

Figure 4.6: Example of ambiguous (left) and logically incorrect (right) lane assignments.

vehicle is moving with its RP in the oncoming lane (ID2) but logically follows the neighboring

lane (ID3). The vehicle would have to give way to the oncoming trafc in the orange lane

(ID2) and receives the ROW from vehicles approaching the intersection in the blue lane (ID1).

When using the lane that the ego vehicle is physically driving in (ID2), misleading conclusions

would be drawn about the relationship to other road users, such as giving way to participants

in the blue lane (ID1). Similar cases occasionally occur on bidirectional roads without lane

markings to separate the lanes. There, many vehicles tend to drive more in the center of the

road, which also leads to a physical assignment to the oncoming lane. Figure 4.7 provides

some real-world examples showing these typical occurrences. To address such implausibilities,

all lane assignments are separated into logical and physical levels in order to derive plausible

conclusions regarding interactions and relationships in various situations. Therefore, again,

vehicle headings and lane directions are compared for all assignments. In case the direction of

the assigned lane does not match the vehicle’s heading, neighbor lanes are considered. The

result of the plausibility check Algorithm 2 is an extension of the tracks data with logical and

physical lane assignments for all vehicles.

4.4.3 Implementation Step 3: Interaction Identifcation

When identifying interactions between road users, a distinction is made between vehicles

and VRUs, since it can be assumed that vehicles follow lanes, while pedestrians or bicyclists

generally tend to move more freely in space. To identify vehicles with potential conficts,

the relationship between diferent lanes is used. With the knowledge from the map, it

can be determined which lanes have potential conficts with the current ego lane and what

relationship these lanes share. The confict zone can be calculated by the spatial overlap

between the respective lanes. Therefore, to identify vehicles with potential conficts, for

each ego vehicle perspective, all potential confict lanes at time

t

can be identifed by the

subset

C_LANES

=

{c_lane1, c_lane2, ..c_lanei}

of other lanes that have a spatial overlap

or potential conficts with the ego lane. Potentially interacting vehicles

P t

can then be

V EH

identifed based on all vehicles present at time

t

, their lane assignment, and the subset of

relevant lanes from the ego p erspective

C_LANES

. Vehicles without any relevant relation to

the ego are not further considered for the scene representation.

Since VRUs are moving in a more free-space manner, other criteria for identifying potential

conficts are required. To evaluate a VRU as a potential confict partner, its distance and

38

4.4 Implementation

Algorithm 2 Plausibility Check Algorithm to get unique and reasonable lane assignment

Data: Existing dataframe tracks_ex, New column logical_assignment, tracksMeta, map fle

Result:

tracks_ex fle with two additional columns: logical_assignment, summary_assignment

foreach row in tracks_ex do

vehi = row[’trackId’]

t = row[’frame’]

phy_ass = row[’physical_assignments’]

// physical assignments is a structured numpy array containing the lanes

specified by lane id, section id, section type, and the s position of the

vehicle

if length(phy_ass) == 1 then

lane_direction = get_lane_dir(lane)

check

=

check_plausibility

(

headingt , lane_direction

)

// check_plausibility()

vehi

compared lane direction and vehicle heading

if check == TRUE then

tracks_ex[’logical_assignment’][row] = lane

else

new_lane

=

check_neighbours_lanes

(

post , lane, map

)

vehi

// check_neighbours_lane() searches for all neighbour lanes within

the same section and compares vehicle heading

tracks_ex[’logical_assignment’][row] = new_lane

else if length(phy_ass) > 1 then

for lane in phy_ass do

unique_lane

=

check_plausibility_multi

(

headingt , lanes

)

vehi

// check_plausibility_multi() compares lane directions and vehicle

heading as well as lane origins and destinations to extract a unique

and logical valid lane assignment

tracks_ex[’logical_assignment’][row] = unique_lane

sum_lane_ass = get_sum_lanes(

tracks_ex[’logical_assignment’, ’physical_assignment’][row]

)

// get_sum_lanes() creates a structured numpy with assigned lanes with an

additional feature if the lane is physical and logical valid, only physical

valid or only logical valid

tracks_ex[’summary_assignment’][row] = sum_lane_ass

return tracks_ex

39

4.4 Implementation

car

bicycle

Ego trajectory

Ego position

partner trajectory

Partner position

car

bicycle

Ego trajectory

Ego position

Partner trajectory

Partner position

Bendplatz

(a) Example of a vehicle moving into the oncoming lane to pass a cyclist.

car

bicycle

Ego trajectory

Ego position

partner trajectory

Partner position

car

bicycle

Ego trajectory

Ego position

Partner trajectory

Partner position

Frankenburg

(b) Example of a vehicle driving with its RP in the oncoming trafc lane on a

two-way road without lane markings

Figure 4.7: Real world examples causing implausible lane assignments.

heading relative to the assigned lane of the ego vehicle and respective successor lanes following

the ego route

rego

are considered. If the VRU is situated in or moving towards one of these lanes,

it is identifed as a PIP. Whether the VRU is moving towards a relevant lane is determined

by checking the overlap of the lane polygons and a defned motion polygon representing the

VRU’s motion with some uncertainty range, as illustrated in Figure 4.8. The motion polygon

is initialized by the length

s

and the angle

γ

. The length

s

is set to 7m for pedestrians and

14m for cyclists. The values set for the length

s

are based on the assumption that pedestrians

have an average walking velocity of 1.4m/s [158], resulting in 7 meters for a time horizon of 5

seconds. The angle

γ

is set to 20° for pedestrians and 10°. The angle and length for cyclists

and the angle for pedestrian motion polygons were determined empirically by examining the

dataset. Additionally, VRUs that have a Euclidean distance

e

smaller than a certain threshold

to the ego are identifed as PIP, as it is assumed that ego vehicle behavior would still be

afected even without a specifc relationship. The threshold for the Euclidean distance

e

was

set to 5m. Following this approach, in the example scenario shown in Figure 4.8, the VRUs

with ID1 and ID3 would be identifed as PIP, but not the VRU with ID2. The result of the

identifcation Algorithm 3 is a list of PIP and their respective relationship to the ego vehicle.

The number of PIP that can be recognized by these algorithms is not limited. However, for

40

4.4 Implementation

some applications, such as preparing data for training a NN, the number should be limited

because such models require a static number of input features. Based on the identifed PIP

from an ego perspective, all information categories with associated features, introduced in

Table A.1, can be calculated. In order to describe the interaction between the ego vehicle and

another road user, interaction features (I) intend to characterize the relative spatio-temporal

movement between ego and partner as well as the individual motion in relation to the confict

zone. This information is augmented with the individual relationship between the road users.

Table A.1 provides the summary of all features characterizing one sample from describing the

scene representation from one individual ego perspective with associated descriptions, features,

and categories as well as units. The presented algorithms can either be used to calculate those

features across an entire dataset or online at runtime to generate a scene representation for

modeling behavior since no future information is required in any algorithm.

𝑒

𝛾

𝑑

Figure 4.8: The concept for identifying potential interactions with VRUs. The yellow vehicle

represents the ego vehicle facing a scene with two pedestrians and a cyclist. Identifcation based

on the relevant ego lanes (red), the motion polygons (blue), and the euclidean distance e.

41

4.4 Implementation

Algorithm 3 Interaction Identifcation Algorithm

Data: id

(ego vehicle id),

rego

(ego vehicle route as a series of lanes),

t

(time frame), tracks_ex

with logical and physical assignment of all vehicles, map fle

Result: C_PAR T NERS

(list of interaction partners with their respective relationship to

the ego vehicle)

C_LANES

= get_confict_lanes(

laneego

, map)

// get_conflict_lanes() returns a

series of conflict lanes that have potential conflicts with the lane(s) the ego

vehicle is currently assigned to

C_PART NERS = [ ]

vehicles = tracks_ex[(type == vehicle)&(frame == t)]

V R Us = tracks_ex[(type == VRU)&(frame == t)]

for vehi in vehicles do

lanesvehi = get_sum_assignment(vehi, t)

if any(lanesvehi ) in C_LANES then

lane_relation

= get_lane_relation(

vehi, C_LANES

)

// the relation between

vehicles equals the lane relation such as receive the right of way, give

the right of way, or merge

vehi, lane_relation ⇒ C_PARTNERS

for vrui in V RUs do

inego_lane

,

towardsego_lane

,

checkmin_dist

= get_VRU_check(

posego

,

rego

,

posvrui

,

γvrui

,

rego

)

// get_VRU_check() checks if any present VRU is in or moves towards

a relevant lane from the ego perspective

if any[inego_lane, towardsego_lane, checkmin_dist] then

// for the relation to VRUs is currently only one class called

’VRU_Interaction’ implemented since there are no pedestrian crossings on

the intersections

vrui, VRU_relation ⇒ C_PARTNERS

return C_PARTNERS

42

4.5 Results

The proposed concept provides algorithms for describing complex and interactive trafc scenes

through model-accessible features containing semantic and non-semantic information. Figure

4.9 shows some visual results of the algorithms applied to situations at diferent locations

from the inD dataset. First of all, one can observe the beneft of interaction detection, as in

the scenario shown in Figure 4.9a, modern interaction identifcation approaches would lead

to misleading results. Using a simple distance measure would only identify a subset of the

relevant road users, while applying a larger radius would include a large amount of participant

information that does not afect the ego vehicle, such as pedestrians with ID 294. The typical

neighborhood approach would also not provide a plausible result because interactions with

vehicle ID300 or ID299 could not be identifed. The presented algorithms identifed the vehicle

with ID293 and the pedestrians with ID294 and ID282 as not being relevant for the ego vehicle.

Also, in the scenario shown in 4.9b, all road users except the car with ID414 could be identifed

as not relevant to the ego vehicle. The fgures 4.9c and 4.9d show the same scenario from

two diferent ego perspectives, proofng that the interaction identifcation performs for various

scenarios and intersection typologies by detecting relevant road users.

In Figure 4.9e, again, the beneft of the identifcation algorithm compared to modern interaction

identifcation approaches is demonstrated. When simply employing a distance measurement,

bus ID46 would be identifed as relevant even though it is turning left, while the presented

algorithms only identify car ID50 as a potential interaction partner through a following

maneuver.

Based on the knowledge of which road users are relevant from an individual ego perspective

and the associated lane information, the situational context can be described by the features

summarized in Table A.1. To give an example, Table 4.2 provides some of the semantic

features describing the interactions shown in Figure 4.9a and 4.9b. The features describe

the relationship, the current maneuver, and the positioning relative to the confict of the ego

vehicle and its interaction partners. In scenario a (4.9a), the ego vehicle interacts with three

other vehicles. Vehicle ID284 is in front of the go vehicle, so it afects the ego vehicle, but

there is no confict in their relationship. Meanwhile, the ego vehicle has to give the ROW to

vehicles ID299 and ID300. For the interaction with vehicle ID300, ego and partner are still

before the confict zone, while partner vehicle ID299 already entered the confict zone.

The results show that through the proposed algorithms, it is possible to provide situational

understanding in a model by describing the context. Such context features can be used for

enhancing data-driven prediction, refning planning or heuristic decision models, or categorizing

and identifying situations for data analysis.

The results showed a potential for more understanding within a driver model, but the actual

added value of the proposed scene representation compared to current solutions should be

measurably investigated. In general, there is no possibility of validating a representation

independently without an application due to the absence of GT. Therefore, the extent to

which the proposed representation and individual feature categories provide added value with

respect to SA is investigated in the following chapter using a prediction model. The underlying

assumption is that high prediction accuracy and the ability of a model to transfer learned

43

4.6 Limitations and Future Work

patterns to new situations is a sign of a pronounced situational understanding. The method

of evaluating a representation with the help of a prediction model is inspired by a research

work investigating the efectiveness of new representation methods using a prediction model in

terms of trafc fow prediction [159].

Table 4.2: Semantic features describing scenario a and b shown in Figures 4.9a and 4.9b.

Scenario a Scenario b

Lon. velocity Ego [m/s]

Maneuver Ego

-0,01

LEFT TURN

4,20

LEFT TURN

Relationship

Pos. ego to confict

Pos. partner to confict

Maneuver partner

NO CONFLICT

inside

ID: 284 -

STRAIGHT

EGO GIVE ROW

before

ID: 414 before

RIGHT TURN

Relationship

Pos. ego to confict

Pos. partner to confict

Maneuver partner

EGO GIVE ROW

before

ID: 299 inside

LEFT TURN

Relationship

Pos. ego to confict

Pos. partner to confict

Maneuver partner

EGO RECEIVE ROW

before

ID: 300 before

STRAIGHT

4.6 Limitations and Future Work

The proposed idea of enhancing SA by a content-rich scene representation is applied to an

open-source dataset as a POC. The implemented algorithms are limited to ROW controlled

intersections based on the key scenarios defned in this thesis. However, the presented heuristics

can be easily extended to cover additional scenarios, such as zebra crossings or signalized

intersections. For example, to include signalized intersections, the algorithms for determining

relationships from the map must incorporate the trafc light status and an ass ociation of trafc

lights to lanes. To cover typical following and lane-changing scenarios in highway-like roads,

such as multi-lane sections, additional categories and heuristics for relationships and relative

motion must be included. Therefore, the idea of the general scene description presented here

can be extended both in the number of trafc participants to be considered and in the type of

relationships that occur.

Furthermore, the level of detail describing the environment is limited to input information

typically available to agent models in simulation or common object information provided by

sensors of AVs. As outlined in Chapter 3, communication in trafc plays a crucial role and can

be distinguished into implicit and explicit communication, whereby the implicit level describes

the spatio-temporal movement of road users, and the explicit manner involves active gestures

[130]. The proposed algorithms only cover the implicit level of communication employed by

motion due to data availability. Therefore, to cover all relevant levels of communication in

trafc, future research should address the challenge of how to capture and process signals such

as gestures and gaze directions, especially of VRUs.

The POC implementation is applied to data obtained from the bird’s eye perspective and the

44

4.6 Limitations and Future Work

Ego id: 285

Potential interaction: pedestrian id: 290

No interaction: car id: 293

No interaction: pedestrian id: 294

Potential interaction: pedestrian id: 296

Potential interaction: pedestrian id: 297

Potential interaction: car id: 299

Potential interaction: car id: 300

No interaction: pedestrian id: 282

Potential interaction: car id: 284

Ego id: 415

No interaction: pedestrian id: 392

No interaction: pedestrian id: 406

No interaction: pedestrian id: 408

No interaction: pedestrian id: 410

No interaction: pedestrian id: 412

No interaction: pedestrian id: 413

Potential interaction: car id: 414

(a) Bendplatz (rec:14 frame:19181) (b) Frankenburg (rec:23 frame:21308)

Ego id: 176

No interaction: truck_bus id: 166

Potential interaction: car id: 175

Potential interaction: car id: 177

Potential interaction: car id: 178

No interaction: car id: 179

Potential interaction: car id: 180

Ego id: 180

No interaction: truck_bus id: 166

Potential interaction: car id: 175

Potential interaction: truck_bus id: 176

Potential interaction: car id: 177

Potential interaction: car id: 178

No interaction: car id: 179

(c) Aseag (rec:01 frame:12558) (d) Aseag (rec:01 frame:12558)

Ego id: 47

No interaction: truck_bus id: 45

No interaction: truck_bus id: 46

No interaction: car id: 48

No interaction: car id: 49

Potential interaction: car id: 50

(e) Heckstrasse (rec:32 frame:2571)

Figure 4.9: Illustration of diferent interactive situations of the inD dataset at the four diferent

locations. The ego vehicle is always drawn in red. The solid lines show the path driven so far, the

dotted lines represent the future path, and the circles mark the current position.

45

4.7 Conclusion

corresponding map. It can be assumed that a AV primarily collects information from an ego

perspective. However, it is expected that sensors on an AV would also b e able to provide the

input information used in the presented algorithms, such as the position or velocity of other

road users. By applying common tracking and detection algorithms, the dynamic information

of other road users would be accessible, and the use of such map information is also common

in AV development.

Heuristics are applied to the data to identify interactions between road users. Using such

heuristic algorithms induces more complexity in the data processing and is associated with

certain limitations. First, no heuristic is able to cover the entire range of potential combinations

occurring from multi-participant interactions in real trafc. By iteratively developing and

extending the algorithms, the most important situations observed in real trafc data are

covered by the presented concept. However, since manual scanning of hours of data would

have exceeded the resources of this thesis, some blind spots might remain.

Lastly, semantic features that discretize individual scene information, such as the positioning

relative to the confict zone, limit the continuity of the information, and details get lost.

However, a particular level of discretization to categorize situations is required to enable

knowledge transfer and the association of comparable situations.

4.7 Conclusion

In summary, the proposed method for generating a sophisticated and transferable scene

description applicable to various urban trafc scenarios shows high potential. Contextual

information is generated by fusing time series and map data, allowing SA in a model or

targeted data analyses.

Furthermore, state-of-the-art approaches for identifying relevant road users from the perspective

of an ego vehicle show particular weaknesses in such complex situations. The presented

algorithms for interaction detection address this problem and ofer a promising solution for

distinguishing between road users that potentially infuence the ego vehicle and those that can

be excluded.

Driving behavior in urban environments is strongly infuenced by various external factors,

such as ROW context, road topology, and reactions of other trafc participants. Therefore,

the semantic context information and interaction identifcation provided by the fusion

algorithms generate an indispensable basis for meaningful and reliable modeling, and evaluation

approaches.

46

5

Anticipating the Intention of Other

Road Users

Disclaimer: The present chapter involves research presented in the following publications:

[142]: Teresa Rock et al. “Data-Driven Prediction of Other Road Users’ Intention for Better Scene Understanding

in Trafc Agents”. In: Proceedings of the Driving Simulation Conference 2022 Europe VR. ed. by Andras Kemeny,

Jean-Rémy Chardonnet, and Florent Colombet. Driving Simulation Association. Strasbourg, France, Sept. 15, 2022,

pp. 9–16.

[160]: Teresa Rock et al. “On the Way to Reliable Trajectory Prediction in Urban Trafc”. Advances in Transdisciplinary

Engineering 2023, Publication in progress.

The following chapter addresses research question R2: How to enable general and

transparent predictions in complex urban trafc?

The chapter is organized as follows. A short introduction illustrating the motivation for general

and transparent prediction models is followed by an overview of state-of-the-art solutions for

trajectory prediction in trafc. Subsequently, a two-fold methodology part is presented, frst

aiming at evaluating the previously introduced scene representation by a model’s ability to

generalize on new situations. Based on insights obtained from the frst method, a framework

for enabling the development of generalizable prediction models is proposed in the second

method part.

5.1 Introduction and Motivation

From a psychological point of view, anticipation is discussed in terms of SA for generating a

certain level of comprehension [129, 136] and is, therefore, the basis for enabling interactive and

spatio-temporal consistent behavior as discussed in Section 3.3.4. Furthermore, anticipating

the intention of other road users is a frequently addressed challenge in AD to provide a safe

47

5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions

and reasonable driving strategy [161]. However, generating reliable predictions given the

complexity and diversity of urban trafc remains an op en challenge due to issues related to

transferability, overftting of data-driven approaches, or insufcient transparency of black-box

models [81, 50].

For this reason, the challenges posed by the lack of transparency and transferability of

prediction models are addressed in the present chapter. Two methods are presented that

aim to create more transparency about learned patterns and remaining model weaknesses.

The frst method evaluates the scene representation introduced in Chapter 4 with the aim

of improving the generalizability of a model by increasing situational understanding through

the more sophisticated representation of the driving scene, including semantic information of

interactions with other road users. In the second part, the efects of various conceptual modeling

choices on generalizability, starting with training data and ending with hyperparameters, are

investigated. Generalizability is measured by a novel evaluation methodology aiming at

generating transparency for efcient and reliable solutions.

5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions

In recent years, a major focus in research has been on developing prediction approaches for

highway scenarios. Due to complexity only a limited amount of researchers have specifcally

investigated urban trafc scenarios [74, 83, 162]. Especially in complex urban trafc, prediction

is difcult because several road users interact and their behavior is infuenced by various factors.

In order to provide a comprehensive overview, state-of-the-art solutions are distinguished by

diferent factors: the prediction method, the discretization of the behavior (model output),

and the considered input information associated with respective model structures.

5.2.1 Prediction Methods

Regarding the prediction method, Leon et al. distinguish between model-based and data-driven

prediction [68].

Model-based approaches rely on knowledge, more specifcally on physical dependencies and

observable spatio-temporal relations. By observing vehicle dynamics over time, possible

maneuvers can be determined and their probability computed. The identifed maneuver can

then be modeled with the respective motion model to predict a trajectory [163, 79].

Model-based methods perform well in short-term predictions, as physical measurements are

good indicators for motion patterns. However, the models are not able to make reasonable long-

term predictions since the resulting behavior relies on more complex relationships, including

spatio-temporal decisions of the driver, such as passing or yielding.

Data-driven methods in this context are mostly based on black-box models that are inspired

by cognitive learning structures, such as NNs. Since model-based approaches are not suitable

regarding the complexity of urban trafc, the following sections focus on data-driven approaches

employing more promising methods. Table 5.1 provides an overview of the introduced categories

and related publications, which are discussed in more detail below.

48

5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions

5.2.2 Behavioral Discretization

Next to the prediction method, one can distinguish between diferent levels of behavioral

discretization into predictable variables. Some approaches predict the intention of other

road users as maneuvers [74] or actions [73]. In comparison, other researchers intend to

predict the trajectory of other trafc participants by using their position at certain time

steps in the future as a label directly [72, 75, 77, 79]. Furthermore, prediction can either

be performed deterministic by outputting one prediction result [164, 165] or probabilistic

providing a trajectory and a range of uncertainty [79, 166].

5.2.3 Infuence Factors and Model Structures

Multiple in fuences may afect driving behavior in urban trafc. Thus, diferent approaches for

incorporating contextual information into prediction exist. Since trajectory prediction is mostly

formulated as a sequential problem, several approaches utilize recurrent network structures,

such as RNN (Recurrent Neural Network) or LSTM (Long Short-Term Memory) [72, 77, 78].

These networks derive one’s future motion based on past temporal motion sequences.

Other promising concepts evolve graph-based models such as a GNN (Graph Neural Network) or

a GCN (Graph Convolutional Network), as these structures ofer great potential in representing

spatial dependencies between road users, ofer the possibility of handling dynamic input sizes

and are suitable to predict the entire scene development instead of predicting each road user

individually [167, 168].

Often the problem of trajectory prediction is distributed among diferent types of NNs that

are combined, such as a MLP (Multi-layer Perceptron), a GNN, and a LSTM to one large

prediction system [79, 167, 78]. Incorporating contextual information into the prediction

is especially relevant due to various infuence factors afecting behavior. Concepts vary in

terms of the information provided (e.g., static environment information on the map) and

the format in which this information is provided (on a semantic level, as raw data, or as

an embedding). Contextual infuences, for example, describe the static environment, since

spatial movement in urban trafc can be highly dependent on the street layout. There are

diferent approaches to represent street maps on a feature level, including latent representations

obtained from a CNN or a VAE (Variational Autoencoder). Mo et al., for example, use a

CNN encoder to calculate an embedded representation of the map [77]. Whereas Schulz et al.

use directly interpretable features describing the current lane, a vehicle is following by width

and curvature [76]. As already mentioned, data-driven prediction models ofer great potential

for such complex prediction tasks but lack in explainability and generalizability. Therefore,

some researchers combine the two worlds of model-based and data-driven prediction by adding

prior knowledge to the learning process. Bahari et al. list diferent possibilities for injecting

knowledge, for example, by including vehicle dynamic models during learning to force the

model to predict only kinematically feasible trajectories [79].

49

5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions

Table 5.1: Summary and categorization of the state-of-the-art approaches to vehicle motion

prediction.

Model archi-

tecture

Convolutional Networks: CNN [169]

Recurrent Network Structures: RNN,

LSTM

[170, 171, 166, 172, 72, 173, 174]

Graph-based Structures: GNN, GCN [164, 175, 176]

TN (Transformer Network) [177, 178, 179, 180, 181, 165]

Autoencoders: VAE [182, 183]

Multi-model frameworks: LSTM-GNN, TN-

MLP, GNN-RNN, LSTM-MLP

[184, 185, 84, 186, 187]

Input

information

semantic representations for static environ-

ment

[188, 176, 169, 183, 76]

raw representations of static environment

or embedding

[182, 170, 166, 189, 179, 186, 180,

190, 172, 77]

raw context information (position, dynam-

ics of other road users)

[184, 164, 185, 175, 177, 191, 178,

72, 165]

semantic context information (interaction

partners, relationship)

[142, 84, 173, 174]

Model output

maneuver prediction [175, 74]

action prediction [73]

deterministic trajectory prediction

[164, 185, 175, 177, 178, 186, 180,

72, 173, 174, 165]

probabilistic trajectory prediction

[79, 182, 170, 188, 166, 189, 191,

179, 84, 190, 169, 172, 183]

5.2.4 Model Evaluation Strategies

The evaluation of such data-driven models is a crucial part of the development, as black-box

models face problems associated with overftting and weak generalizability while ofering a low

level of transparency.

Overftting describes the phenomenon when a model learns the patterns presented in a training

dataset so well that it negatively afects the model’s ability to generalize. The phenomenon is

more likely when learning a loss function from complex non-parametric statistical dependencies,

such as used in NNs [192]. Overftting, accuracy as well and interpretability are related to

each other and the level of complexity of the model [192]. More complex models tend to overft

more and show less interpretability while providing a higher potential of accurate predictions

when facing complex problems [192]. Depending on the prediction problem to be solved, the

representativeness of the training data, and the modeling approach, an appropriate balance

between generalizability, interpretability, and accuracy must be found. Diferent metrics can

be applied to quantify model accuracy in terms of trajectory prediction. Table 5.2 shows a

selection of employed state-of-the-art metrics for evaluation. The most common metrics for

evaluating trajectory prediction are ADE and FDE, measured as the L2 distance between

the true and the predicted trajectory [193]. Some approaches employ variations of ADE

and FDE [184] or RMSE for measuring accuracy by the distance between predicted and GT

trajectory [171]. These metrics indicate how accurately the predicted trajectory matches the

50

5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions

individual human-driven trajectory and is averaged over a set of test data. However, the use of

displacement errors cannot provide information on how functional or plausible the predicted

trajectory was. Therefore, in some individual cases, more sophisticated evaluation strategies

are applied, e.g., taking into account functional errors such as road violations [81] or unrealistic

headways [175] as summarized in Table 5.2.

In the realm of scientifc research, in addition to the chosen metric, another crucial aspect of the

evaluation strategy involves the selection of test scenarios or test data. Mos t state-of-the-art

approaches test their models on a retained test split of the data. Only a few approaches test

their models on diferent datasets [177]. Information about how close the used test data and

training data are is rarely addressed in most publications.

Table 5.2: Summary of state-of-the-art metrics for trajectory prediction.

Metric Explanation Reference

ADE & FDE

Average Displacement Error & Final Displace-

ment Error

[184, 182, 170, 171,

188, 166, 194, 82,

81, 191, 178, 179,

186, 180, 169, 172,

72, 187, 173, 174,

165, 183]

variations of ADE & FDE

normalized ADE & FDE, minimum ADE &

FDE

[184, 166, 176, 177,

190]

RMSE Root Mean Square Error [171, 194, 164, 185,

175, 84, 172]

Negative headway distance

occurrence

Occurrence of unrealistic states due to poor

decision-making

[175]

Jerk sign inversion

Quantifes oscillations in model’s acceleration

predictions

[175]

MR (Miss Rate)

Proportion of unacceptable trajectories mea-

sured by a region of interest.

[190, 183]

Of-road rate

The ratio of predicted trajectories laying not

entirely in the driveable area of the map to

the total number of predicted trajectories

[190]

EMD distance

Quantifes amount of probability mass that

has to be moved from the predicted distribu-

tion to match the true distribution.

[81]

HOR (Hard Of-road Rate)

The percentage of scenarios that have at least

one of-road prediction in the trajectory points

[81]

SOR (Soft Of-road Rate)

The percentage of of-road prediction points

over all prediction points and the average over

all scenarios.

[81]

DAC (Driveable Area Com-

pliance)

Count of future trajectories within the drive-

able area divided by the number of all possible

trajectories.

[112]

TCC (Temporal Correlation

Coefcient)

A high value for TCC is indicating that

predictions cover the time-varying motion

patterns well

[170]

51

5.3 Prediction Concept, Problem Formulation, and Data Preparation

5.3

Prediction Concept, Problem Formulation, and Data

Preparation

5.3.1 General Concept of Intention Prediction

The purpose of the following methods is to address the question of how to enable general and

transparent predictions in complex urban trafc. Since model-based approaches show less

suitability considering the complexity of predicting the future behavior of other road users, a

data-driven approach is chosen. Due to known challenges associated with black-box models,

the following methods investigate how diferent conceptual choices afect the generalizability

of a data-driven prediction model. In this context, the efectiveness of the proposed scene

representation introduced in Chapter 4 is evaluated.

For anticipation, a NN is created that predicts the future trajectory of one vehicle based on

situational information from this vehicle’s ego perspective. In the following, the vehicle to

predict is named ego vehicle while all other potentially relevant road users that are present

in the scene are called partners. The trained model can later be used to predict all vehicles

in the scene. The prediction of VRUs is not further addressed, but all presented methods

aim to be transferable. The following sections explain the problem formulation and the data

preprocessing strategy used for both methods. Subsequently, the two individual methods

investigating the efect of diferent conceptual choices on generalizability are presented with

corresponding results, critical discussion, and conclusions.

5.3.2 Problem Formulation and Model Architecture

The problem of trajectory prediction is formulated as follows. At time

t

the model predicts

the future trajectory

Y i

=

Yti

+1, Yti

+2, ..., Y i

for the next

Tpred

seconds for one vehicle

i

t t+Tpred

based on the current scene

Xt

which describes the individual perspective of vehicle

i

. The

i

i i

future motion

Y i

is a sequence of positions in a two-dimensional space:

Y i

= (

x , y

). The

t t t t

prediction horizon Tpred is set to 5 seconds.

For labeling, the positions of the respective vehicle after 1,2,3,4,and 5 seconds in global XY

coordinates are used.

The current situation from an ego perspective of vehicle

i

is represented as a concatenated

vector

Xi

t

at time

t

. The vector

Xt

includes features describing the ego vehicle

Xi

t

E

, and,

i

depending on the setting to investigate, all features describing the map

Xt

, potential confict

iM

partners with their state

Xt

, and the individual interaction with the ego

Xt

according to

iP iI

Equation (5.1) and defnitions in Table A.1.

Xt = (Xt , Xt , Xt , Xt ) (5.1)

i iego iP iI iM

52

5.3 Prediction Concept, Problem Formulation, and Data Preparation

Partners include vehicles Xt and VRUs Xt according to Equation (5.2).

ipV EH ipV RU

Xt = (Xt , Xt )

iP ipV EH ipV RU

n

∑

Xt (Xt ) k ∈ Pt

= kP V EH ipV EH k=0 (5.2)

m

∑

Xt = (Xt ) j ∈ P t

jP V RU ipV RU j=0

The model itself can be seen as an evaluation tool for investigating the efect of diferent

conceptual choices on generalizability and is not intended to outperform benchmark results.

Therefore, a simple model structure is chosen, and for training, common loss functions

and optimizers are used. Since this work aims to evaluate the efectiveness of the scene

representation, the training samples do not contain any past information, and no recurrent

structures were used in the model architecture. For architecture, a simple MLP is chosen,

incorporating four hidden layers consisting of 512, 256, 128, and 64 neurons and ten neurons

representing the output layer for the respective positions in XY for the next 5 seconds. Batch

normalization is used for stabilizing model training. The model structure is illustrated in

Figure 5.1.

Input 512

Batch Normalization

Layer

256

Batch Normalization

Layer

128

Batch Normalization

Layer

64

Batch Normalization

Layer

Output

Figure 5.1: Structure of the base NN used for the two anticipation methods to investigate the

ability to generate generalizable predictions.

5.3.3 Data Preprocessing

The model input is provided as a concatenated feature vector composed of diferent feature

categories summarized in Table A.1. Regarding the data preprocessing, there are two main

aspects to mention.

First, the input has to be normalized such that a ML model can handle the data during

optimization. Therefore, all data is normalized into a value range between 0 and 1. To ensure

that no information is lost or distorted, maximal and minimal values are set manually, and

features sharing a common value base, e. g. headings, are normalized on the same basis. In

order to prevent squeezing or distorting spatial information, coordinate values are shifted into

a common value range.

In urban trafc, each situation can involve a variable number of road users, while most of the

data-driven prediction models are based on a static number of input neurons. Accordingly, a

concept for handling a dynamic numb er of inputs is required. Therefore, all potential partner

53

5.4 Method 1: Measuring the Efectiveness of the Scene Representation by

Prediction Performance

vehicles

Pt

and all interacting VRUs

P t

are sorted by their distance to the ego vehicle.

V EH V RU

If there are more than

n

or

m

interaction p artners, only the closest ones are considered. When

there are fewer interaction partners, empty features are set to NaN, efectively omitting them

from the training data. During preprocessing, NaN values are set to -1 to ensure the model

learns to ignore them since -1 is out of the normal feature range [0

−

1]. A maximum of

n

= 5

interacting vehicles and

m

= 4 interacting VRUs is defned. The numbers were set empirically

with the intention of fnding the best trade-of between preventing a high number of sparse

information and not losin g relevant information. The data preprocessing can have a high

impact on model performance, and there might still be unused potential. A deeper analysis of

the efects of diferent representation options for such sparse features should be addressed in

the future.

5.4

Method 1: Measuring the Efectiveness of the Scene

Representation by Prediction Performance

5.4.1 Motivation

Generating accurate predictions in the variability of possible situations in urban trafc remains

an unsolved challenge, as both data-driven and model-based approaches perform well in

isolated situations but poorly generalize to new complex an d interactive situations. This is a

major drawback because agent models only beneft from anticipation when enabled for various

common urban trafc scenarios. Therefore, this section addresses the challenge of improving

data-driven prediction models’ ability to generalize by adapting the cognitive-psychological

theory of knowledge transfer proposed in the previous Chapter 4. The efectiveness of the

proposed scene representation is measured by using the introduced prediction model. The

underlying assumption is that the ability of a NN to generalize to new and unknown situations

is associated with a high level of situational understanding. Based on research question R2,

the following sub-question is formulated:

R2.1: To what extent does the proposed scene representation improve the ability of a data-

driven prediction model to generalize and thus produce reasonable predictions for the variety

of complex situations encountered in urban trafc?

5.4.2 Concept

In order to investigate the ability of the model to generalize to new situations, two diferent

test levels for measuring performance are established. The presented model is trained under

weak conditions only on a fragment of the available data showing only one intersection of the

utilized dataset inD [155] during training. A subset of the data of the training intersection is

retained for testing the model’s performance in unseen situations in a known location. For

investigating generalizability, in a second level, the trained mo del is tested on a new unseen

intersection that shows a signifcantly diferent topology.

The model is trained on diferent versions of the scene representation (EMPI, EMP, EMI, EM,

E) involving the proposed features, introduced in Table A.1. Model performance is compared

on these diferent feature spaces to identify the beneft of the individual information categories

54

5.4 Method 1: Measuring the Efectiveness of the Scene Representation by

Prediction Performance

regarding the model’s ability to generalize. By means of this, the efectiveness of the proposed

scene representation is critically evaluated. The described measurement concept is illustrated

in Figure 5.2.

EMPI

EMP

EMI

EM

E

TEST LEVEL 1: retained

situations from training location

TEST LEVEL 2: unseen

intersection with new topology

TRAINING

Figure 5.2: Concept for measuring the effectiveness of the proposed scene representation by

training the NN with diferent feature compositions.

5.4.3 Implementation Details

The model is trained on the location Bendplatz of the inD dataset [155]. Recordings 8 - 17 are

used for training the model, resulting in 190.000 training samples, while recording 14 is used as

a validation set during training. Recording 7, comprising 9.000 samples, is retained and used

for testing at level 1, which represents data close to the training data since the situations are

unknown to the model, but the location is familiar. For investigating the ability to generalize,

the trained model is tested at a second location Heckstrasse of the inD dataset, which shows

a completely diferent intersection topology. The test data is obtained from recording 30

involving 32.000 samples. Both the training and test locations are illustrated in Figure 5.2.

The model is trained with fve diferent feature settings to investigate the efectiveness of the

individual information categories: EMPI, EMP, EMI, EM, and E. All models are trained

with a batch size of 50 for 80 epochs. For training, the following hyperparameter setting is

set: Adam optimizer with a default learning rate of 0.001, MSE (Mean Squared Error) as

loss function, and relu for activation by using Keras for model building [195]. For evaluating

the efectiveness of the scene representation, model performance is measured by the common

benchmark metric ADE and FDE employing L2 distance [193].

5.4.4 Results

Table 5.3 summarizes the results for model performance on diferent feature spaces and both

test levels, measuring accuracy by ADE, FDE, and the relative decrease in accuracy compared

to the best setting. The results for unseen situations are in line with current benchmark results,

covering highly interactive driving scenarios, as shown in Figure 5.3. The results on unseen

situations (level 1) beneft from a smaller feature space considering only the ego vehicle (E) or

55

5.4 Method 1: Measuring the Efectiveness of the Scene Representation by

Prediction Performance

Table 5.3: Results for measuring the efectiveness of the proposed scene representation.

Feature

Setting UNSEEN SITUATION

(Level 1: Bendplatz)

ADE [m] FDE [m] drop ADE [%]

NEW LOCATION

(Level 2: Heckstrasse)

ADE [m] FDE [m] drop ADE [%]

EMPI (209)

EMP (144)

EMI (96)

EM (31)

E (13)

2,5273 4,5856 11,09%

2,3613 4,4491 -4,84%

2,6479 4,5310 -15,14%

2,2850 3,9870 -1,66%

2,2470 4,0779 best

9,0252 14,0349 -2,99%

8,7554 14,5594 best

9,1117 16,7633 -3,91%

27,8287 17,8915 -68,54%

50,3753 46,2537 -82,62%

Figure 5.3: Quantitative results in unseen situations (Level 1).

Ego GT trajectory marked in orange. Predictions of the model trained on EMPI in cyan, and

predictions of the model trained on E in blue. GT and predictions represent positions in next 5

seconds. For all trajectories: driven path: solid, future path: dotted.

ego and its static environment (EM). However, when testing the model on increasing levels of

generalization (level 2), performance is signifcantly better on the feature spaces, including

partner and relationship information (EMI, EMP, EMPI). The performance when excluding

situational context information decreases by up to 82%. The relative decline in performance

regarding the input variant only using ego features (E) compared to the other feature spaces

shows a clear beneft of the proposed scene representation in terms of generalizability. The

diferences between the settings EMPI, EMI, and EMP are minor compared to the decrease in

performance when excluding dynamic situational information.

The diferences in performance at the two levels of testing demonstrate the importance of a

deep and critical evaluation strategy when developing such black-box models. If the model is

tested only on data that is close to the training data, as presented in test level 1, misleading

56

5.4 Method 1: Measuring the Efectiveness of the Scene Representation by

Prediction Performance

conclusions may be drawn regarding which model setting would perform best in general.

Next to objective measures, the qualitative results are investigated, and some examples are

provided in Figures 5.3 and 5.4. The examples in Figure 5.3 show some predictions of the

model trained on the full feature space (EMPI) and only on ego vehicle features (E) on test

level 1. Both models show similar performance and predict the future motion of the vehicle

well, even in highly interactive situations. Meanwhile, when investigating results on the unseen

intersection Heckstarss e in Figure 5.4, a signifcant diference in accuracy and plausibility of

results can be observed. Since the models were trained under weak conditions, both models

show high error values compared to benchmark results. However, the predictions of the model

trained on the feature space describing other road users and the individual relations (EMPI)

provide signifcantly better results in terms of accuracy and plausibility.

The two exemplary predictions in the upper row show high error values but still have a certain

plausibility in the prediction. In the top row, a wrong maneuver is predicted by the EMPI

model on the left side, and on the right side, the curvature of the predicted trajectory deviates,

while the predictions of the model trained on E seem to be random.

In the bottom row on the right side of Figure 5.4, the model trained on EMPI shows reasonable

prediction, but the expected velocity is too low, while the model trained on E again shows

non-sense predictions. Such efects of spatial accurate but temporal weak predictions can

be explained by the diference in mean driven velocity on the training location (Bendplatz:

17km/h) compared to the test location (Heckstrasse: 26km/h).

Next to the beneft of the proposed scene representation, the situations and related predictions

show that the signifcance of the benchmark metric of evaluating by ADE and FDE is limited.

Cases occur in which behavior deviates from the real trajectory but is still plausible, e.g., a

slower driven velocity, while other predictions with similar error values do not show any spatial

plausibility or enter non-driveable areas.

5.4.5 Limitations and Future Work

The results show a clear advantage of providing contextual information for the model’s ability

to generalize. However, the proposed scene representation exhibits a high dimensionality when

all information categories are included, and the precalculation of these features is associated

with computational efort. In addition, such high dimensionality can negatively afect learning

and demands a high number of training samples, which is called the phenomena called curse of

dimensionality [196]. Meanwhile, DL models are less afected by this phenomenon because of

the ability to compress the high dimensional input data into embeddings of lower dimensionality.

The critical dimensionality level depends strongly on the particular data distribution and the

prediction task [197].

The selected model structure, an MLP, only takes a static number of input neurons, while

trafc scenarios in urban trafc vary in the number of relevant road users. Therefore, a

strategy for limiting the maximal number of considered interaction partners and handling

sparse features for non-present partners was established. The way to process such sparse

features for normalization and the efect of a high number of sparse features on prediction

performance was not further investigated and might be associated with additional potential.

Furthermore, the output layer of the model represents the future position of the vehicle in the

57

5.4 Method 1: Measuring the Efectiveness of the Scene Representation by

Prediction Performance

Figure 5.4: Quantitative results in unseen situations on unknown intersection (Level 2).

Ego GT trajectory marked in orange. Predictions of the model trained on EMPI in cyan, and

predictions of the model trained on E in blue. GT and predictions represent positions in next 5

seconds. For all trajectories: driven path: solid, future path: dotted.

next fve seconds, discretized by a one-second time interval. Due to the high discretization,

the predicted trajectory must be resampled into a more continuous motion depending on the

subsequent application. Furthermore, the dynamic feasibility of the predicted trajectories is

neither guaranteed nor investigated at this point.

In this method, a simple model structure, namely a MLP, was used. It is assumed that the

advantage of providing situational information for model training is also present when using

other model structures with alternative architectures such as GNNs or RNNs. However, the

performance of a model with a diferent architecture cannot be determined independently

from the input data and representation, and therefore, it cannot be guaranteed that the same

positive efect size will be observed.

The evaluation is based on a two-level approach and shows an advantage of the proposed scene

representation in terms of generalizability. Meanwhile, the extent to which the model is able

to make meaningful predictions in other situations even further away from the training data

needs to be explicitly investigated due to the lack of transparency caused by the black-box

nature of the models.

When using a black-box approach for prediction, it is important to identify in which situations

the mo del has learned meaningful patterns and when it is producing unreasonable output to

ensure reliability. The qualitative results demonstrated that the benchmark evaluation metric

used is limited in signifcance as no conclusions regarding the plausibility of the respective

prediction can be derived, which should be addressed by a more sophisticated evaluation

58

5.4 Method 1: Measuring the Efectiveness of the Scene Representation by

Prediction Performance

method. Due to the black-box character of NNs, an extended evaluation strategy is required

to investigate efects on generalizability and to provide transparency regarding the potentials

and limitations of a trained model. Such evaluation strategies are mainly characterized by the

metric and the test data, which should cover a broad range of situations.

5.4.6 Conclusion

The present chapter provided a method for evaluating the efectiveness of a scene representation

by assessing the ability of a prediction model to generate reasonable predictions in situations

far away from the training data. To allow so, the model was trained under weak conditions,

and the generalizability was assumed to indicate the level of SA caused by the provided input

representation.

The evaluation results revealed a clear beneft of the extended scene representation in terms

of enabling more SA in the prediction model and thus improved the ability to generalize.

The feature settings describing the interaction with other trafc participants (EMPI, EMI,

and EMP) on level 2 outperformed the settings without any dynamic context information

signifcantly. The accuracy of the models trained on ego only (E) or ego and map information

(EM) decreased by up to 82% at test level 2. In particular, the qualitative results revealed

that even if the error values of the EMPI model, for example, were still high at test level 2,

the results showed signifcantly more plausibility and reliability compared to those of a model

trained on the E-feature space.

Furthermore, the evaluation demonstrated that by testing a model on data close to the training

data, misleading conclusions might b e drawn regarding which model setting would perform

best in general. Additionally, in particular, the qualitative results revealed the limitation of

current benchmark metrics measuring model performance exclusively by the distance between

predicted and GT trajectory. To conclude, quantifying and improving the generalizability of a

NN is a complex task due to its black-box nature. Various conceptual choices, such as training

data, test cases, or training parameters, can afect model performance. Meanwhile, commonly

used evaluation metrics do not provide sufcient insight to derive reasonable decisions regarding

conceptual modeling choices. Therefore, the following methods address the challenges revealed

here.

59

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

5.5

Method 2: Reliable Trajectory Prediction in Complex

Urban Trafc

5.5.1 Motivation

The common objective when developing a prediction model is to generate predictions that are

as accurate as possible across the variety of situations that occur in trafc. The number of

possibilities for addressing this challenge with commonly used data-driven models is immense,

starting with the concept of how and which situational information is provided across a wide

range of model architecture options and ending with the choice of learning parameters. This

richness of opportunities meets difculties associated with data-driven approaches, namely

overftting and lack of transparency. Black-box models provide little transparency so that

insights into accuracy can only be gained in explicitly tested situations. In addition, data-driven

models risk overftting to the training data, resulting in poor results in unknown situations,

i.e., low generalizability. Furthermore, the representativeness of situations that appear in a

database is always limited compared to all potentially occurring scenarios in urban trafc.

Taking these facts together, one usually does not know what relationships the model has

actually learned, and both the evaluation and training of such models depend strongly on the

available datasets. Most state-of-the-art publications focus on problem-solving and introduce

novel concepts on how to generate accurate predictions for a given dataset. However, the

evaluation part is rarely addressed, and thus current research provides little insight into the

necessary requirements for practical use of the models, such as limitations and generalizability.

In order to address these challenges, the following section presents an advanced evaluation

method that provides insight into the ability of models to generalize and generate plausible

predictions even in exceptional situations. The multi-level evaluation method aims to provide

more transparency about learned patterns and allows for more reliable and efcient model

development. To do so, the evaluation method is applied to the previously presented model to

investigate the efects of diferent conceptual choices, namely diferences in the coverage of

scene information, varying diversity in training data, and diferent learning parameters. Based

on research question R2, the following sub-questions are formulated:

R2.2: How to measure the generalizability of data-driven prediction models?

R2.3: How and to what extent is the generalizability of a data-driven prediction model afected

by diferences in the coverage of scene information, homogeneity in training data, and various

learning parameters?

R2.4: Is it possible to combine real and synthetic trafc data samples to compensate for

underrepresented situations in datasets in the future?

5.5.2 Concept

The two key aspects of evaluating a data-driven prediction model involve the metric and the

choice of test data. Most publications employ spatio-temporal error measures, which indicate

how well the predicted trajectory matches the individual human-driven trajectory but do not

consider the situational context when evaluating a trajectory. Using displacement errors cannot

provide information on how functional or plausible the predicted trajectory was. According to

60

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

results from Section 5.4, cases occur in which behavior deviates from the real trajectory but is

still plausible, e.g., slower driven velocity or a longer time gap when turning without becoming

critical. At the same time, other predictions with similar error values enter non-driveable

areas. Consequently, common error measures are not able to distinguish between false-bad

and plausible-bad trajectories and vice versa. However, it would be crucial for developing or

tuning prediction models to identify the situations in which the model produces non-plausible

or non-functional results.

The Evaluation Metric:

In order to address this gap, inspired by literature [79, 81], a simple plausibility metric is

formulated consisting of two categories to evaluate the plausibility of predictions:

• spatial evaluation: maximal path deviation, road violation, maximal road violation

• temporal evaluation: collision check, minimum distance to other trafc participants

To measure plausibility, a percentage score

SP

is calculated based on these fve components.

Binary aspects, such as collision or road violation checks, return True or False, interpreted as 0

or 1. All distance measures are divided into bins and mapped to plausibility values between 0

and 1. The fnal plausibility score

SP

is the average of the individual components. To further

investigate spatial accuracy, the percentage of predictions with path deviations greater than 5

meters is measured. Based on the plausibility metric in conjunction with the common metrics

ADE and FDE, model performance, involving accuracy and plausibility, can be evaluated

against test data.

The Test Levels:

Since data-driven models are based on black-box approaches, transparency in terms of model

generalizability and reliability is achieved by applying a trained model to test data. Accordingly,

the selection of test data is crucial for the signifcance of the evaluation. Therefore, a multi-level

method is presented for critical evaluation, involving four levels of test data, as illustrated in

Figure 5.5. The four levels present diferent challenges in terms of generalizability, as they

include situations that are further apart from the training data. Starting with unknown

situations at locations shown during training (L1), new locations from real trafc data (L2a),

new locations from synthetic trafc data (L2b), and ending with testing in an exceptional

situation (L3), in which the ego vehicle has to pass a static obstacle with oncoming trafc. For

the L2a, L2b, and L3 levels, model performance is measured by the plausibility and accuracy.

Variations to Investigate:

As the objective is to gain insight into how diferent settings afect model performance and

which concepts contribute to improved generalizability, the following variants are investigated:

Three variants of training data (homogeneity) are trained over six variants of provided input

feature settings (coverage of scene information). In addition, one of these models is trained with

four diferent variants of learning parameter sets to investigate the efect of tuning compared

to more application-specifc decisions. In total, this results in 22 models to be evaluated.

For variants in diferent input feature settings, the variations (EMPI, EMP, EMI, EM, E)

presented in Section 5.4 are selected. Regarding the level of variety in the training data

(homogeneity), several factors have to be considered since the performance of data-driven

prediction models is strongly dependent on the data presented during training. On the one

61

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

ABILITY TO GENERLIZE

L1: UNKNOWN SITUATIONS

(SAME LOCATION)

L2a: UNKNOWN INTERSECTION

(REAL DATA)

L2b: UNKNOWN INTERSECTION

(SYNTHETIC DATA)

L3: EXCEPTIONAL SITUATION

→Drive into oncoming

traffic lane to overtake static obstacle

GT accuracy

(ADE & FDE)

Plausibility

MODEL

SETTING

Figure 5.5: The evaluation concept involving diferent test levels and metrics to quantify

generalizability.

hand, it is crucial that the training data represent the domain of application as thoroughly as

possible, i.e., representativeness. On the other hand, the homogeneity of the training data

was identifed to have a signifcant infuence on model training and generalizability [198].

Considering representativeness, in data-driven modeling, one often faces the problem that

some exceptional situations are underrepresented for adequate training. However, the creation

of new data, especially in exceptional situations, is either costly or not possible due to the

rarity or criticality of events in everyday trafc. As a result, it would be benefcial if it were

possible to augment existing datasets with manually defned situations that are known to be

underrepresented. The present methodology investigates how the level of variability in training

data afects the model’s ability to generalize. In addition, real trafc data is combined with

synthetic trafc data from simulation to investigate the possibility of augmentin g existing real

datasets through synthetic samples from simulation. This results in the following levels of

training data are investigated:

T1: Low variability: synthetic trafc data for training.

A simulation framework is used to create synthetic trafc data. Due to the limited possibility

of individualizing a driver model, less diversity in behavior occurs.

T2: Medium variability: real trafc data for training.

For real trafc data, an open-source dataset covering typical interactive urban situations is

selected.

T3: High variability: a combination of real and synthetic trafc data.

5.5.3 Implementation Details

Data for Training and Testing:

For training and evaluating the open-source drone dataset inD

2

is used for representing

real trafc [155]. The dataset includes recordings of four German unsignalized intersections

2https://ind-dataset.com/

62

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

called Aseag, Bendplatz, Frankenburg, and Heckstrasse displayed in Figure 5.6 and is further

described in Section 4.3.

For the medium level of variability in training data (T2), models are trained only on real data,

and recordings from Aseag, Bendplatz, and Frankenburg were selected for training, resulting in

900.000 training samples. For the evaluation at level L1 (unknown situations), one recording

from each location was retained for testing and one as a validation set.

For representing a low level of variety in training data (T1), data on four synthetic intersections

was created with the help of the simulation framework Spider at BMW [151, 55]. The choice

of intersections intends to represent similar intersections compared to the ones represented in

the inD dataset involving diferent complex intersections and merging topologies as shown in

Figure 5.6. Since all drivers in the simulation are based on the same heuristic agent model but

use diferent parameters, overall behavior is more homogeneous compared to real trafc data.

For the low level of variability in training data (T1), models are trained on intersections 1, 2,

and 3. Randomly selected vehicle IDs were chosen and retained for the validation set during

training and for testing on L1 (unseen situation). It has to be noted that the synthetic data

only contains vehicles and no VRUs. The synthetic data is recorded at the same sampling rate

and shows the same characteristics as the inD dataset.

The high degree of variability (T3) in the training data is achieved by combining synthetic

and real data. For this purp ose, the real trafc data of Bendplatz and Aseag are combined

with the synthetic data of intersections 1 and 3. The data is combined in such a way that

synthetic and real data have a distribution of 50:50. For comparison, all training sets were set

to have a similar number of samples. For evaluating at the L2 level, data from one real (L2a)

and one synthetic (L2b) location were used. For L2a, the models were tested on recording 30

of the inD dataset, which represents a new intersection from reality (Heckstrasse). For L2b,

data from a diferent four-armed intersection was created (isec 4).

The simulation framework was used to create a special scenario in which the path of the ego is

occupied by an obstacle on a two-lane road with oncoming trafc to test model performance

in an exceptional situation (L3). For data collection on L3, the vehicle is controlled by a real

human in simulation. Pictures of all test and training locations are shown in Figure 5.6, and

all data is processed according to the methods presented in Chapter 4 and Section 5.3.3.

Model Input Variations (coverage of scene information):

For input variations, the same settings as presented in Section 5.4 employing diferent feature

spaces are used: EMPI: 209 features, EMP: 144 features, EMI: 96 features, EI: 78 features,

EM: 31 features, and E: 13 features.

Training and Model Parameters:

For the training process of all models showing diferent settings of features or varying training

data, the following learning parameters were set: Adam optimizer with a default learning rate

of 0.001 was employed, MSE as loss function, and relu for activation by using Keras for model

building [195]. In order to investigate the infuence of parameter tuning relative to changes

in features and training data, some variations were investigated, namely the choice of the

loss function, activation function, optimizer, and batch normalization shown in Table 5.4. All

models were trained with a batch size of 50 for maximal 80 epochs using early stopping with a

minimum delta of 0.00001 and patience of 15 epochs.

63

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

T1: training on synthetic data

VARIETY IN TRAINING DATA MULTI-LEVEL TESTING

T2: training on real data

Bendplatz Frankenburg Aseag

T3: training on combined data (real & synthetic)

Bendplatz Frankenburg

isec 1 isec 3

isec 2

isec 1

L3: test edge caseL2a: test data real L2b: test data syn

Heckstrasse isec 4 edge case

isec 3

L3: test edge caseL2a: test data real L2b: test data syn

Heckstrasse isec 4 edge case

L3: test edge caseL2a: test data real L2b: test data syn

Heckstrasse isec 4 edge case

Figure 5.6: Illustration of the data concept including three levels of homogeneity of training

data (T1 - T3) on the left-hand side and diferent test data (L2 - L3) on the right-hand side. Real

trafc examples are provided by the inD dataset [155], while synthetic driving data is obtained by

simulation. As an additional test case, an exceptional scenario is designed using the simulation

framework and a human driver (L3).

Table 5.4: Tuning parameter sets for investigating the efect of hyperparameter tuning [195].

ID Batch Norm. Loss Function Optimizer Activation

1

2

3

4

True

MAE (Mean Absolute Error)

MSE (Mean Squared Error)

MAE (Mean Absolute Error)

sgd

adam

sigmoid

relu

Metric Calculation:

The evaluation method presented earlier returns the following measurements for each model:

•

Overall score

SO

, overall accuracy score

SACC

, overall plausibility score

SP

, overall ADE

& FDE calculated across all test levels

• ADE & FDE individually on test data of L1, L2a, L2b, and L3

• SP individually on test data L2a, L2b, and L3

In order to evaluate the accuracy of the proposed models, ADE and FDE for accuracy is

calculated using L2 distance according to the general state-of-the-art [193]. Model performance

is measured as a combined score considering accuracy and plausibility, resulting in

SO

. For

model accuracy evaluation

SACC

, ADE and FDE are converted to a score under consideration

of benchmark results following Equation (5.3), where

ADEB

= 2

m

and

FDEB

= 5

m

. The

individual values were chosen according to results in the analysis of Bahari et al. [79].

= (ADEBF DEB1

SACC + ) · ) · 100 (5.3)

ADE FDE 2

64

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

For calculation of the overall accuracy

SACC

, all displacement errors are combined, while the

displacement errors of L2a, L2b , and L3 are weighted double to assign a higher priority to

results on data further away from training. The score is calculated according to Equation

(5.4). Total FDE is calculated accordingly.

ADEL1+ 2 · (ADEL2a+ ADEL2b+ ADEL3)

ADE = (5.4)

7

The score for the fnal model performance

SO

is the calculated mean of plausibility

SP

and

accuracy score SACC .

5.5.4 Results

All results for all model variants are provided in Tables 5.5 and 5.7, whereby Table 5.5 shows

the results for all diferent feature settings and levels of variability in training data. Table 5.7

provides the evaluation of diferent learning parameters for models trained on the full feature

space EMPI on the real dataset (T2).

5.5.4.1

Infuence of Homogeneity in Training Data and Coverage of Scene

Information

The results show a clear beneft of more variability in training, as models trained on T3

provide the best results for

SO

,

SP

, and

SACC

across all test levels, as illustrated in Figure 5.7

(right). In terms of plausibility and accuracy at diferent test levels, T3 either outperforms the

other data variation settings or shows similarly accurate results. Models trained on T1 show

signifcantly higher error values when testing on unseen real data (L2a) and in the edge case

(L3), as illustrated in Figure 5.7 (left) and Table 5.5. Meanwhile, the low level of variability

in the behavior data of T1 is benefcial when predicting clean synthetic behavior on test

level L2b compared to the models trained on real data (T2). However, even on the clean

synth etic behavior data of L2b, the models trained on T3 outperform models trained on T1.

This ind icates potential in combining real and synthetic data for handling underrepresented

situations in the future. When considering the overall plausibility scores

SP

of all models, a

model trained on T3 shows the best overall plausibility score of 73%, followed by a model

trained on T2 with 72%, while models trained on T1 only reach a maximum of 66% as

summarized in Table 5.5.

Regarding variations in the feature settings, results show that contextual features provide a

clear beneft in terms of generalizability when training on T3. The best-performing model

includes all contextual features (EMPI), as shown in Figure 5.8 and Table 5.5. Models trained

on T1 and T3 show the overall best plausibility on the feature setting EMI, but the diferences

in

SO

and

SACC

when comparing the feature settings do not show a clear tendency. When

considering the ability of models to generate reasonable predictions in exceptional situations

(L3), the feature setting EMI clearly outperforms the others when training on T1 or T2, while

models trained on T3 show the best results when all features are included (EMPI). The models

trained only on synthetic data (T1) show the poorest results overall. Considering diferent

feature settings, no clear tendency could be found. Overall plausibility

SP

shows the best

results on feature setting EMI, and overall accuracy

SACC

is best on feature setting E. But

when it comes to the edge case scenario (L3), one can see a clear advantage of including context

65

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

Table 5.5: Results for all model variants across diferent training data and diferent feature

settings at all test levels.

The frst fve columns provide results summarized across all test levels. The remaining columns

show the results for each individual test level. The best results within the training dataset (T1,

T2, T3) are highlighted in bold, the best results per column, i.e. per test level, are highlighted in

bold and underlined.

F DE ADE

L1 L1

[%] [%] [%]

[m] [m]

SO SACC SP

FDE ADE

[m] [m]

L2a L2a

L2a

FDE ADE

SP

[m] [m]

[%]

L2b L2b

L2a

FDE ADE

SP

[m] [m]

[%]

L3 L3 L3

FDE ADE

SP

[m] [m]

[%]

T1 EMPI 49.34 40.46 58.23 9.93 6.55

EMP 50.31 40.87 59.74 9.96 6.34

EMI

54.57 42.56 66.57 9.72 5.94

EI

50.56 34.62 66.50 11.73 7.51

EM

43.96 34.63 53.30 11.93 7.31

E

54.02 44.42 63.61 9.55 5.48

2.38 1.34

2.10 1.20

2.39 1.38

2.40 1.26

2.19 1.26

2.60 1.45

12.26 8.27 51.79

13.41 8.42 58.23

14.59 9.70 56.57

20.88 16.31 60.71

13.60 9.48 47.58

12.04 7.36 57.41

7.87 4.40 72.92

7.82 4.04 73.25

8.52 4.47 71.22

6.86 3.40 78.27

9.86 5.83 64.50

7.05 3.47 81.72

13.43 9.58 49.97

12.57 9.13 47.75

9.70 5.93 71.92

12.12 5.95 60.52

17.22 9.65 47.81

13.03 7.63 51.70

T2 EMPI 55.84 48.93 62.75 8.87 4.82

EMP 58.72 53.68 63.75 8.62 4.05

EMI

57.65 43.37 71.93 10.60 5.05

EI

53.66 47.74 59.59 9.56 4.64

EM

55.96 45.58 66.34 8.83 5.80

E

58.10 47.42 68.78 8.64 5.41

4.72 2.88

3.90 1.95

3.97 2.09

4.60 2.20

3.94 2.20

4.51 2.26

7.87 3.85 72.47

8.79 4.17 72.42

9.70 4.55 70.52

10.84 4.77 61.75

9.17 4.91 67.28

9.10 4.30

73.12

11.05 5.92 61.54

10.75 4.97 64.10

18.91 8.58 62.08

11.02 5.68 57.33

12.73 9.78 59.39

11.37 9.88 60.88

9.75 5.68 54.26

8.70 4.06 54.73

6.52 3.52 83.19

9.28 4.67 59.70

7.02 4.50 72.34

7.51 3.63 72.35

T3 EMPI

EMP

EMI

EI

EM

E

71.83 70.63 73.03 6.50 3.11

66.24 62.11 70.37 7.24 3.62

61.79 52.81 70.77 8.48 4.29

66.75 63.05 70.46 7.08 3.60

61.27 52.98 69.56 8.13 4.50

60.79 54.56 67.02 8.19 4.16

2.70 1.33

3.12 1.73

2.81 1.54

2.85 1.65

2.91 1.64

2.74 1.62

6.58 3.16

67.40

8.13 4.12 65.20

8.87 4.01 66.85

7.74 3.90 63.73

9.18 5.20 68.96

10.21 5.33 59.00

7.50 3.35 74.49

7.91 3.72 69.29

7.18 3.38 75.72

5.97 2.88 82.70

9.94 5.21 70.21

6.76 3.05 81.74

7.32 3.71 77.21

7.74 3.98 76.62

12.23 6.84 69.73

9.67 5.00 64.96

7.90 4.50 69.51

10.31 5.38 60.32

features during learning (up to 20% more accuracy and plausibility). The fact that models

trained on synthetic data show less beneft from the inclusion of contextual features can be

explained by the driver model used to create the synthetic data. The driver models are not

able to interact and rarely respond to the behavior of others but follow predefned heuristic

rules. Consequently, driver behavior in this dataset is less context-dependent compared to

real trafc data. In addition, the interaction and partner feature spaces contain features for

VRUs that are not present in the synthetic data. The empty features might hinder the training

process. In general, a high inter-dependency between the homogeneity of training data and

the coverage of scene information by the feature setting can be observed.

Figure 5.7: Infuence of variability in training data on model performance. Left: accuracy

measured by ADE and FDE on diferent test levels with best feature setting of training data

category.

Right: scores for accuracy, plausibility, and overall for diferent training data.

5.5.4.2 Infuence of Individual Feature Categories

The efect of map and interaction features on spatial and temporal performance is analyzed to

gain further insight into the impact of individual feature categories. In Figure 5.9 (left), it can

66

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

Table 5.6: Analysis of spatial and temporal model performance measured in [%].

Best result for

SO_temp

and

SO_spa

per homogeneity level (T1, T2, T3) highlighted in bold,

best overall results highlighted in bold and underlined.

L2a L2b L3

SO_spa

Collision Collision Collision

SO _temp

SO _spa SO _spa SO_spa

Check Check Check

L2a L2b L3

T1 EMPI

EMP

EMI

EM

E

EI

28.94

37.05

33.64

24.22

44.96

41.07

72.68

71.82

66.51

54.01

80.21

85.08

17.34

4.01

69.91

0.00

22.78

39.83

50.81

54.44

50.08

39.12

62.58

63.07

92.61

96.47

93.50

83.60

84.10

94.29

97.79

97.41

97.26

89.00

99.18

98.35

97.71 96.04

100.00 97.96

96.85 95.87

98.57 90.39

91.12 91.47

98.85 97.16

T2

EI

EMP

EMI

EM

E

EI

73.29

75.03

70.54

69.32

70.22

51.56

47.92

49.45

53.08

34.68

51.24

49.77

31.66

39.11

86.39

71.92

65.76

35.96

60.60

62.24

61.81

52.00

60.73

50.67

96.15

96.89

95.94

89.23

98.23

93.49

97.82

96.47

96.69

97.18

88.52

97.54

100.00 97.99

100.00 97.79

100.00 97.54

100.00 95.47

100.00 95.58

100.00 97.01

T3 EMPI

EMP

EMI

EM

E

EI

59.86 71.48 81.66 65.67 98.65 98.83

55.53 61.98 82.23 58.76 97.80 98.10

64.13 77.01 65.76 70.57 95.70 98.25

62.76 63.07 62.89 62.91 95.49 88.88

50.49 81.79 47.28 66.14 89.58 99.55

55.90 84.27 50.00 70.09 94.64 99.43

98.57 98.68

100.00 98.64

98.28 97.41

100.00 94.79

99.71 96.28

99.71 97.93

be observed that interaction and partner features contribute, on average, to better temp oral

plausibility of the results, measured by the frequency of collisions. In addition, the best feature

settings with and without map features across all training levels show an advantage of including

map information regarding spatial plausibility of predictions, as shown in Figure 5.9 (right).

Spatial plausibility is measured by the frequency of road violations and the percentage of path

deviations over 5m. However, the positive efect of including map features on better spatial

prediction accuracy is smaller than expected. Table 5.6 summarizes the results for spatial and

temporal plausibility of diferent feature settings trained on diferent datasets. Considering

the individual values presented in Table 5.5, it can be observed that the spatial plausibility,

measured on synthetic test data, partly shows better values without map features. This aspect

should be investigated further.

Semantic features are associated with a higher computational efort in data preprocessing.

Figure 5.8: Infuence of diferent feature settings on model accuracy considering homogeneity

level T3. Left: Efect measured on test level L2a. Right: Efect measured on test level L3.

Therefore, the infuence of interaction features representing situational context is analyzed

in more detail. When testing on L3, a clear advantage of semantic interaction features can

be observed. The same tendency, but with a smaller efect, is observed when testing on real

67

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

(L2a) and synthetic data (L2b). In general, the inclusion of partner and interaction features

contributes to better temporal accuracy in prediction, as shown in Figure 5.9 (left). Again, a

strong inter-dependency between homogeneity in training data and the utility of each feature

category can be observed. In addition, a lower beneft of contextual features is observed for

models trained on synthetic data (T1), which, again, can be attributed to a lower context

dependency of behavior due to heuristic model strategies for artifcial drivers.

Figure 5.9: Infuence of interaction features on temporal plausibility (left) and infuence of map

features on spatial plausibility (right).

5.5.4.3 Impact of Tuning Parameters

The results of the exemplary parameter tuning variants are provided in Table 5.7 and

show similar efect sizes on accuracy and plausibility as diferences in the coverage of scene

information, varying in a range of

±

10%. Looking at the efects of variability in the training

data, one can observe a much more powerful efect up to

±

25%. Results show that the

parameter tuning has a large impact on the generalizability of the model since variant ID4,

for example, shows the best results on test level L1 while providing weak generalizability.

Meanwhile, variant ID2 or ID1 provide lower accuracy on test level L1 but outperforms variant

ID 4 on the other test levels.

Table 5.7: Results for diferent learning parameters on all test levels according to Table 5.4. Best

result per column highlighted in bold.

ID

SO

[%]

SACC

[%]

SP

[%]

F DE

[m]

ADE

[m]

L1

FDE

[m]

L1

ADE

[m]

L2a

FDE

[m]

L2a

ADE

[m]

L2a

SP

[%]

L2b

FDE

[m]

L2b

ADE

[m]

L2b

SP

[%]

L3

FDE

[m]

L3

ADE

[m]

L3

SP

[%]

1 55.87 45.42 66.31 9.26 5.43 5.22 2.54 7.07 3.65 74.27 11.38 5.70 62.76 11.35 8.39 61.91

2 53.67 44.76 62.58 9.27 5.62 5.73 2.94 7.96 4.41 64.55 9.68 5.20 62.67 11.94 8.59 60.51

3 55.84 48.93 62.75 8.87 4.82 4.72 2.88 7.87 3.85 72.47 11.05 5.92 61.54 9.75 5.68 54.26

4 45.92 35.60 56.25 11.64 7.08 3.91 1.89 12.45 7.75 59.06 15.89 8.27 57.90 10.44 7.83 51.79

5.5.4.4 Measuring Generalizability and Plausibility

Generalizability is assumed to be measurable by testing model performance on diferent trafc

situations and scenarios in varying distances (test level L1 - L3) from situations shown during

training [199]. For the various settings shown in Table 5.5 and Table 5.7, the model performing

best on test level L1 is usually not the model providing the best results on L2 or L3, emphasizing

68

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

the necessity of a critical evaluation strategy. In particular, for the parameter settings, the

model with the best results on data close to the training (L1) showed the weakest plausibility

and accuracy overall, indicating poor generalizability. Consequently, an evaluation strategy

that considers a wide range of test data is crucial even in an early stage of development. When

considering plausibility in relation to accuracy, the results with the best accuracy do not

necessarily show the best plausibility. In particular, when testing with real data (L2a), one can

see large diferences between accuracy and plausibility. Therefore, in Figures 5.10 and 5.11,

some qualitative prediction results for the L2 test locations are provided and investigated.

Figure 5.10: Quantitative results on unseen intersections (isec 4 - L2b top, Heckstrasse - L2a

bottom). Ego GT trajectory marked in orange. Predictions of models trained on EMPI features

with diferent training datasets (T1, T2, T3). GT and predictions represent positions in the next 5

seconds. For all trajectories: driven path: solid, future path: dotted.

5.5.4.5 Qualitative Prediction Results of Diferent Model Settings

Figure 5.10 shows situations on the L2a (Heckstrasse) and L2b (isec 4) for models trained

on the feature space EMPI but on diferent training data variations. The top row of Figure

5.10 illustrates predictions on the synthetic test intersection for EMPI models trained on

datasets T1, T2, and T3. Also, in the qualitative results, the improved performance of the

model trained on T3 compared to T2 and T1 can be observed. The model trained only on

synthetic data (T1) shows less accuracy and plausibility compared to the model trained on a

combination of real and synthetic data, even when predicting synthetic behavior.

The importance of the plausibility measure is demonstrated by the example on the right-hand

side, whereby the model trained on T1 and the model trained on T3 show similar accuracy

measured by displacement errors (FDE

∼

16

m

). The trajectory predicted by the T3 model

69

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

shows a lower driven velocity and only a small path deviation and maintains, therefore, still a

certain level of plausibility. Meanwhile, the prediction of the model trained on T1 leads into

an undriveable area and does not show any reasonable b ehavior.

The second row in Figure 5.10 provides some examples of the real test location Heckstrasse

(L2a). Here, similar efects can be observed. The weak performance of the model trained on

the low level of variability in training data is especially signifcant. While predictions of the

models T3 and T2 partially still show larger error values, the predictions follow a reasonable

path, and predictions seem to show weaknesses primarily from a temporal perspective. This

might indicate that the training data is not representative enou gh with respect to higher driven

velocities in such turning maneuvers. Figure 5.11 illustrates situations with three diferent

Figure 5.11: Qualitative evaluation of predictions from diferent feature settings for models

trained on T3 performing on unseen intersections (frst row: isec 4 - L2b, second row: Heckstrasse

- L2a). Ego GT trajectory marked in orange. Predictions of models trained on T3 with diferent

feature settings: EMPI, EMI, and E. GT and predictions represent positions in the next 5 seconds.

For all trajectories: driven path: solid, future path: dotted.

model variants (EMPI, EMI, EM) trained on T3 on the two unseen test locations. The

predictions in the f rst row, representing performance on synthetic data, reveal that predictions

of the model trained with context features are associated with high error values mostly due to

weak temporal accuracy, while the model trained only on ego features generates higher path

deviations. The example in the frst column illustrates a typical example of high displacement

error but still high plausibility of the prediction. The trajectory generated by the EMPI model

leaves a smaller time gap when interacting but without getting critical.

Predictions on real trafc data, shown in the second row of Figure 5.11, demonstrate the

same positive efects of training with contextual features to provide reasonable results. These

70

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

examples indicate that the plausibility metric is an indispensable extension of commonly used

metrics to identify situations in which a model provides unreasonable predictions.

5.5.5 Limitations and Future Work

Next to interesting insights regarding the efect of homogeneity in training data, coverage of

scene information, and learning parameters on generalizability, the evaluation has shown that

such model aspects can not be considered independently. There are strong inter-dependencies

between data, mo del structure, and learning parameters, which make it challenging to derive

general valid conclusions. This fact highlights the necessity of a complex and critical evaluation

method to provide more transparency and reliable solutions when using black-box models.

However, as the complexity of the evaluation method increases, so does the interpretation of

results. Therefore, scores have been introduced to allow for easy assessment and comparison.

However, these scores combine and average the individual results, which can lead to smoothing

efects.

The relevance of plausibility in prediction results had a major focus in the presented method.

Of course, a safe driving strategy for AVs requires an accurate prediction, and therefore,

an inaccurate but plausible prediction might still cause problems. However, those insights

into plausibility provide a valuable basis for generating more transparency regarding model

weaknesses when developing and tuning black box models. Furthermore, in the application

for driver models in simulation, the primary objective is to generate human-like reactions not

requiring the most for being able to interact with other road users [200].

Furthermore, the plausibility metric employed is based on simple functional indicators.

The prediction methods are focused on the prediction of vehicle behavior. For enabling the full

dimension of SA within a driver model in urban scenarios, the anticipation of VRUs, including

cyclists, pedestrians, or scooters, would be required. The methods themselves are transferable

to such applications, but an adaptation of plausibility checks and features would be required

since, for example, a pedestrian is not restricted to moving within a certain lane or road area.

A simple model approach for prediction is employed, dispensing on the consideration of

temporal context or probabilistic outputs. However, such aspects are commonly addressed in

state-of-the-art approaches and should be combined with the proposed methods in the future

by, for example, using a LSTM instead of an MLP.

For data labeling, the future positions at certain time steps represented as global XY coordinates

were used. There might be potential to improve generalizability by using a local labeling

strategy, such as transforming the global coordinated into local lateral and longitudinal

displacement after 1, 2, 3, 4, and 5 seconds. However, the beneft in generalizability versus the

extended calculation efort when using such a strategy online must be investigated. Test level

L3 contains only a small number of samples (600) to exemplify what such a level of testing

could lo ok like. In order to provide broader reliability of model performance, additional test

data for exceptional situations should be designed and included in testing at L3 to provide

comprehensive insights into potential limitations.

For high variability in training data, real trafc data was combined with synthetic samples,

showing the high potential of using simulation to create specifc situations to compensate

for those that are underrepresented in the real training data. Of course, the extent to which

71

5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc

human-like behavior can be generated by simulation in such situations depends strongly on

the quality of the driver models in use.

5.5.6 Conclusion

This chapter presented a novel multi-level evaluation method providing detailed insights into

the generalizability of data-driven trajectory prediction models addressing research question

R2.1. Testing at diferent levels highlighted the criticality of selected test data with respect to

the validity and signifcance of evaluation results.

Since not only the accuracy but also the plausibility of results is considered, the proposed

methodology allows for the identifcation of samples showing inconsistent predictions. Such

insights are crucial during the development process to develop reliable solutions.

Two phenomena were observed: frstly, the plausibility of results does not necessarily correlate

with accuracy; secondly, the best-performing setting on test data that is close to the training

data is not necessarily the best setting in terms of generalization. Taking those facts together,

a multi-dimensional evaluation involving a broad range of test data is crucial for determining

the best model setting and should be considered in early stages of development.

Regarding research question R2.2, the evaluation revealed a large impact of the homogeneity in

training data on model performance and, at the same time, the potential to augment existing

real datasets with synthetic samples. The efect of variability in training data was shown

by the performance of the model trained on a combination of synthetic and real data (T3),

outperforming the other variants under consideration. This valuable insight connects directly

to research question R2.3, demonstrating the possibility of using simulation to compensate for

situations that are underrepresented in the real training data. The investigation of diferent

feature categories demonstrated a positive efect of contextual features on the temporal and

spatial accuracy of predictions. High inter-dependencies between feature setting and training

data were observed, resulting in a complex task to interpret results for general conclusions.

The efect size of variations in the coverage of scene information was identifed to be similar to

the efect (∼ 10%) of hyperparameter tuning, which underlines the power of model tuning.

In conclusion, the results show diferent potentials in the considered conceptual decision

categories. None of the presented models so far fulflls the required accuracy and plausibility

for integration into a driver model covering the entire diversity of urban trafc. However,

valuable insights could be gained, and a novel method could be elaborated, paving the way

for reliable prediction models in the future. The presented method provides a diferentiated

perspective on the prediction results and provides more transparency for the use of such

black-box models for trajectory prediction. The problem of generating accurate predictions

in urban trafc is a highly complex task and requires a correspondingly complex evaluation

approach since there will not be one solution that satisfes all requirements, but the right

balance has to be found for the particular application. The multi-dimensional evaluation

showed diferent strong tendencies in the investigated conceptual possibilities, which ofer

valuable insights for future developments.

72

5.6 Summary on Anticipation Methods

In this chapter, two novel methods were presented concerning the development of a generalizable

and reliable prediction model for anticipating the future movement of vehicles in complex urban

trafc. Weak generalizability and transparency of data-driven models for trajectory prediction

were identifed to be primary challenges in current research. While there are several published

approaches to modeling, these critical aspects are rarely discussed, and model evaluation

takes a secondary place compared to the sophisticated model architectures. Therefore, the

introduced methods in this chapter focused on generating insights regarding the efect of

diferent conceptual choices on the model’s ability to generalize and elaborate a concept to

evaluate generalizability adequately. In summary, the purpose of this chapter is to pave the

way to reliable predictions, not to provide another solution for trajectory prediction that is not

feasible due to lack of generalizability. Referring to research question R2, it can be concluded

that there is no single perfect solution with regard to the high variability and uncertainty

in urban trafc. NNs show great potential in capturing such highly complex relationships

but are associated with a low degree of transparency. Therefore, to take full advantage of

these modeling approaches, a stronger focus on critical evaluation methods rather than the

development of more sophisticated model structures is recommended.

One of the key fndings within the presented results is the demonstration of improved SA by

incorporating semantic and non-semantic representations of interactions of the ego vehicle

with surrounding road users. Furthermore, the methods have highlighted the critical role

of training and testing data in model performance and the signifcance of evaluation results.

High variability of training data and enrichment of real datasets with synthetic data holds the

potential to improve generalizability signifcantly. Moreover, the importance of meaningful

evaluation metrics was emphasized, as the inclusion of additional plausibility metrics besides

the commonly used displacement errors allows for clear identifcation of situations where a

model has signifcant weaknesses. It is recommended to use the proposed methods to refne

the prediction model further before embedding it into a holistic driver model.

73

6

Dynamic Decision-Making

Disclaimer: The present chapter is based on research presented in: [201]: Teresa Rock et al. “Dynamic Decision-Making

for Agent Models in Urban Driving Simulation”. In: Proceedings of the Driving Simulation Conference 2023 Europe VR.

ed. by Andras Kemeny, Jean-Rémy Chardonnet, and Florent Colombet. Driving Simulation Association. Antibes, France,

2023, pp. 169–179.

The following chapter addresses research question R3: How to enable driver agents

for dynamic decision-making in order to cope with typical urban scenarios in a

functional and human-like manner?

After a brief introduction outlining the motivation for more dynamic decision-making in driver

models, the state-of-the-art section provides information on decision strategies for agent models

in DS and the main concepts of trajectory planning, which is the common strategy used in AD.

Subsequently, a concept involving two diferent techniques for decision-making is presented to

address the proposed research question. Both approaches are compared in terms of runtime

and quality of results with each other and with a heuristic agent model to investigate potentials

and weaknesses in typical urban scenarios.

6.1 Introduction and Motivation

The requirements analysis in Section 3.3.4 demonstrated the need for dynamic decision-making

from a psychological point of view in terms of SA and as a requirement for interactive behavior

in trafc. The review of state-of-the-art approaches in Section 2.1 showed that agent models

in driving simulation typically follow heuristic and hierarchical structures. Decisions are made

by choosing between predefned options that discretize the driving task, such as maneuvers [39,

13] or states [22]. Given the variety of situations encountered in urban trafc, it is clear that

such approaches are limited in their ability to exhibit human-like behavior by dynamically

adapting to the situation since each maneuver and the corresponding rules for choosing it must

74

6.2 State-Of-The-Art: Decision-Making

be explicitly formulated. More maneuvers and heuristic rules can be added to solve complex

interactive situations. However, such strategies do not scale, and one risks ending up in endless

if-then loops addressing single-point solutions. Furthermore, the strong discretization of the

driving task leads to a limited solution space and thus often to deadlocks in complex situations

[137, 138]. Such functional errors impede the simulation and might negatively infuence the

participants’ perception of reality in a simulator experiment [139, 118]. Meanwhile, urban

scenarios are gaining importance in driving simulation [4]. Consequently, current modeling

approaches are well suited for simple urban situations but show weaknesses in handling more

complex or interactive scenarios that would require dynamic decision-making for adapting

behavior to the situation [103, 9].

DDM is an established term in psychology and cognitive research that is characterized by a

decision-making process consisting of a sequence of interdependent decisions that infuence

future actions [140]. In contrast, static decision-making is referred to as a linear process

that selects between explicit alternatives [141]. In the context of driver models within this

thesis, the heuristic modeling approaches choosing between explicit maneuvers is understood

as static decision-making, while urban trafc requires a method that allows the driver model

to employ DDM. In this case, DDM is interpreted as the ability to take actions dependent on

future evolutions of the situation without choosing between explicitly formulated maneuver

alternatives.

Hence, it is necessary to explore alternative strategies for decision-making in agent models. In

the domains of AD and robotics, spatio-temporal behavior is typically derived by trajectory

planning. These approaches are more in line with the way humans handle s uch situations,

balancing various weighted needs such as safety, compliance, or efcient target reaching.

Additionally, based on anticipation, these methods consider diferent developments in the

driving scenario, enabling adaptive behavior in response to dynamic changes in the situation.

To meet the challenge of dynamic decision-making, the following sections present a method for

agent models, adapted from trajectory planning approaches commonly utilized in AD. Since

such planning approaches induce increased complexity, the proposed methods are compared to

a heuristic agent model, which is in use at BMW and called TRM [55]. The objective is to

investigate when and to what extent the increased complexity of the dynamic decision-making

strategy can add value.

6.2 State-Of-The-Art: Decision-Making

6.2.1 Decision-Making Strategies for Agent Models in Simulation

The main concepts of how vehicle behavior is modeled in DS were presented in Section 2.2.5.

The decision-making process of current driver agents is mostly rule-based and aims to perform

the appropriate maneuver or action based on heuristic rules processing situational information.

For example, the well-known open-source simulation framework SUMO includes modules

for following, handling intersections, and lane changes in order to cope with the variety of

situations that occur in trafc [39, 34]. The decision on which of these modules is active is

made hierarchically based on predefned rules that take into account the topological features

of the map and the dynamic movements of other road users. Normally, each maneuver is

75

6.2 State-Of-The-Art: Decision-Making

associated with a respective motion model. For example, the longitudinal behavior during a

following maneuver can be generated by the Wiedemann-following model [13, 51] or the IDM

[50, 43]. The driver agent used at BMW also distinguishes between lateral and longitudinal

movement and follows a hierarchical approach. Based on situational information, several

evaluators determine their need for action at each time step. A detailed description of the

decision-making process is provided in Section 2.2.6. Other commercial simulation frameworks

such as cogniBit

1

or AAI

2

use more sophisticated approaches to model human-like driving

behavior including learning-based concepts in combination with planning algorithms. However,

detailed information on these commercial solutions is not publicly available.

6.2.2 Trajectory Planning Approaches for Automated Vehicles

Planning the trajectory of an AV aims to fnd the best movement within a spatio-temporal

confguration space, taking into account safety, comfort, and efciency [202]. Common

approaches for addressing the planning problem can be divided into three main categories:

sampling-based methods, search-based methods, and optimization-based methods [59]. The

following sections provide a brief overview of basic trajectory planning methods used in the

context of AD. Since trajectory planning itself is a broad feld of research, more details can be

found in state-of-the-art reviews, such as the analysis by Gonzales et al. [203].

6.2.2.1 Search-Based Methods

In general, search-based approaches for trajectory p lanning are exploring the environment

to fnd a suitable solution. This class of methods typically constructs a directed graph by

discretizing the confguration space with a fxed number of motion primitives, building the

search graph a priori to avoid costly random exploration of the free space of the environment at

runtime. The search strategies focus on fn ding a globally optimal or sub-optimal path through

the graph according to the defned objectives. Please note that depending on the dimensions

of the search space, the result can either be a spatial path (x/y), a velocity profle (s/t), or a

spatio-temporal trajectory (x/y/t). The respective result and the computational efort strongly

depend on the design of the search spaces and considered heuristics to evaluate potential

nodes in the graph. Commonly used methods are based on the A* algorithm, state lattice

methods, or RRT (Rapidly-exploring Random Trees) [59, 204, 58]. Researchers use diferent

strategies for incorporating kinematic limitations into such point-based search algorithms,

such as using Dubins distance elements as motion primitives [205, 204]. Search-based methods

are associated with advantages such as low calculation efort and sufcient fexibility for

unstructured environments [206]. On the other hand, the A* and related algorithms are

sufering from high sensitivity to the design of the search space, as such methods might provide

only sub-optimal or no solution.

6.2.2.2 Sampling-Based Methods

Unlike environment exploration, the basic idea of sampling-based methods is to generate a

set of trajectories, which are then evaluated to determine the best trajectory among these

1https://cognibit.de/drivebot/

2https://www.automotive-ai.com/

76

6.2 State-Of-The-Art: Decision-Making

candidates. Various methods are available to generate these candidate trajectories. For

example, Zhang et al. proposed an approach in which lateral motions are sampled along a

reference path. Subsequently, the best trajectory is selected based on an objective function [59].

A similar approach is proposed by Lim et al. whereby the static environment is discretized by

topological lane information to derive high-level maneuver decisions [66]. As such high-level

decisions do not provide feasible paths, fu rther smoothing is applied by employing a numerical

optimization-based method.

6.2.2.3 Optimization-Based Methods

As a third option, optimization-based methods formulate the planning problem as a

mathematical model incorporating a cost function and constraints. Optimization-based

methods employ either linear or nonlinear solving strategies to address the planning task,

depending on the individual mathematical problem formulation. These methods typically

represent the path of a vehicle using parameterized curve models, such as splines, spirals, or

polynomials [59, 66]. Individual p oints are then moved within a fnite dimensional parameter

vector space to optimize for the s pecifed objectives in order to obtain smooth trajectories. The

constraint function is responsible for maintaining the trajectory compliant with the dynamic

constraint of the vehicle. Depending on the design of and thus the linearity of the constraint

function, diferent solver algorithms are applied, such as QP (Quadratic Programming) with

the IP (Interior Point) method [67] or using a SQP (Sequential Quadratic Programming)

algorithm for solving a nonlinear constraint function [207]. In contrast to the aforementioned

sampling-based method, the numerical method is not limited by predefned patterns and

allows for fexible solutions. However, the fexibility of the approach is associated with a high

computational cost to solve the spatio-temporal planning problem.

6.2.2.4 Decomposition Strategies

To reduce the complexity of the planning problem, decomposition strategies use sequential or

hierarchical architectures, such as PVD (Path-Velocity-Decomposition) or lateral-longitudinal

decomposition. PVD is the most common method to reduce the complexity of the 3D problem

by separating it into sequential tasks that can be solved in two dimensions [67]. First, a

path is planned in a two-dimensional spatial space. Secondly, the velocity profle is planned

among the path [208, 209]. Diferent methods can be used to perform the individual parts

of the sequential chain. Another approach for reducing complexity is to decouple lateral

and longitudinal motion planning. Such approaches are often used in highway scenarios, as

presented by Werling et al. [210].

6.2.3

Potentials and Weaknesses of Heuristic Decision-Making and Trajec-

tory Planning

Compared to heuristic modeling approaches, trajectory planning ofers the possibility to react

in a more human-like way since the potential solution space is larger and less discretized, and

thus, the decision process is more dynamic. As the number of potential solutions increases, so

does the complexity, which is accompanied by a higher computational cost. Meanwhile, real-

77

6.3 Method: Dynamic Decision-Making for Agent Models

time capability is a crucial functional requirement for agent models in DS. In conclusion, the

challenge is to fnd the best trade-of between the complexity of the approach and the quality

of results. Therefore, it is necessary to gain more insight into the capabilities and limitations

of such approaches to allow for appropriate usage depending on the level of complexity of the

situation.

6.3 Method: Dynamic Decision-Making for Agent Models

6.3.1 Concept for Dynamic Decision-Making

Trajectory planning for urban trafc in the context of AD represents an unsolved challenge

since various publications are available, but yet there exists no established solution applicable

to the diversity of situations encountered in urban trafc. Following, it was not possible to

adapt an existing method and apply it as a tool for agent models in DS.

Therefore, the subsequent method introduces a novel planning framework inspired by promising

publications. In the following, two concepts for trajectory planning that difer in complexity

and planning strategy are presented to gain insight into the disadvantages and potentials of

diferent methods compared to heuristic decision-making. Both planning approaches are based

on the same framework, realizing a two-layer concept inspired by Lim et al. [66], but employ

diferent strategies for creating a frst-guess trajectory representing a high-level decision. Based

on the hierarchical planning framework, two diferent planning methods for decision-making

are developed.

The frst planner variant employs a decomposition strategy to reduce the complexity of the

planning problem, named PL_PVD, while the other approach employs a three-dimensional

search strategy referred to as PL_3D. For the decomposition concept, the PVD method was

selected due to its prevalence in AD [66]. To refect the state-of-the-art in heuristic models,

the driver agent implemented at BMW, called TRM, is used. Details on the model structure

and decision strategies are provided in Section 2.2.6.

Since the focus in this chapter lies on the decision-making process, the impact of prediction

accuracy should be minimized. Therefore, GT data is taken as a prediction in the initial

concept implementation.

Based on the research question R3, three sub-questions are formulated to address the challenging

78

6.3 Method: Dynamic Decision-Making for Agent Models

topic in all dimensions.

•

R3.1: To what extent can a PVD (PL_PVD) and the three-dimensional approach

(PL_3D) p rovide additional value in terms of quality compared to a heuristic state-of-

the-art agent model?

•

R3.2: How do the PL_PVD and PL_3D approaches difer in terms of runtime and

quality of results in diferent scenarios?

•

R3.3: How sensitive are runtime and solution quality of planning approaches concerning

diferent parameter sets and changes in the scenario?

6.3.2 The Planning Framework

A multi-layer planning framework inspired by Lim et al. [66] is developed to enable more

dynamic decision-making. To reduce the planning complexity, the original problem with a

high dimensionality was split up into a two-layer concept of lower dimensionality with diferent

objectives as illustrated in Figure 6.1:

The Behavioral Layer generates a rough frst-guess trajectory refecting a high-level maneuver

decision. Many published approaches in AD frst make a maneuver decision by categorizing

the driving task and using for example state machines to select a maneuver, and subsequently

plan the trajectory for that maneuver [66]. However, this would result in the same problem

faced with heuristic driver models, which do not show enough fexibility to cope with the

variety of urban trafc. Therefore, the behavior planning approach was adopted from Lim et

al. [66].

The Motion Planning Layer produces a smooth and dynamically feasible trajectory on that

basis since the output of the behavioral layer is a roughly discrete spatio-temporal decision.

The entire process can be described as follows: the frst layer outputs a long-term frst-guess

trajectory, which is a non-smooth motion that is then further optimized in the second layer to

generate a smooth and dynamically feasible trajectory of smaller time horizons.

Since the high-level decision is the more sensitive part in terms of plausible behavior, two

diferent approaches for the behavioral layer were developed to gain insight into the weaknesses

and potential of diferent techniques.

6.3.2.1 Planner Variant: PL_PVD

In order to reduce the complexity and computational efort of generating the frst-guess

trajectory, the planning problem is solved by frst planning a path and subsequently planning

the velocity profle, both in two-dimensional space.

• Spatial Planning:

In the frst step, the path is generated using a search-based method. The approach

employs an adaption of the hybrid A* algorithm applied to a spatial confguration space.

The confguration space considers the static environment composed of the road network

and static obstacles. The original concept of the A* algorithm was introduced by Hart et

al. [211], and the principal idea can be described as follows. A graph compromises a set

S

of nodes

ni

and transitions between nodes

eij

which are associated with respective costs

79

6.3 Method: Dynamic Decision-Making for Agent Models

URBAN ENVIRONMENT

INFORMATION

BEHAVIORAL LAYER

MOTION PLANNING LAYER

First Guess Trajectory

Feasible Trajectory

Variant A: PL_PVD

1. Path finding

2. Velocity profile

Variant B: PL_3D

3-dim spatio-temporal

planning

Numerical Optimization by SQP

Input:

Global

Route

Smoothing

Output:

Local

Trajectory

Figure 6.1: The hierarchical planning framework realized as a two-layer concept for frst planning

a rough discretized frst-guess trajectory (behavioral layer), which is subsequently smoothed into a

dynamical feasible trajectory by the motion planning layer.

cij

. The graph is limited by source nodes and terminal nodes. Paths are ordered node

sets with a total cost summarizing individual transition and node costs. The algorithm

searches an optimal path from a source node

n

0 to a terminal node

nk

. The search loop

within the algorithm continues until an optimal path is found, assuming its existence.

Therefore, two sets are iteratively updated: the open set and the closed set. The closed

set contains all nodes that have already been visited, whereas the open set comprises all

nodes that have not been visited but were identifed as sequent nodes of those already

visited ones. A key element of the search algorithm is the value function

f

(

ni

), which

returns a cost value of a node

ni

.

f

(

ni

) is the sum of the total cumulative cost

g

(

ni

)

of each step in the path (

n

1

, n

2

, ..., nk

) and the heuristic cost

h

(

ni

) as estimation of

an optimal path from the node

ni

to a terminal node

nk

[211]. To generate plausible

trajectories, it is necessary to incorporate semantic information into the search algorithm.

Therefore, the value function

f

(

ni

) is extended by additional costs

s

(

ni

) aiming at forcing

the vehicle to stay as close as possible to the ego-lane, according to Equation (6.1). The

semantic cost s(ni) is composed of the following weighted penalties:

– penalizing the deviation from the center line of the ego lane

– penalizing moving into oncoming trafc lanes

f(ni) = g(ni) + h(ni) + s(ni) (6.1)

The individual semantic costs have been chosen in such a way that as many scenarios as

possible can be covered with a simple cost approach, since each additional cost factor

contributes to an increased computational efort during the search. Another key element

is the collision avoidance strategy, which is performed by checking for intersections

between a polygon representing the ego vehicle and polygons representing static obstacles

80

6.3 Method: Dynamic Decision-Making for Agent Models

in the environment. A collision applies if at least one polygon intersects with the

ego vehicle’s motion polygon. To reduce the number of nodes to be explored and the

computational efort, a consideration window is defned, which checks if there is a static

obstacle in the upcoming road section. If there is no obstacle present, the vehicle is

expected to follow the center-line of the respective lane, and node generation takes place

with a controlled action space

Ac

: (

vconst

,

ϕconst

,

dstepconst

) by employing a P-Controller.

If there is an obstacle present, child nodes with diferent steering angles

ϕ

are sampled

to fnd a path avoiding the obstacle. The node generation takes place with the default

action space

Ad

: (

vconst

,

ϕ ∈

[-

ϕmax

, 0 , +

ϕmax

],

dstepconst

). To obtain feasibility, a

bicycle model underlies the calculation of positions of the next child nodes incorporating

ego position, heading, and steering angle. Following, the ego moves continuously under

vehicle kinematic constraints in a XY space. In summary, spatial planning provides a

kinematically feasible path

p

based on the inputs of maximum steering angle

ϕ

, desired

velocity vconst, and constant step width dstepconst .

• Temporal Planning:

Since spatial planning is conducted in a XY space, the dynamic movements of other

road users cannot be taken into account during the path planning. Therefore, the

velocity profle is subsequently planned along the identifed reference path again using a

hybrid A* algorithm. The algorithm searches in a

s/t

space (s along path

p

over time

t

)

incorporating the dynamic movements of other trafc participants inspired by Lim et

al. [66]. It is assumed that path

p

, represented as a sequence of states (

xk, ϕk

), to be a

trajectory with constant velocity. Given path

p

, the main concept is to reduce velocity

to 0 before the ego vehicle reaches an area where collisions with dyn amic obstacles would

occur. In detail, the objective is to decide at each state (

xk, ϕk

) if proceeding to the

next state (

xk

+ 1

, ϕk

+ 1) would cause a collision with any other road user. In order to

consider the dynamic movement of others, the time dimension is assumed to be discrete.

If the transition of state (

xk, ϕk

) to the next state (

xk

+ 1

, ϕk

+ 1) at time

t

would cause

a collision, the state remains (

xk, ϕk

) for

t

+ 1, as illustrated in Figure 6.2. Consequently,

the control action to perform a transition between two nodes originates from a set

V

=

0, vconst

, whereby

vconst

=

dconst/t

. After conducting the temporal planning, the

sequence of nodes represents a trajectory (x/y/t), and the result of the search algorithm

is a non-smooth spatio-temporal trajectory refecting a high-level maneuver decision.

6.3.2.2 Planner Variant: PL_3D

The PVD technique decouples planning in space and time. Therefore, situations in which

spatial and temporal motion are highly interdependent, such as overtaking a cyclist, can lead to

implausible behavior. In order to address this shortcoming, an approach for directly planning a

frst-guess trajectory in three-dimensional space is developed. The three-dimensional planning

strategy introduces a new challenge in terms of representing dynamic obstacle information

within the confguration space. Common approaches in the feld of robotics employ DOGs

or spatio-temporal grids with DAGs to represent dynamically changing situations [59, 152].

Such representations are efective for describing dynamic scenes, especially when the input

81

6.3 Method: Dynamic Decision-Making for Agent Models

s

t

Dynamic Obstacle

Terminal Position

Figure 6.2: Illustration of the velocity profle generation using a hybrid A* algorithm applied in

the s/t space adopted from [66].

data is in the form of point clouds. Nevertheless, when information is discretized into grids, it

loses semantic context. Additionally, generating these dynamic grids in real-time consumes

signifcant computational resources [153, 154]. It is necessary to fnd a method that has the

potential to be real-time capable, from a conceptual point of view, to satisfy the functional

requirements of agent models in simulation. Therefore, the presented method overcomes

the need to discretize the entire environment at runtime by representing static and dynamic

obstacles using a hybrid ellipsoidal APF (Artifcial Potential Field) inspired by promising

publications in the context of AVs [212, 213]. The APF approach is a widely established

approach for obstacle avoidance in complex environments in robotics due to its efciency,

ability to generate smooth trajectories, and simplicity [214]. Besides these advantages, such

approaches are also associated with some potential problems. The authors Sun et al. identify

the most common problems associated with APFs for collision avoidance as the following

[214]. By converting obstacle information into a force, important information, such as obstacle

distribution, is lost. Especially in obstacle-intensive areas, this can lead to problems as no

solution space remains. Furthermore, depending on the design of the planning problem, local

minima or jitter may occur around the obstacle. However, the APF method has been identifed

as the most promising concept for the problem proposed here, and potential drawbacks are

addressed by novel techniques such as those proposed by Liu et al. [213] to compute a situation

adaptive APF that provides various parameters for tuning towards a satisfying solution.

The APF is incorporated in the cost function for collision avoidance and integrated with

additional time-related changes into the Hybrid A* implementation described in Section 6.3.2.1.

The APF is formed by relative position, ego vehicle velocity, relative velocity, and acceleration,

a safety distance factor, and coefcients for calculating the formations of the feld as illustrated

in Figure 6.3. Due to its fexible nature, the APF method can handle various interactive

situations, e.g., by tuning parameters or by adding additional factors such as increased lateral

vulnerability for obstacles of type cyclists. An example of the calculated APF costs for two

exemplary scenarios is illustrated in Figure 6.4. Thus, the value function

f

(

nk

) is extended by

82

6.3 Method: Dynamic Decision-Making for Agent Models

several weighted semantic costs

s

(

xk

) aiming at generating plausible movements with respect

to the situation:

• penalizing the deviation from the center line of the ego lane

• penalizing moving into oncoming trafc lanes

• time cost

• APF cost regarding static and dynamic obstacles

Obstacle

v

Figure 6.3: The APF formed by the position of the obstacle and velocity and position of the ego

vehicle [213].

Similarly to the PL_PVD approach, the semantic costs were selected in such a way that as

many scenarios as possible can be covered with as few cost factors as possible. Whether the

designed cost function generates plausible behavior in typical urban scenarios is examined

during evaluation. The search algorithm for three-dimensional planning also employs the

aforementioned consideration window method to determine whether there is a static or dynamic

obstacle ahead and, therefore, specifes two strategies for node selection:

Controlled: no static or dynamic obstacle ahead, so nodes are sampled only on the center-line

employing a controlled action space Ac as described in Section 6.3.2.1.

Default: Nodes with diferent steering angles and acceleration values are sampled to avoid

collisions and to fnd a feasible path to the destination employing action space

Ad

: (

acc ∈

[0,

accmin, accmed, accmax], ϕ ∈[ -ϕmax, 0, +ϕmax], dstep = f(acc)).

Based on the simplifed representation of the vehicle model and the forward search strategy of

the hybrid A*, it is assumed that the vehicle cannot reverse and only positive acceleration

values can be assigned to the action space. Similar to PL_PVD, the main concept is to reduce

the acceleration before a collision occurs. To allow for deceleration as well, a control system

such as MPC would be required.

6.3.2.3 Trajectory Smoothing

Given that the behavioral layer produces a roughly discretized trajectory, further processing is

required to create a dynamically feasible and smooth trajectory. This task is tackled by the

optimization-based motion planning layer and subsequent post smoothing.

83

6.3 Method: Dynamic Decision-Making for Agent Models

600 625 650 675 700 725 750 775 800

440

450

460

470

480

490

500

Real Scenario

600 625 650 675 700 725 750 775 800

440

450

460

470

480

490

500

APF

0 15 30 45 60 75 90 105

Repulsive Potential

600 625 650 675 700 725 750 775 800

440

450

460

470

480

490

500

Real Scenario

600 625 650 675 700 725 750 775 800

440

450

460

470

480

490

500

APF

0 15 30 45 60 75 90 105

Repulsive Potential

(a) APF for a static obstacle and oncoming

(b) APF for a cyclist and oncoming vehicle.

vehicle.

Figure 6.4: Example visualization of the APF for a dynamic (right) and static (left) obstacle.

Ego vehicle green Bounding box, other trafc participants blue.

Motion Planning Layer:

The motion planning layer employs numerical optimization to generate a smooth motion based

on the frst-guess trajectory. The trajectory provided by the behavioral layer is represented as

a list of states defned by position, heading, and action. The optimization problem formulation

considers the positions [

xi yi

] as decision variables to be rearranged to minimize the objective

function, and trajectory

t

is represented according to Equation (6.2). The objective function is

composed of the following weighted components:

fref

describing the distance to the originally

planned path, facc, and fjerk aiming at maintaining comfort.



t = 





XT

0

.

XN−1





, where Xi = [xi yi]T ∀i ∈ {0, ..., N − 1} (6.2)

Inequality constraints are formulated to prevent collisions caused by the rearrangement of the

decision variables. Therefore, collisions with static and dynamic obstacles are checked by a

strategy adapted from Zhang et al. [59]. The collision check is based on circles, whereby the

principle idea is that each car, including the ego vehicles, is represented by two circles, and

other trafc participants, such as pedestrians, are represented by one circle, as illustrated in

Figure 6.5. For collision avoidance, the distance from each circle center of the ego vehicle to

all circle centers representing other road users must be greater than or equal to the sum of

the radii of the respective circles. Depending on the parameterization of the radius, larger

or smaller safety distances can be forced. Equation (6.3) defnes the inequality constraint for

collision avoidance with

X

describing the center points of the ego and obstacles and

P

defning

the number of obstacles. As the dimension of the constraint vector

g

(

t

) of any trajectory

t

is already (4

P

+ 1)

· N

containing

N

time steps, only crucial cons traints are included as

84

6.3 Method: Dynamic Decision-Making for Agent Models

𝑋𝑖,𝑓

𝑋𝑖,𝑟

𝑋𝑖,𝑓

𝑜𝑏𝑠 𝑋𝑖,𝑟

𝑜𝑏𝑠

𝑟𝑜𝑏𝑠

𝑑𝑜𝑏𝑠,𝑟−𝑓

𝑟𝑣𝑒ℎ

Ego Vehicle

Obstacle

Figure 6.5: Circle method for collision free motion planning inspired by [66].

inequality constraints to maintain a balance in complexity.



gi,coll(t) = 





g0 (X)

i,coll

.

p

g (X)

i,coll



4

0 P·

≤



, 04·P ∈ R4·P (6.3)

For solving the optimization problem, the SLSQP solver from SciPy was chosen [215]. It is

noteworthy that commercial solvers may ofer more efcient solution methods. However, SciPy

was selected for scientifc comparability due to its open-source nature and its provision of the

essential features for this optimization task.

Post Smoothing:

No kinematic constraints were included in order to reduce the complexity of the formulated

numerical optimization problem. Investigations have shown that considering kinematic

constraints, such as limiting the maximal possible curvature, leads to poor optimization

and high runtime [216]. Therefore, the kinematic feasibility of the trajectory, computed by the

motion planning layer, is not guaranteed anymore. For this reason, a method using B-spline

interpolation is employed to smooth the fnal spatial path and guarantee feasibility. Therefore,

next to typically used methods, such as polynomials or Bezier curves, a cubic B-spline is

chosen as an interpolation method since, according to Zhu et al., it provides more fexibility

for curvature control [217]. The interpolation algorithm merely outputs a curvature-optimized

spatial component of the entire trajectory. The locations

Xi

of the initial trajectory can

be projected onto the computed spline to complete this postprocessing method and improve

spatial smoothness. The process of interpolation and curvature control produces a smooth

and kinematically feasible trajectory as an output for the ego vehicle.

6.3.3 Parametrization

Compared to heuristic decision-making, planning approaches better refect the human way

of dealing with complex trafc situations by balancing between diferent needs, such as time-

efcient target reaching or the need for safety. Such needs are refected in the various costs and

constraints incorporated into the planning framework. The resulting trajectory, and thus, the

behavior, strongly depends on how these cost values are weighted accordingly to each other,

which is part of the parameterization task. The parametrization can have a large impact on

the quality of results and the runtime. Meanwhile, it is challenging to fnd a globally valid

parameter set satisfying the diversity of situations encountered in urban trafc.

85

6.3 Method: Dynamic Decision-Making for Agent Models

Therefore, the present method established a GA (Genetic Algorithm) to fnd a set of suitable

parameters. This is a technique that takes advantage of the principle of genetics and natural

selection, in which a population of some individuals evolves towards a stronger one concerning

a defned ftness function [218]. The approach of using a GA ofers the opportunity to fnd

a set of parameters for one individual or among multiple scenarios satisfying a proprietary

specifed ftness function, allowing to prioritize application-specifc targets. In this case, the

ftness function was designed to consider runtime and quality in the sense of comfort values

and feasibility.

Initially, a population of randomly parameterized instances of the behavioral layer is generated,

which are applied to the training scenarios in order to evaluate the respective performance

regarding quality and runtime. The individual instances showing the best performance

are selected for mutual crossover by splitting the genomes and combining them with other

genome parts. In this way, a new generation of a population is created, and the process

of evaluation and crossover can be repeated. Moreover, random mutations are introduced

across individual genes to enable some stochasticity in the process. The fnal result provides

individual parameterizations with high ftness values. Based on empirical tests, the GA was

initialized with a random population of 50 individuals, 50 generations to be assessed, and

a mutation probability of 10%. The algorithm returns a set of ten so-called paretos, which

are individual parameter sets with similar results regarding the ftness function refecting the

optimum of either efciency or best quality. Providing ten paretos allows one to choose the

extrema with the best runtime or quality, as well as various trade-ofs in between.

86

6.3 Method: Dynamic Decision-Making for Agent Models

6.3.4 Evaluating Strategy

The evaluation addresses the previously defned research questions by investigating four key

scenarios inspired by the use case analysis in Section 3.3.3. The scenarios aim to represent

the main challenges in urban trafc and are illustrated in Figure 6.6. In the following, the

individual scenarios and the high-level behavior that is expected to be shown by the ego

vehicle are described. The defnition of expectations enables the subjective assessment of the

plausibility of behavior. Please note: expected behavior is defned by an individual expert

perspective and has no guarantee of generic validity at this point.

• (A_VEH:) ROW regulated interactions when turning left at an intersection:

The ego vehicle approaches a four-leg intersection and intends to turn left, while an

oncoming vehicle crosses the intersection going straight, and a pedestrian crosses the

road after the curve.

Expected behavior: The ego vehicle is expected to complete the turning maneuver

without colliding or cutting the ROW of any other trafc participant.

• (B_PED:) Pedestrians crossing the road:

The ego vehicle faces a situation with pedestrians walking along the sidewalk and two

pedestrians crossing the road at some point.

Expected behavior: The ego vehicle is expected to follow the lane without compromising

the safety of pedestrians.

• (C_STAT:) Partially occupied lane with oncoming trafc:

The ego vehicles’ lane is occupied by a static obstacle. In the oncoming trafc lane,

another vehicle is driving.

Expected behavior: The ego vehicle is expected to overtake the obstacle, aiming to reach

its target behind without colliding or cutting of the other vehicle’s ROW.

• (D_BIC:) Slower cyclist in front with oncoming trafc:

The ego vehicle is driving behind a slower cyclist, which is driving in front. In addition,

a vehicle is driving in the oncoming lane.

Expected behavior: The ego vehicle is expected to overtake the cyclist, aiming to reach its

target in time without colliding or cutting of the other vehicle’s ROW or compromising

the cyclist’s safety.

To address research questions R3.1 and R3.2, both planning approaches are compared with

each other and with the heuristic TRM model in terms of runtime and quality of the result in

the above-mentioned scenarios. Runtime is provided absolute and normalized to a planning

horizon of 5 seconds for all layers. Furthermore, the number of iterations and explored nodes

for the behavioral layer are provided, as runtime strongly depends on the machine and the

level of code optimization.

No runtime comparison is made between the planning concepts and the TRM model since the

TRM model is an optimized and integrated C++ module within the simulation framework at

BMW, while the two planning approaches are conceptual implementations in Python without

further code optimization.

The following criteria are investigated to measure the quality of the generated trajectories:

87

6.3 Method: Dynamic Decision-Making for Agent Models

F G

•

Comfort: measuring maximal acceleration of the frst-guess

acc

[

m/s2

] and fnal

max

F INAL

trajectory acc [m/s2]

max

•

Time efcient target reaching: average driven velocity

velmean

[km/h] and time to reach

the target point ttarget

•

Criticality: minimal distance to other trafc participants

ddyn

and minimal distance to

min

static obstacles dstat

min

Furthermore, quality is assessed subjectively by checking whether the aforementioned behavioral

expectations are met.

Addressing research question R3.3, the evaluation aims to provide insight into the potentials

and the sensitivities of such planning approaches towards diferent parameters and situational

changes. For this purpose, the quality and runtime of the PL_3D approach when solving

the cyclist scenario are evaluated under diferent conditions. For evaluating the infuence of

diferent parameter sets, the presented GA method is used to create the following parameter

sets for the cyclist scenario:

• A best runtime and a best quality parameter set.

•

A parameter set resulting from a global parameterization among all key scenarios and a

scenario-specifc parameter set.

In order to analyse the sensitivity of such approaches to changes in the scenario, two variants

of the cyclist scenario were created. Velocities and positions were selected to signifcantly

change the interaction of the ego vehicle with the other trafc participants.

•

Variation D1: the cyclist scenario with a diferent initial and desired velocity of the ego

vehicle: 35km/h instead of 50km/h.

•

Variation D2: initial and desired velocity of 35

km/h

and the same scenario with a

diferent starting position of the oncoming vehicle infuencing the interaction.

6.3.5 Implementation Details

For comparison, all key scenarios are created in Spider, which is BMW’s proprietary simulation

framework [151, 55]. The PL_PVD planner is initialized with a desired velocity of 50

km/h

and

dstep

= 5

m

. The PL_3D planner is initialized with a desired velocity of 50

km/h

and

acc ∈

[0

,

3

.

75

,

5

.

75

,

8

.

75

,

13

.

88]

m/s2

. For the variations D1 and D2, the desired velocity is

set to 35km/h and acc ∈ [0, 3.75, 5.75, 8.75, 13.88] m/s2. These values are oriented to typical

velocities (10, 20, 30, 50 [km/h]) to reach them in approximately one second. For both planners

the steering range is defned with

ϕ ∈

[

−

36

◦ ,

0

,

+36

◦

]. Both trajectory planning concepts are

implemented in Python 3.8 on an HP Z840" Workstation using Intel(R) Xeon(R) CPU E5-2640

v4 @ 2.40GHz 64GB RAM.

88

6.4 Results

Ego Vehicle

Other Vehicle

Cyclist

Static Obstacle

Pedestrian

AB

D

C

Figure 6.6: The four key scenarios for evaluation.

6.4 Results

The Tables 6.2 and 6.3 provide the results for all four key scenarios for the two trajectory

planners and the heuristic decision model (TRM) associated with the aforementioned evaluation

criteria. Besides the following selected illustrations, for all scenarios, the pictures of resulting

trajectories and fgures of velocity, acceleration and jerk profles can be found in the Appendix

(A.2).

6.4.1

Potential of Trajectory Planning Versus Heuristic Approaches for

Decision-Making

When considering the ability of the diferent decision-making strategies, from a functional

perspective, only the (PL_3D) approach is able to solve all scenarios as summarized in Table

6.1. The PL_PVD planner is not able to solve scenario D, whereby spatial and temporal

movement has a strong inter-dependency and is not able to overtake the cyclist. The heuristic

model (TRM) is able to solve the ROW regulated intersection (A_VEH) and interactions

with pedestrians (B_PED) but runs into a deadlock in scenarios C_STAT and D_BIC. In

these cases, the heuristic model does not consider the oncoming trafc lane as a driveable area,

and thus, none of the predefned maneuvers match the situation.

89

6.4 Results

When comparing the calculation efort between the PL_PVD and the PL_3D approach, one

can observe that generating a frst-guess trajectory is more costly when applying the PL_3D

approaches in scenarios A_VEH, B_PED, and D_BIC. Meanwhile, the PL_PVD approach

shows signifcantly higher runtimes in the motion planning layer for all scenarios, as the

frst-guess trajectory of the PL_3D planner is already smoother based on the third dimension

of time. When analyzing the behavior generated by the diferent decision-making strategies,

Table 6.1: The ability of the two planners and the heuristic agent model to solve the defned key

scenarios from a functional perspective.

Scenario A_VEH

√ Scenario B_PED

√ Scenario C_STAT Scenario D_BIC

TRM

PL_PVD

PL_3D

√

√ √

√

deadlock

√

deadlock

√

one can observe diferent levels of cautiousness in the individual interactions, infuenced by

parameters and representations. The extent to which the individual behaviors of the planners

are in line with a distribution of human-driven trajectories requires further investigation and

can be infuenced by the parameter setting. Regarding the criticality of results, all approaches

satisfy the requirement of not colliding with any other trafc participants or static obstacles.

However, when it comes to functionality, the approaches difer in their ability to reach the

target.

The ROW regulated interaction in scenarios A_VEH is solved in a functional and non-critical

manner by all approaches, but diferent behaviors are exhibited, as illustrated in Figure 6.7.

The heuristic model TRM lets the vehicle and the pedestrian pass frst while both planners

cross the confict zone before the other trafc participants. Based on environmental information,

the heuristic model chooses the maneuver give right of way when approaching the intersection

and, therefore, slows down early. The parameterization of the maneuver involving under which

conditions and distances the oncoming vehicle is identifed to be relevant for giving ROW

results in conservative behavior. In contrast, both planners balance between the needs of

target reaching, a comfortable trajectory, and keeping enough distance to dynamic obstacles

without explicitly formulating the ROW regulation at this point. The trajectory of the PL_3D

approach exhibits some curve-cutting behavior. This shows that the planner would require

some fne-tuning of the weights assigned to the costs, penalizing deviation of the center-line and

the comfort costs for this scenario. Similar efects of showing diferent but all rule-compliant

behaviors can likewise be observed in the B_PED scenario, as illustrated in the Appendix

in A.2.2. In scenario C_STAT, where the ego lane is occupied by a static obstacle, only the

two planners are able to solve the scenario. Due to the solid lane marking, the oncoming

lane is not considered a driveable area by the heuristic model, resulting in a deadlock in this

situation. Both the PL_3D and the PL_PVD controlled vehicle overtake the obstacle but

show diferences in their spatio-temporal motion and yielding behavior as illustrated in Figure

6.8. The PL_3D controlled vehicle drives far into the oncoming lane in Scenarios C_STAT and

D_BIC, as shown in Figure 6.9. This indicates that the weighting of the center-line deviation

cost and the APF cost relative to each other need to be further tuned. Scenario D_BIC is

solved only by the PL_3D approach while both the heuristic model and the PL_PVD planner

stay behind the cyclist, as shown in Figure 6.10. Based on the separation of planning in space

90

6.4 Results

(a) Scenario A_VEH solved by

(b) Scenario A_VEH solved by

(c) Scenario A_VEH solved by

TRM. PL_PVD. PL_3D.

Figure 6.7: Behavior of the heuristic agent model and the two planners in scenario A_VEH at

the same time step. Ego vehicle in green.

(a) Scenario C_STAT solved by TRM.

(b) Scenario C_STAT solved by PL_PVD.

(c) Scenario C_STAT solved by PL_3D.

Figure 6.8: Behavior of the heuristic agent model and the two planners in scenario C_STAT at

the same time step. Ego vehicle in green.

91

6.4 Results

(a) Scenario C_STAT solved by PL_3D. (b) Scenario D_BIC solved by PL_3D.

Figure 6.9: PL_3D driving far into oncoming lane in scenario C_STAT and D_BIC. Ego vehicle

in green.

and time, the dynamic movement of the cyclist is not considered during the path planning, and

thus, the planner is not able to overtake the cyclist. The profles of velocity, acceleration, and

jerk shown in Figure 6.11 illustrate that the implementation of the PL_PVD planner is also

not suitable for a typical following maneuver since the planner alternates between accelerating

and waiting, resulting in jerky movement. For the heuristic model, again, the oncoming trafc

lane is not considered a driveable area, preventing the model from overtaking.

Considering the balance between formulating a complex cost function and reducing the

complexity of the planning problem, the designed cost functions of both planners show great

potential, as all scenarios can be solved by the PL_3D planner, and even if the PL_PVD

planner fails in scenario D_BIC, the reason does not lie in the design of the cost function.

Table 6.2: Comparing the results of the planners PL_PVD and PL_3D with the heuristic agent

models (TRM) in scenario A_VEH and B_PED.

Scenario A_VEH: Left turn at intersection Scenario B_PED: Pedestrian crossing

PL_3D PL_PVD TRM PL_3D PL_PVD TRM

Objective

Quality

accF G

max

3.50/-0.12 m/s

2

3.59/-0.0 m/s

2

- 13.80/-0.00 m/s

2

1.96/-0.00 m/s

2

-

accF INAL

max

3.64/-0.01 m/s

2

3.98/-0.0 m/s

2

3.28/-6.03 m/s

2

10.93/-0.02 m/s

2

1.89/0.00 m/s

2

3.27/-8.67 m/s

2

vel

mean

49.71 km/h 55.0 km/h 25.36 km/h 45.65 km/h 55.02 km/h 32.86 km/h

t

target

17.0s 15.9s 34.96s 17.5s 14.7s 25.24s

dyn

dmin

>20m >20m 1.40m to pedestrian 2.11m to pedestrian 1.73m to pedestrian 1.54m to pedestrian

dstat

min

- - - - - -

Subjective

Quality

Observed

behavior

pass intersection

before vehicle and pedestrian

(without collision)

pass intersection

before vehicle and pedestrian

(without collision)

let vehicle

and pedestrian pass

let pedestrian

pass

pass before pedestrian

(without colliding)

let pedestrian

pass

Expectation

satisfed?

TRUE TRUE TRUE TRUE TRUE TRUE

Runtime

Latency BH:

total (per 5sec) 0.164s (48.24ms)

path:

0.086s (27.04ms)

vel.profle:

0.021s (6.60ms)

<1ms

replanning all 50ms

0.219s (62.57ms)

path:

0.071s (24.15 ms)

vel.profle:

0.037s (12.59ms)

<1ms

replanning all 50ms

Explored

Nodes

176

path:332

vel.profle:326

242

path:304

vel.profle:302

Iterations 89

path:167

vel.profle:164

122

path:153

vel.profle:152

Latency MP:

total (per 5sec) 0.215s (63.24ms) 1.072s (337.11ms) 0.264s (75.43ms) 1.897s (645.24ms)

Latency PP:

total (per 5sec) 0.017s (5ms) 0.018s (5.66ms) 0.018s (5.14ms) 0.017s (5.78ms)

Total Latency:

total (per 5sec) 0.396s (116.47ms) 1.197s (376.42ms) 0.501s (143.14ms) 2.022s (687.76ms)

92

6.4 Results

(a) Scenario D_BIC solved by TRM.

(b) Scenario D_BIC solved by PL_PVD.

(c) Scenario D_BIC solved by PL_3D.

Figure 6.10: Behavior of the heuristic agent model and the two planners in scenario D_BIC at

the same time step. Ego vehicle in green.

Figure 6.11: Profles showing velocity, acceleration and jerk of the PL_PVD planner in Scenario

D_BIC for the fnal trajectory.

93

6.5 Limitations and Future Work

Table 6.3: Comparing the results of the planners PL_PVD and PL_3D with the heuristic agent

models (TRM) in scenario C_STAT and D_BIC.

Scenario C_STAT: Static obstacle Scenario D_BIC: Cyclist

PL_3D PL_PVD TRM PL_3D PL_PVD TRM

Objective

Quality

acc

F G

max

3.50/-0.00 m/s

2

15.27/0.0 m/s

2

- 3.26/-0.00 m/s

2

16.64/0.00 m/s

2

-

acc

F INAL

max

1.81/-0.04 m/s

2

10.70/0.00 m/s

2

0.0/-3.8 m/s

2

2.60/-0.00 m/s

2

11.65/-0.0 m/s

2

2.78/-6.66 m/s

2

vel

mean

49.55 km/h 44.82 km/h 32.68 km/h 49.65 km/h 20.59 km/h 17.52 km/h

t

target

11.4s 13.0s DEADLOCK 16.0s 39.4s 39.88s

dyn

d

min

5.16m to car 0.43m to car

1.76m to car

when passing

2.06m to cyclist 0.57m to cyclist 3.61m to cyclist

d

stat

min

2.09m 1.25m 11.56m - - -

Subjective

Quality

Observed

behavior

overtakes obstacle

after oncoming vehicle

overtakes obstacle

after oncoming vehicle

deadlock

in front of obstacle overtakes cyclist remains

behind cyclist

remains

behind cyclist

Expectation

satisfed?

TRUE TRUE FALSE TRUE FALSE FALSE

Runtime

Latency BH:

total (per 5sec) 0.837s (367.11ms)

path:

0.916s (352.31ms)

vel.profle:

0.043s (16.54ms)

<1ms

replanning all 50ms

1.564s (488.75ms)

path:

0.075s (9.52ms)

vel.profle:

0.206s (26.14ms)

<1ms

replanning all 50ms

Explored

Nodes

1118

path:567

vel.profle:388

1204

path:304

vel.profle:796

Iterations 487

path:220

vel.profle:195

606

path:153

vel.profle:399

Latency MP:

total (per 5sec) 0.193s (84.65ms) 1.279s (491.92ms) 0.255s (79.69ms) 6.130s (777.92ms)

Latency PP:

total (per 5sec) 0.017s (12.91ms) 0.020s (7.46ms) 0.018s (5.63ms) 0.020s (2.54ms)

Total Latency:

total (per 5sec) 1.048s (459.65ms) 2.257s (868.08ms) 1.837s (574.06ms) 6.432s (816.24ms)

6.4.2

Sensitivity Towards Diferent Parameter Sets and Scenario Variations

Addressing research question R3.3, the sensitivity of trajectory planning to changes in the

scenario or the parameter set is evaluated by investigating runtime and quality of results of

the PL_3D approach in the defned variants of scenario D_BIC. Table 6.4 provides all quality

and runtime measures for variants D1 and D2. First of all, it is noteworthy that all variants

satisfy the requirements to overtake the bicyclist while avoiding a collision with any other road

user, demonstrating the ability of the planner to adapt behavior dynamically to the changed

situation. The quality of the trajectories difers in terms of smoothness, especially in Variant

D2, comparing the best runtime and best quality Pareto as shown in the velocity, acceleration,

and jerk profles illustrated in Figure 6.13. Meanwhile, the best runtime parameter set saves

more than 50% of the total runtime, which demonstrates the great potential of parameter

tuning toward a more efcient planner. Next to the objective measures of runtime, Figure

6.12 shows the high diference of explored nodes to solve the scenario. Comparing variants D1

and D2 in terms of runtime, one can see that the calculation efort of the approach is strongly

infuenced by the interaction happening in the scenario. The shorter distance to the oncoming

vehicle is benefcial for the search algorithm since there is less waiting time for the ego vehicle,

and thus, it is easier to fnd a smooth trajectory overtaking the cyclist.

Plots and fgures of behavior for the remaining variants can be found in the Appendix (A.2).

6.5 Limitations and Future Work

The proposed trajectory planning framework shows high potential in being able to generate

reasonable behavior across the great variety of situations encountered in urban trafc. However,

certain limitations should be mentioned. Since the behavioral layer intends to plan a high-level

maneuver decision, the action space is simplifed, and according to the explanation in Section

94

6.5 Limitations and Future Work

(a) Variation D2 best quality parameter set.

(b) Variation D2 best runtime parameter set.

Figure 6.12: Behavior of the PL_3D planner in Variant D2 best quality parameter set (top)

versus best runtime parameter set (bottom).

(a) Variation D2 with best quality parameter set:

frst-guess (top) and fnal trajectory (bottom).

(b) Variation D2 with best runtime parameter set:

frst-guess (top) and fnal trajectory (bottom).

Figure 6.13: Velocity, acceleration and jerk of the PL_3D planner in Variant D2 best quality

parameter set (left) versus best runtime parameter set (right).

6.3.2.2, no deceleration is considered. The approach can be extended, but such extensions

would also enhance complexity and runtime.

During the evaluation, the runtime comparison is only performed between the two planners

and not with the heuristic model TRM since the TRM mo d el is an embedded and optimized

C++ module in contrast to the conceptual implementation of the planners. To what extent

95

6.5 Limitations and Future Work

Table 6.4: Variations D1 and D2 for investigating changes in parametrization and scenario for

the cyclist scenario D_BIC using the PL_3D planner.

Variation D1: Scenario cyclist 35km/h Variation D1: Scenario cyclist 35km/h Variation D2: Scenario cyclist 35km/h

best runtime best quality global scenario-specifc best runtime best quality

Objective

Quality

F G

accmax

1.28/0.0 m/s

2

1.28/0.0 m/s

2

1.28/0.0 m/s

2

1.68/0.0 m/s

2

5.35/0.0 m/s

2

3.77/0.0 m/s

2

F INAL

accmax

1.35/0.0 m/s

2

1.39/0.0 m/s

2

2.30/0.0 m/s

2

2.20/0.0 m/s

2

3.04/0.0 m/s

2

3.42/0.0 m/s

2

vel

mean

31.46 km/h 31.47 km/h 31.40 km/h 31.38 km/h 31.31 km/h 31.39 km/h

t

target

25.4s 25.4s 25.6s 25.6s 23.2s 23.2s

Subjective

Quality

Observed

behavior Ego overtakes cyclist Ego overtakes cyclist Ego overtakes cyclist Ego overtakes cyclist Ego overtakes cyclist Ego overtakes cyclist

Expectation

satisfed?

TRUE TRUE TRUE TRUE TRUE TRUE

Runtime

Latency BH:

total (per 5sec) 2.376s (467.72ms) 2.707s (532.87ms) 2.617s (511.13ms) 2.862s (558.98ms) 1.00s (216.38ms) 2.63s (566.81ms)

Explored

Nodes

1823 2022 1960 2113 875 1865

Iterations 744 845 802 863 567 1030

Latency MP:

total (per 5sec) 0.382s (75.20ms) 0.327s (64.37ms) 0.324s (63.28ms) 0.387s (75.59ms) 0.410s (88.36ms) 0.373s(80.39ms)

Latency PP:

total (per 5sec) 0.018s (3.54ms) 0.019s (3.74ms) 0.016s (3.13ms) 0.018s (3.54ms) 0.017s (3.66ms) 0.019s (4.09ms)

Total Latency:

total (per 5sec) 2.776s (546.46ms) 3.053s (600.98ms) 2.958s (577.73ms) 3.267s (638.09ms) 1.431s (308.41ms) 3.023s (651.51ms)

the runtime of the planning approaches can be tuned towards a level sufcient for the

implementation in a multi-agent simulation has to be investigated.

Furthermore, due to the focus on the planning problem itself, predictions regarding the future

movement of other trafc participants were obtained from GT data. For further development,

the handling of prediction uncertainties has to be investigated. Moreover, the question arises

concerning the distribution of knowledge among several modules within a holistic driver model.

Is it sufcient to provide the information regarding, for example, the ROW context to the

prediction module assuming that the resulting prediction corresponds to this regulation, or

does the planning module require this information explicitly in order to incorporate it into the

planning strategy?

For parametrization, a GA using a simple ftness function was employed. Based on insights

provided by this thesis, such as the insufcient weighting between the APF and the lane

deviation cost of the PL_3D planner, a more sophisticated ftness function could be formulated

aiming at more human-like motion.

The benefts of the presented planning approaches exceed those of heuristic models only in

complex situations. Therefore, the use of such decision strategies should be chosen sensitively

only when required. One approach could be to fnd a modular approach that selects the best

decision strategy from a set of possible modules depending on the level of complexity of the

situation.

Since the planners are more sensitive to changes in the situation or parameters compared

to heuristic approaches, an extended evaluation over a large number of scenarios is required.

Commonly used evaluation methods, investigating to what extent any generated trajectory is

in line with human behavior under similar conditions, are limited to some simple criteria and

the visual assessment of the developer. Therefore, the development of an evaluation method

capable of objectively assessing the human likeness of trajectories is required and will be

addressed in the subsequent chapter.

96

6.6 Conclusion

In summary, the evaluation showed that trajectory planning shows great potential for dynamic

decision-making as the planners are able to generate plausible behavior even in highly complex

urban situations while the heuristic model runs into deadlocks. The comparison between

the two diferent planners demonstrated that PL_PVD can produce high-level decisions

with lower computational cost but is more costly when smoothing into a feasible trajectory.

Furthermore, the decomposition of plann ing in space and time limits the planner’s ability

to produce plausible behavior in scenarios in which temporal and spatial motion interact

strongly. Therefore, limitations similar to the heuristic model occur, and only the PL_3D

planner was able to solve all levels of complexity. The presented planners allow more dynamic

decision-making due to the larger and fner discrete solution space. The runtime and the

quality of the resulting trajectory strongly depend on the particular confguration space and,

thus, on the individual s cenario and the weighting of the diferent costs to be minimized.

As a consequence, such approaches have a higher sensitivity to parameters or situational

changes compared to heuristic mo dels. However, at the same time, the planners provide the

indispensable basis for a scaleable solution to cope with the variety of scenarios occurring in

urban trafc.

97

7

Evaluation Methods

Disclaimer: The present chapter involves research presented in the following publications:

[103]: Teresa Rock et al. “Quantifying Realistic Behaviour of Trafc Agents in Urban Driving Simulation Based on

Questionnaires”. In: 2022 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2022, pp. 1675–1682.

[219]: Teresa Rock et al. “Objectively Scoring the Human-Likeness of Artifcial Driver Models”. In: Applied Sciences

13.18 (2023).

The following chapter addresses research question R4: How to identify model

limitations and quantify the degree of human likeness of holistic driver models

and individual subcomponents?

After a brief introduction pointing out the motivation of the chapter, a comprehensive overview

of state-of-the-art approaches for driver model evaluation from diferent research areas is

given, and associated potentials and weaknesses are discussed. Following this foundational

understanding, two novel methods, an objective and a subjective approach, for evaluating driver

models in terms of human likeness are presented. Both methods are applied to the holistic

driver model at BMW. To conclude, results and the potentials of the individual methods are

critically discussed.

7.1 Introduction and Motivation

Evaluation as a crucial part of model development is especially challenging in the area of

human-like driver models for urban trafc. First of all, the assessment of human likeness sufers

from the absence of a unique defnition of what constitutes human likeness. As discussed

in Section 3.2, diferent perspectives and approaches are available, originating from various

research areas and targeting individual objectives. Furthermore, human driving style varies

due to diferent personalities, experiences, and situational contexts [127]. The result is behavior

individuality in two dimensions, driver-individual and situation-individual [110]. Especially in

98

7.1 Introduction and Motivation

urban scenarios, behavior is infuenced by various situational infuences, which must, therefore,

be taken into account for evaluating the behavior of a driver model in terms of human likeness.

To give a simple example, a driver waiting at an intersection to give ROW to another vehicle.

Imagine the driver accepted a very small time gap because he was already waiting for a few

minutes and had to give ROW to multiple vehicles, while the driver model decides for a longer

time gap. With a common objective metric employing a simple samples-wise comparison of

model and driver behavior in this situation, the model behavior would receive a poor rating

as it is not in line with this individual human behavior without being aware of how both

behaviors would be located within a statistical behavior distribution. Evaluating the same

scenario sub jectively using a contemporary questionnaire, one would possibly receive a good

human likeness rating as model behavior is still in line with an expected range of cautiousness.

In conclusion, the validity of evaluation methods must be handled carefully, always considering

the extent to which the respective method allows for general conclusions.

Therefore, evaluation methods used for model development should cover the key requirements

of transparency and efcient assessment also in early development stages and be adaptable

to diferent applications. The evaluation approach should enable transparency to identify

the strengths and weaknesses of the model and provide signifcant results on a microscopic

behavior level, as urban trafc is characterized by situational changes.

As developing human-like driver models is a highly complex task, iterative development

processes are required. Therefore, the approach should be efcient and suitable for iterative

model improvement.

Finally, since developing human-like driver models has various applications in diferent research

areas, the approach should be transferable to diferent application domains, allowing weighting

according to diferent priorities.

The previous example and an extensive state-of-the-art analysis have shown that modern

objective approaches either do not consider contextual information, focus on macroscopic

assessment, or provide assessment strategies for very isolated scenarios and are, therefore, either

not scaleable or inappropriate. Meanwhile, subjective approaches, on the other hand, consider

behavior within context but, due to the use of simple questionnaires, are unable to provide

insights into which aspects of model behavior reveal weaknesses. Furthermore, state-of-the-art

research intensively addresses the task of developing driver models, planning-, and prediction

modules for complex urban situations, while the challenge of adequately evaluating the results

is rarely discussed.

Therefore, this chapter aims at providing evaluation methods covering the requirements

mentioned above. Thus, a novel objective method applicable to both a fully developed driver

model and subcomponents and usable for models in various development stages is presented.

Furthermore, a subjective method is proposed, which is able to provide crucial insights for

DiL applications based on fully implemented agent models.

99

7.2 State-Of-The-Art: Evaluation Approaches for Human-Like Driver Models

7.2

State-Of-The-Art: Evaluation Approaches for Human-Like

Driver Models

Diferent perspectives on the assessment of the human likeness of driver models have been

presented in Section 3.2. The following section, therefore, provides some additional details

on the topic by categorizing metho ds into objective and subjective approaches and providing

examples from the development of AV and DS.

In the area of AV development, most of the objective metrics rely on the direct comparison

of modeled driving data to an individual sample of real driving b ehavior, describ ed as a

trajectory, using distance measures [220]. Common metrics for evaluating prediction or

prediction frameworks employ displacement errors, measured, for example, as the distance

between the actual and predicted trajectories [81]. Such metrics indicate how accurately the

predicted trajectory matches the individual human-driven trajectory. However, the use of

displacement errors cannot provide information on how functional or plausible the trajectory

was. Therefore, in some individual cases, more sophisticated evaluation strategies are applied,

e.g., taking into account functional errors such as road violations [190]. A detailed summary of

state-of-the-art metrics that are used in the area of trajectory prediction is provided in Section

5.2 in Table 5.2.

To quantify the similarity between driver models and human trafc behavior in DS, macroscopic

analyses are performed. With the help of endurance tests, synthetic data is generated and

compared with real trafc data in typical highway scenarios, such as cut-in or following

maneuvers [47]. Typical indicators to describe human behavior in related research are average

and maximal velocity, frequency and exceeding of speeding [109], acceleration and headway

[110, 221] as well as TTC and longitudinal distance [90]. Based on such parameters, the relative

validity of the macroscopic behavior of driver models can be determined [107, 108]. Such

methods compare observed macroscopic parameters of artifcial vehicles with a distribution of

respective parameters among real vehicles. For example, to measure the human likeness of

modeled behavior, the distributions generated by the models’ parameters can be compared

with the distributions of respective parameters extracted from the real-world driving data.

For comparing distributions, statistical approaches such as Kolmogorov-Smirnov in the study

conducted by Wang et al. [222], or Kullback-Leibler divergence as in research by Kuefer et al.

[73] are applied. However, since DS is primarily applied to highway use cases, most approaches

focus on highway trafc and do not consider contextual infuences, which in turn raises doubts

about the applicability of such methods to more complex urban trafc.

Subjective approaches, on the other hand, measure human likeness using questionnaires,

interviews, or surveys that automatically consider behavior in an individual context. The

underlying assumption of such methods is that either what is perceived as real or can not be

distinguished from artifcial behavior defnes human likeness. Y. Zhang et al., for example,

adapted the Turing test and asked participants to classify the driving behavior of another

vehicle into either artifcial or human-driven [113]. Similarly, Dumbuya et al. asked subjects to

rate how realistic they perceived a drive completed by diferent driver models and how likely it

was that the drive was conducted by a real human driver [114]. Further research investigates

the human likeness of driver models and the extent to which the perceived realism of a VE is

100

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

afected by the behavior [223, 103]. In terms of DiL applications, the objective is to create

plausible behavior of the trafc agents for valid test conditions [46]. Humans will perceive the

behavior of other road users as realistic if it corresponds to what they expect and experienced

in real trafc. Considering the trafc agents as part of a VE, the psychological concept of

presence, as a common method for measuring the efectiveness of VEs, should be named [115,

116, 117, 118, 119]. The underlying assumption is that a high sense of presence means that

people respond as if the sensory input was real [118], and therefore measures the participants’

perception of reality. A more detailed description of causes of presence and the relation to

driver models is provided in Section 3.3.

7.3

Method 1: Objectively Evaluating the Human Likeness by

Introducing a Plausibility Metric

In the following section, a multi-dimensional quality function is proposed, which includes

various objective parameters for the characterization of human-like driving behavior in order to

objectively evaluate the human likeness of driver models. Driving behavior is discretized into

spatio-temporal behavior, thus trajectories and categorized into diferent driving situations by

assigning context parameters, such as the ROW situation, based on the scene representation

proposed in Section 3.4.1. The characterization of each driving situation allows for a conditional

comparison of model behavior to that of humans in similar situations by selecting a subset of

human trafc data showing the respective driving situation. Based on behavior and context

parameters, the degree of human similarity of the artifcial model can be evaluated by using

statistical analysis within a quality function. Since this method attempts to objectify human-

like behavior that is difcult to attribute to any objective truth, the developed method is

validated with the help of a subjective survey inspired by the Turing Test method.

7.3.1 Concept

The core idea centers on the development of a metric that can objectively quantify the

plausibility and, thus, the human likeness of artifcially generated driving behavior by evaluating

the driven trajectory. The method is required to be adaptable to various use cases and suitable

for complex urban trafc scenarios. The evaluation method receives artifcial driving data in

conjunction with contextual information that describes the driving situation, such as whether

the vehicle is yielding the ROW. The metric is constructed by creating a multi-dimensional

quality function that includes various parameters to characterize behavioral plausibility and

provides a consolidated score of human likeness, as illustrated in Figure 7.1. This scoring

technique allows for the efcient assessment of a wide range of samples and scenarios, while the

underlying multi-dimensional quality function enables a detailed analysis of individual samples

and provides transparency concerning situations or characteristics in which a mo del exhibits

weaknesses. The multi-dimensional quality function compromises functional, dynamic, and

interaction-related parameters. Functional parameters are designed to examine the presence

of collisions or road violations, dynamic parameters asses driving behavior by considering

factors such as acceleration, speed, and jerk. Interaction-related parameters evaluate the

relative movements between the ego and other road users. These parameters are checked

101

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

EVALUATOR

SITUATIONAL

CONTEXT

TRAJECTORY

DATA BASE OF REAL TRAFFIC

FOR BEHAVIOR COMPARISON

MULTI-DIMENSIONAL

QUALITY FUNCTION

+

HUMAN-LIKENESS

SCORE

Figure 7.1: The general idea of evaluating model behavior within situational context.

against human behavior in analogous situations and can be weighted to accommodate a variety

of applications and specifc requirements. Finally, these weighted metric components are

combined to generate a consolidated human likeness score. The value ranges that signify either

passing or failing a parameter check are extracted from real trafc data, and confgurable

thresholds can be applied to fne-tune the desired value range to satisfy particular needs of the

application, such as safety. This modular design is intended to provide high scaleability of the

method, allowing developers to choose which parameters to consider and how to weigh them.

For example, the evaluation of a trajectory planner for real trafc might require a higher level

of safety in behavior in contrast to the context of artifcial road users within DS.

Context parameters are ass igned to identify the specifc situational context in which the

artifcial driving data is generated in order to enable the systematic comparison of quality

parameters under analogous conditions. These context parameters serve as identifers and

facilitate the extraction of a subset of analogous real-world trafc situations from a database

fltered by equivalent context parameters. In this way, artifcial behavior can be compared with

hu man behavior under similar conditions, such as comparing the trajectory of a left-turning

vehicle, giving ROW to various human drivers handling the same situation.

In order to investigate the extent to which the proposed metric is able to replace the subjective

evaluation of a human assessing artifcial behavior, the method is validated through a subjective

survey. During the survey, participants are asked to rate the human likeness of drivers in

diferent situations, without knowing whether the vehicle was driven by a human or by a

model. Based on the participant ratings and the automatically computed objective human

likeness score, the efectiveness of the proposed metric can be validated. The present concept

can be divided into the following steps: specifcation of parameters to characterize human

driving behavior, identifcation of context-based similarity of situations, preparing databases,

formulation of the quality function, and validation, as shown in Figure 7.2.

102

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

„EVALUATION OF EVALUATOR“

DEFINE PARAMETERS

THAT INDICATE

HUMAN-LIKE

BEHAVIOR

IDENTIFY REQUIRED

CONTEXTUAL

INFORMATION

FORMULATE

QUALITY

FUNCTION AND

SCORING

EVALUATOR

SYNTHETIC

TRAJECTORY DATA

+ CONTEXT

REAL TRAJECTORY

DATA

+CONTEXT

SURVEY

HUMAN

LIKENESS

SCORE

DATA OF

REAL

TRAFFIC

validate

DRIVER

MODEL

Figure 7.2: Illustration of the concept toolchain including metric development and validation

strategy of the method.

7.3.1.1 Parameter Specifcation

In the frst step, parameters for the evaluation of human-like driving behavior have to be

defned. Inspired by literature, Table 7.1 provides an overview of potential parameters to

characterize driving behavior and are distinguished by the following categories:

• Functional: Did a collision occur, and did the trajectory violate the driveable area?

•

Dynamic: Is motion measured by acceleration, jerk, and velocity in a human-like range?

•

Interaction: Are cautiousness and criticality in interactive situations, measured by

parameters such as time gaps and distances, in a plausible range?

The parameters can be calculated for all samples of real and artifcially generated behavioral

data, provided that the spatio-temporal motion, road user classifcation, and information about

the static environment, i.e., the map, are available. Algorithms to calculate interaction-related

information are based on a fusion of map and time series data according to the concept and

data formats presented in Chapter 4. In addition to the scene representation presented in

Chapter 4, the interaction category contains parameters such as PET (Post Encroachment

Time) that require knowledge of a longer time series of motions and, therefore, can only be

computed ofine and would not b e suitable for a situation representation for a driver model.

Therefore, the calculation of PET, TET (Time Exposed Time-to-Collision), and the critical

time gap is described in detail:

103

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

Table 7.1: Overview Parameters for describing human likeness and plausibility of behavior.

Category Parameter Reference

Functional Road violation [81, 112, 190]

Collision check [170]

Dynamic Lateral Velocity [2]

Longitudinal Velocity [224, 225, 2, 226, 227]

Lateral Acceleration [224, 2, 226]

Longitudinal Acceleration [224, 2, 226, 227]

Longitudinal Jerk [226]

Interaction Relative Velocity [224, 2, 226, 227]

Distance to the Partner [2, 226]

TET (Time Exposed Time-to-Collision) [225]

Max. Value for Critical Time Gap when

Interacting

[228]

PET (Post Encroachment Time) [225]

•

PET: is the time diference between the time when the frst vehicle leaves and when

the second vehicle enters the confict zone, as illustrated in Figure 7.3a. The start and

end positions of the confict zone can be determined by the overlapping multi-polygons,

describing the spatial movement of a vehicle by a union of all bounding boxes for the time

step. Based on this, the time can be calculated by checking for each time step whether

the two bounding boxes representing the vehicles intersect. Implemented calculations

are based on the work of Allen et al. [229].

•

TET: is an aggregate metric encompassing all time intervals during which a pair of

interacting vehicles exhibited TTC values lower than a defned threshold. TET serves

the purpose of as sessing the duration of a confict, and calculations are adapted from

Minderhoud et al. [230]. Interactions are considered pair-wise between the ego vehicle

and its partner. If the trajectories of the pair show an intersection, a motion polygon for

each vehicle can be calculated by the respective bounding boxes over time. Based on a

confict zone extracted from the overlap of motion polygons, TTC can be calculated for

all time steps of the interaction until either the ego vehicle or the interaction partner

leaves the confict zone by assuming velocity to be constant in each time step. TET is

the cumulative time duration, during which the TTC of the ego vehicle was below a

critical value, as illustrated in Figure 7.3b. The critical value was set to six seconds, as

this is the th reshold of interacting vehicles being mutually infuenced according to real

trafc analysis [231].

•

Critical time gap: Gap acceptance is a concept primarily applicable to unsignalized

intersections, which commonly occur at the junction of major and minor roadways, and

was inspired by Raf et al. [228]. When interaction takes place, the respective vehicle

must yield the right of way to vehicles on the primary road. These consecutive vehicles on

the main road create time gaps, as depicted in Figure 7.3c. The acceptance or rejection

of such a gap is contingent upon factors such as gap duration, clearing time, and driver

behavior, as elucidated by Dutta and Ahmed [232]. The value for the time gap being

accepted can be determined by the moment a vehicle with status waiting for gap, which

104

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

is heuristically identifed by velocity, acceleration, and interaction identifcation, starts

moving. By calculating the intersection of time gaps being accepted and rejected, the

critical time gap can be calculated as shown in Figure 7.3c.

Time 𝑡1Time 𝑡2

(a) PET calculation [233].

Left: First arriving vehicle leaves the confict

zone. Right: Later arriving vehicle enters the

confict zone.

TTC

threshold

𝑡0𝑡1𝑡2𝑡3Time

(b) TET calculation according to [234].

GAP

Frequency Accepted Gaps

Gap [s]

Rejected Gaps

Critical gap

(c) Critical time gap calculation.

(d) Critical time gap calculation according to

Raf et al. [228].

Figure 7.3: Illustration of the calculation concepts for interaction-related parameters.

7.3.1.2 Identifcation of Context-Based Similari ty of Situations

In order to compare artifcial behavior with the behavior of humans in similar driving situations,

contextual information must be assigned to the data. Inspired by Scholtes et al. [235] and the

previous concept of representing context information (Section 3.4.1), a multi-layer approach to

describe urban scenarios is developed and presented in Table 7.2. The parameters are inspired

by Schlote’s approach of categorizing environmental information into layers in combination with

the real trafc database, which shows ROW controlled intersections. The context parameters

allow situations such as those shown in Figure 7.4 to be characterized and identifed as similar.

For example, Figure 7.4 shows two situations with the following characteristics: Relation to

intersection: Just Before; Lane turn direction: straight; Vehicle state maneuver: Decelerate;

Number of legs: 4; Number of interactive vehicles: one; Number of interactive V RUs: one;

Intersection Density: Low; ROW relationship: giving; Closest interacting vehicle class: Car;

Closest interacting VRU class: Bicycle.

Algorithms to extract context information are based on a fusion of map and time series data

according to the metho dology presented in Chapter 4. The proposed contextual parameters

can be calculated for each sample for both the trajectory data to evaluate as well as the

comparison data originating from real trafc.

105

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

7.3.1.3 Preparing the Database

Based on context parameters, a subset of real data containing similar situations can be selected.

This subset of real trafc database forming the basis for the comparison is required to contain

sufcient samples. All time series data are aggregated into sequences of one-second time

windows to determine the context. A threshold of 1000 samples is defned to determine if the

number of GT data is sufcient for comparison. If the number of remaining samples in the GT

subset is insufcient, the level of abs traction is increased, and context-describing parameters are

gradually removed from the fltering. Table 7.2 shows the priority order of context parameters

for increasing abstraction in column P. The priority order was determined empirically by

expert knowledge and scanning the real trafc data. Knowledge of the number of matching

parameters is saved to allow additional information about the certainty of the comparison by

providing the Jaccard Index to quantify the similarity of subsets in each comparison [236].

Furthermore, the context information is used to select the parameters to be considered since,

for example, calculating PET in the absence of interaction partners would not be reasonable.

Table 7.2: Overview of context parameters to distinguish scenarios, sorted by priority order (P),

to extract similar trafc situations from the comparison database.

P

Scenario Cat. 1 Cat. 2 Cat. 3 Cat. 4 Cat. 5

1

Infrastructure ma-

neuver: Relation

to intersection

Before Just Before Inside Just After After

2

Infrastructure ma-

neuver: Lane turn

direction

Left Right Straight - -

3

Vehicle state ma-

neuver

Accelerate

Decelerate Steady Stop -

4

Object related ma-

neuver

Following

Waiting for

gap

Approaching

Waiting

queue

-

5 Number of legs Three Four - - -

6

Number of interac-

tive vehicles

Zero One Multiple - -

7

Number of interac-

tive VRU

Zero One Multiple - -

8

Intersection Den-

sity

Low Moderate High - -

9 ROW relationship Giving Receiving - - -

10

Closest interacting

vehicle class

Car Truck / Bus - - -

11

Closest interacting

VRU class

Bicycle Pedestrian - - -

7.3.1.4 Quality Function Formulation

Once all behavioral parameters have been calculated and a subset of real data is extracted, the

comparison of driving behavior can be conducted. In order to select an appropriate statistical

test to measure the diference between real and artifcial behavior, the underlying statistical

distribution for each parameter under consideration must be examined frst. For the proposed

106

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

Ego Vehcile ID: 30 at frame 2956

Bicycle ID: 29 at frame 2956

Car ID: 26 at frame 2956

Ego Vehcile ID: 25 at frame 2315

Bicycle ID: 24 at frame 2315

Car ID: 23 at frame 2315

(a) (b)

Figure 7.4: Two exemplary situations at the Bendplatz location (recording 12) were identifed as

similar based on the contextual parameters in Table 7.2. The ego vehicle is marked in red, the

cyclist in blue and the other vehicle in green. The trajectory already traveled is marked as a solid

line.

metric, the similarity of the distribution of driving behavior parameters of the artifcial vehicle

to the driving behavior of real vehicles is measured by using the KS (Kolmogorov-Smirnov)

two-sample test method [237] inspired by the study conducted by Wang et al. [222]. In

addition, the extremes of some driving parameters are evaluated to verify that driving behavior

exhibited by artifcial vehicles lies within the limits defned by the minimum and maximum

values obtained from real drivers in match ing scenarios. The following parameters were selected

for extrema evaluation: maximum longitudinal velocity, maximum lateral velocity, maximum

longitudinal acceleration, minimum longitudinal acceleration, minimum distance to partners,

and maximal critical time gap.

By assigning thresholds, all individual parts of the metric, either pass or fail in human similarity,

are weighted by

wi

and f nally calculated into a human likeness score according to Equation

(7.1). Inspired by the parameters identifed in literature, presented in Table 7.1, the proposed

metric contains the parameter checks listed in Table 7.3. It has to be noted that not all

parameters are suitable for all driving situations. For example, if no vehicle interacts when

turning, no time value can be calculated for the gap acceptance. The unavailability of such

parameters was assumed as pass when calculating the fnal score. Furthermore, functional

parameters are not considered in the POC implantation of the metric since the model under

test is a holistic and fully integrated driver model that has already passed functional tests. The

concepts for evaluating whether a trajectory meets functional requirements, such as evaluating

collisions or road violations, are presented in the evaluation method for data-driven trajectory

prediction in Chapter 5. Due to the modular structure of the metric, such parameters can be

easily added whenever required.

∑(wi+ pass checki)

score = ∑ (7.1)

(wi+ total checki) ∗ 100

107

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

7.3.1.5 Strategy for Validating the Method

Given the absence of an ofcial, objective standard defned as human likeness, the proposed

metric must be validated. The validation of the method explores whether the human-like

driving scores obtained by the metric correlate with people’s subjective ratings of behavior.

The validation strategy is premised on two key assumptions. First, human-like driving can

be defned by the ability of subjects to distinguish between artifcial and human-generated

behavior. Second, should a correlation be found between the metric’s scores and subjective

assessments, this demonstrates the capacity of the metric to quantify human likeness objectively.

For validation, a survey is prepared, which is inspired by the work of Zhang et al. [113], who

adjusted the Turing Test to quantify the human likeness of their proposed algorithm in a

driving simulator. In order to provide an efcient evaluation method that can be applied to

a large number of situations, the validation strategy for this method is conducted through

a survey instead of a simulator experiment. To validate the presented metric, participants

are asked to rate the behavior of a target vehicle in short videos without knowing whether

the vehicle is driven by a real human or an artifcial driver. In each video, an interactive

driving scene involving multiple vehicles is shown, with one vehicle marked as the target

vehicle to be evaluated. Care is taken to ensure that the videos present no discernible cues

indicating whether the b ehavior emanates from human or artifcial data sources. After each

video, participants are asked to rate the behavior of the target vehicle by the following scale:

1: Completely artifcial driving; 2: Somewhat artifcial driving; 3: Not sure; 4: Somewhat

human-like driving; 5: Completely human-like driving.

Based on this, it can be investigated whether the scores given by humans correlate with those

of the proposed metric and to what extent participants are able to distinguish between artifcial

and human behavior. For further validation of the method, the metric is applied to some

real driving and artifcial driving data, assuming that the human likeness score of real data is

signifcantly higher compared to synthetic behavioral data. The metric includes several aspects

that can be tuned, such as how narrowly the range of human similarity is defned or how

individual parameters are weighted for the fnal score. The insights from the survey provide a

basis for tuning the metric toward how people would distinguish b etween artifcial and real.

7.3.2 Implementation

7.3.2.1 Used Databases and the Driver Model

For representing real human driving behavior, the same open-source dataset inD

2

was selected

as in Chapter 4 and 6, which provides recordings of four German unsignalized intersections

from a birds-eye perspective shown in Figure 7.5 (right) [155]. For creating the human behavior

database, all recordings except recording 12 were selected, whereby this recording was retained

for validation purposes as described in Section 7.3.1.5 (referred as inD12). For testing the

proposed metric, artifcial driving data was created on two synthetic intersections, which are

illustrated in Figure 7.5 (left). For testing on a large scale of interactive situations, data was

created with the help of the simulation framework Spider at BMW [151, 55]. A detailed model

2https://ind-dataset.com/

108

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

description can be found in Section 2.2.6. Behavioral and contextual parameters are computed

according to Section 7.3.1.1 and Chapter 4 for the synthetic and the real trafc databases.

INTERSECTION 1

INTERSECTION 2

inD12

REAL DATA FOR COMPARISON

SYNTHETIC INTERSECTIONS

Figure 7.5: Locations for data gathering - Left: synthetic intersections for creating artifcial

driving behavior. Right: Locations from inD Data [155].

7.3.2.2 Metric Formulation and Thresholds

Through the assignment of individual thresholds and weighting factors, each element of the

metric can be amalgamated into a human likeness score according to Equation (7.1). For

PET, TET, an d Max. Critical Gap, global thresholds for human similarity were derived from

the real trafc database since not enough situational samples could have been extracted for

contextual comparison due to the heuristics applied to calculate these parameters. All other

parameters could be compared considering the situational context, i.e., in comparison to the

behavior of real drivers in comparable situations. Accordingly, threshold values represent the

limits for statistical similarity, as shown in Table 7.3. Based on that, the human likeness

scores for the two synthetic locations and the retained real recording (inD12) were calculated.

Since the driver models used in the synthetic data showed quite high scores, thresholds for

each parameter were further narrowed to fulfll the assumption that the overall results of the

real trafc data would signifcantly exceed those of the synthetic data. The tuned metric

resulted in an averaged score of 89

.

62% for the real dataset (inD12), 77

.

31% for intersection 1

(synthetic), and 77

.

87% for intersection 2 (synthetic). Initially extracted and tuned thresholds

for all parameters are provided in Table 7.3.

7.3.2.3 Survey for Validation

The survey was conducted online and involved 23 participants rating the behavior of vehicles

in 12 videos. Four videos showed real driving behavior and eight artifcial behaviors. Real

scenarios were extracted from the inD dataset of recording 12, whereby artifcial behavior

was created on the two synthetic intersections as described in Section 7.3.1.3. The order of

the videos was randomized to eliminate the possibility of order efect bias. In each video,

the vehicle to asses was marked red while all other vehicles were blue, as shown exemplarily

109

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

Table 7.3: Thresholds for measuring human likeness for diferent parameters extracted from the

real trafc dataset. Distributions of velocities, accelerations, and jerk are compared using KS

statistics. Percent ratios are assigned for maximum values and raw thresholds are assigned for

context-free parameters. Parameters marked with * are calculated context-free.

Name of Parameter Check Initial Fine-tuned

KS Longitudinal Vel. >0.993648 >0.668406

KS Lateral Vel. >0.952358 >0.624929

KS Longitudinal Acc. >0.780503 >0.559198

KS Lateral Acceleration >0.926266 >0.647957

KS Jerk >0.685024 >0.511960

Max. Longitudinal Vel. <66.67 % <66.67%

Max. Lateral Vel. <73.33 % <83.13 %

Max. Longitudinal Acc. <64.00 % <71.80 %

Min. Longitudinal Acc. <78.00 % <72.00 %

Min. Distance to partners <84.00 % <93.60 %

PET* <0.64 s <0.50 s

TET* >4.96 s >3.90 s

Max. Critical Time Gap* >6.98 s >5.40 s

in Figure 7.6. The visualization was abstracted to eliminate any indicators that might help

distinguish between real and artifcial trafc. After each video, participants were asked to rate

on a fve-point scale whether they perceived the red vehicle’s behavior as real or artifcial, as

described in Section 7.3.1.5.

Figure 7.6: Exemplary screenshots of videos shown to participants for rating human likeness of a

subject vehicle marked in red.

7.3.3 Results

In the following section, the results of the survey, providing subjective assessments of human

likeness, are compared with the objective scores obtained by the prop osed metric. In order

to apply the proposed method to a broader range of samples, the human likeness score is

additionally computed for the datasets described in Section 7.3.1.3. Finally, two case studies

are presented exemplary showing how to apply the proposed metric for diferent purposes.

First, the method is applied to examine the used driver model TRM in detail for model

improvements, and second, the method is applied to the results of the trajectory planners

110

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

presented in Chapter 6 in the left turn scenario, as the three models under test all showed

compliant but diferent behavior.

7.3.3.1 Objective Metric Results Versus Subjective Human Ratings

Two aspects are of interest when investigating the results of the survey. First, to which

extent were subjects able to distinguish between real and artifcial behavior, and second, did

participants’ subjective ratings correlate with objective scores calculated by the proposed

metric? The mean value of the Turing test (6.37) indicated that participants’ ability to

distinguish between artifcial and real drivers is only slightly higher than that of random

responses or the result (exactly 6.0) when selecting "Not sure" for all vehicles. Figure 7.7

visualizes the subjective ratings for all test vehicles shown during the survey associated with

the objective scores calculated by the proposed metric. Real vehicles are green, while artifcial

vehicles are colored red. The y-axis shows the rating scale presented during the survey. The

blue value above each vehicle rating describes the objective human likeness score obtained

by the proposed metric. During the survey, artifcial vehicles that exhibited high and low

levels of human likeness were selected. Also, for the real vehicles, samples with more and less

objective human likeness scores were selected to determine if the metric was able to detect

both good and bad results. Comparing the objective scores from the metric to participants’

Figure 7.7: Subjective human likeness rating obtained from participants (y-axis) associated with

objective human likeness scores obtained by the proposed metric (blue value above). Vehicles are

sorted by the average rating assigned by participants, in descending order from left to right.

subjective ratings from the survey, a positive correlation could be observed with a Spearman

correlation coefcient of 0.62 and, a p-value of 0.030, and a Pearson correlation coefcient

of 0.65 and a p-value of 0.023 shown in Figure 7.8. The p-values indicate that there is a

moderately monotonic positive relationship at the 97% confdence level and a moderately

linear p ositive relationship at the 97% confdence level, which can be considered statistically

signifcant. Based on the insights into which vehicles were rated as more human-like by the

participants and the multi-dimensional quality function, the individual parameters could b e

analyzed in terms of the extent to which they contribute to the decision for real or artifcial

111

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

Figure 7.8: Relationship between the fne-tuned objective human-like driving behavior scores

provided by the proposed methodology and subjective ratings of participants during the survey.

behavior. Table 7.4 provides the results of this analysis using the Spearman correlation, which

is further transformed into a weighting of the individual parameters with the aim of refecting

how people prioritize diferences in the individual behavioral parameters.

Table 7.4: Spearman correlation between the parameters and average human-like driving behavior

score from the survey, with correlations converted into weights.

Parameter

Spearman

Correlation

P-value

Conversion

to weight

KS Longitudinal Vel. 0.073555 0.820285 0.016269

KS Lateral Vel. 0.178634 0.578567 0.03951

KS Longitudinal Acc. -0.30823 0.329698 0.068174

KS Lateral Acc. 0.021016 0.948312 0.004648

KS Jerk -0.51839 0.084229 0.114656

Max. Longitudinal Acc. ratio -0.42732 0.165877 0.094514

Min. Longitudinal Acc. ratio -0.50794 0.0918 0.112346

Max. Longitudinal Vel. ratio -0.12151 0.706773 0.026876

Max. Lateral Vel. ratio 0.309234 0.328046 0.068396

Min. Distance to partners ratio -0.3632 0.245869 0.080333

Max. Critical Time Gap [s] -0.94286 0.004805 0.208539

TET [s] -0.52179 0.288343 0.115409

PET [s] -0.22755 0.587845 0.050329

7.3.3.2 The Human Likeness of Investigated Datasets

As described in Section 7.3.1.5, the metric is applied to retained real driving (inD12) and

artifcial driving data, assuming that the human likeness score of real vehicles should be

signifcantly higher compared to synthetic behavior. The used database is described in Section

112

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

7.3.1.3, and results of the objective scores obtained by the metric are shown in Figure 7.9.

The scores were calculated once with the initial thresholds and without weighting according

to Table 7.3 (right), and once with the tuned quality function, incorporating weights from

the survey and adjusted thresholds from Table 7.3 (left). First of all, the diference in

those two fgures demonstrates the sensitivity of results to weights and thresholds within

the quality function. Regarding the initial setting, only the comparison of real data and

artifcial behavior of intersection 2 showed signifcant diferences in the human likeness scores

when using the Mann-Whitney U test (

U

= 82,727.0,

pvalue

= 4

.

06

×

10

−13 )

. While the

comparison of behavior on intersection 1 compared to real humans showed no signifcant

diference (

U

= 58561

.

0

, pvalue

= 0

.

78). This can be explained by the quite far-developed driver

model, which was used to create the artifcial data. Therefore, without weighting or tightening

the thresholds of the quality function, only harsh outliers of behavior can be detected. When

using the tuned quality function (Figure 7.9 left) a clear diference between real and artifcial

behavior could be measured (human-like grades of inD12 vs. intersection 1:

U

= 103,568.5,

pvalue

= 4

.

34

×

10

−68 ;

human-like grades of inD12 vs. intersection 2:

U

= 113,910.0,

pvalue

=

1.54 × 10−60 ).

Figure 7.9: Results for human-like scores for real and synthetic datasets: with tuned thresholds

and weights (left) and initial setting (right).

7.3.3.3

Case Study 1: Exemplary Application of the Method for Investigating

the Heuristic Model at BMW (TRM)

The proposed method is characterized by two main asp ects. First, by using various behavior

parameters involving functional, dynamic, and interactive behavior, the multi-modality of

driving behavior is objectively measurable in diferent dimensions. Secondly, behavior is

assumed to be conditional and compared within a situational context instead of comparing

parameters on a macroscopic level. Therefore, this approach provides a high level of

transparency and enables targeted model improvement. How the proposed method can

be used for model improvement is presented in the following case study.

113

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

The overall scores measured for the artifcial driver model in Figure 7.9 show a mean human

likeness score of 78.89%, showing a variance of 33.33 on intersection 1. The scoring method

enabled the quantifcation of model behavior among multiple situations measured by various

parameters. Based on the proposed method, the following questions can be addressed to enable

model improvement:

• In which scenarios does the model show less human-like behavior?

• Which parameters mostly fail when comparing the model to human behavior?

•

Why do those parameters fail - how does the distribution of parameters distinguish when

comparing model behavior and human behavior?

Based on the correlation analysis presented in Table 7.4, the critical time gap was found to

have a high negative correlation with the subjective ratings. Therefore, the critical time gap

value is further investigated. In the real trafc data, the value was determined to be 6.98

seconds, while the same parameter in the synthetic data was determined to be 10.05 seconds at

intersection 1 and 9.98 seconds at intersection 2. This shows a signifcant diference in behavior

and needs to be improved in the driver model. To further investigate in which situations model

behavior mostly difers from that of humans, the distribution of failing context parameters

is investigated. Considering all context parameters and the distribution of failed parameter

checks, the following conclusions could be drawn: Vehicles are more likely to fail when, they...

• were in the maneuver states: approaching, accelerate, decelerate, or stop

• were giving ROW

• were in the object related maneuver: approaching or waiting for gap

• interacted with fewer partner vehicles

• drove in lower intersection density

The calculated b ehavior parameters of the synthetic data are analyzed to identify which

behavior parameters mos tly fail. Illustrated in Figure 7.10, the fve most often failing behavior

parameter checks were identifed to be: KS Longitudinal Vel., KS Lateral Vel., KS Longitudinal

Acc., KS Lateral Acc., KS Jerk.

In the next step, the value distributions of the extracted parameters can be analyzed. For this

purpose, the distribution of the dynamic parameters extracted from real data is compared

with the dynamic parameters of the artifcial samples, showing a human likeness score of less

than 70%. The distributions are extracted on a macroscopic level and situationally under

the scenario conditions identifed above as causing most of the failures. All distributions are

summarized in Table 7.5. Here, the jerk parameter stands out in particular. The maximal

values measured in artifcial behavior are much larger compared to those observed by human

drivers. It should be noted that the jerk values observed by real humans are determined by

tracking algorithms that process video data from drones. Since the open-source real trafc

dataset is preprocessed, it is not known to what extent the values in this data are smoothed.

However, the jerk values of th e driver model show signifcantly high maximum values, which

114

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

Figure 7.10: Analysis to identify which parameters mostly fail when comparing artifcial and

human behavior.

Table 7.5: Analysis of the value distribution for dynamic parameters for model behavior and real

humans on a macroscopic level (left) and scenario-specifc under situational conditions (right).

real data: syn data: real data: syn data:

macroscopic macroscopic situational situational

Parameter

Longitudinal Vel. [m/s] mean

std

min

max

7.18

5.24

-4.35

27.48

3.38

3.92

0.00

19.05

5.71

4.68

-0.21

20.52

3.48

0.00

12.34

Lateral Vel. [m/s] mean

std

min

max

0.04

0.19

-2.87

10.89

0.03

0.15

-0.77

1.45

0.04

0.26

-0.99

1.07

0.14

0.25

-0.07

0.85

Longitudinal Acc. [m2/s] mean

std

min

max

0.06

0.87

-6.25

6.54

0.06

1.02

-9.58

5.25

-0.08

1.01

-4.34

4.72

-0.47

2.72

-9.58

2.61

Lateral Acc. [m2/s] mean

std

min

max

-0.04

0.61

-5.57

4.98

0.13

0.78

-3.97

4.78

-0.08

0.89

-4.62

4.12

0.65

1.14

-0.20

3.27

Jerk [m3/s] mean

std

min

max

-0.05

0.84

-20.35

16.16

0.03

2.91

-21.40

220.04

0.06

1.05

-6.96

16.16

0.14

12.20

-18.17

220.04

should be further investigated. Therefore, some trajectories were extracted and analyzed.

Figure 7.11 shows some example trajectories that illustrate the jerking problem that occurs

when switching between driving maneuvers. Since the driver model TRM is based on heuristic

decision-making, the temporal behavior and motion are more discrete compared to humans.

However, from a subjective-visual point of view, the jerking problem is not perceptible, as

shown by the survey, and therefore not critical for driver models in the context of DiL trafc

simulation. Further interesting insights are provided by the comparison b etween macroscopic

behavior parameter distribution (left) and situational behavior (right) in Table 7.5. When

considering longitudinal acceleration, for example, one can observe that distributions are in line

115

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

Figure 7.11: Exemplary trajectory of two vehicles showing signifcant high jerk values when

approaching the intersection.

with human value ranges in a macroscopic perspective but not when comparing the parameter

situational. This shows that macroscopic comparison can result in misleading conclusions

regarding the extent of human-like behavior.

In summary, the case study demonstrates the potential of the method to reveal model

weaknesses and enable better model parameterization. The generalizable processing of model

and real trafc data combined with the multi-step approach employed during the case study by

identifying situations and parameters in which model behavior deviates from that of humans

ofers a high potential for model refnement across the variety of situations encountered in

urban trafc.

116

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

7.3.3.4

Case Study 2: Applying the Evaluation Metric to the Proposed Trajectory

Planners

As previously highlighted, the evaluation method presented herein is designed to be transferable

and capable of evaluating complete driver models as well as individual subcomponents.

Therefore, the results of the trajectory planners expounded in Chapter 6 are examined

as an illustrative example for one selected scenario.

For the case study, scenario A_VEH was chosen, where the ego vehicle turns left while another

vehicle is driving in the oncoming straight lan e and a pedestrian crosses the road after the

intersection, as illustrated in Figure 6.6 in Section 6.3.4. Both trajectory planners and the

heuristic mo del TRM were able to solve the scenario from a functional perspective, meaning

no collision or deadlock occurred. However, the three models exhibited diferent levels of

cautiousness, and the question remains as to what extent the behavior of the individual models

was in line with human driving behavior.

Table 7.6 shows the parameter checks with associated thresholds for fne-tuned fail conditions

from the presented method for each of the behavior parameters that constitute the proposed

quality function. Among the three models studied, the TRM model is the only one that decides

to wait for the oncoming vehicle, which raises a particular interest in the interaction-related

parameters to discover whether the behavior of the two trajectory planners was too aggressive.

The evaluation shows that the behavior of the TRM model is not in line with observed

interactions in real trafc, as the model decides to wait although there is a time gap of 12.2 s

to the oncoming vehicle, which exceeds the maximal value of critical time gap of being rejected

in the human database. Meanwhile, the two trajectory planners decide to cross the intersection

frst but still maintain a reasonable time gap. Also, the other interaction-related parameters,

such as PET or the distance to partners, are in line with the observed behavior of humans.

TET could not be calculated for this scenario as no TTC value below the critical threshold

were observed. The high weighting of the critical time gap for the fnal human likeness score

results in a poor result for the TRM model of 54.82%.

Considering the dynamic parameters of the three models, as observed in Case Study 1, the

TRM model fails in the dynamic parameter checks for velocity, acceleration, and jerk as peaks

occur during maneuver transitions. Also, the two trajectory planners fail the parameter checks

for KS Longitudinal Velocity and KS Lateral Velocity. This divergence can be attributed to

the fact that the distribution of human drivers approaching, departing, and traversing the

intersection deviates from the velocity behavior demonstrated by the planners. This deviation

is notably evident in the average higher velocities, as shown by the exemplary analysis in

Figure 7.12 for longitudinal velocity. This indicates frstly that the representativeness of the

comparison data needs to be further investigated and that the use of GT data for anticipation

within the planners leads to a high level of confdence when traversing the intersection, which

is not consistent with human behavior observed in the database. Furthermore, the observed

curve-cutting behavior of the PL_3D planner cannot be identifed by the metric. This suggests

that the inclusion of additional parameters, such as center-line deviation, could be benefcial.

At the moment, these parameters are not considered because maps of real-world locations often

do not adequately represent how vehicles drive at the intersection, as shown in Figure 7.13.

117

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

20 10 0 10 20 30 40

LonVelocity [m/s]

0.0

0.2

0.4

0.6

0.8

1.0

Density

HUMANS

PL_PVD

PL_3D

PL_3D PL_PVD HUMANS

mean 13.26 11.67 9.53

max 14.53 13.7 27.48

min 9.15 7.11 -4.35

std 1.01 2.42 4.99

5 0 5 10 15 20

LonVelocity [m/s]

0.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

Density

HUMANS

PL_PVD

PL_3D

SPA_TEM PVD HUMANS

mean 13.27 7.31 2.73

max 13.83 7.78 17.8

min 13.01 6.86 -0.8

std 0.24 0.35 2.94

(a) LonVelocity before and after the intersection.

(b) LonVelocity within the intersection area in

Vehicle state maneuver: Dcc.

Figure 7.12: Distribution of lonVelocity for the two planners and the TRM model outside and in

the intersection area.

Accordingly, simply calculated center-line deviations from real data would lead to distorted

values.

Table 7.6: Review of all human likeness parameters that constitute the quality function calculated

for the two trajectory planners PL_3D, PL_PVD, and the heuristic model TRM.

*No TET value could be calculated for the scenario since no TTC below the critical threshold of 6s

was detected.

Parameter Check Threshold PL_3D PL_PVD TRM

KS Longitudinal Vel. < 0.668406 0.9668 0.6989 0.7382

KS Lateral Vel. < 0.624929 0.6666 0.7038 0.7210

KS Longitudinal Acc. < 0.559198 0.4786 0.4766 0.6967

KS Lateral Acc. < 0.647957 0.4250 0.4123 0.7537

KS Jerk < 0.511960 0.3924 0.4448 0.6906

Max. Longitudinal Acc. ratio

> 71.80 % 87.27 % 97.60 % 95.71 %

Min. Longitudinal Acc. ratio > 72.00 % 95.27 % 92.40 % 96.59 %

Max. Longitudinal Vel. ratio > 66.67 % 81.82 % 100.0 % 98.83 %

Max. Lateral Vel. ratio > 83.13 % 89.09 % 100.0 % 87.51 %

Min. Distance to partners

ratio

> 93.60 % 100.0 % 100.0 % 100.0 %

Max. Critical Time Gap accept when > 5.40 s 7.0 s 6.6 s 12.2 s

TET < 3.90 s No TET* No TET* No TET*

PET > 0.50 s 6.0 s 5.92 s 2.80 s

Final hum-like grade 94.42 94.42 54.82

7.3.4 Limitations and Future Work

The proposed method presents a frst attempt to objectify human-like driving behavior, taking

into account the situational context. The multimodality of human behavior is mapped into

individual parameters, which are then statistically evaluated by assigning thresholds for pass

or fail and additional weights. Weighting and threshold assignment have a signifcant impact

on the fnal metric score, resulting in high sensitivity to individual tuning, especially due to

the binary strategy of either passing or failing a check.

In the future, more parameters, such as lane deviation, can be added to better account for

the multimodality of human driving behavior, and extensive surveys could provide a basis for

118

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

ID: 159

ID: 140

(a) Recording 18, Frankenburg, trackId 159. (b) Recording 18, Frankenburg, trackId 140.

Figure 7.13: Exemplary illustrating the mismatch of driven trajectories along the defned

intersection lanes.

Trajectory in red and lane polygon in cyan.

fne-tuning the metric. Instead of the binary approach of either passing or failing a parameter,

it is recommended to elaborate a more sophisticated concept whereby the range of human

likeness is discretized into bins for each parameter.

In order to determine similar matching situations from real driving behavior, context parameters

are assigned to the data. The algorithms for computing context parameters, such as the

number of interaction partners and related parameters, are based on heuristics. Such heuristics,

of course, do not guarantee that meaningful results are provided across the entire diversity

of interpersonal situations o ccurring in urban trafc. Some parameters, such as PET, were

calculated only for situations in which reliable results could be guaranteed, resulting in a

signifcant reduction of samples. Specifed assumptions for parameter calculation inspired by

literature, such as that for critical time gap calculation, an intersection of paths must take

place, which limits the situations covered by the heuristic. For future research, optimization,

especially of those interaction-related parameters, is recommended to guarantee representative

value ranges.

Based on the focus of enabling the measurement of human likeness and the development

state of the used model, not all parameters listed in Table 7.1 were included in the POC

implementation of the metric. Functional parameters evaluating road violations or collisions

were not considered. Approaches to include such parameters within a quality metric were

presented in Chapter 5. Future attempts could include more parameters and ofer an individual

weighting to exclude those for further developed models.

When the remaining samples in the comparison dataset were below a threshold, a simple

abstraction strategy was employed to measure the similarity between situations in artifcial and

real data, prioritizing contextual parameters empirically by manually examining the dataset.

Novel techniques ofer alternative approaches for measuring similarity between datasets, such

as those presented by Heuer [238], which could be used in the future.

In general, the heuristics used in this method correspond to the scenarios encountered in real

trafc data sources showing unsignalized intersections. To extend the proposed metric for

other trafc scenarios, additional heuristics and context parameters have to be considered.

119

7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a

Plausibility Metric

7.3.5 Conclusion

In this section, a novel method was introduced to objectively measure the degree of human

likeness of artifcial driver models. To do so, driving behavior is characterized by various

parameters, which are then compared to the behavior of humans in real trafc. A fnal human

likeness score can be computed for each trajectory by using statistical analysis in the context

of a quality function. Since behavior in urban scenarios is infuenced by various factors and

assumed to be conditional, the situational context for each trajectory in real and artifcial

data is characterized by automatically computed context parameters that aim to distinguish

between diferent situations that may occur in urban trafc. Thus, a subset of real trafc

data showing human behavior under similar conditions can be used to compare behavioral

parameters.

Since there is no clear defnition of human likeness, a Turing test-inspired survey was conducted

to investigate the ability of the proposed method to refect the subjective ratings of humans.

Results of the survey showed a signifcant correlation between the scores calculated by the

proposed metric and the scores assigned by participants. In addition, the survey provided

interesting insights into which parameters contribute most to the distinction between artifcial

and human driving behavior. These fndings were used to parameterize the quality function

and provided valuable insights into specifc weaknesses of the used driver model.

An evaluation of large datasets has shown that the proposed metric has the ability to evaluate

model behavior efciently in a wide range of situations, which is crucial f or developing reliable

solutions for urban trafc.

The two case studies provided serve as examples that demonstrate the usefulness of the

proposed metric in conducting a comprehensive assessment of a model an d in making

targeted improvements. By applying the evaluation method to both a fully integrated, refned

driver model and two conceptual trajectory planners, the metric’s remarkable adaptability is

demonstrated. Moreover, this process provided valuable insights into the specifc dimensions

in which model behavior difers from human behavior.

The modular structure of the metric allows model behavior to be evaluated according to

application-specifc requirements. In DS, for example, the priority is more on human-like

behavior than on rule compliance and safety. In contrast, when evaluating a trajectory planner

for AVs in real trafc, the focus might be more on non-critical behavior, accepting a lower level

of aggressiveness during interactions. Accordingly, by weighting and narrowing the thresholds

for individual parameters, the method can be used for a broad range of applications.

120

7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving

Simulation Based on Questionnaires

7.4

Method 2: Quantif ying Realistic Behavior of Trafc Agents

in Urban Driving Simulation Based on Questionnaires

Given the distinct focus of this thesis on DiL in urban DS, the perspective of the human driver

regarding model behavior received signifcant importance. Therefore, a novel approach for the

subjective assessment of the human-like nature of trafc agents is introduced, which measures

the perceived realism of participants in a simulator experiment.

7.4.1 Concept

In DS, human-like driver models are relevant to provide a realistic driving scene for the DiL in

the VE. Therefore, the proposed concept considers the human driver as a mirror for quantifying

artifcially modeled road users in terms of behavioral realism. This general setting allows

for evaluating agent models on a microscopic level while interacting with a human driver

over a broad range of situations. The concept of presence has been established as a common

method to measure perceived realism in VEs [118]. The concept introduced here operates

on the premise that the driver’s sense of presence within the simulator is observably afected

by the dynamics of the surrounding trafc. It is, therefore, necessary to investigate whether

and to what extent the existence of dynamic trafc participants in the simulation leads to

diferences in presence compared to driving without surrounding trafc. In addition, a suitable

questionnaire must be elaborated to quantify agent behavior in diferent dimensions to allow for

deriving useful conclusions for future model improvement. Therefore, a simulator experiment

was designed in which human drivers experience diferent situations in a virtual city and are

asked to rate their sense of presence as well as other items that allow a clear analysis of the

appearance of model behavior. For the experiment, the following hypotheses are formulated:

•

H1: The existence of surrounding trafc leads to an increase in participants’ sense of

presence.

•

H2: The realism of trafc agents can be assessed and distinguished in diferent dimensions

using a questionnaire that considers the key requirements on human-like driver models

defned in Section 3.3.4.

121

7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving

Simulation Based on Questionnaires

7.4.2 Material and Setting of the Simulator Experiment

7.4.2.1 Study Design

During the experiment, the existence of surrounding trafc was manipulated in a within-subject

design. Every participant experienced the same route twice, once with surrounding trafc

(trafc) and once without (no-trafc). Order efects were avoided by balancing the order

of the two drives between the participants. The trafc drive included typical interactive

situations occurring in urban environments. Next to route-following and lane-changing tasks,

the participants experienced the following situations, illustrated in Figure 7.14:

• Pedestrians crossing the street (1, 2, 6)

• Cyclists on the street (3, 5)

• Driving through a roundabout (4)

• Turing left with oncoming trafc (7)

• Occupied lanes with oncoming trafc (9)

• Leaving a parking space on a busy road (8)

Participants were instructed to follow a defned route indicated by the navigation system and

to comply with trafc rules as they would do in real trafc.

In the beginning, a short test drive was completed to allow familiarization with the simulator.

The subjects answered a presence questionnaire [239] after both drives and evaluated the

quality of the agent models after the trafc drive using a second questionnaire.

7.4.2.2 Questionnaires

The participants were asked to rate their sense of presence using a standardized presence

questionnaire [239], which is in common use in related experiments [119, 121, 240]. The

questionnaire was extended by one item regarding the naturalism of driving style. A second

questionnaire was created to assess the degree of realism of the agent models in the simulation.

The questionnaire intends to qualify the current status of the models with regard to the

requirements identifed in 3.3.4: Spatio-temporal consistency (SPA-TEM), Interactivity

(INTERA), and Individuality (STAT). The items of the second questionnaire are inspired by

subjective evaluations in related experiments [241, 117]. The questionnaire is extended by

proprietary items regarding interactivity and the agreement on whether more interaction with

the trafc agents would lead to a more natural driving style. All items were rated using a

seven-point Likert-type scale. Furthermore, the participants had the opportunity to openly

comment on how the behavior of the trafc agents difered from real road users. The complete

questionnaires and related afliations can be found in the Appendix A.3.

7.4.2.3 Simulator Setting

The study was conducted at BMW using a static simulator with a half-vehicle mock-up, as

shown in Figure 7.15. For visualization, LED screens covering a horizontal feld of view of

122

7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving

Simulation Based on Questionnaires

12

34

56

7

8

10

9

Figure 7.14: Trafc scenarios experienced by participants during the simulator experiment. The

ego vehicle is marked with a red triangle.

180 degrees were used. Additional monitors were installed to allow rear and side views. The

simulation is based on the BMW proprietary framework called Spider [151, 55]. For modeling

the surrounding trafc, three types of trafc agents were used: driver models, bicycle models,

and pedestrian models. All agent models are in-house developments of BMW.

Vehicle driver: a detailed description of the driver agent TRM is provided in Section 2.2.5.

Cyclist: the cyclist agent is an adapted version of the driver model extended by further

driveable areas and is based on a suitable dynamic model. In addition, diferent parameters

can be applied, such as driving on the road or in dicating direction changes by hand signals.

Pedestrian: pedestrians are simulated using a simple model that receives the exact path and

a desired velocity as input. The model follows deterministic trafc rules, such as stopping at

red lights, and prevents possible collisions by stopping and waiting.

In summary, all trafc agents are modeled by parametrizable heuristic models focusing on

collision avoidance and route-following in order to reduce complexity. Situational reactions

are discretized into diferent maneuvers, and predefned rules are applied to decide on an

individual maneuver at a time.

123

7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving

Simulation Based on Questionnaires

Figure 7.15: The driving simulator used in the present experiment.

7.4.2.4 Participants

In total, 46 subjects participated in the experiment, including four women (8.7 %) and 42 men

(91.3%). All participants were employees of BMW. The meta information, including mean,

standard deviation, test statistic, and signifcance, is summarized in Table 7.7. Based on the

data distribution, the impact of surrounding trafc, as an independent variable, was analyzed

using the signifcance test according to Wilcoxon [242].

Table 7.7: Distribution of the meta data of the participants.

µ σ s p

Age [years] 38.07 9.35 0.94 0.014

Mileage [km/year] 19391.30 10558.26 0.937 0.015

Driver’s License [years owning a licence] 20.46 9.15 0.93 0.007

7.4.3 Results

7.4.3.1 The Efect of Surrounding Trafc on Perceived Realism

The efect of surrounding trafc on the sense of presence and the rating of natural behavior of

the participants in both drives are illustrated in Figure 7.16. Participants rated their sense of

presence signifcantly higher after performing the drive with other trafc participants. The

clear signifcance compared to the level of standard deviation shows that no ceiling efect

has occurred. Furthermore, the participants rated the degree of naturalism of their own

driving style signifcantly higher when driving with other trafc participants. The descriptive

naturalism ratings were generally rather high in the present study, which, however, may have

been infuenced by the task instructions to behave as in normal trafc regarding trafc rules.

A subsequent rating of their own driving behavior being unnatural could be understood as not

complying with the given task.

7.4.3.2 Realism of Trafc Agent Behavior

Besides the independent variable, the degree of realism of the used agent models was quantifed

with respect to the key requirements for realistic behavior identifed in Chapter 3. The

124

7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving

Simulation Based on Questionnaires

no traffic traffic

1

2

3

4

5

6

7

Mean

: 4.47

Mean

: 4.90

Presence

(

PRE

)

(

N

: 46,

statistic

: 157.5,

pvalue

: 0.0001

**

)

no traffic traffic

1

2

3

4

5

6

7

Mean

: 6.04

Mean

: 6.33

Naturality

(

NAT

)

(

N

: 46,

statistic

: 70.0,

pvalue

: 0.0466

*

)

Figure 7.16: The infuence of surrounding trafc on the perceived realism and the naturalism of

driving style, whereby 7 represents a high sense of presence (*: p < 0.05, **: p < 0.01).

results in Figure 7.17 show clear diferences in the three dimensions of realism. The ratings

for individual behavior are remarkably better compared to the ratings for interactivity and

spatio-temporal consistency in behavior.

These fndings are supported by the subjects’ comments on the extent to which the behavior of

the trafc agents difers from reality. The observations can be summarized into the following

key statements:

• the agents behave defensively and avoid contact instead of interacting or cooperating

• the agents behave as if it is not foreseeable what will happen next

• the behavior of the agents is too compliant with trafc rules

• missing eye-contact with VRUs

Furthermore, the sub jects agreed with the statement that more interaction with the trafc

agents would lead to a more natural driving style. This aspect is also confrmed by the

higher-rated naturalism of driving style after the trafc drive because, without any trafc, no

interaction can take place.

7.4.4 Limitations and Future Work

The proposed method shows great potential for evaluating the behavioral realism of agent

models in diferent dimensions among a broad range of scenarios based on questionnaires.

125

7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving

Simulation Based on Questionnaires

Individuality

(

STAT

)

Interactivity

(

INTERA

)

Spatio temporal

:

consistency

(

SPA

-

TEM

)

1

2

3

4

5

6

7

Mean

: 6.09

Mean

: 4.07

Mean

: 3.87

Figure 7.17: Assessment of the used agent models regarding key requirements for realistic

behavior, whereby 7 represents a positive rating of, for example, high individuality.

However, to conduct a deeper analysis regarding diferent situations or individual types of

trafc participants, further items are required and should be considered in future experiments.

In the present study, the two conditions difered regarding the existence of other trafc

participants. It remains unclear, however, whether the observed positive efects are attributable

to the pure existence (observation) of other road users or to the associated interaction.

Since urban trafc always includes interactive situations, future studies could be aimed at

investigating the efects of presence in diferent scenarios.

In the presented experiment, only one generation of driver models could be evaluated because

the proposed methods for information representation, anticipation, and decision-making are

still at an early stage of development and have not yet been combined into a complete and

runtime-optimized and integrated driver model.

It is important to note that when comparing diferent agent models using this metho d , the

signifcance of the efect in presence is highly dependent on the perceptibility of behavioral

diferences and, therefore, cannot be guaranteed to provide such clear results as in this

experiment.

The proposed method is based on a purely subjective measurement approach, which can be

criticized at times with regard to the vulnerability of the results. Therefore, the combination

of the subjective and the previously presented objective method ofers the potential for more

reliability in this context [116].

While the proposed objective method is suitable for iterative development and, thus, for

early conceptual stages, the subjective method requires fully developed agent models that are

integrated into the respective simulation framework in a real-time manner.

The fndings of the experiment are not limited to driver models. Participants recognized a

weak performance in the interactive nature of VRUs. Consequently, agent models to simulate

cyclists and pedestrians also need to be developed further. Intelligent algorithms are required

not only in terms of a plausible trajectory but also on an explicit communication level to

exhibit situational gaze directions and gestures.

126

7.5 Summary on Evaluation Methods

7.4.5 Conclusion

The present study showed clear efects of the surrounding trafc on the sense of presence

of participants and therefore confrmed hypothesis H1. Furthermore, the drive, including

observation and interaction with other trafc participants, was rated more natural. In

conclusion, surrounding trafc is an essential factor in terms of realism in DS and provides

the opportunity to achieve a higher level of validity by developing more realistic trafc agent

models.

The proposed method aimed at investigating the behavioral realism of trafc agents in three

dimensions to enable useful insights for model improvement. Results show that the used

models perform well in individual behavior but show weaknesses in interactivity and plausible

spatio-temporal behavior. This efect is in line with the main challenges identifed in this

thesis in Chapter 3. The previously presented objective measurement method also showed a

main issue in spatio-temporal plausibility and interactivity through the increased weighting of

the time gap when interacting to distinguish between real and artifcial drivers. The subjects’

ratings and comments underline the need for more dynamic decision-making and situational

understanding in such models. Regarding the present trafc agents and other state-of-the-art

models, such rule-based decision-making strategies limit the ability of situational reactions. In

summary, the proposed evaluation method provides interesting insights into the subjective

perception of agent model behavior and allows a comprehensible model evaluation.

7.5 Summary on Evaluation Methods

Considering evaluation strategies in state-of-the-art solutions, one can observe that evaluation

is considered a secondary topic in the majority of publications. Most common evaluation

strategies and metrics sufer in signifcance. Furthermore, the high dependency of behavior on

various external factors in urban trafc is rarely addressed. Therefore, the present chapter

aims to provide innovative evaluation methods aiming at generating more transparency and

meaningful insights for future development. Both presented methods, the subjective and the

objective evaluation strategy, are associated with certain drawbacks and benefts. While the

proposed objective method is suitable for iterative development and, thus, for early conceptual

stages, the subjective method requires fully developed agent models that are integrated into

the respective simulation framework in a real-time manner. However, with respect to DiL

applications, the subjective method provides better insights into the severity of behavioral

diferences. A good example is the jerking problem identifed in Section 7.3.3.3, which was

not perceived by the participants during the simulator experiment. Fu rthermore, while the

latter requires a fully developed driver model, the objective method can be applied to holistic

driver models as well as subcomponents, as shown by the case study for trajectory planning.

Therefore, the objective method is more suitable during the development process, while the

subjective method can be us ed as a fnal validation strategy by providing solid evidence on

whether the interaction between driver agent and DiL appears realistic. The issues in spatio-

temporal plausibility and interactivity through inadequate parametrization of waiting time

gaps when interacting could be identifed by both methods, resulting in mutually confrming

results. In summary, both methods provided promising results by revealing crucial insights and

127

7.5 Summary on Evaluation Methods

clear actions for future developments by employing methods transferable to multiple models

and model subcomponents across various scenarios.

128

8

Summary, Discussion and Outlook

8.1 Summary of Results

With respect to the objectives formulated in the introduction in Section 1, the initial aim of

this thesis was to provide a clearer and more practical defnition of human likeness, along with

the critical requirements for driver models. In general, one can categorize requirements into a

functional level and a human likeness level. However, both levels show a strong interrelation

since non-functional behavior contradicts human likeness. While the functional level mostly

requests for collision- and deadlock-free trajectories, high-level requirements on human likeness

are identifed to be: individuality, spatio-temporal consistency in behavior, and interactivity.

Since individuality is mostly achieved by diferent parameter sets within a model, which is

a downstream issue, this work mainly addressed the functional requirements and those for

interactivity and spatio-temporal consistency.

In particular, spatio-temporal consistency and interactivity are closely related to the functional

requirement of avoiding deadlocks. The occurrence of deadlocks is a common problem that

can originate from diferent model layers. For example, a deadlock can be caused by a

defciency in the environment representation, e.g. the oncoming trafc lane is not part of

the spatial representation of driveable areas, and thus the vehicle is not able to avoid an

obstacle in the ego lane. In other cases, the decision-making strategy itself can impede the

functionality and plausibility of behavior if the resulting action space is too discretized to

fnd a valid solution. Spatio-temporal consistency and interactivity require a certain degree

of situational understanding and the ability to adapt behavior situationally. In addition to

these requirements, the underlying key challenge is to fnd general and transferable solutions

that allow a model to handle the variety of situations encountered in urban trafc without

exceeding a practical level of complexity.

Given these higher-level requirements and the state-of-the-art in driving simulation and AV

development, four main root causes have been identifed that hinder the successful modeling

of human-like driver behavior in complex urban trafc. Namely, the representation of complex

urban trafc situations to enable situational understanding, generalizable predictions in

129

8.1 Summary of Results

interactive scenarios as a basis for interaction, more dynamic decision-making allowing for

situational behavior adaptation, and measuring the degree of human likeness of driver model

behavior.

The frst challenge arises in the representation of urban trafc situations in a model-accessible

format since driving behavior is afected by multiple infuences, such as road topology, trafc

rules, and the behavior of other road users. The human cognitive process of perceiving

information and creating situational understanding by associating and interpreting this

information is explored in several areas of research, for example, in the context of SA [129].

As computer science approaches are not yet able to compete with humans, human cognition

inspires the requirement that a driver model should achieve a certain level of situational

understanding in order to cope with interactive trafc situations. Most published research

focuses on single scenarios or isolated parts of the driving task. Thus, the challenges of fnding

an appropriate scene representation are rarely discussed. The need to generate situational

understanding through an appropriate scene representation at a general level inspired research

question R1: How to represent complex situations in urban trafc in a general and

transferable manner?"

Therefore, in Chapter 4, a novel method for scene representation incorporating the association

and interpretation of diferent information sources to enable situational understanding is

presented. The idea is to identify and provide the model with all easily accessible information

via heuristics, while more complex patterns, such as the reaction of other road users, are

captured by appropriate modeling approaches such as NNs. The proposed concept is applied

to an op en-source drone dataset, providing time series and metadata of trafc participants

interacting on ROW regulated German intersections. Information from the data is fused with

knowledge from the map to extract further context information by identifying interacting

trafc participants and their relationships. A feature vector containing raw and semantic

information describing a scene at a certain time step is generated. The efectiveness of the

scene representation is measured by means of a deep learning prediction model that anticipates

the trajectory of vehicles for the next 5 seconds. The evaluation showed that the prediction

model’s ability to make general reliable predictions on unseen data benefts signifcantly from

the advanced scene representation containing associated and interpreted context information

compared to only providing raw scene information.

Anticipation plays a crucial role in handling interactive trafc and becomes particularly relevant

in urban trafc, as understanding and incorporating the intentions of other road users is the

basis for safe and reasonable decision-making.

However, modern agent models do not incorporate prediction approaches, and missing

anticipation was therefore identifed as the second root cause that prevents driver agents

in the simulation from exhibiting plausible and interactive behavior. As discussed in Section

3.3.4, anticipating the behavior of other road users is a frequently addressed challenge in AD,

as safe participation in trafc requires the estimation of future scene developments.

However, published solutions tend to ofer limited solutions for isolated scenarios that are

not sufciently transferable and reliable for the diversity of urban trafc. Based on this, the

second research question R2: "How to enable general and transparent predictions in

complex urban trafc?" was formulated.

130

8.1 Summary of Results

In Chapter 5, a novel method is presented that aims to investigate the impact of diferent

conceptual choices on the performance of a NN aiming to predict the future trajectory of other

vehicles. The model was trained on diferent datasets and with diferent parameter settings

along a three-layer testing framework. Results showed that state-of-the-art evaluation metrics

do not provide sufcient transparency for identifying model weaknesses or deciding between

diferent model settings.

The evaluation revealed that the degree of diversity in training has a major impact on model

performance. It was found that very clean behavioral data obtained from simulation results

in the well-known problem of overftting, which causes poor results in unknown situations.

Meanwhile, a larger variety of training data leads to reliable and accurate results even in

situations that are far from the training data. Studies on diferent data compositions have

demonstrated that it is possible to augment real trafc data with synthetic samples to overcome

the known problem of underrepresented situations in datasets.

In summary, the results have shown that meaningful evaluation strategies are crucial to

address the diversity of potential concepts and the shortcomings, such as overftting and low

transparency of black-box models, to enable reliable prediction models in the future.

Besides situational understanding achieved by appropriate representations and anticipation

capabilities, dynamic adaptation of behavior to the situation has been identifed as a necessary

ability and yet unsolved challenge for human-like driver agents. Decision-making in current

driver agents is mostly done hierarchically, by dividing the driving task into maneuvers and

applying heuristic rules to decide when to choose which maneuver. Given the diversity of

situations encountered in urban trafc, the strong discretization of the solution space by

distinguishing into driving maneuvers limits the ability of models to exhibit situationally

plausible behavior and leads to functional errors such as deadlocks. Therefore, research

question R3: "How to enable driver agents for dynamic decision-making in order

to cope with typical urban scenarios in a functional and human-like manner?" was

formulated.

Inspired by techniques from AVs, the proposed methods in Chapter 6 aim to enable more

dynamic decision-making in driver agents by employing trajectory planning. The idea behind

these approaches, to balance between diferent needs, such as safety, rule compliance, or efcient

target reaching, is more in line with the human manner of handling trafc and provides a

less discretized solution space. Based on promising publications, a novel trajectory planning

framework is developed, incorporating two layers, one creating a high-level behavioral decision

and a second one smoothing this decision into a dynamically feasible trajectory. Two diferent

variants for planning the high-level decision were developed to investigate the weaknesses and

potentials of diferent state-of-the-art techniques.

The behavior of the two planners was compared with that of the heuristic driver agent at BMW

in typical interactive scenarios to investigate when and to what extent the higher complexity

of planning methods provides additional value. In addition, the runtime to create a trajectory

was investigated since real-time capability is a strong criterion for agent models in simulation,

and the two planning methods difer in complexity. The results demonstrated that the planners

were able to solve highly interactive scenarios in a reasonable manner, while the heuristic

model faced deadlocks.

131

8.1 Summary of Results

The evaluation revealed the planners’ high potential to address urban trafc challenges but

also a high sensitivity to changes in the situation or parameters, resulting in a high variance of

trajectory quality and runtime. However, even with those shortcomings, trajectory planning

instead of heuristic maneuver distinction is identifed as the expedient metho d to cope with

the complexity of urban trafc. Therefore, further development of the proposed framework

and investigations across a broad range of scenarios is recommended.

While analyzing state-of-the-art solutions and examining the utility of the presented methods

within this thesis, the importance of meaningful and transparent evaluation strategies has

emerged repeatedly, prompting the fourth research question of this thesis: R4: "How to

identify model limitations and quantify the degree of human likeness of holistic

driver models and individual subcomponents?"

Human driving behavior in urban trafc is subject to a high degree of complexity and is

afected by various infuences. This results in new challenges not only for modeling but

also for evaluating models, as behavior always has to be assessed in the context of the

situation. Therefore, in Chapter 7, two evaluation methods for quantifying the human likeness

of driver model behavior, aiming at identifying model limitations and enabling target-oriented

improvements, are provided.

The frst method provides an objective approach to quantify the human likeness of model

behavior while considering the situational context. Therefore, a multi-dimensional quality

function was formulated, incorporating various parameters to characterize driving behavior.

By assigning context parameters, it is possible to compare model behavior to the behavior

of humans under similar conditions, such as behavior when approaching an intersection and

yielding ROW to another vehicle. A combination of all individual parameters into one human

likeness score allowed for efcient evaluation even of large datasets. For validation purposes,

a Turing test inspired survey was conducted. The comparison of scores obtained from the

quality function and scores rated by participants demonstrated the ability of the method to

refect subjective ratings of participants in a transparent and objective manner.

Due to the modular concept, the method is applicable to evaluating holistic driver models

as well as subcomponents such as a trajectory planner. The ability of the method to reveal

explicit model weaknesses was demonstrated for both application areas by means of two case

studies.

In contrast, the second evaluation method follows a subjective approach in which the quality

of agent model behavior is assessed by measuring participants’ perceived realism by the sense

of presence in a simulator experiment, assuming that a high degree of human likeness of agent

models is associated with a high level of perceived realism.

Participants were asked to rate their sense of presence and the realism of the heuristic agent

models after observing and interacting with the agents in diferent urban trafc scenarios.

The existence of trafc was manipulated in a within-subjects design to investigate whether

participants’ perceived realism in a simulator experiment is an appropriate measure for

quantifying trafc agents. The results demonstrated that the participants’ sense of presence

was signifcantly afected by the surrounding trafc. In addition, the heuristic agent mo dels

were found to have weaknesses in spatio-temporal consistency of behavior and in their ability

132

8.2 Limitations and Discussion

to interact with the DiL, as responses in interactive situations appeared to be avoidant rather

than interactive.

8.2 Limitations and Discussion

As described in the previous section, this thesis provides promising methods and results for

the future development of human-like driver models capable of handling complex situations in

urban trafc. However, the main limitation that should be mentioned is that this work does

not provide a complete, runtime-optimized, and integrated driver model since the proposed

methods for information representation, anticipation, and decision-making are still at an early

stage of development and need to be further refned to achieve general reliable solutions.

Therefore, these individual pillars of a driver model, require further research and development

before being combined into a holistic driver model. When connecting these subcomponents,

further challenges will arise, such as knowledge distribution, how to deal with prediction

uncertainty, and fnding generally suitable parameter sets.

Key scenarios for the implementation and evaluation of the presented methods have been

defned for the representation of the most important characteristics of urban trafc. However,

given the wide variety of trafc, it is unlikely that the representations employed will cover

all the challenges potentially occurring, and some blind spots remain. This highlights the

importance of meaningfu l and efcient evaluation methods that are applicable to a broad

range of situations to address such unkown-unknown problems.

The methods presented in this thesis aim to combine heuristic modeling and deep learning

approaches to get the most out of both worlds. However, when applying heuristic rules to real

trafc scenes, it is not possible to cover all possible combinations that may arise from human

behavior. Due to th e numerous iterative development loops during the processing of the real

trafc data, most of these implausibilities could be identifed and covered by the presented

algorithms. However, the use of heuristics always faces limitations due to their categorizing

nature.

Meanwhile, DL approaches lack in transparency due to the black-box nature of the model

and are associated with challenges such as overftting. As a consequence, these approaches

also show disadvantages and do not provide one unique perfect solution to achieve reasonable

results for the variety of urban trafc situations. Combining heuristic approaches that extract

semantic knowledge about th e situation with a highly nonlinear NN to handle such complex

tasks as behavior prediction holds great potential. However, extensive testing and critical

evaluation against multiple test scenarios is required to ensure reliability.

The present thesis aims to provide a new perspective by combining theories and methods from

diferent research areas such as psychology, robotics, and computer science. Furthermore, the

objective of this thesis consisted of fnding generally applicable and transferable solutions. Due

to these requirements and the diferentiated view on modeling human-like driving behavior,

this work provides insights, methods, and connecting points for future developments instead

of a detailed solution strategy for isolated scenarios. In particular, the recommendations for

future work provided by the individual methodology chapters (4, 5, 6), ofer a number of as-yet

unsolved challenges.

133

8.3 Final Conclusion

In alignment with the motivation outlined in the introduction, the present thesis endeavored

to address the challenge of modeling human-like driving behavior in urban environments,

with a particular focus on the application in DS. To accomplish this objective, the initial

chapter undertakes the task of categorizing and evaluating state-of-the-art solutions in various

associated research areas. In addition to exploring solutions within the realm of DS, approaches

from the domain of AVs were discussed due to their profound relevance to the core issues under

discussion. Furthermore, the interdisciplinary nature of this topic, encompassing facets of

robotics, computer science, cognitive modeling, and psychology, was taken into consideration,

ofering new perspectives for the exploration of this subject.

Owing to the multifaceted nature of the challenge associated with modeling human-like

behavior, various defnitions and perspectives on defning human-like driving behavior were

analyzed to derive essential modeling capabilities. The obtained requirements were compared

against contemporary solutions, resulting in the identifcation of four primary challenges that

necessitate resolution to enable the incorporation of human-like driver models within urban

DS.

Subsequently, th e methodology part of this thesis introduced innovative solutions for

representing complex situations in urban trafc to enable situational understanding, methods

for developing and evaluating generalizable prediction models able to predict the future motion

of vehicles, strategies allowing situational behavior adaption by employing more dynamic

decision-making techniques and approaches for measuring the level of human likeness of driver

model behavior.

The proposed methods for representation, prediction, and planning provided promising results

and generated increased transparency and in-depth comprehension, narrowing the scope of the

broad scientifc problem and transforming it into distinct technical challenges that can be the

focus of forthcoming research endeavors.

Moreover, the evaluation metho ds presented herein furnish indispensable tools for advancing

model development by the identifcation of strengths and limitations of holistic driver models

and subcomponents across the broad variety of situations encountered in urban trafc.

In view of all the results presented and the analysis conducted, the main conclusion is that

there is no one perfect solution satisfying the various requirements of potential applications,

but the appropriate balance must be determined. Mo deling such complex behavior will

always underlie a balance between generalizability, solution quality in detail, complexity, and

calculation efort. The solution quality in specifc situations will sufer under a higher level

of generalizability. The more complex solution will mostly provide better results but at the

same time be associated with higher calculation efort and thus sufer in real-time capability.

The approach prioritizing safety will face deadlocks more often since, due to safe distances or

uncertainties, no solution space remains. In contrast, the more risk-taking approach will solve

more situations but might lead to critical situations, as they appear in real trafc between

humans due to misunderstandings or misjudgments.

Despite the complexity of human driving behavior, the topic must be considered holistically in

its interdisciplinary nature since the consideration of isolated scenarios leads to assumptions that

134

8.4 Summarized Outlook

are not transferable and, thus, to solutions that are not practicable. Since there is no perfect

solution covering all problems, it is necessary to defne clear objectives and requirements for a

model in the triad of complexity, generalizability, and computational complexity. Therefore,

clear and testable requirements, dynamic and transferable modeling approaches, and meaningful

evaluation strategies, as presented within this thesis, are required.

8.4 Summarized Outlook

The present thesis has introduced novel concepts that pave the path toward the creation of

reliable driver models exhibiting human-like behavior in urban trafc scenarios. Driver models

often exhibit limited transparency, mainly due to complex modularized architectures that

aggregate diferent rules and scenarios, complex formulations of optimization problems, or

the utilization of DL techniques that are characterized by their black-box nature. Therefore,

despite the prevailing literature on this subject, future research endeavors should dedicate more

attention to the development of meaningful, transparency-creating, and practical evaluation

methods.

Furthermore, driving in urban trafc is a highly complex task, and various aspects infuence

behavior. Consid ering these factors, the root problem may not necessarily rest in an ill-suited

approach to solutions but rather in the limited understanding of how the model operates in

untested scenarios or when its behavior deviates from that of human drivers. In order to

confront the multifaceted challenge of diversity within urban trafc, certain key challenges

must be systematically addressed in forthcoming research eforts.

To begin, it is essential to develop meaningful evaluation strategies that can be efciently

applied across a wide range of situations. The precise design of the evaluation method may

vary depending on the specifc model or subcomponent being examined. In Chapter 5, a

concept is presented for enhancing the transparency of data-driven prediction models through

experimentation with diferent datasets and model settings and the measurement of model

performance incorporating accuracy and functional plausibility. In Chapter 7, an expanded

version of such a plausibility metric was introduced with a stronger focus on human likeness.

It is recommended to combine the two plausibility metric concepts and extend the involved

heuristics to cover further trafc scenarios. Drawing from the general scene description capable

of representing diverse trafc scenarios, these methods can be easily expanded to encompass

additional trafc situations. For instance, the inclusion of trafc light signal data enables their

application to signalized intersections, while the extension of heuristics accommodates shared

spaces such as pedestrian crossings.

After an expansion of heuristics, additional data can be gathered by following the framework

outlined in Chapter 4. This newly acquired data can then serve multiple purposes, including in-

depth analysis of the prediction model, expanded testing of the trajectory planning framework,

and gaining a more comprehensive understanding of the constraints inherent to heuristic

models.

As AD and DS are not only relevant for German trafc, the development for other markets

and countries requires transferable methods and country-specifc data. The proposed data-

processing algorithms can be applied to analyze geographical and cultural diferences in trafc

135

8.4 Summarized Outlook

behavior, such as comparing German and Chinese behavior to identify which aspects of

behavior difer signifcantly. Moreover, the prediction, planning, and evaluation methods are

designed to be fexible enough that country-specifc behavior can be easily trained by the

NN presented in Chapter 5, and decision-making can be parameterized with in the plannin g

framework proposed in Chapter 6 by defning a specifed ftness function for the GA with the

parameter ranges of the country-specifc data and for example tune the planner towards more

Chinese-like driving behavior.

After further refnement of the individual subcomponents within the driver model, the next

important milestone is to combine the anticipation and planning models. As already indicated,

topics such as the management of prediction uncertainties and the investigation of knowledge

distribution must be explored in greater depth at this point.

Given the fact that evaluations have demonstrated the heuristic driver model’s capability to

handle a wide array of less complex trafc scenarios, it is advisable to consider a modular

structure that switches between the heuristic decision strategy and the approaches introduced

within this thesis contingent upon the situation’s complexity. To facilitate this transition, a

thorough comprehension of when and to what extent the more complex modeling approaches

add value becomes essential. A preliminary exploration of this aspect is presented in Chapter

6, where the planning frameworks are compared with the heuristic model. Furthermore,

additional analyses are presented in Chapter 7. Based on such insights, one can either

formulate heuristic rules or devise an alternative intelligent distinction technique capable of

discerning a situation’s complexity, and thus allowing for the wise selection of the appropriate

methodology for decision-making.

Finally, with the special focus on DS applied to DiL applications, the requirement of interactive

behavior and, thus, communication between road users need to be considered in a broader

perspective. Communication in trafc can be exhibited implicitly, for example, by slowing down

in front of a crosswalk or explicitly through a hand gesture [130]. As stated in the introduction

chapter, in this thesis, the output of a driver model was considered to be spatio-temporal

motion. However, concerning realistic behavior in interactive urban trafc, the modeling of

explicit communication cues, such as gestures, should also be considered. This poses challenges

such as the visual modeling of the gesture itself as well as the decision of when to show which

gesture. The latter requires a high degree of situational understanding and intelligence within

the model.

To stay within the context of DiL applications in DS, not only the surrounding vehicles need

to be modeled using agent models. Besides vehicles, pedestrians and agent models covering a

variety of transportation possibilities, such as bicyclists, scooters, or motorcycles, are required.

When modeling such VRUs, new challenges arise due to the high degree of free-space motion

and because active communication, such as gaze direction, becomes even more relevant.

To conclude, numerous challenges remain for modeling and imitating human trafc behavior.

Human beings have a complex nature and various sophisticated cognitive abilities, which are

not yet fully explored or understood. To bridge this gap between computer science and human

nature, interdisciplinary perspectives, dynamic methods, and critical evaluations are required

to fnd the best solution satisfying individual requirements.

136

References

[1]

Junqing Wei, John M Dolan, and Bakhtiar Litkouhi. “A learning-based autonomous

driver: emulate human driver’s intelligence in low-speed car following”. In: Unattended

ground, sea, and air sensor technologies and applications XII. Vol. 7693. SPIE. 2010,

pp. 93–104.

[2]

Mysore N Sharath, Nagendra R Velaga, and Mohammed A Quddus. “2-dimensional

human-like driver model for autonomous vehicles in mixed trafc”. In: IET Intelligent

Transport Systems 14.13 (2020), pp. 1913–1922.

[3]

Klaus Bengler et al. “UR: BAN human factors in trafc”. In: Approaches for Safe,

Efcient and Stress-Free Urban Trafc; Springer: Wiesbaden, Germany (2018).

[4]

Zaheer Allam and Ayyoob Sharif. “Research Structure and Trends of Smart Urban

Mobility”. In: Smart Cities 5.2 (2022), pp. 539–561.

[5]

Cordelia Friesendorf and Luca Uedelhoven. “Megatrends Infuencing Mobility”. In:

Mobility in germany: digital transformation, megatrends and the evolution of new

business models. SpringerBriefs in Business. Cham: Springer International Publishing,

2021, pp. 19–23.

[6]

Maximilian Hübner et al. “External communication of automated vehicles in mixed

trafc: Addressing the right human interaction partner in multi-agent simulation”. In:

Transportation research part F: trafc psychology and behaviour 87 (2022), pp. 365–378.

[7]

Lucas Bruck, Bruce Haycock, and Ali Emadi. “A review of driving simulation technology

and applications”. In: IEEE Open Journal of Vehicular Technology 2 (2020), pp. 1–16.

[8] Jason Jerald. “Immersion, presence and reality trade-ofs”. In: 2016, pp. 45–52.

[9]

Gustav Markkula et al. “Defning interactions: A conceptual framework for understand-

ing interactive behaviour in human and automated road trafc”. In: Theoretical Issues

in Ergonomics Science 21.6 (2020), pp. 728–752.

[10]

Hermann Winner et al., eds. Handbook of driver assistance systems. Cham: Springer

International Publishing, 2016.

[11]

Marco Luetzenberger and Sahin Albayrak. “Can you simulate trafc psychology? an

analysis”. In: 2013 Winter Simulations Conference (WSC). IEEE. 2013, pp. 1539–1550.

[12]

Corina Apachite, Ralph Lauxmann, Robert Thiel, et al. “AI for Automated Driving”.

In: ATZelectronics worldwide 16.9 (2021), pp. 48–51.

137

REFERENCES

[13]

Almut Hochstaedter, Peter Zahn, and Karsten Breuer. “A comprehensive driver model

with application to trafc simulation and driving simulators”. In: Proc. Human-Centered

Transportation Simulation Conf. HCTSC, Iowa City. 2001.

[14]

James Imende Obuhuma, Henry Okora Okoyo, and Sylvester Okoth McOyowo. “A

software agent for vehicle driver modeling”. In: 2019 IEEE AFRICON. IEEE. 2019,

pp. 1–8.

[15]

Manuela Witt et al. “Cognitive driver behavior modeling: Infuence of personality

and driver characteristics on driver behavior”. In: Advances in Human Aspects of

Transportation: Proceedings of the AHFE 2018 International Conference on Human

Factors in Transportation, July 21-25, 2018, Loews Sapphire Falls Resort at Universal

Studios, Orlando, Florida, USA 9. Springer. 2019, pp. 751–763.

[16]

Jun JIANG and Jian LU. “Research of Driver-Vehicle Unit Model Framework Based

on Agent and ACT-R”. In: The National Science Foundation of China (2009). (Visited

on 11/05/2020).

[17]

R. Herpers et al. Agentenbasierte Verkehrssimulation mit psychologischen Persoen-

lichkeitsproflen (AVeSi). Tech. rep. University of Applied Sciences Bonn-Rhein-Sieg,

Department of Computer Science, 2015.

[18]

Mohammad Bahram et al. “Microscopic trafc simulation based evaluation of highly

automated driving on highways”. In: 17th International IEEE Conference on Intelligent

Transportation Systems (ITSC) . IEEE. 2014, pp. 1752–1757.

[19]

Benhamza Karima et al. “Agent-based modeling for trafc simulation”. In: Courrier du

savoir (Jan. 2012), pp. 51–56.

[20]

Qianjiao Wu, Rong Lan, and Wei Zhou. “Trafc Flow Simulation Based on Adaptive

Agent”. In: 2018 3rd International Conference on Smart City and Systems Engineering

(ICSCSE). IEEE. 2018, pp. 697–700.

[21]

Jie Sun, Yongping Zhang, and Jianbo Fan. “SmartAgents: a scalable infrastructure

for smart car”. In: 2011 12th International Conference on Parallel and Distributed

Computing, Applications and Technologies. IEEE. 2011, pp. 99–103.

[22]

Amgad Naiem et al. “An agent based approach for modeling trafc fow”. In: 2010 The

7th international conference on informatics and systems (INFOS). IEEE. 2010, pp. 1–6.

[23]

Levente Alekszejenkó and Tadeusz P Dobrowiecki. “SUMO Based Platform for

Cooperative Intelligent Automotive Agents.” In: SUMO. 2019, pp. 107–123.

[24]

José LF Pereira and Rosaldo JF Ross etti. “An integrated architecture for autonomous

vehicles simulation”. In: Proceedings of the 27th annual ACM symposium on applied

computing. 2012, pp. 286–292.

[25]

Ali Bazghandi. “Techniques, advantages and problems of agent based modeling for

trafc simulation”. In: International Journal of Computer Science Issues (IJCSI) 9.1

(2012), p. 115.

[26]

Eric Bonabeau. “Agent-based modeling: Methods and techniques for simulating human

systems”. In: Proceedings of the national academy of sciences 99.suppl_3 (2002),

pp. 7280–7287.

138

REFERENCES

[27]

Charles M Macal and Michael J North. “Agent-based modeling and simulation”. In:

Proceedings of the 2009 winter simulation conference (WSC). IEEE. 2009, pp. 86–98.

[28]

Li Zhou and Kai Zhao. “The design of agent-based intelligent trafc visualized simulation

system”. In: 2010 International Conference on Electrical and Control Engineering . IEEE.

2010, pp. 3066–3069.

[29]

Pablo Alvarez Lopez et al. “Microscopic trafc simulation using SUMO”. In: 2018

21st international conference on intelligent transportation systems (ITSC). IEEE. 2018,

pp. 2575–2582.

[30]

Tianjiao Wang, Jianping Wu, and Mike McDonald. “A micro-simulation model of

pedestrian-vehicle interaction behavior at unsignalized mid-block locations”. In: 2012

15th International IEEE Conference on Intelligent Transportation Systems. IEEE. 2012,

pp. 1827–1833.

[31]

C-A Brunet, Ruben Gonzalez-Rubio, and Mario Tetreault. “A multi-agent architecture

for a driver model for autonomous road vehicles”. In: Proceedings 1995 Canadian

conference on electrical and computer engineering. Vol. 2. IEEE. 1995, pp. 772–775.

[32]

Dario D Salvucci. “Mo deling driver b ehavior in a cognitive architecture”. In: Human

factors 48.2 (2006), pp. 362–380.

[33]

Hideki Fujii, Hideaki Uchida, and Shinobu Yoshimura. “Agent-based simulation

framework for mixed trafc of cars, pedestrians and trams”. In: Transportation research

part C: emerging technologies 85 (2017), pp. 234–248.

[34]

Michael Behrisch et al. “SUMO–simulation of urban mobility: an overview”. In:

Proceedings of SIMUL 2011, The Third International Conference on Advances in

System Simulation. ThinkMind. 2011.

[35] Jaume Barceló et al. Fundamentals of trafc simulation. Vol. 145. Springer, 2010.

[36]

Michael Kutz and Rainer Herpers. “Urban trafc simulation for games: a general

approach for simulation of urban actors”. In: Proceedings of the 2008 Conference on

Future Play: Research, Play, Share. 2008, pp. 181–184.

[37]

Martin Fellendorf and Peter Vortisch. “Microscopic trafc fow simulator VISSIM”. In:

Fundamentals of trafc simulation (2010), pp. 63–93.

[38]

Dominik Salles, Stefan Kaufmann, and Hans-Christian Reuss. “Extending the intelligent

driver model in SUMO and verifying the drive of trajectories with aerial measurements”.

In: 1 (2020), pp. 1–25.

[39]

Jakob Erdmann. “Lane-changing model in SUMO”. In: Proceedings of the SUMO2014

modeling mobility with open data 24 (2014), pp. 77–88.

[40]

cogniBIT - drivebot. WEBSITE. url:

https://cognibit.de /drivebot/

(visited on

10/06/2021).

[41]

AAI Intelligent Trafc. WEBSITE. url:

https://www.automotive-ai.com/aai-

blog/caevrevent21 (visited on 01/10/2024).

139

REFERENCES

[42]

Arpan Kusari et al. “Enhancing SUMO simulator for simulation based testing and

validation of autonomous vehicles”. In: 2022 IEEE Intelligent Vehicles Symposium (IV).

IEEE. 2022, pp. 829–835.

[43]

Philippe Mathieu and Antoine Nongaillard. “A risk-driven model for trafc simulation”.

In: Distributed Computing and Artifcial Intelligence, 17th International Conference.

Springer. 2021, pp. 1–10.

[44]

Alexandre Bonhomme, Philippe Mathieu, and Sébastien Picault. “A versatile description

framework for modeling behaviors in trafc simulations”. In: 2014 IEEE 26th

International Conference on Tools with Artifcial Intelligence. IEEE. 2014, pp. 937–944.

[45]

Xianyan Kuang et al. “Multi-Agent Based Microscopic Simulation Modeling for Urban

Trafc Flow”. In: Sensors & Transducers 180.10 (2014), p. 117.

[46]

Stefen Kampmann et al. “Automatic mapping of human behavior data to personality

model parameters for trafc simulations in virtual environments”. In: 2015 IEEE

conference on computational intelligence and games (CIG). IEEE. 2015, pp. 336–343.

[47]

Alexandra Fries et al. “Driver behavior model for the safety assessment of automated

driving”. In: 2022 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2022, pp. 1669–

1674.

[48]

Alexandra Fries et al. “Modeling driver behavior in critical trafc scenarios for the

safety assessment of automated driving”. In: Trafc injury prevention 24.sup1 (2023),

S105–S110.

[49]

Martin Treiber, Ansgar Hennecke, and Dirk Helbing. “Congested trafc states in

empirical observations and microscopic simulations”. In: Physical Review E 62.2 (2000),

pp. 1805–1824.

[50]

Kyle Brown, Katherine Driggs-Campbell, and Mykel J Kochenderfer. “A taxonomy and

review of algorithms for modeling and predicting human driver behavior”. In: arXiv

preprint arXiv:2006.08832 (2020).

[51]

Martin Fellendorf. “VISSIM: A microscopic simulation tool to evaluate actuated signal

control including bus priority”. In: 64th Institute of transportation engineers annual

meeting. Vol. 32. Springer. 1994, pp. 1–9.

[52]

Arne Kesting, Martin Treiber, and Dirk Helbing. “General lane-changing model MOBIL

for car-following models”. In: Transportation Research Record 1999.1 (2007), pp. 86–94.

[53]

Carl-Johan Hoel et al. “Combining planning and deep reinforcement learning in tactical

decision making for autonomous driving”. In: IEEE transactions on intelligent vehicles

5.2 (2019), pp. 294–305.

[54]

Manuela Witt et al. “Cognitive driver behavior modeling: Infuence of personality

and driver characteristics on driver behavior”. In: Advances in Human Aspects of

Transportation: Proceedings of the AHFE 2018 International Conference on Human

Factors in Transportation, July 21-25, 2018, Loews Sapphire Falls Resort at Universal

Studios, Orlando, Florida, USA 9. Springer. 2019, pp. 751–763.

140

REFERENCES

[55]

Teresa Rock Thomas Bleher Mohammad Bahram. “Spider - the Simulation Framework at

BMW”. Proceedings of the Driving Simulation Conference 2024 Europe VR Publication

in progress.

[56]

Wiedemann Rainer. “Simulation des Strassenverkehrsfusses”. In: Schriftenreihe des

Instituts für Verkehrswesen der Universität (TH) Karlsruhe Heft 8/1974 (1974).

[57]

Jiřı Vokřınek et al. “A cooperative driver model for trafc simulations”. In: 2013

11th IEEE International Conference on Industrial Informatics (INDIN). IEEE. 2013,

pp. 756–761.

[58]

Xiaohui Li et al. “Real-time trajectory planning for autonomous urban driving: Frame-

work, algorithms, and verifcations”. In: IEEE/ASME Transactions on mechatronics

21.2 (2015), pp. 740–753.

[59]

Yu Zhang et al. “Hybrid Trajectory Planning for Autonomous Driving in Highly

Constrained Environments”. In: IEEE Access 6 (2018), pp. 32800–32819.

[60]

Branka Mirchevska et al. “High-level decision making for safe and reasonable

autonomous lane changing using reinforcement learning”. In: 2018 21st International

Conference on Intelligent Transportation Systems (ITSC). IEEE. 2018, pp. 2156–2162.

[61]

Gustav Markkula et al. “Models of human decision-making as tools for estimating and

optimizing impacts of vehicle automation”. In: Transportation research record 2672.37

(2018), pp. 153–163.

[62]

David Isele. “Interactive decision making for autonomous vehicles in dense trafc”.

In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE. 2019,

pp. 3981–3986.

[63]

Zirui Li et al. “Driver behavior modelling at the urban intersection via canonical

correlation analysis”. In: 2020 3rd International Conference on Unmanned Systems

(ICUS). IEEE. 2020, pp. 564–569.

[64]

Yonghwan Jeong and Kyongsu Yi. “Target Vehicle Motion Prediction-Based Motion

Planning Framework for Autonomous Driving in Uncontrolled Intersections”. In: IEEE

Transactions on Intelligent Transportation Systems 22.1 (2021), pp. 168–177.

[65]

Kichun Jo et al. “Development of Autonomous Car—Part II: A Case Study on the

Implementation of an Autonomous Driving System Based on Distributed Architecture”.

In: IEEE Transactions on Industrial Electronics 62.8 (2015), pp. 5119–5132.

[66]

Wonteak Lim et al. “Hierarchical Trajectory Planning of an Autonomous Car Based on

the Integration of a Sampling and an Optimization Method”. In: IEEE Transactions

on Intelligent Transportation Systems 19 (2018), pp. 613–626.

[67]

Wonteak Lim et al. “Hybrid trajectory planning for autonomous driving in on-road

dynamic scenarios”. In: IEEE Transactions on Intelligent Transportation Systems 22.1

(2019), pp. 341–355.

[68]

Florin Leon and Marius Gavrilescu. “A review of tracking and trajectory prediction

methods for autonomous driving”. In: Mathematics 9.6 (2021), p. 660.

141

REFERENCES

[69]

Dongchan Kim, Hayoung Kim, and Kunsoo Huh. “Trajectory planning for autonomous

highway driving using the adaptive potential feld”. In: 2018 21st International

Conference on Intelligent Transportation Systems (ITSC). IEEE. 2018, pp. 1069–1074.

[70]

Pramila P. Shinde and Seema Shah. “A Review of Machine Learning and Deep Learning

Applications”. In: 2018 Fourth International Conference on Computing Communication

Control and Automation (ICCUBEA). 2018, pp. 1–6.

[71]

Yassine Ouali, Céline Hudelot, and Myriam Tami. “An overview of deep semi-supervised

learning”. In: arXiv preprint arXiv:2006.05278 (2020).

[72]

Alexandre Alahi et al. “Social lstm: Human trajectory prediction in crowded spaces”.

In: Proceedings of the IEEE conference on computer vision and pattern recognition.

2016, pp. 961–971.

[73]

Alex Kuefer et al. “Imitating driver behavior with generative adversarial networks”.

In: 2017 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2017, pp. 204–211.

[74]

Nachiket Deo, Akshay Rangesh, and Mohan M Trivedi. “How would surround vehicles

move? a unifed framework for maneuver classifcation and motion prediction”. In: IEEE

Transactions on Intelligent Vehicles 3.2 (2018), pp. 129–140.

[75]

Joey Hong, Benjamin Sapp, and James Philbin. “Rules of the road: Predicting driving

behavior with a convolutional model of semantic interactions”. In: Proceedings of the

IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, pp. 8454–

8462.

[76]

Jens Schulz et al. “Learning interaction-aware probabilistic driver behavior models from

urban scenarios”. In: 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2019,

pp. 1326–1333.

[77]

Xiaoyu Mo, Yang Xing, and Chen Lv. “Recog: A deep learning framework with

heterogeneous graph for interaction-aware trajectory prediction”. In: arXiv preprint

arXiv:2012.05032 (2020).

[78]

Xiaoyu Mo, Yang Xing, and Chen Lv. “Heterogeneous edge-enhanced graph attention

network for multi-agent trajectory prediction”. In: arXiv preprint arXiv:2106.07161

(2021).

[79]

Mohammadhossein Bahari, Ismail Nejjar, and Alexandre Alahi. “Injecting knowledge in

data-driven vehicle trajectory predictors”. In: Transportation research part C: emerging

technologies 128 (2021), p. 103010.

[80]

Debaditya Roy et al. “Vehicle tra jectory prediction at intersections using interaction

based generative adversarial networks”. In: 2019 IEEE Intelligent transportation systems

conference (ITSC) . IEEE. 2019, pp. 2318–2323.

[81]

Mohammadhossein Bahari et al. “Vehicle trajectory prediction works, but not

everywhere”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and

Pattern Recognition. 2022, pp. 17123–17133.

[82]

Iago Gomes and Denis Wolf. “A review on intention-aware and interaction-aware

trajectory prediction for autonomous vehicles”. In: Authorea Preprints (2023).

142

REFERENCES

[83]

Rohan Chandra et al. “Traphic: Trajectory prediction in dense and heterogeneous

trafc using weighted interactions”. In: Proceedings of the IEEE/CVF Conference on

Computer Vision and Pattern Recognition. 2019, pp. 8483–8492.

[84]

Hyeongseok Jeon, Junwon Choi, and Dongsuk Kum. “Scale-net: Scalable vehicle

trajectory prediction network under random number of interacting vehicles via edge-

enhanced graph convolutional neural network”. In: 2020 IEEE/RSJ International

Conference on Intelligent Robots and Systems (IROS). IEEE. 2020, pp. 2095–2102.

[85]

Siyu Teng et al. “Motion planning for autonomous driving: The state of the art and

future perspectives”. In: IEEE Transactions on Intelligent Vehicles (2023).

[86]

David Lenz et al. “Deep neural networks for Markovian interactive scene prediction in

highway scenarios”. In: 2017 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2017,

pp. 685–692.

[87]

Markus Koschi and Matthias Althof. “Set-based prediction of trafc participants

considering occlusions and trafc rules”. In: IEEE Transactions on Intelligent Vehicles

6.2 (2020), pp. 249–265.

[88]

Jens Schulz et al. “Interaction-aware probabilistic behavior prediction in urban

environments”. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and

Systems (IROS). IEEE. 2018, pp. 3999–4006.

[89]

Noor Hafzah Amer et al. “Modelling and control strategies in path tracking control for

autonomous ground vehicles: a review of state of the art and challenges”. In: Journal

of intelligent & robotic systems 86 (2017), pp. 225–254.

[90]

Wuhong Wang et al. “A cross-cultural analysis of driving behavior under critical

situations: A driving simulator study”. In: Transportation research part F: trafc

psychology and behaviour 62 (2019), pp. 483–493.

[91]

F Freuli et al. “Cross-cultural perspective of driving style in young adults: Psychometric

evaluation through the analysis of the Multidimensional Driving Style Inventory”. In:

Transportation research part F: trafc psychology and behaviour 73 (2020), pp. 425–432.

[92]

Biying Shen et al. “The diferent efects of personality on prosocial and aggressive

driving behaviour in a Chinese sample”. In: Transportation research part F: trafc

psychology and behaviour 56 (2018), pp. 268–279.

[93]

Lei Zhang et al. “A quantifcation method of driver characteristics based on Driver

Behavior Questionnaire”. In: 2009 IEEE Intelligent Vehicles Symposium. IEEE. 2009,

pp. 616–620.

[94]

Jens Rasmussen. “Skills, rules, and knowledge; signals, signs, and symbols, and other

distinctions in human performance models”. In: IEEE Transactions on Systems, Man,

and Cybernetics SMC-13.3 (1983), pp. 257–266.

[95]

Edmund Donges. “Aspekte der aktiven Sicherheit bei der Führung von Personenkraft-

wagen”. In: Automob-Ind 27.2 (1982).

[96]

Christian P Janssen et al. “Cognitive Modelling at the UCL Interaction Centre”. In:

2010.

143

REFERENCES

[97]

Mehdi Cina and Ahmad B. Rad. “Categorized review of drive simulators and driver

behavior analysis focusing on ACT-R architecture in autonomous vehicles”. In:

Sustainable Energy Technologies and Assessments 56 (2023), p. 103044.

[98]

Christian P Janssen et al. “Computational Models of Human-Automated Vehicle

Interaction (Dagstuhl Seminar 22102)”. In: Dagstuhl Reports. Vol. 12. 3. Schloss

Dagstuhl-Leibniz-Zentrum für Informatik. 2022.

[99]

Christian P Janssen and Wayne D Gray. “When, what, and how much to reward in

reinforcement learning-based models of cognition”. In: Cognitive science 36.2 (2012),

pp. 333–358.

[100]

Tessa van der Heiden, Florian Mirus, and Herke van Hoof. “Social navigation with

human empowerment driven deep reinforcement learning”. In: Artifcial Neural Networks

and Machine Learning–ICANN 2020: 29th International Conference on Artifcial Neural

Networks, Bratislava, Slovakia, September 15–18, 2020, Proceedings, Part II 29. Springer.

2020, pp. 395–407.

[101]

Raunak P Bhattacharyya et al. “Multi-agent imitation learning for driving simulation”.

In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

IEEE. 2018, pp. 1534–1539.

[102]

Feryal Behbahani et al. “Learning from demonstration in the wild”. In: 2019

International Conference on Robotics and Automation (ICRA). IEEE. 2019, pp. 775–781.

[103]

Teresa Rock et al. “Quantifying Realistic Behaviour of Trafc Agents in Urban Driving

Simu lation Based on Questionnaires”. In: 2022 IEEE Intelligent Vehicles Symposium

(IV). IEEE. 2022, pp. 1675–1682.

[104]

Liu Yang et al. “Efect of trafc density on drivers’ lane change and overtaking maneuvers

in freeway situation—A driving simulator–based study”. In: Trafc injury prevention

19.6 (2018), pp. 594–600.

[105]

Andreas Keler et al. “A bicycle simulator for experiencing microscopic trafc fow

simulation in urban environments”. In: 2018 21st International Conference on Intelligent

Transportation Systems (ITSC) . IEEE. 2018, pp. 3020–3023.

[106]

Manu el Lindorfer, Christoph F Mecklenbraeuker, and Gerald Ostermayer. “Modeling

the imperfect driver: Incorporating human factors in a microscopic trafc model”. In:

IEEE Transactions on Intelligent Transportation Systems 19.9 (2017), pp. 2856–2870.

[107]

Aleksandar Kostikj, Milan Kjosevski, and Ljupcho Kocarev. “Validation of a microscopic

single lane urban trafc simulator”. In: 2014 International Conference on Connected

Vehicles and Expo (ICCVE) . IEEE. 2014, pp. 850–854.

[108]

Christian Rudlof, Robert Schoenauer, and Martin Fellendorf. “Comparing Calibrated

Shared Space Simulation Model with Real-Life Data”. In: Transportation Research

Record (2013), pp. 1–14.

[109]

HyunSuk Kim et al. “Driving characteristics analysis of young and middle-aged drivers”.

In: 2016 International Conference on Information and Communication Technology

Convergence (ICTC). IEEE. 2016, pp. 864–867.

144

REFERENCES

[110] Peter Mörtl et al. “Modelling driver styles based on driving data”. In: (2017).

[111]

Günther Prokop. “Modeling human vehicle driving by model predictive online

optimization”. In: Vehicle system dynamics 35.1 (2001), pp. 19–53.

[112]

Jianbang Liu et al. “A Survey on Deep-Learning Approaches for Vehicle Trajectory

Prediction in Autonomous Driving”. In: 2021 IEEE International Conference on Robotics

and Biomimetics (ROBIO) (2021), pp. 978–985.

[113]

Yiran Zhang et al. “Human-Like Interactive Behavior Generation for Autonomous

Vehicles: A Bayesian Game-Theoretic Approach with Turing Test”. In: Advanced

Intelligent Systems 4.5 (2022), p. 2100211.

[114]

Abs Dumbuya et al. “Complexity of trafc interactions: Improving behavioural

intelligence in driving simulation scenarios”. In: Complex systems and self-organization

modelling (2009), pp. 201–209.

[115]

Jari Takatalo, Göte Nyman, and Leif Laaksonen. “Components of human experience in

virtual environments”. In: Computers in Human Behavior 24.1 (2008), pp. 1–15.

[116]

Christophe Deniaud et al. “The concept of “presence” as a measure of ecological validity

in driving simulators”. In: Journal of Interaction Science 3 (2015), pp. 1–13.

[117]

Ilsun Rhiu et al. “The evaluation of user experience of a human walking and a driving

simulation in the virtual reality”. In: International journal of industrial ergonomics 79

(2020), p. 103002.

[118]

Mel Slater et al. “How we experience immersive virtual environments: the concept of

presence and its measurement”. In: Anuario de psicologıa 40.2 (2009), pp. 193–210.

[119]

Rajaram Bhagavathula et al. “The reality of virtual reality: A comparison of pedestrian

behavior in real and virtual environments”. In: 62.1 (2018), pp. 2056–2060.

[120]

Despina Michael et al. “Impact of immersion and realism in driving simulator studies”.

In: International Journal of Interdisciplinary Telecommunications and Networking

(IJITN) 6.1 (2014), pp. 10–25.

[121]

Dimitri Hein, Christian Mai, and Heinrich Hußmann. “The usage of presence

measurements in research: a review”. In: Proceedings of the International Society

for Presence Research Annual Conference (Presence). The International Society for

Presence Research Prague. 2018, pp. 21–22.

[122]

Martin Usoh et al. “Using presence questionnaires in reality”. In: Presence 9.5 (2000),

pp. 497–503.

[123]

Mel Slater, Martin Usoh, and Anthony Steed. “Depth of presence in virtual environ-

ments”. In: Presence: Teleoperators & Virtual Environments 3.2 (1994), pp. 130–144.

[124]

Rosanna E Guadagno et al. “Virtual humans and persuasion: The efects of agency and

behavioral realism”. In: Media Psychology 10.1 (2007), pp. 1–22.

[125]

Xueni Pan and Antonia F de C Hamilton. “Why and how to use virtual reality to study

human social interaction: The challenges of exploring a new research landscape”. In:

British Journal of Psychology 109.3 (2018), pp. 395–417.

145

REFERENCES

[126]

Jason Jerald. “Immersion, p resence and reality trade-ofs”. In: J. Jerald, The VR Book

Human-Centered Design for Virtual Reality (2016), pp. 45–52.

[127]

Clara Marina Martinez et al. “Driving style recognition for intelligent vehicle control

and advanced driver assistance: A survey”. In: IEEE Transactions on Intelligent

Transportation Systems 19.3 (2017), pp. 666–676.

[128]

Berthold Färber. “Communication and communication problems between autonomous

vehicles and human drivers”. In: Autonomous driving: Technical, legal and social aspects

(2016), pp. 125–144.

[129]

Mica R. Endsley. “Situation Awareness Misconceptions and Misunderstandings”. In:

Journal of Cognitive Engineering and Decision Making 9.1 (2015), pp. 4–32.

[130]

Victor Fabricius et al. “Interactions between heavy trucks and vulnerable road users—A

systematic review to inform the interactive capabilities of highly automated trucks”. In:

Frontiers in Robotics and AI 9 (2022), p. 818019.

[131] Herbert H Clark and Susan E Brennan. “Grounding in communication.” In: (1991).

[132]

Sebastian Brechtel. “Dynamic decision-making in continuous partially observable

domains: A novel method and its application for autonomous driving”. PhD thesis.

Karlsruhe, Karlsruher Institut für Technologie (KIT), Diss., 2015, 2015.

[133]

WS Lee et al. “Driving simulation for evaluation of driver assistance systems and driving

management systems”. In: sponsored by the Korea Transportation Institute under the

national project,‘Development of National Trafc Core Technology (2007).

[134]

Guangquan Lu et al. “Measuring drivers’ takeover performance in varying levels of

automation: Considering the infuence of cognitive secondary task”. In: Transportation

research part F: trafc psychol ogy and behaviour 82 (2021), pp. 96–110.

[135]

Manish M Narkhede and Nilkanth B Chopade. “Review of advanced driver assistance

systems and their applications for collision avoidance in urban driving scenario”. In:

Machine Learning and Big Data Analytics (Proceedings of International Conference

on Machine Learning and Big Data Analytics (ICMLBDA) 2021). Springer. 2022,

pp. 253–267.

[136]

Leo Gugerty et al. “Situation awareness in driving”. In: Handbook for driving simulation

in engineering, medicine and psychology 1 (2011), pp. 265–272.

[137]

Florent Perronnet et al. “Deadlock prevention of self-driving vehicles in a network

of intersections”. In: IEEE Transactions on Intelligent Transportation Systems 20.11

(2019), pp. 4219–4233.

[138]

Jaskaran Grover, Changliu Liu, and Katia Sycara. “Deadlock Analysis and Resolution

in Multi-Robot Systems”. In: arXiv (2020). eprint: 1911.09146.

[139]

Steve Wright, Nicholas J Ward, and Anthony G Cohn. “Enhanced presence in

driving simulators using autonomous trafc with virtual personalities”. In: Presence:

Teleoperators & Virtual Environments 11.6 (2002), pp. 578–590.

[140]

Berndt Brehmer. “Dynamic decision making: Human control of complex systems”. In:

Acta Psychologica 81.3 (1992), pp. 211–241.

146

REFERENCES

[141]

Cleotilde Gonzalez, Pegah Fakhari, and Jerome Busemeyer. “Dynamic decision making:

Learning processes and new research directions”. In: Human factors 59.5 (2017), pp. 713–

721.

[142]

Teresa Rock et al. “Data-Driven Prediction of Other Road Users’ Intention for Better

Scene Understanding in Trafc Agents”. In: Proceedings of the Driving Simulation

Conference 2022 Europe VR. Ed. by Andras Kemeny, Jean-Rémy Chardonnet, and

Florent Colombet. Driving Simulation Association. Strasbourg, France, Sept. 15, 2022,

pp. 9–16.

[143]

Mica R Endsley, Daniel J Garland, et al. “Theoretical underpinnings of situation

awareness: A critical review”. In: Situation awareness analysis and measurement 1.1

(2000), pp. 3–21.

[144]

Philipp Bender, Julius Ziegler, and Christoph Stiller. “Lanelets: Efcient map

representation for autonomous driving”. In: 2014 IEEE Intelligent Vehicles Symposium

Proceedings. IEEE. 2014, pp. 420–425.

[145]

Open Street Map. “Open street map”. In: Online: https://www. openstreetmap. org.

Search in (2014).

[146]

Claudine Badue et al. “Self-driving cars: A survey”. In: Expert Systems with Applications

165 (2021), p. 113816.

[147]

Kai-Wei Chiang et al. “Automated Modeling of Road Networks for High-Defnition

Maps in OpenDRIVE Format Using Mobile Mapping Measurements”. In: Geomatics

2.2 (2022), pp. 221–235.

[148]

Alejandro Diaz-Diaz et al. “Hd maps: Exploiting opendrive potential for path planning

and map monitoring”. In: 2022 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2022,

pp. 1211–1217.

[149]

ASAM e.V. ASAM OpenDRIVE. July 2023. url:

https://www.asam.net/standards/

detail/opendrive/ (visited on 07/23/2023).

[150]

ASAM e.V. ASAM OpenDRIVE User Guide 1.7.0. July 2023. url:

https://www.asam.

net/fileadmin/Standards/OpenDRIVE/ASAM_OpenDRIVE_BS_V1-7-0.html

(visited

on 07/23/2023).

[151]

Martin H Strobl. “Spider-das innovative software-framework der bmw

fahrsimulation/spider-the innovative software framework of the bmw driving simulation”.

In: 1745. 2003.

[152]

Long Xin et al. “Enable faster and smoother spatio-temporal tra jectory planning for

autonomous vehicles in constrained dynamic environment”. In: Proceedings of the

Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering 235.4

(2021), pp. 1101–1112.

[153]

Yanjie Liu, Jiao Chen, and Xinyu Bai. “An approach for multi-objective obstacle

avoidance using dynamic occupancy grid map”. In: 2020 IEEE International Conference

on Mechatronics and Automation (ICMA). IEEE. 2020, pp. 1209–1215.

147

REFERENCES

[154]

Emmanouil G. Tsardoulias et al. “A Review of Global Path Planning Methods for

Occupancy Grid Maps Regardless of Obstacle Density”. In: Journal of Intelligent &

Robotic Systems 84 (2016), pp. 829–858.

[155]

Julian Bock et al. “The ind dataset: A drone dataset of naturalistic road user trajectories

at german intersections”. In: 2020 IEEE Intelligent Vehicles Symposium (IV). IE EE.

2020, pp. 1929–1934.

[156]

Matthias Schreier, Volker Willert, and Juergen Adamy. “Compact representation of

dynamic driving environments for ADAS by parametric free space and dynamic object

maps”. In: IEEE Transactions on Intelligent Transportation Systems 17.2 (2015),

pp. 367–384.

[157]

Timothy J.C. Nokes. “Mechanisms of knowledge transfer”. In: Thinking & Reasoning

15 (2009), pp. 1–36.

[158]

Pelin Onelcin and Yalcin Alver. “The crossing speed and safety margin of pedestrians

at signalized intersections”. In: Transportation Research Procedia 22 (2017), pp. 3–12.

[159]

Sen Zhang et al. “Representation of trafc congestion data for urban road trafc

networks based on pooling operations”. In: Algorithms 13.4 (2020), p. 84.

[160]

Teresa Rock et al. “On the Way to Reliable Trajectory Prediction in Urban Trafc”.

Advances in Transdisciplinary Engineering 2023, Publication in progress.

[161]

Sajjad Mozafari et al. “Deep learning-based vehicle behavior prediction for autonomous

driving applications: A review”. In: IEEE Transactions on Intelligent Transportation

Systems 23.1 (2020), pp. 33–47.

[162]

Christoph Burger, Thomas Schneider, and Martin Lauer. “Interaction aware cooperative

trajectory planning for lane change maneuvers in dense trafc”. In: (2020), pp. 1–8.

[163]

Adam Houenou et al. “Vehicle trajectory prediction based on motion model and

maneuver recognition”. In: 2013 IEEE/RSJ international conference on intelligent

robots and systems. IEEE. 2013, pp. 4363–4369.

[164]

Xin Li, Xiaowen Ying, and Mooi Choo Chuah. “Grip: Graph-based interaction-aware

trajectory prediction”. In: 2019 IEEE Intelligent Transportation Systems Conference

(ITSC). IEEE. 2019, pp. 3960–3966.

[165]

Bin Zou et al. “A framework for trajectory prediction of preceding target vehicles in

urban scenario using multi-sensor fusion”. In: Sensors 22.13 (2022), p. 4808.

[166]

Beihao Xia et al. “CSCNet: Contextual semantic consistency network for trajectory

prediction in crowded spaces”. In: Pattern Recognition 126 (2022), p. 108552.

[167]

Jiachen Li et al. “Social-WaGDAT: Interaction-aware Trajectory Prediction via

Wasserstein Graph Double-Attention Network”. In: arXiv preprint arXiv:2002.06241

(2020).

[168]

Xin Li, Xiaowen Ying, and Mooi Choo. “GRIP++: Enhanced Graph-based Interaction-

aware TrajectoryPrediction for Autonomous Driving”. In: Chuah Department of

Computer Science and Engineering, Lehigh University (2020).

148

REFERENCES

[169]

Zhaoen Su et al. “Convolutions for spatial interaction modeling”. In: Proceedings

of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022,

pp. 6583–6592.

[170]

Chaofan Tao et al. “Dynamic and static context-aware lstm for multi-agent motion

prediction”. In: European Conference on Computer Vision. Springer. 2020, pp. 547–563.

[171]

Yu Wang and Shiwei Chen. “Multi-agent trajectory prediction with spatio-temporal

sequence fusion”. In: IEEE Transactions on Multimedia (2021).

[172]

Tianyang Zhao et al. “Multi-agent tensor fusion for contextual trajectory prediction”. In:

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

2019, pp. 12126–12134.

[173]

Yuexin Ma et al. “TrafcPredict: Trajectory Prediction for Heterogeneous Trafc-

Agents”. In: ArXiv abs/1811.02146 (2018).

[174]

Rohan Chandra et al. “Traphic: Trajectory prediction in dense and heterogeneous

trafc using weighted interactions”. In: Proceedings of the IEEE/CVF Conference on

Computer Vision and Pattern Recognition. 2019, pp. 8483–8492.

[175]

Jianyu Su et al. “Graph convolution networks for probabilistic modeling of driving ac-

celeration”. In: 2020 IEEE 23rd International Conference on Intelligent Transportation

Systems (ITSC). IEEE. 2020, pp. 1–8.

[176]

Jiachen Li et al. “Evolvegraph: Multi-agent trajectory prediction with dynamic relational

reasoning”. In: Advances in neural information processing systems 33 (2020), pp. 19783–

19794.

[177]

A Quintanar et al. “Predicting vehicles trajectories in urban scenarios with transformer

networks and augmented information”. In: 2021 IEEE Intelligent Vehicles Symposium

(IV). IEEE. 2021, pp. 1051–1056.

[178]

Cunjun Yu et al. “Spatio-temporal graph transformer networks for pedestrian trajectory

prediction”. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow,

UK, August 23–28, 2020, Proceedings, Part XII 16. Springer. 2020, pp. 507–523.

[179]

Jiachen Li et al. “Social-WaGDAT: Interaction-aware Trajectory Prediction via

Wasserstein Graph Double-Attention Network”. In: ArXiv abs/2002.06241 (2020).

[180]

Xiaoyu Mo et al. “Multi-agent trajectory prediction with heterogeneous edge-enhanced

graph attention network”. In: IEEE Transactions on Intelligent Transportation Systems

23.7 (2022), pp. 9554–9567.

[181]

Shengnan Guo et al. “Attention based spatial-temporal graph convolutional networks

for trafc fow forecasting”. In: 33.01 (2019), pp. 922–929.

[182]

Hao Cheng et al. “MCENET: Multi-context encoder network for homogeneous agent

trajectory prediction in mixed trafc”. In: 2020 IEEE 23rd International Conference

on Intelligent Transportation Systems (ITSC). IEEE. 2020, pp. 1–8.

[183]

Namhoon Lee et al. “Desire: Distant future prediction in dynamic scenes with interacting

agents”. In: Proceedings of the IEEE conference on computer vision and pattern

recognition. 2017, pp. 336–345.

149

REFERENCES

[184]

Luca Rossi et al. “Human trajectory prediction and generation using LSTM mo dels

and GANs”. In: Pattern Recognition 120 (2021), p. 108136.

[185]

Xiaoyu Mo, Yang Xing, and Chen Lv. “Graph and recurrent neural network-based vehicle

trajectory prediction for highway driving”. In: 2021 IEEE International Intelligent

Transportation Systems Conference (ITSC). IEEE. 2021, pp. 1934–1939.

[186]

Xiaoyu Mo, Yang Xing, and Chen Lv. “Recog: A deep learning framework with

heterogeneous graph for interaction-aware trajectory prediction”. In: arXiv preprint

arXiv:2012.05032 (2020).

[187]

Agrim Gupta et al. “Social GAN: Socially Acceptable Trajectories with Generative

Adversarial Networks”. In: 2018 IEEE/CVF Conference on Computer Vision and

Pattern Recognition (2018), pp. 2255–2264.

[188]

Arsal Syed and Brendan Tran Morris. “Semantic scene upgrades for trajectory

prediction”. In: Machine vision and applications 34.2 (2023), p. 23.

[189]

Junwei Liang, Lu Jiang, and Alexander Hauptmann. “Simaug: Learning robust

representations from simulation for trajectory prediction”. In: Computer Vision–ECCV

2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part

XIII 16. Springer. 2020, pp. 275–292.

[190]

Maximilian Schäfer et al. “Context-Aware Scene Prediction Network (CASPNet)”. In:

2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC).

IEEE. 2022, pp. 3970–3977.

[191]

Hang Zhao et al. “Tnt: Target-driven trajectory prediction”. In: Conference on Robot

Learning. PMLR. 2021, pp. 895–904.

[192]

Osval Antonio Montesinos López, Abelardo Montesinos López, and Jose Crossa.

“Overftting, model tuning, and evaluation of prediction performance”. In: Multivariate

statistical machine learning methods for genomic prediction. Springer, 2022, pp. 109–139.

[193]

Agrim Gupta et al. “Social gan: Socially acceptable trajectories with generative

adversarial networks”. In: Proceedings of the IEEE conference on computer vision

and pattern recognition . 2018, pp. 2255–2264.

[194]

Bogdan Ilie Sighencea, Rare

s

,

Ion Stanciu, and Cătălin Daniel Căleanu. “A review of

deep learning-based methods for pedestrian trajectory prediction”. In: Sensors 21.22

(2021), p. 7543.

[195] François Chollet et al. Keras. https://keras.io. 2015.

[196]

Samy Bengio and Yoshua Bengio. “Taking on the curse of dimensionality in joint

distributions using neural networks”. In: IEEE Transactions on Neural Networks 11.3

(2000), pp. 550–557.

[197]

G. Hughes. “On the mean accuracy of statistical pattern recognizers”. In: IEEE

Transactions on Information Theory 14.1 (1968), pp. 55–63.

[198]

Jui-En Lo et al. “Data Homogeneity Efect in Deep Learning-Based Prediction of Type

1 Diabetic Retinopathy”. In: Journal of Diabetes Research 2021 (2021).

150

REFERENCES

[199]

Juanwu Lu et al. “Generalizability analysis of graph-based trajectory predictor with

vectorized representation”. In: 2022 IEEE/RSJ International Conference on Intelligent

Robots and Systems (IROS). IEEE. 2022, pp. 13430–13437.

[200]

Wenshuo Wang et al. “Social interactions for autonomous driving: A review and

perspectives”. In: Foundations and Trends® in Robotics 10.3-4 (2022), pp. 198–376.

[201]

Teresa Rock et al. “Dynamic Decision-Making for Agent Models in Urban Driving

Simulation”. In: Proceedings of the Driving Simulation Conference 2023 Europe VR. Ed.

by Andras Kemeny, Jean-Rémy Chardonnet, and Florent Colombet. Driving Simulation

Association. Antibes, France, 2023, pp. 169–179.

[202]

Christos Katrakazas et al. “Real-time motion planning methods for autonomous on-road

driving: State-of-the-art and future research directions”. In: Transportation Research

Part C: Emerging Technologies 60 (2015), pp. 416–442.

[203]

David González et al. “A Review of Motion Planning Techniques for Automated

Vehicles”. In: IEEE Transactions on Intelligent Transportation Systems 17.4 (2016),

pp. 1135–1145.

[204]

Stanislav A Goll et al. “Unmanned ground vehicle local trajectory planning algorithm”.

In: 2016 5th Mediterranean Conference on Embedded Computing (MECO). IEEE. 2016,

pp. 317–321.

[205]

Yao Qi et al. “Hierarchical Motion Planning for Autonomous Vehicles in Unstructured

Dynamic Environments”. In: IEEE Robotics and Automation Letters 8.2 (2022), pp. 496–

503.

[206]

Donghao Xu et al. “Naturalistic lane change analysis for human-like trajectory genera-

tion”. In: 2018 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2018, pp. 1393–1399.

[207]

Ivo Batkovic et al. “Real-time constrained trajectory planning and vehicle control

for proactive autonomous driving with road users”. In: 2019 18th European Control

Conference (ECC). IEEE. 2019, pp. 256–262.

[208]

Kamal Kant and Steven Zu cker. “Toward Efcient Trajectory Planning: The Path-

Velocity Decomposition”. In: International Journal of Robotic Research - IJRR 5 (Sept.

1986), pp. 72–89.

[209]

Vasundhara Jain et al. “Collision Avoidance for Multiple Static Obstacles using Path-

Velocity Decomposition”. In: IFAC-PapersOnLine 52.8 (2019), pp. 265–270.

[210]

Moritz Werling et al. “Optimal trajectories for time-critical street scenarios using

discretized terminal manifolds”. In: The International Journal of Robotics Research

31.3 (2012), pp. 346–359.

[211]

Peter E. Hart, Nils J. Nilsson, and Bertram Raphael. “A Formal Basis for the Heuristic

Determination of Minimum Cost Paths”. In: IEEE Transactions on Systems Science

and Cybernetics 4.2 (1968), pp. 100–107.

[212]

Michael T Wolf and Joel W Burdick. “Artifcial potential functions for highway driving

with collision avoidance”. In: 2008 IEEE International Conference on Robotics and

Automation. IEEE. 2008, pp. 3731–3736.

151

REFERENCES

[213]

Chang Liu, Li Zhai, and XueYing Zhang. “Research on local real-time obstacle avoidance

path planning of unmanned vehicle based on improved artifcial potential feld method”.

In: 2022 6th CAA International Conference on Vehicular Control and Intelligence

(CVCI). IEEE. 2022, pp. 1–6.

[214]

Jiayi Sun, Jun Tang, and Songyang Lao. “Collision Avoidance for Cooperative UAVs

With Optimized Artifcial Potential Field Algorithm”. In: IEEE Access 5 (2017),

pp. 18382–18390.

[215]

Pauli Virtanen et al. “SciPy 1.0: Fundamental Algorithms for Scientifc Computing in

Python”. In: Nature Methods 17 (2020), pp. 261–272.

[216]

Constantin Berger. “Software-Efcient Trajectory Planning for Autonomous Driving

Agent Models in Simulated Urban Environments”. Master’s thesis. Technical University

of Munich, Chair of Integrated Systems, June 2022.

[217]

Sheng Zhu and Bilin Aksun-Guvenc. “Trajectory planning of autonomous vehicles based

on parameterized control optimization in dynamic on-road environments”. In: Journal

of Intelligent & Robotic Systems 100 (2020), pp. 1055–1067.

[218]

David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning.

1st. USA: Addison-Wesley Longman Publishing Co., Inc., 1989.

[219]

Teresa Rock et al. “Objectively Scoring the Human-Likeness of Artifcial Driver Models”.

In: Applied Sciences 13.18 (2023).

[220]

Meixin Zhu, Xuesong Wang, and Yinhai Wang. “Human-like autonomous car-following

model with deep reinforcement learning”. In: Transportation research part C: emerging

technologies 97 (2018), pp. 348–368.

[221]

Il Bae et al. “Self-driving like a human driver instead of a robocar: Personalized comfort-

able driving experience for autonomous vehicles”. In: arXiv preprint arXiv:2001.03908

(2020).

[222]

Jun Wang et al. “Normal deceleration behavior of passenger vehicles at stop sign–

controlled intersections evaluated with in-vehicle Global Positioning System data”. In:

Transportation research record 1937.1 (2005), pp. 120–127.

[223]

Talal Al-Shihabi and Ronald R Mourant. “Toward more realistic driving behavior

models for autonomous vehicles in driving simulators”. In: Transportation Research

Record 1843.1 (2003), pp. 41–49.

[224]

Arne Kesting, Martin Treiber, and Dirk Helbing. “Enhanced intelligent driver model to

access the impact of driving strategies on trafc capacity”. In: Philosophical Transactions

of the Royal Society A: Mathematical, Physical and Engineering Sciences 368.1928

(2010), pp. 4585–4605.

[225]

Mysore Narasimhamurthy Sharath and Babak Mehran. “A Literature Review of

Performance Metrics of Automated Driving Systems for On-Road Vehicles”. In: Frontiers

in Future Transportation (2021), p. 28.

[226]

Adrian Zlocki et al. “Logical Scenarios Parameterization for Automated Vehicle Safety

Assessment: Comparison of Deceleration and Cut-In Scenarios From Japanese and

German Highways”. In: IEEE Access 10 (2022), pp. 26817–26829.

152

REFERENCES

[227]

Kyle Sama et al. “Extracting human-like driving behaviors from expert driver data using

deep learning”. In: IEEE transactions on vehicular technology 69.9 (2020), pp. 9315–

9329.

[228] Morton S Raf et al. “A volume warrant for urban stop signs”. In: (1950).

[229]

Brian L Allen, B Tom Shin, and Peter J Cooper. Analysis of trafc conficts and

collisions. Tech. rep. 1978.

[230]

Michiel M Minderhoud and Piet HL Bovy. “Extended time-to-collision measures for road

trafc safety assessment”. In: Accident Analysis & Prevention 33.1 (2001), pp. 89–97.

[231]

Katja Vogel. “A comparison of headway and time to collision as safety indicators”. In:

Accident analysis & prevention 35.3 (2003), pp. 427–433.

[232]

Manish Dutta and Mokaddes Ali Ahmed. “Gap acceptance behavior of drivers at

uncontrolled T-intersections under mixed trafc conditions”. In: Journal of modern

transportation 26.2 (2018), pp. 119–132.

[233]

Nengchao Lyu, Jiaqiang Wen, and Chaozhong Wu. “Novel time-delay side-collision

warning model at non-signalized intersections based on vehicle-to-infrastructure

communication”. In: International journal of environmental research and public health

18.4 (2021), p. 1520.

[234]

Ye Li et al. “Exploring transition durations of rear-end collisions based on vehicle

trajectory data: a survival modeling approach”. In: Accident Analysis & Prevention

159 (2021), p. 106271.

[235]

Maike Scholtes et al. “6-layer model for a structured description and categorization of

urban trafc and environment”. In: IEEE Access 9 (2021), pp. 59131–59147.

[236]

Paul Jaccard. “Distribution de la fore alpine dans le bassin des Dranses et dans quelques

régions voisines”. In: Bull Soc Vaudoise Sci Nat 37 (1901), pp. 241–272.

[237]

Silvia Facchinetti et al. “A procedure to fnd exact critical values of Kolmogorov-

Smirnov test”. In: Statistica Applicata–Italian Journal of Applied Statistics 21.3-4

(2009), pp. 337–359.

[238]

Fin Malte Heuer. “Scenario Generation for Testing of Automated Driving Functions

based on Real Data: Master’s Thesis”. PhD thesis. Braunschweig, 2022.

[239]

Mel S later and Anthony Steed. “A Virtual Presence Counter”. In: Presence: Teleoper.

Virtual Environ. 9 (Oct. 2000), pp. 413–434.

[240]

Valentin Schwind et al. “Using presence questionnaires in virtual reality”. In: Proceedings

of the 2019 CHI conference on human factors in computing systems. 2019, pp. 1–12.

[241]

Matthew Lombard, Theresa B Ditton, and Lisa Weinstein. “Measuring presence: the

temple presence inventory”. In: Proceedings of the 12th annual international workshop

on presence. 2009, pp. 1–15.

[242] F Wilcoxon. Individual comparisons by ranking methods. Biom. Bull., 1, 80–83. 1945.

153

A

Appendix

A.1

Additional Material for the Context Representation

Method

The following table lists all the features extracted from the data sources to describe situational

context. The sufx n for interaction and partner features describes the number of interaction

partners. Thus, when the ego vehicle interacts with two other vehicles, agent_type_p1 and

agent_type_p2 have the property "car".

154

A.1 Additional Material for the Context Representation Method

Table A.1: All features describing a driving scene at time

t

from the perspective of an individual

ego vehcile on the basis of features from the inD dataset. [155]

Category Feature Description Unit P

Ego vehicle fea-

tures (E)

Map information

Map representa-

tion (M)

Partner Features

(P)

Interaction Fea-

tures (I)

agent_type

xCenter

yCenter

heading

width

length

xVelocity

yVelocity

xAcceleration

yAcceleration

lonVelocity

latVelocity

lonAcceleration

latAcceleration

logical lane assignment

physical lane assignment

confict_lanes

lane_start_direction

lane_end_direction

lane_turn_direction

lane_curvature_in_5m,

lane_curvature_in_10m,

lane_curvature_in_20m

lane_direction_in_5m,

lane_direction_in_10m,

lane_direction_in_20m

lane_width_in_5m,

lane_width_in_10m,

lane_width_in_20m

center_line_x_in_5m,

’center_line_x_in_10m,

’center_line_x_in_20m’

center_line_y_in_5m,

’center_line_y_in_10m,

’center_line_y_in_20m’

agent_type_pn

xCenter_pn

yCenter_pn

heading_pn

width_p_pn

length_pn

xVelocity_pn

yVelocity_pn

xAcceleration_pn

yAcceleration_pn

lonVelocity_pn

latVelocity_pn

lonAcceleration_pn

latAcceleration_pn

relationship2ego_veh_pn

pos2conf_veh_pn

ego_pos2conf_vehn

route_opt_veh_pn

dis2conf_pn

dis2ego_pn

rel_speed_pn

rel_heading_pn

angle2conf_veh_pn

classifcation of ego vehicle: car, motorcycle, van, truck_bus

X-coordinate of vehicle center

Y-coordinate of vehicle center

Heading angle of vehicle (0-360 degrees)

Width of the vehicle

Length of the vehicle

X-component of vehicle velocity

Y-component of vehicle velocity

X-component of vehicle acceleration

Y-component of vehicle acceleration

Longitudinal velocity of the vehicle

Lateral velocity of the vehicle

Longitudinal acceleration of the vehicle

Lateral acceleration of the vehicle

lane the vehicle is logically driving on described by lane id,

section id, section type and s-position within the lane

lanes the vehicle is physically driving on described by lane id,

section id, section type and s-position within the lane

all lanes that have potential conficts with any ego lane

respective semantic and spatial relationship as well as from

which ego lane the confict originates

Direction value of the frst point of the lane the vehicle is

logically driving on

Direction value of the last point of the lane the vehicle is

logically driving on

Turn direction of the lane the vehicle is logically driving on

(STRAIGHT, LEFT, RIGHT)

Lane curvature calculated for the lane the vehicle is logically

driving on 5, 10, and 20 meters from the current position

Lane direction calculated for the lane the vehicle is logically

driving on 5, 10, and 20 meters from the current position

Lane width calculated for the lane the vehicle is logically

driving on 5, 10, and 20 meters from the current position

Global x coordinates of the center_line position 5,10, and 20

meters from the current position

Global y coordinates of the center_line position 5,10, and 20

meters from the current position

Type of partner: car, motorcycle, van, truck_bus, pedestrian,

bicycle

X-coordinate of partner center

Y-coordinate of partner center

Heading angle of partner (0-360 degrees)

Width of partner

Length of partner

X-component of partner velocity

Y-component of partner velocity

X-component of partner acceleration

Y-component of partner acceleration

Longitudinal velocity of partner

Lateral velocity of partner

Longitudinal acceleration of partner

Lateral acceleration of partner

Relationship between ego and partner: SAME_LANE,

MERGING, GIVING_ROW, RECEIVING_ROW,

IN_WRONG_LANE, VRU_INTERACTION

Position of partner vehicle to confict (before, inside, after)

calculated by confict area from the map (feature only for

partner vehicles!)

Position of ego to confict (before, inside, after) calculated by

confict area from the map (feature only for partner vehicles!)

Route options of partner: LEFT, STRAIGHT, RIGHT (feature

only for partner vehicles!)

Distance to center of confict zone

Euclidean distance between ego and partner center point

Relative velocity between ego and partner

Relative heading between ego and partner

Heading angle of partner with respect to confict

-

[m]

[°]

[m]

[m/s]

[m/s^2]

[m/s]

[m/s^2]

-

[rad]

[1/m]

[m]

-

[m]

[°]

[m]

[m/s]

[m/s^2]

[m/s]

[m/s^2]

-

[m]

[m/s]

[°]

x

-

x

155

A.2 Additional Material for the Dynamic Decision-making Method

A.2

Additional Material for the Dynamic Decision-making

Method

Videos of the resulting trajectories and fgures of velocity-, acceleration- and jerk profles can

be found under this link:

https://www.dropbox.com/scl/fi/hbxqvxjjox5nb8cx7bji5/Supplementary_material_

Paper_DSC2023_Rock.pdf?rlkey=iseqj2tykudmtjizpcv7ckinp&dl=0.

A.2.1

Scenario A_VEH Solved by the Two Planners and the Heuristic

Agent Model

Behavior of the heuristic agent model and the two planners in scenario A_VEH are illustrated

in Chapter 6 in Figure 6.7.

(a) Scenario A_VEH solved by PL_PVD. (b) Scenario A_VEH solved by PL_3D.

Figure A.1: Velocity acceleration and jerk profle for the frst guess and fnal trajectory of the

two planners in scenario A_VEH.

156

A.2 Additional Material for the Dynamic Decision-making Method

Figure A.2: Velocity, acceleration, and jerk profle of the heuristic agent model in scenario

A_VEH.

A.2.2

Scenario B_PED Solved by the Two Planners and the Heuristic

Agent Model

(a) Scenario B_PED solved by

(b) Scenario B_PED solved by

(c) Scenario B_PED solved by

TRM. PL_PVD. PL_3D.

Figure A.3: Behavior of the heuristic agent model and the two planners in scenario B_PED at

the same time step. Ego vehicle in green.

157

A.2 Additional Material for the Dynamic Decision-making Method

(a) Scenario B_PED solved by PL_PVD (b) Scenario B_PED solved by PL_3D

Figure A.4: Velocity, acceleration and jerk profle for the frst guess and fnal trajectory of the

two planners in scenario B_PED.

A.2.3

Scenario C_STAT Solved by the Two Planners and the Heuristic

Agent Model

Behavior of the heuristic agent model and the two planners in scenario C_STAT are illustrated

in Chapter 6 in Figure 6.8.

Figure A.5: Velocity, acceleration, and jerk profle of the heuristic agent model in scenario

B_PED.

158

A.2 Additional Material for the Dynamic Decision-making Method

(a) Scenario C_STAT solved by PL_PVD (b) Scenario C_STAT solved by PL_3D

Figure A.6: Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of the

two planners in scenario C_STAT.

A.2.4

Scenario D_BIC Solved by the Two Planners and the Heuristic Agent

Model

Behavior of the heuristic agent model and the two planners in scenario D_BIC are illustrated

in Chapter 6 in Figure 6.10.

Figure A.7: Velocity, acceleration, and jerk profle of the heuristic agent model in scenario

C_STAT.

159

A.2 Additional Material for the Dynamic Decision-making Method

(a) Scenario D_BIC solved by PL_PVD. (b) Scenario D_BIC solved by PL_3D.

Figure A.8: Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of the

two planners in scenario D_BIC.

Figure A.9: Velocity, acceleration, and jerk profle of the heuristic agent model in scenario

D_BIC.

160

A.2 Additional Material for the Dynamic Decision-making Method

(a) Scenario D1 best quality parameter set.

(b) Scenario D1 best runtime parameter set.

(c) Scenario D1 global parameter set. (d) Scenario D1 local parameter set.

Figure A.10: Behavior of the PL_3D planner employing diferent parameter sets in variation D1

of scenario D_BIC.

A.2.5 Scenario D1 Solved by the PL_3D Planner

(a) Scenario D1 best quality parameter set.

(b) Scenario D1 best runtime parameter set.

Figure A.11: Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of

PL_3D planner employing best quality and best runtime parameter sets.

161

A.2 Additional Material for the Dynamic Decision-making Method

(a) Scenario D1 global parameter set. (b) Scenario D1 local parameter set.

Figure A.12: Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of

PL_3D planner employing local and global parameter sets.

162

A.3 Additional Material for the Simulator Experiment

The following tables show the questionnaires used during the simulator experiment. The

questionnaires were designed based on established questionnaires for measuring perceived

realism in VEs and additional use-case specifc items to explore the subjective evaluation of

trafc agents.

Table A.2: Questionnaire 1: Investigating the sense of presence. Presence (PRE) items according

to M. Slater and A. Steed [239] and one additional item investigating the degree of naturalism of

the driving style.

1.1

PRE

Please rate your sense of being in the

trafc scenario, on the following scale

from 1 to 7, where 7 represents your

normal experience of being in a place. I

had a sense of "being there" in the trafc

scenario

Not at all Very much

1.2

PRE

To what extent were there times during

the experience when the virtual trafc

situation became the "reality" for you,

and you almost forgot about the "real

world" of the simulator room in which the

whole experience was really taking place?

There were times during the experience

when the virtual environment became

more real for me compared to the "real

world".

At no time

Almost all the

time

1.3

PRE

When you think back about your expe-

rience do you think of the city more

as images that you saw, or more as

somewhere you drove through. "The

trafc scenario seems to me to be more

like"

images that I

saw

somewhere you

drove through

1.4

PRE

During the time of the experience, which

was strongest on the whole your sense of

being in the city or of being in the real

world of the simulator? I had a strong

sense of being in ...

The real world of

the simulator

The virtual real-

ity of the city

1.5

PRE

During the time of the experience, did

you often think to yourself that you were

actually just sitting in a simulator or

did the virtual environment overwhelm

you? I often thought I was sitting in a

simulator ...

Almost all the

time

At no time

1.6

NAT

How much did you behave within the

driving simulator as if the situation were

real? I responded as if the situation were

real.

Not at all Very much

163

A.3 Additional Material for the Simulator Experiment

Table A.3: Questionnaire 2: Evaluating the degree of realism of the trafc agents.

2.1

STAT

The events I saw could occur in the real

world. [241]

Not at all Very much

2.2

STAT

Overall, how much did the other trafc

participants in the virtual environment

behave and move like they were real?

[241]

Not at all Very much

2.3

INTERA

How natural did your interactions with

the other trafc participants seem? [117]

Like reality Not real at all

2.4

SPA-

TEM

How often did you have the sensation

that other trafc participants you saw

could also see you? [241]

Never Always

2.5

INTERA

To what extent did you feel you could in-

teract with the other trafc participants

you saw? [241]

Not at all Very much

2.6

INTERA

The other trafc participants reacted to

my actions.

Strongly

Disagree

Strongly Agree

2.7

AGREE

I would behave more naturally in the

driving simulator if the other trafc

participants interacted with me more

often or reacted to my actions more

often.

Strongly

Disagree

Strongly Agree

2.8

How does the behavior of other road

users, that you observed, difer from

what you experienced in real trafc?

164