Smart Agents
Human-Like Driver Agents in Simulated Urban
Environments Based on Situational Understanding
and Dynamic Decision-Making
vorgelegt von
Teresa Rock, M. Sc.
an der Fakultät V – Verkehrs- und Maschinensysteme
der Technischen Universität Berlin
zur Erlangung des akademischen Grades
Doktorinder Ingenieurwissenschaften
- Dr.-Ing. -
genehmigte Dissertation
Promotionsausschuss:
Vorsitzende: Prof. Dr. Christina Völlmecke
Gutachterin: Prof. Dr. Stefanie Marker
Gutachterin: Prof. Dr. Nele Rußwinkel
Tag der wissenschaftlichen Aussprache: 11. April 2024
Berlin 2024
Zusammenfassung
Der Bedarf an intelligenten Fahrermodellen, die in der Lage sind, komplexe Verkehrssituationen
in einer Weise zu bewältigen, die dem menschlichen Verhalten ähnelt, ergibt sich aus
verschiedenen Anwendungsbereichen. Im Kontext des autonomen Fahrens werden Fahrermodelle
verwendet, um den menschlichen Fahrer zu ersetzen und sichere sowie fexible Lösungen für
die Mobilität anzubieten. Man geht davon aus, dass autonome Fahrzeuge, die sich menschlich
verhalten, für sicherere Verkehrsinteraktionen sorgen und außerdem eher von Nutzern akzeptiert
werden [1, 2]. Auch im Bereich der Fahrsimulation werden Fahrermodelle benötigt, um den
Umgebungsverkehr in der Simulation zu erzeugen, der möglichst dem Verkehrsverhalten in der
realen Welt entsprechen soll. Fahrsimulation hat sich zu einem zentralen und unverzichtbaren
Werkzeug für Forschung und Entwicklung im Verkehrssektor entwickelt. Darüber hinaus führen
globale Trends wie Globalisierung, Nachhaltigkeit und die steigende Nachfrage an Mobilität
zu einem erhöhten Forschungsbedarf, insbesondere im urbanen Kontext [3, 4]. Somit kann das
Verständnis und die Modellierung des menschlichen Fahrverhaltens in urbanen Umgebungen als
ein Forschungsgebiet mit wachsender Bedeutung für die zukünftige Mobilität angesehen werden.
Dieser zunehmenden Bedeutung steht jedoch ein derzeit unvollständiger Forschungsstand
gegenüber, da sich die meisten Arbeiten entweder auf einfache Autobahnszenarien konzentrieren
oder Ansätze zur Lösung isolierter Szenarien präsentieren. Dementsprechend sind aktuelle
Lösungsansätze nicht für die für die Vielfalt und Komplexität der im Stadtverkehr auftretenden
Situationen geeignet. Ziel dieser Arbeit ist es daher, üb ertragbare und praktikable Methoden
für die Modellierung von menschenähnlichem Fahrverhalten in urbanen Umgebungen zu
erarbeiten.
Um diese wissenschaftliche Lücke zu adressieren, wird ein zweistufger Ansatz verfolgt, indem
zunächst eine detaillierte Analyse des interdisziplinären Themas durchgeführt wird, um
Kernprobleme moderner Lösungen zu identifzieren, welche anschließend im zweiten Teil
mit neuartigen Methoden behandelt werden. Entsprechend der interdisziplinären Natur des
Themas werden verschiedenste Forschungsbereiche wie Psychologie, Robotik, Fahrsimulation
oder autonomes Fahren beleuchtet. Ausgehend davon werden klare Modellanforderungen
identifziert und die folgenden vier Kernherausforderungen abgeleitet, welche im Methodenteil
mit innovativen Methoden adressiert und kritisch diskutiert werden: Repräsentation von Umge-
bungsinformationen zur Ausbildung eines situativen Verständnisses in Modellen, Entwicklung
und Untersuchung generalisierbarer Vorhersagemodellen, dynamische Entscheidungsfndung
für situationsadaptives Verhalten und aussagekräftige Evaluationsstrategien zur Bewertung
der Menschenähnlichkeit von Modellverhalten. Eine umfassende Diskussion der Ergebnisse,
Limitationen und ein Ausblick auf weiterführende Forschungsthemen runden die Arbeit ab.
Abstract
The demand for intelligent driver models capable of handling complex trafc situations in a
way that resembles human behavior arises from various application areas. In the context of
Autonomous Driving, driver models replace human drivers, aiming at providing safe and fexible
mobility solutions. It is believed that autonomous vehicles exhibiting human-like behavior have
the potential to enhance the safety of trafc interactions and are better accepted by users [1, 2].
In the feld of Driving Simulation, driver models are required to generate surrounding trafc
within the Virtual Environment to provide a realistic replication of real-world trafc scenarios.
Driving Simulation has become a central and indispensable tool for research and development
in the transportation sector. Moreover, global trends such as globalization, sustainability, and
increased demand for mobility contribute to a growing need for research, especially in the
context of urban trafc scenarios [3, 4]. Therefore, modeling and understanding human driving
behavior in urban environments shows increasing necessity for the future of mobility.
Meanwhile, current research is incomplete, as most publications either focus on more simple
highway trafc or propose approaches to solve isolated scenarios or parts of the driving task.
As a result, current solutions are not suitable for the diversity and complexity of urban trafc.
Therefore, the objective of this thesis is to develop transferable, practicable, and reliable
methods for modeling human-like driving behavior in urban environments.
In order to address this scientifc gap, a twofold approach is taken. First, a detailed analysis
of the topic in its interdisciplinary nature is conducted in order to identify the fundamental
problems of modern solutions, which are subsequently addressed with novel methods in the
second part of the thesis.
Therefore, the topic is explored from the perspective of various research areas, including
psychology, robotics, Driving Simulation, and Autonomous Driving. Based on this multi-
dimensional analysis, key challenges in state-of-the-art solutions and clear requirements for
modeling human-like driving behavior are determined. The following four key challenges
are identifed to prevent successful modeling of human-like driving behavior in urban trafc:
representation of complex trafc situations to enable situational understanding, creation and
evaluation of generalizable prediction models to anticipate future scene developments, dynamic
decision-making to enable situational behavior adaptation, and meaningful evaluation strategies
capable of assessing human-like model behavior. Novel methods are presented to address these
four main challenges, and the results are critically discussed. A comprehensive discussion of
the results, limitations, and an outlook for further research will conclude the thesis.
Acknowledgements
I would like to thank my supervisor at the Technical University of Berlin, Prof. Marker, for her
consistent support and guidance throughout this project. I also want to thank my supervisors
at BMW, Thomas Bleher and Dr. Mohammad Bahram, for supporting and inspiring me to
think out of the box, thus enabling me to generate novelty in this densely covered research
area. Besides my supervisors, I would like to thank BMW and my incredible team there for
their collaborative eforts and personal support. Especially the collaboration with the other
PhD students in my team (Chantal H., Maurice K., and Robert J.) enriched my research time.
Finally, a special thanks to my family and friends for their endless emotional support, without
which I would not have been able to complete this research. In particular, the constant support
in the form of coworking, feedback, and encouragement from my friends Christoph B., Claudia
B., Judith P., Cosima V., and Deniz A. got me through this time.
Table of Contents
Title Page i
Zusammenfassung ii
Abstract iii
List of Figures vii
List of Tables xii
Abbreviations xiv
1 Introduction 1
1.1 Motivation ...................................... 1
1.2 Objectives and Focus of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
PART I: Identifcation of the Scientifc Gap and Related Research Questions 3
2 State-Of-The-Art 4
2.1 The Interdisciplinary Nature of Driver Models . . . . . . . . . . . . . . . . . . . 4
2.2 Driver Models in Driving Simulation . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Modeling Trafc in Driving Simulation . . . . . . . . . . . . . . . . . . . 5
2.2.2 Categories of Trafc Simulation . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.3 Agent-Based Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.4 Agent Models in Driving Simulation . . . . . . . . . . . . . . . . . . . . 6
2.2.5
Modeling Vehicle Driver Behavior by Agent Models in Driving Simulation
7
2.2.6 The Driver Agent at BMW: TRM . . . . . . . . . . . . . . . . . . . . . 9
2.3 Driver Models for Automated Vehicles . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Further Fields of Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Psychology .................................. 14
2.4.2 Cognitive Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.3 Robotics .................................... 15
2.4.4 Imitating and Replicating Human Behavior . . . . . . . . . . . . . . . . 15
2.5 Summary ....................................... 15
ii
TABLE OF CONTENTS
3 Requirements for Modeling Human-Like Driving Behavior 17
3.1 Characteristics and Main Challenges of Urban Trafc . . . . . . . . . . . . . . . 17
3.2 Perspectives on Defning and Quantifying Human-Like Driving Behavior . . . . 18
3.2.1 Objective Approaches to Quantify Human Likeness . . . . . . . . . . . . 18
3.2.2 Subjective Approaches to Quantify Human Likeness . . . . . . . . . . . 19
3.2.3
Potentials and Limitations of Objective and Subjective Quantifcation
Methods .................................... 20
3.3 Requirements From a Driver-In-The-Loop Perspective . . . . . . . . . . . . . . 21
3.3.1
What It Takes to Be Perceived as Real - The Human Driver in the
Simulator as a Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.2
What It Takes to Cope With Urban Trafc - The Human Driver as a
RoleModel .................................. 22
3.3.3
Use Cases and Technical Requirements Originating From Driving
Simulation .................................. 23
3.3.4
Summary of Requirements for Human-Like Driver Agents in Urban
DrivingSimulation .............................. 23
3.4 Methodological Gaps in State-Of-The-Art Solutions . . . . . . . . . . . . . . . 25
3.4.1 Insufcient Representation of Context Information . . . . . . . . . . . . 26
3.4.2 Missing Anticipation in Agent Models . . . . . . . . . . . . . . . . . . . 26
3.4.3 Missing Dynamics in Decision-Making of Driver Agents . . . . . . . . . 26
3.4.4 Insufcient Evaluation Strategies and Misleading Metrics . . . . . . . . 27
PART II: Methods to Address the Identifed Research Gaps 27
4 Representation of Complex Trafc Situations 28
4.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 State-Of-The-Art: Environment Representation . . . . . . . . . . . . . . . . . . 29
4.2.1 Static Environment Representation . . . . . . . . . . . . . . . . . . . . . 29
4.2.2 OpenDRIVEMaps .............................. 30
4.2.3 Dynamic Environment Representation . . . . . . . . . . . . . . . . . . . 31
4.3
Method: Dynamic Scene Representation Obtained by Prior Knowledge and
Interpretation ..................................... 32
4.3.1 The General Scene Description . . . . . . . . . . . . . . . . . . . . . . . 33
4.3.2 The Data Processing Concept . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4 Implementation .................................... 36
4.4.1 Implementation Step 1: Data Fusion . . . . . . . . . . . . . . . . . . . . 36
4.4.2 Implementation Step 2: Plausibility Check . . . . . . . . . . . . . . . . 37
4.4.3 Implementation Step 3: Interaction Identifcation . . . . . . . . . . . . . 38
4.5 Results......................................... 43
4.6 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.7 Conclusion ...................................... 46
5 Anticipating the Intention of Other Road Users 47
5.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
iii
TABLE OF CONTENTS
5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions . . . . . . . . . . . . . . 48
5.2.1 Prediction Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2.2 Behavioral Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2.3 Infuence Factors and Model Structures . . . . . . . . . . . . . . . . . . 49
5.2.4 Model Evaluation Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.3 Prediction Concept, Problem Formulation, and Data Preparation . . . . . . . . 52
5.3.1 General Concept of Intention Prediction . . . . . . . . . . . . . . . . . . 52
5.3.2 Problem Formulation and Model Architecture . . . . . . . . . . . . . . . 52
5.3.3 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.4
Method 1: Measuring the Efectiveness of the Scene Representation by Prediction
Performance ...................................... 54
5.4.1 Motivation .................................. 54
5.4.2 Concept.................................... 54
5.4.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.4.4 Results .................................... 55
5.4.5 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4.6 Conclusion .................................. 59
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc . . . . . . 60
5.5.1 Motivation .................................. 60
5.5.2 Concept.................................... 60
5.5.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.5.4 Results .................................... 65
5.5.4.1
Infuence of Homogeneity in Training Data and Coverage of
Scene Information . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.5.4.2 Infuence of Individual Feature Categories . . . . . . . . . . . . 66
5.5.4.3 Impact of Tuning Parameters . . . . . . . . . . . . . . . . . . . 68
5.5.4.4 Measuring Generalizability and Plausibility . . . . . . . . . . . 68
5.5.4.5 Qualitative Prediction Results of Diferent Model Settings . . . 69
5.5.5 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 71
5.5.6 Conclusion .................................. 72
5.6 Summary on Anticipation Methods . . . . . . . . . . . . . . . . . . . . . . . . . 73
6 Dynamic Decision-Making 74
6.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.2 State-Of-The-Art: Decision-Making . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2.1 Decision-Making Strategies for Agent Models in Simulation . . . . . . . 75
6.2.2 Trajectory Planning Approaches for Automated Vehicles . . . . . . . . . 76
6.2.2.1 Search-Based Methods . . . . . . . . . . . . . . . . . . . . . . 76
6.2.2.2 Sampling-Based Methods . . . . . . . . . . . . . . . . . . . . . 76
6.2.2.3 Optimization-Based Methods . . . . . . . . . . . . . . . . . . . 77
6.2.2.4 Decomposition Strategies . . . . . . . . . . . . . . . . . . . . . 77
6.2.3
Potentials and Weaknesses of Heuristic Decision-Making and Trajectory
Planning ................................... 77
6.3 Method: Dynamic Decision-Making for Agent Models . . . . . . . . . . . . . . 78
iv
TABLE OF CONTENTS
6.3.1 Concept for Dynamic Decision-Making . . . . . . . . . . . . . . . . . . . 78
6.3.2 The Planning Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.3.2.1 Planner Variant: PL_PVD . . . . . . . . . . . . . . . . . . . . 79
6.3.2.2 Planner Variant: PL_3D . . . . . . . . . . . . . . . . . . . . . 81
6.3.2.3 Trajectory Smoothing . . . . . . . . . . . . . . . . . . . . . . . 83
6.3.3 Parametrization ............................... 85
6.3.4 Evaluating Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.3.5 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.4 Results......................................... 89
6.4.1
Potential of Trajectory Planning Versus Heuristic Approaches for
Decision-Making ............................... 89
6.4.2 Sensitivity Towards Diferent Parameter Sets and Scenario Variations . 94
6.5 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.6 Conclusion ...................................... 97
7 Evaluation Methods 98
7.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.2 State-Of-The-Art: Evaluation Approaches for Human-Like Driver Models . . . 100
7.3
Method 1: Objectively Evaluating the Human Likeness by Introducing a
PlausibilityMetric .................................. 101
7.3.1 Concept.................................... 101
7.3.1.1 Parameter Specifcation . . . . . . . . . . . . . . . . . . . . . . 103
7.3.1.2 Identifcation of Context-Based Similarity of Situations . . . . 105
7.3.1.3 Preparing the Database . . . . . . . . . . . . . . . . . . . . . . 106
7.3.1.4 Quality Function Formulation . . . . . . . . . . . . . . . . . . 106
7.3.1.5 Strategy for Validating the Method . . . . . . . . . . . . . . . 108
7.3.2 Implementation ................................ 108
7.3.2.1 Used Databases and the Driver Model . . . . . . . . . . . . . . 108
7.3.2.2 Metric Formulation and Thresholds . . . . . . . . . . . . . . . 109
7.3.2.3 Survey for Validation . . . . . . . . . . . . . . . . . . . . . . . 109
7.3.3 Results .................................... 110
7.3.3.1 Objective Metric Results Versus Subjective Human Ratings . . 111
7.3.3.2 The Human Likeness of Investigated Datasets . . . . . . . . . 112
7.3.3.3
Case Study 1: Exemplary Application of the Method for
Investigating the Heuristic Model at BMW (TRM) . . . . . . 113
7.3.3.4
Case Study 2: Applying the Evaluation Metric to the Proposed
Trajectory Planners . . . . . . . . . . . . . . . . . . . . . . . . 117
7.3.4 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 118
7.3.5 Conclusion .................................. 120
7.4
Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving
Simulation Based on Questionnaires . . . . . . . . . . . . . . . . . . . . . . . . 121
7.4.1 Concept.................................... 121
7.4.2 Material and Setting of the Simulator Experiment . . . . . . . . . . . . 122
7.4.2.1 Study Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
v
TABLE OF CONTENTS
7.4.2.2 Questionnaires . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.4.2.3 Simulator Setting . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.4.2.4 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.4.3 Results .................................... 124
7.4.3.1 The Efect of Surrounding Trafc on Perceived Realism . . . . 124
7.4.3.2 Realism of Trafc Agent Behavior . . . . . . . . . . . . . . . . 124
7.4.4 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 125
7.4.5 Conclusion .................................. 127
7.5 Summary on Evaluation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8 Summary, Discussion and Outlook 129
8.1 SummaryofResults ................................. 129
8.2 Limitations and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.3 FinalConclusion ................................... 134
8.4 SummarizedOutlook ................................. 135
References 137
Appendix A Appendix 154
A.1 Additional Material for the Context Representation Method . . . . . . . . . . . 154
A.2 Additional Material for the Dynamic Decision-making Method . . . . . . . . . 156
A.2.1
Scenario A_VEH Solved by the Two Planners and the Heuristic Agent
Model ..................................... 156
A.2.2
Scenario B_PED Solved by the Two Planners and the Heuristic Agent
Model ..................................... 157
A.2.3
Scenario C_STAT Solved by the Two Planners and the Heuristic Agent
Model ..................................... 158
A.2.4
Scenario D_BIC Solved by the Two Planners and the Heuristic Agent
Model ..................................... 159
A.2.5 Scenario D1 Solved by the PL_3D Planner . . . . . . . . . . . . . . . . 161
A.3 Additional Material for the Simulator Experiment . . . . . . . . . . . . . . . . 163
vi
List of Figures
1.1 Structure of the thesis following a two-fold approach. . . . . . . . . . . . . . . . 3
2.1
Proprietary-designed generic driver model that enables the categorization and
comparison of state-of-the-art solutions for diferent subsets describing the
drivingtask. ...................................... 5
2.2 Structure of the driver agent TRM developed by BMW. . . . . . . . . . . . . . 10
2.3 Overview of the interrelationships AI - ML - DL [70]. . . . . . . . . . . . . . . . 12
3.1
Evaluation of model performance based on the similarity of a predicted trajectory
compared to the individual human-driven trajectory. . . . . . . . . . . . . . . . 19
3.2
Key scenarios representing the main challenges of urban trafc for driver agents
for simulation. Ego vehicle in yellow experiencing the diferent situations. . . . 24
3.3
Overview of identifed components required for modeling human-like driving
behavior in urban trafc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1
OpenDRIVE road network description including reference line, lane objects and
lanefeatures[149]. .................................. 31
4.2
OpenDRIVE Lanes: types, neighbor relations, and driving direction by lane ID
[150]........................................... 31
4.3 Examples of lanes and lane geometry representations in Spider.......... 32
4.4
Illustrating the idea of scenario mapping by the general scene description. The
ego vehicle (ID 1, yellow) interacts with diferent interaction partners once in a
typical turning maneuver and once in an exceptional situation resulting from
anoccupiedlane.................................... 35
4.5
Example of undefned areas within intersections. Recording location Heckstrasse
inDdataset[155]. ................................... 36
4.6 Example of ambiguous (left) and logically incorrect (right) lane assignments. . 38
4.7 Real world examples causing implausible lane assignments. . . . . . . . . . . . 40
4.8
The concept for identifying potential interactions with VRUs. The yellow vehicle
represents the ego vehicle facing a scene with two pedestrians and a cyclist.
Identifcation based on the relevant ego lanes (red), the motion polygons (blue),
and the euclidean distance e. ............................ 41
vii
LIST OF FIGURES
4.9
Illustration of diferent interactive situations of the inD dataset at the four
diferent locations. The ego vehicle is always drawn in red. The solid lines show
the path driven so far, the dotted lines represent the future path, and the circles
mark the current position. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.1
Structure of the base NN used for the two anticipation methods to investigate
the ability to generate generalizable predictions. . . . . . . . . . . . . . . . . . . 53
5.2
Concept for measuring the effectiveness of the proposed scene representation
by training the NN with diferent feature compositions. . . . . . . . . . . . . . 55
5.3
Quantitative results in unseen situations (Level 1). Ego GT trajectory marked
in orange. Predictions of the model trained on EMPI in cyan, and predictions
of the model trained on E in blue. GT and predictions represent positions in
next 5 seconds. For all trajectories: driven path: solid, future path: dotted. . 56
5.4
Quantitative results in unseen situations on unknown intersection (Level 2).
Ego GT trajectory marked in orange. Predictions of the model trained on EMPI
in cyan, and predictions of the model trained on E in blue. GT and predictions
represent positions in next 5 seconds. For all trajectories: driven path: solid,
futurepath: dotted. ................................. 58
5.5
The evaluation concept involving diferent test levels and metrics to quantify
generalizability..................................... 62
5.6
Illustration of the data concept including three levels of homogeneity of training
data (T1 - T3) on the left-hand side and diferent test data (L2 - L3) on the
right-hand side. Real trafc examples are provided by the inD dataset [155],
while synthetic driving data is obtained by simulation. As an additional test
case, an exceptional scenario is designed using the simulation framework and a
human driver (L3)................................... 64
5.7
Infuence of variability in training data on model performance. Left: accuracy
measured by ADE and FDE on diferent test levels with best feature setting of
training data category. Right: scores for accuracy, plausibility, and overall for
diferenttrainingdata. ................................ 66
5.8
Infuence of diferent feature settings on model accuracy considering homogeneity
level T3. Left: Efect measured on test level L2a. Right: Efect measured on
testlevelL3. ..................................... 67
5.9
Infuence of interaction features on temporal plausibility (left) and infuence of
map features on spatial plausibility (right). . . . . . . . . . . . . . . . . . . . . 68
5.10
Quantitative results on unseen intersections (isec 4 - L2b top, Heckstrasse -
L2a bottom). Ego GT trajectory marked in orange. Predictions of models
trained on EMPI features with diferent training datasets (T1, T2, T3). GT
and predictions represent positions in the next 5 seconds. For all trajectories:
driven path: solid, future path: dotted. . . . . . . . . . . . . . . . . . . . . . . 69
viii
LIST OF FIGURES
5.11
Qualitative evaluation of predictions from diferent feature settings for models
trained on T3 performing on unseen intersections (frst row: isec 4 - L2b, second
row: Heckstrasse - L2a). Ego GT trajectory marked in orange. Predictions of
models trained on T3 with diferent feature settings: EMPI, EMI, and E. GT
and predictions represent positions in the next 5 seconds. For all trajectories:
driven path: solid, future path: dotted. . . . . . . . . . . . . . . . . . . . . . . 70
6.1
The hierarchical planning framework realized as a two-layer concept for frst
planning a rough discretized frst-guess trajectory (behavioral layer), which
is subsequently smoothed into a dynamical feasible trajectory by the motion
planninglayer. .................................... 80
6.2
Illustration of the velocity profle generation using a hybrid A* algorithm applied
in the s/t space adopted from [66]. . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.3
The APF formed by the position of the obstacle and velocity and position of
theego vehicle[213]. ................................. 83
6.4
Example visualization of the APF for a dynamic (right) and static (left) obstacle.
Ego vehicle green Bounding box, other trafc participants blue. . . . . . . . . . 84
6.5 Circle method for collision free motion planning inspired by [66]. . . . . . . . . 85
6.6 The four key scenarios for evaluation. . . . . . . . . . . . . . . . . . . . . . . . 89
6.7
Behavior of the heuristic agent model and the two planners in scenario A_VEH
at the same time step. Ego vehicle in green. . . . . . . . . . . . . . . . . . . . . 91
6.8
Behavior of the heuristic agent model and the two planners in scenario C_STAT
at the same time step. Ego vehicle in green. . . . . . . . . . . . . . . . . . . . . 91
6.9 PL_3D driving far into oncoming lane in scenario C_STAT and D_BIC. Ego
vehicleingreen. .................................... 92
6.10
Behavior of the heuristic agent model and the two planners in scenario D_BIC
at the same time step. Ego vehicle in green. . . . . . . . . . . . . . . . . . . . . 93
6.11
Profles showing velocity, acceleration and jerk of the PL_PVD planner in
Scenario D_BIC for the fnal trajectory. . . . . . . . . . . . . . . . . . . . . . . 93
6.12
Behavior of the PL_3D planner in Variant D2 best quality parameter set (top)
versus best runtime parameter set (bottom). . . . . . . . . . . . . . . . . . . . . 95
6.13
Velocity, acceleration and jerk of the PL_3D planner in Variant D2 best quality
parameter set (left) versus best runtime parameter set (right). . . . . . . . . . . 95
7.1 The general idea of evaluating model behavior within situational context. . . . 102
7.2
Illustration of the concept toolchain including metric development and validation
strategyofthemethod. ............................... 103
7.3 Illustration of the calculation concepts for interaction-related parameters. . . . 105
7.4
Two exemplary situations at the Bendplatz location (recording 12) were identifed
as similar based on the contextual parameters in Table 7.2. The ego vehicle is
marked in red, the cyclist in blue and the other vehicle in green. The trajectory
already traveled is marked as a solid line. . . . . . . . . . . . . . . . . . . . . . 107
7.5
Locations for data gathering - Left: synthetic intersections for creating artifcial
driving behavior. Right: Locations from inD Data [155]. . . . . . . . . . . . . . 109
ix
LIST OF FIGURES
7.6
Exemplary screenshots of videos shown to participants for rating human likeness
of a subject vehicle marked in red. . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.7
Subjective human likeness rating obtained from participants (y-axis) associated
with objective human likeness scores obtained by the proposed metric (blue
value above). Vehicles are sorted by the average rating assigned by participants,
in descending order from left to right. . . . . . . . . . . . . . . . . . . . . . . . 111
7.8
Relationship between the fne-tuned objective human-like driving behavior scores
provided by the proposed methodology and subjective ratings of participants
duringthe survey. .................................. 112
7.9
Results for human-like scores for real and synthetic datasets: with tuned
thresholds and weights (left) and initial setting (right). . . . . . . . . . . . . . . 113
7.10
Analysis to identify which parameters mostly fail when comparing artifcial and
human behavior. ................................... 115
7.11
Exemplary trajectory of two vehicles showing signifcant high jerk values when
approaching the intersection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.12
Distribution of lonVelocity for the two planners and the TRM model outside
and in the intersection area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.13
Exemplary illustrating the mismatch of driven trajectories along the defned
intersection lanes. Trajectory in red and lane polygon in cyan. . . . . . . . . . 119
7.14
Trafc scenarios experienced by participants during the simulator experiment.
The ego vehicle is marked with a red triangle. . . . . . . . . . . . . . . . . . . . 123
7.15 The driving simulator used in the present experiment. . . . . . . . . . . . . . . 124
7.16
The infuence of surrounding trafc on the perceived realism and the naturalism
of driving style, whereby 7 represents a high sense of presence (*:
p <
0.05, **:
p < 0.01). ....................................... 125
7.17
Assessment of the used agent models regarding key requirements for realistic
behavior, whereby 7 represents a positive rating of, for example, high individuality.
126
A.1
Velocity acceleration and jerk profle for the frst guess and fnal trajectory of
the two planners in scenario A_VEH. . . . . . . . . . . . . . . . . . . . . . . . 156
A.2
Velocity, acceleration, and jerk profle of the heuristic agent model in scenario
A_VEH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
A.3
Behavior of the heuristic agent model and the two planners in scenario B_PED
at the same time step. Ego vehicle in green. . . . . . . . . . . . . . . . . . . . . 157
A.4
Velocity, acceleration and jerk profle for the frst guess and fnal trajectory of
the two planners in scenario B_PED. . . . . . . . . . . . . . . . . . . . . . . . 158
A.5
Velocity, acceleration, and jerk profle of the heuristic agent model in scenario
B_PED. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
A.6
Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of
the two planners in scenario C_STAT. . . . . . . . . . . . . . . . . . . . . . . . 159
A.7
Velocity, acceleration, and jerk profle of the heuristic agent model in scenario
C_STAT. ....................................... 159
A.8
Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of
the two planners in scenario D_BIC. . . . . . . . . . . . . . . . . . . . . . . . . 160
x
LIST OF FIGURES
A.9
Velocity, acceleration, and jerk profle of the heuristic agent model in scenario
D_BIC. ........................................ 160
A.10
Behavior of the PL_3D planner employing diferent parameter sets in variation
D1 of scenario D_BIC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
A.11
Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of
PL_3D planner employing best quality and best runtime parameter sets. . . . 161
A.12
Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of
PL_3D planner employing local and global parameter sets. . . . . . . . . . . . 162
xi
List of Tables
2.1
Summary of the most frequently used input information for driver agents in
simulation. ...................................... 9
2.2
Summary of the most commonly used input information for autonomous vehicles
that is used for prediction or planning purposes. . . . . . . . . . . . . . . . . . 13
3.1
Potentials and Limitations of diferent perspectives to quantify human-like
drivingbehavior. ................................... 20
4.1 Feature information defned according to the general scene description. . . . . . 35
4.2 Semantic features describing scenario a and b shown in Figures 4.9a and 4.9b. . 44
5.1
Summary and categorization of the state-of-the-art approaches to vehicle motion
prediction........................................ 50
5.2 Summary of state-of-the-art metrics for trajectory prediction. . . . . . . . . . . 51
5.3 Results for measuring the efectiveness of the proposed scene representation. . . 56
5.4
Tuning parameter sets for investigating the efect of hyperparameter tuning [195].
64
5.5
Results for all model variants across diferent training data and diferent feature
settings at all test levels. The frst fve columns provide results summarized across
all test levels. The remaining columns show the results for each individual test
level. The best results within the training dataset (T1, T2, T3) are highlighted
in bold, the best results per column, i.e. per test level, are highlighted in bold
andunderlined..................................... 66
5.6
Analysis of spatial and temporal model performance measured in [%]. Best
result for
SO_temp
and
SO_spa
per homogeneity level (T1, T2, T3) highlighted
in bold, best overall results highlighted in bold and underlined. . . . . . . . . . 67
5.7
Results for diferent learning parameters on all test levels according to Table
5.4. Best result per column highlighted in bold. . . . . . . . . . . . . . . . . . . 68
6.1
The ability of the two planners and the heuristic agent mo del to solve the
defned key scenarios from a functional perspective. . . . . . . . . . . . . . . . . 90
6.2
Comparing the results of the planners PL_PVD and PL_3D with the heuristic
agent models (TRM) in scenario A_VEH and B_PED. . . . . . . . . . . . . . 92
6.3
Comparing the results of the planners PL_PVD and PL_3D with the heuristic
agent models (TRM) in scenario C_STAT and D_BIC. . . . . . . . . . . . . . 94
xii
LIST OF TABLES
6.4
Variations D1 and D2 for investigating changes in parametrization and scenario
for the cyclist scenario D_BIC using the PL_3D planner. . . . . . . . . . . . . 96
7.1 Overview Parameters for describing human likeness and plausibility of behavior.104
7.2
Overview of context parameters to distinguish scenarios, sorted by priority order
(P), to extract similar trafc situations from the comparison database. . . . . . 106
7.3
Thresholds for measuring human likeness for diferent parameters extracted
from the real trafc dataset. Distributions of velocities, accelerations, and jerk
are compared using KS statistics. Percent ratios are assigned for maximum
values and raw thresholds are assigned for context-free parameters. Parameters
marked with * are calculated context-free. . . . . . . . . . . . . . . . . . . . . . . 110
7.4
Spearman correlation between the parameters and average human-like driving
behavior score from the survey, with correlations converted into weights. . . . . 112
7.5
Analysis of the value distribution for dynamic parameters for model behavior and
real humans on a macroscopic level (left) and scenario-specifc under situational
conditions(right). .................................. 115
7.6
Review of all human likeness parameters that constitute the quality function
calculated for the two trajectory planners PL_3D, PL_PVD, and the heuristic
model TRM. *No TET value could be calculated for the scenario since no TTC
below the critical threshold of 6s was detected. . . . . . . . . . . . . . . . . . . 118
7.7 Distribution of the meta data of the participants. . . . . . . . . . . . . . . . . . 124
A.1
All features describing a driving scene at time
t
from the perspective of an
individual ego vehcile on the basis of features from the inD dataset. [155] . . . 155
A.2
Questionnaire 1: Investigating the sense of presence. P resence (PRE) items
according to M. Slater and A. Steed [239] and one additional item investigating
the degree of naturalism of the driving style. . . . . . . . . . . . . . . . . . . . 163
A.3 Questionnaire 2: Evaluating the degree of realism of the trafc agents. . . . . . 164
xiii
Abbreviations
ABM Agent-Based Modelling
ACC Adaptive Cruise Control
ACT-R Adaptive Control of the Thought-Rational
AD Autonomous Driving
ADE Average Displacement Error
AI Artifcial Intelligence
APF Artifcial Potential Field
ASAM
Association for Standardization of Automa-
tion and Measuring Systems
AV Automated Vehicle
CNN Convolutional Neural Network
DAC Driveable Area Compliance
DAG Directed Acyclic Graph
DDM Dynamic Decision-Making
DiL Driver-in-the-Loop
DL Deep Learning
DOG Dynamic Occupancy Grid
DS Driving Simulation
FDE Final Displacement Error
FOT Field Operational Test
GA Genetic Algorithm
GAIL Generative Adversarial Imitation Learning
GAN Generative Adversarial Network
GCN Graph Convolutional Network
GNN Graph Neural Network
GT Ground Truth
HD High-Defntion
HMI Human-Machine-Interface
HOR Hard Of-road Rate
IDM Intelligent Diver Model
IP Interior Point
IPQ Igroup Presence Questionnaire
KS Kolmogorov-Smirnov
LIDAR Light Detection and Ranging
LSTM Long Short-Term Memory
MAE Mean Absolute Error
MAS Multi-Agent-System
ML Machine Learning
MLP Multi-layer Perceptron
MOBIL
Minimizing Overall Braking Induced by
Lane Changes
MPC Model Predictive Control
MR Miss Rate
MSE Mean Squared Error
NDS Navigation Data Standard
NN Neural Network
ODM OpenDRIVE Map
OSM OpenStreetMap
PET Post Encroachment Time
PID Proportional-Integral-Derivative Controller
PIP Potential Interaction Partners
POC Proof of concept
PVD Path-Velocity-Decomposition
QP Quadratic Programming
RL Reinforcement Learning
RMSE Root Mean Squared Error
RNN Recurrent Neural Network
ROW right-of-way
RP Reference Point
RRT Rapidly-exploring Random Trees
SA Situation Awareness
SiL Software-in-the-Loop
SOR Soft Of-road Rate
SQP Sequential Quadratic Programming
SUS Slater-Usoh-Steed
TCC Temporal Correlation Coefcient
TET Time Exposed Time-to-Collision
TN Transformer Network
TTC Time to Collision
TTO Time-to-overlap
VAE Variational Autoencoder
VE Virtual Environment
VRU Vulnerable Road User
XML Extensible Markup Language
xiv
1
Introduction
1.1 Motivation
Global developments show increasing signifcance and demand f or mobility [5]. Urban research
is particularly important due to factors like urbanization and economic growth, which introduce
new complexities to the investigation and advancement of vehicle and mobility concepts [4].
A key challenge in this context is comprehending, replicating, and mo d eling human driving
behavior. This challenge engages a diverse array of research disciplines, including psychology,
computer science, and engineering, as models are employed for various applications, such as
AD (Autonomous Driving) or DS (Driving Simulation).
Within the realm of f uture mobility concepts, AD is a major area of study, striving to
comprehend and model driving behavior to take on the responsibility of the human driver [3, 6,
4]. The signifcance of mimicking human-like driving behavior for an AV (Automated Vehicle)
is justifed by the expectation to improve interactions between road users and to enhance the
comfort and acceptance of customers [1, 2].
DS represents the complementary research domain by providing the indispensable tool in
nowadays development to investigate various aspects under safe and reproducible conditions.
Simulation allows testing components in early development stages and conducting human-
centered investigations, focusing, for instance, on driver perception, distraction, or behavior in
critical situations [7]. DS can be categorized into DiL (Driver-in-the-Loop) and SiL (Software-
in-the-Loop) applications, investigating the ability of models and functions to interact with the
driver or the environment. In both loops, creating a realistic replica of the real world within
the VE (Virtual Environment) is crucial to establish valid test conditions [8]. Surrounding
trafc, as a crucial part of the VE, is typically simulated using agent models in DS which are
expected to behave as closely as possible to real road users.
Meanwhile, global trends in globalization and mobility accentuate the importance of urban
scenarios in mobility-related research [5]. Urban trafc is characterized by intricate interactions
among multiple road users and the adherence to complex trafc regulations [3], necessitating
awareness and perception of the situation as well as appropriate adaption of behavior[9].
1
1.2 Objectives and Focus of the Thesis
Especially in urban scenarios, the advantages of DS exceed those of real driving tests regarding
maintaining safety and reproducibility. Due to the increased complexity of urban environments,
new challenges arise for the development of driver models. Given the complexity of human
behavior and characteristics of urban trafc, driver models that are able to cope with the
diversity of urban trafc in a human-like manner pose a compelling scientifc domain with as yet
unresolved facets. Current solutions are either inappropriate as the subject is not considered
holistically and thus merely addresses isolated situations or approaches originated from highway
context, whereby solutions are not adequately transferrable to meet the prerequisites of urban
trafc. Furthermore, the subject is highly interdisciplinary, combining research areas such
as psychology, robotics and computer science. Existing research work does not sufciently
combine the fndings and methods from these areas to meet the complexity of the topic. In
particular, the ability to exhibit human-like driving behavior in various interactive trafc
scenarios requires a high level of dynamic and intelligence within the model. Therefore, the
present thesis addresses the challenge of modeling human-like driving behavior for urban trafc
with a special focus on DiL applications in DS.
1.2 Objectives and Focus of the Thesis
The primary aim of this thesis is to address the challenge of accurately modeling human-like
driving behavior within urban environments. Given the absence of a universally accepted
defnition of what constitutes as human-like behavior in this context, this thesis strives to
establish a more precise and pragmatic defnition of human likeness, accompanied by the
essential requirements for driver models by maintaining the interdisciplinary nature of the
topic.
Furthermore, a notable disparity becomes evident upon examining the substantial volume
of research publications related to driver models in contrast to the limited number of
practical implementations on real roads. Hence, the principal objective of this thesis is
to pinpoint methodological shortcomings and root causes hindering the progress of driver
model development for urban trafc and attempts to develop innovative and transferable
approaches addressing the identifed methodological gaps.
As the realm of human driving behavior modeling encompasses a wide spectrum of possibilities,
this thesis strategically narrows its research focus for clarity and precision. To begin with,
all methods and analyses within this thesis will be exclusively conducted within the context
of DS with a special focus on DiL applications. Nevertheless, it is essential to note that the
methodologies developed herein are intended to have broader applications in related research
domains, notably in the realm of AD. This is due to the inherent interconnectedness of the
research questions and challenges DS and AD.
This thesis places a distinct focus on urban trafc scenarios. Despite several studies addressing
the so-called highway problem, urban trafc remains relatively underexplored in scientifc
research. This is primarily attributed to the complexity of urban trafc, which creates
a signifcant gap in multiple dimensions encompassing modeling and evaluation purposes.
Specifcally, the thesis focuses on interactive situations, particularly those involving shared
space conficts introduced by ROW (right-of-way) regulations.
2
1.3 Structure of the Thesis
Since this research focuses on DS, the input interface to models is provided at the object
level. Therefore, the processing of raw sensory inputs, such as images or point clouds, is not
considered. Consistent with typical implementations, the output of the models is specifed as
the spatio-temporal motion of the vehicle, thus the trajectory it follows.
1.3 Structure of the Thesis
The present thesis follows a bipartite approach, as illustrated in Figure 1.1, to address the
multifaceted and interdisciplinary topic of modeling human-like driving behavior in a holistic
and diferentiated manner.
In the initial part, a comprehensive state-of-the-art analysis ofers an overview of key approaches
related to the modeling of surrounding trafc within DS and driver models in the context
of AVs. Furthermore, this section underscores the interdisciplinary nature of driver models,
discovering the diverse research domains involved in the quest for human-like driver models.
Building upon these insights and considering the practical application of DS, essential
requirements for human-like driver models are derived. This generates the basis for identifying
persistent challenges in modeling human-like driving behavior, resulting in the formulation of
four key research questions that are subsequently addressed in the methodological part of this
thesis.
The ensuing part comprises four chapters, each dedicated to tackling one of the aforementioned
research questions. These chapters encompass a comprehensive review of the state-of-the-art,
the presentation of innovative methodologies, and a detailed exploration of results, limitations,
and conclusions.
In the fnal summary, the insights derived from this thesis are consolidated in alignment with
the motivations and objectives previously outlined. Limitations are critically discussed, and
recommendations for future research eforts are proposed.
• State-Of-The-Art Analysis
• Requirements on Human-
Like Driver Models
• Methodological Gaps in
State-Of-The-Art Solutions
PART I: Identification of
Main Challenges
• Representation of
Complex Situations
• Anticipating the Intention
of Other Road Users
• Dynamic Decision-Making
• Meaningful Evaluation
Methods
PART II: Methodologies
and Concepts
INTRODUCTION
SUMMARY
Figure 1.1: Structure of the thesis following a two-fold approach.
3
2
State-Of-The-Art
The subject of human-like driver models is a multifaceted domain that encompasses various
research areas such as computer science, engineering, and psychology, as well as several
applications such as DS, AD, and robotics. Therefore, the following chapter aims to provide a
comprehensive outline describing the topic of human-like driver models. This review aims to
underscore the interdisciplinary nature of the topic serving as the basis for identifying scientifc
gaps and formulating research questions that will be explored in this thesis.
2.1 The Interdisciplinary Nature of Driver Models
Modeling driver behavior is a broad feld of research encompassing various methods and
perspectives. Signifcant parallels exist between the modeling of intelligent and autonomous
agents for DS and the advancement of AD for real-world transportation applications. In both
contexts, the primary objective is to achieve the independent navigation of a predetermined
route while adeptly managing the intricate dynamics of trafc in a secure and efcient
manner. The pursuit of emulating human problem-solving abilities also intersects with the
feld of robotics, as it involves the artifcial replication of human-like task performance. The
indispensable understanding of human behavior is at the core of modeling and replicating
the human facet of driving, which is addressed by psychology and cognition science. While
psychological approach aim to resolve human driving behavior by models and exp eriments,
cognitive modeling aims to render cognitive processes while driving tangible by bridging the
gap between p sychology and computer science. Therefore, the following sections present
state-of-the-art approaches related to driver models in simulation frameworks and AVs and
provide insights into methods originating from psychology, cognitive modeling and robotics.
However, most published research only provides solutions for one subpart of the driving task
such as fnding a path, decision-making, or controlling motion. To handle this, a generic and
modular driver model is defned that allows the comparison and categorization of the relevant
approaches. This generic model is inspired by the fundamental model of Donges [10] and the
layer model proposed by Michon and Keskinen [11]. Figure 2.1 illustrates the model, which
follows a matrix structure, aiming to cluster all parts of the driving task, into sub-modules
4
2.2 Driver Models in Driving Simulation
with respective interdependencies and visualizing the infuences of driver-internal components,
such as knowledge, experience, or personality. From the perspective of human decision-making,
most state-of-the-art ap proaches employ a structure that follows the chain of sense-plan-act [12,
13, 14, 15, 16, 17, 18, 16]. Meanwhile, psychological approaches introduce various layer models
inspired by human information processing and action-taking, distinguishing between a strategic,
tactical, and control level [11, 10]. Since both perspectives provide valid categorizations, the
generic driver model proposed here aims to combine both by a matrix structure. In addition,
the sense-plan-act structure is extended by the interpretation layer, whereby the task is to
link diferent sources of information and anticipate future developments.
internal state
personality
physiolo-
gical &
psycholo-
gical state
strategic
level
tactical
level
control
level
sense plan act
map,
target
point
set route/path
interpret
identify
drivable
areas
associate
anticipate
choose
maneuver
from set by
heuristic
criteria
maneuver
perform-
ance
explicit
commu-
nication
(indicator)
&
implicit
commun-
ication
(maneuver)
experience
knowledge
heuristic optimization
-based
plan
motion
interpret
trajectory
following
detect
dy-
namic
envi-
ron-
ment
plan
action,
maneuver
or rough
motion
Figure 2.1: Proprietary-designed generic driver model that enables the categorization and
comparison of state-of-the-art solutions for diferent subsets describing the driving task.
2.2 Driver Models in Driving Simulation
The following section provides an overview of the main concepts, important defnitions, and
the current state-of-the-art driver models used in DS.
2.2.1 Modeling Trafc in Driving Simulation
Integrating surrounding trafc into DS can be achieved through two distinct approaches:
replication or simulation. One way to infuse realistic trafc into the simulation involves
replicating scenarios using real trafc data. This approach entails replaying human driving
patterns within the simulation. However, in this replication process, the responsiveness of other
trafc participants to changing conditions is eliminated. Hence, the predominant strategies for
incorporating surrounding trafc into the VE revolve mainly around modeling and simulating
other road users.
5
2.2 Driver Models in Driving Simulation
2.2.2 Categories of Trafc Simulation
Trafc simulation can be classifed into three primary categories: macroscopic, mesoscopic, and
microscopic simulation [19, 20, 17]. These categories vary in terms of their level of detail and
the specifc phenomena that are observed. Macroscopic trafc simulation concerns the broader
trafc situation rather than individual road users. For instance, it examines the emergence of
trafc congestion and how alterations in road infrastructure impact trafc fow [21, 22, 23, 20].
Conversely, microscopic simulation focuses on modeling the interactions among road users
and their individual behavior [19, 20]. Mesoscopic simulation serves as an intermediary level,
bridging the gap between microscopic and macroscopic approaches to optimize computational
resources [17]. Furthermore, even more refned simulation levels exist, such as nanoscopic or
sub-microscopic simulations, which delve into the individual spatio-temporal behaviors of road
users [24]. Typical use cases for DiL applications require microscopic simulation to assess the
precise interactions unfolding between distinct road users an d the ego vehicle. Microscopic
simulation tools are particularly apt for urban settings due to their ability to enable vehicles’
individualized responses to the intricacies of road characteristics and trafc dynamics [25].
2.2.3 Agent-Based Modeling
Real trafc constitutes a dynamic system comprising autonomous individual entities, here
humans, that make independent decisions and mutually infuence each other. To efectively
model such intricate systems, ABM (Agent-Based Modelling) is employed, enabling the
representation of the multifaceted patterns that emerge through interactive dynamics. This
approach facilitates the replication of interactive phenomena and the emergence of unforeseen
behaviors [26]. ABM fnds typical applications in diverse felds such as trafc control, ecological
relationships, epidemic mo deling, market analysis, and organizational decision-making [27].
For DS, each trafc participant is therefore modeled by an individual agent model. In
trafc simulation, the scope of agents extends beyond road users to encompass entities like
street agents, intersection agents, and trafc light agents [28]. In general, an agent model is
characterized by the following facets [27]:
• autonomous, independently, takes own decisions
• follows personal rules and has a personal and explicit goal
• interacts with his environment
2.2.4 Agent Models in Driving Simulation
State-of-the-art simulation frameworks employ ABM to create the appearance of authentic
patterns through independent agents, whose actions are guided by environmental inputs [29,
19, 11]. These agents are usually assigned a predetermined route and a set of initialization
parameters, such as driver-related parameters like the desired velocity or vehicle-related
parameters, for example, the maximum acceleration [19]. Throughout the simulation, trafc
agents autonomously follow their assigned routes while taking context-driven decisions such as
applying brakes for collision avoidance [30], adhering to following distances [19], or executing
lane changes [23]. These agent models aim to replicate the multifaceted driving tasks and
6
2.2 Driver Models in Driving Simulation
situations handled by human drivers. Given the inherent complexity and diversity of these
tasks, it is suggested that a complete driver model should integrate solutions from diferent
paradigms integrated through a framework combining sub-modules [31, 32]. This holistic
approach involving multiple collaborative modules is referred to as a MAS (Multi-Agent-
System), which is proposed as the optimal approach for all driver models [31].
The strategies employed to model behavior within these agents can signifcantly difer in terms
of complexity. Prevalent multi-agent simulation frameworks are founded on distinct agent
types, such as vehicles, pedestrians, or cyclists, that maintain their designated roles throughout
the simulation [33, 11, 34, 35, 17]. More complex perspectives distinguish between humans
and their means of transportation by using a modularized architecture to describe humans
independently [17]. This allows, for example, a driver to behave more cautiously when driving
a bus compared to driving a passenger car given the same personality profle [17].
2.2.5
Modeling Vehicle Driver Behavior by Agent Models in Driving
Simulation
Most common strategies within DS employ a hierarchical framework to model driving behavior.
The constituents of these hierarchical structures and associated state-of-the-art methods are
discussed in categories proposed by the generic driver model, shown in Figure 2.1, to allow
comparability to relevant approaches from AV development.
• Strategical Level:
Agent models usually receive a destination point and map information or a predefned
route as input. Depending on the degree of input, the desired route or a reference path
is generated in the frst step. For further processing, knowledge is required to identify
driveable areas or preferred lanes. This strategic route information can be provided
in diferent formats, for example, in the form of way-points [36] or a list of contiguous
lanes [37]. Depending on the format, diferent strategies can be employed, such as an A*
algorithm to fnd a suitable route to reach the target [36].
• Tactical Level:
The tactical level of driver models in current simulation frameworks difers greatly in
complexity and strongly depends on the scenarios the model has to cope with, thus, the
respective aim of the research.
Most state-of-the-art agent models in DS follow heuristic and hierarchical structures
to decide from a set of possible actions [38], maneuvers [39, 13], or states [22] that
discretize the driving task. The decision-making process of current agent models is
mostly rule-based and aims to decide which maneuver or action to perform based on
environmental information. The decision-making varies in strategy, complexity, and
considered parameters. To provide some examples, the well-known open-source framework
SUMO includes a car-following, intersection, and lane-changing model [39]. Which of
these modules is active depends on topological circumstances and the dynamic movement
of road users. The decision of which module to take is solved hierarchically based on
heuristic rules. Hochstaedter et al. present an approach distinguishing between lateral
and longitudinal behavior [13]. Therefore, depending on the situation, diferent states of
7
2.2 Driver Models in Driving Simulation
a following and lane-changing model are assigned. The authors distinguish the following
tasks into uninfuenced driving, approaching, braking in emergency situations, and car-
following. The desired acceleration is calculated based on diferential speed and relative
distance. The lane-change decision is further infuenced by a contentedness parameter,
enabling the wish for a lane change when, for example, the vehicle in front is driving
slower.
Some commercial simulation frameworks like cogniBit [40] or AAI [41] use more
sophisticated approaches to model human-like driving behavior, including learning-based
decision-making combined with planning algorithms. Those commercial approaches
provide only very high-level information and no details of their individual solutions.
As a basis for decision-making, environmental information must be processed and
associated by the simulation framework or the agent model. Table 2.1 provides an
overview of the most commonly used information for driver agents in simulation associated
with exemplary references. Information is distinguished into three categories: ego
vehicle related information, dynamic environment information, and static environment
information. While ego vehicle related information characterizes the vehicle and the
driver, the dynamic environment information category describes the trafc scenario that
surrounds the ego vehicle. Please note that the ego vehicle does not refer to the d river or
the function being under test in the simulation but refers to the individual perspective
of the vehicle or driver being modeled as an individual agent.
The represented information strongly depends on the addressed maneuvers (e.g., car-
following or lane changes) and the respective scenarios (e.g., multi-lane highways or
urban intersections). Especially in urban environments, the motion of others, such as
the distance to the in-front driving vehicle, has to be considered for decision-making. It
is noteworthy that neighboring vehicles, as mentioned in Table 2.1, are determined by
diferent strategies, e.g., vehicles driving in the same lane [42], vehicles driving within a
certain Euclidean distance [43], or vehicles driving in adjacent lanes [13].
• Control Level:
A variety of motion models are available that can be ass igned to the control level in order
to carry out a particular maneuver or action. In this case, a maneuver-specifc model and
distinct parameter sets are used to compute spatio-temporal movement for the selected
maneuver. For example, practitioners frequently use models such as the IDM (Intelligent
Diver Model) [49, 50, 43] or the Wiedemann-Following model [13, 51] when s imulating
a car-following maneuver to represent longitudinal behavior. Methods like the MOBIL
(Minimizing Overall Braking Induced by Lane Changes) model [52] or the established
lane change model by Hoel et al. [53] are frequently used to model lateral behavior,
notably lane-changing. It is possible to include parameters in the modeling process that
refect the individuality of human behavior at both the tactical and control levels. These
parameters encompass elements like safe distances, gap acceptance thresholds, perceptual
sensitivity levels, or desired velocities, as summarized in Table 2.1. In addition, more
advanced approaches exist that interlink such parameters to create personality profles
inspired by statistical distributions of behavioral parameters obtained from actual human
driving data [36, 54, 15, 17, 46].
8
2.2 Driver Models in Driving Simulation
Table 2.1: Summary of the most frequently used input information for driver agents in simulation.
Ego vehicle related information
Ego vehicle motion [20, 13]
Ego vehicle dimensions
Desired following distance
[37, 13]
[37, 42]
Desired velocity [44, 13, 42, 34, 43]
Desired acceleration
Maximal velocity
[13, 34]
[37, 45]
Maximal acceleration
Emotional state or contentedness
[45]
[46, 13]
Reaction time [34]
Perception range or gaze behavior
Risk tolerance factor
[43, 47, 48, 37]
[43]
Dynamic environment information
Distance to in-front driving vehicle [45, 20, 44]
Distance of neighbour vehicles [46, 13]
Relative velocity of neighbour vehicles
Indicator and light status of neighbour vehicles
[46, 13, 34]
[47, 13]
Time-gaps to neighbour vehicles [46, 34]
Direction or lane information of in-front driving vehicle
State (position, velocity, acceleration) of neighbour vehicles
[44]
[46, 43, 47, 45]
Static environment information
Lane information
Trafc light status
Road topology
Road markings, trafc signs
Confict areas
[13]
[13, 34]
[19, 34]
[47]
[37]
2.2.6 The Driver Agent at BMW: TRM
The driver agent used as a representative agent model within this thesis is an in-house developed
model by BMW called TRM [55]. The driver agent employs a modular and heuristic structure
that aligns with the sense-plan-act architecture, drawing inspiration from human information
processing, and is illustrated in Figure 2.2. By means of diferent algorithms within the driver
agent and the simulation framework, a representation of the static and dynamic environment
is calculated. This includes, among other things, identifying driveable areas from the map, the
positioning on the respective lane, and the absolute and relative positions of other road users.
This environmental information, as well as a predefned route, which is represented as a list of
contiguous lane objects and a set of parameters, is provided as inp ut to the model. Essential
information, such as recognized trafc signs, is stored in an internal memory during runtime.
The output of the model encompasses the driver’s usual actions, such as turn signals, pedal
actions, and steering control. Based on this, the spatio-temporal motion is calculated by a
vehicle dynamic model. The decision-making follows a hierarchical and heuristic approach
whereby the driving task is separated into various maneuvers, such as lane changes, overtaking,
stopping at obstacles, stopping at red lights, slowing down due to speed limits or curves, and
more. A maneuver is defned as an active driving decision lasting several seconds. In the frst
step, possible maneuvers are evaluated based on the current situation, allowing high-level
decision-making for the driving strategy. At each time step, multiple maneuver evaluators
9
2.3 Driver Models for Automated Vehicles
determine their need for action. Due to the parallel evaluation, several maneuvers may request
action at the same time, but only one can be active. Therefore, a decision module applies rule-
based criteria to prioritize these potential maneuvers and fnally determines the appropriate
one. The actual motion of the driver agent is then calculated using the respective parameterized
motion models. For example, longitudinal motion is based on Wiedemann’s human-inspired
distance model [56]. For lateral movement, the active maneuver provides an allowed corridor
within which the driver may move. A path planner generates a smooth trajectory within this
corridor. For reducing complexity, lateral- and longitudinal motion, as well as temporal and
spatial decisions, are kept individual as far as possible within the model. For example, when
turning at an intersection, the driven velocity is inf uenced by the distance to the front-driving
vehicle, assumed to be a following task, and the curvature of the turning lane. Additionally, a
handling module is embedded to emulate human-like characteristics such as reaction times. For
personalized and individual driving behavior, driver-specifc characteristics such as minimal
safety distances, the desired velocity, or speeding attitude can be confgured for each driver
agent. Furthermore, dynamic parameters such as the contentedness in the current situation
infuence evaluators, impacting decisions such as the timing of a desired lane change [57].
ENVIRONMENT & ROUTE
DRIVER MEMORY
MANEUVERS EVALUATORS
DECISION MAKING
PATH PLANNING
VEHICLE DYNAMIC MODEL
HANDLING
PEDAL POSITION, STEERING WHEEL ANGLE,
INDICATOR STATE
COMBINED
ACCELERATION
DRIVER
Figure 2.2: Structure of the driver agent TRM developed by BMW.
2.3 Driver Models for Automated Vehicles
Since AVs ultimately aim to replace the human driver, the following section provides an
overview of the key methods used in the context of AVs to satisfy the interdisciplinary nature
of modeling human-like driving behavior. Modeling human driver behavior must be addressed,
as AVs need to employ automated driving systems capable of handling complex interactions
with humans in trafc in a safe and efcient manner [50]. Given the wide variety of challenges
related to AVs, the majority of publications only cover a part of the many tasks related to
10
2.3 Driver Models for Automated Vehicles
AD, such as behavior prediction, trajectory planning, or environmental information processing.
Investigating and solving isolated components of the driving task or individual scenarios makes
sense, given the complexity of the holistic problem. However, the independent research leads
to individual solutions based on assumptions that are rarely valid for the wide variety of
trafc, are challenging to transfer, and ofer few practical solutions. The primary methods
are discussed again in the context of the categories presented in the general driver model in
Figure 2.1.
• Strategic Level:
The strategic level in the context of AVs difers signifcantly from agent models in trafc
simulation, particularly during the early sensing phase. This distinction arises because
AVs must process raw sensory data from sources like cameras or LIDAR (Light Detection
and Ranging) sensors into meaningful environmental object information. However, as
they are outside the scope of this thesis, methods referring to object recognition or the
creation of an internal GT (Ground Truth) representation of the driving scene based on
raw sensor inputs are not further discussed. For generating a route, a process akin to
navigation or pathfnding is necessary to establish an initial path, similar to the input
required by an agent model.
• Tactical Level:
The tactical level in terms of AD aims to generate the future spatio-temporal movement
of the vehicle. This generation is mainly addressed as a planning task formulated as
an optimization or minimizing problem referred to as trajectory planning. Traj ectory
planning aims to generate the future spatio-temporal motion of the ego vehicle within a
given solution space while incorporating static and dynamic environmental information,
such as the behavior of other road users, the road geometry, or physical constraints
[58]. Environmental data is discretized and represented within the defned planning
space, facilitating the generation of an optimal or sub-optimal trajectory balancing
between diferent needs such as safety, comfort, and time efciency [59]. Given the
intricate nature of trajectory planning in urban scenarios, the problem is often addressed
hierarchically. For example, frst, taking a higher-level maneuver decision, followed
by local motion planning. Such decision-making tasks are addressed in various ways
in AV development. To give some examples, unsupervised learning architectures,
employing RL (Reinforcement Learning), and train an agent for reasonable decision-
making [60, 53]. Other researchers are using combinations of unsupervised and supervised
learning strategies [14] or solely supervised approaches for decision-making [61, 62, 63].
Furthermore, these higher-level decisions can be approached similarly to agent models,
where the driving task is separated into independent sub-problems or maneuvers, and
methods such as state machines are applied to choose the appropriate maneuver [64,
65]. Alternatively, a higher-level decision can be planned by formulating the problem
accordingly to accomplish the diversity of scenarios encountered in urban trafc [66].
Such a planning problem is mainly addressed by search-, sampling-, or optimization-based
approaches [59]. Some researchers combine the individual methods into hierarchical
planning frameworks [66]. To reduce complexity and thus computational efort, diferent
strategies for decomposing the complex task into multiple sub-parts are employed. For
11
2.3 Driver Models for Automated Vehicles
example, frst planning a path in the spatial dimension and subsequently planning a
velocity profle along this path [67].
In order to incorporate the evolution of the dynamic trafc scene into decision-making,
most approaches include a prediction module aiming at anticipating the future movement
of dynamic trafc participants [50]. For such anticipation tasks, Leon et al. present
a categorization into model-based and data-driven prediction models [68]. Model-
based approaches rely on knowledge, particularly physical dependencies and observable
spatio-temporal relationships. Consequently, those models perform well for short-term
predictions, as physical relationships, such as lateral acceleration for driven curvature,
serve as reliable indicators of motion dynamics [69]. However, these approaches show
weak performance when making long-term predictions due to the emergence of more
complex dependencies over extended time horizons, such as a yielding decision based on
surrounding trafc. In contrast, data-driven methods are predominantly based on black-
box models inspired by cognitive learning structures, such as a NN (Neural Network).
To provide a brief overview of learning-based and data-driven modeling approaches, the
most important keywords are explained:
The overall concept of AI (Artifcial Intelligence), illustrated in Figure 2.3, aims at
making machines as intelligent as a human, involving abilities such as solving problems
or learning [70]. The ability to learn is classifed as a subset of AI called ML (Machine
Learning) mainly addressed with NNs that are inspired by human cognition [70]. There
are various model approaches addressing learning, employing diferent levels of complexity
[70]. A highly nonlinear model structure is required to obtain dependencies when learning
complex patterns. Models incorporating multiple nonlinear layers for learning are called
DL (Deep Learning) models [70].
Field of Artificial Intelligence
Field of Deep Learning
Field of Machine Learning
Figure 2.3: Overview of the interrelationships AI - ML - DL [70].
DL models can be distinguished into supervised, unsupervised, and semi-supervised
[71]. Supervised models learn to predict desired output from a large amount of labeled
input-output training pairs, such as predicting a trajectory based on current and past
motion [72, 73, 74, 75, 76, 77, 78, 79]. Unsupervised models fnd less application when
predicting the future motion of other road users but in modeling decision-making. Some
approaches employ RL approaches for learning tactical decision-making in maneuvers
such as lane changing [60] or stopping at intersections [53]. Semi-supervised methods
combine supervised and unsupervised methods. A commonly used method is named
GAN (Generative Adversarial Network), composed of a Generator and a Discriminator.
12
2.3 Driver Models for Automated Vehicles
Roy et al. propose a trajectory prediction method that models interactions among
vehicles by embedding social context in a GAN, which helps to fnd the most acceptable
future trajectory candidate among a set of potential trajectories [80]. The generator
takes past trajectories as input and outputs a predicted trajectory. The past trajectories
and future prediction are provided to the discriminator, which classifes them as either
real or artifcial learning interactive behavior by rejecting non-acceptable trajectories.
Data-driven models, in particular supervised learning approaches based on NNs, exhibit
a superior aptitude for long-term predictions as they have the capability to determine
highly nonlinear patterns. Nevertheless, such models are associated with issues such as
overftting and the lack of transparency, arising from their black-box nature [81, 50].
Supplementary to Section 2.2.5, Table 2.2 lists the most commonly used environmental
information for prediction and planning models in the context of AVs with corresponding
publications. Again, the information strongly depends on the scop e and scenarios the
research covers. Furthermore, it has to be noted that relevant other road users, as
mentioned in Table 2.2, are determined either by a distance measure [82] or neighboring
relationship as presented by Mo et al. using a neighboring concept employing a graph
representation [77]. More detailed insights into the individual state-of-the-art trajectory
planning and prediction solutions will be provided in Chapter 5 and Chapter 6.
Table 2.2: Summary of the most commonly used input information for autonomous vehicles that
is used for prediction or planning purposes.
Ego vehicle related information
Ego position and motion
Past performed actions of the ego vehicle
[75, 83, 84, 68, 85]
[86]
Maximal velocity [58]
Maximal Acceleration [58]
Dynamic environment information
Movement and state of relevant road users
[75, 87, 74, 86, 84, 77, 78,
68]
History of movement of relevant road users [83, 74, 77, 68]
Dimensions of relevant road users
classifcation of relevant road users
[83, 68]
[87, 86, 78]
Constraint parameters of road user’s motion (v_max,
a_max)
[87]
Potential maneuvers and routes of relevant road users [88]
TTO (Time-to-overlap) to relevant vehicles
Relative movement (distance, velocity) to relevant vehicles
[86]
[68]
Static environment information
Road network information [75, 88, 77, 77, 68, 85]
Road signs and Speed limit [75, 68, 58]
Road geometry [86, 58, 68]
• Control Level:
To follow a predetermined trajectory, control strategies are employed to guide an AV by
computing the necessary actuating inputs, such as steering angle or throttle position.
Various controller strategies are available, ranging from classical methods employing PID
(Proportional-Integral-Derivative Controller) or model-based approaches using MPC
13
2.4 Further Fields of Research
(Model Predictive Control), and extending to adaptive concepts that incorporate NNs or
fuzzy sets [89]. The control level is not discussed in more detail, as the present thesis
focuses on behavior modeling.
2.4 Further Fields of Research
2.4.1 Psychology
In the context of AV development and agent modeling for simulation, the focus tends to be on
the modeling aspect, whereas researchers in psychology strive to comprehend and elucidate
human driving behavior. Various exp eriments have been conducted to explore the impacts
of cultural, demographic, and other factors on driving styles [90, 91]. Some investigations
aim to establish correlations between driver characteristics and distinct driving patterns [92,
93]. Furthermore, some researchers attempt to separate behavior by conceptualizing models
infuenced by cognitive information processing. For instance, Rasmussen introduces a tri-level
framework comprising knowledge-based, rule-based, and skill-based behavior, which diverge in
terms of the learning process, fundamental elements (experience or knowledge), and cognitive
load [94, 10]. Meanwhile, the driver model formulated by Donges is more focused on the
diferent tasks occurring while driving, such as the navigation, the control, and the stabilization
task associated with environmental infuences [95, 10].
2.4.2 Cognitive Modeling
Cognitive modeling as a bridge between computer science and psychology is another essential
research discipline to mention in the context of human driving behavior, including the study
of the social and physical aspects of driving and the driver’s interaction with the vehicle
[96]. Cognitive architectures are general frameworks for specifying computational behavioral
models of human cognitive performance and are used to understand and model human
cognitive processes [32]. Cognitive modeling and the use of cognitive architectures facilitate
the understanding of human driving behavior, taking into account human capabilities such as
memory storage, recall of motor actions, and limitations such as memory decay or foveal versus
peripheral visual coding [32]. In the theory of human cognition, ACT-R (Adaptive Control
of the Thought-Rational) is one of the most promising and complete cognitive architectures
widely used to model human behavior and decision-making and plays a crucial role in the
context of AVs [97]. ACT-R is identifed as the most suitable platform for integrating the so
far independently investigated areas of self-driving cars into an interdisciplinary framework
[97]. Such cognitive systems are used to understand human cognition while driving, to address
specifc tasks such as behavior prediction, and to study the driver’s cognitive load while driving
or interacting with the vehicle [98]. To give some examples, Salvucci et al. present a framework
based on ACT-R to model the driver components of control, monitoring, and decision-making
to handle typical highway tasks such as lane keeping and lane changing [32]. Janssen et al. use
cognitive architectures for agents moving in a new environment by adapting approaches from
cognition to RL [99]. The agent is supposed to learn a suitable plan in the new environment
based on experience and is modeled utilizing a cognitive architecture that characterizes its
14
2.5 Summary
perception, actions, and memory.
The use of cognitive models in conjunction with ML is also discussed as a promising approach
to improve performance by reducing complexity through the identifcation and delineation of
individual tasks, thus reducing the black-box character of models [98].
2.4.3 Robotics
Another pertinent research domain worth mentioning is robotics. In the context of AD, as
a vehicle transitions to an autonomous state, it transforms into a mobile robot since human
control is no longer in play. Many methodologies utilized in AV development originate from the
robotics feld, including techniques for path or tra jectory planning, tracking, and controlling
algorithms [10]. Moreover, within the realm of robotics, attempts to construct humanoid
robots entail similar challenges related to modeling human cognition and motion [100]. The
concept of agent models also shares a strong connection with robotics [27].
2.4.4 Imitating and Replicating Human Behavior
Imitating or replicating human behavior for various purposes, such as generating test cases for
AVs [101], is another related research area worth mentioning in the area of driver models. A
number of techniques in this area are based on generative modeling. For example, learning
from demonstration using a GAIL (Generative Adversarial Imitation Learning) approach
[102, 101] or using a GAN [73] to mimic driver behavior. These approaches follow the idea
of learning from data rather than engineering. The resulting models follow an end-to-end
structure and are therefore less explainable.
2.5 Summary
In summary, the state-of-the-art analysis highlights the rich interdisciplinary nature of human-
like driver models, ofering diverse perspectives, approaches, and technologies. However, it is
noteworthy that only a limited number of approaches address the topic with an interdisciplinary
perspective. Most of the research in this area is divided into either computer science or robotics,
focusing on behavior modeling or psychology, seeking to explain behavior.
The state-of-the-art in the feld of DS predominantly relies on heuristic and hierarchical
approaches that address individual sub-tasks of driving. A thorough literature review reveals
that the majority of published approaches are tailored for highway scenarios. In contrast,
urban scenarios involving interactions with other vehicles or VRU (Vulnerable Road User)s
received comparatively little attention in the past.
Due to the inherently heuristic nature of these models, their applicability beyond their specifc
context is limited. Consequently, new rules, maneuvers, and motion models must b e developed
and integrated to accommodate the broader range of scenarios occurring in urban trafc.
The complexity of driving behavior maintains a strong focus on highway scenarios, both in the
context of DS and AVs. Many approaches devised for highway scenarios rely on underlying
assumptions, such as identifying interacting vehicles through neighboring relations, which are
not transferable to urban scenarios. Furthermore, while numerous solutions address individual
15
2.5 Summary
situations or maneuvers, most methods lack in generalizability and applicability to a broader
range of more complex scenarios.
16
3
Requirements for Modeling
Human-Like Driving Behavior
Disclaimer: The following chapter is based on research presented in [103]: Teresa Rock et al. “Quantifying Realistic
Behaviour of Trafc Agents in Urban Driving Simulation Based on Questionnaires”. In: 2022 IEEE Intelligent Vehicles
Symposium (IV). IEEE. 2022, pp. 1675–1682.
Chapter 2 presented an extensive overview of various perspectives, applications, and
research domains related to human-like driver models. To comprehend the absence of a
universal solution despite extensive research eforts, it is essential to defne what constitutes
human-like driving behavior, defne quantifable metrics, and outline the resulting necessities
for driver models applicable to urban environments. The present chapter addresses these
open questions by performing a detailed analysis of the problem formulation from various
perspectives, enabling the identifcation of methodological gaps in current state-of-the-art
solutions. Building upon this analysis, the chapter aims to pinpoint yet unsolved key challenges
in modeling human-like driving behavior and to formulate appropriate research questions.
These research questions are the focus of investigation and method development in this thesis.
3.1 Characteristics and Main Challenges of Urban Trafc
Before delving into the analysis of requirements for human-like driver models, it is essential to
outline the primary characteristics and challenges inherent to urban trafc. This foundational
step provides a clear basis for a comprehensive discussion regarding the suitability of current
state-of-the-art approaches.
Trafc environments, in general, can be broadly categorized into three primary contexts:
highway, rural, and urban. This research is focused on the intricacies of urban trafc, which
are marked by a distinct set of characteristics and complexities compared to highway and rural
environments. The urban trafc milieu introduces an array of challenges attributed to the
following characteristics [3, 6]:
17
3.2 Perspectives on Defning and Quantifying Human-Like Driving Behavior
•
Shared spaces used by vehicles and VRUs: pedestrian crossings, intersections, bike lanes
•
More complex road network design: tighter streets, less or no separation from oncoming
trafc, bidirectional roads
•
High trafc dynamics: high relative velocities, frequent direction changes, high curvatures
•
Occurrence of trafc obstacles: occupied lanes by wrong parked cars, slower participants
like cyclists on the road
•
Complex trafc regulation: ROW situations without unique regulations requiring
situational behavior adaption and interaction
•
Higher diversity of types of trafc participants,such as pedestrians, cyclists, or scooters
with varying dynamics
In summary, urban trafc exhibits high complexity across multiple dimensions, resulting
in behavior infuenced by various factors. This complexity introduces new challenges for
driver models, both in terms of modeling behavior to process relevant information and make
reasonable decisions efectively and in evaluating model performance considering the impact of
situational infuences.
3.2
Perspectives on Defning and Quantifying Human-Like
Driving Behavior
As presented in Section 2.1, human-like driving behavior is explored across multiple research
domains. Various tasks, including modeling, analysis, and simulation of human-like or realistic
behavior, are discussed in the literature. Notably, no universally accepted defnition of
human likeness or realism of behavior exists. Consequently, diverse research domains such as
psychology and computer science ofer distinct approaches to quantifying, defning, or assessing
behavior. To ofer a comprehensive perspective within this thesis, the following section provides
an overview of objective and subjective methodologies stemming from various research areas.
3.2.1 Objective Approaches to Quantify Human Likeness
Objective approaches measure human likeness by various indicators intended to characterize
driving behavior. In simulations, driver models are used for trafc fow analysis [104, 105] or
safety assessment of automated driving functions [47, 48]. By such multi-agent simulations, a
database of synthetic behavioral data can be created. For instance, simulation-based safety
studies require driver models controlling surrounding trafc to exhibit human-like behavior.
Fries et al. investigate the degree of human likeness of their model by comparing collision risk
due to relative velocities and distances in highway scenarios with diferent real trafc datasets
[47]. Other approaches, such as Lindorfer et al., evaluate model behavior in well-defned
scenarios, such as following maneuvers on highways, by analyzing the similarity of speed traces
from models compared to reference data from FOT (Field Operational Test)s [106]. Typical
approaches to measure the similarity to arbitrary reference data include RMSE (Root Mean
Squared Error), means, and standard deviations [106, 107, 108].
18
3.2 Perspectives on Defning and Quantifying Human-Like Driving Behavior
In driving behavior studies, behavioral data of individuals is investigated concerning individual
aspects such as cultural or demographic infuences. Typical measurement parameters aiming
to characterize driving behavior are:
• Average and maximum speed as well as frequency and extent of speeding [109]
•
Distance to other vehicles, speed, steering wheel reversals, and acceleration counts [110]
• TTC (Time to Collision), longitudinal distance, and velocity in confict situations [90]
Other relevant methods evaluate trajectories generated by a driver model or submodules, such
as a prediction or planning module. The assessment is carried out by comparing the generated
trajectory to the human-driven trajectory in the same context. For instance, the cornering
and lane-change behavior of a driver model is assessed by comparing steering angle, lateral
acceleration, and longitudinal velocity with those exhibited by human drivers navigating the
same course [111]. In the feld of AV development, human-like driving behavior is imp ortant
in quantifying trajectory prediction models. To assess the accuracy of such prediction models,
the generated trajectories are compared against those driven by humans using the same input
data, as illustrated in Figure 3.1. Typically, spatio-temporal distance measures are employed
to compare prediction and GT, encompassing metrics such as ADE (Average Displacement
Error), FDE (Final Displacement Error) [112], RMSE, or specifc temporal (e.g. time to
detection) and spatial (e.g. lateral displacement) metrics [50].
In summary, objective human likeness is either defned as the statistical similarity between
macroscopic behavior parameters comparing artifcial and real driving data or using distance
measures comparing spatio-temporal motion sample-wise within the same context.
INPUT DATA
DESCRIBING THE
DRIVING
SITUATION
HUMAN DRIVEN
TRAJECTORY
PREDICTED
TRAJECTORY
Figure 3.1: Evaluation of model performance based on the similarity of a predicted trajectory
compared to the individual human-driven trajectory.
3.2.2 Subjective Approaches to Quantify Human Likeness
In contrast, subjective approaches use humans as quantifers, employing questionnaires to
inquire about their p erceptions. The underlying assumption is that either what is perceived as
real or can not be distinguished from artifcial behavior defnes human likeness. Zhang et al.,
for example, adapted the Turing test and asked participants to classify the driving behavior of
another vehicle as either artifcial or human-driven [113]. Du mbuya et al. followed a similar
concept by asking sub jects to rate how realistic they perceived a drive completed by diferent
driver models and how likely it was that a human driver conducted the drive [114]. Since model
19
3.2 Perspectives on Defning and Quantifying Human-Like Driving Behavior
behavior during simulation in DiL, applications can be considered as VE experience, methods
for measuring the experience and efectiveness of a VE is further relevant research area. In
most cases, the concept of presence [115, 116, 117, 118, 119] or immersion [120, 115] is used to
quantify the quality of a VE. The underlying assumption is that a high sense of presence means
that people resp ond as if the sensory input were real [118]. Thus, a high degree of presence
is associated with a high quality of the VE. Those approaches were applied to investigate
pedestrian [117, 119] and driving simulators [117, 119, 116]. Due to the complexity of human
perception and cognition, several approaches divide the sense of presence into sub-classes, such
as spatial presence, physical presence, social presence, or co-presence [121]. In experiments,
participants are asked to rate their sense of presence using questionnaires, such as the IPQ
(Igroup Presence Questionnaire) [122] and the SUS (Slater-Usoh-Steed) presence questionnaire
[123].
3.2.3
Potentials and Limitations of Objective and Subjective Quantifcation
Methods
Objective and subjective approaches ofer distinct strengths and limitations, which are
summarized in Table 3.1. Objective strategies for quantifying human likeness mainly rely
on macroscopic behavioral perspectives, evaluating the resemblance of motion parameters to
those found in real reference data. These assessments are typically independent of individual
situational infuences. The reliability and relevance of assessment results strongly depend on the
reference data employed and the scenarios under consideration. Such methods often necessitate
the establishment of thresholds, either based on the similarity of individual parameters or
applied statistics, to determine if the behavior is sufciently human-like. A major challenge
in comparing parameters to reference data is ensuring that the reference dataset includes
directly comparable scenarios. These scenarios or maneuvers can be defned by factors like
trafc density or the number of parallel lanes on a highway road [47]. Objective approaches
often prioritize highway scenarios, partially due to the growing complexity of identifying
comparable scenarios within more complex urban environments. While objective approaches
ofer insights into macroscopic behavioral patterns, they do not ofer transparency into the
realism of driving behavior concerning individual situational contexts. In contrast to objective
approaches, subjective experiments provide insights into behavior within the situational context
but demand substantial efort in the preparation and execution of experiments. Additionally,
these methods do not facilitate the identifcation of individual model weaknesses, as they
evaluate overall behavior without discretizing driving behavior into categories, dimensions, or
parameters.
Table 3.1: Potentials and Limitations of diferent perspectives to quantify human-like driving
behavior.
Objective Subjective
Efort (suitability for iterative development process)
Transferability (suitability for complex urban environments)
Providing insights into situational realism (micro-level)
Potential to identify individual model weaknesses
+
-
-
+
-
+
+
-
20
3.3 Requirements From a Driver-In-The-Loop Perspective
3.3 Requirements From a Driver-In-The-Loop Perspective
Section 3.2 provided an overview of diferent perspectives on quantifying the human likeness
of driver models, highlighting the associated possibilities and limitations of these approaches.
However, none of them provides sufcient insight to derive clear requirements for modeling
human-like drivers. Therefore, theories from a psychological point of view will be further
investigated concerning DiL applications. The resulting fndings will be combined with technical
requirements for DS to provide a basis for analyzing the state-of-the-art approaches and identify
root causes of problems and scientifc gaps.
3.3.1
What It Takes to Be Perceived as Real - The Human Driver in the
Simulator as a Mirror
The primary focus of this thesis centers on DS, specifcally applied within the context of DiL
applications. As mentioned above, potential defnitions of human likeness can be based on what
is perceived as real by a human experiencing the VE [118] or what is indistinguishable
from artifcial behavior for a human [113]. In order to gain a more profound insight into
what is required for behavior to be perceived as real, this section delves deeper into the role of
the driver in the simulator, analyzing the experience within the VE. As explained in Section
3.2, the common method for measuring the perceived realism in VEs revolves around the
concept of presence. Factors contributing to a high sense of presence are investigated to acquire
a more comprehensive understanding.
In literature, it is assumed that presence has the following conditions:
•
Consistent low-latency sensorimotor loop: predictable correlation between proprioception
and sensory input [118]
•
Statistical plausibility: visual inputs must be plausible in regard to the p robability
distribution of natural scenes [118]
•
Behavior-response correlations: appropriate correlations between state, behavior, and
responses [118]
•
Interaction with the environment: appropriate response from the environment especially
in social interactions with other virtual humans [115, 118, 124, 125]
Johnson et al. argue that people not only infer from given information but anticipate the future.
Consequently, a strong sense of presence relies on aligning past and current experiences [126].
Drawing from the constructivist theory, the authors emphasize that subjects in VEs do not
perceive an exact image but rather a constructed version shaped by their cognitive processes,
which are infuenced by various factors. The authors highlight the need to distinguish between
the degree of interactivity and the degree of graphic realism of a character. Additionally, as
examined by researchers, the uncanny valley theory reveals that a photorealistic character
displaying jerky movements is more likely to be perceived as uncanny compared to a cartoon
character that moves in a more human-like manner [125]. According to Jerald, social presence in
a VE is the sensation of true communication with other virtual characters, which is reinforced
when these characters move and behave in a manner consistent with the physical world
21
3.3 Requirements From a Driver-In-The-Loop Perspective
[126]. These factors show what positively infuences the perception of the human driver in
the simulator to perceive the VE as more real. In order to identify clear requirements for
human- like driver models, the following sections place these fndings in the context of trafc
behavior and relate them to the characteristics and capabilities of human driving behavior.
3.3.2
What It Takes to Cope With Urban Trafc - The Human Driver as a
Role Model
Since the objective is to model human-like driver agents, abilities and characteristics employed
by human drivers in urban trafc should be considered to derive required model capabilities.
Human driving style varies due to diferent personalities, experiences, and situational contexts,
including internal and external infuences [127]. External infuences result from the environment,
such as road conditions, weather, and other road users, while internal factors relate to the
individual driver, such as age, gender, and risk tolerance. Therefore, human trafc behavior is
characterized by a certain range of behavioral patterns corresponding to various combinations
of those factors.
Following general trafc rules, the driver’s central responsibility is participating safely in
trafc while following fundamental rules. Decision-making in urban situations is complex
and non-deterministic, requiring interaction and communication to maintain trafc fow and
prevent accidents. Studies of communication between AVs and human drivers, such as those
conducted by B. Faerber, show the importance and characteristics of communication in trafc
[128]. The exchange of information in trafc can vary in type, such as verbal and non-verbal
communication, but can only be understood from the respective context. Gestures and
movement are pervasive methods for non-verbal communication in trafc, making someone’s
behavior predictable for other road users. The authors conclude that AVs must be able to
recognize and interpret other road users’ gestures and trajectories. Situational knowledge and
understanding are required to interpret such signals and put information into the right context.
Understanding signals from context is investigated in various research areas and can be
associated with the concept of SA (Situation Awareness). According to Endsley, SA can be
divided into three levels, which are not necessarily strictly forward or linear [129]:
• Perception of the current situation
• Comprehension of the current situation
• Projections of probable future developments
Especially in urban trafc, various situations in shared spaces reveal interactions between road
users. Interaction can be defned as behavior infuenced by space-sharing conficts [130]. The
fundamental for interaction, assumed as implicit or explicit communication in trafc, is mutual
knowledge and shared information [131, 130]. To successfully maneuver through interactive
situations in trafc, Markkula et al. identifed the following key tasks that a human must
complete [9]:
• Perceiving: recognize and show the awareness of others
• Moving: adapting the spatio-temporal movement to avoid collisions
22
3.3 Requirements From a Driver-In-The-Loop Perspective
Furthermore, driving is described as a highly dynamic process that requires anticipation to
consider potential future outcomes for reasonable driving decisions [132]. In summary, based on
recognizing and interpreting the situational context, humans constantly adapt their behavior
to the current situation, leading to a context-related plausibility in behavior regarding s patial
and temporal changes.
3.3.3
Use Cases and Technical Requirements Originating From Driving
Simulation
There is a wide range of applications for DS, starting from concept testing during early
development to fnal safety assessment before introducing new features. In the context of DiL
applications, use cases related to the driver’s perception of HMI (Human-Machine-Interface)
systems are particularly relevant. These include studies exploring the acceptance, comfort, or
understanding of driving assistance features [133] or use cases in which the driver has to take
over control of the vehicle because a function is no longer available [134]. Common assistance
systems in urban settings encompass ACC (Adaptive Cruise Control), collision warning and
avoidance systems, VRU detection, blind spot detection, distance or speed assistants, and
intersection assistants [135, 3]. All these systems share the common characteristic of being
signifcantly infuenced by the behavior of other road users. DS ofers signifcant advantages,
particularly in scenarios that can not be tested in actual trafc for safety or ethical reasons.
Based on the characteristics of urban trafc and typical assistance functions, the following
critical scenarios were identifed that driver agents should be able to cope with in order to
exploit the full capacity of DS. Figure 3.2 and the following four key scenarios cover the
main challenges of urban trafc [3] and provide a basis for applying and testing the methods
developed in this thesis.
• Diferent ROW regulated interactions with other trafc participants (A_VEH)
• Interaction with pedestrians crossing the road (B_PED)
• Avoidance of static obstacles in trafc (C_STAT)
• Handling of dynamic obstacles, s uch as interactions with cyclists on the road (D_BIC)
In order to provide a reliable test environment for experiments, driver agents must be able to
functionally cope with the above scenarios. This means agent models must exhibit deterministic
and reproducible behavior while avoiding collisions and deadlocks. In addition, since the
simulation in DiL applications takes place online, approaches are required to be real-time
capable. Furthermore, the plausibility of the trafc agents’ behavior afects the simulation’s
validity [46].
3.3.4
Summary of Requirements for Human-Like Driver Agents in Urban
Driving Simulation
By combining the technical requirements for simulation, the psychological factors that
contribute to perceived realism, and the characteristics of urban trafc and human driving
behavior, the following higher-level requirements for human-like driver models in urban trafc
can be derived.
23
3.3 Requirements From a Driver-In-The-Loop Perspective
A_VEH
B_PED
D_BIC
C_STAT
Figure 3.2: Key scenarios representing the main challenges of urban trafc for driver agents for
simulation. Ego vehicle in yellow experiencing the diferent situations.
1. Spatio-temporal consistency (SPA-TEM): Context-related behaviour and SA.
Research demonstrated the importance of spatio-temporal consistency in behavior, both
in the context of presence measurements and in regard to safe and efcient participation
in trafc. Spatio-temporal consistency in the behavior of human drivers is based on
a certain level of understanding of the current situation and the ability to anticipate
potential temporal evolutions of the situation.
Therefore, to exhibit human-like and reasonable behavior, agent models must be able to
perceive and identify relevant environmental details, associate and interpret information,
and make predictions about temporal and contextual changes.
2. Interactivity (INTERA): Behaviour-response correlation and communication.
Interaction and behavior-response correlation contribute to a high sense of presence,
and the ability to interact and communicate is crucial for safe participation in trafc.
Furthermore, typical use cases in driving simulation encounter interactive trafc scenarios.
According to the conducted analysis, driver agents must be able to adapt behavior
according to the dynamic situation involving a shared understanding to interact with
other road users efectively.
24
3.4 Methodological Gaps in State-Of-The-Art Solutions
3. Individuality (STAT): Statistical representation of natural patterns.
Human driving style varies due to diferent factors such as personality, experiences, and
situational circumstances, such as weather, trafc density, or individual behavior of other
road users. Consequently, human trafc behavior exhibits a spectrum of behavioral
patterns.
Therefore, the behavior of driver models should be parameterizable, allowing them to
exhibit a range of behavioral variety that aligns with the statistical range of natural
patterns occurring in real trafc.
3.4 Methodological Gaps in State-Of-The-Art Solutions
Inspired by the literature presented in Section 2.3 and Section 2.2.5 and the analysis conducted
in Section 3.3.2 and Section 3.3.1, which discuss diferent perspectives on the driver, the
main components and interrelationships required to generate human-like driving behavior
in urban trafc are consolidated in a model illustrated in Figure 3.3. A representation of
the current situation is necessary to make information accessible for automated algorithms
controlling behavior in order to replicate human driving behavior in urban trafc. These
representations of complex environments depend on additional information sources in addition
to raw sensory information, such as knowledge and experience. Furthermore, information
processing is distributed among multiple tasks that involve interpreting information and
anticipating future developments. Decision-making for dynamic behavior control is based on
input provided by these components. The output of the model, its behavior, is defned as
spatio-temporal motion, in this case the vehicle’s tra jectory. Based on these components, their
interdependencies, and the high-level requirements presented, state-of-the-art solutions were
analyzed to derive the following methodological gaps.
SITUATION BEHAVIOR
HUMAN
REPRESENTATION
INTERPRETATION
TRAJECTORY
DECISION
MAKING
MODEL
Knowledge, Rules, Experiences,
Personality
ANTICIPATION
INTERPRETATION
Figure 3.3: Overview of identifed components required for modeling human-like driving behavior
in urban trafc.
,
25
3.4 Methodological Gaps in State-Of-The-Art Solutions
3.4.1 Insufcient Representation of Context Information
To allow for exhibiting interactive and plausible behavior in urban trafc, a certain degree
of SA is required. While scientists in psychology attempt to decipher the cognitive processes
happening while driving, most approaches in the technical feld of modeling lack the distinct
perspective of information processing and miss to consider the comprehension part to generate
SA within the model. The Tables 2.1 and 2.2 summarize the most commonly used information
in the context of AVs and agent models in DS. Most situational information is provided on
a raw basis without associated semantic meaning or interpretation. Dynamically changing
information is represented by the position and motion of selected neighbors of the ego vehicle.
In most cases, relevant road users are limited to other vehicles. Given the major challenges of
urban trafc, there is a gap in the representation of dynamic information at a sufcient level
that includes situational understanding of interactions characterized by involved road users
and their inherent relationships to enable SA. Furthermore, current research rarely considers
situational context as an input for modeling behavior or evaluating model behavior. Meanwhile,
approaches that incorporate more contextual information tend to focus on isolated situations
and lack transferability. An appropriate representation of the situation is the basis for any
further b ehavior modeling. Accordingly, the representation of information is identifed to be
one of the main challenges when modeling driving behavior for urban scenarios and the basis
for the frst research question:
R1: How to represent complex situations in urban trafc in a general and
transferable manner?
3.4.2 Missing Anticipation in Agent Models
Anticipation and the role of expectations are heavily discussed in the psychological perspective
of SA [129, 136]. Moreover, the basis for interactivity and spatio-temporal consistency in
behavior, as discussed in Section 3.3.4, shows the need for anticipation. Meanwhile, estimating
the intentions of other road users is a widely studied topic in developing AVs to enable safe
and reasonable decisions. However, current driver agents in simulation rarely anticipate future
developments of the situation or incorporate such information into decision-making. Although
behavior prediction of other road users is frequently addressed, it is still an unsolved challenge
for AVs in urban scenarios due to the complexity of behavior. Published approaches tend to
ofer single-point solutions that are not sufciently transferable to the diversity of urban trafc
due to the phenomenon of overftting and missing transparency of black-box models [81, 50].
Therefore, the published anticipation modules cannot simply be integrated into agent models
as a tool, but further research on anticipation in urban trafc is required. Based on this, the
second research question is formulated:
R2: How to enable general and transparent predictions in complex urban trafc?
3.4.3 Missing Dynamics in Decision-Making of Driver Agents
Current agent models are based on hierarchical and heuristic decision strategies that divide
the driving task into maneuvers and formulate rules for the selection of each maneuver. Such
approaches discretize the action space and thus the ability to adapt behavior situationally. The
26
3.4 Methodological Gaps in State-Of-The-Art Solutions
discussion in Section 3.3.4 demonstrated that more dynamic decision strategies are necessary to
satisfy the requirements of interactivity and spatio-temporal consistency in behavior. Moreover,
strong discretization of the action space can lead to deadlocks in complex situations [137, 138].
Such functional errors impede the simulation and might negatively infuence the participants’
perception of reality in DS [139, 118]. As a solution, agent models need to be enabled for DDM
(Dynamic Decision-Making), which is characterized by a decision process involving a sequence
of interdependent choices that infuence future actions [140], instead of static decision-making,
which is defned as a linear process that makes choices among explicit alternatives [141].
Following, the third research question is formulated:
R3: How to enable driver agents for dynamic decision-making in order to cope
with typical urban scenarios in a functional and human-like manner?
3.4.4 Insufcient Evaluation Strategies and Misleading Metrics
As discussed in Section 3.4.1, contextual information is important not only for modeling but
also for evaluating model behavior. The main approaches discussed in Section 3.2 and Table
3.1 show that the capability of currently available evaluation strategies is limited. Most applied
metrics are based on pairwise comparison of model behavior to human behavior under the
same conditions, which is limited by the available test scenarios and only provides insights into
the similarity of model behavior to that individual human behavior, not human-like behavior in
general. Meanwhile, macroscopic analyses identify comparable subsets in real-world trafc and
artifcial behavior data using criteria that are not transferable to urban scenarios. As a result,
a gap has been identifed in gaining transparent insights into the strengths and weaknesses
of individual models for coping with urban trafc, allowing for future improvements. Given
the strong situational dependency of behavior in urban trafc, the context of the behavior
must be considered in the evaluation to gain meaningful insights. It is crucial to provide
sufcient transparency regarding the limits of a model and to identify the situations in which
difculties arise to allow for reliable solutions in the future. Depending on the application and
the development state of the model, diferent approaches are required. Following, the fourth
research question is formulated:
R4: How to identify model limitations and quantify the degree of human likeness
of holistic driver models and individual subcomponents?
27
4
Representation of Complex Trafc
Situations
Disclaimer: The following chapter is based on a concept presented in [142]: Teresa Rock et al. “Data-Driven Prediction of
Other Road Users’ Intention for Better Scene Understanding in Trafc Agents”. In: Proceedings of the Driving Simulation
Conference 2022 Europe VR. ed. by Andras Kemeny, Jean-Rémy Chardonnet, and Florent Colombet. Driving Simulation
Association. Strasbourg, France, Sept. 15, 2022, pp. 9–16.
The present chapter addresses research question R1: How to represent complex
situations in urban trafc in a general and transferable manner?
Beginning with a brief introduction, the chapter provides an overview of current methods for
representing contextual information. Subsequently, a novel approach is introduced, focusing
on the representation of complex urban trafc scenarios inspired by the concept of knowledge
transfer. This technique serves as the foundation for enhancing situational understanding and
SA within a driver model. A framework is presented that facilitates the representation of
complex urban scenarios in a general and transferable manner and is applied to a real trafc
database. Finally results and limitations of the method are discussed, and recommendations for
future work are provided. As it’s usually not feasible to independently verify a representation
due to the absence of GT, the efectiveness of the presented representation is validated through
the utilization of a prediction model in the ensuing chapter.
4.1 Introduction and Motivation
The foundation of any model lies in the representation of information. Unfortunately, in
the feld of agent models for DS, the precise methodology of information processing is often
not discussed in pub lications. Consequently, the extent to which raw data is asso ciated and
interpreted to generate situational understanding within the agent model remains unclear.
Given that these models typically adhere to heuristic methodologies, it can be inferred that
these association processes are predominately situation- or maneuver-specifc, constraining their
28
4.2 State-Of-The-Art: Environment Representation
potential for generalization. Within the domain of AV development, contextual information
is mostly processed through the formulation of constraint and objective functions during
planning or by providing input features to a prediction model. In this context, situational
understanding is inherently tackled by optimizing cost functions or the intricate nonlinear
relationships within black-box models.
Drawing from Endsley’s theory of human SA, it’s acknowledged that achieving a level of
comprehension extends beyond the mere perceiving of information [143]. It is stated, that
instead, various streams of information are combined to derive meaning and extract relevant
information. The authors mentions the comparison of reading and understanding the meaning
of a text, rather than just reading single words as analog for this process.
Combining diferent sources of information to allow a comprehensive understanding of a
situation is referred to as interpretation in the following.
From the author’s point of view, the interpretation and thus the representation of information
is one of the major weaknesses of current driver models for urban trafc. Information is usually
provided in a raw format, and the process of relevance determination and interpretation is
delegated to some optimization procedure that aims at generating meaningful results but no
longer provides transparency. The lack of the interpretation level places unnecessary demands
on such a model, as the model is deprived of essential knowledge, such as the semantic meaning
of relationships between road users. At the same time, agent models in simulation formulate
scenario-specifc interpretations without further automation, and generalizability sufers as a
consequence.
The following sections propose a method for associating information in complex urban trafc
situations to address the missing interpretation level in driver models. In the frst step, a
general scene description from an ego perspective is elaborated to provide a basic description
covering the majority of situations occurring in urban trafc. The subsequent method generates
model-accessible features from a raw data basis containing dynamic situational and static
environmental information. The feature generation metho d is a multi-step process involving
associating and interpreting information using prior knowledge. The method is implemented
for an open-source dataset of real trafc situations on German intersections captured by a
drone.
4.2 State-Of-The-Art: Environment Representation
The following section provides an overview of state-of-the-art approaches to represent the
environment. In the frst part, the questions on how to represent the static environment e.g.
the road network or the map are addressed. Subsequently, the representation of dynamic
information is discussed.
4.2.1 Static Environment Representation
There exist diferent formats and approaches to represent the static environment. One can
distinguish between geometric representations and topological representations. Geometrical
representations are often built on raw sensor data aiming to represent lanes or lane boundaries
to provide a description of the driveable area. Typical methods are polylines, clothoids or
29
4.2 State-Of-The-Art: Environment Representation
representing the 3D environment by octrees [144].
Topological approaches involve semantic information and provide a more abstract manner to
represent the environment. Such representations vary greatly in their level of detail. In 2011
the OSM (OpenStreetMap), an open collaborative project was introduced [145]. OSM is a map
database representing road networks employing a hierarchical approach to provide geometrical
information of lanes associated with topological information such as relations or driving
directions [146]. Based on the OSM format, Bender et. al. introduced the lanelet description,
especially for self-driving cars, which combines geometrical and topological information to
overcome the drawback of the weak local geometrical representation of OSM [144]. In some
experiments, HD (High-Defntion) maps were introduced for AVs, which is a high-detail format
containing additional information such as road boundaries, and road curvatures [146]. Based on
the high level of detail, such representations are computationally demanding especially in online
applications [146]. Ofine formats were introduced to overcome the demanding process of
dynamically building HD maps. Commonly used formats are, OpenDRIVE, NDS (Navigation
Data Standard), and lanelet2, which are usually ofine pregenerated [147]. OpenDRIVE maps
are commonly used in the feld of DS and are named as the most likely HD map format
for the future based on advantages like expressiveness and accessibility benefting from the
open-source and open community character by ASAM (Association for Standardization of
Automation and Measuring Systems) [147, 148]. Therefore, the OpenDRIVE format was
chosen as the representation for static environments in this thesis. Detailed information about
the description of road networks is described in the following section.
4.2.2 OpenDRIVE Maps
The OpenDRIVE format was introduced by ASAM, describing either real or synthetic road
networks using a multi-level approach incorporating roads, lanes, and objects, such as trafc
signs or signals. The map description is stored in a XML (Extensible Markup Language) fle
format, employing an extendable node format to allow transferability to individual applications
[149]. Due to the relevance for this thesis, the following descriptions are focused on representing
urban road n etworks. The basic elements in a ODM (OpenDRIVE Map) are roads, junctions,
and lanes. Roads consist of one or multiple lanes which are not allowed to overlap. Junctions
consist of multiple roads which are allowed to overlap and connect incoming and outgoing
roads. Each road refers to a reference line that provides the basis for describing placements
of ob jects and lanes within this road using a local two-dimensional coordinate system called
s/t
, whereby the
s
-axis runs along the road and the
t
coordinate is perpendicular. Any object
in the road, such as signs or signals can be either referred by global coordinates or local
s/t
coordinates within the respective lane, as illustrated in Figure 4.1. Lanes of diferent roads are
connected by processor and successor relationships. Each road in the map owns a unique ID
and each lane within a road owns a unique ID. Thus, each lane can be describ ed individually
by the combination of road and lane ID. Trafc signs or signals are associated with lanes and
roads by referring to the respective IDs. Lanes have multiple properties, such as type, driving
direction, and geometrical descriptions of lane boundaries and the center line. Lanes can b e of
diferent types, such as driving lanes, parking lanes, or bicycle lanes, as shown in Figure 4.2a.
The geometrical representation of reference lines is expressed by points, whereby each point
30
4.2 State-Of-The-Art: Environment Representation
Figure 4.1: OpenDRIVE road network description including reference line, lane objects and lane
features [149].
inherits global coordinates and
s/t
coordinates. Multiple lanes within one road can be identifed
as neighbors by their driving direction, characterized by the sign of the respective lane ID
illustrated in Figure 4.2b [150]. For reading OpenDRIVE maps, a module of the BMW
proprietary simulation framework Spider is used [151, 55]. Map representations within Spider
align with the OpenDRIVE standard. However, instead of distinguishing between roads and
junctions, the road network is divided into diferent types of sections: linear sections referring
to roads and intersections referring to junctions. Each section consists of one or multiple lanes.
Comparable to the OpenDRIVE standard, lanes are allowed to overlap within intersections,
but not within linear sections. Each lane is uniquely identifed by lane ID, section ID, and
section type. The geometry of a lane is described by the polylines of the center-line, left-lane,
and right-lane boundary as illustrated in Figure 4.3. Each point of the polyline is characterized
by geometrical attributes involving local coordinates,
s/t
coordinates, curvature, and direction
values.
(a) OpenDRIVE Lane types in urban
(b) Neighbor relations and driving direction
areas. by lane ID.
Figure 4.2: OpenDRIVE Lanes: types, neighbor relations, and driving direction by lane ID [150].
4.2.3 Dynamic Environment Representation
The dynamic environment representation encompasses all trafc participants and additional
objects not included in the map, such as parked cars or temporary road obstacles. In the
context of AD, these representations emerge from sensors like LIDAR or cameras. Hence,
prevalent techniques in robotics and AD involve constructing a DOG (Dynamic Occupancy
Grid) or spatio-temporal grids using a DAG (Directed Acyclic Graph) to represent dynamically
changing scenarios [59, 152]. These representations establish efectiveness when input data is
available as point clouds. However, by discretizing the information into grids, no semantic
31
4.3 Method: Dynamic Scene Representation Obtained by Prior Knowledge and
Interpretation
(a) Linear Lanes. (b) Intersection Lanes.
Figure 4.3: Examples of lanes and lane geometry representations in Spider.
information remains. Furthermore, creating these dynamic grids during runtime incurs
substantial computational expenses [153, 154]. In the context of DS, all environmental data is
already available in an object format as perfect GT without any sensor noise. Consequently,
the following discussion will primarily center around representations derived from the object
level. As illustrated in Tables 2.1 and 2.2, the ego vehicle and other trafc participants are
typically characterized by their position and orientation within a global coordinate system,
along with their dimensions and dynamic attributes such as velocity and acceleration. All trafc
participants are classifed, mostly following a simple distinction: passenger car, bus, truck,
cyclist, motorcycle, pedestrian [155]. Researchers include supplementary details like indicators
or brake light status additionally [75]. Based on this fundamental scene representation,
information can be associated to extract additional context information, such as the distance
or velocity relative to the in-front driving vehicle [45, 44] or other neighboring vehicles [46,
13]. As discussed in Section 4.1, the processes of associating information in the context of AVs
is implicitly addressed through cost functions and extraction mechanisms when constructing
embeddings using NNs. Meanwhile, interpreting information within agent models is either
scenario-specifc or not sufciently described in published research.
4.3
Method: Dynamic Scene Representation Obtained by Prior
Knowledge and Interpretation
The following method addresses the question of how to represent complex situations in trafc.
The challenge of scene representation concerns fnding an appropriate level of abstraction to
formulate universal solutions while retaining sufcient information to meet the requirements
of functions reliant on environmental data. Schreier et al. emphas ize the connection between
abstraction level and generalizability, highlighting the difculty in achieving a suitable balance
[156]. A high level of informational content may introduce drawbacks in terms of runtime and
complexity, as the subsequent model becomes burdened with unnecessary data, while a low
level of informational content impedes the solution potential of the ensuing model [156].
Given the example of data-driven prediction models, generalizability is a primary challenge.
Meanwhile, humans are capable of anticipating and responding logically to novel situations, even
32
4.3 Method: Dynamic Scene Representation Obtained by Prior Knowledge and
Interpretation
if they haven’t encountered those exact circumstances before. The cognitive psychological theory
of knowledge transfer can explain this ability. This theory encompasses three mechanisms:
analogical transfer, constraint violation, and knowledge compilation. Analogical transfer refers
to the capacity to establish a shared relational mapping from prior knowledge to handle new
problems. Constraint violation centers on learning from errors, while knowledge compilation
outlines the process of sequentially interpreting and optimizing rules from prior knowledge to
achieve specifc objectives [157].
To enable situational understanding and thus SA within a driver model, an approach for a
general scene description inspired by analogue transfer and knowledge compilation is presented.
To overcome the identifed shortcomings in common methods, algorithms are developed to
extract, associate, and interpret information for generating a context-rich representation of the
scene in the format of a feature vector. The universal feature vector can be used with various
modeling techniques and for diferent applications.
First, a generalized approach to describe complex urban trafc scenes that allows for linking
between diferent complex trafc scenarios is presented. Subsequently, novel algorithms are
developed to extract, identify, and associate all relevant information, resulting in a feature
vector that describes not only the raw time series but also the inherent relationships between
the ego vehicle and all surrounding trafc participants that could potentially afect the ego
vehicle’s behavior..
As a POC (Proof of concept), the idea of the general scene description is implemented for a
real trafc dataset and later tested using a prediction model. With a focus on complex urban
trafc scenarios in this thesis, a dataset has b een chosen that aligns best with the defned key
scenarios in Section 3.3.3. The selected dataset, known as inD [155], is an open-source dataset
capturing a bird’s-eye view of interactive urban trafc by drones. It includes shared spaces,
non-deterministic regulations, and interactions between vehicles and VRUs. This dataset
contains tracks data describing the spatial-temp oral movement of all road users, tracksMeta
data outlining dimensions and classifcations of road users, and the corresponding OpenDRIVE
map for the recorded location.
Drone data holds signifcant potential for capturing contextual information due to its bird’s
eye perspective that shows the entire scenario around an ego vehicle. However, this potential
comes with trade-ofs, as the bird’s eye perspective may lack certain details that would be
accessible in an ego vehicle-perspective dataset, such as turning indicator states or brake lights
of other vehicles. Since this thesis aims to provide transferable solutions, and since it can
be assumed that AVs primarily receive information from the ego perspective, sophisticated
processing algorithms are developed to compensate for such defciencies.
4.3.1 The General Scene Description
In order to improve the level of understanding as part of SA within a driver model, a general
representation of interactive trafc scenes is required. Considering the key scenarios outlined in
Section 3.3.3, diferent interactive trafc situations, from an ego perspective, all share common
essential aspects. The following characteristics, describing an interactive trafc scenario aim
to enable a mapping between various complex situations in urban trafc:
33
4.3 Method: Dynamic Scene Representation Obtained by Prior Knowledge and
Interpretation
1. The static environment:
Diferent areas (e.g. lanes) that are restricted for diferent types of trafc participants
(e.g. sidewalk or driving lane) or defned directions (e.g. left-turning lane or straight
lane).
2. The relevant dynamic environment:
A subset of the surrounding road users infuencing the ego’s behavior or directly
interacting with the ego vehicle, in the following called PIP (Potential Interaction
Partners). PIP are characterized by static and dynamic information
3. Confict zones:
From the ego’s perspective regarding each PIP, there exists a spatial confict zone
resulting from overlapping lanes or shared spaces.
4. Relationships:
Between the ego and each PIP individual relationships can occur for example the ROW
situation.
5. Relative movement:
For each potential confict, the relative movement of the involved road users (PIP and
ego vehicle) relative to each other and relative to the confict zone can be described.
To give an example, Figure 4.4 illustrates two typical but very diferent situations. A typical
ROW regulated turning scenario (bottom) and a situation whereby a vehicle is forced to move
to the oncoming trafc lane because of a static obstacle on the road (top). Both scenarios can
be described on the same basis that characterizes the PIP, their inherent relationship to the
ego vehicle, and their relative movement. In both situations, the ego vehicle (ID1, yellow) must
give the ROW to a fnite number of interaction partners, and a spatial confict zone results
from the overlap of the individual lanes or driving corridors. Thus, the exceptional situation
illustrated in the upper part of Figure 4.4 can be described by the same characteristics as the
typical turning maneuver. This enables situational understanding and allows both scenarios
to be addressed in one method without having to formulate each situation separately, which in
turn improves generalizability. To express the defned description and make it accessible for
models, information must be mapped into features. Based on the components of the general
scene description, fve feature categories are introduced and summarized in Table 4.1. All
individual features associated with these categories that characterize a scene at time
t
are
provided in the Appendix in Table A.1.
4.3.2 The Data Processing Concept
A multi-level strategy is required to obtain the characteristics outlined in Table 4.1 from
the data source. A toolchain is presented, beginning with the fusion of time series and map
data to derive contextual details related to the vehicles’ positions. This context information
generates knowledge on which lane each vehicle is driving. Leveraging this newly acquired lane
information in conjunction with information from the time series and map-based knowledge
of lane connections, potential interactions between road users can be identifed, and the
34
4.3 Method: Dynamic Scene Representation Obtained by Prior Knowledge and
Interpretation
ego features partner features
relationship &
relative movements
ego features
partner features
relationship &
relative movements
partner features
relationship &
relative movements
relationship &
relative movements
Figure 4.4: Illustrating the idea of scenario mapping by the general scene description. The
ego vehicle (ID 1, yellow) interacts with diferent interaction partners once in a typical turning
maneuver and once in an exceptional situation resulting from an occupied lane.
Table 4.1: Feature information defned according to the general scene description.
CATEGORY FEATURE INFORMATION
Ego vehicle features (E)
Classifcation, dimension, position, velocity, and accel-
eration of the ego vehicle
Map features (M)
Lane course description including turn direction,
geometric information, curvature
Partner features (P)
Interaction features (I)
Classifcation, dimension, position, velocity, and accel-
eration of all PIP
Semantic and geometrical features describing the
relationship and the relative movement of PIP, ego
and their relative to the confict zone
above-mentioned interaction features can be calculated. In detail, the toolchain incorporates
the following steps:
1.
Data Fusion: Identifcation of the lane on which vehicles are traveling, resulting in a lane
assignment for all vehicles at all time steps.
2.
Plausibility Check: An automatic check of all lane assignments for logical plausibility.
The reasons why plausibility checks are necessary are explained in the following section.
3.
Interaction Identifcation & Feature Calculation: Th e identifcation of PIP and calculation
of associated relationship features based on the data fusion from each ego perspective.
Please note that the scene description is always considered from the perspective of an ego
vehicle, describing the ego vehicle itself, the surrounding road users, and their relationship.
35
4.4 Implementation
Algorithms processing information of road users surrounding the ego vehicle only receive
features available from this ego perspective to be transferable to AD.
4.4 Implementation
4.4.1 Implementation Step 1: Data Fusion
The data fusion intends to assign all vehicles, including passenger cars, motorcycles, trucks,
and buses, to the respective lane they are currently driving on. Only vehicles are as signed
to lanes since it can be assumed that VRUs do not strictly follow the lanes but move in
a more free-space manner. Algorithm 1 assigns the positions of all vehicle samples to the
map and returns a
lane_assignment
for each time step extending the original tracks fle. A
lane_assignment
consists of one or multiple lanes defned by lane ID, section ID, section
type, and the respective s/t position of the vehicle within this lane.
Based on the map representation composed of lanes directing over the intersection, undefned
areas within the intersection area can result. An exemplary occurrence is illustrated in Figure
4.5. As a result, in some cases, the RP (Reference Point) of a vehicle is positioned within
such an undefned area, leading to no valid result from the assignment function. In order to
compensate for such cases, the bounding box of the respective vehicle at time
t
is calculated,
and the edge points are used for lane assignment.
Figure 4.5: Example of undefned areas within intersections. Recording location Heckstrasse inD
dataset [155].
36
4.4 Implementation
Algorithm 1 Initial Data Fusion Algorithm
Data: tracks, tracksMeta, map
Result: tracks_ex fle with an additional column physical_assignment
for vehi in vehicles do
for t in time do
phy_ass
=
get_assignment
(
post , map
)
// physical assignments are a
vehi
structured numpy array containing the lanes specified by lane ID, section
ID, section type, and the s position of the vehicle
if length(get_assignment()) > 0 then
tracks_ex[’physical_assignment’][row] = phy_ass next
else
// if no physical assignment could be found, the bounding box of the
vehicle is calculated, and the edges are assigned to the map
bbox = get_bbox(vehi, t)
phy_ass_bbox = [ ]
for edge in bbox do
phy_ass_edge
=
get_assignment
(
posedge, map
)
phy_ass_bbox
[
i
] =
phy_ass_edge
tracks_ex[’physical_assignment’][row] = phy_ass_bbox next
return tracks_ex
4.4.2 Implementation Step 2: Plausibility Check
In order to obtain situational interpretations, e.g. who is interacting with whom and which
relationship they share, a clear an d unambiguous assignment of vehicles to a specifc lane is
required. For a number of reasons, the data fusion can lead to ambiguous or non-plausible
results. Ambiguous results occur for example when a vehicle is located on the intersection,
where multiple lanes overlap. The function
get_assignment
() then returns multiple lanes
originating from various processor lanes entering the intersection and leading to diferent
destination lanes, as exemplary illustrated in Figure 4.6 (left). However, only a subset of the
assigned lanes is plausible with respect to the spatio-temporal motion of the vehicle. Therefore,
information from the past time series is used to check if the motion of the vehicle matches
the origin of each assigned lane. Since multiple lanes still can have the same origin, as the
yellow and orange lane in Figure 4.6 (left), the
check_plausibility
() function, additionally
compares the direction of the lane and the heading of the vehicle. By applying a threshold when
comparing heading and lane direction, one lane is identifed as logically valid. When applying
these algorithms online to detect other road users from th e ego perspective, the identifed
lane another vehicle is following while located in the intersection area can be confrmed by
the indicator status of the respective vehicle if this signal is available. When the algorithm is
used ofine to process data, such as preparing a database for training a prediction model or
data analysis, future information about the vehicle’s route can be used to distinguish between
logical and physical lanes more precisely in such cases.
In other situations, the lane assignment process can return a wrong lane from a logical p oint
of view. The right part of Figure 4.6 shows an example of a vehicle overtaking a cyclist. The
37
4.4 Implementation
Figure 4.6: Example of ambiguous (left) and logically incorrect (right) lane assignments.
vehicle is moving with its RP in the oncoming lane (ID2) but logically follows the neighboring
lane (ID3). The vehicle would have to give way to the oncoming trafc in the orange lane
(ID2) and receives the ROW from vehicles approaching the intersection in the blue lane (ID1).
When using the lane that the ego vehicle is physically driving in (ID2), misleading conclusions
would be drawn about the relationship to other road users, such as giving way to participants
in the blue lane (ID1). Similar cases occasionally occur on bidirectional roads without lane
markings to separate the lanes. There, many vehicles tend to drive more in the center of the
road, which also leads to a physical assignment to the oncoming lane. Figure 4.7 provides
some real-world examples showing these typical occurrences. To address such implausibilities,
all lane assignments are separated into logical and physical levels in order to derive plausible
conclusions regarding interactions and relationships in various situations. Therefore, again,
vehicle headings and lane directions are compared for all assignments. In case the direction of
the assigned lane does not match the vehicle’s heading, neighbor lanes are considered. The
result of the plausibility check Algorithm 2 is an extension of the tracks data with logical and
physical lane assignments for all vehicles.
4.4.3 Implementation Step 3: Interaction Identifcation
When identifying interactions between road users, a distinction is made between vehicles
and VRUs, since it can be assumed that vehicles follow lanes, while pedestrians or bicyclists
generally tend to move more freely in space. To identify vehicles with potential conficts,
the relationship between diferent lanes is used. With the knowledge from the map, it
can be determined which lanes have potential conficts with the current ego lane and what
relationship these lanes share. The confict zone can be calculated by the spatial overlap
between the respective lanes. Therefore, to identify vehicles with potential conficts, for
each ego vehicle perspective, all potential confict lanes at time
t
can be identifed by the
subset
C_LANES
=
{c_lane1, c_lane2, ..c_lanei}
of other lanes that have a spatial overlap
or potential conficts with the ego lane. Potentially interacting vehicles
P t
can then be
V EH
identifed based on all vehicles present at time
t
, their lane assignment, and the subset of
relevant lanes from the ego p erspective
C_LANES
. Vehicles without any relevant relation to
the ego are not further considered for the scene representation.
Since VRUs are moving in a more free-space manner, other criteria for identifying potential
conficts are required. To evaluate a VRU as a potential confict partner, its distance and
38
4.4 Implementation
Algorithm 2 Plausibility Check Algorithm to get unique and reasonable lane assignment
Data: Existing dataframe tracks_ex, New column logical_assignment, tracksMeta, map fle
Result:
tracks_ex fle with two additional columns: logical_assignment, summary_assignment
foreach row in tracks_ex do
vehi = row[’trackId’]
t = row[’frame’]
phy_ass = row[’physical_assignments’]
// physical assignments is a structured numpy array containing the lanes
specified by lane id, section id, section type, and the s position of the
vehicle
if length(phy_ass) == 1 then
lane_direction = get_lane_dir(lane)
check
=
check_plausibility
(
headingt , lane_direction
)
// check_plausibility()
vehi
compared lane direction and vehicle heading
if check == TRUE then
tracks_ex[’logical_assignment’][row] = lane
else
new_lane
=
check_neighbours_lanes
(
post , lane, map
)
vehi
// check_neighbours_lane() searches for all neighbour lanes within
the same section and compares vehicle heading
tracks_ex[’logical_assignment’][row] = new_lane
else if length(phy_ass) > 1 then
for lane in phy_ass do
unique_lane
=
check_plausibility_multi
(
headingt , lanes
)
vehi
// check_plausibility_multi() compares lane directions and vehicle
heading as well as lane origins and destinations to extract a unique
and logical valid lane assignment
tracks_ex[’logical_assignment’][row] = unique_lane
sum_lane_ass = get_sum_lanes(
tracks_ex[’logical_assignment’, ’physical_assignment’][row]
)
// get_sum_lanes() creates a structured numpy with assigned lanes with an
additional feature if the lane is physical and logical valid, only physical
valid or only logical valid
tracks_ex[’summary_assignment’][row] = sum_lane_ass
return tracks_ex
39
4.4 Implementation
car
bicycle
Ego trajectory
Ego position
partner trajectory
Partner position
car
bicycle
Ego trajectory
Ego position
Partner trajectory
Partner position
Bendplatz
(a) Example of a vehicle moving into the oncoming lane to pass a cyclist.
car
bicycle
Ego trajectory
Ego position
partner trajectory
Partner position
car
bicycle
Ego trajectory
Ego position
Partner trajectory
Partner position
Frankenburg
(b) Example of a vehicle driving with its RP in the oncoming trafc lane on a
two-way road without lane markings
Figure 4.7: Real world examples causing implausible lane assignments.
heading relative to the assigned lane of the ego vehicle and respective successor lanes following
the ego route
rego
are considered. If the VRU is situated in or moving towards one of these lanes,
it is identifed as a PIP. Whether the VRU is moving towards a relevant lane is determined
by checking the overlap of the lane polygons and a defned motion polygon representing the
VRU’s motion with some uncertainty range, as illustrated in Figure 4.8. The motion polygon
is initialized by the length
s
and the angle
γ
. The length
s
is set to 7m for pedestrians and
14m for cyclists. The values set for the length
s
are based on the assumption that pedestrians
have an average walking velocity of 1.4m/s [158], resulting in 7 meters for a time horizon of 5
seconds. The angle
γ
is set to 20° for pedestrians and 10°. The angle and length for cyclists
and the angle for pedestrian motion polygons were determined empirically by examining the
dataset. Additionally, VRUs that have a Euclidean distance
e
smaller than a certain threshold
to the ego are identifed as PIP, as it is assumed that ego vehicle behavior would still be
afected even without a specifc relationship. The threshold for the Euclidean distance
e
was
set to 5m. Following this approach, in the example scenario shown in Figure 4.8, the VRUs
with ID1 and ID3 would be identifed as PIP, but not the VRU with ID2. The result of the
identifcation Algorithm 3 is a list of PIP and their respective relationship to the ego vehicle.
The number of PIP that can be recognized by these algorithms is not limited. However, for
40
4.4 Implementation
some applications, such as preparing data for training a NN, the number should be limited
because such models require a static number of input features. Based on the identifed PIP
from an ego perspective, all information categories with associated features, introduced in
Table A.1, can be calculated. In order to describe the interaction between the ego vehicle and
another road user, interaction features (I) intend to characterize the relative spatio-temporal
movement between ego and partner as well as the individual motion in relation to the confict
zone. This information is augmented with the individual relationship between the road users.
Table A.1 provides the summary of all features characterizing one sample from describing the
scene representation from one individual ego perspective with associated descriptions, features,
and categories as well as units. The presented algorithms can either be used to calculate those
features across an entire dataset or online at runtime to generate a scene representation for
modeling behavior since no future information is required in any algorithm.
𝑒
𝛾
𝑑
Figure 4.8: The concept for identifying potential interactions with VRUs. The yellow vehicle
represents the ego vehicle facing a scene with two pedestrians and a cyclist. Identifcation based
on the relevant ego lanes (red), the motion polygons (blue), and the euclidean distance e.
41
4.4 Implementation
Algorithm 3 Interaction Identifcation Algorithm
Data: id
(ego vehicle id),
rego
(ego vehicle route as a series of lanes),
t
(time frame), tracks_ex
with logical and physical assignment of all vehicles, map fle
Result: C_PAR T NERS
(list of interaction partners with their respective relationship to
the ego vehicle)
C_LANES
= get_confict_lanes(
laneego
, map)
// get_conflict_lanes() returns a
series of conflict lanes that have potential conflicts with the lane(s) the ego
vehicle is currently assigned to
C_PART NERS = [ ]
vehicles = tracks_ex[(type == vehicle)&(frame == t)]
V R Us = tracks_ex[(type == VRU)&(frame == t)]
for vehi in vehicles do
lanesvehi = get_sum_assignment(vehi, t)
if any(lanesvehi ) in C_LANES then
lane_relation
= get_lane_relation(
vehi, C_LANES
)
// the relation between
vehicles equals the lane relation such as receive the right of way, give
the right of way, or merge
vehi, lane_relation ⇒ C_PARTNERS
for vrui in V RUs do
inego_lane
,
towardsego_lane
,
checkmin_dist
= get_VRU_check(
posego
,
rego
,
posvrui
,
γvrui
,
rego
)
// get_VRU_check() checks if any present VRU is in or moves towards
a relevant lane from the ego perspective
if any[inego_lane, towardsego_lane, checkmin_dist] then
// for the relation to VRUs is currently only one class called
’VRU_Interaction’ implemented since there are no pedestrian crossings on
the intersections
vrui, VRU_relation ⇒ C_PARTNERS
return C_PARTNERS
42
4.5 Results
4.5 Results
The proposed concept provides algorithms for describing complex and interactive trafc scenes
through model-accessible features containing semantic and non-semantic information. Figure
4.9 shows some visual results of the algorithms applied to situations at diferent locations
from the inD dataset. First of all, one can observe the beneft of interaction detection, as in
the scenario shown in Figure 4.9a, modern interaction identifcation approaches would lead
to misleading results. Using a simple distance measure would only identify a subset of the
relevant road users, while applying a larger radius would include a large amount of participant
information that does not afect the ego vehicle, such as pedestrians with ID 294. The typical
neighborhood approach would also not provide a plausible result because interactions with
vehicle ID300 or ID299 could not be identifed. The presented algorithms identifed the vehicle
with ID293 and the pedestrians with ID294 and ID282 as not being relevant for the ego vehicle.
Also, in the scenario shown in 4.9b, all road users except the car with ID414 could be identifed
as not relevant to the ego vehicle. The fgures 4.9c and 4.9d show the same scenario from
two diferent ego perspectives, proofng that the interaction identifcation performs for various
scenarios and intersection typologies by detecting relevant road users.
In Figure 4.9e, again, the beneft of the identifcation algorithm compared to modern interaction
identifcation approaches is demonstrated. When simply employing a distance measurement,
bus ID46 would be identifed as relevant even though it is turning left, while the presented
algorithms only identify car ID50 as a potential interaction partner through a following
maneuver.
Based on the knowledge of which road users are relevant from an individual ego perspective
and the associated lane information, the situational context can be described by the features
summarized in Table A.1. To give an example, Table 4.2 provides some of the semantic
features describing the interactions shown in Figure 4.9a and 4.9b. The features describe
the relationship, the current maneuver, and the positioning relative to the confict of the ego
vehicle and its interaction partners. In scenario a (4.9a), the ego vehicle interacts with three
other vehicles. Vehicle ID284 is in front of the go vehicle, so it afects the ego vehicle, but
there is no confict in their relationship. Meanwhile, the ego vehicle has to give the ROW to
vehicles ID299 and ID300. For the interaction with vehicle ID300, ego and partner are still
before the confict zone, while partner vehicle ID299 already entered the confict zone.
The results show that through the proposed algorithms, it is possible to provide situational
understanding in a model by describing the context. Such context features can be used for
enhancing data-driven prediction, refning planning or heuristic decision models, or categorizing
and identifying situations for data analysis.
The results showed a potential for more understanding within a driver model, but the actual
added value of the proposed scene representation compared to current solutions should be
measurably investigated. In general, there is no possibility of validating a representation
independently without an application due to the absence of GT. Therefore, the extent to
which the proposed representation and individual feature categories provide added value with
respect to SA is investigated in the following chapter using a prediction model. The underlying
assumption is that high prediction accuracy and the ability of a model to transfer learned
43
4.6 Limitations and Future Work
patterns to new situations is a sign of a pronounced situational understanding. The method
of evaluating a representation with the help of a prediction model is inspired by a research
work investigating the efectiveness of new representation methods using a prediction model in
terms of trafc fow prediction [159].
Table 4.2: Semantic features describing scenario a and b shown in Figures 4.9a and 4.9b.
Scenario a Scenario b
Lon. velocity Ego [m/s]
Maneuver Ego
-0,01
LEFT TURN
4,20
LEFT TURN
Relationship
Pos. ego to confict
Pos. partner to confict
Maneuver partner
NO CONFLICT
inside
ID: 284 -
STRAIGHT
EGO GIVE ROW
before
ID: 414 before
RIGHT TURN
Relationship
Pos. ego to confict
Pos. partner to confict
Maneuver partner
EGO GIVE ROW
before
ID: 299 inside
LEFT TURN
Relationship
Pos. ego to confict
Pos. partner to confict
Maneuver partner
EGO RECEIVE ROW
before
ID: 300 before
STRAIGHT
4.6 Limitations and Future Work
The proposed idea of enhancing SA by a content-rich scene representation is applied to an
open-source dataset as a POC. The implemented algorithms are limited to ROW controlled
intersections based on the key scenarios defned in this thesis. However, the presented heuristics
can be easily extended to cover additional scenarios, such as zebra crossings or signalized
intersections. For example, to include signalized intersections, the algorithms for determining
relationships from the map must incorporate the trafc light status and an ass ociation of trafc
lights to lanes. To cover typical following and lane-changing scenarios in highway-like roads,
such as multi-lane sections, additional categories and heuristics for relationships and relative
motion must be included. Therefore, the idea of the general scene description presented here
can be extended both in the number of trafc participants to be considered and in the type of
relationships that occur.
Furthermore, the level of detail describing the environment is limited to input information
typically available to agent models in simulation or common object information provided by
sensors of AVs. As outlined in Chapter 3, communication in trafc plays a crucial role and can
be distinguished into implicit and explicit communication, whereby the implicit level describes
the spatio-temporal movement of road users, and the explicit manner involves active gestures
[130]. The proposed algorithms only cover the implicit level of communication employed by
motion due to data availability. Therefore, to cover all relevant levels of communication in
trafc, future research should address the challenge of how to capture and process signals such
as gestures and gaze directions, especially of VRUs.
The POC implementation is applied to data obtained from the bird’s eye perspective and the
44
4.6 Limitations and Future Work
Ego id: 285
Potential interaction: pedestrian id: 290
No interaction: car id: 293
No interaction: pedestrian id: 294
Potential interaction: pedestrian id: 296
Potential interaction: pedestrian id: 297
Potential interaction: car id: 299
Potential interaction: car id: 300
No interaction: pedestrian id: 282
Potential interaction: car id: 284
Ego id: 415
No interaction: pedestrian id: 392
No interaction: pedestrian id: 406
No interaction: pedestrian id: 408
No interaction: pedestrian id: 410
No interaction: pedestrian id: 412
No interaction: pedestrian id: 413
Potential interaction: car id: 414
(a) Bendplatz (rec:14 frame:19181) (b) Frankenburg (rec:23 frame:21308)
Ego id: 176
No interaction: truck_bus id: 166
Potential interaction: car id: 175
Potential interaction: car id: 177
Potential interaction: car id: 178
No interaction: car id: 179
Potential interaction: car id: 180
Ego id: 180
No interaction: truck_bus id: 166
Potential interaction: car id: 175
Potential interaction: truck_bus id: 176
Potential interaction: car id: 177
Potential interaction: car id: 178
No interaction: car id: 179
(c) Aseag (rec:01 frame:12558) (d) Aseag (rec:01 frame:12558)
Ego id: 47
No interaction: truck_bus id: 45
No interaction: truck_bus id: 46
No interaction: car id: 48
No interaction: car id: 49
Potential interaction: car id: 50
(e) Heckstrasse (rec:32 frame:2571)
Figure 4.9: Illustration of diferent interactive situations of the inD dataset at the four diferent
locations. The ego vehicle is always drawn in red. The solid lines show the path driven so far, the
dotted lines represent the future path, and the circles mark the current position.
45
4.7 Conclusion
corresponding map. It can be assumed that a AV primarily collects information from an ego
perspective. However, it is expected that sensors on an AV would also b e able to provide the
input information used in the presented algorithms, such as the position or velocity of other
road users. By applying common tracking and detection algorithms, the dynamic information
of other road users would be accessible, and the use of such map information is also common
in AV development.
Heuristics are applied to the data to identify interactions between road users. Using such
heuristic algorithms induces more complexity in the data processing and is associated with
certain limitations. First, no heuristic is able to cover the entire range of potential combinations
occurring from multi-participant interactions in real trafc. By iteratively developing and
extending the algorithms, the most important situations observed in real trafc data are
covered by the presented concept. However, since manual scanning of hours of data would
have exceeded the resources of this thesis, some blind spots might remain.
Lastly, semantic features that discretize individual scene information, such as the positioning
relative to the confict zone, limit the continuity of the information, and details get lost.
However, a particular level of discretization to categorize situations is required to enable
knowledge transfer and the association of comparable situations.
4.7 Conclusion
In summary, the proposed method for generating a sophisticated and transferable scene
description applicable to various urban trafc scenarios shows high potential. Contextual
information is generated by fusing time series and map data, allowing SA in a model or
targeted data analyses.
Furthermore, state-of-the-art approaches for identifying relevant road users from the perspective
of an ego vehicle show particular weaknesses in such complex situations. The presented
algorithms for interaction detection address this problem and ofer a promising solution for
distinguishing between road users that potentially infuence the ego vehicle and those that can
be excluded.
Driving behavior in urban environments is strongly infuenced by various external factors,
such as ROW context, road topology, and reactions of other trafc participants. Therefore,
the semantic context information and interaction identifcation provided by the fusion
algorithms generate an indispensable basis for meaningful and reliable modeling, and evaluation
approaches.
46
5
Anticipating the Intention of Other
Road Users
Disclaimer: The present chapter involves research presented in the following publications:
[142]: Teresa Rock et al. “Data-Driven Prediction of Other Road Users’ Intention for Better Scene Understanding
in Trafc Agents”. In: Proceedings of the Driving Simulation Conference 2022 Europe VR. ed. by Andras Kemeny,
Jean-Rémy Chardonnet, and Florent Colombet. Driving Simulation Association. Strasbourg, France, Sept. 15, 2022,
pp. 9–16.
[160]: Teresa Rock et al. “On the Way to Reliable Trajectory Prediction in Urban Trafc”. Advances in Transdisciplinary
Engineering 2023, Publication in progress.
The following chapter addresses research question R2: How to enable general and
transparent predictions in complex urban trafc?
The chapter is organized as follows. A short introduction illustrating the motivation for general
and transparent prediction models is followed by an overview of state-of-the-art solutions for
trajectory prediction in trafc. Subsequently, a two-fold methodology part is presented, frst
aiming at evaluating the previously introduced scene representation by a model’s ability to
generalize on new situations. Based on insights obtained from the frst method, a framework
for enabling the development of generalizable prediction models is proposed in the second
method part.
5.1 Introduction and Motivation
From a psychological point of view, anticipation is discussed in terms of SA for generating a
certain level of comprehension [129, 136] and is, therefore, the basis for enabling interactive and
spatio-temporal consistent behavior as discussed in Section 3.3.4. Furthermore, anticipating
the intention of other road users is a frequently addressed challenge in AD to provide a safe
47
5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions
and reasonable driving strategy [161]. However, generating reliable predictions given the
complexity and diversity of urban trafc remains an op en challenge due to issues related to
transferability, overftting of data-driven approaches, or insufcient transparency of black-box
models [81, 50].
For this reason, the challenges posed by the lack of transparency and transferability of
prediction models are addressed in the present chapter. Two methods are presented that
aim to create more transparency about learned patterns and remaining model weaknesses.
The frst method evaluates the scene representation introduced in Chapter 4 with the aim
of improving the generalizability of a model by increasing situational understanding through
the more sophisticated representation of the driving scene, including semantic information of
interactions with other road users. In the second part, the efects of various conceptual modeling
choices on generalizability, starting with training data and ending with hyperparameters, are
investigated. Generalizability is measured by a novel evaluation methodology aiming at
generating transparency for efcient and reliable solutions.
5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions
In recent years, a major focus in research has been on developing prediction approaches for
highway scenarios. Due to complexity only a limited amount of researchers have specifcally
investigated urban trafc scenarios [74, 83, 162]. Especially in complex urban trafc, prediction
is difcult because several road users interact and their behavior is infuenced by various factors.
In order to provide a comprehensive overview, state-of-the-art solutions are distinguished by
diferent factors: the prediction method, the discretization of the behavior (model output),
and the considered input information associated with respective model structures.
5.2.1 Prediction Methods
Regarding the prediction method, Leon et al. distinguish between model-based and data-driven
prediction [68].
Model-based approaches rely on knowledge, more specifcally on physical dependencies and
observable spatio-temporal relations. By observing vehicle dynamics over time, possible
maneuvers can be determined and their probability computed. The identifed maneuver can
then be modeled with the respective motion model to predict a trajectory [163, 79].
Model-based methods perform well in short-term predictions, as physical measurements are
good indicators for motion patterns. However, the models are not able to make reasonable long-
term predictions since the resulting behavior relies on more complex relationships, including
spatio-temporal decisions of the driver, such as passing or yielding.
Data-driven methods in this context are mostly based on black-box models that are inspired
by cognitive learning structures, such as NNs. Since model-based approaches are not suitable
regarding the complexity of urban trafc, the following sections focus on data-driven approaches
employing more promising methods. Table 5.1 provides an overview of the introduced categories
and related publications, which are discussed in more detail below.
48
5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions
5.2.2 Behavioral Discretization
Next to the prediction method, one can distinguish between diferent levels of behavioral
discretization into predictable variables. Some approaches predict the intention of other
road users as maneuvers [74] or actions [73]. In comparison, other researchers intend to
predict the trajectory of other trafc participants by using their position at certain time
steps in the future as a label directly [72, 75, 77, 79]. Furthermore, prediction can either
be performed deterministic by outputting one prediction result [164, 165] or probabilistic
providing a trajectory and a range of uncertainty [79, 166].
5.2.3 Infuence Factors and Model Structures
Multiple in fuences may afect driving behavior in urban trafc. Thus, diferent approaches for
incorporating contextual information into prediction exist. Since trajectory prediction is mostly
formulated as a sequential problem, several approaches utilize recurrent network structures,
such as RNN (Recurrent Neural Network) or LSTM (Long Short-Term Memory) [72, 77, 78].
These networks derive one’s future motion based on past temporal motion sequences.
Other promising concepts evolve graph-based models such as a GNN (Graph Neural Network) or
a GCN (Graph Convolutional Network), as these structures ofer great potential in representing
spatial dependencies between road users, ofer the possibility of handling dynamic input sizes
and are suitable to predict the entire scene development instead of predicting each road user
individually [167, 168].
Often the problem of trajectory prediction is distributed among diferent types of NNs that
are combined, such as a MLP (Multi-layer Perceptron), a GNN, and a LSTM to one large
prediction system [79, 167, 78]. Incorporating contextual information into the prediction
is especially relevant due to various infuence factors afecting behavior. Concepts vary in
terms of the information provided (e.g., static environment information on the map) and
the format in which this information is provided (on a semantic level, as raw data, or as
an embedding). Contextual infuences, for example, describe the static environment, since
spatial movement in urban trafc can be highly dependent on the street layout. There are
diferent approaches to represent street maps on a feature level, including latent representations
obtained from a CNN or a VAE (Variational Autoencoder). Mo et al., for example, use a
CNN encoder to calculate an embedded representation of the map [77]. Whereas Schulz et al.
use directly interpretable features describing the current lane, a vehicle is following by width
and curvature [76]. As already mentioned, data-driven prediction models ofer great potential
for such complex prediction tasks but lack in explainability and generalizability. Therefore,
some researchers combine the two worlds of model-based and data-driven prediction by adding
prior knowledge to the learning process. Bahari et al. list diferent possibilities for injecting
knowledge, for example, by including vehicle dynamic models during learning to force the
model to predict only kinematically feasible trajectories [79].
49
5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions
Table 5.1: Summary and categorization of the state-of-the-art approaches to vehicle motion
prediction.
Model archi-
tecture
Convolutional Networks: CNN [169]
Recurrent Network Structures: RNN,
LSTM
[170, 171, 166, 172, 72, 173, 174]
Graph-based Structures: GNN, GCN [164, 175, 176]
TN (Transformer Network) [177, 178, 179, 180, 181, 165]
Autoencoders: VAE [182, 183]
Multi-model frameworks: LSTM-GNN, TN-
MLP, GNN-RNN, LSTM-MLP
[184, 185, 84, 186, 187]
Input
information
semantic representations for static environ-
ment
[188, 176, 169, 183, 76]
raw representations of static environment
or embedding
[182, 170, 166, 189, 179, 186, 180,
190, 172, 77]
raw context information (position, dynam-
ics of other road users)
[184, 164, 185, 175, 177, 191, 178,
72, 165]
semantic context information (interaction
partners, relationship)
[142, 84, 173, 174]
Model output
maneuver prediction [175, 74]
action prediction [73]
deterministic trajectory prediction
[164, 185, 175, 177, 178, 186, 180,
72, 173, 174, 165]
probabilistic trajectory prediction
[79, 182, 170, 188, 166, 189, 191,
179, 84, 190, 169, 172, 183]
5.2.4 Model Evaluation Strategies
The evaluation of such data-driven models is a crucial part of the development, as black-box
models face problems associated with overftting and weak generalizability while ofering a low
level of transparency.
Overftting describes the phenomenon when a model learns the patterns presented in a training
dataset so well that it negatively afects the model’s ability to generalize. The phenomenon is
more likely when learning a loss function from complex non-parametric statistical dependencies,
such as used in NNs [192]. Overftting, accuracy as well and interpretability are related to
each other and the level of complexity of the model [192]. More complex models tend to overft
more and show less interpretability while providing a higher potential of accurate predictions
when facing complex problems [192]. Depending on the prediction problem to be solved, the
representativeness of the training data, and the modeling approach, an appropriate balance
between generalizability, interpretability, and accuracy must be found. Diferent metrics can
be applied to quantify model accuracy in terms of trajectory prediction. Table 5.2 shows a
selection of employed state-of-the-art metrics for evaluation. The most common metrics for
evaluating trajectory prediction are ADE and FDE, measured as the L2 distance between
the true and the predicted trajectory [193]. Some approaches employ variations of ADE
and FDE [184] or RMSE for measuring accuracy by the distance between predicted and GT
trajectory [171]. These metrics indicate how accurately the predicted trajectory matches the
50
5.2 State-Of-The-Art: Anticipation of Vehicle’s Intentions
individual human-driven trajectory and is averaged over a set of test data. However, the use of
displacement errors cannot provide information on how functional or plausible the predicted
trajectory was. Therefore, in some individual cases, more sophisticated evaluation strategies
are applied, e.g., taking into account functional errors such as road violations [81] or unrealistic
headways [175] as summarized in Table 5.2.
In the realm of scientifc research, in addition to the chosen metric, another crucial aspect of the
evaluation strategy involves the selection of test scenarios or test data. Mos t state-of-the-art
approaches test their models on a retained test split of the data. Only a few approaches test
their models on diferent datasets [177]. Information about how close the used test data and
training data are is rarely addressed in most publications.
Table 5.2: Summary of state-of-the-art metrics for trajectory prediction.
Metric Explanation Reference
ADE & FDE
Average Displacement Error & Final Displace-
ment Error
[184, 182, 170, 171,
188, 166, 194, 82,
81, 191, 178, 179,
186, 180, 169, 172,
72, 187, 173, 174,
165, 183]
variations of ADE & FDE
normalized ADE & FDE, minimum ADE &
FDE
[184, 166, 176, 177,
190]
RMSE Root Mean Square Error [171, 194, 164, 185,
175, 84, 172]
Negative headway distance
occurrence
Occurrence of unrealistic states due to poor
decision-making
[175]
Jerk sign inversion
Quantifes oscillations in model’s acceleration
predictions
[175]
MR (Miss Rate)
Proportion of unacceptable trajectories mea-
sured by a region of interest.
[190, 183]
Of-road rate
The ratio of predicted trajectories laying not
entirely in the driveable area of the map to
the total number of predicted trajectories
[190]
EMD distance
Quantifes amount of probability mass that
has to be moved from the predicted distribu-
tion to match the true distribution.
[81]
HOR (Hard Of-road Rate)
The percentage of scenarios that have at least
one of-road prediction in the trajectory points
[81]
SOR (Soft Of-road Rate)
The percentage of of-road prediction points
over all prediction points and the average over
all scenarios.
[81]
DAC (Driveable Area Com-
pliance)
Count of future trajectories within the drive-
able area divided by the number of all possible
trajectories.
[112]
TCC (Temporal Correlation
Coefcient)
A high value for TCC is indicating that
predictions cover the time-varying motion
patterns well
[170]
51
5.3 Prediction Concept, Problem Formulation, and Data Preparation
5.3
Prediction Concept, Problem Formulation, and Data
Preparation
5.3.1 General Concept of Intention Prediction
The purpose of the following methods is to address the question of how to enable general and
transparent predictions in complex urban trafc. Since model-based approaches show less
suitability considering the complexity of predicting the future behavior of other road users, a
data-driven approach is chosen. Due to known challenges associated with black-box models,
the following methods investigate how diferent conceptual choices afect the generalizability
of a data-driven prediction model. In this context, the efectiveness of the proposed scene
representation introduced in Chapter 4 is evaluated.
For anticipation, a NN is created that predicts the future trajectory of one vehicle based on
situational information from this vehicle’s ego perspective. In the following, the vehicle to
predict is named ego vehicle while all other potentially relevant road users that are present
in the scene are called partners. The trained model can later be used to predict all vehicles
in the scene. The prediction of VRUs is not further addressed, but all presented methods
aim to be transferable. The following sections explain the problem formulation and the data
preprocessing strategy used for both methods. Subsequently, the two individual methods
investigating the efect of diferent conceptual choices on generalizability are presented with
corresponding results, critical discussion, and conclusions.
5.3.2 Problem Formulation and Model Architecture
The problem of trajectory prediction is formulated as follows. At time
t
the model predicts
the future trajectory
Y i
=
Yti
+1, Yti
+2, ..., Y i
for the next
Tpred
seconds for one vehicle
i
t t+Tpred
based on the current scene
Xt
which describes the individual perspective of vehicle
i
. The
i
i i
future motion
Y i
is a sequence of positions in a two-dimensional space:
Y i
= (
x , y
). The
t t t t
prediction horizon Tpred is set to 5 seconds.
For labeling, the positions of the respective vehicle after 1,2,3,4,and 5 seconds in global XY
coordinates are used.
The current situation from an ego perspective of vehicle
i
is represented as a concatenated
vector
Xi
t
at time
t
. The vector
Xt
includes features describing the ego vehicle
Xi
t
E
, and,
i
depending on the setting to investigate, all features describing the map
Xt
, potential confict
iM
partners with their state
Xt
, and the individual interaction with the ego
Xt
according to
iP iI
Equation (5.1) and defnitions in Table A.1.
Xt = (Xt , Xt , Xt , Xt ) (5.1)
i iego iP iI iM
52
5.3 Prediction Concept, Problem Formulation, and Data Preparation
Partners include vehicles Xt and VRUs Xt according to Equation (5.2).
ipV EH ipV RU
Xt = (Xt , Xt )
iP ipV EH ipV RU
n
∑
Xt (Xt ) k ∈ Pt
= kP V EH ipV EH k=0 (5.2)
m
∑
Xt = (Xt ) j ∈ P t
jP V RU ipV RU j=0
The model itself can be seen as an evaluation tool for investigating the efect of diferent
conceptual choices on generalizability and is not intended to outperform benchmark results.
Therefore, a simple model structure is chosen, and for training, common loss functions
and optimizers are used. Since this work aims to evaluate the efectiveness of the scene
representation, the training samples do not contain any past information, and no recurrent
structures were used in the model architecture. For architecture, a simple MLP is chosen,
incorporating four hidden layers consisting of 512, 256, 128, and 64 neurons and ten neurons
representing the output layer for the respective positions in XY for the next 5 seconds. Batch
normalization is used for stabilizing model training. The model structure is illustrated in
Figure 5.1.
Input 512
Batch Normalization
Layer
256
Batch Normalization
Layer
128
Batch Normalization
Layer
64
Batch Normalization
Layer
Output
Figure 5.1: Structure of the base NN used for the two anticipation methods to investigate the
ability to generate generalizable predictions.
5.3.3 Data Preprocessing
The model input is provided as a concatenated feature vector composed of diferent feature
categories summarized in Table A.1. Regarding the data preprocessing, there are two main
aspects to mention.
First, the input has to be normalized such that a ML model can handle the data during
optimization. Therefore, all data is normalized into a value range between 0 and 1. To ensure
that no information is lost or distorted, maximal and minimal values are set manually, and
features sharing a common value base, e. g. headings, are normalized on the same basis. In
order to prevent squeezing or distorting spatial information, coordinate values are shifted into
a common value range.
In urban trafc, each situation can involve a variable number of road users, while most of the
data-driven prediction models are based on a static number of input neurons. Accordingly, a
concept for handling a dynamic numb er of inputs is required. Therefore, all potential partner
53
5.4 Method 1: Measuring the Efectiveness of the Scene Representation by
Prediction Performance
vehicles
Pt
and all interacting VRUs
P t
are sorted by their distance to the ego vehicle.
V EH V RU
If there are more than
n
or
m
interaction p artners, only the closest ones are considered. When
there are fewer interaction partners, empty features are set to NaN, efectively omitting them
from the training data. During preprocessing, NaN values are set to -1 to ensure the model
learns to ignore them since -1 is out of the normal feature range [0
−
1]. A maximum of
n
= 5
interacting vehicles and
m
= 4 interacting VRUs is defned. The numbers were set empirically
with the intention of fnding the best trade-of between preventing a high number of sparse
information and not losin g relevant information. The data preprocessing can have a high
impact on model performance, and there might still be unused potential. A deeper analysis of
the efects of diferent representation options for such sparse features should be addressed in
the future.
5.4
Method 1: Measuring the Efectiveness of the Scene
Representation by Prediction Performance
5.4.1 Motivation
Generating accurate predictions in the variability of possible situations in urban trafc remains
an unsolved challenge, as both data-driven and model-based approaches perform well in
isolated situations but poorly generalize to new complex an d interactive situations. This is a
major drawback because agent models only beneft from anticipation when enabled for various
common urban trafc scenarios. Therefore, this section addresses the challenge of improving
data-driven prediction models’ ability to generalize by adapting the cognitive-psychological
theory of knowledge transfer proposed in the previous Chapter 4. The efectiveness of the
proposed scene representation is measured by using the introduced prediction model. The
underlying assumption is that the ability of a NN to generalize to new and unknown situations
is associated with a high level of situational understanding. Based on research question R2,
the following sub-question is formulated:
R2.1: To what extent does the proposed scene representation improve the ability of a data-
driven prediction model to generalize and thus produce reasonable predictions for the variety
of complex situations encountered in urban trafc?
5.4.2 Concept
In order to investigate the ability of the model to generalize to new situations, two diferent
test levels for measuring performance are established. The presented model is trained under
weak conditions only on a fragment of the available data showing only one intersection of the
utilized dataset inD [155] during training. A subset of the data of the training intersection is
retained for testing the model’s performance in unseen situations in a known location. For
investigating generalizability, in a second level, the trained mo del is tested on a new unseen
intersection that shows a signifcantly diferent topology.
The model is trained on diferent versions of the scene representation (EMPI, EMP, EMI, EM,
E) involving the proposed features, introduced in Table A.1. Model performance is compared
on these diferent feature spaces to identify the beneft of the individual information categories
54
5.4 Method 1: Measuring the Efectiveness of the Scene Representation by
Prediction Performance
regarding the model’s ability to generalize. By means of this, the efectiveness of the proposed
scene representation is critically evaluated. The described measurement concept is illustrated
in Figure 5.2.
EMPI
EMP
EMI
EM
E
TEST LEVEL 1: retained
situations from training location
TEST LEVEL 2: unseen
intersection with new topology
TRAINING
Figure 5.2: Concept for measuring the effectiveness of the proposed scene representation by
training the NN with diferent feature compositions.
5.4.3 Implementation Details
The model is trained on the location Bendplatz of the inD dataset [155]. Recordings 8 - 17 are
used for training the model, resulting in 190.000 training samples, while recording 14 is used as
a validation set during training. Recording 7, comprising 9.000 samples, is retained and used
for testing at level 1, which represents data close to the training data since the situations are
unknown to the model, but the location is familiar. For investigating the ability to generalize,
the trained model is tested at a second location Heckstrasse of the inD dataset, which shows
a completely diferent intersection topology. The test data is obtained from recording 30
involving 32.000 samples. Both the training and test locations are illustrated in Figure 5.2.
The model is trained with fve diferent feature settings to investigate the efectiveness of the
individual information categories: EMPI, EMP, EMI, EM, and E. All models are trained
with a batch size of 50 for 80 epochs. For training, the following hyperparameter setting is
set: Adam optimizer with a default learning rate of 0.001, MSE (Mean Squared Error) as
loss function, and relu for activation by using Keras for model building [195]. For evaluating
the efectiveness of the scene representation, model performance is measured by the common
benchmark metric ADE and FDE employing L2 distance [193].
5.4.4 Results
Table 5.3 summarizes the results for model performance on diferent feature spaces and both
test levels, measuring accuracy by ADE, FDE, and the relative decrease in accuracy compared
to the best setting. The results for unseen situations are in line with current benchmark results,
covering highly interactive driving scenarios, as shown in Figure 5.3. The results on unseen
situations (level 1) beneft from a smaller feature space considering only the ego vehicle (E) or
55
5.4 Method 1: Measuring the Efectiveness of the Scene Representation by
Prediction Performance
Table 5.3: Results for measuring the efectiveness of the proposed scene representation.
Feature
Setting UNSEEN SITUATION
(Level 1: Bendplatz)
ADE [m] FDE [m] drop ADE [%]
NEW LOCATION
(Level 2: Heckstrasse)
ADE [m] FDE [m] drop ADE [%]
EMPI (209)
EMP (144)
EMI (96)
EM (31)
E (13)
2,5273 4,5856 11,09%
2,3613 4,4491 -4,84%
2,6479 4,5310 -15,14%
2,2850 3,9870 -1,66%
2,2470 4,0779 best
9,0252 14,0349 -2,99%
8,7554 14,5594 best
9,1117 16,7633 -3,91%
27,8287 17,8915 -68,54%
50,3753 46,2537 -82,62%
Figure 5.3: Quantitative results in unseen situations (Level 1).
Ego GT trajectory marked in orange. Predictions of the model trained on EMPI in cyan, and
predictions of the model trained on E in blue. GT and predictions represent positions in next 5
seconds. For all trajectories: driven path: solid, future path: dotted.
ego and its static environment (EM). However, when testing the model on increasing levels of
generalization (level 2), performance is signifcantly better on the feature spaces, including
partner and relationship information (EMI, EMP, EMPI). The performance when excluding
situational context information decreases by up to 82%. The relative decline in performance
regarding the input variant only using ego features (E) compared to the other feature spaces
shows a clear beneft of the proposed scene representation in terms of generalizability. The
diferences between the settings EMPI, EMI, and EMP are minor compared to the decrease in
performance when excluding dynamic situational information.
The diferences in performance at the two levels of testing demonstrate the importance of a
deep and critical evaluation strategy when developing such black-box models. If the model is
tested only on data that is close to the training data, as presented in test level 1, misleading
56
5.4 Method 1: Measuring the Efectiveness of the Scene Representation by
Prediction Performance
conclusions may be drawn regarding which model setting would perform best in general.
Next to objective measures, the qualitative results are investigated, and some examples are
provided in Figures 5.3 and 5.4. The examples in Figure 5.3 show some predictions of the
model trained on the full feature space (EMPI) and only on ego vehicle features (E) on test
level 1. Both models show similar performance and predict the future motion of the vehicle
well, even in highly interactive situations. Meanwhile, when investigating results on the unseen
intersection Heckstarss e in Figure 5.4, a signifcant diference in accuracy and plausibility of
results can be observed. Since the models were trained under weak conditions, both models
show high error values compared to benchmark results. However, the predictions of the model
trained on the feature space describing other road users and the individual relations (EMPI)
provide signifcantly better results in terms of accuracy and plausibility.
The two exemplary predictions in the upper row show high error values but still have a certain
plausibility in the prediction. In the top row, a wrong maneuver is predicted by the EMPI
model on the left side, and on the right side, the curvature of the predicted trajectory deviates,
while the predictions of the model trained on E seem to be random.
In the bottom row on the right side of Figure 5.4, the model trained on EMPI shows reasonable
prediction, but the expected velocity is too low, while the model trained on E again shows
non-sense predictions. Such efects of spatial accurate but temporal weak predictions can
be explained by the diference in mean driven velocity on the training location (Bendplatz:
17km/h) compared to the test location (Heckstrasse: 26km/h).
Next to the beneft of the proposed scene representation, the situations and related predictions
show that the signifcance of the benchmark metric of evaluating by ADE and FDE is limited.
Cases occur in which behavior deviates from the real trajectory but is still plausible, e.g., a
slower driven velocity, while other predictions with similar error values do not show any spatial
plausibility or enter non-driveable areas.
5.4.5 Limitations and Future Work
The results show a clear advantage of providing contextual information for the model’s ability
to generalize. However, the proposed scene representation exhibits a high dimensionality when
all information categories are included, and the precalculation of these features is associated
with computational efort. In addition, such high dimensionality can negatively afect learning
and demands a high number of training samples, which is called the phenomena called curse of
dimensionality [196]. Meanwhile, DL models are less afected by this phenomenon because of
the ability to compress the high dimensional input data into embeddings of lower dimensionality.
The critical dimensionality level depends strongly on the particular data distribution and the
prediction task [197].
The selected model structure, an MLP, only takes a static number of input neurons, while
trafc scenarios in urban trafc vary in the number of relevant road users. Therefore, a
strategy for limiting the maximal number of considered interaction partners and handling
sparse features for non-present partners was established. The way to process such sparse
features for normalization and the efect of a high number of sparse features on prediction
performance was not further investigated and might be associated with additional potential.
Furthermore, the output layer of the model represents the future position of the vehicle in the
57
5.4 Method 1: Measuring the Efectiveness of the Scene Representation by
Prediction Performance
Figure 5.4: Quantitative results in unseen situations on unknown intersection (Level 2).
Ego GT trajectory marked in orange. Predictions of the model trained on EMPI in cyan, and
predictions of the model trained on E in blue. GT and predictions represent positions in next 5
seconds. For all trajectories: driven path: solid, future path: dotted.
next fve seconds, discretized by a one-second time interval. Due to the high discretization,
the predicted trajectory must be resampled into a more continuous motion depending on the
subsequent application. Furthermore, the dynamic feasibility of the predicted trajectories is
neither guaranteed nor investigated at this point.
In this method, a simple model structure, namely a MLP, was used. It is assumed that the
advantage of providing situational information for model training is also present when using
other model structures with alternative architectures such as GNNs or RNNs. However, the
performance of a model with a diferent architecture cannot be determined independently
from the input data and representation, and therefore, it cannot be guaranteed that the same
positive efect size will be observed.
The evaluation is based on a two-level approach and shows an advantage of the proposed scene
representation in terms of generalizability. Meanwhile, the extent to which the model is able
to make meaningful predictions in other situations even further away from the training data
needs to be explicitly investigated due to the lack of transparency caused by the black-box
nature of the models.
When using a black-box approach for prediction, it is important to identify in which situations
the mo del has learned meaningful patterns and when it is producing unreasonable output to
ensure reliability. The qualitative results demonstrated that the benchmark evaluation metric
used is limited in signifcance as no conclusions regarding the plausibility of the respective
prediction can be derived, which should be addressed by a more sophisticated evaluation
58
5.4 Method 1: Measuring the Efectiveness of the Scene Representation by
Prediction Performance
method. Due to the black-box character of NNs, an extended evaluation strategy is required
to investigate efects on generalizability and to provide transparency regarding the potentials
and limitations of a trained model. Such evaluation strategies are mainly characterized by the
metric and the test data, which should cover a broad range of situations.
5.4.6 Conclusion
The present chapter provided a method for evaluating the efectiveness of a scene representation
by assessing the ability of a prediction model to generate reasonable predictions in situations
far away from the training data. To allow so, the model was trained under weak conditions,
and the generalizability was assumed to indicate the level of SA caused by the provided input
representation.
The evaluation results revealed a clear beneft of the extended scene representation in terms
of enabling more SA in the prediction model and thus improved the ability to generalize.
The feature settings describing the interaction with other trafc participants (EMPI, EMI,
and EMP) on level 2 outperformed the settings without any dynamic context information
signifcantly. The accuracy of the models trained on ego only (E) or ego and map information
(EM) decreased by up to 82% at test level 2. In particular, the qualitative results revealed
that even if the error values of the EMPI model, for example, were still high at test level 2,
the results showed signifcantly more plausibility and reliability compared to those of a model
trained on the E-feature space.
Furthermore, the evaluation demonstrated that by testing a model on data close to the training
data, misleading conclusions might b e drawn regarding which model setting would perform
best in general. Additionally, in particular, the qualitative results revealed the limitation of
current benchmark metrics measuring model performance exclusively by the distance between
predicted and GT trajectory. To conclude, quantifying and improving the generalizability of a
NN is a complex task due to its black-box nature. Various conceptual choices, such as training
data, test cases, or training parameters, can afect model performance. Meanwhile, commonly
used evaluation metrics do not provide sufcient insight to derive reasonable decisions regarding
conceptual modeling choices. Therefore, the following methods address the challenges revealed
here.
59
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
5.5
Method 2: Reliable Trajectory Prediction in Complex
Urban Trafc
5.5.1 Motivation
The common objective when developing a prediction model is to generate predictions that are
as accurate as possible across the variety of situations that occur in trafc. The number of
possibilities for addressing this challenge with commonly used data-driven models is immense,
starting with the concept of how and which situational information is provided across a wide
range of model architecture options and ending with the choice of learning parameters. This
richness of opportunities meets difculties associated with data-driven approaches, namely
overftting and lack of transparency. Black-box models provide little transparency so that
insights into accuracy can only be gained in explicitly tested situations. In addition, data-driven
models risk overftting to the training data, resulting in poor results in unknown situations,
i.e., low generalizability. Furthermore, the representativeness of situations that appear in a
database is always limited compared to all potentially occurring scenarios in urban trafc.
Taking these facts together, one usually does not know what relationships the model has
actually learned, and both the evaluation and training of such models depend strongly on the
available datasets. Most state-of-the-art publications focus on problem-solving and introduce
novel concepts on how to generate accurate predictions for a given dataset. However, the
evaluation part is rarely addressed, and thus current research provides little insight into the
necessary requirements for practical use of the models, such as limitations and generalizability.
In order to address these challenges, the following section presents an advanced evaluation
method that provides insight into the ability of models to generalize and generate plausible
predictions even in exceptional situations. The multi-level evaluation method aims to provide
more transparency about learned patterns and allows for more reliable and efcient model
development. To do so, the evaluation method is applied to the previously presented model to
investigate the efects of diferent conceptual choices, namely diferences in the coverage of
scene information, varying diversity in training data, and diferent learning parameters. Based
on research question R2, the following sub-questions are formulated:
R2.2: How to measure the generalizability of data-driven prediction models?
R2.3: How and to what extent is the generalizability of a data-driven prediction model afected
by diferences in the coverage of scene information, homogeneity in training data, and various
learning parameters?
R2.4: Is it possible to combine real and synthetic trafc data samples to compensate for
underrepresented situations in datasets in the future?
5.5.2 Concept
The two key aspects of evaluating a data-driven prediction model involve the metric and the
choice of test data. Most publications employ spatio-temporal error measures, which indicate
how well the predicted trajectory matches the individual human-driven trajectory but do not
consider the situational context when evaluating a trajectory. Using displacement errors cannot
provide information on how functional or plausible the predicted trajectory was. According to
60
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
results from Section 5.4, cases occur in which behavior deviates from the real trajectory but is
still plausible, e.g., slower driven velocity or a longer time gap when turning without becoming
critical. At the same time, other predictions with similar error values enter non-driveable
areas. Consequently, common error measures are not able to distinguish between false-bad
and plausible-bad trajectories and vice versa. However, it would be crucial for developing or
tuning prediction models to identify the situations in which the model produces non-plausible
or non-functional results.
The Evaluation Metric:
In order to address this gap, inspired by literature [79, 81], a simple plausibility metric is
formulated consisting of two categories to evaluate the plausibility of predictions:
• spatial evaluation: maximal path deviation, road violation, maximal road violation
• temporal evaluation: collision check, minimum distance to other trafc participants
To measure plausibility, a percentage score
SP
is calculated based on these fve components.
Binary aspects, such as collision or road violation checks, return True or False, interpreted as 0
or 1. All distance measures are divided into bins and mapped to plausibility values between 0
and 1. The fnal plausibility score
SP
is the average of the individual components. To further
investigate spatial accuracy, the percentage of predictions with path deviations greater than 5
meters is measured. Based on the plausibility metric in conjunction with the common metrics
ADE and FDE, model performance, involving accuracy and plausibility, can be evaluated
against test data.
The Test Levels:
Since data-driven models are based on black-box approaches, transparency in terms of model
generalizability and reliability is achieved by applying a trained model to test data. Accordingly,
the selection of test data is crucial for the signifcance of the evaluation. Therefore, a multi-level
method is presented for critical evaluation, involving four levels of test data, as illustrated in
Figure 5.5. The four levels present diferent challenges in terms of generalizability, as they
include situations that are further apart from the training data. Starting with unknown
situations at locations shown during training (L1), new locations from real trafc data (L2a),
new locations from synthetic trafc data (L2b), and ending with testing in an exceptional
situation (L3), in which the ego vehicle has to pass a static obstacle with oncoming trafc. For
the L2a, L2b, and L3 levels, model performance is measured by the plausibility and accuracy.
Variations to Investigate:
As the objective is to gain insight into how diferent settings afect model performance and
which concepts contribute to improved generalizability, the following variants are investigated:
Three variants of training data (homogeneity) are trained over six variants of provided input
feature settings (coverage of scene information). In addition, one of these models is trained with
four diferent variants of learning parameter sets to investigate the efect of tuning compared
to more application-specifc decisions. In total, this results in 22 models to be evaluated.
For variants in diferent input feature settings, the variations (EMPI, EMP, EMI, EM, E)
presented in Section 5.4 are selected. Regarding the level of variety in the training data
(homogeneity), several factors have to be considered since the performance of data-driven
prediction models is strongly dependent on the data presented during training. On the one
61
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
ABILITY TO GENERLIZE
L1: UNKNOWN SITUATIONS
(SAME LOCATION)
L2a: UNKNOWN INTERSECTION
(REAL DATA)
L2b: UNKNOWN INTERSECTION
(SYNTHETIC DATA)
L3: EXCEPTIONAL SITUATION
→Drive into oncoming
traffic lane to overtake static obstacle
GT accuracy
(ADE & FDE)
Plausibility
MODEL
SETTING
Figure 5.5: The evaluation concept involving diferent test levels and metrics to quantify
generalizability.
hand, it is crucial that the training data represent the domain of application as thoroughly as
possible, i.e., representativeness. On the other hand, the homogeneity of the training data
was identifed to have a signifcant infuence on model training and generalizability [198].
Considering representativeness, in data-driven modeling, one often faces the problem that
some exceptional situations are underrepresented for adequate training. However, the creation
of new data, especially in exceptional situations, is either costly or not possible due to the
rarity or criticality of events in everyday trafc. As a result, it would be benefcial if it were
possible to augment existing datasets with manually defned situations that are known to be
underrepresented. The present methodology investigates how the level of variability in training
data afects the model’s ability to generalize. In addition, real trafc data is combined with
synthetic trafc data from simulation to investigate the possibility of augmentin g existing real
datasets through synthetic samples from simulation. This results in the following levels of
training data are investigated:
T1: Low variability: synthetic trafc data for training.
A simulation framework is used to create synthetic trafc data. Due to the limited possibility
of individualizing a driver model, less diversity in behavior occurs.
T2: Medium variability: real trafc data for training.
For real trafc data, an open-source dataset covering typical interactive urban situations is
selected.
T3: High variability: a combination of real and synthetic trafc data.
5.5.3 Implementation Details
Data for Training and Testing:
For training and evaluating the open-source drone dataset inD
2
is used for representing
real trafc [155]. The dataset includes recordings of four German unsignalized intersections
2https://ind-dataset.com/
62
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
called Aseag, Bendplatz, Frankenburg, and Heckstrasse displayed in Figure 5.6 and is further
described in Section 4.3.
For the medium level of variability in training data (T2), models are trained only on real data,
and recordings from Aseag, Bendplatz, and Frankenburg were selected for training, resulting in
900.000 training samples. For the evaluation at level L1 (unknown situations), one recording
from each location was retained for testing and one as a validation set.
For representing a low level of variety in training data (T1), data on four synthetic intersections
was created with the help of the simulation framework Spider at BMW [151, 55]. The choice
of intersections intends to represent similar intersections compared to the ones represented in
the inD dataset involving diferent complex intersections and merging topologies as shown in
Figure 5.6. Since all drivers in the simulation are based on the same heuristic agent model but
use diferent parameters, overall behavior is more homogeneous compared to real trafc data.
For the low level of variability in training data (T1), models are trained on intersections 1, 2,
and 3. Randomly selected vehicle IDs were chosen and retained for the validation set during
training and for testing on L1 (unseen situation). It has to be noted that the synthetic data
only contains vehicles and no VRUs. The synthetic data is recorded at the same sampling rate
and shows the same characteristics as the inD dataset.
The high degree of variability (T3) in the training data is achieved by combining synthetic
and real data. For this purp ose, the real trafc data of Bendplatz and Aseag are combined
with the synthetic data of intersections 1 and 3. The data is combined in such a way that
synthetic and real data have a distribution of 50:50. For comparison, all training sets were set
to have a similar number of samples. For evaluating at the L2 level, data from one real (L2a)
and one synthetic (L2b) location were used. For L2a, the models were tested on recording 30
of the inD dataset, which represents a new intersection from reality (Heckstrasse). For L2b,
data from a diferent four-armed intersection was created (isec 4).
The simulation framework was used to create a special scenario in which the path of the ego is
occupied by an obstacle on a two-lane road with oncoming trafc to test model performance
in an exceptional situation (L3). For data collection on L3, the vehicle is controlled by a real
human in simulation. Pictures of all test and training locations are shown in Figure 5.6, and
all data is processed according to the methods presented in Chapter 4 and Section 5.3.3.
Model Input Variations (coverage of scene information):
For input variations, the same settings as presented in Section 5.4 employing diferent feature
spaces are used: EMPI: 209 features, EMP: 144 features, EMI: 96 features, EI: 78 features,
EM: 31 features, and E: 13 features.
Training and Model Parameters:
For the training process of all models showing diferent settings of features or varying training
data, the following learning parameters were set: Adam optimizer with a default learning rate
of 0.001 was employed, MSE as loss function, and relu for activation by using Keras for model
building [195]. In order to investigate the infuence of parameter tuning relative to changes
in features and training data, some variations were investigated, namely the choice of the
loss function, activation function, optimizer, and batch normalization shown in Table 5.4. All
models were trained with a batch size of 50 for maximal 80 epochs using early stopping with a
minimum delta of 0.00001 and patience of 15 epochs.
63
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
T1: training on synthetic data
VARIETY IN TRAINING DATA MULTI-LEVEL TESTING
T2: training on real data
Bendplatz Frankenburg Aseag
T3: training on combined data (real & synthetic)
Bendplatz Frankenburg
isec 1 isec 3
isec 2
isec 1
L3: test edge caseL2a: test data real L2b: test data syn
Heckstrasse isec 4 edge case
isec 3
L3: test edge caseL2a: test data real L2b: test data syn
Heckstrasse isec 4 edge case
L3: test edge caseL2a: test data real L2b: test data syn
Heckstrasse isec 4 edge case
Figure 5.6: Illustration of the data concept including three levels of homogeneity of training
data (T1 - T3) on the left-hand side and diferent test data (L2 - L3) on the right-hand side. Real
trafc examples are provided by the inD dataset [155], while synthetic driving data is obtained by
simulation. As an additional test case, an exceptional scenario is designed using the simulation
framework and a human driver (L3).
Table 5.4: Tuning parameter sets for investigating the efect of hyperparameter tuning [195].
ID Batch Norm. Loss Function Optimizer Activation
1
2
3
4
True
True
True
True
MAE (Mean Absolute Error)
MSE (Mean Squared Error)
MSE (Mean Squared Error)
MAE (Mean Absolute Error)
sgd
sgd
adam
adam
sigmoid
sigmoid
relu
relu
Metric Calculation:
The evaluation method presented earlier returns the following measurements for each model:
•
Overall score
SO
, overall accuracy score
SACC
, overall plausibility score
SP
, overall ADE
& FDE calculated across all test levels
• ADE & FDE individually on test data of L1, L2a, L2b, and L3
• SP individually on test data L2a, L2b, and L3
In order to evaluate the accuracy of the proposed models, ADE and FDE for accuracy is
calculated using L2 distance according to the general state-of-the-art [193]. Model performance
is measured as a combined score considering accuracy and plausibility, resulting in
SO
. For
model accuracy evaluation
SACC
, ADE and FDE are converted to a score under consideration
of benchmark results following Equation (5.3), where
ADEB
= 2
m
and
FDEB
= 5
m
. The
individual values were chosen according to results in the analysis of Bahari et al. [79].
= (ADEBF DEB1
SACC + ) · ) · 100 (5.3)
ADE FDE 2
64
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
For calculation of the overall accuracy
SACC
, all displacement errors are combined, while the
displacement errors of L2a, L2b , and L3 are weighted double to assign a higher priority to
results on data further away from training. The score is calculated according to Equation
(5.4). Total FDE is calculated accordingly.
ADEL1+ 2 · (ADEL2a+ ADEL2b+ ADEL3)
ADE = (5.4)
7
The score for the fnal model performance
SO
is the calculated mean of plausibility
SP
and
accuracy score SACC .
5.5.4 Results
All results for all model variants are provided in Tables 5.5 and 5.7, whereby Table 5.5 shows
the results for all diferent feature settings and levels of variability in training data. Table 5.7
provides the evaluation of diferent learning parameters for models trained on the full feature
space EMPI on the real dataset (T2).
5.5.4.1
Infuence of Homogeneity in Training Data and Coverage of Scene
Information
The results show a clear beneft of more variability in training, as models trained on T3
provide the best results for
SO
,
SP
, and
SACC
across all test levels, as illustrated in Figure 5.7
(right). In terms of plausibility and accuracy at diferent test levels, T3 either outperforms the
other data variation settings or shows similarly accurate results. Models trained on T1 show
signifcantly higher error values when testing on unseen real data (L2a) and in the edge case
(L3), as illustrated in Figure 5.7 (left) and Table 5.5. Meanwhile, the low level of variability
in the behavior data of T1 is benefcial when predicting clean synthetic behavior on test
level L2b compared to the models trained on real data (T2). However, even on the clean
synth etic behavior data of L2b, the models trained on T3 outperform models trained on T1.
This ind icates potential in combining real and synthetic data for handling underrepresented
situations in the future. When considering the overall plausibility scores
SP
of all models, a
model trained on T3 shows the best overall plausibility score of 73%, followed by a model
trained on T2 with 72%, while models trained on T1 only reach a maximum of 66% as
summarized in Table 5.5.
Regarding variations in the feature settings, results show that contextual features provide a
clear beneft in terms of generalizability when training on T3. The best-performing model
includes all contextual features (EMPI), as shown in Figure 5.8 and Table 5.5. Models trained
on T1 and T3 show the overall best plausibility on the feature setting EMI, but the diferences
in
SO
and
SACC
when comparing the feature settings do not show a clear tendency. When
considering the ability of models to generate reasonable predictions in exceptional situations
(L3), the feature setting EMI clearly outperforms the others when training on T1 or T2, while
models trained on T3 show the best results when all features are included (EMPI). The models
trained only on synthetic data (T1) show the poorest results overall. Considering diferent
feature settings, no clear tendency could be found. Overall plausibility
SP
shows the best
results on feature setting EMI, and overall accuracy
SACC
is best on feature setting E. But
when it comes to the edge case scenario (L3), one can see a clear advantage of including context
65
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
Table 5.5: Results for all model variants across diferent training data and diferent feature
settings at all test levels.
The frst fve columns provide results summarized across all test levels. The remaining columns
show the results for each individual test level. The best results within the training dataset (T1,
T2, T3) are highlighted in bold, the best results per column, i.e. per test level, are highlighted in
bold and underlined.
F DE ADE
L1 L1
[%] [%] [%]
[m] [m]
SO SACC SP
FDE ADE
[m] [m]
L2a L2a
L2a
FDE ADE
SP
[m] [m]
[%]
L2b L2b
L2a
FDE ADE
SP
[m] [m]
[%]
L3 L3 L3
FDE ADE
SP
[m] [m]
[%]
T1 EMPI 49.34 40.46 58.23 9.93 6.55
EMP 50.31 40.87 59.74 9.96 6.34
EMI
54.57 42.56 66.57 9.72 5.94
EI
50.56 34.62 66.50 11.73 7.51
EM
43.96 34.63 53.30 11.93 7.31
E
54.02 44.42 63.61 9.55 5.48
2.38 1.34
2.10 1.20
2.39 1.38
2.40 1.26
2.19 1.26
2.60 1.45
12.26 8.27 51.79
13.41 8.42 58.23
14.59 9.70 56.57
20.88 16.31 60.71
13.60 9.48 47.58
12.04 7.36 57.41
7.87 4.40 72.92
7.82 4.04 73.25
8.52 4.47 71.22
6.86 3.40 78.27
9.86 5.83 64.50
7.05 3.47 81.72
13.43 9.58 49.97
12.57 9.13 47.75
9.70 5.93 71.92
12.12 5.95 60.52
17.22 9.65 47.81
13.03 7.63 51.70
T2 EMPI 55.84 48.93 62.75 8.87 4.82
EMP 58.72 53.68 63.75 8.62 4.05
EMI
57.65 43.37 71.93 10.60 5.05
EI
53.66 47.74 59.59 9.56 4.64
EM
55.96 45.58 66.34 8.83 5.80
E
58.10 47.42 68.78 8.64 5.41
4.72 2.88
3.90 1.95
3.97 2.09
4.60 2.20
3.94 2.20
4.51 2.26
7.87 3.85 72.47
8.79 4.17 72.42
9.70 4.55 70.52
10.84 4.77 61.75
9.17 4.91 67.28
9.10 4.30
73.12
11.05 5.92 61.54
10.75 4.97 64.10
18.91 8.58 62.08
11.02 5.68 57.33
12.73 9.78 59.39
11.37 9.88 60.88
9.75 5.68 54.26
8.70 4.06 54.73
6.52 3.52 83.19
9.28 4.67 59.70
7.02 4.50 72.34
7.51 3.63 72.35
T3 EMPI
EMP
EMI
EI
EM
E
71.83 70.63 73.03 6.50 3.11
66.24 62.11 70.37 7.24 3.62
61.79 52.81 70.77 8.48 4.29
66.75 63.05 70.46 7.08 3.60
61.27 52.98 69.56 8.13 4.50
60.79 54.56 67.02 8.19 4.16
2.70 1.33
3.12 1.73
2.81 1.54
2.85 1.65
2.91 1.64
2.74 1.62
6.58 3.16
67.40
8.13 4.12 65.20
8.87 4.01 66.85
7.74 3.90 63.73
9.18 5.20 68.96
10.21 5.33 59.00
7.50 3.35 74.49
7.91 3.72 69.29
7.18 3.38 75.72
5.97 2.88 82.70
9.94 5.21 70.21
6.76 3.05 81.74
7.32 3.71 77.21
7.74 3.98 76.62
12.23 6.84 69.73
9.67 5.00 64.96
7.90 4.50 69.51
10.31 5.38 60.32
features during learning (up to 20% more accuracy and plausibility). The fact that models
trained on synthetic data show less beneft from the inclusion of contextual features can be
explained by the driver model used to create the synthetic data. The driver models are not
able to interact and rarely respond to the behavior of others but follow predefned heuristic
rules. Consequently, driver behavior in this dataset is less context-dependent compared to
real trafc data. In addition, the interaction and partner feature spaces contain features for
VRUs that are not present in the synthetic data. The empty features might hinder the training
process. In general, a high inter-dependency between the homogeneity of training data and
the coverage of scene information by the feature setting can be observed.
Figure 5.7: Infuence of variability in training data on model performance. Left: accuracy
measured by ADE and FDE on diferent test levels with best feature setting of training data
category.
Right: scores for accuracy, plausibility, and overall for diferent training data.
5.5.4.2 Infuence of Individual Feature Categories
The efect of map and interaction features on spatial and temporal performance is analyzed to
gain further insight into the impact of individual feature categories. In Figure 5.9 (left), it can
66
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
Table 5.6: Analysis of spatial and temporal model performance measured in [%].
Best result for
SO_temp
and
SO_spa
per homogeneity level (T1, T2, T3) highlighted in bold,
best overall results highlighted in bold and underlined.
L2a L2b L3
SO_spa
Collision Collision Collision
SO _temp
SO _spa SO _spa SO_spa
Check Check Check
L2a L2b L3
T1 EMPI
EMP
EMI
EM
E
EI
28.94
37.05
33.64
24.22
44.96
41.07
72.68
71.82
66.51
54.01
80.21
85.08
17.34
4.01
69.91
0.00
22.78
39.83
50.81
54.44
50.08
39.12
62.58
63.07
92.61
96.47
93.50
83.60
84.10
94.29
97.79
97.41
97.26
89.00
99.18
98.35
97.71 96.04
100.00 97.96
96.85 95.87
98.57 90.39
91.12 91.47
98.85 97.16
T2
EI
EMP
EMI
EM
E
EI
73.29
75.03
70.54
69.32
70.22
51.56
47.92
49.45
53.08
34.68
51.24
49.77
31.66
39.11
86.39
71.92
65.76
35.96
60.60
62.24
61.81
52.00
60.73
50.67
96.15
96.89
95.94
89.23
98.23
93.49
97.82
96.47
96.69
97.18
88.52
97.54
100.00 97.99
100.00 97.79
100.00 97.54
100.00 95.47
100.00 95.58
100.00 97.01
T3 EMPI
EMP
EMI
EM
E
EI
59.86 71.48 81.66 65.67 98.65 98.83
55.53 61.98 82.23 58.76 97.80 98.10
64.13 77.01 65.76 70.57 95.70 98.25
62.76 63.07 62.89 62.91 95.49 88.88
50.49 81.79 47.28 66.14 89.58 99.55
55.90 84.27 50.00 70.09 94.64 99.43
98.57 98.68
100.00 98.64
98.28 97.41
100.00 94.79
99.71 96.28
99.71 97.93
be observed that interaction and partner features contribute, on average, to better temp oral
plausibility of the results, measured by the frequency of collisions. In addition, the best feature
settings with and without map features across all training levels show an advantage of including
map information regarding spatial plausibility of predictions, as shown in Figure 5.9 (right).
Spatial plausibility is measured by the frequency of road violations and the percentage of path
deviations over 5m. However, the positive efect of including map features on better spatial
prediction accuracy is smaller than expected. Table 5.6 summarizes the results for spatial and
temporal plausibility of diferent feature settings trained on diferent datasets. Considering
the individual values presented in Table 5.5, it can be observed that the spatial plausibility,
measured on synthetic test data, partly shows better values without map features. This aspect
should be investigated further.
Semantic features are associated with a higher computational efort in data preprocessing.
Figure 5.8: Infuence of diferent feature settings on model accuracy considering homogeneity
level T3. Left: Efect measured on test level L2a. Right: Efect measured on test level L3.
Therefore, the infuence of interaction features representing situational context is analyzed
in more detail. When testing on L3, a clear advantage of semantic interaction features can
be observed. The same tendency, but with a smaller efect, is observed when testing on real
67
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
(L2a) and synthetic data (L2b). In general, the inclusion of partner and interaction features
contributes to better temporal accuracy in prediction, as shown in Figure 5.9 (left). Again, a
strong inter-dependency between homogeneity in training data and the utility of each feature
category can be observed. In addition, a lower beneft of contextual features is observed for
models trained on synthetic data (T1), which, again, can be attributed to a lower context
dependency of behavior due to heuristic model strategies for artifcial drivers.
Figure 5.9: Infuence of interaction features on temporal plausibility (left) and infuence of map
features on spatial plausibility (right).
5.5.4.3 Impact of Tuning Parameters
The results of the exemplary parameter tuning variants are provided in Table 5.7 and
show similar efect sizes on accuracy and plausibility as diferences in the coverage of scene
information, varying in a range of
±
10%. Looking at the efects of variability in the training
data, one can observe a much more powerful efect up to
±
25%. Results show that the
parameter tuning has a large impact on the generalizability of the model since variant ID4,
for example, shows the best results on test level L1 while providing weak generalizability.
Meanwhile, variant ID2 or ID1 provide lower accuracy on test level L1 but outperforms variant
ID 4 on the other test levels.
Table 5.7: Results for diferent learning parameters on all test levels according to Table 5.4. Best
result per column highlighted in bold.
ID
SO
[%]
SACC
[%]
SP
[%]
F DE
[m]
ADE
[m]
L1
FDE
[m]
L1
ADE
[m]
L2a
FDE
[m]
L2a
ADE
[m]
L2a
SP
[%]
L2b
FDE
[m]
L2b
ADE
[m]
L2b
SP
[%]
L3
FDE
[m]
L3
ADE
[m]
L3
SP
[%]
1 55.87 45.42 66.31 9.26 5.43 5.22 2.54 7.07 3.65 74.27 11.38 5.70 62.76 11.35 8.39 61.91
2 53.67 44.76 62.58 9.27 5.62 5.73 2.94 7.96 4.41 64.55 9.68 5.20 62.67 11.94 8.59 60.51
3 55.84 48.93 62.75 8.87 4.82 4.72 2.88 7.87 3.85 72.47 11.05 5.92 61.54 9.75 5.68 54.26
4 45.92 35.60 56.25 11.64 7.08 3.91 1.89 12.45 7.75 59.06 15.89 8.27 57.90 10.44 7.83 51.79
5.5.4.4 Measuring Generalizability and Plausibility
Generalizability is assumed to be measurable by testing model performance on diferent trafc
situations and scenarios in varying distances (test level L1 - L3) from situations shown during
training [199]. For the various settings shown in Table 5.5 and Table 5.7, the model performing
best on test level L1 is usually not the model providing the best results on L2 or L3, emphasizing
68
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
the necessity of a critical evaluation strategy. In particular, for the parameter settings, the
model with the best results on data close to the training (L1) showed the weakest plausibility
and accuracy overall, indicating poor generalizability. Consequently, an evaluation strategy
that considers a wide range of test data is crucial even in an early stage of development. When
considering plausibility in relation to accuracy, the results with the best accuracy do not
necessarily show the best plausibility. In particular, when testing with real data (L2a), one can
see large diferences between accuracy and plausibility. Therefore, in Figures 5.10 and 5.11,
some qualitative prediction results for the L2 test locations are provided and investigated.
Figure 5.10: Quantitative results on unseen intersections (isec 4 - L2b top, Heckstrasse - L2a
bottom). Ego GT trajectory marked in orange. Predictions of models trained on EMPI features
with diferent training datasets (T1, T2, T3). GT and predictions represent positions in the next 5
seconds. For all trajectories: driven path: solid, future path: dotted.
5.5.4.5 Qualitative Prediction Results of Diferent Model Settings
Figure 5.10 shows situations on the L2a (Heckstrasse) and L2b (isec 4) for models trained
on the feature space EMPI but on diferent training data variations. The top row of Figure
5.10 illustrates predictions on the synthetic test intersection for EMPI models trained on
datasets T1, T2, and T3. Also, in the qualitative results, the improved performance of the
model trained on T3 compared to T2 and T1 can be observed. The model trained only on
synthetic data (T1) shows less accuracy and plausibility compared to the model trained on a
combination of real and synthetic data, even when predicting synthetic behavior.
The importance of the plausibility measure is demonstrated by the example on the right-hand
side, whereby the model trained on T1 and the model trained on T3 show similar accuracy
measured by displacement errors (FDE
∼
16
m
). The trajectory predicted by the T3 model
69
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
shows a lower driven velocity and only a small path deviation and maintains, therefore, still a
certain level of plausibility. Meanwhile, the prediction of the model trained on T1 leads into
an undriveable area and does not show any reasonable b ehavior.
The second row in Figure 5.10 provides some examples of the real test location Heckstrasse
(L2a). Here, similar efects can be observed. The weak performance of the model trained on
the low level of variability in training data is especially signifcant. While predictions of the
models T3 and T2 partially still show larger error values, the predictions follow a reasonable
path, and predictions seem to show weaknesses primarily from a temporal perspective. This
might indicate that the training data is not representative enou gh with respect to higher driven
velocities in such turning maneuvers. Figure 5.11 illustrates situations with three diferent
Figure 5.11: Qualitative evaluation of predictions from diferent feature settings for models
trained on T3 performing on unseen intersections (frst row: isec 4 - L2b, second row: Heckstrasse
- L2a). Ego GT trajectory marked in orange. Predictions of models trained on T3 with diferent
feature settings: EMPI, EMI, and E. GT and predictions represent positions in the next 5 seconds.
For all trajectories: driven path: solid, future path: dotted.
model variants (EMPI, EMI, EM) trained on T3 on the two unseen test locations. The
predictions in the f rst row, representing performance on synthetic data, reveal that predictions
of the model trained with context features are associated with high error values mostly due to
weak temporal accuracy, while the model trained only on ego features generates higher path
deviations. The example in the frst column illustrates a typical example of high displacement
error but still high plausibility of the prediction. The trajectory generated by the EMPI model
leaves a smaller time gap when interacting but without getting critical.
Predictions on real trafc data, shown in the second row of Figure 5.11, demonstrate the
same positive efects of training with contextual features to provide reasonable results. These
70
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
examples indicate that the plausibility metric is an indispensable extension of commonly used
metrics to identify situations in which a model provides unreasonable predictions.
5.5.5 Limitations and Future Work
Next to interesting insights regarding the efect of homogeneity in training data, coverage of
scene information, and learning parameters on generalizability, the evaluation has shown that
such model aspects can not be considered independently. There are strong inter-dependencies
between data, mo del structure, and learning parameters, which make it challenging to derive
general valid conclusions. This fact highlights the necessity of a complex and critical evaluation
method to provide more transparency and reliable solutions when using black-box models.
However, as the complexity of the evaluation method increases, so does the interpretation of
results. Therefore, scores have been introduced to allow for easy assessment and comparison.
However, these scores combine and average the individual results, which can lead to smoothing
efects.
The relevance of plausibility in prediction results had a major focus in the presented method.
Of course, a safe driving strategy for AVs requires an accurate prediction, and therefore,
an inaccurate but plausible prediction might still cause problems. However, those insights
into plausibility provide a valuable basis for generating more transparency regarding model
weaknesses when developing and tuning black box models. Furthermore, in the application
for driver models in simulation, the primary objective is to generate human-like reactions not
requiring the most for being able to interact with other road users [200].
Furthermore, the plausibility metric employed is based on simple functional indicators.
The prediction methods are focused on the prediction of vehicle behavior. For enabling the full
dimension of SA within a driver model in urban scenarios, the anticipation of VRUs, including
cyclists, pedestrians, or scooters, would be required. The methods themselves are transferable
to such applications, but an adaptation of plausibility checks and features would be required
since, for example, a pedestrian is not restricted to moving within a certain lane or road area.
A simple model approach for prediction is employed, dispensing on the consideration of
temporal context or probabilistic outputs. However, such aspects are commonly addressed in
state-of-the-art approaches and should be combined with the proposed methods in the future
by, for example, using a LSTM instead of an MLP.
For data labeling, the future positions at certain time steps represented as global XY coordinates
were used. There might be potential to improve generalizability by using a local labeling
strategy, such as transforming the global coordinated into local lateral and longitudinal
displacement after 1, 2, 3, 4, and 5 seconds. However, the beneft in generalizability versus the
extended calculation efort when using such a strategy online must be investigated. Test level
L3 contains only a small number of samples (600) to exemplify what such a level of testing
could lo ok like. In order to provide broader reliability of model performance, additional test
data for exceptional situations should be designed and included in testing at L3 to provide
comprehensive insights into potential limitations.
For high variability in training data, real trafc data was combined with synthetic samples,
showing the high potential of using simulation to create specifc situations to compensate
for those that are underrepresented in the real training data. Of course, the extent to which
71
5.5 Method 2: Reliable Trajectory Prediction in Complex Urban Trafc
human-like behavior can be generated by simulation in such situations depends strongly on
the quality of the driver models in use.
5.5.6 Conclusion
This chapter presented a novel multi-level evaluation method providing detailed insights into
the generalizability of data-driven trajectory prediction models addressing research question
R2.1. Testing at diferent levels highlighted the criticality of selected test data with respect to
the validity and signifcance of evaluation results.
Since not only the accuracy but also the plausibility of results is considered, the proposed
methodology allows for the identifcation of samples showing inconsistent predictions. Such
insights are crucial during the development process to develop reliable solutions.
Two phenomena were observed: frstly, the plausibility of results does not necessarily correlate
with accuracy; secondly, the best-performing setting on test data that is close to the training
data is not necessarily the best setting in terms of generalization. Taking those facts together,
a multi-dimensional evaluation involving a broad range of test data is crucial for determining
the best model setting and should be considered in early stages of development.
Regarding research question R2.2, the evaluation revealed a large impact of the homogeneity in
training data on model performance and, at the same time, the potential to augment existing
real datasets with synthetic samples. The efect of variability in training data was shown
by the performance of the model trained on a combination of synthetic and real data (T3),
outperforming the other variants under consideration. This valuable insight connects directly
to research question R2.3, demonstrating the possibility of using simulation to compensate for
situations that are underrepresented in the real training data. The investigation of diferent
feature categories demonstrated a positive efect of contextual features on the temporal and
spatial accuracy of predictions. High inter-dependencies between feature setting and training
data were observed, resulting in a complex task to interpret results for general conclusions.
The efect size of variations in the coverage of scene information was identifed to be similar to
the efect (∼ 10%) of hyperparameter tuning, which underlines the power of model tuning.
In conclusion, the results show diferent potentials in the considered conceptual decision
categories. None of the presented models so far fulflls the required accuracy and plausibility
for integration into a driver model covering the entire diversity of urban trafc. However,
valuable insights could be gained, and a novel method could be elaborated, paving the way
for reliable prediction models in the future. The presented method provides a diferentiated
perspective on the prediction results and provides more transparency for the use of such
black-box models for trajectory prediction. The problem of generating accurate predictions
in urban trafc is a highly complex task and requires a correspondingly complex evaluation
approach since there will not be one solution that satisfes all requirements, but the right
balance has to be found for the particular application. The multi-dimensional evaluation
showed diferent strong tendencies in the investigated conceptual possibilities, which ofer
valuable insights for future developments.
72
5.6 Summary on Anticipation Methods
5.6 Summary on Anticipation Methods
In this chapter, two novel methods were presented concerning the development of a generalizable
and reliable prediction model for anticipating the future movement of vehicles in complex urban
trafc. Weak generalizability and transparency of data-driven models for trajectory prediction
were identifed to be primary challenges in current research. While there are several published
approaches to modeling, these critical aspects are rarely discussed, and model evaluation
takes a secondary place compared to the sophisticated model architectures. Therefore, the
introduced methods in this chapter focused on generating insights regarding the efect of
diferent conceptual choices on the model’s ability to generalize and elaborate a concept to
evaluate generalizability adequately. In summary, the purpose of this chapter is to pave the
way to reliable predictions, not to provide another solution for trajectory prediction that is not
feasible due to lack of generalizability. Referring to research question R2, it can be concluded
that there is no single perfect solution with regard to the high variability and uncertainty
in urban trafc. NNs show great potential in capturing such highly complex relationships
but are associated with a low degree of transparency. Therefore, to take full advantage of
these modeling approaches, a stronger focus on critical evaluation methods rather than the
development of more sophisticated model structures is recommended.
One of the key fndings within the presented results is the demonstration of improved SA by
incorporating semantic and non-semantic representations of interactions of the ego vehicle
with surrounding road users. Furthermore, the methods have highlighted the critical role
of training and testing data in model performance and the signifcance of evaluation results.
High variability of training data and enrichment of real datasets with synthetic data holds the
potential to improve generalizability signifcantly. Moreover, the importance of meaningful
evaluation metrics was emphasized, as the inclusion of additional plausibility metrics besides
the commonly used displacement errors allows for clear identifcation of situations where a
model has signifcant weaknesses. It is recommended to use the proposed methods to refne
the prediction model further before embedding it into a holistic driver model.
73
6
Dynamic Decision-Making
Disclaimer: The present chapter is based on research presented in: [201]: Teresa Rock et al. “Dynamic Decision-Making
for Agent Models in Urban Driving Simulation”. In: Proceedings of the Driving Simulation Conference 2023 Europe VR.
ed. by Andras Kemeny, Jean-Rémy Chardonnet, and Florent Colombet. Driving Simulation Association. Antibes, France,
2023, pp. 169–179.
The following chapter addresses research question R3: How to enable driver agents
for dynamic decision-making in order to cope with typical urban scenarios in a
functional and human-like manner?
After a brief introduction outlining the motivation for more dynamic decision-making in driver
models, the state-of-the-art section provides information on decision strategies for agent models
in DS and the main concepts of trajectory planning, which is the common strategy used in AD.
Subsequently, a concept involving two diferent techniques for decision-making is presented to
address the proposed research question. Both approaches are compared in terms of runtime
and quality of results with each other and with a heuristic agent model to investigate potentials
and weaknesses in typical urban scenarios.
6.1 Introduction and Motivation
The requirements analysis in Section 3.3.4 demonstrated the need for dynamic decision-making
from a psychological point of view in terms of SA and as a requirement for interactive behavior
in trafc. The review of state-of-the-art approaches in Section 2.1 showed that agent models
in driving simulation typically follow heuristic and hierarchical structures. Decisions are made
by choosing between predefned options that discretize the driving task, such as maneuvers [39,
13] or states [22]. Given the variety of situations encountered in urban trafc, it is clear that
such approaches are limited in their ability to exhibit human-like behavior by dynamically
adapting to the situation since each maneuver and the corresponding rules for choosing it must
74
6.2 State-Of-The-Art: Decision-Making
be explicitly formulated. More maneuvers and heuristic rules can be added to solve complex
interactive situations. However, such strategies do not scale, and one risks ending up in endless
if-then loops addressing single-point solutions. Furthermore, the strong discretization of the
driving task leads to a limited solution space and thus often to deadlocks in complex situations
[137, 138]. Such functional errors impede the simulation and might negatively infuence the
participants’ perception of reality in a simulator experiment [139, 118]. Meanwhile, urban
scenarios are gaining importance in driving simulation [4]. Consequently, current modeling
approaches are well suited for simple urban situations but show weaknesses in handling more
complex or interactive scenarios that would require dynamic decision-making for adapting
behavior to the situation [103, 9].
DDM is an established term in psychology and cognitive research that is characterized by a
decision-making process consisting of a sequence of interdependent decisions that infuence
future actions [140]. In contrast, static decision-making is referred to as a linear process
that selects between explicit alternatives [141]. In the context of driver models within this
thesis, the heuristic modeling approaches choosing between explicit maneuvers is understood
as static decision-making, while urban trafc requires a method that allows the driver model
to employ DDM. In this case, DDM is interpreted as the ability to take actions dependent on
future evolutions of the situation without choosing between explicitly formulated maneuver
alternatives.
Hence, it is necessary to explore alternative strategies for decision-making in agent models. In
the domains of AD and robotics, spatio-temporal behavior is typically derived by trajectory
planning. These approaches are more in line with the way humans handle s uch situations,
balancing various weighted needs such as safety, compliance, or efcient target reaching.
Additionally, based on anticipation, these methods consider diferent developments in the
driving scenario, enabling adaptive behavior in response to dynamic changes in the situation.
To meet the challenge of dynamic decision-making, the following sections present a method for
agent models, adapted from trajectory planning approaches commonly utilized in AD. Since
such planning approaches induce increased complexity, the proposed methods are compared to
a heuristic agent model, which is in use at BMW and called TRM [55]. The objective is to
investigate when and to what extent the increased complexity of the dynamic decision-making
strategy can add value.
6.2 State-Of-The-Art: Decision-Making
6.2.1 Decision-Making Strategies for Agent Models in Simulation
The main concepts of how vehicle behavior is modeled in DS were presented in Section 2.2.5.
The decision-making process of current driver agents is mostly rule-based and aims to perform
the appropriate maneuver or action based on heuristic rules processing situational information.
For example, the well-known open-source simulation framework SUMO includes modules
for following, handling intersections, and lane changes in order to cope with the variety of
situations that occur in trafc [39, 34]. The decision on which of these modules is active is
made hierarchically based on predefned rules that take into account the topological features
of the map and the dynamic movements of other road users. Normally, each maneuver is
75
6.2 State-Of-The-Art: Decision-Making
associated with a respective motion model. For example, the longitudinal behavior during a
following maneuver can be generated by the Wiedemann-following model [13, 51] or the IDM
[50, 43]. The driver agent used at BMW also distinguishes between lateral and longitudinal
movement and follows a hierarchical approach. Based on situational information, several
evaluators determine their need for action at each time step. A detailed description of the
decision-making process is provided in Section 2.2.6. Other commercial simulation frameworks
such as cogniBit
1
or AAI
2
use more sophisticated approaches to model human-like driving
behavior including learning-based concepts in combination with planning algorithms. However,
detailed information on these commercial solutions is not publicly available.
6.2.2 Trajectory Planning Approaches for Automated Vehicles
Planning the trajectory of an AV aims to fnd the best movement within a spatio-temporal
confguration space, taking into account safety, comfort, and efciency [202]. Common
approaches for addressing the planning problem can be divided into three main categories:
sampling-based methods, search-based methods, and optimization-based methods [59]. The
following sections provide a brief overview of basic trajectory planning methods used in the
context of AD. Since trajectory planning itself is a broad feld of research, more details can be
found in state-of-the-art reviews, such as the analysis by Gonzales et al. [203].
6.2.2.1 Search-Based Methods
In general, search-based approaches for trajectory p lanning are exploring the environment
to fnd a suitable solution. This class of methods typically constructs a directed graph by
discretizing the confguration space with a fxed number of motion primitives, building the
search graph a priori to avoid costly random exploration of the free space of the environment at
runtime. The search strategies focus on fn ding a globally optimal or sub-optimal path through
the graph according to the defned objectives. Please note that depending on the dimensions
of the search space, the result can either be a spatial path (x/y), a velocity profle (s/t), or a
spatio-temporal trajectory (x/y/t). The respective result and the computational efort strongly
depend on the design of the search spaces and considered heuristics to evaluate potential
nodes in the graph. Commonly used methods are based on the A* algorithm, state lattice
methods, or RRT (Rapidly-exploring Random Trees) [59, 204, 58]. Researchers use diferent
strategies for incorporating kinematic limitations into such point-based search algorithms,
such as using Dubins distance elements as motion primitives [205, 204]. Search-based methods
are associated with advantages such as low calculation efort and sufcient fexibility for
unstructured environments [206]. On the other hand, the A* and related algorithms are
sufering from high sensitivity to the design of the search space, as such methods might provide
only sub-optimal or no solution.
6.2.2.2 Sampling-Based Methods
Unlike environment exploration, the basic idea of sampling-based methods is to generate a
set of trajectories, which are then evaluated to determine the best trajectory among these
1https://cognibit.de/drivebot/
2https://www.automotive-ai.com/
76
6.2 State-Of-The-Art: Decision-Making
candidates. Various methods are available to generate these candidate trajectories. For
example, Zhang et al. proposed an approach in which lateral motions are sampled along a
reference path. Subsequently, the best trajectory is selected based on an objective function [59].
A similar approach is proposed by Lim et al. whereby the static environment is discretized by
topological lane information to derive high-level maneuver decisions [66]. As such high-level
decisions do not provide feasible paths, fu rther smoothing is applied by employing a numerical
optimization-based method.
6.2.2.3 Optimization-Based Methods
As a third option, optimization-based methods formulate the planning problem as a
mathematical model incorporating a cost function and constraints. Optimization-based
methods employ either linear or nonlinear solving strategies to address the planning task,
depending on the individual mathematical problem formulation. These methods typically
represent the path of a vehicle using parameterized curve models, such as splines, spirals, or
polynomials [59, 66]. Individual p oints are then moved within a fnite dimensional parameter
vector space to optimize for the s pecifed objectives in order to obtain smooth trajectories. The
constraint function is responsible for maintaining the trajectory compliant with the dynamic
constraint of the vehicle. Depending on the design of and thus the linearity of the constraint
function, diferent solver algorithms are applied, such as QP (Quadratic Programming) with
the IP (Interior Point) method [67] or using a SQP (Sequential Quadratic Programming)
algorithm for solving a nonlinear constraint function [207]. In contrast to the aforementioned
sampling-based method, the numerical method is not limited by predefned patterns and
allows for fexible solutions. However, the fexibility of the approach is associated with a high
computational cost to solve the spatio-temporal planning problem.
6.2.2.4 Decomposition Strategies
To reduce the complexity of the planning problem, decomposition strategies use sequential or
hierarchical architectures, such as PVD (Path-Velocity-Decomposition) or lateral-longitudinal
decomposition. PVD is the most common method to reduce the complexity of the 3D problem
by separating it into sequential tasks that can be solved in two dimensions [67]. First, a
path is planned in a two-dimensional spatial space. Secondly, the velocity profle is planned
among the path [208, 209]. Diferent methods can be used to perform the individual parts
of the sequential chain. Another approach for reducing complexity is to decouple lateral
and longitudinal motion planning. Such approaches are often used in highway scenarios, as
presented by Werling et al. [210].
6.2.3
Potentials and Weaknesses of Heuristic Decision-Making and Trajec-
tory Planning
Compared to heuristic modeling approaches, trajectory planning ofers the possibility to react
in a more human-like way since the potential solution space is larger and less discretized, and
thus, the decision process is more dynamic. As the number of potential solutions increases, so
does the complexity, which is accompanied by a higher computational cost. Meanwhile, real-
77
6.3 Method: Dynamic Decision-Making for Agent Models
time capability is a crucial functional requirement for agent models in DS. In conclusion, the
challenge is to fnd the best trade-of between the complexity of the approach and the quality
of results. Therefore, it is necessary to gain more insight into the capabilities and limitations
of such approaches to allow for appropriate usage depending on the level of complexity of the
situation.
6.3 Method: Dynamic Decision-Making for Agent Models
6.3.1 Concept for Dynamic Decision-Making
Trajectory planning for urban trafc in the context of AD represents an unsolved challenge
since various publications are available, but yet there exists no established solution applicable
to the diversity of situations encountered in urban trafc. Following, it was not possible to
adapt an existing method and apply it as a tool for agent models in DS.
Therefore, the subsequent method introduces a novel planning framework inspired by promising
publications. In the following, two concepts for trajectory planning that difer in complexity
and planning strategy are presented to gain insight into the disadvantages and potentials of
diferent methods compared to heuristic decision-making. Both planning approaches are based
on the same framework, realizing a two-layer concept inspired by Lim et al. [66], but employ
diferent strategies for creating a frst-guess trajectory representing a high-level decision. Based
on the hierarchical planning framework, two diferent planning methods for decision-making
are developed.
The frst planner variant employs a decomposition strategy to reduce the complexity of the
planning problem, named PL_PVD, while the other approach employs a three-dimensional
search strategy referred to as PL_3D. For the decomposition concept, the PVD method was
selected due to its prevalence in AD [66]. To refect the state-of-the-art in heuristic models,
the driver agent implemented at BMW, called TRM, is used. Details on the model structure
and decision strategies are provided in Section 2.2.6.
Since the focus in this chapter lies on the decision-making process, the impact of prediction
accuracy should be minimized. Therefore, GT data is taken as a prediction in the initial
concept implementation.
Based on the research question R3, three sub-questions are formulated to address the challenging
78
6.3 Method: Dynamic Decision-Making for Agent Models
topic in all dimensions.
•
R3.1: To what extent can a PVD (PL_PVD) and the three-dimensional approach
(PL_3D) p rovide additional value in terms of quality compared to a heuristic state-of-
the-art agent model?
•
R3.2: How do the PL_PVD and PL_3D approaches difer in terms of runtime and
quality of results in diferent scenarios?
•
R3.3: How sensitive are runtime and solution quality of planning approaches concerning
diferent parameter sets and changes in the scenario?
6.3.2 The Planning Framework
A multi-layer planning framework inspired by Lim et al. [66] is developed to enable more
dynamic decision-making. To reduce the planning complexity, the original problem with a
high dimensionality was split up into a two-layer concept of lower dimensionality with diferent
objectives as illustrated in Figure 6.1:
The Behavioral Layer generates a rough frst-guess trajectory refecting a high-level maneuver
decision. Many published approaches in AD frst make a maneuver decision by categorizing
the driving task and using for example state machines to select a maneuver, and subsequently
plan the trajectory for that maneuver [66]. However, this would result in the same problem
faced with heuristic driver models, which do not show enough fexibility to cope with the
variety of urban trafc. Therefore, the behavior planning approach was adopted from Lim et
al. [66].
The Motion Planning Layer produces a smooth and dynamically feasible trajectory on that
basis since the output of the behavioral layer is a roughly discrete spatio-temporal decision.
The entire process can be described as follows: the frst layer outputs a long-term frst-guess
trajectory, which is a non-smooth motion that is then further optimized in the second layer to
generate a smooth and dynamically feasible trajectory of smaller time horizons.
Since the high-level decision is the more sensitive part in terms of plausible behavior, two
diferent approaches for the behavioral layer were developed to gain insight into the weaknesses
and potential of diferent techniques.
6.3.2.1 Planner Variant: PL_PVD
In order to reduce the complexity and computational efort of generating the frst-guess
trajectory, the planning problem is solved by frst planning a path and subsequently planning
the velocity profle, both in two-dimensional space.
• Spatial Planning:
In the frst step, the path is generated using a search-based method. The approach
employs an adaption of the hybrid A* algorithm applied to a spatial confguration space.
The confguration space considers the static environment composed of the road network
and static obstacles. The original concept of the A* algorithm was introduced by Hart et
al. [211], and the principal idea can be described as follows. A graph compromises a set
S
of nodes
ni
and transitions between nodes
eij
which are associated with respective costs
79
6.3 Method: Dynamic Decision-Making for Agent Models
URBAN ENVIRONMENT
INFORMATION
BEHAVIORAL LAYER
MOTION PLANNING LAYER
First Guess Trajectory
Feasible Trajectory
Variant A: PL_PVD
1. Path finding
2. Velocity profile
Variant B: PL_3D
3-dim spatio-temporal
planning
Numerical Optimization by SQP
Input:
Global
Route
Smoothing
Output:
Local
Trajectory
Figure 6.1: The hierarchical planning framework realized as a two-layer concept for frst planning
a rough discretized frst-guess trajectory (behavioral layer), which is subsequently smoothed into a
dynamical feasible trajectory by the motion planning layer.
cij
. The graph is limited by source nodes and terminal nodes. Paths are ordered node
sets with a total cost summarizing individual transition and node costs. The algorithm
searches an optimal path from a source node
n
0 to a terminal node
nk
. The search loop
within the algorithm continues until an optimal path is found, assuming its existence.
Therefore, two sets are iteratively updated: the open set and the closed set. The closed
set contains all nodes that have already been visited, whereas the open set comprises all
nodes that have not been visited but were identifed as sequent nodes of those already
visited ones. A key element of the search algorithm is the value function
f
(
ni
), which
returns a cost value of a node
ni
.
f
(
ni
) is the sum of the total cumulative cost
g
(
ni
)
of each step in the path (
n
1
, n
2
, ..., nk
) and the heuristic cost
h
(
ni
) as estimation of
an optimal path from the node
ni
to a terminal node
nk
[211]. To generate plausible
trajectories, it is necessary to incorporate semantic information into the search algorithm.
Therefore, the value function
f
(
ni
) is extended by additional costs
s
(
ni
) aiming at forcing
the vehicle to stay as close as possible to the ego-lane, according to Equation (6.1). The
semantic cost s(ni) is composed of the following weighted penalties:
– penalizing the deviation from the center line of the ego lane
– penalizing moving into oncoming trafc lanes
f(ni) = g(ni) + h(ni) + s(ni) (6.1)
The individual semantic costs have been chosen in such a way that as many scenarios as
possible can be covered with a simple cost approach, since each additional cost factor
contributes to an increased computational efort during the search. Another key element
is the collision avoidance strategy, which is performed by checking for intersections
between a polygon representing the ego vehicle and polygons representing static obstacles
80
6.3 Method: Dynamic Decision-Making for Agent Models
in the environment. A collision applies if at least one polygon intersects with the
ego vehicle’s motion polygon. To reduce the number of nodes to be explored and the
computational efort, a consideration window is defned, which checks if there is a static
obstacle in the upcoming road section. If there is no obstacle present, the vehicle is
expected to follow the center-line of the respective lane, and node generation takes place
with a controlled action space
Ac
: (
vconst
,
ϕconst
,
dstepconst
) by employing a P-Controller.
If there is an obstacle present, child nodes with diferent steering angles
ϕ
are sampled
to fnd a path avoiding the obstacle. The node generation takes place with the default
action space
Ad
: (
vconst
,
ϕ ∈
[-
ϕmax
, 0 , +
ϕmax
],
dstepconst
). To obtain feasibility, a
bicycle model underlies the calculation of positions of the next child nodes incorporating
ego position, heading, and steering angle. Following, the ego moves continuously under
vehicle kinematic constraints in a XY space. In summary, spatial planning provides a
kinematically feasible path
p
based on the inputs of maximum steering angle
ϕ
, desired
velocity vconst, and constant step width dstepconst .
• Temporal Planning:
Since spatial planning is conducted in a XY space, the dynamic movements of other
road users cannot be taken into account during the path planning. Therefore, the
velocity profle is subsequently planned along the identifed reference path again using a
hybrid A* algorithm. The algorithm searches in a
s/t
space (s along path
p
over time
t
)
incorporating the dynamic movements of other trafc participants inspired by Lim et
al. [66]. It is assumed that path
p
, represented as a sequence of states (
xk, ϕk
), to be a
trajectory with constant velocity. Given path
p
, the main concept is to reduce velocity
to 0 before the ego vehicle reaches an area where collisions with dyn amic obstacles would
occur. In detail, the objective is to decide at each state (
xk, ϕk
) if proceeding to the
next state (
xk
+ 1
, ϕk
+ 1) would cause a collision with any other road user. In order to
consider the dynamic movement of others, the time dimension is assumed to be discrete.
If the transition of state (
xk, ϕk
) to the next state (
xk
+ 1
, ϕk
+ 1) at time
t
would cause
a collision, the state remains (
xk, ϕk
) for
t
+ 1, as illustrated in Figure 6.2. Consequently,
the control action to perform a transition between two nodes originates from a set
V
=
0, vconst
, whereby
vconst
=
dconst/t
. After conducting the temporal planning, the
sequence of nodes represents a trajectory (x/y/t), and the result of the search algorithm
is a non-smooth spatio-temporal trajectory refecting a high-level maneuver decision.
6.3.2.2 Planner Variant: PL_3D
The PVD technique decouples planning in space and time. Therefore, situations in which
spatial and temporal motion are highly interdependent, such as overtaking a cyclist, can lead to
implausible behavior. In order to address this shortcoming, an approach for directly planning a
frst-guess trajectory in three-dimensional space is developed. The three-dimensional planning
strategy introduces a new challenge in terms of representing dynamic obstacle information
within the confguration space. Common approaches in the feld of robotics employ DOGs
or spatio-temporal grids with DAGs to represent dynamically changing situations [59, 152].
Such representations are efective for describing dynamic scenes, especially when the input
81
6.3 Method: Dynamic Decision-Making for Agent Models
s
t
Dynamic Obstacle
Terminal Position
Figure 6.2: Illustration of the velocity profle generation using a hybrid A* algorithm applied in
the s/t space adopted from [66].
data is in the form of point clouds. Nevertheless, when information is discretized into grids, it
loses semantic context. Additionally, generating these dynamic grids in real-time consumes
signifcant computational resources [153, 154]. It is necessary to fnd a method that has the
potential to be real-time capable, from a conceptual point of view, to satisfy the functional
requirements of agent models in simulation. Therefore, the presented method overcomes
the need to discretize the entire environment at runtime by representing static and dynamic
obstacles using a hybrid ellipsoidal APF (Artifcial Potential Field) inspired by promising
publications in the context of AVs [212, 213]. The APF approach is a widely established
approach for obstacle avoidance in complex environments in robotics due to its efciency,
ability to generate smooth trajectories, and simplicity [214]. Besides these advantages, such
approaches are also associated with some potential problems. The authors Sun et al. identify
the most common problems associated with APFs for collision avoidance as the following
[214]. By converting obstacle information into a force, important information, such as obstacle
distribution, is lost. Especially in obstacle-intensive areas, this can lead to problems as no
solution space remains. Furthermore, depending on the design of the planning problem, local
minima or jitter may occur around the obstacle. However, the APF method has been identifed
as the most promising concept for the problem proposed here, and potential drawbacks are
addressed by novel techniques such as those proposed by Liu et al. [213] to compute a situation
adaptive APF that provides various parameters for tuning towards a satisfying solution.
The APF is incorporated in the cost function for collision avoidance and integrated with
additional time-related changes into the Hybrid A* implementation described in Section 6.3.2.1.
The APF is formed by relative position, ego vehicle velocity, relative velocity, and acceleration,
a safety distance factor, and coefcients for calculating the formations of the feld as illustrated
in Figure 6.3. Due to its fexible nature, the APF method can handle various interactive
situations, e.g., by tuning parameters or by adding additional factors such as increased lateral
vulnerability for obstacles of type cyclists. An example of the calculated APF costs for two
exemplary scenarios is illustrated in Figure 6.4. Thus, the value function
f
(
nk
) is extended by
82
6.3 Method: Dynamic Decision-Making for Agent Models
several weighted semantic costs
s
(
xk
) aiming at generating plausible movements with respect
to the situation:
• penalizing the deviation from the center line of the ego lane
• penalizing moving into oncoming trafc lanes
• time cost
• APF cost regarding static and dynamic obstacles
Obstacle
v
v
v
Figure 6.3: The APF formed by the position of the obstacle and velocity and position of the ego
vehicle [213].
Similarly to the PL_PVD approach, the semantic costs were selected in such a way that as
many scenarios as possible can be covered with as few cost factors as possible. Whether the
designed cost function generates plausible behavior in typical urban scenarios is examined
during evaluation. The search algorithm for three-dimensional planning also employs the
aforementioned consideration window method to determine whether there is a static or dynamic
obstacle ahead and, therefore, specifes two strategies for node selection:
Controlled: no static or dynamic obstacle ahead, so nodes are sampled only on the center-line
employing a controlled action space Ac as described in Section 6.3.2.1.
Default: Nodes with diferent steering angles and acceleration values are sampled to avoid
collisions and to fnd a feasible path to the destination employing action space
Ad
: (
acc ∈
[0,
accmin, accmed, accmax], ϕ ∈[ -ϕmax, 0, +ϕmax], dstep = f(acc)).
Based on the simplifed representation of the vehicle model and the forward search strategy of
the hybrid A*, it is assumed that the vehicle cannot reverse and only positive acceleration
values can be assigned to the action space. Similar to PL_PVD, the main concept is to reduce
the acceleration before a collision occurs. To allow for deceleration as well, a control system
such as MPC would be required.
6.3.2.3 Trajectory Smoothing
Given that the behavioral layer produces a roughly discretized trajectory, further processing is
required to create a dynamically feasible and smooth trajectory. This task is tackled by the
optimization-based motion planning layer and subsequent post smoothing.
83
6.3 Method: Dynamic Decision-Making for Agent Models
600 625 650 675 700 725 750 775 800
440
450
460
470
480
490
500
Real Scenario
600 625 650 675 700 725 750 775 800
440
450
460
470
480
490
500
APF
0 15 30 45 60 75 90 105
Repulsive Potential
600 625 650 675 700 725 750 775 800
440
450
460
470
480
490
500
Real Scenario
600 625 650 675 700 725 750 775 800
440
450
460
470
480
490
500
APF
0 15 30 45 60 75 90 105
Repulsive Potential
(a) APF for a static obstacle and oncoming
(b) APF for a cyclist and oncoming vehicle.
vehicle.
Figure 6.4: Example visualization of the APF for a dynamic (right) and static (left) obstacle.
Ego vehicle green Bounding box, other trafc participants blue.
Motion Planning Layer:
The motion planning layer employs numerical optimization to generate a smooth motion based
on the frst-guess trajectory. The trajectory provided by the behavioral layer is represented as
a list of states defned by position, heading, and action. The optimization problem formulation
considers the positions [
xi yi
] as decision variables to be rearranged to minimize the objective
function, and trajectory
t
is represented according to Equation (6.2). The objective function is
composed of the following weighted components:
fref
describing the distance to the originally
planned path, facc, and fjerk aiming at maintaining comfort.
t =
XT
0
.
.
.
XN−1
, where Xi = [xi yi]T ∀i ∈ {0, ..., N − 1} (6.2)
Inequality constraints are formulated to prevent collisions caused by the rearrangement of the
decision variables. Therefore, collisions with static and dynamic obstacles are checked by a
strategy adapted from Zhang et al. [59]. The collision check is based on circles, whereby the
principle idea is that each car, including the ego vehicles, is represented by two circles, and
other trafc participants, such as pedestrians, are represented by one circle, as illustrated in
Figure 6.5. For collision avoidance, the distance from each circle center of the ego vehicle to
all circle centers representing other road users must be greater than or equal to the sum of
the radii of the respective circles. Depending on the parameterization of the radius, larger
or smaller safety distances can be forced. Equation (6.3) defnes the inequality constraint for
collision avoidance with
X
describing the center points of the ego and obstacles and
P
defning
the number of obstacles. As the dimension of the constraint vector
g
(
t
) of any trajectory
t
is already (4
P
+ 1)
· N
containing
N
time steps, only crucial cons traints are included as
84
6.3 Method: Dynamic Decision-Making for Agent Models
𝑋𝑖,𝑓
𝑋𝑖,𝑟
𝑋𝑖,𝑓
𝑜𝑏𝑠 𝑋𝑖,𝑟
𝑜𝑏𝑠
𝑟𝑜𝑏𝑠
𝑑𝑜𝑏𝑠,𝑟−𝑓
𝑟𝑣𝑒ℎ
Ego Vehicle
Obstacle
Figure 6.5: Circle method for collision free motion planning inspired by [66].
inequality constraints to maintain a balance in complexity.
gi,coll(t) =
g0 (X)
i,coll
.
.
.
p
g (X)
i,coll
4
0 P·
≤
, 04·P ∈ R4·P (6.3)
For solving the optimization problem, the SLSQP solver from SciPy was chosen [215]. It is
noteworthy that commercial solvers may ofer more efcient solution methods. However, SciPy
was selected for scientifc comparability due to its open-source nature and its provision of the
essential features for this optimization task.
Post Smoothing:
No kinematic constraints were included in order to reduce the complexity of the formulated
numerical optimization problem. Investigations have shown that considering kinematic
constraints, such as limiting the maximal possible curvature, leads to poor optimization
and high runtime [216]. Therefore, the kinematic feasibility of the trajectory, computed by the
motion planning layer, is not guaranteed anymore. For this reason, a method using B-spline
interpolation is employed to smooth the fnal spatial path and guarantee feasibility. Therefore,
next to typically used methods, such as polynomials or Bezier curves, a cubic B-spline is
chosen as an interpolation method since, according to Zhu et al., it provides more fexibility
for curvature control [217]. The interpolation algorithm merely outputs a curvature-optimized
spatial component of the entire trajectory. The locations
Xi
of the initial trajectory can
be projected onto the computed spline to complete this postprocessing method and improve
spatial smoothness. The process of interpolation and curvature control produces a smooth
and kinematically feasible trajectory as an output for the ego vehicle.
6.3.3 Parametrization
Compared to heuristic decision-making, planning approaches better refect the human way
of dealing with complex trafc situations by balancing between diferent needs, such as time-
efcient target reaching or the need for safety. Such needs are refected in the various costs and
constraints incorporated into the planning framework. The resulting trajectory, and thus, the
behavior, strongly depends on how these cost values are weighted accordingly to each other,
which is part of the parameterization task. The parametrization can have a large impact on
the quality of results and the runtime. Meanwhile, it is challenging to fnd a globally valid
parameter set satisfying the diversity of situations encountered in urban trafc.
85
6.3 Method: Dynamic Decision-Making for Agent Models
Therefore, the present method established a GA (Genetic Algorithm) to fnd a set of suitable
parameters. This is a technique that takes advantage of the principle of genetics and natural
selection, in which a population of some individuals evolves towards a stronger one concerning
a defned ftness function [218]. The approach of using a GA ofers the opportunity to fnd
a set of parameters for one individual or among multiple scenarios satisfying a proprietary
specifed ftness function, allowing to prioritize application-specifc targets. In this case, the
ftness function was designed to consider runtime and quality in the sense of comfort values
and feasibility.
Initially, a population of randomly parameterized instances of the behavioral layer is generated,
which are applied to the training scenarios in order to evaluate the respective performance
regarding quality and runtime. The individual instances showing the best performance
are selected for mutual crossover by splitting the genomes and combining them with other
genome parts. In this way, a new generation of a population is created, and the process
of evaluation and crossover can be repeated. Moreover, random mutations are introduced
across individual genes to enable some stochasticity in the process. The fnal result provides
individual parameterizations with high ftness values. Based on empirical tests, the GA was
initialized with a random population of 50 individuals, 50 generations to be assessed, and
a mutation probability of 10%. The algorithm returns a set of ten so-called paretos, which
are individual parameter sets with similar results regarding the ftness function refecting the
optimum of either efciency or best quality. Providing ten paretos allows one to choose the
extrema with the best runtime or quality, as well as various trade-ofs in between.
86
6.3 Method: Dynamic Decision-Making for Agent Models
6.3.4 Evaluating Strategy
The evaluation addresses the previously defned research questions by investigating four key
scenarios inspired by the use case analysis in Section 3.3.3. The scenarios aim to represent
the main challenges in urban trafc and are illustrated in Figure 6.6. In the following, the
individual scenarios and the high-level behavior that is expected to be shown by the ego
vehicle are described. The defnition of expectations enables the subjective assessment of the
plausibility of behavior. Please note: expected behavior is defned by an individual expert
perspective and has no guarantee of generic validity at this point.
• (A_VEH:) ROW regulated interactions when turning left at an intersection:
The ego vehicle approaches a four-leg intersection and intends to turn left, while an
oncoming vehicle crosses the intersection going straight, and a pedestrian crosses the
road after the curve.
Expected behavior: The ego vehicle is expected to complete the turning maneuver
without colliding or cutting the ROW of any other trafc participant.
• (B_PED:) Pedestrians crossing the road:
The ego vehicle faces a situation with pedestrians walking along the sidewalk and two
pedestrians crossing the road at some point.
Expected behavior: The ego vehicle is expected to follow the lane without compromising
the safety of pedestrians.
• (C_STAT:) Partially occupied lane with oncoming trafc:
The ego vehicles’ lane is occupied by a static obstacle. In the oncoming trafc lane,
another vehicle is driving.
Expected behavior: The ego vehicle is expected to overtake the obstacle, aiming to reach
its target behind without colliding or cutting of the other vehicle’s ROW.
• (D_BIC:) Slower cyclist in front with oncoming trafc:
The ego vehicle is driving behind a slower cyclist, which is driving in front. In addition,
a vehicle is driving in the oncoming lane.
Expected behavior: The ego vehicle is expected to overtake the cyclist, aiming to reach its
target in time without colliding or cutting of the other vehicle’s ROW or compromising
the cyclist’s safety.
To address research questions R3.1 and R3.2, both planning approaches are compared with
each other and with the heuristic TRM model in terms of runtime and quality of the result in
the above-mentioned scenarios. Runtime is provided absolute and normalized to a planning
horizon of 5 seconds for all layers. Furthermore, the number of iterations and explored nodes
for the behavioral layer are provided, as runtime strongly depends on the machine and the
level of code optimization.
No runtime comparison is made between the planning concepts and the TRM model since the
TRM model is an optimized and integrated C++ module within the simulation framework at
BMW, while the two planning approaches are conceptual implementations in Python without
further code optimization.
The following criteria are investigated to measure the quality of the generated trajectories:
87
6.3 Method: Dynamic Decision-Making for Agent Models
F G
•
Comfort: measuring maximal acceleration of the frst-guess
acc
[
m/s2
] and fnal
max
F INAL
trajectory acc [m/s2]
max
•
Time efcient target reaching: average driven velocity
velmean
[km/h] and time to reach
the target point ttarget
•
Criticality: minimal distance to other trafc participants
ddyn
and minimal distance to
min
static obstacles dstat
min
Furthermore, quality is assessed subjectively by checking whether the aforementioned behavioral
expectations are met.
Addressing research question R3.3, the evaluation aims to provide insight into the potentials
and the sensitivities of such planning approaches towards diferent parameters and situational
changes. For this purpose, the quality and runtime of the PL_3D approach when solving
the cyclist scenario are evaluated under diferent conditions. For evaluating the infuence of
diferent parameter sets, the presented GA method is used to create the following parameter
sets for the cyclist scenario:
• A best runtime and a best quality parameter set.
•
A parameter set resulting from a global parameterization among all key scenarios and a
scenario-specifc parameter set.
In order to analyse the sensitivity of such approaches to changes in the scenario, two variants
of the cyclist scenario were created. Velocities and positions were selected to signifcantly
change the interaction of the ego vehicle with the other trafc participants.
•
Variation D1: the cyclist scenario with a diferent initial and desired velocity of the ego
vehicle: 35km/h instead of 50km/h.
•
Variation D2: initial and desired velocity of 35
km/h
and the same scenario with a
diferent starting position of the oncoming vehicle infuencing the interaction.
6.3.5 Implementation Details
For comparison, all key scenarios are created in Spider, which is BMW’s proprietary simulation
framework [151, 55]. The PL_PVD planner is initialized with a desired velocity of 50
km/h
and
dstep
= 5
m
. The PL_3D planner is initialized with a desired velocity of 50
km/h
and
acc ∈
[0
,
3
.
75
,
5
.
75
,
8
.
75
,
13
.
88]
m/s2
. For the variations D1 and D2, the desired velocity is
set to 35km/h and acc ∈ [0, 3.75, 5.75, 8.75, 13.88] m/s2. These values are oriented to typical
velocities (10, 20, 30, 50 [km/h]) to reach them in approximately one second. For both planners
the steering range is defned with
ϕ ∈
[
−
36
◦ ,
0
,
+36
◦
]. Both trajectory planning concepts are
implemented in Python 3.8 on an HP Z840" Workstation using Intel(R) Xeon(R) CPU E5-2640
v4 @ 2.40GHz 64GB RAM.
88
6.4 Results
Ego Vehicle
Other Vehicle
Cyclist
Static Obstacle
Pedestrian
AB
D
C
Figure 6.6: The four key scenarios for evaluation.
6.4 Results
The Tables 6.2 and 6.3 provide the results for all four key scenarios for the two trajectory
planners and the heuristic decision model (TRM) associated with the aforementioned evaluation
criteria. Besides the following selected illustrations, for all scenarios, the pictures of resulting
trajectories and fgures of velocity, acceleration and jerk profles can be found in the Appendix
(A.2).
6.4.1
Potential of Trajectory Planning Versus Heuristic Approaches for
Decision-Making
When considering the ability of the diferent decision-making strategies, from a functional
perspective, only the (PL_3D) approach is able to solve all scenarios as summarized in Table
6.1. The PL_PVD planner is not able to solve scenario D, whereby spatial and temporal
movement has a strong inter-dependency and is not able to overtake the cyclist. The heuristic
model (TRM) is able to solve the ROW regulated intersection (A_VEH) and interactions
with pedestrians (B_PED) but runs into a deadlock in scenarios C_STAT and D_BIC. In
these cases, the heuristic model does not consider the oncoming trafc lane as a driveable area,
and thus, none of the predefned maneuvers match the situation.
89
6.4 Results
When comparing the calculation efort between the PL_PVD and the PL_3D approach, one
can observe that generating a frst-guess trajectory is more costly when applying the PL_3D
approaches in scenarios A_VEH, B_PED, and D_BIC. Meanwhile, the PL_PVD approach
shows signifcantly higher runtimes in the motion planning layer for all scenarios, as the
frst-guess trajectory of the PL_3D planner is already smoother based on the third dimension
of time. When analyzing the behavior generated by the diferent decision-making strategies,
Table 6.1: The ability of the two planners and the heuristic agent model to solve the defned key
scenarios from a functional perspective.
Scenario A_VEH
√ Scenario B_PED
√ Scenario C_STAT Scenario D_BIC
TRM
PL_PVD
PL_3D
√
√ √
√
deadlock
√
√
deadlock
deadlock
√
one can observe diferent levels of cautiousness in the individual interactions, infuenced by
parameters and representations. The extent to which the individual behaviors of the planners
are in line with a distribution of human-driven trajectories requires further investigation and
can be infuenced by the parameter setting. Regarding the criticality of results, all approaches
satisfy the requirement of not colliding with any other trafc participants or static obstacles.
However, when it comes to functionality, the approaches difer in their ability to reach the
target.
The ROW regulated interaction in scenarios A_VEH is solved in a functional and non-critical
manner by all approaches, but diferent behaviors are exhibited, as illustrated in Figure 6.7.
The heuristic model TRM lets the vehicle and the pedestrian pass frst while both planners
cross the confict zone before the other trafc participants. Based on environmental information,
the heuristic model chooses the maneuver give right of way when approaching the intersection
and, therefore, slows down early. The parameterization of the maneuver involving under which
conditions and distances the oncoming vehicle is identifed to be relevant for giving ROW
results in conservative behavior. In contrast, both planners balance between the needs of
target reaching, a comfortable trajectory, and keeping enough distance to dynamic obstacles
without explicitly formulating the ROW regulation at this point. The trajectory of the PL_3D
approach exhibits some curve-cutting behavior. This shows that the planner would require
some fne-tuning of the weights assigned to the costs, penalizing deviation of the center-line and
the comfort costs for this scenario. Similar efects of showing diferent but all rule-compliant
behaviors can likewise be observed in the B_PED scenario, as illustrated in the Appendix
in A.2.2. In scenario C_STAT, where the ego lane is occupied by a static obstacle, only the
two planners are able to solve the scenario. Due to the solid lane marking, the oncoming
lane is not considered a driveable area by the heuristic model, resulting in a deadlock in this
situation. Both the PL_3D and the PL_PVD controlled vehicle overtake the obstacle but
show diferences in their spatio-temporal motion and yielding behavior as illustrated in Figure
6.8. The PL_3D controlled vehicle drives far into the oncoming lane in Scenarios C_STAT and
D_BIC, as shown in Figure 6.9. This indicates that the weighting of the center-line deviation
cost and the APF cost relative to each other need to be further tuned. Scenario D_BIC is
solved only by the PL_3D approach while both the heuristic model and the PL_PVD planner
stay behind the cyclist, as shown in Figure 6.10. Based on the separation of planning in space
90
6.4 Results
(a) Scenario A_VEH solved by
(b) Scenario A_VEH solved by
(c) Scenario A_VEH solved by
TRM. PL_PVD. PL_3D.
Figure 6.7: Behavior of the heuristic agent model and the two planners in scenario A_VEH at
the same time step. Ego vehicle in green.
(a) Scenario C_STAT solved by TRM.
(b) Scenario C_STAT solved by PL_PVD.
(c) Scenario C_STAT solved by PL_3D.
Figure 6.8: Behavior of the heuristic agent model and the two planners in scenario C_STAT at
the same time step. Ego vehicle in green.
91
6.4 Results
(a) Scenario C_STAT solved by PL_3D. (b) Scenario D_BIC solved by PL_3D.
Figure 6.9: PL_3D driving far into oncoming lane in scenario C_STAT and D_BIC. Ego vehicle
in green.
and time, the dynamic movement of the cyclist is not considered during the path planning, and
thus, the planner is not able to overtake the cyclist. The profles of velocity, acceleration, and
jerk shown in Figure 6.11 illustrate that the implementation of the PL_PVD planner is also
not suitable for a typical following maneuver since the planner alternates between accelerating
and waiting, resulting in jerky movement. For the heuristic model, again, the oncoming trafc
lane is not considered a driveable area, preventing the model from overtaking.
Considering the balance between formulating a complex cost function and reducing the
complexity of the planning problem, the designed cost functions of both planners show great
potential, as all scenarios can be solved by the PL_3D planner, and even if the PL_PVD
planner fails in scenario D_BIC, the reason does not lie in the design of the cost function.
Table 6.2: Comparing the results of the planners PL_PVD and PL_3D with the heuristic agent
models (TRM) in scenario A_VEH and B_PED.
Scenario A_VEH: Left turn at intersection Scenario B_PED: Pedestrian crossing
PL_3D PL_PVD TRM PL_3D PL_PVD TRM
Objective
Quality
accF G
max
3.50/-0.12 m/s
2
3.59/-0.0 m/s
2
- 13.80/-0.00 m/s
2
1.96/-0.00 m/s
2
-
accF INAL
max
3.64/-0.01 m/s
2
3.98/-0.0 m/s
2
3.28/-6.03 m/s
2
10.93/-0.02 m/s
2
1.89/0.00 m/s
2
3.27/-8.67 m/s
2
vel
mean
49.71 km/h 55.0 km/h 25.36 km/h 45.65 km/h 55.02 km/h 32.86 km/h
t
target
17.0s 15.9s 34.96s 17.5s 14.7s 25.24s
dyn
dmin
>20m >20m 1.40m to pedestrian 2.11m to pedestrian 1.73m to pedestrian 1.54m to pedestrian
dstat
min
- - - - - -
Subjective
Quality
Observed
behavior
pass intersection
before vehicle and pedestrian
(without collision)
pass intersection
before vehicle and pedestrian
(without collision)
let vehicle
and pedestrian pass
let pedestrian
pass
pass before pedestrian
(without colliding)
let pedestrian
pass
Expectation
satisfed?
TRUE TRUE TRUE TRUE TRUE TRUE
Runtime
Latency BH:
total (per 5sec) 0.164s (48.24ms)
path:
0.086s (27.04ms)
vel.profle:
0.021s (6.60ms)
<1ms
replanning all 50ms
0.219s (62.57ms)
path:
0.071s (24.15 ms)
vel.profle:
0.037s (12.59ms)
<1ms
replanning all 50ms
Explored
Nodes
176
path:332
vel.profle:326
242
path:304
vel.profle:302
Iterations 89
path:167
vel.profle:164
122
path:153
vel.profle:152
Latency MP:
total (per 5sec) 0.215s (63.24ms) 1.072s (337.11ms) 0.264s (75.43ms) 1.897s (645.24ms)
Latency PP:
total (per 5sec) 0.017s (5ms) 0.018s (5.66ms) 0.018s (5.14ms) 0.017s (5.78ms)
Total Latency:
total (per 5sec) 0.396s (116.47ms) 1.197s (376.42ms) 0.501s (143.14ms) 2.022s (687.76ms)
92
6.4 Results
(a) Scenario D_BIC solved by TRM.
(b) Scenario D_BIC solved by PL_PVD.
(c) Scenario D_BIC solved by PL_3D.
Figure 6.10: Behavior of the heuristic agent model and the two planners in scenario D_BIC at
the same time step. Ego vehicle in green.
Figure 6.11: Profles showing velocity, acceleration and jerk of the PL_PVD planner in Scenario
D_BIC for the fnal trajectory.
93
6.5 Limitations and Future Work
Table 6.3: Comparing the results of the planners PL_PVD and PL_3D with the heuristic agent
models (TRM) in scenario C_STAT and D_BIC.
Scenario C_STAT: Static obstacle Scenario D_BIC: Cyclist
PL_3D PL_PVD TRM PL_3D PL_PVD TRM
Objective
Quality
acc
F G
max
3.50/-0.00 m/s
2
15.27/0.0 m/s
2
- 3.26/-0.00 m/s
2
16.64/0.00 m/s
2
-
acc
F INAL
max
1.81/-0.04 m/s
2
10.70/0.00 m/s
2
0.0/-3.8 m/s
2
2.60/-0.00 m/s
2
11.65/-0.0 m/s
2
2.78/-6.66 m/s
2
vel
mean
49.55 km/h 44.82 km/h 32.68 km/h 49.65 km/h 20.59 km/h 17.52 km/h
t
target
11.4s 13.0s DEADLOCK 16.0s 39.4s 39.88s
dyn
d
min
5.16m to car 0.43m to car
1.76m to car
when passing
2.06m to cyclist 0.57m to cyclist 3.61m to cyclist
d
stat
min
2.09m 1.25m 11.56m - - -
Subjective
Quality
Observed
behavior
overtakes obstacle
after oncoming vehicle
overtakes obstacle
after oncoming vehicle
deadlock
in front of obstacle overtakes cyclist remains
behind cyclist
remains
behind cyclist
Expectation
satisfed?
TRUE TRUE FALSE TRUE FALSE FALSE
Runtime
Latency BH:
total (per 5sec) 0.837s (367.11ms)
path:
0.916s (352.31ms)
vel.profle:
0.043s (16.54ms)
<1ms
replanning all 50ms
1.564s (488.75ms)
path:
0.075s (9.52ms)
vel.profle:
0.206s (26.14ms)
<1ms
replanning all 50ms
Explored
Nodes
1118
path:567
vel.profle:388
1204
path:304
vel.profle:796
Iterations 487
path:220
vel.profle:195
606
path:153
vel.profle:399
Latency MP:
total (per 5sec) 0.193s (84.65ms) 1.279s (491.92ms) 0.255s (79.69ms) 6.130s (777.92ms)
Latency PP:
total (per 5sec) 0.017s (12.91ms) 0.020s (7.46ms) 0.018s (5.63ms) 0.020s (2.54ms)
Total Latency:
total (per 5sec) 1.048s (459.65ms) 2.257s (868.08ms) 1.837s (574.06ms) 6.432s (816.24ms)
6.4.2
Sensitivity Towards Diferent Parameter Sets and Scenario Variations
Addressing research question R3.3, the sensitivity of trajectory planning to changes in the
scenario or the parameter set is evaluated by investigating runtime and quality of results of
the PL_3D approach in the defned variants of scenario D_BIC. Table 6.4 provides all quality
and runtime measures for variants D1 and D2. First of all, it is noteworthy that all variants
satisfy the requirements to overtake the bicyclist while avoiding a collision with any other road
user, demonstrating the ability of the planner to adapt behavior dynamically to the changed
situation. The quality of the trajectories difers in terms of smoothness, especially in Variant
D2, comparing the best runtime and best quality Pareto as shown in the velocity, acceleration,
and jerk profles illustrated in Figure 6.13. Meanwhile, the best runtime parameter set saves
more than 50% of the total runtime, which demonstrates the great potential of parameter
tuning toward a more efcient planner. Next to the objective measures of runtime, Figure
6.12 shows the high diference of explored nodes to solve the scenario. Comparing variants D1
and D2 in terms of runtime, one can see that the calculation efort of the approach is strongly
infuenced by the interaction happening in the scenario. The shorter distance to the oncoming
vehicle is benefcial for the search algorithm since there is less waiting time for the ego vehicle,
and thus, it is easier to fnd a smooth trajectory overtaking the cyclist.
Plots and fgures of behavior for the remaining variants can be found in the Appendix (A.2).
6.5 Limitations and Future Work
The proposed trajectory planning framework shows high potential in being able to generate
reasonable behavior across the great variety of situations encountered in urban trafc. However,
certain limitations should be mentioned. Since the behavioral layer intends to plan a high-level
maneuver decision, the action space is simplifed, and according to the explanation in Section
94
6.5 Limitations and Future Work
(a) Variation D2 best quality parameter set.
(b) Variation D2 best runtime parameter set.
Figure 6.12: Behavior of the PL_3D planner in Variant D2 best quality parameter set (top)
versus best runtime parameter set (bottom).
(a) Variation D2 with best quality parameter set:
frst-guess (top) and fnal trajectory (bottom).
(b) Variation D2 with best runtime parameter set:
frst-guess (top) and fnal trajectory (bottom).
Figure 6.13: Velocity, acceleration and jerk of the PL_3D planner in Variant D2 best quality
parameter set (left) versus best runtime parameter set (right).
6.3.2.2, no deceleration is considered. The approach can be extended, but such extensions
would also enhance complexity and runtime.
During the evaluation, the runtime comparison is only performed between the two planners
and not with the heuristic model TRM since the TRM mo d el is an embedded and optimized
C++ module in contrast to the conceptual implementation of the planners. To what extent
95
6.5 Limitations and Future Work
Table 6.4: Variations D1 and D2 for investigating changes in parametrization and scenario for
the cyclist scenario D_BIC using the PL_3D planner.
Variation D1: Scenario cyclist 35km/h Variation D1: Scenario cyclist 35km/h Variation D2: Scenario cyclist 35km/h
best runtime best quality global scenario-specifc best runtime best quality
Objective
Quality
F G
accmax
1.28/0.0 m/s
2
1.28/0.0 m/s
2
1.28/0.0 m/s
2
1.68/0.0 m/s
2
5.35/0.0 m/s
2
3.77/0.0 m/s
2
F INAL
accmax
1.35/0.0 m/s
2
1.39/0.0 m/s
2
2.30/0.0 m/s
2
2.20/0.0 m/s
2
3.04/0.0 m/s
2
3.42/0.0 m/s
2
vel
mean
31.46 km/h 31.47 km/h 31.40 km/h 31.38 km/h 31.31 km/h 31.39 km/h
t
target
25.4s 25.4s 25.6s 25.6s 23.2s 23.2s
Subjective
Quality
Observed
behavior Ego overtakes cyclist Ego overtakes cyclist Ego overtakes cyclist Ego overtakes cyclist Ego overtakes cyclist Ego overtakes cyclist
Expectation
satisfed?
TRUE TRUE TRUE TRUE TRUE TRUE
Runtime
Latency BH:
total (per 5sec) 2.376s (467.72ms) 2.707s (532.87ms) 2.617s (511.13ms) 2.862s (558.98ms) 1.00s (216.38ms) 2.63s (566.81ms)
Explored
Nodes
1823 2022 1960 2113 875 1865
Iterations 744 845 802 863 567 1030
Latency MP:
total (per 5sec) 0.382s (75.20ms) 0.327s (64.37ms) 0.324s (63.28ms) 0.387s (75.59ms) 0.410s (88.36ms) 0.373s(80.39ms)
Latency PP:
total (per 5sec) 0.018s (3.54ms) 0.019s (3.74ms) 0.016s (3.13ms) 0.018s (3.54ms) 0.017s (3.66ms) 0.019s (4.09ms)
Total Latency:
total (per 5sec) 2.776s (546.46ms) 3.053s (600.98ms) 2.958s (577.73ms) 3.267s (638.09ms) 1.431s (308.41ms) 3.023s (651.51ms)
the runtime of the planning approaches can be tuned towards a level sufcient for the
implementation in a multi-agent simulation has to be investigated.
Furthermore, due to the focus on the planning problem itself, predictions regarding the future
movement of other trafc participants were obtained from GT data. For further development,
the handling of prediction uncertainties has to be investigated. Moreover, the question arises
concerning the distribution of knowledge among several modules within a holistic driver model.
Is it sufcient to provide the information regarding, for example, the ROW context to the
prediction module assuming that the resulting prediction corresponds to this regulation, or
does the planning module require this information explicitly in order to incorporate it into the
planning strategy?
For parametrization, a GA using a simple ftness function was employed. Based on insights
provided by this thesis, such as the insufcient weighting between the APF and the lane
deviation cost of the PL_3D planner, a more sophisticated ftness function could be formulated
aiming at more human-like motion.
The benefts of the presented planning approaches exceed those of heuristic models only in
complex situations. Therefore, the use of such decision strategies should be chosen sensitively
only when required. One approach could be to fnd a modular approach that selects the best
decision strategy from a set of possible modules depending on the level of complexity of the
situation.
Since the planners are more sensitive to changes in the situation or parameters compared
to heuristic approaches, an extended evaluation over a large number of scenarios is required.
Commonly used evaluation methods, investigating to what extent any generated trajectory is
in line with human behavior under similar conditions, are limited to some simple criteria and
the visual assessment of the developer. Therefore, the development of an evaluation method
capable of objectively assessing the human likeness of trajectories is required and will be
addressed in the subsequent chapter.
96
6.6 Conclusion
6.6 Conclusion
In summary, the evaluation showed that trajectory planning shows great potential for dynamic
decision-making as the planners are able to generate plausible behavior even in highly complex
urban situations while the heuristic model runs into deadlocks. The comparison between
the two diferent planners demonstrated that PL_PVD can produce high-level decisions
with lower computational cost but is more costly when smoothing into a feasible trajectory.
Furthermore, the decomposition of plann ing in space and time limits the planner’s ability
to produce plausible behavior in scenarios in which temporal and spatial motion interact
strongly. Therefore, limitations similar to the heuristic model occur, and only the PL_3D
planner was able to solve all levels of complexity. The presented planners allow more dynamic
decision-making due to the larger and fner discrete solution space. The runtime and the
quality of the resulting trajectory strongly depend on the particular confguration space and,
thus, on the individual s cenario and the weighting of the diferent costs to be minimized.
As a consequence, such approaches have a higher sensitivity to parameters or situational
changes compared to heuristic mo dels. However, at the same time, the planners provide the
indispensable basis for a scaleable solution to cope with the variety of scenarios occurring in
urban trafc.
97
7
Evaluation Methods
Disclaimer: The present chapter involves research presented in the following publications:
[103]: Teresa Rock et al. “Quantifying Realistic Behaviour of Trafc Agents in Urban Driving Simulation Based on
Questionnaires”. In: 2022 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2022, pp. 1675–1682.
[219]: Teresa Rock et al. “Objectively Scoring the Human-Likeness of Artifcial Driver Models”. In: Applied Sciences
13.18 (2023).
The following chapter addresses research question R4: How to identify model
limitations and quantify the degree of human likeness of holistic driver models
and individual subcomponents?
After a brief introduction pointing out the motivation of the chapter, a comprehensive overview
of state-of-the-art approaches for driver model evaluation from diferent research areas is
given, and associated potentials and weaknesses are discussed. Following this foundational
understanding, two novel methods, an objective and a subjective approach, for evaluating driver
models in terms of human likeness are presented. Both methods are applied to the holistic
driver model at BMW. To conclude, results and the potentials of the individual methods are
critically discussed.
7.1 Introduction and Motivation
Evaluation as a crucial part of model development is especially challenging in the area of
human-like driver models for urban trafc. First of all, the assessment of human likeness sufers
from the absence of a unique defnition of what constitutes human likeness. As discussed
in Section 3.2, diferent perspectives and approaches are available, originating from various
research areas and targeting individual objectives. Furthermore, human driving style varies
due to diferent personalities, experiences, and situational contexts [127]. The result is behavior
individuality in two dimensions, driver-individual and situation-individual [110]. Especially in
98
7.1 Introduction and Motivation
urban scenarios, behavior is infuenced by various situational infuences, which must, therefore,
be taken into account for evaluating the behavior of a driver model in terms of human likeness.
To give a simple example, a driver waiting at an intersection to give ROW to another vehicle.
Imagine the driver accepted a very small time gap because he was already waiting for a few
minutes and had to give ROW to multiple vehicles, while the driver model decides for a longer
time gap. With a common objective metric employing a simple samples-wise comparison of
model and driver behavior in this situation, the model behavior would receive a poor rating
as it is not in line with this individual human behavior without being aware of how both
behaviors would be located within a statistical behavior distribution. Evaluating the same
scenario sub jectively using a contemporary questionnaire, one would possibly receive a good
human likeness rating as model behavior is still in line with an expected range of cautiousness.
In conclusion, the validity of evaluation methods must be handled carefully, always considering
the extent to which the respective method allows for general conclusions.
Therefore, evaluation methods used for model development should cover the key requirements
of transparency and efcient assessment also in early development stages and be adaptable
to diferent applications. The evaluation approach should enable transparency to identify
the strengths and weaknesses of the model and provide signifcant results on a microscopic
behavior level, as urban trafc is characterized by situational changes.
As developing human-like driver models is a highly complex task, iterative development
processes are required. Therefore, the approach should be efcient and suitable for iterative
model improvement.
Finally, since developing human-like driver models has various applications in diferent research
areas, the approach should be transferable to diferent application domains, allowing weighting
according to diferent priorities.
The previous example and an extensive state-of-the-art analysis have shown that modern
objective approaches either do not consider contextual information, focus on macroscopic
assessment, or provide assessment strategies for very isolated scenarios and are, therefore, either
not scaleable or inappropriate. Meanwhile, subjective approaches, on the other hand, consider
behavior within context but, due to the use of simple questionnaires, are unable to provide
insights into which aspects of model behavior reveal weaknesses. Furthermore, state-of-the-art
research intensively addresses the task of developing driver models, planning-, and prediction
modules for complex urban situations, while the challenge of adequately evaluating the results
is rarely discussed.
Therefore, this chapter aims at providing evaluation methods covering the requirements
mentioned above. Thus, a novel objective method applicable to both a fully developed driver
model and subcomponents and usable for models in various development stages is presented.
Furthermore, a subjective method is proposed, which is able to provide crucial insights for
DiL applications based on fully implemented agent models.
99
7.2 State-Of-The-Art: Evaluation Approaches for Human-Like Driver Models
7.2
State-Of-The-Art: Evaluation Approaches for Human-Like
Driver Models
Diferent perspectives on the assessment of the human likeness of driver models have been
presented in Section 3.2. The following section, therefore, provides some additional details
on the topic by categorizing metho ds into objective and subjective approaches and providing
examples from the development of AV and DS.
In the area of AV development, most of the objective metrics rely on the direct comparison
of modeled driving data to an individual sample of real driving b ehavior, describ ed as a
trajectory, using distance measures [220]. Common metrics for evaluating prediction or
prediction frameworks employ displacement errors, measured, for example, as the distance
between the actual and predicted trajectories [81]. Such metrics indicate how accurately the
predicted trajectory matches the individual human-driven trajectory. However, the use of
displacement errors cannot provide information on how functional or plausible the trajectory
was. Therefore, in some individual cases, more sophisticated evaluation strategies are applied,
e.g., taking into account functional errors such as road violations [190]. A detailed summary of
state-of-the-art metrics that are used in the area of trajectory prediction is provided in Section
5.2 in Table 5.2.
To quantify the similarity between driver models and human trafc behavior in DS, macroscopic
analyses are performed. With the help of endurance tests, synthetic data is generated and
compared with real trafc data in typical highway scenarios, such as cut-in or following
maneuvers [47]. Typical indicators to describe human behavior in related research are average
and maximal velocity, frequency and exceeding of speeding [109], acceleration and headway
[110, 221] as well as TTC and longitudinal distance [90]. Based on such parameters, the relative
validity of the macroscopic behavior of driver models can be determined [107, 108]. Such
methods compare observed macroscopic parameters of artifcial vehicles with a distribution of
respective parameters among real vehicles. For example, to measure the human likeness of
modeled behavior, the distributions generated by the models’ parameters can be compared
with the distributions of respective parameters extracted from the real-world driving data.
For comparing distributions, statistical approaches such as Kolmogorov-Smirnov in the study
conducted by Wang et al. [222], or Kullback-Leibler divergence as in research by Kuefer et al.
[73] are applied. However, since DS is primarily applied to highway use cases, most approaches
focus on highway trafc and do not consider contextual infuences, which in turn raises doubts
about the applicability of such methods to more complex urban trafc.
Subjective approaches, on the other hand, measure human likeness using questionnaires,
interviews, or surveys that automatically consider behavior in an individual context. The
underlying assumption of such methods is that either what is perceived as real or can not be
distinguished from artifcial behavior defnes human likeness. Y. Zhang et al., for example,
adapted the Turing test and asked participants to classify the driving behavior of another
vehicle into either artifcial or human-driven [113]. Similarly, Dumbuya et al. asked subjects to
rate how realistic they perceived a drive completed by diferent driver models and how likely it
was that the drive was conducted by a real human driver [114]. Further research investigates
the human likeness of driver models and the extent to which the perceived realism of a VE is
100
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
afected by the behavior [223, 103]. In terms of DiL applications, the objective is to create
plausible behavior of the trafc agents for valid test conditions [46]. Humans will perceive the
behavior of other road users as realistic if it corresponds to what they expect and experienced
in real trafc. Considering the trafc agents as part of a VE, the psychological concept of
presence, as a common method for measuring the efectiveness of VEs, should be named [115,
116, 117, 118, 119]. The underlying assumption is that a high sense of presence means that
people respond as if the sensory input was real [118], and therefore measures the participants’
perception of reality. A more detailed description of causes of presence and the relation to
driver models is provided in Section 3.3.
7.3
Method 1: Objectively Evaluating the Human Likeness by
Introducing a Plausibility Metric
In the following section, a multi-dimensional quality function is proposed, which includes
various objective parameters for the characterization of human-like driving behavior in order to
objectively evaluate the human likeness of driver models. Driving behavior is discretized into
spatio-temporal behavior, thus trajectories and categorized into diferent driving situations by
assigning context parameters, such as the ROW situation, based on the scene representation
proposed in Section 3.4.1. The characterization of each driving situation allows for a conditional
comparison of model behavior to that of humans in similar situations by selecting a subset of
human trafc data showing the respective driving situation. Based on behavior and context
parameters, the degree of human similarity of the artifcial model can be evaluated by using
statistical analysis within a quality function. Since this method attempts to objectify human-
like behavior that is difcult to attribute to any objective truth, the developed method is
validated with the help of a subjective survey inspired by the Turing Test method.
7.3.1 Concept
The core idea centers on the development of a metric that can objectively quantify the
plausibility and, thus, the human likeness of artifcially generated driving behavior by evaluating
the driven trajectory. The method is required to be adaptable to various use cases and suitable
for complex urban trafc scenarios. The evaluation method receives artifcial driving data in
conjunction with contextual information that describes the driving situation, such as whether
the vehicle is yielding the ROW. The metric is constructed by creating a multi-dimensional
quality function that includes various parameters to characterize behavioral plausibility and
provides a consolidated score of human likeness, as illustrated in Figure 7.1. This scoring
technique allows for the efcient assessment of a wide range of samples and scenarios, while the
underlying multi-dimensional quality function enables a detailed analysis of individual samples
and provides transparency concerning situations or characteristics in which a mo del exhibits
weaknesses. The multi-dimensional quality function compromises functional, dynamic, and
interaction-related parameters. Functional parameters are designed to examine the presence
of collisions or road violations, dynamic parameters asses driving behavior by considering
factors such as acceleration, speed, and jerk. Interaction-related parameters evaluate the
relative movements between the ego and other road users. These parameters are checked
101
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
EVALUATOR
SITUATIONAL
CONTEXT
TRAJECTORY
DATA BASE OF REAL TRAFFIC
FOR BEHAVIOR COMPARISON
MULTI-DIMENSIONAL
QUALITY FUNCTION
+
HUMAN-LIKENESS
SCORE
Figure 7.1: The general idea of evaluating model behavior within situational context.
against human behavior in analogous situations and can be weighted to accommodate a variety
of applications and specifc requirements. Finally, these weighted metric components are
combined to generate a consolidated human likeness score. The value ranges that signify either
passing or failing a parameter check are extracted from real trafc data, and confgurable
thresholds can be applied to fne-tune the desired value range to satisfy particular needs of the
application, such as safety. This modular design is intended to provide high scaleability of the
method, allowing developers to choose which parameters to consider and how to weigh them.
For example, the evaluation of a trajectory planner for real trafc might require a higher level
of safety in behavior in contrast to the context of artifcial road users within DS.
Context parameters are ass igned to identify the specifc situational context in which the
artifcial driving data is generated in order to enable the systematic comparison of quality
parameters under analogous conditions. These context parameters serve as identifers and
facilitate the extraction of a subset of analogous real-world trafc situations from a database
fltered by equivalent context parameters. In this way, artifcial behavior can be compared with
hu man behavior under similar conditions, such as comparing the trajectory of a left-turning
vehicle, giving ROW to various human drivers handling the same situation.
In order to investigate the extent to which the proposed metric is able to replace the subjective
evaluation of a human assessing artifcial behavior, the method is validated through a subjective
survey. During the survey, participants are asked to rate the human likeness of drivers in
diferent situations, without knowing whether the vehicle was driven by a human or by a
model. Based on the participant ratings and the automatically computed objective human
likeness score, the efectiveness of the proposed metric can be validated. The present concept
can be divided into the following steps: specifcation of parameters to characterize human
driving behavior, identifcation of context-based similarity of situations, preparing databases,
formulation of the quality function, and validation, as shown in Figure 7.2.
102
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
„EVALUATION OF EVALUATOR“
DEFINE PARAMETERS
THAT INDICATE
HUMAN-LIKE
BEHAVIOR
IDENTIFY REQUIRED
CONTEXTUAL
INFORMATION
FORMULATE
QUALITY
FUNCTION AND
SCORING
EVALUATOR
SYNTHETIC
TRAJECTORY DATA
+ CONTEXT
REAL TRAJECTORY
DATA
+CONTEXT
SURVEY
HUMAN
LIKENESS
SCORE
DATA OF
REAL
TRAFFIC
validate
DRIVER
MODEL
Figure 7.2: Illustration of the concept toolchain including metric development and validation
strategy of the method.
7.3.1.1 Parameter Specifcation
In the frst step, parameters for the evaluation of human-like driving behavior have to be
defned. Inspired by literature, Table 7.1 provides an overview of potential parameters to
characterize driving behavior and are distinguished by the following categories:
• Functional: Did a collision occur, and did the trajectory violate the driveable area?
•
Dynamic: Is motion measured by acceleration, jerk, and velocity in a human-like range?
•
Interaction: Are cautiousness and criticality in interactive situations, measured by
parameters such as time gaps and distances, in a plausible range?
The parameters can be calculated for all samples of real and artifcially generated behavioral
data, provided that the spatio-temporal motion, road user classifcation, and information about
the static environment, i.e., the map, are available. Algorithms to calculate interaction-related
information are based on a fusion of map and time series data according to the concept and
data formats presented in Chapter 4. In addition to the scene representation presented in
Chapter 4, the interaction category contains parameters such as PET (Post Encroachment
Time) that require knowledge of a longer time series of motions and, therefore, can only be
computed ofine and would not b e suitable for a situation representation for a driver model.
Therefore, the calculation of PET, TET (Time Exposed Time-to-Collision), and the critical
time gap is described in detail:
103
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
Table 7.1: Overview Parameters for describing human likeness and plausibility of behavior.
Category Parameter Reference
Functional Road violation [81, 112, 190]
Collision check [170]
Dynamic Lateral Velocity [2]
Longitudinal Velocity [224, 225, 2, 226, 227]
Lateral Acceleration [224, 2, 226]
Longitudinal Acceleration [224, 2, 226, 227]
Longitudinal Jerk [226]
Interaction Relative Velocity [224, 2, 226, 227]
Distance to the Partner [2, 226]
TET (Time Exposed Time-to-Collision) [225]
Max. Value for Critical Time Gap when
Interacting
[228]
PET (Post Encroachment Time) [225]
•
PET: is the time diference between the time when the frst vehicle leaves and when
the second vehicle enters the confict zone, as illustrated in Figure 7.3a. The start and
end positions of the confict zone can be determined by the overlapping multi-polygons,
describing the spatial movement of a vehicle by a union of all bounding boxes for the time
step. Based on this, the time can be calculated by checking for each time step whether
the two bounding boxes representing the vehicles intersect. Implemented calculations
are based on the work of Allen et al. [229].
•
TET: is an aggregate metric encompassing all time intervals during which a pair of
interacting vehicles exhibited TTC values lower than a defned threshold. TET serves
the purpose of as sessing the duration of a confict, and calculations are adapted from
Minderhoud et al. [230]. Interactions are considered pair-wise between the ego vehicle
and its partner. If the trajectories of the pair show an intersection, a motion polygon for
each vehicle can be calculated by the respective bounding boxes over time. Based on a
confict zone extracted from the overlap of motion polygons, TTC can be calculated for
all time steps of the interaction until either the ego vehicle or the interaction partner
leaves the confict zone by assuming velocity to be constant in each time step. TET is
the cumulative time duration, during which the TTC of the ego vehicle was below a
critical value, as illustrated in Figure 7.3b. The critical value was set to six seconds, as
this is the th reshold of interacting vehicles being mutually infuenced according to real
trafc analysis [231].
•
Critical time gap: Gap acceptance is a concept primarily applicable to unsignalized
intersections, which commonly occur at the junction of major and minor roadways, and
was inspired by Raf et al. [228]. When interaction takes place, the respective vehicle
must yield the right of way to vehicles on the primary road. These consecutive vehicles on
the main road create time gaps, as depicted in Figure 7.3c. The acceptance or rejection
of such a gap is contingent upon factors such as gap duration, clearing time, and driver
behavior, as elucidated by Dutta and Ahmed [232]. The value for the time gap being
accepted can be determined by the moment a vehicle with status waiting for gap, which
104
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
is heuristically identifed by velocity, acceleration, and interaction identifcation, starts
moving. By calculating the intersection of time gaps being accepted and rejected, the
critical time gap can be calculated as shown in Figure 7.3c.
Time 𝑡1Time 𝑡2
(a) PET calculation [233].
Left: First arriving vehicle leaves the confict
zone. Right: Later arriving vehicle enters the
confict zone.
TTC
threshold
𝑡0𝑡1𝑡2𝑡3Time
(b) TET calculation according to [234].
GAP
Frequency Accepted Gaps
Gap [s]
Rejected Gaps
Critical gap
(c) Critical time gap calculation.
(d) Critical time gap calculation according to
Raf et al. [228].
Figure 7.3: Illustration of the calculation concepts for interaction-related parameters.
7.3.1.2 Identifcation of Context-Based Similari ty of Situations
In order to compare artifcial behavior with the behavior of humans in similar driving situations,
contextual information must be assigned to the data. Inspired by Scholtes et al. [235] and the
previous concept of representing context information (Section 3.4.1), a multi-layer approach to
describe urban scenarios is developed and presented in Table 7.2. The parameters are inspired
by Schlote’s approach of categorizing environmental information into layers in combination with
the real trafc database, which shows ROW controlled intersections. The context parameters
allow situations such as those shown in Figure 7.4 to be characterized and identifed as similar.
For example, Figure 7.4 shows two situations with the following characteristics: Relation to
intersection: Just Before; Lane turn direction: straight; Vehicle state maneuver: Decelerate;
Number of legs: 4; Number of interactive vehicles: one; Number of interactive V RUs: one;
Intersection Density: Low; ROW relationship: giving; Closest interacting vehicle class: Car;
Closest interacting VRU class: Bicycle.
Algorithms to extract context information are based on a fusion of map and time series data
according to the metho dology presented in Chapter 4. The proposed contextual parameters
can be calculated for each sample for both the trajectory data to evaluate as well as the
comparison data originating from real trafc.
105
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
7.3.1.3 Preparing the Database
Based on context parameters, a subset of real data containing similar situations can be selected.
This subset of real trafc database forming the basis for the comparison is required to contain
sufcient samples. All time series data are aggregated into sequences of one-second time
windows to determine the context. A threshold of 1000 samples is defned to determine if the
number of GT data is sufcient for comparison. If the number of remaining samples in the GT
subset is insufcient, the level of abs traction is increased, and context-describing parameters are
gradually removed from the fltering. Table 7.2 shows the priority order of context parameters
for increasing abstraction in column P. The priority order was determined empirically by
expert knowledge and scanning the real trafc data. Knowledge of the number of matching
parameters is saved to allow additional information about the certainty of the comparison by
providing the Jaccard Index to quantify the similarity of subsets in each comparison [236].
Furthermore, the context information is used to select the parameters to be considered since,
for example, calculating PET in the absence of interaction partners would not be reasonable.
Table 7.2: Overview of context parameters to distinguish scenarios, sorted by priority order (P),
to extract similar trafc situations from the comparison database.
P
Scenario Cat. 1 Cat. 2 Cat. 3 Cat. 4 Cat. 5
1
Infrastructure ma-
neuver: Relation
to intersection
Before Just Before Inside Just After After
2
Infrastructure ma-
neuver: Lane turn
direction
Left Right Straight - -
3
Vehicle state ma-
neuver
Accelerate
Decelerate Steady Stop -
4
Object related ma-
neuver
Following
Waiting for
gap
Approaching
Waiting
queue
-
5 Number of legs Three Four - - -
6
Number of interac-
tive vehicles
Zero One Multiple - -
7
Number of interac-
tive VRU
Zero One Multiple - -
8
Intersection Den-
sity
Low Moderate High - -
9 ROW relationship Giving Receiving - - -
10
Closest interacting
vehicle class
Car Truck / Bus - - -
11
Closest interacting
VRU class
Bicycle Pedestrian - - -
7.3.1.4 Quality Function Formulation
Once all behavioral parameters have been calculated and a subset of real data is extracted, the
comparison of driving behavior can be conducted. In order to select an appropriate statistical
test to measure the diference between real and artifcial behavior, the underlying statistical
distribution for each parameter under consideration must be examined frst. For the proposed
106
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
Ego Vehcile ID: 30 at frame 2956
Bicycle ID: 29 at frame 2956
Car ID: 26 at frame 2956
Ego Vehcile ID: 25 at frame 2315
Bicycle ID: 24 at frame 2315
Car ID: 23 at frame 2315
(a) (b)
Figure 7.4: Two exemplary situations at the Bendplatz location (recording 12) were identifed as
similar based on the contextual parameters in Table 7.2. The ego vehicle is marked in red, the
cyclist in blue and the other vehicle in green. The trajectory already traveled is marked as a solid
line.
metric, the similarity of the distribution of driving behavior parameters of the artifcial vehicle
to the driving behavior of real vehicles is measured by using the KS (Kolmogorov-Smirnov)
two-sample test method [237] inspired by the study conducted by Wang et al. [222]. In
addition, the extremes of some driving parameters are evaluated to verify that driving behavior
exhibited by artifcial vehicles lies within the limits defned by the minimum and maximum
values obtained from real drivers in match ing scenarios. The following parameters were selected
for extrema evaluation: maximum longitudinal velocity, maximum lateral velocity, maximum
longitudinal acceleration, minimum longitudinal acceleration, minimum distance to partners,
and maximal critical time gap.
By assigning thresholds, all individual parts of the metric, either pass or fail in human similarity,
are weighted by
wi
and f nally calculated into a human likeness score according to Equation
(7.1). Inspired by the parameters identifed in literature, presented in Table 7.1, the proposed
metric contains the parameter checks listed in Table 7.3. It has to be noted that not all
parameters are suitable for all driving situations. For example, if no vehicle interacts when
turning, no time value can be calculated for the gap acceptance. The unavailability of such
parameters was assumed as pass when calculating the fnal score. Furthermore, functional
parameters are not considered in the POC implantation of the metric since the model under
test is a holistic and fully integrated driver model that has already passed functional tests. The
concepts for evaluating whether a trajectory meets functional requirements, such as evaluating
collisions or road violations, are presented in the evaluation method for data-driven trajectory
prediction in Chapter 5. Due to the modular structure of the metric, such parameters can be
easily added whenever required.
∑(wi+ pass checki)
score = ∑ (7.1)
(wi+ total checki) ∗ 100
107
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
7.3.1.5 Strategy for Validating the Method
Given the absence of an ofcial, objective standard defned as human likeness, the proposed
metric must be validated. The validation of the method explores whether the human-like
driving scores obtained by the metric correlate with people’s subjective ratings of behavior.
The validation strategy is premised on two key assumptions. First, human-like driving can
be defned by the ability of subjects to distinguish between artifcial and human-generated
behavior. Second, should a correlation be found between the metric’s scores and subjective
assessments, this demonstrates the capacity of the metric to quantify human likeness objectively.
For validation, a survey is prepared, which is inspired by the work of Zhang et al. [113], who
adjusted the Turing Test to quantify the human likeness of their proposed algorithm in a
driving simulator. In order to provide an efcient evaluation method that can be applied to
a large number of situations, the validation strategy for this method is conducted through
a survey instead of a simulator experiment. To validate the presented metric, participants
are asked to rate the behavior of a target vehicle in short videos without knowing whether
the vehicle is driven by a real human or an artifcial driver. In each video, an interactive
driving scene involving multiple vehicles is shown, with one vehicle marked as the target
vehicle to be evaluated. Care is taken to ensure that the videos present no discernible cues
indicating whether the b ehavior emanates from human or artifcial data sources. After each
video, participants are asked to rate the behavior of the target vehicle by the following scale:
1: Completely artifcial driving; 2: Somewhat artifcial driving; 3: Not sure; 4: Somewhat
human-like driving; 5: Completely human-like driving.
Based on this, it can be investigated whether the scores given by humans correlate with those
of the proposed metric and to what extent participants are able to distinguish between artifcial
and human behavior. For further validation of the method, the metric is applied to some
real driving and artifcial driving data, assuming that the human likeness score of real data is
signifcantly higher compared to synthetic behavioral data. The metric includes several aspects
that can be tuned, such as how narrowly the range of human similarity is defned or how
individual parameters are weighted for the fnal score. The insights from the survey provide a
basis for tuning the metric toward how people would distinguish b etween artifcial and real.
7.3.2 Implementation
7.3.2.1 Used Databases and the Driver Model
For representing real human driving behavior, the same open-source dataset inD
2
was selected
as in Chapter 4 and 6, which provides recordings of four German unsignalized intersections
from a birds-eye perspective shown in Figure 7.5 (right) [155]. For creating the human behavior
database, all recordings except recording 12 were selected, whereby this recording was retained
for validation purposes as described in Section 7.3.1.5 (referred as inD12). For testing the
proposed metric, artifcial driving data was created on two synthetic intersections, which are
illustrated in Figure 7.5 (left). For testing on a large scale of interactive situations, data was
created with the help of the simulation framework Spider at BMW [151, 55]. A detailed model
2https://ind-dataset.com/
108
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
description can be found in Section 2.2.6. Behavioral and contextual parameters are computed
according to Section 7.3.1.1 and Chapter 4 for the synthetic and the real trafc databases.
INTERSECTION 1
INTERSECTION 2
inD12
REAL DATA FOR COMPARISON
SYNTHETIC INTERSECTIONS
Figure 7.5: Locations for data gathering - Left: synthetic intersections for creating artifcial
driving behavior. Right: Locations from inD Data [155].
7.3.2.2 Metric Formulation and Thresholds
Through the assignment of individual thresholds and weighting factors, each element of the
metric can be amalgamated into a human likeness score according to Equation (7.1). For
PET, TET, an d Max. Critical Gap, global thresholds for human similarity were derived from
the real trafc database since not enough situational samples could have been extracted for
contextual comparison due to the heuristics applied to calculate these parameters. All other
parameters could be compared considering the situational context, i.e., in comparison to the
behavior of real drivers in comparable situations. Accordingly, threshold values represent the
limits for statistical similarity, as shown in Table 7.3. Based on that, the human likeness
scores for the two synthetic locations and the retained real recording (inD12) were calculated.
Since the driver models used in the synthetic data showed quite high scores, thresholds for
each parameter were further narrowed to fulfll the assumption that the overall results of the
real trafc data would signifcantly exceed those of the synthetic data. The tuned metric
resulted in an averaged score of 89
.
62% for the real dataset (inD12), 77
.
31% for intersection 1
(synthetic), and 77
.
87% for intersection 2 (synthetic). Initially extracted and tuned thresholds
for all parameters are provided in Table 7.3.
7.3.2.3 Survey for Validation
The survey was conducted online and involved 23 participants rating the behavior of vehicles
in 12 videos. Four videos showed real driving behavior and eight artifcial behaviors. Real
scenarios were extracted from the inD dataset of recording 12, whereby artifcial behavior
was created on the two synthetic intersections as described in Section 7.3.1.3. The order of
the videos was randomized to eliminate the possibility of order efect bias. In each video,
the vehicle to asses was marked red while all other vehicles were blue, as shown exemplarily
109
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
Table 7.3: Thresholds for measuring human likeness for diferent parameters extracted from the
real trafc dataset. Distributions of velocities, accelerations, and jerk are compared using KS
statistics. Percent ratios are assigned for maximum values and raw thresholds are assigned for
context-free parameters. Parameters marked with * are calculated context-free.
Name of Parameter Check Initial Fine-tuned
KS Longitudinal Vel. >0.993648 >0.668406
KS Lateral Vel. >0.952358 >0.624929
KS Longitudinal Acc. >0.780503 >0.559198
KS Lateral Acceleration >0.926266 >0.647957
KS Jerk >0.685024 >0.511960
Max. Longitudinal Vel. <66.67 % <66.67%
Max. Lateral Vel. <73.33 % <83.13 %
Max. Longitudinal Acc. <64.00 % <71.80 %
Min. Longitudinal Acc. <78.00 % <72.00 %
Min. Distance to partners <84.00 % <93.60 %
PET* <0.64 s <0.50 s
TET* >4.96 s >3.90 s
Max. Critical Time Gap* >6.98 s >5.40 s
in Figure 7.6. The visualization was abstracted to eliminate any indicators that might help
distinguish between real and artifcial trafc. After each video, participants were asked to rate
on a fve-point scale whether they perceived the red vehicle’s behavior as real or artifcial, as
described in Section 7.3.1.5.
Figure 7.6: Exemplary screenshots of videos shown to participants for rating human likeness of a
subject vehicle marked in red.
7.3.3 Results
In the following section, the results of the survey, providing subjective assessments of human
likeness, are compared with the objective scores obtained by the prop osed metric. In order
to apply the proposed method to a broader range of samples, the human likeness score is
additionally computed for the datasets described in Section 7.3.1.3. Finally, two case studies
are presented exemplary showing how to apply the proposed metric for diferent purposes.
First, the method is applied to examine the used driver model TRM in detail for model
improvements, and second, the method is applied to the results of the trajectory planners
110
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
presented in Chapter 6 in the left turn scenario, as the three models under test all showed
compliant but diferent behavior.
7.3.3.1 Objective Metric Results Versus Subjective Human Ratings
Two aspects are of interest when investigating the results of the survey. First, to which
extent were subjects able to distinguish between real and artifcial behavior, and second, did
participants’ subjective ratings correlate with objective scores calculated by the proposed
metric? The mean value of the Turing test (6.37) indicated that participants’ ability to
distinguish between artifcial and real drivers is only slightly higher than that of random
responses or the result (exactly 6.0) when selecting "Not sure" for all vehicles. Figure 7.7
visualizes the subjective ratings for all test vehicles shown during the survey associated with
the objective scores calculated by the proposed metric. Real vehicles are green, while artifcial
vehicles are colored red. The y-axis shows the rating scale presented during the survey. The
blue value above each vehicle rating describes the objective human likeness score obtained
by the proposed metric. During the survey, artifcial vehicles that exhibited high and low
levels of human likeness were selected. Also, for the real vehicles, samples with more and less
objective human likeness scores were selected to determine if the metric was able to detect
both good and bad results. Comparing the objective scores from the metric to participants’
Figure 7.7: Subjective human likeness rating obtained from participants (y-axis) associated with
objective human likeness scores obtained by the proposed metric (blue value above). Vehicles are
sorted by the average rating assigned by participants, in descending order from left to right.
subjective ratings from the survey, a positive correlation could be observed with a Spearman
correlation coefcient of 0.62 and, a p-value of 0.030, and a Pearson correlation coefcient
of 0.65 and a p-value of 0.023 shown in Figure 7.8. The p-values indicate that there is a
moderately monotonic positive relationship at the 97% confdence level and a moderately
linear p ositive relationship at the 97% confdence level, which can be considered statistically
signifcant. Based on the insights into which vehicles were rated as more human-like by the
participants and the multi-dimensional quality function, the individual parameters could b e
analyzed in terms of the extent to which they contribute to the decision for real or artifcial
111
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
Figure 7.8: Relationship between the fne-tuned objective human-like driving behavior scores
provided by the proposed methodology and subjective ratings of participants during the survey.
behavior. Table 7.4 provides the results of this analysis using the Spearman correlation, which
is further transformed into a weighting of the individual parameters with the aim of refecting
how people prioritize diferences in the individual behavioral parameters.
Table 7.4: Spearman correlation between the parameters and average human-like driving behavior
score from the survey, with correlations converted into weights.
Parameter
Spearman
Correlation
P-value
Conversion
to weight
KS Longitudinal Vel. 0.073555 0.820285 0.016269
KS Lateral Vel. 0.178634 0.578567 0.03951
KS Longitudinal Acc. -0.30823 0.329698 0.068174
KS Lateral Acc. 0.021016 0.948312 0.004648
KS Jerk -0.51839 0.084229 0.114656
Max. Longitudinal Acc. ratio -0.42732 0.165877 0.094514
Min. Longitudinal Acc. ratio -0.50794 0.0918 0.112346
Max. Longitudinal Vel. ratio -0.12151 0.706773 0.026876
Max. Lateral Vel. ratio 0.309234 0.328046 0.068396
Min. Distance to partners ratio -0.3632 0.245869 0.080333
Max. Critical Time Gap [s] -0.94286 0.004805 0.208539
TET [s] -0.52179 0.288343 0.115409
PET [s] -0.22755 0.587845 0.050329
7.3.3.2 The Human Likeness of Investigated Datasets
As described in Section 7.3.1.5, the metric is applied to retained real driving (inD12) and
artifcial driving data, assuming that the human likeness score of real vehicles should be
signifcantly higher compared to synthetic behavior. The used database is described in Section
112
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
7.3.1.3, and results of the objective scores obtained by the metric are shown in Figure 7.9.
The scores were calculated once with the initial thresholds and without weighting according
to Table 7.3 (right), and once with the tuned quality function, incorporating weights from
the survey and adjusted thresholds from Table 7.3 (left). First of all, the diference in
those two fgures demonstrates the sensitivity of results to weights and thresholds within
the quality function. Regarding the initial setting, only the comparison of real data and
artifcial behavior of intersection 2 showed signifcant diferences in the human likeness scores
when using the Mann-Whitney U test (
U
= 82,727.0,
pvalue
= 4
.
06
×
10
−13 )
. While the
comparison of behavior on intersection 1 compared to real humans showed no signifcant
diference (
U
= 58561
.
0
, pvalue
= 0
.
78). This can be explained by the quite far-developed driver
model, which was used to create the artifcial data. Therefore, without weighting or tightening
the thresholds of the quality function, only harsh outliers of behavior can be detected. When
using the tuned quality function (Figure 7.9 left) a clear diference between real and artifcial
behavior could be measured (human-like grades of inD12 vs. intersection 1:
U
= 103,568.5,
pvalue
= 4
.
34
×
10
−68 ;
human-like grades of inD12 vs. intersection 2:
U
= 113,910.0,
pvalue
=
1.54 × 10−60 ).
Figure 7.9: Results for human-like scores for real and synthetic datasets: with tuned thresholds
and weights (left) and initial setting (right).
7.3.3.3
Case Study 1: Exemplary Application of the Method for Investigating
the Heuristic Model at BMW (TRM)
The proposed method is characterized by two main asp ects. First, by using various behavior
parameters involving functional, dynamic, and interactive behavior, the multi-modality of
driving behavior is objectively measurable in diferent dimensions. Secondly, behavior is
assumed to be conditional and compared within a situational context instead of comparing
parameters on a macroscopic level. Therefore, this approach provides a high level of
transparency and enables targeted model improvement. How the proposed method can
be used for model improvement is presented in the following case study.
113
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
The overall scores measured for the artifcial driver model in Figure 7.9 show a mean human
likeness score of 78.89%, showing a variance of 33.33 on intersection 1. The scoring method
enabled the quantifcation of model behavior among multiple situations measured by various
parameters. Based on the proposed method, the following questions can be addressed to enable
model improvement:
• In which scenarios does the model show less human-like behavior?
• Which parameters mostly fail when comparing the model to human behavior?
•
Why do those parameters fail - how does the distribution of parameters distinguish when
comparing model behavior and human behavior?
Based on the correlation analysis presented in Table 7.4, the critical time gap was found to
have a high negative correlation with the subjective ratings. Therefore, the critical time gap
value is further investigated. In the real trafc data, the value was determined to be 6.98
seconds, while the same parameter in the synthetic data was determined to be 10.05 seconds at
intersection 1 and 9.98 seconds at intersection 2. This shows a signifcant diference in behavior
and needs to be improved in the driver model. To further investigate in which situations model
behavior mostly difers from that of humans, the distribution of failing context parameters
is investigated. Considering all context parameters and the distribution of failed parameter
checks, the following conclusions could be drawn: Vehicles are more likely to fail when, they...
• were in the maneuver states: approaching, accelerate, decelerate, or stop
• were giving ROW
• were in the object related maneuver: approaching or waiting for gap
• interacted with fewer partner vehicles
• drove in lower intersection density
The calculated b ehavior parameters of the synthetic data are analyzed to identify which
behavior parameters mos tly fail. Illustrated in Figure 7.10, the fve most often failing behavior
parameter checks were identifed to be: KS Longitudinal Vel., KS Lateral Vel., KS Longitudinal
Acc., KS Lateral Acc., KS Jerk.
In the next step, the value distributions of the extracted parameters can be analyzed. For this
purpose, the distribution of the dynamic parameters extracted from real data is compared
with the dynamic parameters of the artifcial samples, showing a human likeness score of less
than 70%. The distributions are extracted on a macroscopic level and situationally under
the scenario conditions identifed above as causing most of the failures. All distributions are
summarized in Table 7.5. Here, the jerk parameter stands out in particular. The maximal
values measured in artifcial behavior are much larger compared to those observed by human
drivers. It should be noted that the jerk values observed by real humans are determined by
tracking algorithms that process video data from drones. Since the open-source real trafc
dataset is preprocessed, it is not known to what extent the values in this data are smoothed.
However, the jerk values of th e driver model show signifcantly high maximum values, which
114
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
Figure 7.10: Analysis to identify which parameters mostly fail when comparing artifcial and
human behavior.
Table 7.5: Analysis of the value distribution for dynamic parameters for model behavior and real
humans on a macroscopic level (left) and scenario-specifc under situational conditions (right).
real data: syn data: real data: syn data:
macroscopic macroscopic situational situational
Parameter
Longitudinal Vel. [m/s] mean
std
min
max
7.18
5.24
-4.35
27.48
3.38
3.92
0.00
19.05
5.71
4.68
-0.21
20.52
3.48
3.48
0.00
12.34
Lateral Vel. [m/s] mean
std
min
max
0.04
0.19
-2.87
10.89
0.03
0.15
-0.77
1.45
0.04
0.26
-0.99
1.07
0.14
0.25
-0.07
0.85
Longitudinal Acc. [m2/s] mean
std
min
max
0.06
0.87
-6.25
6.54
0.06
1.02
-9.58
5.25
-0.08
1.01
-4.34
4.72
-0.47
2.72
-9.58
2.61
Lateral Acc. [m2/s] mean
std
min
max
-0.04
0.61
-5.57
4.98
0.13
0.78
-3.97
4.78
-0.08
0.89
-4.62
4.12
0.65
1.14
-0.20
3.27
Jerk [m3/s] mean
std
min
max
-0.05
0.84
-20.35
16.16
0.03
2.91
-21.40
220.04
0.06
1.05
-6.96
16.16
0.14
12.20
-18.17
220.04
should be further investigated. Therefore, some trajectories were extracted and analyzed.
Figure 7.11 shows some example trajectories that illustrate the jerking problem that occurs
when switching between driving maneuvers. Since the driver model TRM is based on heuristic
decision-making, the temporal behavior and motion are more discrete compared to humans.
However, from a subjective-visual point of view, the jerking problem is not perceptible, as
shown by the survey, and therefore not critical for driver models in the context of DiL trafc
simulation. Further interesting insights are provided by the comparison b etween macroscopic
behavior parameter distribution (left) and situational behavior (right) in Table 7.5. When
considering longitudinal acceleration, for example, one can observe that distributions are in line
115
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
Figure 7.11: Exemplary trajectory of two vehicles showing signifcant high jerk values when
approaching the intersection.
with human value ranges in a macroscopic perspective but not when comparing the parameter
situational. This shows that macroscopic comparison can result in misleading conclusions
regarding the extent of human-like behavior.
In summary, the case study demonstrates the potential of the method to reveal model
weaknesses and enable better model parameterization. The generalizable processing of model
and real trafc data combined with the multi-step approach employed during the case study by
identifying situations and parameters in which model behavior deviates from that of humans
ofers a high potential for model refnement across the variety of situations encountered in
urban trafc.
116
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
7.3.3.4
Case Study 2: Applying the Evaluation Metric to the Proposed Trajectory
Planners
As previously highlighted, the evaluation method presented herein is designed to be transferable
and capable of evaluating complete driver models as well as individual subcomponents.
Therefore, the results of the trajectory planners expounded in Chapter 6 are examined
as an illustrative example for one selected scenario.
For the case study, scenario A_VEH was chosen, where the ego vehicle turns left while another
vehicle is driving in the oncoming straight lan e and a pedestrian crosses the road after the
intersection, as illustrated in Figure 6.6 in Section 6.3.4. Both trajectory planners and the
heuristic mo del TRM were able to solve the scenario from a functional perspective, meaning
no collision or deadlock occurred. However, the three models exhibited diferent levels of
cautiousness, and the question remains as to what extent the behavior of the individual models
was in line with human driving behavior.
Table 7.6 shows the parameter checks with associated thresholds for fne-tuned fail conditions
from the presented method for each of the behavior parameters that constitute the proposed
quality function. Among the three models studied, the TRM model is the only one that decides
to wait for the oncoming vehicle, which raises a particular interest in the interaction-related
parameters to discover whether the behavior of the two trajectory planners was too aggressive.
The evaluation shows that the behavior of the TRM model is not in line with observed
interactions in real trafc, as the model decides to wait although there is a time gap of 12.2 s
to the oncoming vehicle, which exceeds the maximal value of critical time gap of being rejected
in the human database. Meanwhile, the two trajectory planners decide to cross the intersection
frst but still maintain a reasonable time gap. Also, the other interaction-related parameters,
such as PET or the distance to partners, are in line with the observed behavior of humans.
TET could not be calculated for this scenario as no TTC value below the critical threshold
were observed. The high weighting of the critical time gap for the fnal human likeness score
results in a poor result for the TRM model of 54.82%.
Considering the dynamic parameters of the three models, as observed in Case Study 1, the
TRM model fails in the dynamic parameter checks for velocity, acceleration, and jerk as peaks
occur during maneuver transitions. Also, the two trajectory planners fail the parameter checks
for KS Longitudinal Velocity and KS Lateral Velocity. This divergence can be attributed to
the fact that the distribution of human drivers approaching, departing, and traversing the
intersection deviates from the velocity behavior demonstrated by the planners. This deviation
is notably evident in the average higher velocities, as shown by the exemplary analysis in
Figure 7.12 for longitudinal velocity. This indicates frstly that the representativeness of the
comparison data needs to be further investigated and that the use of GT data for anticipation
within the planners leads to a high level of confdence when traversing the intersection, which
is not consistent with human behavior observed in the database. Furthermore, the observed
curve-cutting behavior of the PL_3D planner cannot be identifed by the metric. This suggests
that the inclusion of additional parameters, such as center-line deviation, could be benefcial.
At the moment, these parameters are not considered because maps of real-world locations often
do not adequately represent how vehicles drive at the intersection, as shown in Figure 7.13.
117
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
20 10 0 10 20 30 40
LonVelocity [m/s]
0.0
0.2
0.4
0.6
0.8
1.0
Density
HUMANS
PL_PVD
PL_3D
PL_3D PL_PVD HUMANS
mean 13.26 11.67 9.53
max 14.53 13.7 27.48
min 9.15 7.11 -4.35
std 1.01 2.42 4.99
5 0 5 10 15 20
LonVelocity [m/s]
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
Density
HUMANS
PL_PVD
PL_3D
SPA_TEM PVD HUMANS
mean 13.27 7.31 2.73
max 13.83 7.78 17.8
min 13.01 6.86 -0.8
std 0.24 0.35 2.94
(a) LonVelocity before and after the intersection.
(b) LonVelocity within the intersection area in
Vehicle state maneuver: Dcc.
Figure 7.12: Distribution of lonVelocity for the two planners and the TRM model outside and in
the intersection area.
Accordingly, simply calculated center-line deviations from real data would lead to distorted
values.
Table 7.6: Review of all human likeness parameters that constitute the quality function calculated
for the two trajectory planners PL_3D, PL_PVD, and the heuristic model TRM.
*No TET value could be calculated for the scenario since no TTC below the critical threshold of 6s
was detected.
Parameter Check Threshold PL_3D PL_PVD TRM
KS Longitudinal Vel. < 0.668406 0.9668 0.6989 0.7382
KS Lateral Vel. < 0.624929 0.6666 0.7038 0.7210
KS Longitudinal Acc. < 0.559198 0.4786 0.4766 0.6967
KS Lateral Acc. < 0.647957 0.4250 0.4123 0.7537
KS Jerk < 0.511960 0.3924 0.4448 0.6906
Max. Longitudinal Acc. ratio
> 71.80 % 87.27 % 97.60 % 95.71 %
Min. Longitudinal Acc. ratio > 72.00 % 95.27 % 92.40 % 96.59 %
Max. Longitudinal Vel. ratio > 66.67 % 81.82 % 100.0 % 98.83 %
Max. Lateral Vel. ratio > 83.13 % 89.09 % 100.0 % 87.51 %
Min. Distance to partners
ratio
> 93.60 % 100.0 % 100.0 % 100.0 %
Max. Critical Time Gap accept when > 5.40 s 7.0 s 6.6 s 12.2 s
TET < 3.90 s No TET* No TET* No TET*
PET > 0.50 s 6.0 s 5.92 s 2.80 s
Final hum-like grade 94.42 94.42 54.82
7.3.4 Limitations and Future Work
The proposed method presents a frst attempt to objectify human-like driving behavior, taking
into account the situational context. The multimodality of human behavior is mapped into
individual parameters, which are then statistically evaluated by assigning thresholds for pass
or fail and additional weights. Weighting and threshold assignment have a signifcant impact
on the fnal metric score, resulting in high sensitivity to individual tuning, especially due to
the binary strategy of either passing or failing a check.
In the future, more parameters, such as lane deviation, can be added to better account for
the multimodality of human driving behavior, and extensive surveys could provide a basis for
118
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
ID: 159
ID: 140
(a) Recording 18, Frankenburg, trackId 159. (b) Recording 18, Frankenburg, trackId 140.
Figure 7.13: Exemplary illustrating the mismatch of driven trajectories along the defned
intersection lanes.
Trajectory in red and lane polygon in cyan.
fne-tuning the metric. Instead of the binary approach of either passing or failing a parameter,
it is recommended to elaborate a more sophisticated concept whereby the range of human
likeness is discretized into bins for each parameter.
In order to determine similar matching situations from real driving behavior, context parameters
are assigned to the data. The algorithms for computing context parameters, such as the
number of interaction partners and related parameters, are based on heuristics. Such heuristics,
of course, do not guarantee that meaningful results are provided across the entire diversity
of interpersonal situations o ccurring in urban trafc. Some parameters, such as PET, were
calculated only for situations in which reliable results could be guaranteed, resulting in a
signifcant reduction of samples. Specifed assumptions for parameter calculation inspired by
literature, such as that for critical time gap calculation, an intersection of paths must take
place, which limits the situations covered by the heuristic. For future research, optimization,
especially of those interaction-related parameters, is recommended to guarantee representative
value ranges.
Based on the focus of enabling the measurement of human likeness and the development
state of the used model, not all parameters listed in Table 7.1 were included in the POC
implementation of the metric. Functional parameters evaluating road violations or collisions
were not considered. Approaches to include such parameters within a quality metric were
presented in Chapter 5. Future attempts could include more parameters and ofer an individual
weighting to exclude those for further developed models.
When the remaining samples in the comparison dataset were below a threshold, a simple
abstraction strategy was employed to measure the similarity between situations in artifcial and
real data, prioritizing contextual parameters empirically by manually examining the dataset.
Novel techniques ofer alternative approaches for measuring similarity between datasets, such
as those presented by Heuer [238], which could be used in the future.
In general, the heuristics used in this method correspond to the scenarios encountered in real
trafc data sources showing unsignalized intersections. To extend the proposed metric for
other trafc scenarios, additional heuristics and context parameters have to be considered.
119
7.3 Method 1: Objectively Evaluating the Human Likeness by Introducing a
Plausibility Metric
7.3.5 Conclusion
In this section, a novel method was introduced to objectively measure the degree of human
likeness of artifcial driver models. To do so, driving behavior is characterized by various
parameters, which are then compared to the behavior of humans in real trafc. A fnal human
likeness score can be computed for each trajectory by using statistical analysis in the context
of a quality function. Since behavior in urban scenarios is infuenced by various factors and
assumed to be conditional, the situational context for each trajectory in real and artifcial
data is characterized by automatically computed context parameters that aim to distinguish
between diferent situations that may occur in urban trafc. Thus, a subset of real trafc
data showing human behavior under similar conditions can be used to compare behavioral
parameters.
Since there is no clear defnition of human likeness, a Turing test-inspired survey was conducted
to investigate the ability of the proposed method to refect the subjective ratings of humans.
Results of the survey showed a signifcant correlation between the scores calculated by the
proposed metric and the scores assigned by participants. In addition, the survey provided
interesting insights into which parameters contribute most to the distinction between artifcial
and human driving behavior. These fndings were used to parameterize the quality function
and provided valuable insights into specifc weaknesses of the used driver model.
An evaluation of large datasets has shown that the proposed metric has the ability to evaluate
model behavior efciently in a wide range of situations, which is crucial f or developing reliable
solutions for urban trafc.
The two case studies provided serve as examples that demonstrate the usefulness of the
proposed metric in conducting a comprehensive assessment of a model an d in making
targeted improvements. By applying the evaluation method to both a fully integrated, refned
driver model and two conceptual trajectory planners, the metric’s remarkable adaptability is
demonstrated. Moreover, this process provided valuable insights into the specifc dimensions
in which model behavior difers from human behavior.
The modular structure of the metric allows model behavior to be evaluated according to
application-specifc requirements. In DS, for example, the priority is more on human-like
behavior than on rule compliance and safety. In contrast, when evaluating a trajectory planner
for AVs in real trafc, the focus might be more on non-critical behavior, accepting a lower level
of aggressiveness during interactions. Accordingly, by weighting and narrowing the thresholds
for individual parameters, the method can be used for a broad range of applications.
120
7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving
Simulation Based on Questionnaires
7.4
Method 2: Quantif ying Realistic Behavior of Trafc Agents
in Urban Driving Simulation Based on Questionnaires
Given the distinct focus of this thesis on DiL in urban DS, the perspective of the human driver
regarding model behavior received signifcant importance. Therefore, a novel approach for the
subjective assessment of the human-like nature of trafc agents is introduced, which measures
the perceived realism of participants in a simulator experiment.
7.4.1 Concept
In DS, human-like driver models are relevant to provide a realistic driving scene for the DiL in
the VE. Therefore, the proposed concept considers the human driver as a mirror for quantifying
artifcially modeled road users in terms of behavioral realism. This general setting allows
for evaluating agent models on a microscopic level while interacting with a human driver
over a broad range of situations. The concept of presence has been established as a common
method to measure perceived realism in VEs [118]. The concept introduced here operates
on the premise that the driver’s sense of presence within the simulator is observably afected
by the dynamics of the surrounding trafc. It is, therefore, necessary to investigate whether
and to what extent the existence of dynamic trafc participants in the simulation leads to
diferences in presence compared to driving without surrounding trafc. In addition, a suitable
questionnaire must be elaborated to quantify agent behavior in diferent dimensions to allow for
deriving useful conclusions for future model improvement. Therefore, a simulator experiment
was designed in which human drivers experience diferent situations in a virtual city and are
asked to rate their sense of presence as well as other items that allow a clear analysis of the
appearance of model behavior. For the experiment, the following hypotheses are formulated:
•
H1: The existence of surrounding trafc leads to an increase in participants’ sense of
presence.
•
H2: The realism of trafc agents can be assessed and distinguished in diferent dimensions
using a questionnaire that considers the key requirements on human-like driver models
defned in Section 3.3.4.
121
7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving
Simulation Based on Questionnaires
7.4.2 Material and Setting of the Simulator Experiment
7.4.2.1 Study Design
During the experiment, the existence of surrounding trafc was manipulated in a within-subject
design. Every participant experienced the same route twice, once with surrounding trafc
(trafc) and once without (no-trafc). Order efects were avoided by balancing the order
of the two drives between the participants. The trafc drive included typical interactive
situations occurring in urban environments. Next to route-following and lane-changing tasks,
the participants experienced the following situations, illustrated in Figure 7.14:
• Pedestrians crossing the street (1, 2, 6)
• Cyclists on the street (3, 5)
• Driving through a roundabout (4)
• Turing left with oncoming trafc (7)
• Occupied lanes with oncoming trafc (9)
• Leaving a parking space on a busy road (8)
Participants were instructed to follow a defned route indicated by the navigation system and
to comply with trafc rules as they would do in real trafc.
In the beginning, a short test drive was completed to allow familiarization with the simulator.
The subjects answered a presence questionnaire [239] after both drives and evaluated the
quality of the agent models after the trafc drive using a second questionnaire.
7.4.2.2 Questionnaires
The participants were asked to rate their sense of presence using a standardized presence
questionnaire [239], which is in common use in related experiments [119, 121, 240]. The
questionnaire was extended by one item regarding the naturalism of driving style. A second
questionnaire was created to assess the degree of realism of the agent models in the simulation.
The questionnaire intends to qualify the current status of the models with regard to the
requirements identifed in 3.3.4: Spatio-temporal consistency (SPA-TEM), Interactivity
(INTERA), and Individuality (STAT). The items of the second questionnaire are inspired by
subjective evaluations in related experiments [241, 117]. The questionnaire is extended by
proprietary items regarding interactivity and the agreement on whether more interaction with
the trafc agents would lead to a more natural driving style. All items were rated using a
seven-point Likert-type scale. Furthermore, the participants had the opportunity to openly
comment on how the behavior of the trafc agents difered from real road users. The complete
questionnaires and related afliations can be found in the Appendix A.3.
7.4.2.3 Simulator Setting
The study was conducted at BMW using a static simulator with a half-vehicle mock-up, as
shown in Figure 7.15. For visualization, LED screens covering a horizontal feld of view of
122
7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving
Simulation Based on Questionnaires
12
34
56
7
8
10
9
Figure 7.14: Trafc scenarios experienced by participants during the simulator experiment. The
ego vehicle is marked with a red triangle.
180 degrees were used. Additional monitors were installed to allow rear and side views. The
simulation is based on the BMW proprietary framework called Spider [151, 55]. For modeling
the surrounding trafc, three types of trafc agents were used: driver models, bicycle models,
and pedestrian models. All agent models are in-house developments of BMW.
Vehicle driver: a detailed description of the driver agent TRM is provided in Section 2.2.5.
Cyclist: the cyclist agent is an adapted version of the driver model extended by further
driveable areas and is based on a suitable dynamic model. In addition, diferent parameters
can be applied, such as driving on the road or in dicating direction changes by hand signals.
Pedestrian: pedestrians are simulated using a simple model that receives the exact path and
a desired velocity as input. The model follows deterministic trafc rules, such as stopping at
red lights, and prevents possible collisions by stopping and waiting.
In summary, all trafc agents are modeled by parametrizable heuristic models focusing on
collision avoidance and route-following in order to reduce complexity. Situational reactions
are discretized into diferent maneuvers, and predefned rules are applied to decide on an
individual maneuver at a time.
123
7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving
Simulation Based on Questionnaires
Figure 7.15: The driving simulator used in the present experiment.
7.4.2.4 Participants
In total, 46 subjects participated in the experiment, including four women (8.7 %) and 42 men
(91.3%). All participants were employees of BMW. The meta information, including mean,
standard deviation, test statistic, and signifcance, is summarized in Table 7.7. Based on the
data distribution, the impact of surrounding trafc, as an independent variable, was analyzed
using the signifcance test according to Wilcoxon [242].
Table 7.7: Distribution of the meta data of the participants.
µ σ s p
Age [years] 38.07 9.35 0.94 0.014
Mileage [km/year] 19391.30 10558.26 0.937 0.015
Driver’s License [years owning a licence] 20.46 9.15 0.93 0.007
7.4.3 Results
7.4.3.1 The Efect of Surrounding Trafc on Perceived Realism
The efect of surrounding trafc on the sense of presence and the rating of natural behavior of
the participants in both drives are illustrated in Figure 7.16. Participants rated their sense of
presence signifcantly higher after performing the drive with other trafc participants. The
clear signifcance compared to the level of standard deviation shows that no ceiling efect
has occurred. Furthermore, the participants rated the degree of naturalism of their own
driving style signifcantly higher when driving with other trafc participants. The descriptive
naturalism ratings were generally rather high in the present study, which, however, may have
been infuenced by the task instructions to behave as in normal trafc regarding trafc rules.
A subsequent rating of their own driving behavior being unnatural could be understood as not
complying with the given task.
7.4.3.2 Realism of Trafc Agent Behavior
Besides the independent variable, the degree of realism of the used agent models was quantifed
with respect to the key requirements for realistic behavior identifed in Chapter 3. The
124
7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving
Simulation Based on Questionnaires
no traffic traffic
1
2
3
4
5
6
7
Mean
: 4.47
Mean
: 4.90
Presence
(
PRE
)
(
N
: 46,
statistic
: 157.5,
pvalue
: 0.0001
**
)
no traffic traffic
1
2
3
4
5
6
7
Mean
: 6.04
Mean
: 6.33
Naturality
(
NAT
)
(
N
: 46,
statistic
: 70.0,
pvalue
: 0.0466
*
)
Figure 7.16: The infuence of surrounding trafc on the perceived realism and the naturalism of
driving style, whereby 7 represents a high sense of presence (*: p < 0.05, **: p < 0.01).
results in Figure 7.17 show clear diferences in the three dimensions of realism. The ratings
for individual behavior are remarkably better compared to the ratings for interactivity and
spatio-temporal consistency in behavior.
These fndings are supported by the subjects’ comments on the extent to which the behavior of
the trafc agents difers from reality. The observations can be summarized into the following
key statements:
• the agents behave defensively and avoid contact instead of interacting or cooperating
• the agents behave as if it is not foreseeable what will happen next
• the behavior of the agents is too compliant with trafc rules
• missing eye-contact with VRUs
Furthermore, the sub jects agreed with the statement that more interaction with the trafc
agents would lead to a more natural driving style. This aspect is also confrmed by the
higher-rated naturalism of driving style after the trafc drive because, without any trafc, no
interaction can take place.
7.4.4 Limitations and Future Work
The proposed method shows great potential for evaluating the behavioral realism of agent
models in diferent dimensions among a broad range of scenarios based on questionnaires.
125
7.4 Method 2: Quantifying Realistic Behavior of Trafc Agents in Urban Driving
Simulation Based on Questionnaires
Individuality
(
STAT
)
Interactivity
(
INTERA
)
Spatio temporal
:
consistency
(
SPA
-
TEM
)
1
2
3
4
5
6
7
Mean
: 6.09
Mean
: 4.07
Mean
: 3.87
Figure 7.17: Assessment of the used agent models regarding key requirements for realistic
behavior, whereby 7 represents a positive rating of, for example, high individuality.
However, to conduct a deeper analysis regarding diferent situations or individual types of
trafc participants, further items are required and should be considered in future experiments.
In the present study, the two conditions difered regarding the existence of other trafc
participants. It remains unclear, however, whether the observed positive efects are attributable
to the pure existence (observation) of other road users or to the associated interaction.
Since urban trafc always includes interactive situations, future studies could be aimed at
investigating the efects of presence in diferent scenarios.
In the presented experiment, only one generation of driver models could be evaluated because
the proposed methods for information representation, anticipation, and decision-making are
still at an early stage of development and have not yet been combined into a complete and
runtime-optimized and integrated driver model.
It is important to note that when comparing diferent agent models using this metho d , the
signifcance of the efect in presence is highly dependent on the perceptibility of behavioral
diferences and, therefore, cannot be guaranteed to provide such clear results as in this
experiment.
The proposed method is based on a purely subjective measurement approach, which can be
criticized at times with regard to the vulnerability of the results. Therefore, the combination
of the subjective and the previously presented objective method ofers the potential for more
reliability in this context [116].
While the proposed objective method is suitable for iterative development and, thus, for
early conceptual stages, the subjective method requires fully developed agent models that are
integrated into the respective simulation framework in a real-time manner.
The fndings of the experiment are not limited to driver models. Participants recognized a
weak performance in the interactive nature of VRUs. Consequently, agent models to simulate
cyclists and pedestrians also need to be developed further. Intelligent algorithms are required
not only in terms of a plausible trajectory but also on an explicit communication level to
exhibit situational gaze directions and gestures.
126
7.5 Summary on Evaluation Methods
7.4.5 Conclusion
The present study showed clear efects of the surrounding trafc on the sense of presence
of participants and therefore confrmed hypothesis H1. Furthermore, the drive, including
observation and interaction with other trafc participants, was rated more natural. In
conclusion, surrounding trafc is an essential factor in terms of realism in DS and provides
the opportunity to achieve a higher level of validity by developing more realistic trafc agent
models.
The proposed method aimed at investigating the behavioral realism of trafc agents in three
dimensions to enable useful insights for model improvement. Results show that the used
models perform well in individual behavior but show weaknesses in interactivity and plausible
spatio-temporal behavior. This efect is in line with the main challenges identifed in this
thesis in Chapter 3. The previously presented objective measurement method also showed a
main issue in spatio-temporal plausibility and interactivity through the increased weighting of
the time gap when interacting to distinguish between real and artifcial drivers. The subjects’
ratings and comments underline the need for more dynamic decision-making and situational
understanding in such models. Regarding the present trafc agents and other state-of-the-art
models, such rule-based decision-making strategies limit the ability of situational reactions. In
summary, the proposed evaluation method provides interesting insights into the subjective
perception of agent model behavior and allows a comprehensible model evaluation.
7.5 Summary on Evaluation Methods
Considering evaluation strategies in state-of-the-art solutions, one can observe that evaluation
is considered a secondary topic in the majority of publications. Most common evaluation
strategies and metrics sufer in signifcance. Furthermore, the high dependency of behavior on
various external factors in urban trafc is rarely addressed. Therefore, the present chapter
aims to provide innovative evaluation methods aiming at generating more transparency and
meaningful insights for future development. Both presented methods, the subjective and the
objective evaluation strategy, are associated with certain drawbacks and benefts. While the
proposed objective method is suitable for iterative development and, thus, for early conceptual
stages, the subjective method requires fully developed agent models that are integrated into
the respective simulation framework in a real-time manner. However, with respect to DiL
applications, the subjective method provides better insights into the severity of behavioral
diferences. A good example is the jerking problem identifed in Section 7.3.3.3, which was
not perceived by the participants during the simulator experiment. Fu rthermore, while the
latter requires a fully developed driver model, the objective method can be applied to holistic
driver models as well as subcomponents, as shown by the case study for trajectory planning.
Therefore, the objective method is more suitable during the development process, while the
subjective method can be us ed as a fnal validation strategy by providing solid evidence on
whether the interaction between driver agent and DiL appears realistic. The issues in spatio-
temporal plausibility and interactivity through inadequate parametrization of waiting time
gaps when interacting could be identifed by both methods, resulting in mutually confrming
results. In summary, both methods provided promising results by revealing crucial insights and
127
7.5 Summary on Evaluation Methods
clear actions for future developments by employing methods transferable to multiple models
and model subcomponents across various scenarios.
128
8
Summary, Discussion and Outlook
8.1 Summary of Results
With respect to the objectives formulated in the introduction in Section 1, the initial aim of
this thesis was to provide a clearer and more practical defnition of human likeness, along with
the critical requirements for driver models. In general, one can categorize requirements into a
functional level and a human likeness level. However, both levels show a strong interrelation
since non-functional behavior contradicts human likeness. While the functional level mostly
requests for collision- and deadlock-free trajectories, high-level requirements on human likeness
are identifed to be: individuality, spatio-temporal consistency in behavior, and interactivity.
Since individuality is mostly achieved by diferent parameter sets within a model, which is
a downstream issue, this work mainly addressed the functional requirements and those for
interactivity and spatio-temporal consistency.
In particular, spatio-temporal consistency and interactivity are closely related to the functional
requirement of avoiding deadlocks. The occurrence of deadlocks is a common problem that
can originate from diferent model layers. For example, a deadlock can be caused by a
defciency in the environment representation, e.g. the oncoming trafc lane is not part of
the spatial representation of driveable areas, and thus the vehicle is not able to avoid an
obstacle in the ego lane. In other cases, the decision-making strategy itself can impede the
functionality and plausibility of behavior if the resulting action space is too discretized to
fnd a valid solution. Spatio-temporal consistency and interactivity require a certain degree
of situational understanding and the ability to adapt behavior situationally. In addition to
these requirements, the underlying key challenge is to fnd general and transferable solutions
that allow a model to handle the variety of situations encountered in urban trafc without
exceeding a practical level of complexity.
Given these higher-level requirements and the state-of-the-art in driving simulation and AV
development, four main root causes have been identifed that hinder the successful modeling
of human-like driver behavior in complex urban trafc. Namely, the representation of complex
urban trafc situations to enable situational understanding, generalizable predictions in
129
8.1 Summary of Results
interactive scenarios as a basis for interaction, more dynamic decision-making allowing for
situational behavior adaptation, and measuring the degree of human likeness of driver model
behavior.
The frst challenge arises in the representation of urban trafc situations in a model-accessible
format since driving behavior is afected by multiple infuences, such as road topology, trafc
rules, and the behavior of other road users. The human cognitive process of perceiving
information and creating situational understanding by associating and interpreting this
information is explored in several areas of research, for example, in the context of SA [129].
As computer science approaches are not yet able to compete with humans, human cognition
inspires the requirement that a driver model should achieve a certain level of situational
understanding in order to cope with interactive trafc situations. Most published research
focuses on single scenarios or isolated parts of the driving task. Thus, the challenges of fnding
an appropriate scene representation are rarely discussed. The need to generate situational
understanding through an appropriate scene representation at a general level inspired research
question R1: How to represent complex situations in urban trafc in a general and
transferable manner?"
Therefore, in Chapter 4, a novel method for scene representation incorporating the association
and interpretation of diferent information sources to enable situational understanding is
presented. The idea is to identify and provide the model with all easily accessible information
via heuristics, while more complex patterns, such as the reaction of other road users, are
captured by appropriate modeling approaches such as NNs. The proposed concept is applied
to an op en-source drone dataset, providing time series and metadata of trafc participants
interacting on ROW regulated German intersections. Information from the data is fused with
knowledge from the map to extract further context information by identifying interacting
trafc participants and their relationships. A feature vector containing raw and semantic
information describing a scene at a certain time step is generated. The efectiveness of the
scene representation is measured by means of a deep learning prediction model that anticipates
the trajectory of vehicles for the next 5 seconds. The evaluation showed that the prediction
model’s ability to make general reliable predictions on unseen data benefts signifcantly from
the advanced scene representation containing associated and interpreted context information
compared to only providing raw scene information.
Anticipation plays a crucial role in handling interactive trafc and becomes particularly relevant
in urban trafc, as understanding and incorporating the intentions of other road users is the
basis for safe and reasonable decision-making.
However, modern agent models do not incorporate prediction approaches, and missing
anticipation was therefore identifed as the second root cause that prevents driver agents
in the simulation from exhibiting plausible and interactive behavior. As discussed in Section
3.3.4, anticipating the behavior of other road users is a frequently addressed challenge in AD,
as safe participation in trafc requires the estimation of future scene developments.
However, published solutions tend to ofer limited solutions for isolated scenarios that are
not sufciently transferable and reliable for the diversity of urban trafc. Based on this, the
second research question R2: "How to enable general and transparent predictions in
complex urban trafc?" was formulated.
130
8.1 Summary of Results
In Chapter 5, a novel method is presented that aims to investigate the impact of diferent
conceptual choices on the performance of a NN aiming to predict the future trajectory of other
vehicles. The model was trained on diferent datasets and with diferent parameter settings
along a three-layer testing framework. Results showed that state-of-the-art evaluation metrics
do not provide sufcient transparency for identifying model weaknesses or deciding between
diferent model settings.
The evaluation revealed that the degree of diversity in training has a major impact on model
performance. It was found that very clean behavioral data obtained from simulation results
in the well-known problem of overftting, which causes poor results in unknown situations.
Meanwhile, a larger variety of training data leads to reliable and accurate results even in
situations that are far from the training data. Studies on diferent data compositions have
demonstrated that it is possible to augment real trafc data with synthetic samples to overcome
the known problem of underrepresented situations in datasets.
In summary, the results have shown that meaningful evaluation strategies are crucial to
address the diversity of potential concepts and the shortcomings, such as overftting and low
transparency of black-box models, to enable reliable prediction models in the future.
Besides situational understanding achieved by appropriate representations and anticipation
capabilities, dynamic adaptation of behavior to the situation has been identifed as a necessary
ability and yet unsolved challenge for human-like driver agents. Decision-making in current
driver agents is mostly done hierarchically, by dividing the driving task into maneuvers and
applying heuristic rules to decide when to choose which maneuver. Given the diversity of
situations encountered in urban trafc, the strong discretization of the solution space by
distinguishing into driving maneuvers limits the ability of models to exhibit situationally
plausible behavior and leads to functional errors such as deadlocks. Therefore, research
question R3: "How to enable driver agents for dynamic decision-making in order
to cope with typical urban scenarios in a functional and human-like manner?" was
formulated.
Inspired by techniques from AVs, the proposed methods in Chapter 6 aim to enable more
dynamic decision-making in driver agents by employing trajectory planning. The idea behind
these approaches, to balance between diferent needs, such as safety, rule compliance, or efcient
target reaching, is more in line with the human manner of handling trafc and provides a
less discretized solution space. Based on promising publications, a novel trajectory planning
framework is developed, incorporating two layers, one creating a high-level behavioral decision
and a second one smoothing this decision into a dynamically feasible trajectory. Two diferent
variants for planning the high-level decision were developed to investigate the weaknesses and
potentials of diferent state-of-the-art techniques.
The behavior of the two planners was compared with that of the heuristic driver agent at BMW
in typical interactive scenarios to investigate when and to what extent the higher complexity
of planning methods provides additional value. In addition, the runtime to create a trajectory
was investigated since real-time capability is a strong criterion for agent models in simulation,
and the two planning methods difer in complexity. The results demonstrated that the planners
were able to solve highly interactive scenarios in a reasonable manner, while the heuristic
model faced deadlocks.
131
8.1 Summary of Results
The evaluation revealed the planners’ high potential to address urban trafc challenges but
also a high sensitivity to changes in the situation or parameters, resulting in a high variance of
trajectory quality and runtime. However, even with those shortcomings, trajectory planning
instead of heuristic maneuver distinction is identifed as the expedient metho d to cope with
the complexity of urban trafc. Therefore, further development of the proposed framework
and investigations across a broad range of scenarios is recommended.
While analyzing state-of-the-art solutions and examining the utility of the presented methods
within this thesis, the importance of meaningful and transparent evaluation strategies has
emerged repeatedly, prompting the fourth research question of this thesis: R4: "How to
identify model limitations and quantify the degree of human likeness of holistic
driver models and individual subcomponents?"
Human driving behavior in urban trafc is subject to a high degree of complexity and is
afected by various infuences. This results in new challenges not only for modeling but
also for evaluating models, as behavior always has to be assessed in the context of the
situation. Therefore, in Chapter 7, two evaluation methods for quantifying the human likeness
of driver model behavior, aiming at identifying model limitations and enabling target-oriented
improvements, are provided.
The frst method provides an objective approach to quantify the human likeness of model
behavior while considering the situational context. Therefore, a multi-dimensional quality
function was formulated, incorporating various parameters to characterize driving behavior.
By assigning context parameters, it is possible to compare model behavior to the behavior
of humans under similar conditions, such as behavior when approaching an intersection and
yielding ROW to another vehicle. A combination of all individual parameters into one human
likeness score allowed for efcient evaluation even of large datasets. For validation purposes,
a Turing test inspired survey was conducted. The comparison of scores obtained from the
quality function and scores rated by participants demonstrated the ability of the method to
refect subjective ratings of participants in a transparent and objective manner.
Due to the modular concept, the method is applicable to evaluating holistic driver models
as well as subcomponents such as a trajectory planner. The ability of the method to reveal
explicit model weaknesses was demonstrated for both application areas by means of two case
studies.
In contrast, the second evaluation method follows a subjective approach in which the quality
of agent model behavior is assessed by measuring participants’ perceived realism by the sense
of presence in a simulator experiment, assuming that a high degree of human likeness of agent
models is associated with a high level of perceived realism.
Participants were asked to rate their sense of presence and the realism of the heuristic agent
models after observing and interacting with the agents in diferent urban trafc scenarios.
The existence of trafc was manipulated in a within-subjects design to investigate whether
participants’ perceived realism in a simulator experiment is an appropriate measure for
quantifying trafc agents. The results demonstrated that the participants’ sense of presence
was signifcantly afected by the surrounding trafc. In addition, the heuristic agent mo dels
were found to have weaknesses in spatio-temporal consistency of behavior and in their ability
132
8.2 Limitations and Discussion
to interact with the DiL, as responses in interactive situations appeared to be avoidant rather
than interactive.
8.2 Limitations and Discussion
As described in the previous section, this thesis provides promising methods and results for
the future development of human-like driver models capable of handling complex situations in
urban trafc. However, the main limitation that should be mentioned is that this work does
not provide a complete, runtime-optimized, and integrated driver model since the proposed
methods for information representation, anticipation, and decision-making are still at an early
stage of development and need to be further refned to achieve general reliable solutions.
Therefore, these individual pillars of a driver model, require further research and development
before being combined into a holistic driver model. When connecting these subcomponents,
further challenges will arise, such as knowledge distribution, how to deal with prediction
uncertainty, and fnding generally suitable parameter sets.
Key scenarios for the implementation and evaluation of the presented methods have been
defned for the representation of the most important characteristics of urban trafc. However,
given the wide variety of trafc, it is unlikely that the representations employed will cover
all the challenges potentially occurring, and some blind spots remain. This highlights the
importance of meaningfu l and efcient evaluation methods that are applicable to a broad
range of situations to address such unkown-unknown problems.
The methods presented in this thesis aim to combine heuristic modeling and deep learning
approaches to get the most out of both worlds. However, when applying heuristic rules to real
trafc scenes, it is not possible to cover all possible combinations that may arise from human
behavior. Due to th e numerous iterative development loops during the processing of the real
trafc data, most of these implausibilities could be identifed and covered by the presented
algorithms. However, the use of heuristics always faces limitations due to their categorizing
nature.
Meanwhile, DL approaches lack in transparency due to the black-box nature of the model
and are associated with challenges such as overftting. As a consequence, these approaches
also show disadvantages and do not provide one unique perfect solution to achieve reasonable
results for the variety of urban trafc situations. Combining heuristic approaches that extract
semantic knowledge about th e situation with a highly nonlinear NN to handle such complex
tasks as behavior prediction holds great potential. However, extensive testing and critical
evaluation against multiple test scenarios is required to ensure reliability.
The present thesis aims to provide a new perspective by combining theories and methods from
diferent research areas such as psychology, robotics, and computer science. Furthermore, the
objective of this thesis consisted of fnding generally applicable and transferable solutions. Due
to these requirements and the diferentiated view on modeling human-like driving behavior,
this work provides insights, methods, and connecting points for future developments instead
of a detailed solution strategy for isolated scenarios. In particular, the recommendations for
future work provided by the individual methodology chapters (4, 5, 6), ofer a number of as-yet
unsolved challenges.
133
8.3 Final Conclusion
8.3 Final Conclusion
In alignment with the motivation outlined in the introduction, the present thesis endeavored
to address the challenge of modeling human-like driving behavior in urban environments,
with a particular focus on the application in DS. To accomplish this objective, the initial
chapter undertakes the task of categorizing and evaluating state-of-the-art solutions in various
associated research areas. In addition to exploring solutions within the realm of DS, approaches
from the domain of AVs were discussed due to their profound relevance to the core issues under
discussion. Furthermore, the interdisciplinary nature of this topic, encompassing facets of
robotics, computer science, cognitive modeling, and psychology, was taken into consideration,
ofering new perspectives for the exploration of this subject.
Owing to the multifaceted nature of the challenge associated with modeling human-like
behavior, various defnitions and perspectives on defning human-like driving behavior were
analyzed to derive essential modeling capabilities. The obtained requirements were compared
against contemporary solutions, resulting in the identifcation of four primary challenges that
necessitate resolution to enable the incorporation of human-like driver models within urban
DS.
Subsequently, th e methodology part of this thesis introduced innovative solutions for
representing complex situations in urban trafc to enable situational understanding, methods
for developing and evaluating generalizable prediction models able to predict the future motion
of vehicles, strategies allowing situational behavior adaption by employing more dynamic
decision-making techniques and approaches for measuring the level of human likeness of driver
model behavior.
The proposed methods for representation, prediction, and planning provided promising results
and generated increased transparency and in-depth comprehension, narrowing the scope of the
broad scientifc problem and transforming it into distinct technical challenges that can be the
focus of forthcoming research endeavors.
Moreover, the evaluation metho ds presented herein furnish indispensable tools for advancing
model development by the identifcation of strengths and limitations of holistic driver models
and subcomponents across the broad variety of situations encountered in urban trafc.
In view of all the results presented and the analysis conducted, the main conclusion is that
there is no one perfect solution satisfying the various requirements of potential applications,
but the appropriate balance must be determined. Mo deling such complex behavior will
always underlie a balance between generalizability, solution quality in detail, complexity, and
calculation efort. The solution quality in specifc situations will sufer under a higher level
of generalizability. The more complex solution will mostly provide better results but at the
same time be associated with higher calculation efort and thus sufer in real-time capability.
The approach prioritizing safety will face deadlocks more often since, due to safe distances or
uncertainties, no solution space remains. In contrast, the more risk-taking approach will solve
more situations but might lead to critical situations, as they appear in real trafc between
humans due to misunderstandings or misjudgments.
Despite the complexity of human driving behavior, the topic must be considered holistically in
its interdisciplinary nature since the consideration of isolated scenarios leads to assumptions that
134
8.4 Summarized Outlook
are not transferable and, thus, to solutions that are not practicable. Since there is no perfect
solution covering all problems, it is necessary to defne clear objectives and requirements for a
model in the triad of complexity, generalizability, and computational complexity. Therefore,
clear and testable requirements, dynamic and transferable modeling approaches, and meaningful
evaluation strategies, as presented within this thesis, are required.
8.4 Summarized Outlook
The present thesis has introduced novel concepts that pave the path toward the creation of
reliable driver models exhibiting human-like behavior in urban trafc scenarios. Driver models
often exhibit limited transparency, mainly due to complex modularized architectures that
aggregate diferent rules and scenarios, complex formulations of optimization problems, or
the utilization of DL techniques that are characterized by their black-box nature. Therefore,
despite the prevailing literature on this subject, future research endeavors should dedicate more
attention to the development of meaningful, transparency-creating, and practical evaluation
methods.
Furthermore, driving in urban trafc is a highly complex task, and various aspects infuence
behavior. Consid ering these factors, the root problem may not necessarily rest in an ill-suited
approach to solutions but rather in the limited understanding of how the model operates in
untested scenarios or when its behavior deviates from that of human drivers. In order to
confront the multifaceted challenge of diversity within urban trafc, certain key challenges
must be systematically addressed in forthcoming research eforts.
To begin, it is essential to develop meaningful evaluation strategies that can be efciently
applied across a wide range of situations. The precise design of the evaluation method may
vary depending on the specifc model or subcomponent being examined. In Chapter 5, a
concept is presented for enhancing the transparency of data-driven prediction models through
experimentation with diferent datasets and model settings and the measurement of model
performance incorporating accuracy and functional plausibility. In Chapter 7, an expanded
version of such a plausibility metric was introduced with a stronger focus on human likeness.
It is recommended to combine the two plausibility metric concepts and extend the involved
heuristics to cover further trafc scenarios. Drawing from the general scene description capable
of representing diverse trafc scenarios, these methods can be easily expanded to encompass
additional trafc situations. For instance, the inclusion of trafc light signal data enables their
application to signalized intersections, while the extension of heuristics accommodates shared
spaces such as pedestrian crossings.
After an expansion of heuristics, additional data can be gathered by following the framework
outlined in Chapter 4. This newly acquired data can then serve multiple purposes, including in-
depth analysis of the prediction model, expanded testing of the trajectory planning framework,
and gaining a more comprehensive understanding of the constraints inherent to heuristic
models.
As AD and DS are not only relevant for German trafc, the development for other markets
and countries requires transferable methods and country-specifc data. The proposed data-
processing algorithms can be applied to analyze geographical and cultural diferences in trafc
135
8.4 Summarized Outlook
behavior, such as comparing German and Chinese behavior to identify which aspects of
behavior difer signifcantly. Moreover, the prediction, planning, and evaluation methods are
designed to be fexible enough that country-specifc behavior can be easily trained by the
NN presented in Chapter 5, and decision-making can be parameterized with in the plannin g
framework proposed in Chapter 6 by defning a specifed ftness function for the GA with the
parameter ranges of the country-specifc data and for example tune the planner towards more
Chinese-like driving behavior.
After further refnement of the individual subcomponents within the driver model, the next
important milestone is to combine the anticipation and planning models. As already indicated,
topics such as the management of prediction uncertainties and the investigation of knowledge
distribution must be explored in greater depth at this point.
Given the fact that evaluations have demonstrated the heuristic driver model’s capability to
handle a wide array of less complex trafc scenarios, it is advisable to consider a modular
structure that switches between the heuristic decision strategy and the approaches introduced
within this thesis contingent upon the situation’s complexity. To facilitate this transition, a
thorough comprehension of when and to what extent the more complex modeling approaches
add value becomes essential. A preliminary exploration of this aspect is presented in Chapter
6, where the planning frameworks are compared with the heuristic model. Furthermore,
additional analyses are presented in Chapter 7. Based on such insights, one can either
formulate heuristic rules or devise an alternative intelligent distinction technique capable of
discerning a situation’s complexity, and thus allowing for the wise selection of the appropriate
methodology for decision-making.
Finally, with the special focus on DS applied to DiL applications, the requirement of interactive
behavior and, thus, communication between road users need to be considered in a broader
perspective. Communication in trafc can be exhibited implicitly, for example, by slowing down
in front of a crosswalk or explicitly through a hand gesture [130]. As stated in the introduction
chapter, in this thesis, the output of a driver model was considered to be spatio-temporal
motion. However, concerning realistic behavior in interactive urban trafc, the modeling of
explicit communication cues, such as gestures, should also be considered. This poses challenges
such as the visual modeling of the gesture itself as well as the decision of when to show which
gesture. The latter requires a high degree of situational understanding and intelligence within
the model.
To stay within the context of DiL applications in DS, not only the surrounding vehicles need
to be modeled using agent models. Besides vehicles, pedestrians and agent models covering a
variety of transportation possibilities, such as bicyclists, scooters, or motorcycles, are required.
When modeling such VRUs, new challenges arise due to the high degree of free-space motion
and because active communication, such as gaze direction, becomes even more relevant.
To conclude, numerous challenges remain for modeling and imitating human trafc behavior.
Human beings have a complex nature and various sophisticated cognitive abilities, which are
not yet fully explored or understood. To bridge this gap between computer science and human
nature, interdisciplinary perspectives, dynamic methods, and critical evaluations are required
to fnd the best solution satisfying individual requirements.
136
References
[1]
Junqing Wei, John M Dolan, and Bakhtiar Litkouhi. “A learning-based autonomous
driver: emulate human driver’s intelligence in low-speed car following”. In: Unattended
ground, sea, and air sensor technologies and applications XII. Vol. 7693. SPIE. 2010,
pp. 93–104.
[2]
Mysore N Sharath, Nagendra R Velaga, and Mohammed A Quddus. “2-dimensional
human-like driver model for autonomous vehicles in mixed trafc”. In: IET Intelligent
Transport Systems 14.13 (2020), pp. 1913–1922.
[3]
Klaus Bengler et al. “UR: BAN human factors in trafc”. In: Approaches for Safe,
Efcient and Stress-Free Urban Trafc; Springer: Wiesbaden, Germany (2018).
[4]
Zaheer Allam and Ayyoob Sharif. “Research Structure and Trends of Smart Urban
Mobility”. In: Smart Cities 5.2 (2022), pp. 539–561.
[5]
Cordelia Friesendorf and Luca Uedelhoven. “Megatrends Infuencing Mobility”. In:
Mobility in germany: digital transformation, megatrends and the evolution of new
business models. SpringerBriefs in Business. Cham: Springer International Publishing,
2021, pp. 19–23.
[6]
Maximilian Hübner et al. “External communication of automated vehicles in mixed
trafc: Addressing the right human interaction partner in multi-agent simulation”. In:
Transportation research part F: trafc psychology and behaviour 87 (2022), pp. 365–378.
[7]
Lucas Bruck, Bruce Haycock, and Ali Emadi. “A review of driving simulation technology
and applications”. In: IEEE Open Journal of Vehicular Technology 2 (2020), pp. 1–16.
[8] Jason Jerald. “Immersion, presence and reality trade-ofs”. In: 2016, pp. 45–52.
[9]
Gustav Markkula et al. “Defning interactions: A conceptual framework for understand-
ing interactive behaviour in human and automated road trafc”. In: Theoretical Issues
in Ergonomics Science 21.6 (2020), pp. 728–752.
[10]
Hermann Winner et al., eds. Handbook of driver assistance systems. Cham: Springer
International Publishing, 2016.
[11]
Marco Luetzenberger and Sahin Albayrak. “Can you simulate trafc psychology? an
analysis”. In: 2013 Winter Simulations Conference (WSC). IEEE. 2013, pp. 1539–1550.
[12]
Corina Apachite, Ralph Lauxmann, Robert Thiel, et al. “AI for Automated Driving”.
In: ATZelectronics worldwide 16.9 (2021), pp. 48–51.
137
REFERENCES
[13]
Almut Hochstaedter, Peter Zahn, and Karsten Breuer. “A comprehensive driver model
with application to trafc simulation and driving simulators”. In: Proc. Human-Centered
Transportation Simulation Conf. HCTSC, Iowa City. 2001.
[14]
James Imende Obuhuma, Henry Okora Okoyo, and Sylvester Okoth McOyowo. “A
software agent for vehicle driver modeling”. In: 2019 IEEE AFRICON. IEEE. 2019,
pp. 1–8.
[15]
Manuela Witt et al. “Cognitive driver behavior modeling: Infuence of personality
and driver characteristics on driver behavior”. In: Advances in Human Aspects of
Transportation: Proceedings of the AHFE 2018 International Conference on Human
Factors in Transportation, July 21-25, 2018, Loews Sapphire Falls Resort at Universal
Studios, Orlando, Florida, USA 9. Springer. 2019, pp. 751–763.
[16]
Jun JIANG and Jian LU. “Research of Driver-Vehicle Unit Model Framework Based
on Agent and ACT-R”. In: The National Science Foundation of China (2009). (Visited
on 11/05/2020).
[17]
R. Herpers et al. Agentenbasierte Verkehrssimulation mit psychologischen Persoen-
lichkeitsproflen (AVeSi). Tech. rep. University of Applied Sciences Bonn-Rhein-Sieg,
Department of Computer Science, 2015.
[18]
Mohammad Bahram et al. “Microscopic trafc simulation based evaluation of highly
automated driving on highways”. In: 17th International IEEE Conference on Intelligent
Transportation Systems (ITSC) . IEEE. 2014, pp. 1752–1757.
[19]
Benhamza Karima et al. “Agent-based modeling for trafc simulation”. In: Courrier du
savoir (Jan. 2012), pp. 51–56.
[20]
Qianjiao Wu, Rong Lan, and Wei Zhou. “Trafc Flow Simulation Based on Adaptive
Agent”. In: 2018 3rd International Conference on Smart City and Systems Engineering
(ICSCSE). IEEE. 2018, pp. 697–700.
[21]
Jie Sun, Yongping Zhang, and Jianbo Fan. “SmartAgents: a scalable infrastructure
for smart car”. In: 2011 12th International Conference on Parallel and Distributed
Computing, Applications and Technologies. IEEE. 2011, pp. 99–103.
[22]
Amgad Naiem et al. “An agent based approach for modeling trafc fow”. In: 2010 The
7th international conference on informatics and systems (INFOS). IEEE. 2010, pp. 1–6.
[23]
Levente Alekszejenkó and Tadeusz P Dobrowiecki. “SUMO Based Platform for
Cooperative Intelligent Automotive Agents.” In: SUMO. 2019, pp. 107–123.
[24]
José LF Pereira and Rosaldo JF Ross etti. “An integrated architecture for autonomous
vehicles simulation”. In: Proceedings of the 27th annual ACM symposium on applied
computing. 2012, pp. 286–292.
[25]
Ali Bazghandi. “Techniques, advantages and problems of agent based modeling for
trafc simulation”. In: International Journal of Computer Science Issues (IJCSI) 9.1
(2012), p. 115.
[26]
Eric Bonabeau. “Agent-based modeling: Methods and techniques for simulating human
systems”. In: Proceedings of the national academy of sciences 99.suppl_3 (2002),
pp. 7280–7287.
138
REFERENCES
[27]
Charles M Macal and Michael J North. “Agent-based modeling and simulation”. In:
Proceedings of the 2009 winter simulation conference (WSC). IEEE. 2009, pp. 86–98.
[28]
Li Zhou and Kai Zhao. “The design of agent-based intelligent trafc visualized simulation
system”. In: 2010 International Conference on Electrical and Control Engineering . IEEE.
2010, pp. 3066–3069.
[29]
Pablo Alvarez Lopez et al. “Microscopic trafc simulation using SUMO”. In: 2018
21st international conference on intelligent transportation systems (ITSC). IEEE. 2018,
pp. 2575–2582.
[30]
Tianjiao Wang, Jianping Wu, and Mike McDonald. “A micro-simulation model of
pedestrian-vehicle interaction behavior at unsignalized mid-block locations”. In: 2012
15th International IEEE Conference on Intelligent Transportation Systems. IEEE. 2012,
pp. 1827–1833.
[31]
C-A Brunet, Ruben Gonzalez-Rubio, and Mario Tetreault. “A multi-agent architecture
for a driver model for autonomous road vehicles”. In: Proceedings 1995 Canadian
conference on electrical and computer engineering. Vol. 2. IEEE. 1995, pp. 772–775.
[32]
Dario D Salvucci. “Mo deling driver b ehavior in a cognitive architecture”. In: Human
factors 48.2 (2006), pp. 362–380.
[33]
Hideki Fujii, Hideaki Uchida, and Shinobu Yoshimura. “Agent-based simulation
framework for mixed trafc of cars, pedestrians and trams”. In: Transportation research
part C: emerging technologies 85 (2017), pp. 234–248.
[34]
Michael Behrisch et al. “SUMO–simulation of urban mobility: an overview”. In:
Proceedings of SIMUL 2011, The Third International Conference on Advances in
System Simulation. ThinkMind. 2011.
[35] Jaume Barceló et al. Fundamentals of trafc simulation. Vol. 145. Springer, 2010.
[36]
Michael Kutz and Rainer Herpers. “Urban trafc simulation for games: a general
approach for simulation of urban actors”. In: Proceedings of the 2008 Conference on
Future Play: Research, Play, Share. 2008, pp. 181–184.
[37]
Martin Fellendorf and Peter Vortisch. “Microscopic trafc fow simulator VISSIM”. In:
Fundamentals of trafc simulation (2010), pp. 63–93.
[38]
Dominik Salles, Stefan Kaufmann, and Hans-Christian Reuss. “Extending the intelligent
driver model in SUMO and verifying the drive of trajectories with aerial measurements”.
In: 1 (2020), pp. 1–25.
[39]
Jakob Erdmann. “Lane-changing model in SUMO”. In: Proceedings of the SUMO2014
modeling mobility with open data 24 (2014), pp. 77–88.
[40]
cogniBIT - drivebot. WEBSITE. url:
https://cognibit.de /drivebot/
(visited on
10/06/2021).
[41]
AAI Intelligent Trafc. WEBSITE. url:
https://www.automotive-ai.com/aai-
blog/caevrevent21 (visited on 01/10/2024).
139
REFERENCES
[42]
Arpan Kusari et al. “Enhancing SUMO simulator for simulation based testing and
validation of autonomous vehicles”. In: 2022 IEEE Intelligent Vehicles Symposium (IV).
IEEE. 2022, pp. 829–835.
[43]
Philippe Mathieu and Antoine Nongaillard. “A risk-driven model for trafc simulation”.
In: Distributed Computing and Artifcial Intelligence, 17th International Conference.
Springer. 2021, pp. 1–10.
[44]
Alexandre Bonhomme, Philippe Mathieu, and Sébastien Picault. “A versatile description
framework for modeling behaviors in trafc simulations”. In: 2014 IEEE 26th
International Conference on Tools with Artifcial Intelligence. IEEE. 2014, pp. 937–944.
[45]
Xianyan Kuang et al. “Multi-Agent Based Microscopic Simulation Modeling for Urban
Trafc Flow”. In: Sensors & Transducers 180.10 (2014), p. 117.
[46]
Stefen Kampmann et al. “Automatic mapping of human behavior data to personality
model parameters for trafc simulations in virtual environments”. In: 2015 IEEE
conference on computational intelligence and games (CIG). IEEE. 2015, pp. 336–343.
[47]
Alexandra Fries et al. “Driver behavior model for the safety assessment of automated
driving”. In: 2022 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2022, pp. 1669–
1674.
[48]
Alexandra Fries et al. “Modeling driver behavior in critical trafc scenarios for the
safety assessment of automated driving”. In: Trafc injury prevention 24.sup1 (2023),
S105–S110.
[49]
Martin Treiber, Ansgar Hennecke, and Dirk Helbing. “Congested trafc states in
empirical observations and microscopic simulations”. In: Physical Review E 62.2 (2000),
pp. 1805–1824.
[50]
Kyle Brown, Katherine Driggs-Campbell, and Mykel J Kochenderfer. “A taxonomy and
review of algorithms for modeling and predicting human driver behavior”. In: arXiv
preprint arXiv:2006.08832 (2020).
[51]
Martin Fellendorf. “VISSIM: A microscopic simulation tool to evaluate actuated signal
control including bus priority”. In: 64th Institute of transportation engineers annual
meeting. Vol. 32. Springer. 1994, pp. 1–9.
[52]
Arne Kesting, Martin Treiber, and Dirk Helbing. “General lane-changing model MOBIL
for car-following models”. In: Transportation Research Record 1999.1 (2007), pp. 86–94.
[53]
Carl-Johan Hoel et al. “Combining planning and deep reinforcement learning in tactical
decision making for autonomous driving”. In: IEEE transactions on intelligent vehicles
5.2 (2019), pp. 294–305.
[54]
Manuela Witt et al. “Cognitive driver behavior modeling: Infuence of personality
and driver characteristics on driver behavior”. In: Advances in Human Aspects of
Transportation: Proceedings of the AHFE 2018 International Conference on Human
Factors in Transportation, July 21-25, 2018, Loews Sapphire Falls Resort at Universal
Studios, Orlando, Florida, USA 9. Springer. 2019, pp. 751–763.
140
REFERENCES
[55]
Teresa Rock Thomas Bleher Mohammad Bahram. “Spider - the Simulation Framework at
BMW”. Proceedings of the Driving Simulation Conference 2024 Europe VR Publication
in progress.
[56]
Wiedemann Rainer. “Simulation des Strassenverkehrsfusses”. In: Schriftenreihe des
Instituts für Verkehrswesen der Universität (TH) Karlsruhe Heft 8/1974 (1974).
[57]
Jiřı Vokřınek et al. “A cooperative driver model for trafc simulations”. In: 2013
11th IEEE International Conference on Industrial Informatics (INDIN). IEEE. 2013,
pp. 756–761.
[58]
Xiaohui Li et al. “Real-time trajectory planning for autonomous urban driving: Frame-
work, algorithms, and verifcations”. In: IEEE/ASME Transactions on mechatronics
21.2 (2015), pp. 740–753.
[59]
Yu Zhang et al. “Hybrid Trajectory Planning for Autonomous Driving in Highly
Constrained Environments”. In: IEEE Access 6 (2018), pp. 32800–32819.
[60]
Branka Mirchevska et al. “High-level decision making for safe and reasonable
autonomous lane changing using reinforcement learning”. In: 2018 21st International
Conference on Intelligent Transportation Systems (ITSC). IEEE. 2018, pp. 2156–2162.
[61]
Gustav Markkula et al. “Models of human decision-making as tools for estimating and
optimizing impacts of vehicle automation”. In: Transportation research record 2672.37
(2018), pp. 153–163.
[62]
David Isele. “Interactive decision making for autonomous vehicles in dense trafc”.
In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE. 2019,
pp. 3981–3986.
[63]
Zirui Li et al. “Driver behavior modelling at the urban intersection via canonical
correlation analysis”. In: 2020 3rd International Conference on Unmanned Systems
(ICUS). IEEE. 2020, pp. 564–569.
[64]
Yonghwan Jeong and Kyongsu Yi. “Target Vehicle Motion Prediction-Based Motion
Planning Framework for Autonomous Driving in Uncontrolled Intersections”. In: IEEE
Transactions on Intelligent Transportation Systems 22.1 (2021), pp. 168–177.
[65]
Kichun Jo et al. “Development of Autonomous Car—Part II: A Case Study on the
Implementation of an Autonomous Driving System Based on Distributed Architecture”.
In: IEEE Transactions on Industrial Electronics 62.8 (2015), pp. 5119–5132.
[66]
Wonteak Lim et al. “Hierarchical Trajectory Planning of an Autonomous Car Based on
the Integration of a Sampling and an Optimization Method”. In: IEEE Transactions
on Intelligent Transportation Systems 19 (2018), pp. 613–626.
[67]
Wonteak Lim et al. “Hybrid trajectory planning for autonomous driving in on-road
dynamic scenarios”. In: IEEE Transactions on Intelligent Transportation Systems 22.1
(2019), pp. 341–355.
[68]
Florin Leon and Marius Gavrilescu. “A review of tracking and trajectory prediction
methods for autonomous driving”. In: Mathematics 9.6 (2021), p. 660.
141
REFERENCES
[69]
Dongchan Kim, Hayoung Kim, and Kunsoo Huh. “Trajectory planning for autonomous
highway driving using the adaptive potential feld”. In: 2018 21st International
Conference on Intelligent Transportation Systems (ITSC). IEEE. 2018, pp. 1069–1074.
[70]
Pramila P. Shinde and Seema Shah. “A Review of Machine Learning and Deep Learning
Applications”. In: 2018 Fourth International Conference on Computing Communication
Control and Automation (ICCUBEA). 2018, pp. 1–6.
[71]
Yassine Ouali, Céline Hudelot, and Myriam Tami. “An overview of deep semi-supervised
learning”. In: arXiv preprint arXiv:2006.05278 (2020).
[72]
Alexandre Alahi et al. “Social lstm: Human trajectory prediction in crowded spaces”.
In: Proceedings of the IEEE conference on computer vision and pattern recognition.
2016, pp. 961–971.
[73]
Alex Kuefer et al. “Imitating driver behavior with generative adversarial networks”.
In: 2017 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2017, pp. 204–211.
[74]
Nachiket Deo, Akshay Rangesh, and Mohan M Trivedi. “How would surround vehicles
move? a unifed framework for maneuver classifcation and motion prediction”. In: IEEE
Transactions on Intelligent Vehicles 3.2 (2018), pp. 129–140.
[75]
Joey Hong, Benjamin Sapp, and James Philbin. “Rules of the road: Predicting driving
behavior with a convolutional model of semantic interactions”. In: Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, pp. 8454–
8462.
[76]
Jens Schulz et al. “Learning interaction-aware probabilistic driver behavior models from
urban scenarios”. In: 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2019,
pp. 1326–1333.
[77]
Xiaoyu Mo, Yang Xing, and Chen Lv. “Recog: A deep learning framework with
heterogeneous graph for interaction-aware trajectory prediction”. In: arXiv preprint
arXiv:2012.05032 (2020).
[78]
Xiaoyu Mo, Yang Xing, and Chen Lv. “Heterogeneous edge-enhanced graph attention
network for multi-agent trajectory prediction”. In: arXiv preprint arXiv:2106.07161
(2021).
[79]
Mohammadhossein Bahari, Ismail Nejjar, and Alexandre Alahi. “Injecting knowledge in
data-driven vehicle trajectory predictors”. In: Transportation research part C: emerging
technologies 128 (2021), p. 103010.
[80]
Debaditya Roy et al. “Vehicle tra jectory prediction at intersections using interaction
based generative adversarial networks”. In: 2019 IEEE Intelligent transportation systems
conference (ITSC) . IEEE. 2019, pp. 2318–2323.
[81]
Mohammadhossein Bahari et al. “Vehicle trajectory prediction works, but not
everywhere”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition. 2022, pp. 17123–17133.
[82]
Iago Gomes and Denis Wolf. “A review on intention-aware and interaction-aware
trajectory prediction for autonomous vehicles”. In: Authorea Preprints (2023).
142
REFERENCES
[83]
Rohan Chandra et al. “Traphic: Trajectory prediction in dense and heterogeneous
trafc using weighted interactions”. In: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. 2019, pp. 8483–8492.
[84]
Hyeongseok Jeon, Junwon Choi, and Dongsuk Kum. “Scale-net: Scalable vehicle
trajectory prediction network under random number of interacting vehicles via edge-
enhanced graph convolutional neural network”. In: 2020 IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS). IEEE. 2020, pp. 2095–2102.
[85]
Siyu Teng et al. “Motion planning for autonomous driving: The state of the art and
future perspectives”. In: IEEE Transactions on Intelligent Vehicles (2023).
[86]
David Lenz et al. “Deep neural networks for Markovian interactive scene prediction in
highway scenarios”. In: 2017 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2017,
pp. 685–692.
[87]
Markus Koschi and Matthias Althof. “Set-based prediction of trafc participants
considering occlusions and trafc rules”. In: IEEE Transactions on Intelligent Vehicles
6.2 (2020), pp. 249–265.
[88]
Jens Schulz et al. “Interaction-aware probabilistic behavior prediction in urban
environments”. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and
Systems (IROS). IEEE. 2018, pp. 3999–4006.
[89]
Noor Hafzah Amer et al. “Modelling and control strategies in path tracking control for
autonomous ground vehicles: a review of state of the art and challenges”. In: Journal
of intelligent & robotic systems 86 (2017), pp. 225–254.
[90]
Wuhong Wang et al. “A cross-cultural analysis of driving behavior under critical
situations: A driving simulator study”. In: Transportation research part F: trafc
psychology and behaviour 62 (2019), pp. 483–493.
[91]
F Freuli et al. “Cross-cultural perspective of driving style in young adults: Psychometric
evaluation through the analysis of the Multidimensional Driving Style Inventory”. In:
Transportation research part F: trafc psychology and behaviour 73 (2020), pp. 425–432.
[92]
Biying Shen et al. “The diferent efects of personality on prosocial and aggressive
driving behaviour in a Chinese sample”. In: Transportation research part F: trafc
psychology and behaviour 56 (2018), pp. 268–279.
[93]
Lei Zhang et al. “A quantifcation method of driver characteristics based on Driver
Behavior Questionnaire”. In: 2009 IEEE Intelligent Vehicles Symposium. IEEE. 2009,
pp. 616–620.
[94]
Jens Rasmussen. “Skills, rules, and knowledge; signals, signs, and symbols, and other
distinctions in human performance models”. In: IEEE Transactions on Systems, Man,
and Cybernetics SMC-13.3 (1983), pp. 257–266.
[95]
Edmund Donges. “Aspekte der aktiven Sicherheit bei der Führung von Personenkraft-
wagen”. In: Automob-Ind 27.2 (1982).
[96]
Christian P Janssen et al. “Cognitive Modelling at the UCL Interaction Centre”. In:
2010.
143
REFERENCES
[97]
Mehdi Cina and Ahmad B. Rad. “Categorized review of drive simulators and driver
behavior analysis focusing on ACT-R architecture in autonomous vehicles”. In:
Sustainable Energy Technologies and Assessments 56 (2023), p. 103044.
[98]
Christian P Janssen et al. “Computational Models of Human-Automated Vehicle
Interaction (Dagstuhl Seminar 22102)”. In: Dagstuhl Reports. Vol. 12. 3. Schloss
Dagstuhl-Leibniz-Zentrum für Informatik. 2022.
[99]
Christian P Janssen and Wayne D Gray. “When, what, and how much to reward in
reinforcement learning-based models of cognition”. In: Cognitive science 36.2 (2012),
pp. 333–358.
[100]
Tessa van der Heiden, Florian Mirus, and Herke van Hoof. “Social navigation with
human empowerment driven deep reinforcement learning”. In: Artifcial Neural Networks
and Machine Learning–ICANN 2020: 29th International Conference on Artifcial Neural
Networks, Bratislava, Slovakia, September 15–18, 2020, Proceedings, Part II 29. Springer.
2020, pp. 395–407.
[101]
Raunak P Bhattacharyya et al. “Multi-agent imitation learning for driving simulation”.
In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
IEEE. 2018, pp. 1534–1539.
[102]
Feryal Behbahani et al. “Learning from demonstration in the wild”. In: 2019
International Conference on Robotics and Automation (ICRA). IEEE. 2019, pp. 775–781.
[103]
Teresa Rock et al. “Quantifying Realistic Behaviour of Trafc Agents in Urban Driving
Simu lation Based on Questionnaires”. In: 2022 IEEE Intelligent Vehicles Symposium
(IV). IEEE. 2022, pp. 1675–1682.
[104]
Liu Yang et al. “Efect of trafc density on drivers’ lane change and overtaking maneuvers
in freeway situation—A driving simulator–based study”. In: Trafc injury prevention
19.6 (2018), pp. 594–600.
[105]
Andreas Keler et al. “A bicycle simulator for experiencing microscopic trafc fow
simulation in urban environments”. In: 2018 21st International Conference on Intelligent
Transportation Systems (ITSC) . IEEE. 2018, pp. 3020–3023.
[106]
Manu el Lindorfer, Christoph F Mecklenbraeuker, and Gerald Ostermayer. “Modeling
the imperfect driver: Incorporating human factors in a microscopic trafc model”. In:
IEEE Transactions on Intelligent Transportation Systems 19.9 (2017), pp. 2856–2870.
[107]
Aleksandar Kostikj, Milan Kjosevski, and Ljupcho Kocarev. “Validation of a microscopic
single lane urban trafc simulator”. In: 2014 International Conference on Connected
Vehicles and Expo (ICCVE) . IEEE. 2014, pp. 850–854.
[108]
Christian Rudlof, Robert Schoenauer, and Martin Fellendorf. “Comparing Calibrated
Shared Space Simulation Model with Real-Life Data”. In: Transportation Research
Record (2013), pp. 1–14.
[109]
HyunSuk Kim et al. “Driving characteristics analysis of young and middle-aged drivers”.
In: 2016 International Conference on Information and Communication Technology
Convergence (ICTC). IEEE. 2016, pp. 864–867.
144
REFERENCES
[110] Peter Mörtl et al. “Modelling driver styles based on driving data”. In: (2017).
[111]
Günther Prokop. “Modeling human vehicle driving by model predictive online
optimization”. In: Vehicle system dynamics 35.1 (2001), pp. 19–53.
[112]
Jianbang Liu et al. “A Survey on Deep-Learning Approaches for Vehicle Trajectory
Prediction in Autonomous Driving”. In: 2021 IEEE International Conference on Robotics
and Biomimetics (ROBIO) (2021), pp. 978–985.
[113]
Yiran Zhang et al. “Human-Like Interactive Behavior Generation for Autonomous
Vehicles: A Bayesian Game-Theoretic Approach with Turing Test”. In: Advanced
Intelligent Systems 4.5 (2022), p. 2100211.
[114]
Abs Dumbuya et al. “Complexity of trafc interactions: Improving behavioural
intelligence in driving simulation scenarios”. In: Complex systems and self-organization
modelling (2009), pp. 201–209.
[115]
Jari Takatalo, Göte Nyman, and Leif Laaksonen. “Components of human experience in
virtual environments”. In: Computers in Human Behavior 24.1 (2008), pp. 1–15.
[116]
Christophe Deniaud et al. “The concept of “presence” as a measure of ecological validity
in driving simulators”. In: Journal of Interaction Science 3 (2015), pp. 1–13.
[117]
Ilsun Rhiu et al. “The evaluation of user experience of a human walking and a driving
simulation in the virtual reality”. In: International journal of industrial ergonomics 79
(2020), p. 103002.
[118]
Mel Slater et al. “How we experience immersive virtual environments: the concept of
presence and its measurement”. In: Anuario de psicologıa 40.2 (2009), pp. 193–210.
[119]
Rajaram Bhagavathula et al. “The reality of virtual reality: A comparison of pedestrian
behavior in real and virtual environments”. In: 62.1 (2018), pp. 2056–2060.
[120]
Despina Michael et al. “Impact of immersion and realism in driving simulator studies”.
In: International Journal of Interdisciplinary Telecommunications and Networking
(IJITN) 6.1 (2014), pp. 10–25.
[121]
Dimitri Hein, Christian Mai, and Heinrich Hußmann. “The usage of presence
measurements in research: a review”. In: Proceedings of the International Society
for Presence Research Annual Conference (Presence). The International Society for
Presence Research Prague. 2018, pp. 21–22.
[122]
Martin Usoh et al. “Using presence questionnaires in reality”. In: Presence 9.5 (2000),
pp. 497–503.
[123]
Mel Slater, Martin Usoh, and Anthony Steed. “Depth of presence in virtual environ-
ments”. In: Presence: Teleoperators & Virtual Environments 3.2 (1994), pp. 130–144.
[124]
Rosanna E Guadagno et al. “Virtual humans and persuasion: The efects of agency and
behavioral realism”. In: Media Psychology 10.1 (2007), pp. 1–22.
[125]
Xueni Pan and Antonia F de C Hamilton. “Why and how to use virtual reality to study
human social interaction: The challenges of exploring a new research landscape”. In:
British Journal of Psychology 109.3 (2018), pp. 395–417.
145
REFERENCES
[126]
Jason Jerald. “Immersion, p resence and reality trade-ofs”. In: J. Jerald, The VR Book
Human-Centered Design for Virtual Reality (2016), pp. 45–52.
[127]
Clara Marina Martinez et al. “Driving style recognition for intelligent vehicle control
and advanced driver assistance: A survey”. In: IEEE Transactions on Intelligent
Transportation Systems 19.3 (2017), pp. 666–676.
[128]
Berthold Färber. “Communication and communication problems between autonomous
vehicles and human drivers”. In: Autonomous driving: Technical, legal and social aspects
(2016), pp. 125–144.
[129]
Mica R. Endsley. “Situation Awareness Misconceptions and Misunderstandings”. In:
Journal of Cognitive Engineering and Decision Making 9.1 (2015), pp. 4–32.
[130]
Victor Fabricius et al. “Interactions between heavy trucks and vulnerable road users—A
systematic review to inform the interactive capabilities of highly automated trucks”. In:
Frontiers in Robotics and AI 9 (2022), p. 818019.
[131] Herbert H Clark and Susan E Brennan. “Grounding in communication.” In: (1991).
[132]
Sebastian Brechtel. “Dynamic decision-making in continuous partially observable
domains: A novel method and its application for autonomous driving”. PhD thesis.
Karlsruhe, Karlsruher Institut für Technologie (KIT), Diss., 2015, 2015.
[133]
WS Lee et al. “Driving simulation for evaluation of driver assistance systems and driving
management systems”. In: sponsored by the Korea Transportation Institute under the
national project,‘Development of National Trafc Core Technology (2007).
[134]
Guangquan Lu et al. “Measuring drivers’ takeover performance in varying levels of
automation: Considering the infuence of cognitive secondary task”. In: Transportation
research part F: trafc psychol ogy and behaviour 82 (2021), pp. 96–110.
[135]
Manish M Narkhede and Nilkanth B Chopade. “Review of advanced driver assistance
systems and their applications for collision avoidance in urban driving scenario”. In:
Machine Learning and Big Data Analytics (Proceedings of International Conference
on Machine Learning and Big Data Analytics (ICMLBDA) 2021). Springer. 2022,
pp. 253–267.
[136]
Leo Gugerty et al. “Situation awareness in driving”. In: Handbook for driving simulation
in engineering, medicine and psychology 1 (2011), pp. 265–272.
[137]
Florent Perronnet et al. “Deadlock prevention of self-driving vehicles in a network
of intersections”. In: IEEE Transactions on Intelligent Transportation Systems 20.11
(2019), pp. 4219–4233.
[138]
Jaskaran Grover, Changliu Liu, and Katia Sycara. “Deadlock Analysis and Resolution
in Multi-Robot Systems”. In: arXiv (2020). eprint: 1911.09146.
[139]
Steve Wright, Nicholas J Ward, and Anthony G Cohn. “Enhanced presence in
driving simulators using autonomous trafc with virtual personalities”. In: Presence:
Teleoperators & Virtual Environments 11.6 (2002), pp. 578–590.
[140]
Berndt Brehmer. “Dynamic decision making: Human control of complex systems”. In:
Acta Psychologica 81.3 (1992), pp. 211–241.
146
REFERENCES
[141]
Cleotilde Gonzalez, Pegah Fakhari, and Jerome Busemeyer. “Dynamic decision making:
Learning processes and new research directions”. In: Human factors 59.5 (2017), pp. 713–
721.
[142]
Teresa Rock et al. “Data-Driven Prediction of Other Road Users’ Intention for Better
Scene Understanding in Trafc Agents”. In: Proceedings of the Driving Simulation
Conference 2022 Europe VR. Ed. by Andras Kemeny, Jean-Rémy Chardonnet, and
Florent Colombet. Driving Simulation Association. Strasbourg, France, Sept. 15, 2022,
pp. 9–16.
[143]
Mica R Endsley, Daniel J Garland, et al. “Theoretical underpinnings of situation
awareness: A critical review”. In: Situation awareness analysis and measurement 1.1
(2000), pp. 3–21.
[144]
Philipp Bender, Julius Ziegler, and Christoph Stiller. “Lanelets: Efcient map
representation for autonomous driving”. In: 2014 IEEE Intelligent Vehicles Symposium
Proceedings. IEEE. 2014, pp. 420–425.
[145]
Open Street Map. “Open street map”. In: Online: https://www. openstreetmap. org.
Search in (2014).
[146]
Claudine Badue et al. “Self-driving cars: A survey”. In: Expert Systems with Applications
165 (2021), p. 113816.
[147]
Kai-Wei Chiang et al. “Automated Modeling of Road Networks for High-Defnition
Maps in OpenDRIVE Format Using Mobile Mapping Measurements”. In: Geomatics
2.2 (2022), pp. 221–235.
[148]
Alejandro Diaz-Diaz et al. “Hd maps: Exploiting opendrive potential for path planning
and map monitoring”. In: 2022 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2022,
pp. 1211–1217.
[149]
ASAM e.V. ASAM OpenDRIVE. July 2023. url:
https://www.asam.net/standards/
detail/opendrive/ (visited on 07/23/2023).
[150]
ASAM e.V. ASAM OpenDRIVE User Guide 1.7.0. July 2023. url:
https://www.asam.
net/fileadmin/Standards/OpenDRIVE/ASAM_OpenDRIVE_BS_V1-7-0.html
(visited
on 07/23/2023).
[151]
Martin H Strobl. “Spider-das innovative software-framework der bmw
fahrsimulation/spider-the innovative software framework of the bmw driving simulation”.
In: 1745. 2003.
[152]
Long Xin et al. “Enable faster and smoother spatio-temporal tra jectory planning for
autonomous vehicles in constrained dynamic environment”. In: Proceedings of the
Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering 235.4
(2021), pp. 1101–1112.
[153]
Yanjie Liu, Jiao Chen, and Xinyu Bai. “An approach for multi-objective obstacle
avoidance using dynamic occupancy grid map”. In: 2020 IEEE International Conference
on Mechatronics and Automation (ICMA). IEEE. 2020, pp. 1209–1215.
147
REFERENCES
[154]
Emmanouil G. Tsardoulias et al. “A Review of Global Path Planning Methods for
Occupancy Grid Maps Regardless of Obstacle Density”. In: Journal of Intelligent &
Robotic Systems 84 (2016), pp. 829–858.
[155]
Julian Bock et al. “The ind dataset: A drone dataset of naturalistic road user trajectories
at german intersections”. In: 2020 IEEE Intelligent Vehicles Symposium (IV). IE EE.
2020, pp. 1929–1934.
[156]
Matthias Schreier, Volker Willert, and Juergen Adamy. “Compact representation of
dynamic driving environments for ADAS by parametric free space and dynamic object
maps”. In: IEEE Transactions on Intelligent Transportation Systems 17.2 (2015),
pp. 367–384.
[157]
Timothy J.C. Nokes. “Mechanisms of knowledge transfer”. In: Thinking & Reasoning
15 (2009), pp. 1–36.
[158]
Pelin Onelcin and Yalcin Alver. “The crossing speed and safety margin of pedestrians
at signalized intersections”. In: Transportation Research Procedia 22 (2017), pp. 3–12.
[159]
Sen Zhang et al. “Representation of trafc congestion data for urban road trafc
networks based on pooling operations”. In: Algorithms 13.4 (2020), p. 84.
[160]
Teresa Rock et al. “On the Way to Reliable Trajectory Prediction in Urban Trafc”.
Advances in Transdisciplinary Engineering 2023, Publication in progress.
[161]
Sajjad Mozafari et al. “Deep learning-based vehicle behavior prediction for autonomous
driving applications: A review”. In: IEEE Transactions on Intelligent Transportation
Systems 23.1 (2020), pp. 33–47.
[162]
Christoph Burger, Thomas Schneider, and Martin Lauer. “Interaction aware cooperative
trajectory planning for lane change maneuvers in dense trafc”. In: (2020), pp. 1–8.
[163]
Adam Houenou et al. “Vehicle trajectory prediction based on motion model and
maneuver recognition”. In: 2013 IEEE/RSJ international conference on intelligent
robots and systems. IEEE. 2013, pp. 4363–4369.
[164]
Xin Li, Xiaowen Ying, and Mooi Choo Chuah. “Grip: Graph-based interaction-aware
trajectory prediction”. In: 2019 IEEE Intelligent Transportation Systems Conference
(ITSC). IEEE. 2019, pp. 3960–3966.
[165]
Bin Zou et al. “A framework for trajectory prediction of preceding target vehicles in
urban scenario using multi-sensor fusion”. In: Sensors 22.13 (2022), p. 4808.
[166]
Beihao Xia et al. “CSCNet: Contextual semantic consistency network for trajectory
prediction in crowded spaces”. In: Pattern Recognition 126 (2022), p. 108552.
[167]
Jiachen Li et al. “Social-WaGDAT: Interaction-aware Trajectory Prediction via
Wasserstein Graph Double-Attention Network”. In: arXiv preprint arXiv:2002.06241
(2020).
[168]
Xin Li, Xiaowen Ying, and Mooi Choo. “GRIP++: Enhanced Graph-based Interaction-
aware TrajectoryPrediction for Autonomous Driving”. In: Chuah Department of
Computer Science and Engineering, Lehigh University (2020).
148
REFERENCES
[169]
Zhaoen Su et al. “Convolutions for spatial interaction modeling”. In: Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022,
pp. 6583–6592.
[170]
Chaofan Tao et al. “Dynamic and static context-aware lstm for multi-agent motion
prediction”. In: European Conference on Computer Vision. Springer. 2020, pp. 547–563.
[171]
Yu Wang and Shiwei Chen. “Multi-agent trajectory prediction with spatio-temporal
sequence fusion”. In: IEEE Transactions on Multimedia (2021).
[172]
Tianyang Zhao et al. “Multi-agent tensor fusion for contextual trajectory prediction”. In:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
2019, pp. 12126–12134.
[173]
Yuexin Ma et al. “TrafcPredict: Trajectory Prediction for Heterogeneous Trafc-
Agents”. In: ArXiv abs/1811.02146 (2018).
[174]
Rohan Chandra et al. “Traphic: Trajectory prediction in dense and heterogeneous
trafc using weighted interactions”. In: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. 2019, pp. 8483–8492.
[175]
Jianyu Su et al. “Graph convolution networks for probabilistic modeling of driving ac-
celeration”. In: 2020 IEEE 23rd International Conference on Intelligent Transportation
Systems (ITSC). IEEE. 2020, pp. 1–8.
[176]
Jiachen Li et al. “Evolvegraph: Multi-agent trajectory prediction with dynamic relational
reasoning”. In: Advances in neural information processing systems 33 (2020), pp. 19783–
19794.
[177]
A Quintanar et al. “Predicting vehicles trajectories in urban scenarios with transformer
networks and augmented information”. In: 2021 IEEE Intelligent Vehicles Symposium
(IV). IEEE. 2021, pp. 1051–1056.
[178]
Cunjun Yu et al. “Spatio-temporal graph transformer networks for pedestrian trajectory
prediction”. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow,
UK, August 23–28, 2020, Proceedings, Part XII 16. Springer. 2020, pp. 507–523.
[179]
Jiachen Li et al. “Social-WaGDAT: Interaction-aware Trajectory Prediction via
Wasserstein Graph Double-Attention Network”. In: ArXiv abs/2002.06241 (2020).
[180]
Xiaoyu Mo et al. “Multi-agent trajectory prediction with heterogeneous edge-enhanced
graph attention network”. In: IEEE Transactions on Intelligent Transportation Systems
23.7 (2022), pp. 9554–9567.
[181]
Shengnan Guo et al. “Attention based spatial-temporal graph convolutional networks
for trafc fow forecasting”. In: 33.01 (2019), pp. 922–929.
[182]
Hao Cheng et al. “MCENET: Multi-context encoder network for homogeneous agent
trajectory prediction in mixed trafc”. In: 2020 IEEE 23rd International Conference
on Intelligent Transportation Systems (ITSC). IEEE. 2020, pp. 1–8.
[183]
Namhoon Lee et al. “Desire: Distant future prediction in dynamic scenes with interacting
agents”. In: Proceedings of the IEEE conference on computer vision and pattern
recognition. 2017, pp. 336–345.
149
REFERENCES
[184]
Luca Rossi et al. “Human trajectory prediction and generation using LSTM mo dels
and GANs”. In: Pattern Recognition 120 (2021), p. 108136.
[185]
Xiaoyu Mo, Yang Xing, and Chen Lv. “Graph and recurrent neural network-based vehicle
trajectory prediction for highway driving”. In: 2021 IEEE International Intelligent
Transportation Systems Conference (ITSC). IEEE. 2021, pp. 1934–1939.
[186]
Xiaoyu Mo, Yang Xing, and Chen Lv. “Recog: A deep learning framework with
heterogeneous graph for interaction-aware trajectory prediction”. In: arXiv preprint
arXiv:2012.05032 (2020).
[187]
Agrim Gupta et al. “Social GAN: Socially Acceptable Trajectories with Generative
Adversarial Networks”. In: 2018 IEEE/CVF Conference on Computer Vision and
Pattern Recognition (2018), pp. 2255–2264.
[188]
Arsal Syed and Brendan Tran Morris. “Semantic scene upgrades for trajectory
prediction”. In: Machine vision and applications 34.2 (2023), p. 23.
[189]
Junwei Liang, Lu Jiang, and Alexander Hauptmann. “Simaug: Learning robust
representations from simulation for trajectory prediction”. In: Computer Vision–ECCV
2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part
XIII 16. Springer. 2020, pp. 275–292.
[190]
Maximilian Schäfer et al. “Context-Aware Scene Prediction Network (CASPNet)”. In:
2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC).
IEEE. 2022, pp. 3970–3977.
[191]
Hang Zhao et al. “Tnt: Target-driven trajectory prediction”. In: Conference on Robot
Learning. PMLR. 2021, pp. 895–904.
[192]
Osval Antonio Montesinos López, Abelardo Montesinos López, and Jose Crossa.
“Overftting, model tuning, and evaluation of prediction performance”. In: Multivariate
statistical machine learning methods for genomic prediction. Springer, 2022, pp. 109–139.
[193]
Agrim Gupta et al. “Social gan: Socially acceptable trajectories with generative
adversarial networks”. In: Proceedings of the IEEE conference on computer vision
and pattern recognition . 2018, pp. 2255–2264.
[194]
Bogdan Ilie Sighencea, Rare
s
,
Ion Stanciu, and Cătălin Daniel Căleanu. “A review of
deep learning-based methods for pedestrian trajectory prediction”. In: Sensors 21.22
(2021), p. 7543.
[195] François Chollet et al. Keras. https://keras.io. 2015.
[196]
Samy Bengio and Yoshua Bengio. “Taking on the curse of dimensionality in joint
distributions using neural networks”. In: IEEE Transactions on Neural Networks 11.3
(2000), pp. 550–557.
[197]
G. Hughes. “On the mean accuracy of statistical pattern recognizers”. In: IEEE
Transactions on Information Theory 14.1 (1968), pp. 55–63.
[198]
Jui-En Lo et al. “Data Homogeneity Efect in Deep Learning-Based Prediction of Type
1 Diabetic Retinopathy”. In: Journal of Diabetes Research 2021 (2021).
150
REFERENCES
[199]
Juanwu Lu et al. “Generalizability analysis of graph-based trajectory predictor with
vectorized representation”. In: 2022 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS). IEEE. 2022, pp. 13430–13437.
[200]
Wenshuo Wang et al. “Social interactions for autonomous driving: A review and
perspectives”. In: Foundations and Trends® in Robotics 10.3-4 (2022), pp. 198–376.
[201]
Teresa Rock et al. “Dynamic Decision-Making for Agent Models in Urban Driving
Simulation”. In: Proceedings of the Driving Simulation Conference 2023 Europe VR. Ed.
by Andras Kemeny, Jean-Rémy Chardonnet, and Florent Colombet. Driving Simulation
Association. Antibes, France, 2023, pp. 169–179.
[202]
Christos Katrakazas et al. “Real-time motion planning methods for autonomous on-road
driving: State-of-the-art and future research directions”. In: Transportation Research
Part C: Emerging Technologies 60 (2015), pp. 416–442.
[203]
David González et al. “A Review of Motion Planning Techniques for Automated
Vehicles”. In: IEEE Transactions on Intelligent Transportation Systems 17.4 (2016),
pp. 1135–1145.
[204]
Stanislav A Goll et al. “Unmanned ground vehicle local trajectory planning algorithm”.
In: 2016 5th Mediterranean Conference on Embedded Computing (MECO). IEEE. 2016,
pp. 317–321.
[205]
Yao Qi et al. “Hierarchical Motion Planning for Autonomous Vehicles in Unstructured
Dynamic Environments”. In: IEEE Robotics and Automation Letters 8.2 (2022), pp. 496–
503.
[206]
Donghao Xu et al. “Naturalistic lane change analysis for human-like trajectory genera-
tion”. In: 2018 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2018, pp. 1393–1399.
[207]
Ivo Batkovic et al. “Real-time constrained trajectory planning and vehicle control
for proactive autonomous driving with road users”. In: 2019 18th European Control
Conference (ECC). IEEE. 2019, pp. 256–262.
[208]
Kamal Kant and Steven Zu cker. “Toward Efcient Trajectory Planning: The Path-
Velocity Decomposition”. In: International Journal of Robotic Research - IJRR 5 (Sept.
1986), pp. 72–89.
[209]
Vasundhara Jain et al. “Collision Avoidance for Multiple Static Obstacles using Path-
Velocity Decomposition”. In: IFAC-PapersOnLine 52.8 (2019), pp. 265–270.
[210]
Moritz Werling et al. “Optimal trajectories for time-critical street scenarios using
discretized terminal manifolds”. In: The International Journal of Robotics Research
31.3 (2012), pp. 346–359.
[211]
Peter E. Hart, Nils J. Nilsson, and Bertram Raphael. “A Formal Basis for the Heuristic
Determination of Minimum Cost Paths”. In: IEEE Transactions on Systems Science
and Cybernetics 4.2 (1968), pp. 100–107.
[212]
Michael T Wolf and Joel W Burdick. “Artifcial potential functions for highway driving
with collision avoidance”. In: 2008 IEEE International Conference on Robotics and
Automation. IEEE. 2008, pp. 3731–3736.
151
REFERENCES
[213]
Chang Liu, Li Zhai, and XueYing Zhang. “Research on local real-time obstacle avoidance
path planning of unmanned vehicle based on improved artifcial potential feld method”.
In: 2022 6th CAA International Conference on Vehicular Control and Intelligence
(CVCI). IEEE. 2022, pp. 1–6.
[214]
Jiayi Sun, Jun Tang, and Songyang Lao. “Collision Avoidance for Cooperative UAVs
With Optimized Artifcial Potential Field Algorithm”. In: IEEE Access 5 (2017),
pp. 18382–18390.
[215]
Pauli Virtanen et al. “SciPy 1.0: Fundamental Algorithms for Scientifc Computing in
Python”. In: Nature Methods 17 (2020), pp. 261–272.
[216]
Constantin Berger. “Software-Efcient Trajectory Planning for Autonomous Driving
Agent Models in Simulated Urban Environments”. Master’s thesis. Technical University
of Munich, Chair of Integrated Systems, June 2022.
[217]
Sheng Zhu and Bilin Aksun-Guvenc. “Trajectory planning of autonomous vehicles based
on parameterized control optimization in dynamic on-road environments”. In: Journal
of Intelligent & Robotic Systems 100 (2020), pp. 1055–1067.
[218]
David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning.
1st. USA: Addison-Wesley Longman Publishing Co., Inc., 1989.
[219]
Teresa Rock et al. “Objectively Scoring the Human-Likeness of Artifcial Driver Models”.
In: Applied Sciences 13.18 (2023).
[220]
Meixin Zhu, Xuesong Wang, and Yinhai Wang. “Human-like autonomous car-following
model with deep reinforcement learning”. In: Transportation research part C: emerging
technologies 97 (2018), pp. 348–368.
[221]
Il Bae et al. “Self-driving like a human driver instead of a robocar: Personalized comfort-
able driving experience for autonomous vehicles”. In: arXiv preprint arXiv:2001.03908
(2020).
[222]
Jun Wang et al. “Normal deceleration behavior of passenger vehicles at stop sign–
controlled intersections evaluated with in-vehicle Global Positioning System data”. In:
Transportation research record 1937.1 (2005), pp. 120–127.
[223]
Talal Al-Shihabi and Ronald R Mourant. “Toward more realistic driving behavior
models for autonomous vehicles in driving simulators”. In: Transportation Research
Record 1843.1 (2003), pp. 41–49.
[224]
Arne Kesting, Martin Treiber, and Dirk Helbing. “Enhanced intelligent driver model to
access the impact of driving strategies on trafc capacity”. In: Philosophical Transactions
of the Royal Society A: Mathematical, Physical and Engineering Sciences 368.1928
(2010), pp. 4585–4605.
[225]
Mysore Narasimhamurthy Sharath and Babak Mehran. “A Literature Review of
Performance Metrics of Automated Driving Systems for On-Road Vehicles”. In: Frontiers
in Future Transportation (2021), p. 28.
[226]
Adrian Zlocki et al. “Logical Scenarios Parameterization for Automated Vehicle Safety
Assessment: Comparison of Deceleration and Cut-In Scenarios From Japanese and
German Highways”. In: IEEE Access 10 (2022), pp. 26817–26829.
152
REFERENCES
[227]
Kyle Sama et al. “Extracting human-like driving behaviors from expert driver data using
deep learning”. In: IEEE transactions on vehicular technology 69.9 (2020), pp. 9315–
9329.
[228] Morton S Raf et al. “A volume warrant for urban stop signs”. In: (1950).
[229]
Brian L Allen, B Tom Shin, and Peter J Cooper. Analysis of trafc conficts and
collisions. Tech. rep. 1978.
[230]
Michiel M Minderhoud and Piet HL Bovy. “Extended time-to-collision measures for road
trafc safety assessment”. In: Accident Analysis & Prevention 33.1 (2001), pp. 89–97.
[231]
Katja Vogel. “A comparison of headway and time to collision as safety indicators”. In:
Accident analysis & prevention 35.3 (2003), pp. 427–433.
[232]
Manish Dutta and Mokaddes Ali Ahmed. “Gap acceptance behavior of drivers at
uncontrolled T-intersections under mixed trafc conditions”. In: Journal of modern
transportation 26.2 (2018), pp. 119–132.
[233]
Nengchao Lyu, Jiaqiang Wen, and Chaozhong Wu. “Novel time-delay side-collision
warning model at non-signalized intersections based on vehicle-to-infrastructure
communication”. In: International journal of environmental research and public health
18.4 (2021), p. 1520.
[234]
Ye Li et al. “Exploring transition durations of rear-end collisions based on vehicle
trajectory data: a survival modeling approach”. In: Accident Analysis & Prevention
159 (2021), p. 106271.
[235]
Maike Scholtes et al. “6-layer model for a structured description and categorization of
urban trafc and environment”. In: IEEE Access 9 (2021), pp. 59131–59147.
[236]
Paul Jaccard. “Distribution de la fore alpine dans le bassin des Dranses et dans quelques
régions voisines”. In: Bull Soc Vaudoise Sci Nat 37 (1901), pp. 241–272.
[237]
Silvia Facchinetti et al. “A procedure to fnd exact critical values of Kolmogorov-
Smirnov test”. In: Statistica Applicata–Italian Journal of Applied Statistics 21.3-4
(2009), pp. 337–359.
[238]
Fin Malte Heuer. “Scenario Generation for Testing of Automated Driving Functions
based on Real Data: Master’s Thesis”. PhD thesis. Braunschweig, 2022.
[239]
Mel S later and Anthony Steed. “A Virtual Presence Counter”. In: Presence: Teleoper.
Virtual Environ. 9 (Oct. 2000), pp. 413–434.
[240]
Valentin Schwind et al. “Using presence questionnaires in virtual reality”. In: Proceedings
of the 2019 CHI conference on human factors in computing systems. 2019, pp. 1–12.
[241]
Matthew Lombard, Theresa B Ditton, and Lisa Weinstein. “Measuring presence: the
temple presence inventory”. In: Proceedings of the 12th annual international workshop
on presence. 2009, pp. 1–15.
[242] F Wilcoxon. Individual comparisons by ranking methods. Biom. Bull., 1, 80–83. 1945.
153
A
Appendix
A.1
Additional Material for the Context Representation
Method
The following table lists all the features extracted from the data sources to describe situational
context. The sufx n for interaction and partner features describes the number of interaction
partners. Thus, when the ego vehicle interacts with two other vehicles, agent_type_p1 and
agent_type_p2 have the property "car".
154
A.1 Additional Material for the Context Representation Method
Table A.1: All features describing a driving scene at time
t
from the perspective of an individual
ego vehcile on the basis of features from the inD dataset. [155]
Category Feature Description Unit P
Ego vehicle fea-
tures (E)
Map information
Map representa-
tion (M)
Partner Features
(P)
Interaction Fea-
tures (I)
agent_type
xCenter
yCenter
heading
width
length
xVelocity
yVelocity
xAcceleration
yAcceleration
lonVelocity
latVelocity
lonAcceleration
latAcceleration
logical lane assignment
physical lane assignment
confict_lanes
lane_start_direction
lane_end_direction
lane_turn_direction
lane_curvature_in_5m,
lane_curvature_in_10m,
lane_curvature_in_20m
lane_direction_in_5m,
lane_direction_in_10m,
lane_direction_in_20m
lane_width_in_5m,
lane_width_in_10m,
lane_width_in_20m
center_line_x_in_5m,
’center_line_x_in_10m,
’center_line_x_in_20m’
center_line_y_in_5m,
’center_line_y_in_10m,
’center_line_y_in_20m’
agent_type_pn
xCenter_pn
yCenter_pn
heading_pn
width_p_pn
length_pn
xVelocity_pn
yVelocity_pn
xAcceleration_pn
yAcceleration_pn
lonVelocity_pn
latVelocity_pn
lonAcceleration_pn
latAcceleration_pn
relationship2ego_veh_pn
pos2conf_veh_pn
ego_pos2conf_vehn
route_opt_veh_pn
dis2conf_pn
dis2ego_pn
rel_speed_pn
rel_heading_pn
angle2conf_veh_pn
classifcation of ego vehicle: car, motorcycle, van, truck_bus
X-coordinate of vehicle center
Y-coordinate of vehicle center
Heading angle of vehicle (0-360 degrees)
Width of the vehicle
Length of the vehicle
X-component of vehicle velocity
Y-component of vehicle velocity
X-component of vehicle acceleration
Y-component of vehicle acceleration
Longitudinal velocity of the vehicle
Lateral velocity of the vehicle
Longitudinal acceleration of the vehicle
Lateral acceleration of the vehicle
lane the vehicle is logically driving on described by lane id,
section id, section type and s-position within the lane
lanes the vehicle is physically driving on described by lane id,
section id, section type and s-position within the lane
all lanes that have potential conficts with any ego lane
respective semantic and spatial relationship as well as from
which ego lane the confict originates
Direction value of the frst point of the lane the vehicle is
logically driving on
Direction value of the last point of the lane the vehicle is
logically driving on
Turn direction of the lane the vehicle is logically driving on
(STRAIGHT, LEFT, RIGHT)
Lane curvature calculated for the lane the vehicle is logically
driving on 5, 10, and 20 meters from the current position
Lane direction calculated for the lane the vehicle is logically
driving on 5, 10, and 20 meters from the current position
Lane width calculated for the lane the vehicle is logically
driving on 5, 10, and 20 meters from the current position
Global x coordinates of the center_line position 5,10, and 20
meters from the current position
Global y coordinates of the center_line position 5,10, and 20
meters from the current position
Type of partner: car, motorcycle, van, truck_bus, pedestrian,
bicycle
X-coordinate of partner center
Y-coordinate of partner center
Heading angle of partner (0-360 degrees)
Width of partner
Length of partner
X-component of partner velocity
Y-component of partner velocity
X-component of partner acceleration
Y-component of partner acceleration
Longitudinal velocity of partner
Lateral velocity of partner
Longitudinal acceleration of partner
Lateral acceleration of partner
Relationship between ego and partner: SAME_LANE,
MERGING, GIVING_ROW, RECEIVING_ROW,
IN_WRONG_LANE, VRU_INTERACTION
Position of partner vehicle to confict (before, inside, after)
calculated by confict area from the map (feature only for
partner vehicles!)
Position of ego to confict (before, inside, after) calculated by
confict area from the map (feature only for partner vehicles!)
Route options of partner: LEFT, STRAIGHT, RIGHT (feature
only for partner vehicles!)
Distance to center of confict zone
Euclidean distance between ego and partner center point
Relative velocity between ego and partner
Relative heading between ego and partner
Heading angle of partner with respect to confict
-
[m]
[m]
[°]
[m]
[m]
[m/s]
[m/s]
[m/s^2]
[m/s^2]
[m/s]
[m/s]
[m/s^2]
[m/s^2]
-
-
[rad]
[rad]
[1/m]
[1/m]
[m]
[m]
[m]
-
[m]
[m]
[°]
[m]
[m]
[m/s]
[m/s]
[m/s^2]
[m/s^2]
[m/s]
[m/s]
[m/s^2]
[m/s^2]
-
-
-
-
[m]
[m]
[m/s]
[°]
[°]
x
x
x
x
-
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
155
A.2 Additional Material for the Dynamic Decision-making Method
A.2
Additional Material for the Dynamic Decision-making
Method
Videos of the resulting trajectories and fgures of velocity-, acceleration- and jerk profles can
be found under this link:
https://www.dropbox.com/scl/fi/hbxqvxjjox5nb8cx7bji5/Supplementary_material_
Paper_DSC2023_Rock.pdf?rlkey=iseqj2tykudmtjizpcv7ckinp&dl=0.
A.2.1
Scenario A_VEH Solved by the Two Planners and the Heuristic
Agent Model
Behavior of the heuristic agent model and the two planners in scenario A_VEH are illustrated
in Chapter 6 in Figure 6.7.
(a) Scenario A_VEH solved by PL_PVD. (b) Scenario A_VEH solved by PL_3D.
Figure A.1: Velocity acceleration and jerk profle for the frst guess and fnal trajectory of the
two planners in scenario A_VEH.
156
A.2 Additional Material for the Dynamic Decision-making Method
Figure A.2: Velocity, acceleration, and jerk profle of the heuristic agent model in scenario
A_VEH.
A.2.2
Scenario B_PED Solved by the Two Planners and the Heuristic
Agent Model
(a) Scenario B_PED solved by
(b) Scenario B_PED solved by
(c) Scenario B_PED solved by
TRM. PL_PVD. PL_3D.
Figure A.3: Behavior of the heuristic agent model and the two planners in scenario B_PED at
the same time step. Ego vehicle in green.
157
A.2 Additional Material for the Dynamic Decision-making Method
(a) Scenario B_PED solved by PL_PVD (b) Scenario B_PED solved by PL_3D
Figure A.4: Velocity, acceleration and jerk profle for the frst guess and fnal trajectory of the
two planners in scenario B_PED.
A.2.3
Scenario C_STAT Solved by the Two Planners and the Heuristic
Agent Model
Behavior of the heuristic agent model and the two planners in scenario C_STAT are illustrated
in Chapter 6 in Figure 6.8.
Figure A.5: Velocity, acceleration, and jerk profle of the heuristic agent model in scenario
B_PED.
158
A.2 Additional Material for the Dynamic Decision-making Method
(a) Scenario C_STAT solved by PL_PVD (b) Scenario C_STAT solved by PL_3D
Figure A.6: Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of the
two planners in scenario C_STAT.
A.2.4
Scenario D_BIC Solved by the Two Planners and the Heuristic Agent
Model
Behavior of the heuristic agent model and the two planners in scenario D_BIC are illustrated
in Chapter 6 in Figure 6.10.
Figure A.7: Velocity, acceleration, and jerk profle of the heuristic agent model in scenario
C_STAT.
159
A.2 Additional Material for the Dynamic Decision-making Method
(a) Scenario D_BIC solved by PL_PVD. (b) Scenario D_BIC solved by PL_3D.
Figure A.8: Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of the
two planners in scenario D_BIC.
Figure A.9: Velocity, acceleration, and jerk profle of the heuristic agent model in scenario
D_BIC.
160
A.2 Additional Material for the Dynamic Decision-making Method
(a) Scenario D1 best quality parameter set.
(b) Scenario D1 best runtime parameter set.
(c) Scenario D1 global parameter set. (d) Scenario D1 local parameter set.
Figure A.10: Behavior of the PL_3D planner employing diferent parameter sets in variation D1
of scenario D_BIC.
A.2.5 Scenario D1 Solved by the PL_3D Planner
(a) Scenario D1 best quality parameter set.
(b) Scenario D1 best runtime parameter set.
Figure A.11: Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of
PL_3D planner employing best quality and best runtime parameter sets.
161
A.2 Additional Material for the Dynamic Decision-making Method
(a) Scenario D1 global parameter set. (b) Scenario D1 local parameter set.
Figure A.12: Velocity, acceleration, and jerk profle for the frst guess and fnal trajectory of
PL_3D planner employing local and global parameter sets.
162
A.3 Additional Material for the Simulator Experiment
A.3 Additional Material for the Simulator Experiment
The following tables show the questionnaires used during the simulator experiment. The
questionnaires were designed based on established questionnaires for measuring perceived
realism in VEs and additional use-case specifc items to explore the subjective evaluation of
trafc agents.
Table A.2: Questionnaire 1: Investigating the sense of presence. Presence (PRE) items according
to M. Slater and A. Steed [239] and one additional item investigating the degree of naturalism of
the driving style.
1.1
PRE
Please rate your sense of being in the
trafc scenario, on the following scale
from 1 to 7, where 7 represents your
normal experience of being in a place. I
had a sense of "being there" in the trafc
scenario
Not at all Very much
1.2
PRE
To what extent were there times during
the experience when the virtual trafc
situation became the "reality" for you,
and you almost forgot about the "real
world" of the simulator room in which the
whole experience was really taking place?
There were times during the experience
when the virtual environment became
more real for me compared to the "real
world".
At no time
Almost all the
time
1.3
PRE
When you think back about your expe-
rience do you think of the city more
as images that you saw, or more as
somewhere you drove through. "The
trafc scenario seems to me to be more
like"
images that I
saw
somewhere you
drove through
1.4
PRE
During the time of the experience, which
was strongest on the whole your sense of
being in the city or of being in the real
world of the simulator? I had a strong
sense of being in ...
The real world of
the simulator
The virtual real-
ity of the city
1.5
PRE
During the time of the experience, did
you often think to yourself that you were
actually just sitting in a simulator or
did the virtual environment overwhelm
you? I often thought I was sitting in a
simulator ...
Almost all the
time
At no time
1.6
NAT
How much did you behave within the
driving simulator as if the situation were
real? I responded as if the situation were
real.
Not at all Very much
163
A.3 Additional Material for the Simulator Experiment
Table A.3: Questionnaire 2: Evaluating the degree of realism of the trafc agents.
2.1
STAT
The events I saw could occur in the real
world. [241]
Not at all Very much
2.2
STAT
Overall, how much did the other trafc
participants in the virtual environment
behave and move like they were real?
[241]
Not at all Very much
2.3
INTERA
How natural did your interactions with
the other trafc participants seem? [117]
Like reality Not real at all
2.4
SPA-
TEM
How often did you have the sensation
that other trafc participants you saw
could also see you? [241]
Never Always
2.5
INTERA
To what extent did you feel you could in-
teract with the other trafc participants
you saw? [241]
Not at all Very much
2.6
INTERA
The other trafc participants reacted to
my actions.
Strongly
Disagree
Strongly Agree
2.7
AGREE
I would behave more naturally in the
driving simulator if the other trafc
participants interacted with me more
often or reacted to my actions more
often.
Strongly
Disagree
Strongly Agree
2.8
How does the behavior of other road
users, that you observed, difer from
what you experienced in real trafc?
164