scieee Science in your language
[en] (orig)
Game Theoretic Approaches
to Motion Planning
in Robot Soccer
von der Fakultät für Elektrotechnik,
Informatik und Mathematik
der Universität Paderborn
zur Erlangung des akademischen Grades
Doktor der Naturwissenschaften
(Dr. rer. nat.)
genehmigte Dissertation
von
Marcus Post
Paderborn, 2008
Referees: Prof. Dr. Oliver Junge
Prof. Dr. Burkhard Monien
Committee: Prof. Dr. Michael Dellnitz (chairman)
Prof. Dr. Peter Bürgisser
Prof. Dr. Oliver Junge
Prof. Dr. Burkhard Monien
Dr. Elke Wolf
Date of PhD Defense: 17.04.2008
You can not learn anything
until you already almost know it.
Unknown
To Berthild, Karl-Friedrich, Sebastian, and Ping
iii
Advertisement
Acknowledgements
I would like to start by thanking my advisors Prof. Dr. Michael Dellnitz and Prof. Dr.
Oliver Junge for their guidance, support, motivation, and for the great freedom which was
given to me. Prof. Dellnitz’ group at the University of Paderborn always provided a very
good research environment for me. I also want to thank Prof. Dr. Burkhard Monien for
helpful comments and for reviewing this PhD thesis.
Moreover, I am very grateful to Prof. Dr. Oliver Junge, Prof. Dr. Michael Dellnitz, Dr. Sina
Ober-Blöbaum, Dr. Kathrin Padberg, Dr. Oliver Schütze, Stefan Sertl, and Bianca Thiere
for interesting discussions and exciting joint work. For fruitful discussions and comments
I would also like to thank Mirko Hessel-von Molo, Oliver Kramer, Prof. Dr. Michael G.
Lagoudakis, Tim Laue, Dr. Martin Lauer, Henning Meyerhenke, and Willi Richert.
In general, I am indebted to my colleagues in Paderborn including Alessandro Dell’ Aere,
Sebastian Hage-Packhäuser, Mirko Hessel-von Molo, Stefan Klus, Dr. Arvind Krishna-
murthy, Anna-Lena Meier, Dr. Sina Ober-Blöbaum, Dr. Kathrin Padberg, Michael Petry,
Dr. Robert Preis, Dr. Oliver Schütze, Stefan Sertl, Bianca Thiere, Dr. Fang Wang, and
Katrin Witting for discussions, technical support, social events and many other things. I
am further indebted to the collegiates of the PaSCo graduate school.
I am grateful to Laurel Frick-Wright for proofreading my thesis to improve the quality of
my English and to Mirko Hessel-von Molo, Anna-Lena Meier, and Stefan Klus for proof-
reading excerpts.
I always received valuable administrative support from the secretaries Marianne Kalle and
Tanja rger, and from Anne Belkner. For enabling my work in a different way, namely
by keeping my office clean, I thank the non-scientific staff of the University of Paderborn.
I would like to thank Alessandro Dell’ Aere for helping me to build a physical soccer envi-
ronment for the AIBO robots and some students for supporting me with the AIBO robots’
technical issues: Johannes Berg, Raphael Golombek, and Nicolai Hähnle are to be men-
tioned here.
Last but not least I am very indebted to my parents Berthild and Karl-Friedrich Post who
supported me not only during my studies but during my whole life in all ways imaginable. I
want to thank my brother Sebastian, Janina, my friends from the University of Paderborn,
especially Ping, the “Mensa-Kreis”, and the musicians I played music with, and, of course,
all other friends of mine.
For the development of great software tools I want to especially thank all L
A
T
EX develop-
ers and the developers of Dia, Kate, and Kile many of whom work voluntarily, and the
developers of Matlab which is a commercial software tool. All of them made my work
technically possible or at least substantially simpler.
For financial support I am very grateful to the Paderborn Institute for Scientific Compu-
tation (PaSCo)(1) and to the University of Paderborn.
Marcus Post, February 2008
(1)The research is (partly) supported by the DFG Research Training Group GK-693 of the Paderborn
Institute for Scientific Computation (PaSCo).
iv
Abstract
Robotics is, from a scientific point of view, a very broad topic with many applications.
While highly specialised robots have been widely used in production lines, the next big
scientific steps are towards autonomy of robots and interaction with other robots and hu-
mans. For achieving these long-term goals catenations of physical and mental abilities
which are interdisciplinary and scientifically challenging have to be carried out. At its cur-
rent state, robot soccer is an appropriate environment for demonstrating and developing
robotic skills as several areas are addressed such as image processing and analysis in the
widest sense (including e. g. object matching and directing the camera), control and opti-
misation of physical movement (walking, ball handling), and the strategic planning which
may be considered as being close to high level reasoning. In this thesis, game theoretic and
reinforcement learning approaches are utilised to contribute to strategic planning in robot
soccer which serves as a motivating example. The aim of strategic planning is to obtain
an optimal strategy which also takes the possibly unknown strategies of other players into
account. A natural further goal of this thesis is the development and analysis of algorithms
by means of which such optimal strategies can be approximately computed.
More specifically, the following steps are undertaken: first, a game theoretic model of
multi-player robot soccer is developed which is independent from the robot hardware. The
occuring challenge to determine an optimal strategy with respect to this model for as
many robots as possible is met by exact model reduction, i. e. by finding equivalent smaller
models. For this, a theoretical framework of symmetries is developed which bases on
homomorphisms between two-player zero-sum Markov games. It lays a formal foundation
for practitioners who already implicitly used results proven within that framework. A
special result which is important for model reduction is that the reduction can be performed
in several separate steps and be combined afterwards which is expressed by a composition
of homomorphisms. Finally, a qualitatively new symmetry which interchanges the two
players of the Markov game, i. e. the two teams in robot soccer, is proven to be part of
the homomorphism framework. Particularly, this means that it can be combined with all
symmetries which occur in Markov decision processes.
The theoretical results about Markov game symmetries are algorithmically exploited for
Dynamic Programming (DP) and Reinforcement Learning (RL) methods which are also
compared. Such comparisons ought to be standard but seem unusual for large parts of the
RL literature. Unsurprisingly, DP methods are more efficient and thus the following general
procedure seems recommendable: firstly, to design an approximate model for the task at
hand and solve this by DP methods to an appropriate level of precision and, secondly, to
use the DP solution of the rough model as an initialisation for an RL method to let the RL
method adapt to the unknown real model. In this spirit, the developed soccer model and
the computation of its optimal solution can be seen as the completion of the first of the
above two steps. Ideas of dynamical systems and graph theory are additionally integrated
v
Advertisement
Loading more pages...