Game theoretic approaches to motion planning in robot soccer / Marcus Post [original]

Game Theoretic Approaches

to Motion Planning

in Robot Soccer

von der Fakultät für Elektrotechnik,

Informatik und Mathematik

der Universität Paderborn

zur Erlangung des akademischen Grades

Doktor der Naturwissenschaften

(Dr. rer. nat.)

genehmigte Dissertation

von

Marcus Post

Paderborn, 2008

Referees: Prof. Dr. Oliver Junge

Prof. Dr. Burkhard Monien

Committee: Prof. Dr. Michael Dellnitz (chairman)

Prof. Dr. Peter Bürgisser

Prof. Dr. Oliver Junge

Prof. Dr. Burkhard Monien

Dr. Elke Wolf

Date of PhD Defense: 17.04.2008

You can not learn anything

until you already almost know it.

Unknown

To Berthild, Karl-Friedrich, Sebastian, and Ping

iii

Acknowledgements

I would like to start by thanking my advisors Prof. Dr. Michael Dellnitz and Prof. Dr.

Oliver Junge for their guidance, support, motivation, and for the great freedom which was

given to me. Prof. Dellnitz’ group at the University of Paderborn always provided a very

good research environment for me. I also want to thank Prof. Dr. Burkhard Monien for

helpful comments and for reviewing this PhD thesis.

Moreover, I am very grateful to Prof. Dr. Oliver Junge, Prof. Dr. Michael Dellnitz, Dr. Sina

Ober-Blöbaum, Dr. Kathrin Padberg, Dr. Oliver Schütze, Stefan Sertl, and Bianca Thiere

for interesting discussions and exciting joint work. For fruitful discussions and comments

I would also like to thank Mirko Hessel-von Molo, Oliver Kramer, Prof. Dr. Michael G.

Lagoudakis, Tim Laue, Dr. Martin Lauer, Henning Meyerhenke, and Willi Richert.

In general, I am indebted to my colleagues in Paderborn including Alessandro Dell’ Aere,

Sebastian Hage-Packhäuser, Mirko Hessel-von Molo, Stefan Klus, Dr. Arvind Krishna-

murthy, Anna-Lena Meier, Dr. Sina Ober-Blöbaum, Dr. Kathrin Padberg, Michael Petry,

Dr. Robert Preis, Dr. Oliver Schütze, Stefan Sertl, Bianca Thiere, Dr. Fang Wang, and

Katrin Witting for discussions, technical support, social events and many other things. I

am further indebted to the collegiates of the PaSCo graduate school.

I am grateful to Laurel Frick-Wright for proofreading my thesis to improve the quality of

my English and to Mirko Hessel-von Molo, Anna-Lena Meier, and Stefan Klus for proof-

reading excerpts.

I always received valuable administrative support from the secretaries Marianne Kalle and

Tanja Bürger, and from Anne Belkner. For enabling my work in a different way, namely

by keeping my office clean, I thank the non-scientific staff of the University of Paderborn.

I would like to thank Alessandro Dell’ Aere for helping me to build a physical soccer envi-

ronment for the AIBO robots and some students for supporting me with the AIBO robots’

technical issues: Johannes Berg, Raphael Golombek, and Nicolai Hähnle are to be men-

tioned here.

Last but not least I am very indebted to my parents Berthild and Karl-Friedrich Post who

supported me not only during my studies but during my whole life in all ways imaginable. I

want to thank my brother Sebastian, Janina, my friends from the University of Paderborn,

especially Ping, the “Mensa-Kreis”, and the musicians I played music with, and, of course,

all other friends of mine.

For the development of great software tools I want to especially thank all L

EX develop-

ers and the developers of Dia, Kate, and Kile many of whom work voluntarily, and the

developers of Matlab which is a commercial software tool. All of them made my work

technically possible or at least substantially simpler.

For financial support I am very grateful to the Paderborn Institute for Scientific Compu-

tation (PaSCo)(1) and to the University of Paderborn.

Marcus Post, February 2008

(1)The research is (partly) supported by the DFG Research Training Group GK-693 of the Paderborn

Institute for Scientific Computation (PaSCo).

Abstract

Robotics is, from a scientific point of view, a very broad topic with many applications.

While highly specialised robots have been widely used in production lines, the next big

scientific steps are towards autonomy of robots and interaction with other robots and hu-

mans. For achieving these long-term goals catenations of physical and mental abilities

which are interdisciplinary and scientifically challenging have to be carried out. At its cur-

rent state, robot soccer is an appropriate environment for demonstrating and developing

robotic skills as several areas are addressed such as image processing and analysis in the

widest sense (including e. g. object matching and directing the camera), control and opti-

misation of physical movement (walking, ball handling), and the strategic planning which

may be considered as being close to high level reasoning. In this thesis, game theoretic and

reinforcement learning approaches are utilised to contribute to strategic planning in robot

soccer which serves as a motivating example. The aim of strategic planning is to obtain

an optimal strategy which also takes the possibly unknown strategies of other players into

account. A natural further goal of this thesis is the development and analysis of algorithms

by means of which such optimal strategies can be approximately computed.

More specifically, the following steps are undertaken: first, a game theoretic model of

multi-player robot soccer is developed which is independent from the robot hardware. The

occuring challenge to determine an optimal strategy with respect to this model for as

many robots as possible is met by exact model reduction, i. e. by finding equivalent smaller

models. For this, a theoretical framework of symmetries is developed which bases on

homomorphisms between two-player zero-sum Markov games. It lays a formal foundation

for practitioners who already implicitly used results proven within that framework. A

special result which is important for model reduction is that the reduction can be performed

in several separate steps and be combined afterwards which is expressed by a composition

of homomorphisms. Finally, a qualitatively new symmetry which interchanges the two

players of the Markov game, i. e. the two teams in robot soccer, is proven to be part of

the homomorphism framework. Particularly, this means that it can be combined with all

symmetries which occur in Markov decision processes.

The theoretical results about Markov game symmetries are algorithmically exploited for

Dynamic Programming (DP) and Reinforcement Learning (RL) methods which are also

compared. Such comparisons ought to be standard but seem unusual for large parts of the

RL literature. Unsurprisingly, DP methods are more efficient and thus the following general

procedure seems recommendable: firstly, to design an approximate model for the task at

hand and solve this by DP methods to an appropriate level of precision and, secondly, to

use the DP solution of the rough model as an initialisation for an RL method to let the RL

method adapt to the unknown real model. In this spirit, the developed soccer model and

the computation of its optimal solution can be seen as the completion of the first of the

above two steps. Ideas of dynamical systems and graph theory are additionally integrated

Loading more pages...