scieee Science in your language
[en] (orig)
Universität Paderborn
Fachbereich Mathematik/Informatik
Reduction Techniques
in Constraint Programming
and Combinatorial Optimization
Dissertation
von
Meinolf Sellmann
Schriftliche Arbeit zur Erlangung des Grades
eines Doktors der Naturwissenschaften
Paderborn, im August 2002.
Für Olga.
Dank
Diese Dissertationentstand inden vergangenenvier Jahren während meiner Tätigkeit alswis-
senschaftlicher Mitarbeiter in der Arbeitsgruppe Monien im Fachbereich Mathematik/Informatik
an der Universität Paderborn. Sie wäre ohne die Unterstützung und Hilfe einer Vielzahl von Wis-
senschaftlern, Kollegen, Freunden und Verwandten in dieser Form nicht möglich gewesen. Bei
ihnen möchte ich mich an dieser Stelle bedanken.
Zuerst gilt mein besonderer und herzlicher Dank Prof. Dr. Burkhard Monien, der diese Arbeit
mit seiner großen wissenschaftlichen Kompetenz und viel Wohlwollen sowohl inhaltlichals auch
menschlich betreut hat. Die vielfältigen Themen und Projekte, die an seinem Lehrstuhl aktiv
verfolgt werden, bieten einen außerordentlichen Einblick in das aktuelle Forschungsgeschehen,
für den ich ihm sehr dankbar bin. Darüber hinaus hat mich über die vergangenen Jahre die
Gewissheit seiner Unterstützung getragen, die ebenso Leistung wie Bescheidenheit fordert und
fördert.
Für ihre Liebe, ihren Rat in schwierigen Situationen, ihre Geduld und ihren Glauben an mich
bedanke ich mich ganz besonders bei meiner Freundin Olga. Ohne ihren Rückhalt und ihre
Hilfe wäre diese Arbeit nicht gelungen. Dasselbe gilt auch für meine Familie, meine Eltern und
Brüder, die mich stets nach Kräften unterstützen und meine Entwicklung fördern.
Mein weiterer Dank für die gute wissenschaftliche Zusammenarbeit gilt meinen Kollegen
in der Arbeitsgruppe, im EU-Projekt PARROT, im DFG-Sonderforschungsbereich Massive Pa-
rallelität, im EU-Projekt UP-TV und im DFG-Schwerpunktprogramm Algorithmik großer und
komplexer Netzwerke. Viele der in dieser Arbeit dargestellten Ergebnisse sind in diesen For-
schungsgruppendurcheinenintensivenIdeenaustauschentstanden. Ausdrücklichnennenmöchte
ich Torsten Fahle, Dr. Warwick Harvey, Georg Kliewer, Norbert Sensen und Kyriakos Zer-
voudakis.
Schließlich bedanke ich mich noch sehr herzlich bei Dr. Michael Laska für seine Unter-
stützung als Projektmanager der AG und bei Ulrich Ahlers, Sigrid Gundelach, Marion Rohloff
und Thomas Thissen für die unermüdliche Hilfe bei organisatorischen und technischen Proble-
men.
Vielen Dank!
Paderborn, im August 2002. Meinolf Sellmann
Contents
1 Introduction 1
1.1 A World of Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Modeling and Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.1 Outline and Major Results . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.3 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Part I Methods
2 Optimization Constraints 11
2.1 Definitions and General Observations . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 On the Complexity of Cost-based Domain Filtering Problems . . . . . . 12
2.1.2 Degrees of Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Shortest Path Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.2 Shortest Path Problems on DAGs . . . . . . . . . . . . . . . . . . . . . 18
2.2.3 Shortest Path Problems on Undirected Graphs . . . . . . . . . . . . . . . 21
2.2.4 Shortest Path Problems on Directed Graphs . . . . . . . . . . . . . . . . 27
2.2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3 Weighted Stable Set Constraints on Interval Graphs . . . . . . . . . . . . . . . . 33
2.3.1 The Weighted Stable Set Constraint . . . . . . . . . . . . . . . . . . . . 34
2.3.2 A Mathematical Programming Approach . . . . . . . . . . . . . . . . . 34
2.3.3 Cost-based Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.4 Weighted All Different Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.4.1 The Minimum Weight All-Different Constraint . . . . . . . . . . . . . . 46
2.4.2 An Arc-Consistency Algorithm . . . . . . . . . . . . . . . . . . . . . . 48
i
ii
Contents
2.5 Knapsack Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.5.1 Definition and Applications . . . . . . . . . . . . . . . . . . . . . . . . 53
2.5.2 Knapsack Relaxations . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.5.3 Cost-based Filtering for Knapsack Constraints . . . . . . . . . . . . . . 59
2.5.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.5.5 Cost-based Filtering for Knapsack Related Problems . . . . . . . . . . . 69
2.5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3 Cost-based Filtering and Problem Decomposition 71
3.1 CP-based Column Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.2 CP-based Lagrangian Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.3 Remarks and Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.3.1 Solving the Lagrangian Dual and Impotence . . . . . . . . . . . . . . . . 78
3.3.2 Redundant Constraint Generation . . . . . . . . . . . . . . . . . . . . . 78
3.3.3 Linking more than Two Optimization Constraints . . . . . . . . . . . . . 79
3.3.4 Linear Relaxations and Cuts . . . . . . . . . . . . . . . . . . . . . . . . 79
3.3.5 Binary IPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.3.6 Column Generation vs. Lagrangian Relaxation . . . . . . . . . . . . . . 80
4 Symmetry Breaking 81
4.1 Symmetry Breaking by Dominance Detection . . . . . . . . . . . . . . . . . . . 82
4.1.1 Efficient Realization in a Depth First Search . . . . . . . . . . . . . . . . 84
4.1.2 Arbitrary Search Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.1.3 A Different Representation of Choice Points . . . . . . . . . . . . . . . 85
4.2 DeBruijn Graph Bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.2.1 Bisection Width of the DeBruijn Graph . . . . . . . . . . . . . . . . . . 87
4.2.2 Symmetry Breaking for the Bisection of DeBruijn Graphs . . . . . . . . 87
4.3 The Social Golfer Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3.1 Symmetries in the Social Golfer Problem . . . . . . . . . . . . . . . . . 89
4.3.2 Symmetry Breaking for the Social Golfer Problem . . . . . . . . . . . . 89
4.3.3 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.4 The n-Queens Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.4.1 Symmetry Breaking for the n-Queens Problem . . . . . . . . . . . . . . 93
4.4.2 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Contents
iii
Part II Applications
5 Airline Crew Assignment 99
5.1 The Airline Crew Assignment Problem . . . . . . . . . . . . . . . . . . . . . . . 100
5.2 Two Approaches for the Crew Assignment Problem . . . . . . . . . . . . . . . . 101
5.2.1 CP-based Column Generation Approach . . . . . . . . . . . . . . . . . . 102
5.2.2 Heuristic Tree Search Approach . . . . . . . . . . . . . . . . . . . . . . 107
5.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.3.1 The Airline Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.3.2 Transforming a Set Covering into a Set Partitioning Solution . . . . . . . 110
5.3.3 Generating Combinable Columns and Exploiting Dual Values . . . . . . 113
5.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6 Automatic Recording 121
6.1 The Automatic Recording Problem . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.1.1 On the Complexity of the Automatic Recording Problem . . . . . . . . . 123
6.1.2 A Mathematical Programming Formulation . . . . . . . . . . . . . . . . 124
6.1.3 Solving the Resulting Integer Linear Program . . . . . . . . . . . . . . . 124
6.2 CP-based Lagrangian Relaxation for the ARP . . . . . . . . . . . . . . . . . . . 125
6.2.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.3 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.3.1 Test Instance Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.3.2 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.4 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
7 Capacitated Network Design 133
7.1 The Capacitated Network Design Problem . . . . . . . . . . . . . . . . . . . . . 133
7.1.1 State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.2 Lagrangian Relaxation Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.2.1 Shortest Path Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.2.2 Knapsack Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.2.3 Subgradient Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.2.4 Variable Fixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.2.5 Lagrangian Cardinality Cuts . . . . . . . . . . . . . . . . . . . . . . . . 141
iv
Contents
7.3 A Branch & Bound Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.4.1 Benchmark Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.4.2 Algorithm Variants Considered in the Experiments . . . . . . . . . . . . 144
7.4.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.5 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
8 The Social Golfer Problem 151
8.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
8.2 Another SBDD-Approach for the Social Golfer Problem . . . . . . . . . . . . . 153
8.3 Heuristic Constraint Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.3.1 Literature on the Integration of CP and Local Search . . . . . . . . . . . 155
8.3.2 Horizontal Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.3.3 Vertical Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
8.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
9 Graph Bisection 167
9.1 The Graph Bisection Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.2 Bounds on Graph Bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
9.3 Approximation of the VarMC-bound . . . . . . . . . . . . . . . . . . . . . . . . 170
9.3.1 The FPTAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
9.3.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
9.4 Cost-Decomposition Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
9.4.1 Column Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
9.4.2 Lagrangian Relaxation Based Column Generation . . . . . . . . . . . . . 179
9.5 A Branch & Bound Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
9.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
9.6.1 Approximating Lower Bounds . . . . . . . . . . . . . . . . . . . . . . . 182
9.6.2 Lower Bounds using Cost-Decomposition . . . . . . . . . . . . . . . . . 184
9.6.3 Comparison of Lower Bound Algorithms . . . . . . . . . . . . . . . . . 184
9.7 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
10 Conclusion 189
Contents
v
List of Figures 191
List of Tables 195
Bibliography 199
vi
Contents
Chapter 1
Introduction
1.1 A World of Optimization
On all levels, the world we live in is full of optimization problems and processes. Many macro-
scopic physical and chemical phenomenacan be explained by the theory thatnature tries to reach
a state of minimum enthalpy. Also, every competition is inherently associated with a measure of
success that can be optimized. Biologicallife itself is based on competition, and the law of evolu-
tion favors efficiency and adaptiveness. The same principle determines our economic system that
is based on the egoistic striving of every economic subject to maximize its wealth. Therefore, to
forecast natural processes and to be economically competitive, optimization problems have to be
solved.
The progress that is made in algorithmic computer science can help to take up this chal-
lenge. Nowadays, most computers that are sold are used to edit texts, to manage large amounts
of data, and to provide access to the Internet. On the other hand, provided with state-of-the-art
optimization software, even a simple personal computer has the potential to become a valu-
able tool for the simulation of natural processes, efficient production and decision support in a
progressive, up-to-date company. However, the technology push is dragging and the solutions
that algorithmic computer science offers are only laboriously transferred into practice. Espe-
cially small and medium sized companies often cannot afford the risk of investing a considerable
amount of money into the development of company-specific software that may soon turn out to
be over-specialized and too rigid to ensure a return on investment in the perpetually changing
environments of a globalized economy.
Of course, commercial optimization software that efficiently supports a stable, non-malleable
process by solving a general optimization problem like Job-Shop Problems, for example
has a chance to successfully find its way into the industry. Due to highly dynamic production
conditions, the number of such applications is rather limited, though, and there is certainly a
1
2
Chapter 1. Introduction
need for flexible and also company specific optimization software. For this purpose, there are
general solvers and software libraries available. Outstanding examples are ILOG SOLVER [121],
ECLIPSE [58], ABACUS [127], LEDA [153], ILOG CPLEX [118], and MINTO [159]. However,
with respect to the wide range of applications that they potentially address, even these compa-
rably successful tools could reach a much larger market than the one that is currently serviced.
A major obstacle for a broader use of standard optimization software is a lack of expertise out-
side the scientific world. In the current situation, we observe that the more powerful a solver
is, the more knowledge is necessary to use it successfully. Therefore, to ease the handling of
optimization software, the modeling of real-world problems has to become more intuitive, and
its influence on the efficiency of the solution process must be reduced.
1.2 Modeling and Efficiency
Consider the situation in mathematical programming. The user is forced to crush a possibly
well-structured problem into a set of very basic linear and integer constraints. If this modeling
process is carried out carefully, standard mixed integer program solvers like CPLEX can tackle
many problems with an astounding efficiency. However, to set up an efficiently solvable integer
program is an art, it requires much experience and is not at all an easy task [215]. In constraint
programming, the situation with respect to problem modeling is much more comfortable. Nowa-
days constraint programming solvers offer sets of predefined, so-called symbolic constraints that
reflect the user’s intuitionmuch better than linear programs. However, even though the modeling
is easier, due to the loose connection of constraints, the optimization abilities of constraint pro-
gramming solvers are much more limited than those based on mathematical programming. To
a large extent, the lack of efficiency is caused by the fact that unfavorable regions of the search
space are being explored unnecessarily, which could be avoided by using a tight global bound on
the objective.
In order to overcome the obstacle of complicated problem modeling in mathematical pro-
gramming, work is in progress that tries to provide the user with higher lever building blocks
that can be used to describe a problem. The SCIL-library [186] for branch-and-cut-and-price
approaches is an excellent example of this attempt. Using a description language that provides
constructs in the style of symbolic constraints, the user can set up a problem model. Not only
does this make the modeling process easier, but on top of that, the solver is no longer provided
with only a set of dis-aggregated linear constraints, but is made aware of the basic substruc-
tures of a problem. Therefore, it is able to exploit specific knowledge about these substructures,
for example by adding global cuts to a problem that are valid for one of the polytopes that are
intersected. This may improve the quality of problem relaxations.
On the other hand, to improve the optimization abilities of constraint programming solvers,
there is a remarkable effort visible that tries to incorporate the merits of mathematical program-
1.3. Contribution
3
ming into constraint programming machineries. The ILOG CONCERT technology, for exam-
ple, combines ILOG CPLEX and ILOG SOLVER. On a broader scale, a considerable number
of researchers work on the integration of methods from operations research into constraint pro-
gramming. Particularly, since 1999, the International Workshop on Integration of AI and OR
Techniques in Constraint Programming for Combinatorial Optimization Problems (CP-AI-OR)
has become a major terminal for the exchange of ideas between the two research areas. Even
though combinatorial optimization and constraint programming cover a widespread area of re-
search topics and go far beyond simple enumeration techniques, tree search has been identified
and used as a key juncture between the two fields.
In combinatorial optimization, following the branch & bound paradigm, tree search is used
to solve hard optimization problems exactly, a task that consists of computing a feasible solution
and proving its optimality. Upper and lower bounding routines form the beating heart of every
branch & bound algorithm: primal heuristics or approximation algorithms are used to find near-
optimal solutions quickly, while relaxations overestimate the best performance of any solution
that can be found in a given sub-tree. Of course, if that estimate is worse than the performance
of the best known, so-called incumbent solution, the corresponding sub-tree does not need to be
explored any further and can be pruned.
On the other hand, in constraint programming, tree search is used to overcome the incom-
pleteness of a pure inference calculus that achieves a state of local consistency only. In every
search node, the finite domains of the variables of the CP-model are reduced with respect to the
model’s constraints; a process called constraint propagation. Sequentially, constraints are propa-
gated until a state of the domains is reached that achieves the desired degree of consistency. Only
then, a case distinction takes place and a branching step is carried out. This way, constraints in-
teract only via the domains of their variables, which makes the approach extremely flexible with
respect to the addition or removal of constraints.
1.3 Contribution
In this thesis, we develop reduction techniques for combinatorial optimization and constraint
satisfaction problems that can be embedded in a tree search approach. In combinatorial opti-
mization, bound estimates and variable fixing algorithms are commonly used for that purpose,
whereas in constraintprogrammingfilteringalgorithmsundertakethe task ofshrinkingthe search
space by eliminating values from variable domains. The algorithms that we develop are meant
to be used as symbolic constraints in constraint programming solvers and also in optimization
software that provides high level description constructs like the SCIL-library. Therefore, we con-
sider the work done in this thesis as a contribution toward the development of efficient and easy
to use optimization software.
4
Chapter 1. Introduction
Note that, when solving discrete optimization problems to optimality, there are really two
tasks to be considered. First, an optimal solution must be constructed, and second, its optimality
must be proven. Optimal or at least near-optimal solutions can often be found quickly by heuris-
tics or by approximation algorithms, both specially tailored for the given problem. In contrast to
the construction of a high quality solution, the algorithmic optimality proof requires the investi-
gation of the entire search space, which in general is much harder than to partly explore the most
promising regions only. By eliminating parts of the search space that do not contain improving
solutions, problem reduction can help with respect to both aspects of discrete optimization.
1.3.1 Outline and Major Results
The thesis is organized in two parts. Part I consists of the Chapters 2–4 and is method oriented.
This means that the task of achieving a certain degree of consistency for some special filtering
problems is studied theoretically. The efficiency of the algorithms developed is measured in
terms of worst case complexity and the degree of consistency that they achieve.
The first type of reduction algorithm that we develop involves a special kind of symbolic
constraint: In Chapter 2, our goal is to provide a set of so-called optimization constraints. By
linking the objective function with the constraint structure of a problem, such constraints can be
used for pruning and problem reduction with respect to cost considerations, a process called cost-
based filtering. That way, optimization constraints naturally combine the optimization abilities
of operations research algorithms and the efficient modeling and filtering concepts of constraint
programming.
Particularly, we study the problem of achieving different degrees of consistency for opti-
mization constraints. Since achieving a state of hyper-arc-consistency may turn out to be an
NP-hard problem itself, we introduce a new type of consistency for optimization constraints,
so-called relaxed consistency. Based on the two concepts, we develop efficient cost-based fil-
tering algorithms for shortest path constraints (on directed acyclic graphs, undirected graphs
with non-negative edge weights, and directed graphs without negative cycles), weighted stable
set constraints in interval graphs, weighted all-different constraints, and knapsack constraints.
These constraints are supposed to be used as basic building blocks when modeling real-life dis-
crete optimization problems. By exploiting the knowledge of the given constraint structures, the
corresponding reduction algorithms make use of previously developed bounds and the efficient
ways known to compute them.
As we shall see, the loose connection of optimization constraints via variable domains results
in less effective and thus also less efficient problem reduction. Therefore, in Chapter 3, we
present a theory that motivatesthe linking of optimizationconstraints via the standard operations
research decomposition techniques column generation and Lagrangian relaxation.
Then, a second type of reduction algorithm is developed that bases its decisions on the con-
1.3. Contribution
5
straint structure of a problem rather than on cost considerations. Obviously, a search node does
not need to be expanded if it represents a previously considered configuration. However, this
situation occurs frequently when tackling problems that contain symmetry. In Chapter 4, we
present a general symmetry breaking method called SBDD that is based on dominance detection
between choice points. An experimental evaluation shows that the method is better suited to
tackle highly symmetric problems than previously developed symmetry breaking techniques.
Part II of the thesis covers the Chapters 5–9 and is application oriented. Several combinato-
rial optimization and constraint satisfaction problems are investigated. The approaches that we
develop are based on the algorithms and methodsfrom Part I. This allowsan empirical evaluation
of the previously developed reduction algorithms on top of the theoretical work done in the first
part.
In particular, we consider the Airline Crew Assignment Problem in Chapter 5. The approach
presented is based on the concept of CP-based column generation in combination with shortest
path constraints. By exploiting CP and OR specific advantages, we are able to speed up the
computation of real-world airline crew schedules considerably. The ideas that we present have
been integrated in an industrial airline crew assignment software system and have yielded drastic
savings in running time.
In Chapter 6, we studythe Automatic Recording Problem, that evolves in the contextof mod-
ern multimedia applications. After giving an approximation scheme for the NP-hard problem, an
exact algorithmic approach is presented that links knapsack constraints and weighted stable set
constraints on interval graphs following the idea of CP-based Lagrangian relaxation. Numerical
results show that our implementation is efficient enough to tackle real-size problem instances in
an amount of time that is well affordable in practice.
The Capacitated Network Design Problem is tackled in Chapter 7. Lower bounds can be
computed by decomposing the problem. We review previously developed reduction techniques
and use CP-based Lagrangian relaxation to link them together. Moreover, a new technique is
presented that adds locally valid cuts based on a Lagrangian relaxation of the problem. In our
experiments, we show that a heuristic version of our potentially exact solver is able to provide
solutions of higher quality in less time than the best known heuristic techniques known so far.
A new approach for the Social Golfer Problem is developed in Chapter 8. Using SBDD for
symmetry breaking and the new idea of heuristic constraint propagation, we are able to solve
problems that were previously out of reach for solvers based on constraint programming.
Finally, in Chapter 9, we develop a solver for the Graph Bisection Problem. The core of the
algorithm is a lower bounding procedure that approximates maximum multicommodity flows
with multiple sinks. Comparisons with a previously developed bound based on semi-definite
programming show the gains in quality and computation time on sparse, structured graphs. Es-
pecially, our implementation is the first to compute the bisection widths of DeBruijn 9, Shuffle-
Exchange 9, and Shuffle-Exchange 10.
6
Chapter 1. Introduction
1.3.2 Background
To a large extent, the thesis is self-contained. However, we assume that the reader is famil-
iar with the basic concepts of algorithm and complexity theory (dynamic programming, NP-
completeness, approximation schemes, etc.), operations research (linear programming, Lagran-
gian relaxation, column generation, etc.), and constraint programming (logic programming,
hyper-arc-consistency, constraint propagation, etc.). For introductions, we refer the reader to:
Algorithm and Complexity Theory
Cormen, Leiserson, and Rivest: Introduction to Algorithms [43].
Garey and Johnson: Computers and Intractability [88].
Hochbaum: Approximation Algorithms for NP-hard Problems [110].
Operations Research.
Nemhauser and Wolsey: Integer and Combinatorial Optimization [158].
Ahuja, Magnati, and Orlin: Network Flows [1].
Jünger and Naddef: Computational Combinatorial Optimization [126].
Constraint Programming.
Marriott and Stuckey: Programming with Constraints: An Introduction [145].
Kumar: Algorithms for Constraints-Satisfaction Problems: A Survey [138].
Apt: The Rough Guide to Constraint Propagation [6].
1.3.3 Publications
Many parts of the work presented have been published on several workshops and conferences. In
case of multiple authors, the results have been achieved in a joint effort of the collaborating re-
searchers. An alphabetical ordering of the list of authors indicatesthat all researchers contributed
equally (this applies to Section 2.5 and Chapters 4, 9), whereas a deviation from the alphabetical
ordering denotes that the first mentioned authors contributed significantly more than the other
authors (Chapters 3, and 5–8).
Unreviewed Workshops
U. Junker, S.E. Karisch, N. Kohl, B. Vaaben, T. Fahle, and M. Sellmann. A Framework for
Constraint Programming Based Column Generation. 16th International Joint Conference
on Artificial Intelligence (IJCAI), Workshop on Non-Binary Constraints, 1999.
1.3. Contribution
7
T. Fahle, U. Junker, S.E. Karisch, N. Kohl, M. Sellmann, and B. Vaaben. Constraint Prop-
agation for Complex Column Generation Subproblems. 17th International Symposium on
Mathematical Programming (ISMP), 2000.
G. Kliewer, M. Sellmann, and A. Koberstein. Solving the capacitated network design
problem in parallel. 3rd meeting of the PAREO Euro working group on ParallelProcessing
in Operations Research (PAREO), 2002.
Reviewed Workshops
T. Fahle and M.Sellmann. ConstraintProgramming Based Column Generationwith Knap-
sack Subproblems. 2nd International Workshop on Integration of AI and OR Techniques
inConstraintProgrammingforCombinatorialOptimizationProblems(CP-AI-OR), Pader-
born Center for Parallel Computing, Technical Report tr-001-2000:33–44, 2000.
M. Sellmann, K. Zervoudakis, P. Stamatopoulos, and T. Fahle. Integrating Direct CP
Search and CP-based Column Generation for the Airline Crew Assignment Problem. 2nd
International Workshop on Integration of AI and OR Techniques in Constraint Program-
mingfor CombinatorialOptimizationProblems(CP-AI-OR), Paderborn Center for Parallel
Computing, Technical Report tr-001-2000:163–170, 2000.
M. Sellmann and T. Fahle. CP-Based Lagrangian Relaxation for a Multimedia Applica-
tion. 3rd International Workshop on Integration of AI and OR Techniques in Constraint
Programming for Combinatorial Optimization Problems (CP-AI-OR), pp. 1–14, 2001.
M. Sellmann and W. Harvey. Heuristic Constraint Propagation. 4th International Work-
shop on Integration of AI and OR Techniques in Constraint Programming for Combinato-
rial Optimization Problems (CP-AI-OR), pp. 191–204, 2002.
International Conferences
U. Junker, S.E. Karisch, N. Kohl, B. Vaaben, T. Fahle, and M. Sellmann. A Framework
for Constraint Programming Based Column Generation. 5th International Conference on
Principles and Practice of Constraint Programming (CP), LNCS 1713:261–274, 1999.
M.SellmannandT.Fahle. CouplingVariableFixingAlgorithmsforthe AutomaticRecord-
ing Problem. 9th Annual European Symposium on Algorithms (ESA), LNCS 2161:134–
145, 2001.
T. Fahle, S. Schamberger, and M. Sellmann. Symmetry Breaking. 7th International Con-
ference on Principles and Practice of ConstraintProgramming (CP), LNCS 2239:93–107,
2001.
8
Chapter 1. Introduction
M. Sellmann, G. Kliewer, and A. Koberstein. Lagrangian Cardinality Cuts and Variable
Fixing for Capacitated Network Design. 10th Annual European Symposium on Algorithms
(ESA), LNCS 2461:845–858, 2002.
M. Sellmann. An Arc-Consistency Algorithm for the Weighted All Different Constraint.
8th InternationalConference on Principles and Practice of ConstraintProgramming(CP),
LNCS 2470:744–749, 2002.
M. Sellmann and W. Harvey. Heuristic Constraint Propagation. 8th International Confer-
ence on Principles and Practice of Constraint Programming (CP), LNCS 2470:738–743,
2002.
Journals
T. Fahle, U. Junker, S.E. Karisch, N. Kohl, M. Sellmann, and B. Vaaben. Constraint pro-
gramming based column generation for crew assignment. Journal of Heuristics, 8(1):59–
81, 2002.
M. Sellmann, K. Zervoudakis, P. Stamatopoulos, and T. Fahle. Crew Assignment via Con-
straint Programming: Integrating Column Generation and Heuristic Tree Search. Annals
of Operations Research, 115:207–225, 2002.
T. Fahle and M. Sellmann. Cost-Based Filtering for the Constrained Knapsack Problem.
Annals of Operations Research, 115:73–93, 2002.
M. Sellmann and T. Fahle. Constraint Programming Based Lagrangian Relaxation for the
Automatic Recording Problem. Annals of Operations Research, to appear.
PART I
Methods
In the first part of this thesis, we introduce general purpose methods for pruning and filtering
with respect to cost considerations and symmetry.
In particular, we develop a tool box of cost-based filtering algorithms in Chapter 2. We
introduce the notion of relaxed consistencyand study the complexityof achievingdifferent levels
of consistency for shortest path constraints, weighted stable set constraints on interval graphs,
weighted all-different constraints, and knapsack constraints.
Then, in Chapter 3, we investigate how the interplay of optimization constraints can be im-
proved with the help of the standard problem decomposition techniques column generation and
Lagrangian relaxation.
Finally, in Chapter 4, we develop a straightforward symmetry breaking method that is based
on the detection of dominance relations between choice points. The method is applied to three
different problems and is shown to be particularly suited for highly symmetric problems.
Chapter 2
Optimization Constraints
In this chapter, we develop a “tool box” of domain filtering algorithms that can be used for
solving combinatorial optimization problems. While concentrating on some specific problems,
it is important to keep in mind that the domain filtering algorithms we develop are to be used as
building blocks for tackling more complex optimization problems. As a matter of fact, for the
problemsconsideredhere, domainfilteringusuallyonlymakessenseinthe presenceofadditional
constraints. Different methods of how cost-based filtering algorithms can be exploited in the
context of more general optimization problems will be presented in Chapter 3.
2.1 Definitions and General Observations
Within a tree search, during the course of optimization, we compute a sequence of feasible solu-
tions. We refer to the best known feasible solution as the incumbent solution. Obviously, once
we have found a solutionof a certain quality, we are searching for better solutionsonly. Thus, we
impose a restriction on the objective. That restriction, in combination with other side-constraints
of the original problem, forms an optimization constraint [65, 79, 80, 128, 162, 189], which is
the core concept that we will be using throughout this chapter. It was developed by a community
that has been working on the integrationof constraint programming (CP) and operations research
(OR) in recent years. In the OR world, though never explicitly stated as constraints, optimization
constraints are frequently used for bound computations and variable fixing. From a CP perspec-
tive, they can be viewed as global constraints that link the objective with some other constraints
of the problem:
Given n

, let X1

Xndenote some variables with finite domains D1:
D
X1

Dn:
D
Xn
. Furthermore, given a constraint ζ:D1

Dn
0
1
, and an objective function
Z:D1

Dn

, let xi
Di
1
i
n.
11
12
Chapter 2. Optimization Constraints
Definition 2.1 Let B
denote an upper (lower) bound on the objective Z to be minimized
(maximized).
ϑζ
Z
B
:D1

Dn
0
1
with ϑζ
Z
B
x1

xn
1iff ζ
x1

xn
1and
Z
x1

xn

B is called minimization constraint.
ϑζ
Z
B
:D1

Dn
0
1
with ϑζ
Z
B
x1

xn
1iff ζ
x1

xn
1and
Z
x1

xn

B is called maximization constraint.
A minimization or maximization constraint is also called an optimization constraint.
The purpose of optimization constraints is twofold: first, they can be used for pruning by
computing an upper/lower bound on the objective, which is the common idea in branch & bound
algorithms. Second, they may also be used to remove those values from variable domains that
cannot be part ofanyimprovingsolution, which maybe viewed as a generalizationof the variable
fixing technique: for binary problems, variable fixing and domain filtering are essentially the
same.
2.1.1 On the Complexity of Cost-based Domain Filtering Problems
In order to achieve a state of (hyper-)arc-consistency [6, 138]1of an optimization constraint, we
have to find and remove all assignments that cannot be extended to an improving solution that
is feasible with respect to ζ. That is, if ζis the only constraint of a combinatorial optimization
problem (we call that optimization problem and the optimization constraint corresponding to
or associated with each other), an arc-consistency algorithm allows us to compute improving
solutions in a backtrack-free search. Consequently, if the original problem is NP-hard, so is
the problem of achieving arc-consistency of the corresponding optimization constraint. The
Knapsack Problem is an example for such a situation.
If the optimizationproblem associatedwith an optimizationconstraint ispolynomial, thenthe
arc-consistency problem may also be polynomial. The Weighted Bipartite Matching Problem is
an example, because there exists a polynomial time algorithm for the problem and the removal
of an edge or two nodes (when the edge between the nodes is chosen to be part of the matching)
does not change the structure of the problem.
The situation may change, however, if the problem structure is not preserved when a variable
is forced to take a specific value. Consider a Shortest Path Problem in an arbitrary network,
where we use a binary variable for each edge. The problem of finding a shortest path is of
course solvable in polynomial time. However, if we are to compute the set of edges that must or
1To be precise: here and in the remainder of this thesis, we consider the problem of achieving a state of hyper-
arc-consistency. However,for historical reasons and also to improvethe readability, we simply write arc-consistency
when we refer to a state of local consistency with respect to one constraint only.
2.1. Definitions and General Observations
13
cannot be part of any simple path that does not exceed a certain length, we are facing an NP-hard
problem: it is easy to see that the Two Vertex Disjoint Paths Problem [83] can be reduced to this
problem.
2.1.2 Degrees of Consistency
The discussionshows thatwe cannot always hope for a cost-baseddomain filteringalgorithm that
achieves arc-consistency. Therefore, we may consider to develop less effective but polynomial
time bounded filtering algorithms that may only achieve a weaker degree of consistency. Note
that, in a different context, the idea of weaker forms of consistency gives yield to the notion of
bound consistency that can also be achieved more easily than general arc-consistency, and that
has proven valuable for many applications.
Regarding cost-based filtering, an idea that has been developed in OR to perform variable
fixing on integer linear problems is the reduced-cost filtering method: when solving the contin-
uous relaxation bound on a linear combinatorial optimization problem with the help of a general
LP solver (such as the simplex algorithm or interior point methods), we get dual information
and reduced-cost data for free. That data can be used to compute a lower bound on the loss of
performance that we have to accept when adding a new constraint of the form X
x(usually this
is done by performing one dual simplex re-optimization step). Of course, if the loss is too large,
we can deduce that xmust be removed from the domain of X.
We strengthen and generalize the basic idea by coupling optimization constraints and relax-
ations:
Definition 2.2 Given an optimization constraint ϑζ
Z
B
:D1

Dn
0
1
, let :
D1

Dn. Furthermore, denote the set of all subsets of by 2.
Let ϑζ
Z
B
be a minimization constraint, and let L : 2
such that for all Mi
Di
1
i
n,
L
M1

Mn
min
Z
x1

xn

ζ
x1

xn
1
xi
Mi
1
i
n
,
where min /
0
. We callL a relaxationofϑζ
Zandsay thatϑζ
Z
B
isrelaxedL-consistent,
iff for any given 1
i
n and xi
Di, L
D1

xi

Dn

B.
Analogously, let ϑζ
Z
B
be a maximization constraint, and letU : 2
such that for all
Mi
Di
1
i
n,
U
M1

Mn

max
Z
x1

xn

ζ
x1

xn
1
xi
Mi
1
i
n
,
where max /
0

. We call U a relaxation of ϑζ
Zand say that ϑζ
Z
B
is relaxed U-
consistent, iff for any given 1
i
n and xi
Di,U
D1

xi

Dn

B.
14
Chapter 2. Optimization Constraints
As one would expect, the definition states that relaxed L-consistency (relaxed U-consistency
follows analogously) can be achieved the easier the weaker the relaxation Lis. For L
,
there is no work to do to achieve relaxed L-consistency, whereas arc-consistency is enforced
when L
M1

Mn
min
Z
x1

xn

ζ
x1

xn
1
xi
Mi
1
i
n
. That is, the
choice of Ldetermines the degree of domain filtering.
In practice, Lis usually chosen as a fairly tight bound that can still be computed quickly.
Generally, within a tree search there is a trade-off between the time spent per search node and the
total number of search nodes. Thus, the favorable choice of the accuracy of the relaxation is al-
ways subject to the optimization problem at hand. Note that the definition of relaxed consistency
allows to compare domain filtering algorithms with respect to the running time and the degree of
consistency they achieve.
In the following, we develop cost-based domain filtering algorithms for shortest path con-
straints, weighted stable set constraints on interval graphs, weighted all-different constraints,
and knapsack constraints.
We have seen already, that achieving arc-consistencyfor knapsack or shortestpath constraints
are NP-hard tasks. Therefore, the challenge is to develop efficient filtering algorithms for the two
constraints that achieve relaxed consistency for favorably strong relaxations.
On the other hand, for weighted stable set constraints on interval graphs and weighted all-
different constraints, arc-consistency can be achieved in polynomial time. Thus, we aim at devel-
oping filtering methods for these constraints that achieve arc-consistency and run faster than in
time O
nT
, where ndenotes the number of (binary) variables, and Tis the time needed to solve
the corresponding optimization problem. That time bound can obviously be achieved by probing
all variable values in a brute force manner. As we will see, we can achieve arc-consistency for
both constraints in time O
T
, i.e. the same time that is needed to compute the bound on the
objective.
2.2. Shortest Path Constraints
15
2.2 Shortest Path Constraints
Many real-world problems, e.g. in personnel scheduling and transportation planning, can be
modeled naturally as Constrained Shortest Path Problems (CSPs), i.e., as Shortest Path Problems
with additional constraints. A well-studied problem in this class is the Resource Constrained
Shortest Path Problem. Reduction techniques are vital ingredients of solvers for the CSP, that is
frequently NP-hard, depending on the nature of the additional constraints. Viewed as heuristics,
until today these techniques have not been studied theoretically with respect to their efficiency,
i.e., with respect to the relation of filtering power and running time. Using core concepts of
constraint programming and the notion of relaxed consistency, we provide a sound theoretical
studyof cost-basedfiltering forshortest pathconstraintson acyclic, onundirected andon directed
graphs that must not contain negative cycles.
Real-world problems can frequently be modeled as Shortest Path Problems with additional
constraints. The best known Constrained Shortest Path Problem (CSP) is probably the Resource
Constrained Shortest Path Problem [4, 17, 57, 103, 125] that consists in the combination of a
Shortest Path Problem and capacity constraints on a set of resources. Even on DAGs, for non-
negativeobjective functions and for only one resource that problem is known to be NP-hard [88].
Standard applications for the Resource Constrained Shortest Path Problem are route planning
in traffic networks and quality of service routing [161, 216]. The Crew Scheduling Problem is
another example of a real-world problem where CSPs are used in many successful approaches:
In a column generation process, CSPs have to be solved to generate columns, which correspond
to individual lines of work in this context [219]. In Chapter 5, we present an example of such a
crew scheduling approach based on column generation with shortest path sub-problems.
Generally, CSPs appear very often as sub-problems in column generation approaches. Exam-
ples range from route guidance [123] and duty scheduling in public transit [25] up to the schedul-
ing of switching engines [142]. In Section 3.1, a general framework for constraint programming
based column generation is developed that formalizes the use of optimization constraints in this
context.
To solve Constrained Shortest Path Problems, state of the art solvers compute lower and
upper bounds on the problem and then close the duality gap. The latter task is carried out by an
enumeration procedure such as a tree search [17], dynamic programming [154] or a k-shortest
path algorithm [103]. Particularly in a tree search, but also in the other approaches the tightening
of (sub-)problems is vital for an effective gap closing procedure. Therefore, it is essential for the
overall performance and the practical success of the entire approach.
The first tightening strategy that was proposed goes back to a work done by Aneja, Aggarwal,
and Nair [4]for problem reductionof the Resource Constrained ShortestPathProblem. The basic
idea consists in identifying nodes and arcs that cannot be visited by any path that obeys the given
resource restrictions. The same method can also be used to identify nodes and arcs that cannot be
16
Chapter 2. Optimization Constraints
visited by any improving path, which gives a first cost-based filtering algorithm for the problem.
Dumitrescu and Boland [57] proposed a repeated problem reduction procedure that has shown
to be very successful for hard constrained problems. Beasley and Christofides [17] have shown
how a tighter global, Lagrangian relaxation based bound can be used for the elimination of nodes
and arcs.
Apparently, none of these heuristics has been classified with respect to its filtering abilities.
Moreover, the reduction techniques used all focus on the removal of nodes and arcs, but those
arcs and nodes that must be visited by all path of a certain quality remain undetected. However,
with respect to the additional constraints of the CSP this information can be very valuable as it
may prove useful for an additional reduction of the problem.
Constraint programming theory provides means for the level of consistency that a constraint
filtering algorithm achieves. Using the extended notion of relaxed consistency for optimization
constraints, we are able to measure a non-exact filtering algorithm not only with respect to its
running time, but also to its filtering power that is determined by the quality of the relaxation
used. With respect to shortest path constraints, we study the complexity of achieving a state of
(hyper-)arc-consistency. Since the problem is NP-hard in the general case, we introduce short-
est path relaxations and develop and compare different filtering algorithms for different graph
classes.
In Section 2.2.2, we develop an efficient linear time filtering algorithm for shortest path
constraints on DAGs. In Section 2.2.3, we investigate the problem on undirected graphs, where
it is shown to be NP-hard. We introduce a shortest path relaxation L1and formulate a linear time
algorithm that achieves a state of relaxed L1-consistency. Finally, in Section 2.2.4, we develop
cost-based filtering algorithms for shortest path constraints on general directed networks with
non-negative costs or graphs that at least do not contain negative weight cycles.
2.2.1 Definition
Definition 2.3 Denote a weighted (directed or undirected) graph by G
V
E
c
, and let h
.
A sequence of nodes P
i1

ih
Vhwith
if
if
1
E for all 1
f
h is called a
path from i1to ihin G.
A path P is called simple iff P visits every node at most once. For all i
j
V, denote the
set of all simple paths from i to j by π
i
j
.
For all paths P, nodes i
V and edges
i
j
E, we write i
P or
i
j
P iff P visits
node i or the edge
i
j
, respectively. For a set of nodes or edges S, we write S
P, iff
s
P for all s
S. Correspondingly, we write P
S iff s
S for all s
P.
2.2. Shortest Path Constraints
17
The cost of a path P
i1

ih
is defined as cost
P
:
1
j
hcijij
1. Accordingly, for
any set S
E, we define cost
S
:
i
j

Scij.
Definition 2.4 Let G
V
E
c
denote a (directed or undirected) graph with n
V
and m
E
, a designated source v1
V and sink vn
V, and arc costs cij

. Furthermore, assume we
are given binary variables X1

Xm, an integer variable Z, and an objective bound B

.
Ashortest path constraint has binary variables X1

Xmand an integer variable Z, and
an instantiation Xi
xi, for all 1
i
m, and Z
z is consistent iff the following holds:
1. The set
ei
Xi
1
E determines a simple path in the graph G from the source v1
to the sink vn.
2. z is the cost of the path represented by the value of X, and z
B.
Every simple path from source to sink with costs less than B is called admissible.
To ease the notation, for the remainder of this section we assume that a shortest path con-
straint is associated with a set variableY
Ethat represents the set of edges eifor which Xi
1.
The (current) domains of the variables Xwill be represented by two sets: the set of possible
members pos
Y
, and the set of required members req
Y
of Y. In the sub-tree of the search
rooted at the current choice point we require req
Y
Y
pos
Y
. That is, req
Y
represents
the set of variables for which it has been set Xi
1, and the set pos
Y
represents the set of all
variables for which it has not been decided to set Xi
0 already. Then, in the current choice
point, we have to search for admissible paths Psuch that req
Y
P
pos
Y
. Note that we use
the set variable Yonly to ease the presentation. It has no impact on the implementation that is
assumed to use only the variables X. Especially, the didactic use of a set variable has no impact
on the definition of arc-consistency. To achieve arc-consistency of a shortest path constraint, we
must ensure:
For all e
pos
Y
, there exists an admissible path Pwith req
Y

e
P
pos
Y
, and
for all e
pos
Y

req
Y
, there existsan admissible path with req
Y
P
pos
Y

e
.
Obviously, on the existence of an admissible path can be decided by applying a shortest path
algorithm. However, to decide whether there exists a simple path that visits a set of edges is
an NP-hard task which can be shown by a simple reduction to the Two Vertex Disjoint Path
Problem [83]. Consequently, the arc-consistency problem for the general shortest path constraint
is also NP-hard.
18
Chapter 2. Optimization Constraints
2.2.2 Shortest Path Problems on DAGs
The reduction on the Two Vertex Disjoint Path Problem does not prove NP-hardness for acyclic
graphs. As a matter of fact, on DAGs the problem is solvable in polynomial time. In the follow-
ing, we develop a cost-based domainfiltering algorithm for shortestpath constraints thatachieves
arc-consistency in time O
n
m
.
As stated above, to perform domain reduction for a shortest path constraint, we are facing
two tasks: first, in order to shrink the set pos
Y
as much as possible, we need to identify all arcs
in Ethat cannot be visited by any admissible path req
Y

P
pos
Y
. Second, to increase
req
Y
as much as possible, we must compute all arcs that must be visited by all admissible path
req
Y

P
pos
Y
.
First, to ensure that all paths computed are subsets of pos
Y
, we remove all arcs from G
that are not in pos
Y
. Then we want to make sure that all admissible paths are super-sets of
req
Y
. To do so, we set M:
mmax
cij
cij
0
, we decrease the arc weight cij of edge
i
j
req
Y
by Mand adapt the upper bound Bby subtracting M
req
Y
. In the following, we
assume that G
V
E
c
has been updated accordingly. Note that the removal of nodes and arcs
can be performed in time O
n
m
when an adjacency list representation (such as the forward
or backward star representations [1]) of Gis used.
2.2.2.1 Removing Arcs from the Possible Set
Without loss of generality, we may assume that the nodes in Vare ordered topologically. If a
node i
Vprecedes a node j
Vin the topological ordering, we write i
top j. We write i
top j
iff i
top jand i
j. Note that, for all arcs
i
j
E, it holds that i
top j. Furthermore, we
may assume that v1and vnare the first and last nodes in this ordering, respectively. To find out
for a given arc
i
j
Ewhether there still exists an admissible path Pwith
i
j
P, we use a
method that was originally developed for the Resource Constrained Shortest Path Problem [4]:
First, we compute the shortest path distances c
v1
i
from the source v1to all nodes i
V.
Once a topological ordering of the nodes is known (which takes time O
n
m
), this can be done
in time O
n
m
even in the presence of negative arc weights (see [43]). Next, we compute the
shortest path distances c
i
vn
from all nodes i
Vto the sink vn, which can also be done in
linear time by reversing all arcs and using the same procedure as before with vnas the starting
node. Finally, for every arc
i
j
pos
Y
we check whether the shortest simple path from v1to
vnvia
i
j
has costs lower than B, i.e., we remove
i
j
from pos
Y
iff
c
v1
i
cij
c
j
vn

B
(2.1)
Using the same idea we can also identify nodes that can be removed from the graph, because
they can never be part of any path from source to sink that is short enough to beat the current
upper bound. Figure 2.1 illustrates the situation.
2.2. Shortest Path Constraints
19
v1v11
2
3
9
10
7
5
4
68
Fig. 2.1: The figure shows arcs on shortest paths from v1and to v11 in a DAG. Dashed lines mark shortest-
path arcs from v1, dotted lines those to v11. Solid lines represent arcs that are in both sets. Consider for
example node 7: the shortest path from v1to 7 is
v1
3
7
, and the shortest path from node 7 to v11 is
7
9
v11
. Therefore, a shortest path from v1to v11 via node 7 is
v1
3
7
9
v11
.
2.2.2.2 Adding Arcs to the Required Set
After having shrunk the possible set, we still need to identify all arcs that must be visited by
all admissible paths to achieve a state of arc-consistency for a shortest path constraint. If we
perform the algorithm for the removal of members from the possible set first, we may assume
that the graph Gonly contains arcs and nodes that are part of at least one admissible path. Then
the following lemma characterizes the arcs that are required.
Lemma 2.1 Denote the graph that is obtained by un-directing all arcs in G by Gu. If, for all
arcs e
E, there exists an admissible path P in G such that e
P, then the following statements
are equivalent:
1. Every admissible path P in G contains the arc
i
j
E.
2. For all arcs
k
l
E with
k
l
i
j
, it holds that k
l
top i or j
top k
l.
3. The edge
i
j
is a bridge in Gu.
Proof:
1.
2. Assume that there exist nodes k
l
Vsuch that
k
l
E,
k
l
i
j
. Then
there exists an admissible path Psuch that
k
l
Pand
i
j
P. Thus, k
l
top ior
j
top k
l.
2.
3. Statement (2) implies that the arc
i
j
is the only arc that leaves node i. Also,
there exists no arc that by-passes the node iin G. Thus the removal of
i
j
disconnects
v1and vnin Gu, i.e.,
i
j
is a bridge in Gu.
3.
1. If
i
j
is a bridge in Gu, then every path from v1to vnin Gmust contain the arc
i
j
. Therefore, for every admissible path P, it also holds that
i
j
P.
20
Chapter 2. Optimization Constraints
Lemma 2.1 allows us to compute the arcs to be required by searching for bridges in the
undirected version of G, and the bridges of an undirected graph can be computed in time O
n
m
[43].
The following theorem summarizes the results in Sections 2.2.2.1 and 2.2.2.2.
Theorem 2.1 On DAGs, arc-consistency for the shortest path constraint can be achieved in
linear time.
2.2.2.3 Incremental Shortest Path Constraint Propagation
During the search, the domain ofYchanges frequently, which means that many and often similar
SSSPs have to be solved. Therefore, we develop an incremental version of the algorithm. Instead
of restarting the computation from scratch, it makes use of previously computed shortest path
information. Moreover, it uses the information on the differences in the domains of the current
and the last call: the required set may have grown, and the possible set may have shrunk.
For an efficient implementation of the algorithm, we use both the forward and the backward
star representation of G. We choose this redundant data structure, first because we need to
compute shortest paths in Gand the reverse of G, and second because we are able to perform the
incremental shortest path update more efficiently.
In order to compute the arcs and nodes that have to be removed from the graph, a support
idea in the style of AC-6 [21] can be used to reduce the computational effort required. If a node
ileaves the possible set, we mark all its adjacent nodes jas affected by the removal or distance
change of node i. If the node iwas even the direct shortest path predecessor of node jin the
preceding call to the propagation routine, we refer to ias the support node of j.
We iterate overall nodes in their topologicalorder. If the node jis affected, we check whether
its support still exists or is replaceable by another node without a change in the shortest path dis-
tance c
v1
j
. Since this requires iterating over all in-going arcs, a backward star representation
is used. Only if the support is lost and cannot be replaced, we need to propagate the distance
update and mark the successors of node jas affected. To do this efficiently, the forward star
representation of Gis used. That way, we perform a continuing update on all affected nodes in
only one pass.
If a new arc
i
j
becomes required, we do not need to re-compute the shortest path distances
of all nodes in the graph, because nodes that precede iin the topological ordering are not affected
by that change. Therefore, it is sufficient to restart the SSSP-algorithm for DAGs at node i. The
distance c
v1
i
can be reused from the previous call to the constraint propagation algorithm.
Moreover, we can stop examining all outgoing arcs when we run over the first node kfor which
k
l
was already formerly required: the shortest path tree structure “behind” a required arc
2.2. Shortest Path Constraints
21
remains intact, and the difference in the distance c
v1
k
before and after the change simply
applies to all following nodes as well.
In the worst case, the incremental variant of the propagation algorithm may still require time
O
n
m
. However, in practice the ideas sketched in the above can reduce the computational
effort considerably as we shall see in Chapter 5.
2.2.3 Shortest Path Problems on Undirected Graphs
Next we consider shortest path constraints on undirected graphs with non-negativeedge weights.
Unlike in the previous section, achieving arc-consistency for a shortest path constraint on undi-
rected graphs is an NP-hard task, as the following observation shows.
Lemma 2.2 Given an undirected graph G
V
E
, n :
V
, m :
E
, two designated nodes
v1
vn
V and a set of edges S
E, it is NP-hard to decide whether there exists a simple path
P
π
v1
vn
with S
P.
Proof: We reduce the problem to the Hamiltonian Path Problem: Given an undirected graph
G
V
E
, do there exist two nodes s
t
Vand a simple path P
π
s
t
with V
P? We
transform Ginto an instance
G
v1
vn
S
such that there exists a simple path P
π
v1
vn
with
S
P
iff there exists a Hamiltonian path in the original graph G.
e1
e2
ed
ab
12d
v
Fig. 2.2: The structure replacing a node in G.
First, we add two new nodes v1
vn, and all
edges in V
v1
vn
. Then every node v
V
is replaced by the structure given in Figure 2.2:
The ellipse sketches a former node v
V. For
alledges e1

ed
V
v1
vn
V
v1
vn
incident to v, we add new nodes 1v

dv
that connect the new structure with their cor-
responding edges. Moreover, we add two
new nodes avand bvand edges
1v

dv
av
bv
. Finally, we setS:

av
bv
v
V
,
the set of edges that must be visited.
Then there exists a simple path P
π
v1
vn
with S
P
iff there exists a Hamil-
tonian path in the original graph G: Given a path P
π
v1
vn
with S
P
.P
must visit all
structures sketched in Figure 2.2 at least once, because it must visit all edges
av
bv
. On the
other hand, after P
has visited the edge
av
bv
it can never return to the current structure be-
cause all paths that pass through it must visit either node avor bvagain. Therefore, P
visits all
structures corresponding to the original nodes in Vexactly once, and thus defines a Hamiltonian
path in G.
22
Chapter 2. Optimization Constraints
On the other hand, assume there exists a Hamiltonian path P
π
s
t
in Gfor some nodes
s
t
V. Then we construct a path P
π
v1
vn
with S
P
in the following manner: We start at
v1and go to node sfirst. Now, for each v
Vthat the Hamiltonian path visits we enter via some
edge ei, go to node avfrom there, we visit the edge
av
bv
, and find our way out via the node
incident to ejthat is visited by Pnext. Since V
P, we visit all edges in Slike that. Finally, we
end at node tand proceed to vnfrom there.
As a simple consequence of Lemma 2.2, we get the following
Corollary 2.1 To check the arc-consistency of a shortest path constraint on an undirected graph
is NP-hard.
Due to this result, in the following, we develop a cost-based filtering algorithm that achieves
relaxed consistency rather than arc-consistency. In order to introduce the relaxation we want to
use, we first start with the following
Definition 2.5 Denote a weighted (directed or undirected) graph by G
V
E
c
.
A path P is called a k-simple path in G iff, for all j
V, the path P visits j at most k times.
Note that a 1-simple path is a simple path in G.
With P
i
j
π
i
j
we refer to a shortest path from i to j (with respect to c). Then, to
ease the notation, we set c
i
j
:
cost
P
i
j

.
Given a shortest path constraint, a k-simple path P from v1to vnis called a k-admissible
path iff cost
P
B.
Note that in a graph with non-negative edge weights, a shortest admissible path is also a shortest
2-admissible path. Now, instead of checking for admissible paths only, we consider the fol-
lowing shortest path relaxation: Let D
Y
denote the domain of Yrepresented as the pair of sets
req
Y

pos
Y

. We set H:
P
P
π
v1
vn
with P
pos
Y
and Ff:
P
Pis a 2-simple
path from v1to vnwith f
P
for all f
E. Then we define
L1
D
Y

:
max
min
cost
P
P
H
, maxf
req
Y
min
cost
P
P
Ff

.
Lemma 2.3 L1is a shortest path relaxation, i.e., it holds that
L1
D
Y

min
cost
P
P
π
v1
vn
req
Y
P
pos
Y
.
Proof: Let P
π
v1
vn
denote the shortest path in Gwith req
Y

P
pos
Y
. Obviously, it
holds that P
Hand P
Fffor all f
req
Y
. Therefore, L1
D
Y

cost
P
.
2.2. Shortest Path Constraints
23
The big advantage of the relaxation above is that it allows to be checked for consistency very
easily, as we shall see below. Note however, that L1does not require that the 2-admissible paths
must visit all nodes in req
Y
simultaneously. Of course, this weakens the relaxation. We can
reduce the negative effects by improving the probability that a 2-admissible path visits the edges
in req
Y
: we set cij :
0 for all
i
j
req
Y
and subtract cost
req
Y

from B.
According to the definition, a shortest path constraint is relaxed L1-consistent, iff
1. for all f
pos
Y
, there exists a 2-admissible path P
Ff, and
2. for all f
pos
Y
req
Y
, there exists an admissible path P
Hwith f
P.
In the following two sections, we show how relaxed L1-consistency can be achieved efficiently.
2.2.3.1 Removing Edges from the Possible Set
In order to check whether there exists a 2-admissible path in Gthat visits an edge
i
j
E,
we can use the same idea as in the previous section on shortest paths in DAGs. Obviously,
the shortest 2-simple path from v1to vnthat visits
i
j
is either
P
v1
i

P
j
vn

with costs
c
v1
i
cij
c
j
vn
or
P
v1
j
P
i
vn

with costs c
v1
j
cij
c
i
vn
. Therefore, to check
whether an edge has to be removed from pos
Y
with respect to the relaxation L1, it is sufficient
to know the shortest path distances from the source and to the sink of all nodes. Both values can
be computed for all nodes by only two shortest path computations in Gin time O
m
nlogn
by
using Dijkstra’s algorithm in combination with Fibonacci heaps [86]. In a RAM model, shortest
paths on undirected graphs can be computed in time O
m
n
when using the algorithm of
Thorup (see [207] and the recent extension of Pettie and Ramachandran in [166]). Thus, the set
of edges that has to be removed from pos
Y
to achieve relaxed L1-consistency can be computed
in time O
m
nlogn
, and in time O
m
n
on a RAM.
2.2.3.2 Adding Edges to the Required Set
After having removed all edges from Gthat cannot be part of any 2-admissible path, the edges
that must be visited by all such paths can be characterized by the following
Theorem 2.2 Assume that all edges in G are part of at least one 2-admissible path. Then an
edge
r
s
E must be visited by all admissible paths, iff
r
s
P
v1
vn
, and
r
s
is a bridge in G.
Before we can prove the above theorem, we need to show the following two lemmas first:
24
Chapter 2. Optimization Constraints
vn
v1k
h
s
r
1f f+1
l
Fig. 2.3: The figure schematically shows an edge
k
l

Ethat must exist according to Lemma 2.4. Solid
lines mark edges in E, and dashed lines mark parts of the shortest path between v1and vn. The dotted line
between land vnindicates that there exists a path between the two nodes that does not visit the edge
r
s
.
The alternating lines and dots between land rindicate that the shortest path from lto vnvisits node r. The
numbers on top of the nodes give their corresponding DFS numbers, and triangles mark DFS sub-trees.
Lemma 2.4 (Compare with Figure 2.3.) Assume that all edges in G are part of at least one
2-admissible path. Let
r
s
E denote an edge that must be visited by all admissible paths and
that can be removed from G without disconnecting v1and vn. Then there exists an edge
k
l
E
such that
1.
P
π
v1
vn
:
k
l
P and
r
s
P,
2. k is a shortest path predecessor of r, and
3.
r
s
P
l
vn
.
Proof: Assume we compute a shortest path P
i1

ih
π
v1
vn
. Then i1
v1,ih
vn
and if
r,if
1
sfor some 1
f
h. Next, we change the graph representation of Gsuch
that
ig
ig
1
is the first outgoing edge of node igfor all 1
g
h. For all nodes j
V, let
dj
1

n
denote the ordering in which the nodes are first visited by a depth first search
using the modified graph representation of G. Then dig
gfor all 1
g
h. Since the removal
of
r
s
does not disconnect v1and vn, there exists a forward edge
k
l
Ewith dk
fand
dl
f
1. This implies Statements 1 and 2.
It remains to show that
r
s
P
l
vn
. By assumption, there exists a 2-admissible path R
through edge
k
l
. There are two possibilities: either Rvisits node kor node lfirst, which
corresponds to:
a) c
v1
k
ckl
c
l
vn

B, or
b) Rvisits lbefore kand c
v1
l
ckl
c
k
vn
B.
In the first case, since
r
s
P
v1
k
and
r
s
must be visited by all admissible paths, it holds
that
r
s
P
l
vn
, and we are done.
2.2. Shortest Path Constraints
25
v1vn
s
r
i j
Fig. 2.4: The figure schematically shows an edge
i
j
Ethat must exist according to Lemma 2.5. Solid
lines mark edges in E, and dashed lines mark parts of the shortest path between v1and vn. Alternating lines
and dots indicate parts of the shortest path from v1to a node, and dotted lines indicate parts of the shortest
path from a node to vn. The proof of Theorem 2.2 shows that the path
P
v1
r
P
r
i
P
j
s
P
s
vn
is
2-admissible and does not visit the edge
r
s
.
Let us consider the second case. Let Q
π
v1
l
denote a shortest path from v1to lwith
r
s
Q. Withoutloss of generality, we may assume that kand lare chosen such that
k
l
Q.
We observe that
r
s
P
v1
l
, because otherwise this implies that
k
l
Q
P
v1
l
and
the 2-admissible path visits node kbefore node l. Now, since kis a shortest path predecessor of
rand
r
s
P
v1
l
, it holds that k
P
v1
l
. Then,
c
v1
k
ckl
c
l
vn
c
v1
k
ckl
c
l
k
c
k
vn
c
v1
k
c
k
l
ckl
c
k
vn
c
v1
l
ckl
c
k
vn
B,
which reduces this case to (a).
Lemma 2.5 (Compare with Figure 2.4.) Assume that all edges in G are part of at least one
2-admissible path. Let
r
s
E denote an edge that must be visited by all admissible paths and
that can be removed from G without disconnecting v1and vn. Then there exists an edge
i
j
E
such that
1.
r
s
P
i
vn
and
r
s
P
j
vn
, and
2.
r
s
P
v1
i
and
r
s
P
v1
j
.
Proof: Denote an edge as in Lemma 2.4 by
k
l
E. Then there exists a path P
π
l
vn
with
r
s
P, and we may assume
r
s
P
l
vn
.
1. Since
r
s
P
vn
vn
, there existsan edge
i
j
Psuch that
r
s
P
i
vn
and
r
s
P
j
vn
.
2. By assumption, there exists a 2-admissible path that visits j. Since
r
s
P
j
vn
, it
follows that
r
s
P
v1
j
, because
r
s
must be visited by all admissiblepaths. Finally,
assume that
r
s
P
v1
i
. Then the shortest path visiting node ihas costs
26
Chapter 2. Optimization Constraints
c
v1
r
crs
c
s
i
c
i
r
crs
c
s
vn
.
However, the path from v1via r,iand sto vnhas costs
c
v1
r
c
r
i
c
i
s
c
s
vn
,
which is lower or equal to the cost of the shortest path visiting i. This implies that it
is also a shortest path visiting node i. It does not, however, visit some edges with zero
costs. Particularly, it does not visit the edge
r
s
. Therefore, we may assume that
r
s
P
v1
i
.
Now, we have everything at hand to give the previously postponed
Proof of Theorem 2.2:
Let
r
s
be a bridge on the shortest path P
π
v1
vn
. Then the removal of
r
s
discon-
nects the graph G. Since the node pairs
v1
r
and
s
vn
are still connected, the removal
of
r
s
also disconnects v1and vn. Thus, for all P
π
v1
vn
, it holds that
r
s
P.
Therefore, also all admissible paths must visit
r
s
.
Obviously, if there exists any admissible path, then P
v1
vn
is admissible, too. Thus,
r
s
P
v1
vn
. Now assume that the removal of
r
s
does not disconnect v1and vn.
Then, according to Lemma 2.5, there exists an edge
i
j
Esuch that
r
s
P
i
vn
,
r
s
P
j
vn
,
r
s
P
v1
i
and
r
s
P
v1
j
. By assumption, there exists a 2-
admissible path Rvisiting
i
j
. Without loss of generality, we may assume that Rvisits
node ibefore node j, because
c
v1
j
cij
c
i
vn
c
v1
r
crs
c
s
j
cij
c
i
r
crs
c
s
vn
c
v1
r
c
r
i
cij
c
j
s
c
s
vn
c
v1
i
cij
c
j
vn
.
However, this implies that
r
s
R, which is a contradiction to the assumption that every
admissible path must visit
r
s
.
Using Theorem 2.2, after having removed all edges that cannot be part of any 2-admissible
path, we can compute all edges that must be visited by all admissible paths in time O
m
n
:
first, we compute a shortest path P
π
v1
vn
and mark all edges on this path. Then we compute
all bridges in Gand check which ones are visited by P.
2.2. Shortest Path Constraints
27
v4
1 1 1
1 1
44
3
12
v1
Fig. 2.5: A directed graph with non-negative arc weights. Assume we are given an upper bound B
8.
All arcs in the graph are part of an admissible path with costs lower than B, and every admissible path
with costs lower than Bmust visit the arc
1
2
. However, there exists a path
v1
3
v4
that does not visit
this arc.
The following theorem summarizes the results in the previous two sub-sections:
Theorem 2.3 On undirected graphs with non-negative edge weights, relaxed L1-consistency of
a shortest path constraint can be achieved in time O
m
nlogn
, and in time O
m
n
on a
RAM.
2.2.4 Shortest Path Problems on Directed Graphs
To complete our discussion on cost-based filtering for shortest path constraints, we finish with
some results on shortest paths in general directed networks. We start by considering directed
graphs with non-negative arc weights. In the end of this section, we will show how these results
can be exploited to cope with negative arc weights as well.
As has been stated in the introduction to this section, achieving arc-consistency for shortest
path constraints in general networks is NP-hard. Regarding the removal of arcs from the possible
set, relaxed L1-consistency on directed graphs with non-negative arc weights can be achieved in
the same way as on undirected graphs. However, with respect to arcs that must be visited by all
admissible paths, the situation is more complicated. Recall the result from Section 2.2.3: After
having removedthe infeasible edges, in undirected graphs, the edges that are required are exactly
the ones on the shortest path that must be visited by all paths from v1to vn.
Unfortunately, this classification does not hold for directed graphs as the example in Fig-
ure 2.5 shows. Thus, for all arcs
i
j
P
v1
vn
, we have to re-compute the shortest path value
when removing
i
j
from E, which may require n
1 shortest path computations in the worst
case.
Theorem 2.4 On directed graphs with non-negative arc weights, relaxed L1-consistency can be
achieved in time O
n
m
nlogn

.
Since the computation time of the algorithm sketched previously may not be efficient enough
to be of use when being applied in a tree search, in the following we consider another shortest
28
Chapter 2. Optimization Constraints
pathrelaxation. Let T
Edenote a shortestpath tree inGrootedatv1. Withoutlossofgenerality,
we may assume that every node in Gcan be reached from v1, and thus that V
T. Obviously,
when e
Eis removed from T, the nodes inVare partitioned into two sets: the set v1
Se
Vof
nodes that are still connected with v1in T
e
, and the complement of Sein V,SC
e. Obviously,
SC
e
/
0iff e
T. We set
J:
P
Pis a 2-simple path from v1to vnwith P
pos
Y
or, if e
P
pos
Y
, then there exists an arc
i
j
P
Tsuch that i
Seand j
SC
e
.
Moreover, we define
L2
D
Y

:
max
min
cost
P
P
J
, maxf
req
Y
min
cost
P
P
Ff

,
To understand the above shortest path relaxation better, we make the following observations:
Obviously, since H
J,L2is dominated by L1, i.e., L2
L1. Therefore, L2is also a
shortest path relaxation.
The difference between relaxations L1and L2only consists in the set Jthat is used instead
of Hto determine the arcs that have to be required to achieve a state of relaxed consistency.
In contrast to H, the set Jalso contains paths Pthat are not simple (i.e., paths that may
visit some nodes more than just once) and that may visit arcs e
pos
Y
. However, if
e
P
pos
Y
, then we enforce that Pmust also visit another arc
i
j
Tthat connects
Sewith SC
e. This implies e
T, as otherwise SC
e
/
0. Moreover, it holds that cost
P
min
c
v1
i
cij
c
j
vn

i
j
Se
SC
e
T
.
Like L1,L2also does not force the 2-admissible paths to visit the nodes in req
Y
simulta-
neously. Again we can improve the effectivity of the filtering algorithm by setting cij :
0
for all
i
j
req
Y
and by subtracting cost
req
Y

from B.
A shortest path constraint is relaxed L2-consistent, iff
1. for all f
pos
Y
, there exists a 2-admissible path P
Ff, and
2. for all f
pos
Y
req
Y
, there exists a 2-admissible path P
Jwith f
P, or there
exists an arc e
P
Tsuch that e
Sf
SC
f.
We have seen that the relaxation L2is dominated by L1. Nevertheless, cost-based filtering
that achieves relaxed L2-consistency is still stronger than ordinary reduced-cost filtering (see
Section 2.1.2):
2.2. Shortest Path Constraints
29
Lemma 2.6 If a shortest path constraint is relaxed L2-consistent, reduced-cost filtering is inef-
fective.
Proof: Let
reqY
pos
Y

such that the shortest path constraint is L2-consistent. Furthermore,
denote the reduced costs of
i
j
pos
Y
by cij
0.
By assumption, for all
i
j
pos
Y
, it holds that there exists a 2-admissible path that
visits
i
j
. Particularly, the shortest path in F
i
j
is 2-admissible, i.e., c
v1
i
cij
c
j
vn
B. Reduced-cost filtering removes an arc
i
j
pos
Y
from the possible set iff
c
v1
vn
cij
c
v1
i
c
v1
j
c
v1
vn
cij
B.
However, since c
v1
j
c
j
vn
c
v1
vn
, it holds that
c
v1
vn
cij
c
v1
i
c
v1
j
cij
c
v1
i
c
j
vn

B.
Reduced-cost filtering adds f
i
j
pos
Y
req
Y
to req
Y
iff
c
v1
vn
min
cgh
g
h
Sf
SC
f
B.
By assumption, for all f
i
j
pos
Y
req
Y
, there exists a 2-admissible path Pin G
such that either
i
j
Por there exists an arc
r
s
P
Tsuch that
r
s
Sf
SC
f:
a) Let f
P. Since Pis 2-admissible, it implies that there exists an admissible path (that
can be constructed by removing all loops in P) that does not visit f. Thus, fmust not
be required.
b) Now, let
r
s
P
Tsuch that
r
s
Sf
SC
f, and
g
h
Sf
SC
f
Tthe arc with
minimal reduced costs cgh
0. Then,
c
v1
vn
cgh
c
v1
vn
crs
c
v1
vn
crs
c
v1
r
c
v1
s
crs
c
v1
r
c
s
vn
cost
P
B,
because c
v1
s
c
s
vn
c
v1
vn
.
30
Chapter 2. Optimization Constraints
vn
Se
C
v1r
Se
ji
s
e
Fig. 2.6: The figure schematically shows a shortest path tree Trooted at v1. Solid lines denote arcs in G,
dashed lines mark parts of the shortest path P
v1
vn
from v1to vn. The triangles symbolize shortest path
sub-trees. For an edge e
r
s
P
v1
vn
, the nodes in Vare partitioned into two non-empty sets Seand
SC
e. If eis removed from the graph, the shortest path from v1to vnmust visit an edge
i
j
Se
SC
e
T.
2.2.4.1 Relaxed L2-Consistency
As relaxations L1and L2do not differ with respect to the definition of Ff,f
E, to remove arcs
from pos
Y
we can simply follow the procedure sketched in Section 2.2.3.
Regarding the identification of arcs that have to be added to req
Y
to achieve relaxed L2-
consistency, for all e
pos
Y

req
Y
, we have tocompute the costof the shortest2-simple path
Pfrom v1to vnsuch that e
Por such that there exists an edge
i
j
P
Twith
i
j
Se
SC
e,
where Tis a shortest path tree in Grooted at v1.
First, we compute the shortest paths from v1to vnand vnto v1in the reverse of Gin time
O
m
nlogn
. As a byproduct, we get Tand shortest path distances c
v1
i
,c
i
vn
for all
i
V. If c
v1
vn
B, the current choice point is inconsistent, and we can backtrack. Otherwise,
candidates to be added to req
Y
are only the arcs e
P
v1
vn
. Since v1
Seand vn
SC
e, the
shortest 2-simple path Pfrom v1to vnwith e
Pmust contain an arc
i
j
Se
SC
e. Moreover,
since T
Se
SC
e
e
, we have that
i
j
T(see Figure 2.6). Therefore, it is sufficient to
compute, for all e
P
v1
vn
, the costs of the shortest 2-simple path Pfrom v1to vnthat contains
some
i
j
Se
SC
e
T.
Let P
v1
vn
r1
r2

rh
rh
1
,h
,r1
v1and rh
1
vn, and denote the sequence of
arcs that P
v1
vn
visits by
e1

eh
, whereby ek
rk
rk
1
for all 1
k
h. Furthermore,
for all 1
k
h, let Qkdenote a shortest 2-simple path from v1to vnwith
i
j
Qkfor some
i
j
Sek
SC
ek
T. Then,
cost
Qk
min
c
v1
i
ci
j
c
j
vn
i
j
Sek
SC
ek
T
.
A brute force approach requires time Θ
nm
to determine these values. However, we can do
better when we compute the values cost
Qk
for all 1
k
hsequentially. Note that
SC
eh

SC
e1.
2.2. Shortest Path Constraints
31
We keep the nodes jin the current set SC
ekin a min-heap, whereby the associated value of jin the
heap is defined as
xj:
min
c
v1
i
ci
j
c
j
vn

i
Sekand
i
j
E
T
.
Obviously, the smallest xjin the heap determines cost
Qk
. In the transition from one shortest-
path arc ekto the next ek
1, the nodes i
Sek
Sek
1have to be removed from the heap, and the
values xjmust be updated. For each node i
Sek
Sek
1, we iterate over all outgoing arcs and
perform a decrease-key on the adjacent nodes if necessary. Then iis removed from the heap.
Since every node in Vleaves the heap at most once and never re-enters it, for all 1
k
h, this
procedure requires at most m decrease-key operations and n delete-min operations. Therefore,
when using a Fibonacci heap, the values cost
Qk
for all 1
k
hcan be determined in time
O
m
nlogn
. Then ekis added to req
Y
iff cost
Qk

B. It follows:
Theorem 2.5 On directed graphs with non-negative arc weights, relaxed L2-consistency of a
shortest path constraint can be achieved in time O
m
nlogn
.
Finally, we consider the general case of directed graphs with integer arc weights that do not
contain negative weight cycles. On such graphs, the Bellman-Ford algorithm computes a single
source shortest path in time O
nm
. The shortest path distance from source to sink can be used to
prune the search if that value exceeds the given bound B. However, for the purpose of cost-based
filtering with respect to the relaxations L1or L2, we need to compute the shortest path distances
from the source and to the sink for all nodes.
Of course, we could apply the Bellman-Ford algorithm with v1as root in Gand vnin the
reverse of Gto obtain these values. To achieve relaxed L1-consistency, this procedure would
require time Θ
n2m
in the worst case. We can do much better though, especially when taking
into account that within a tree search, many similar Shortest Path Problems have to be solved.
We can speed up the computation by using node potentials hvfor all v
V. It is a well-known
fact, that the shortest path structure of a graph is maintained when the arc weights are changed
to cij
cij
hi
hj[1]. We aim at finding node potentials hsuch that c
0. Then, even after
arcs have been removed from the graph or the shortest path is required to visit certain arcs, we
can simply apply the algorithms that we developed for directed graphs with non-negative arc
weights. The only necessary modification is to compute c
i
j
c
i
j
hi
hj.
In order to compute the desired node potentials, we use a method that has been developed
for the computation of all pairs shortest paths by Johnson [43]: we add an artificial source node
sand arcs
s
i
for all i
V, and we set csi :
0. If the given graph does not contain negative
weight cycles, the Bellman-Ford algorithm produces shortest path distances c
s
i
. For all arcs
i
j
E, we have that c
s
j
c
s
i
cij. Thus, when setting hi:
c
s
i
we get
cij
cij
hi
hj
cij
c
s
i
c
s
j
0.
32
Chapter 2. Optimization Constraints
Degree of Consistency
Graph Type ArcCon L1L2RedCost
DAG O
m
n
undirected, c
0 NP-hard O
m
nlogn
,
RAM
O
m
n
directed, c
0 NP-hard O
n
m
nlogn

O
m
nlogn
directed, NP-hard O
n
m
nlogn

O(nm)
no negative cycles amort.
n
:O
m
nlogn
Tab. 2.1: The table gives an overview of the findings in this section.
The following two theorems follow directly from the discussion:
Theorem 2.6 On directed graphs, relaxed L1-consistency of a shortest path constraint can be
achieved in time O
n
m
nlogn

.
Theorem 2.7 On directed graphs, relaxed L2-consistency of a shortest path constraint can be
achieved in time O
nm
. Within a tree search, relaxed L2-consistency can be achieved in amor-
tized time O
m
nlogn
for
n
calls of the filtering procedure.
2.2.5 Summary
Before we proceed, we summarize the results that we achieved in this section (see Table 2.1): On
DAGs, arc-consistency for a shortest path constraint can be achieved in linear time by exploiting
topological orderings. On general directed and on undirected graphs, achieving arc-consistency
is an NP-hard task. We developed two shortest path relaxations L1and L2both based on the
class of 2-simple paths. We showed that L1dominates L2, and cost-based filtering based on L2is
superior to reduced-cost filtering. On undirected graphs with non-negative edge weights, relaxed
L1-consistency(and therefore also relaxed L2-consistency)can be achievedin time O
m
nlogn
and in time O
m
n
on a RAM. On directed graphs with non-negative arc weights, relaxed L1-
consistency can be obtained in time O
n
m
nlogn

, and a state of relaxed L2-consistency
can be achieved in time O
m
nlogn
. Finally, in the presence of negative arc weights, we
use the Bellman-Ford algorithm just once for the computation of node potentials that allow us
to solve the Shortest Path Problems on graphs with non-negative arc weights. Therefore, we
achieve relaxed L1-consistency in time O
n
m
nlogn

, and L2-consistency in time O
nm
or
O
m
nlogn
for
n
calls of the filtering algorithm.
2.3. Weighted Stable Set Constraints on Interval Graphs
33
2.3 Weighted Stable Set Constraints on Interval Graphs
Real-world scheduling problems often require the selection of temporally non-overlapping tasks,
as one machine, processor or person can only work on one job at a time. E.g., if one wants to
record movies from TV, no two temporally overlapping broadcasts can be taped. Thus, when
given a set of weighted tasks with starting and ending times, we try to find a selection of non-
overlapping tasks such that their weighted sum is minimized [43]2.
Frequently, the problem evolves only as relaxation or sub-problem of a real-world applica-
tion. For instance, in a realisticmodel of the aboveTV recording example, thestorage capacity of
the recording device is limited (see Chapter 6). The problem can viewed as an augmented Knap-
sack Problem, which is NP-hard. Exact algorithms to compute and prove an optimal solution
for such problems are often based on enumeration approaches. The tightening of sub-problems
can help greatly to improve the performance of a tree search approach. Therefore, in this sec-
tion we develop an efficient cost-based filtering algorithm that exploits the special structure of
non-overlapping constraints.
During a tree search, many similar problem instances have to be solved, whereby the in-
stances from one iteration to the other only differ with respect to necessarily included and ex-
cluded tasks and, as we shall see later, possibly changes in the objective. Therefore, the develop-
ment of an incremental algorithmis desirable [8, 54, 177, 178]. The algorithm we developworks
in two phases: a preprocessing phase using time Θ
nlogn
, and an optimization and filtering
phase using linear time. The data structure established in the first phase is independent of the ob-
jective function and can be adapted in linear time to reflect decisions on necessarily included and
excluded tasks. Thus, we achieve an amortized linear time algorithm for
logn
incremental
calls with changing variable domains and different objectives.
Repeated computations with changing objective functions are important when solving La-
grangian relaxations for example. In Section 3.2, we developa method to link linear optimization
constraints that is based on Lagrangian relaxation, and that makes use of dual and reduced-cost
information while solving the Lagrangian dual. Therefore, as a major objective in this section,
we develop an efficient algorithm that, on top of an optimal selection of non-overlapping tasks,
provides dual and reduced-cost data as a byproduct.
The work presented in this section was published in [189]. It is structured as follows: In
Section 2.3.1, we define the weighted stable set constraint formally. Then, in Section 2.3.2, we
develop an algorithm based on mathematical programming that computes minimum weighted
stable sets on interval graphs and provides dual and reduced-costs information as a byproduct.
2In contrast to the common definition of weighted stable set problems, we state the problem not as a maximiza-
tion but as a minimization problem. We do this because the latter problem has a one-to-one correspondence with
a shortest-path problem in the complement graph that we will use for filtering purposes later. Note that we allow
negative node weights such that a maximization problem can easily be transformed into a minimization instance.
34
Chapter 2. Optimization Constraints
Finally, in Section 2.3.3, we give a cost-based filtering algorithm for weighted stable set con-
straints on interval graphs.
2.3.1 The Weighted Stable Set Constraint
A natural way of modeling the problem of finding a selection of non-overlapping tasks is to
consider an interval graph [99]: the tasks are represented by the nodes, and an edge connects
two nodes iff the corresponding tasks are in conflict, i.e., iff the corresponding intervals are
overlapping.
Definition 2.6 A graphG
V
E
is called an interval graph iff there exist intervals I1

I
V
such that
vi
vj
V:
vi
vj
E
Ii
Ij
/
0.
Then the problemconsists infinding a minimumweighted stable set (WSSP) in an interval graph.
We generalize the use of conflict graphs and define:
Definition 2.7 Given an undirected graph G
V
E
, n
V
, V
1

n
, denote the node
weights in G by c
n, and let B
. Let Xi
0
1
denote binary variables for all 1
i
n.
Then a weighted stable set constraint has variables
X1

Xn
, and an instantiation Xi
xifor
all 1
i
n is true, iff
for all 1
i
j
n, it holds that xi
1
xjimplies
i
j
E, and
xi
1ci
B.
Obviously, the weighted stable set constraint is a minimizationconstraint. On general graphs,
thecomputationofa minimumstable setis anNP-hardtask. Therefore, achievingarc-consistency
for the weighted stable set constraint on general graphs is also NP-hard. However, on interval
graphs minimum weighted stable sets can be computed in time O
nlogn
[115]. The existing
algorithms for the WSSP on interval graphs are based on sweep line or dynamic programming
approaches and neither provide dual values and reduced-cost information, nor do they suggest
how cost-based filtering could be performed efficiently.
We will show that a state of arc-consistency can be achieved in amortized linear time for
logn
incremental calls for weightedstable setconstraints oninterval graphs. In the following,
we assume that we are given a number n

, intervals Ii
start
i

end
i
for all 1
i
n, task
weights c
nand an upper bound Bon the objective. We refer to the corresponding interval
graph with G
V
E
whereby V
1

n
and E

i
j
Ii
Ij
/
0
.
2.3.2 A Mathematical Programming Approach
We present an algorithm based on mathematical programming for the Minimum Weighted Stable
Set Problem on interval graphs that provides us with dual and reduced-cost information as a
2.3. Weighted Stable Set Constraints on Interval Graphs
35
byproduct, and that will be extended to an efficient cost-based filtering for the problem later.
Obviously, the following Integer Program solves the WSSP on interval graphs:
Minimize IP1
1
i
ncixi
subject to xi
xj
1
1
i
j
n
Ii
Ij
/
0
x
0
1
n
In this formulation, an LP relaxation does not necessarily yield an integer solution. However,
we can tighten the problem formulationsuch that every LP-solutionis already integer. To achieve
that formulation, we introduce a few more definitions:
Definition 2.8 A setC
V is called a conflict clique, iff I
i
Ij
/
0
i
j
C. A conflict cliqueC
is called maximal, iff
D
V, D conflict clique: C
D
C
D. Let M :
C1

Cm
2V
denote the set of maximal conflict cliques in G. We set max_start :M
, max_start
C
:
maxi
C
start
i
.
Remark 2.1
1
p
mCp
V, because
i
is a conflict clique
i
V. Thus, there exists a
maximal conflict clique Cp,1
p
m, such that i
Cp.
Lemma 2.7 The function max_start is injective.
Proof: Assume max_start
Cp
max_start
Cq
for some 1
p
q
m. Then there exist nodes
sp
Cpand sq
Cqsuch that
start
sp
max_start
Cp
max_start
Cq
start
sq
,
and
i
Cp
j
Cq:
end
i
start
sp
start
sq
start
j
and
start
i
start
sp
start
sq
end
j
.
Thus, all nodes in Cpand Cqare pairwise overlapping. Therefore, Cp
Cqis a conflict clique.
AsCpand Cqare maximal, we have Cp
Cp
Cq
Cq. However, as p
qimpliesCp
Cq, we
have p
q.
Thus, without loss of generality, in the following we assume that the conflict cliques are
ordered with respect to max_start, i.e.
max_start
Cp
max_start
Cq
1
p
q
m.
36
Chapter 2. Optimization Constraints
Lemma 2.8 Let 1
p
r
m and i
Cp
Cr. Then i
Cq
p
q
r.
Proof: Let sp
Cp
sq
Cq
sr
Cr, such that start
st
max_start
Ct
t
p
q
r
. Further-
more, let j
Cq. Then,
end
i
start
sr

start
sq
start
j
and
start
i
start
sp

start
sq
end
j
.
Therefore, Cq
i
is a conflict clique. As Cqis maximal,Cq
Cq
i
, i.e. i
Cq.
Corollary 2.2 m
n.
Proof: Let 1
p
m.
i
Cpsuchthat i
Cp
1, asotherwiseCp
Cp
1, which contradictsthe
maximality ofCporCp
Cp
1. Thus, with Lemma 2.8, we have i
Cq
p
q
m. Therefore,
Cq
n
1
q
1
q
m, and thus m
n.
Definition 2.9 We set Rp:
Cp
Cp
1
1
p
m and Rm:
Cm, and call every such Rpa
(max_start) rest clique.
Remark 2.2 The rest cliques form a partition of V: Let 1
i
n. Remark 2.1 states that i
Cp
for some 1
p
m. Let q :
max
p
1
p
m
i
Cp
. Then, i
Rq.
On the other hand, let i
Rp
Rqwith 1
p
q
m. Then, i
Cp
Cq. Therefore, with
Lemma 2.8, we have i
Cp
1, which is a contradiction to i
Rp.
LetC1

Cmdenote the maximal conflict cliques of Gordered according to max_start, and
consider the following integer program:
Minimize IP2
1
i
ncixi
subject to i
Cpxi
1
1
p
m
x
0
1
n
The maximal conflict clique restrictions imply that xi
xj
1 for all nodes i
j
Vwhose
corresponding intervals overlap. On the other hand, if xi
xj
1 for all overlapping intervals
Iiand Ij, it is also true that i
Cpxi
1
1
p
m. Thus, the above IP solves the WSSP on
interval graphs.
In the following, by A
0
1
m
nwe denote the corresponding matrix to IP2, i.e. A
api
1
p
m
1
i
nwith api
1 iff i
Cp.
2.3. Weighted Stable Set Constraints on Interval Graphs
37
Theorem 2.8 The corresponding matrix A of IP2is an interval matrix.
Proof: We have to show that api
ari
1 implies that aqi
1
p
q
r, 1
i
n. By
the construction of A, this is equivalent to showing that i
Cp
Crimplies i
Cq
p
q
r.
However, this is true according to Lemma 2.8.
Corollary 2.3 IP2is totally unimodular.
Proof: Interval matrices are totally unimodular [158].
Corollary 2.3 now allows to solve the WSSP on interval graphs as a linear program:
Minimize LP3
1
i
ncixi
subject to i
Cpxi
1
1
p
m
x
0
Notice that, with Remark 2.1, the maximal conflict clique restrictions imply that x
1.
2.3.2.1 A Pivot Selection Strategy
We use the simplex method to solve LP3. Let R1

Rmdenote the (max_start) rest cliques of G.
In iteration 1
t
m, we choose q:
tas pivot row and j
Rqwith the smallest reduced costs
as pivot column. If the reduced costs of jare less than 0, we perform a pivot step. Otherwise we
proceed with the next iteration immediately.
Theorem 2.9 After m such iterations, the simplex tableau is primal and dual feasible.
The proof of Theorem 2.9 will be given later in this section. In the following, we refer to the
simplex tableau with the following identifiers: elements of the matrix At
1
0
1
m
nafter
0
t
msimplex iterations are denoted by
atpi
1
p
m
1
i
n, entries in the right hand side
bt
mare referred to by
btp
1
p
m, and the reduced costs ctare denoted by
ct
i
1
i
n.
First, we prove that our pivot selection preserves primal feasibility. We observe that x
0 is
primal feasible as b0
p
1
0
1
p
m. To assure the maintenance of primal feasibility, we
must show that bt
0
0
t
m. To do so, we prove the following
Lemma 2.9 Let 0
t
m, 1
p
m, 1
i
n. Then,
(a) p
t implies that atpi
a0
pi and btp
b0
p
1.
(b) btp
0, i
r
tRrimplies atpi
1
0
.
(c) btp
1, i
r
tRrimplies atpi
0
1
.
(d) btp
0
1
.
38
Chapter 2. Optimization Constraints
Proof: We induce over t:
t
0:b0
p
1 and a0
pi
0
1
1
p
mand 1
i
n.
t
t
1: Let t
m, and denote the pivot column in iteration t
1 by j
Rt
1. If the reduced
costs of column jare greater or equal zero, then we are done since bt
1
btand At
1
At.
Otherwise, we set q:
t
1 and choose at
qj as pivot element. By induction hypothesis (a), we
know that bt
q
b0
q
1, and that at
qj
a0
qj. Now, since j
Rq
Cq,a0
qj
1. Thus our pivot
element is equal to 1.
(a) Therefore, at
1
qi
at
qi
a0
qi and bt
1
q
bt
q
b0
q
1 for all 1
i
n. Now let t
1
p
m.
According to our pivot selection strategy,
j
Rq
Cq
Cq
1
j
Cq
1.
This, and the interval property of the matrix Aimply
atpj
a0
pj
0 for p
t
2, and thus for all p
t
1.
Thus, in iteration t
1 the rows p
t
1 do not change, i.e.
at
1
pj
atpj
a0
pj and bt
1
p
btp
b0
p.
For p
t
1, (a) implies (b)–(d). Thus, in the following we assume p
t. Since, for all p
t
with atpj
0, row pdoes not change in iteration t
1, we only need to consider p
twith
atpj
0. Then, as the matrix Ais totally unimodular, it holds that atpj
1
1
.
Finally, let i
r
tRrwith at
qi
a0
qi
0. Due to the interval matrix property of Aand a0
ri
1
for some r
t
1
q, we know then that a0
ri
0
r
q. Moreover, as all pivot elements up to
step twere chosen from rows lower than q, it holds that at
ri
a0
ri
0
1
, and bt
r
b0
r
1 for
all q
r
m. Therefore, we only need to consider at
qi
a0
qi
1.
(b–d) Let p
t,i
r
tRr,at
qi
1 and atpj
1
1
. Using induction hypothesis (d), we
know that btp
0
1
. First, assume that btp
0. By induction hypothesis (b), we know
that atpj
1 and atpi
1
0
. Thus,
bt
1
p
btp
1
1
0
1
and at
1
pi
atpi
1
0
1
.
Now let us assume btp
1. Then, by induction hypothesis (c), we know that atpj
1, and
atpi
0
1
. Thus,
bt
1
p
btp
1
0
0
1
and at
1
pi
atpi
1
1
0
.
2.3. Weighted Stable Set Constraints on Interval Graphs
39
Now we show that after miterations we achieve dual feasibility.
Lemma 2.10 Let 1
t
m. Then,
(a) ct
i
0for all i
Rt.
(b) If t
m, then i
1
p
tRpimplies ct
1
i
ct
i.
Proof: (a) Let j
Rtdenote the pivot column in iteration t. If ct
1
j
0 we are done, as then
ct
i
ct
1
i
ct
1
j
0 for all i
Rt. So let us assume ct
1
j
0. In Lemma 2.9, we have already
shown that the pivot element is at
1
t j
1, and
at
1
ti
a0
ti
1
i
n.
In particular, we know that
at
1
ti
a0
ti
1
i
Rt
Ct.
We conclude that
ct
i
ct
1
i
ct
1
j
0.
(b) Let 1
p
t
mand i
Rp. Lemma 2.9 states that at
ri
a0
ri for all t
r
m. Then, due
to the interval matrix property of A, and i
Rp
Cp
Cp
1, we know that at
t
1
i
a0
t
1
i
0.
Therefore, ct
1
i
ct
i.
Corollary 2.4 After m iterations we achieve dual feasibility.
Proof: Let 1
i
n, and 1
t
msuch that i
Rt. With Lemma 2.10, it holds that:
0
ct
i
ct
1
i

cm
i.
Now, we give the previously postponed
Proof of Theorem 2.9: In Lemma 2.9 and Corollary 2.4, we have shown that after m
n
iterations the simplex tableau is primal and dual feasible.
40
Chapter 2. Optimization Constraints
2.3.2.2 An Efficient Simplex Realization
We have shown how the WSSP on interval graphs can be stated as a totally unimodular LP.
Moreover, we have proven a feasible pivot selection strategy that yields an optimal tableau after
at most nsimplex iterations.
In the following, we develop an efficient Θ
nlogn
-time algorithm to compute a set Q
1

n
with Ii
Ij
/
0
i
j
Q, such that cost
Q
:
i
Qciis minimal. Most importantly,
the algorithm also provides us with dual and reduced-cost information as a byproduct. To es-
tablish that algorithm, we show how the simplex computations according to the pivot strategy
developed in the previous section can be performed efficiently.
Theorem 2.10 Let
j1

jm
R1

Rmdenote the sequence of pivot columns according
to Section 2.3.2.1, and let d :
1

m
0
1
with d
t
:
1iff ctjt
0for all 1
t
m.
Then the set Q :
jt
1
t
m with d(t)=1 and Ijt
Ijr
/
0
t
r
m
is a stable set in
V with minimal costs.
Proof: If no pivoting is taking place (d
t
0 for all 1
t
m), the initial tableau is optimal
with x
0. Therefore, Q
/
0is an optimal solution to the problem. So let us assume that
D:
t
1
t
mand d
t
1
/
0.
We induce over m:
m
1 : If D
/
0and there exists only one row, exactly one pivot step is being carried out. Then
xj1is the only basic variable, and according to Lemma 2.9 it holds that: xj1
b1
1
b0
1
1. Thus,
Q
j1
is an optimal solution.
m
m
1 : Now assume that there are m
1 maximal conflict cliques. We set l:
maxt
Dt.
Again, by applying Lemma 2.9, we know that bm
1
l
bl
l
b0
l
1. Therefore, there exists an
optimal solution
Q
jt
1
t
lwith d
t
1
, with jl
Q.
Let k:
min
t
1
t
l
jl
Ctand d
t
1
. When setting N:
k
t
mCt
we know that
N
Q
/
0. Thus, there exists an optimal solution Q
jl
S, where
S
V
N
1
t
kRt, and it holds that Ii
Ij
/
0
i
j
S.
Since cost
Q
cjl
cost
S
, we can construct an optimal solution by finding such a set Swith
a minimal value cost
S
. If k
1, it holds that S
/
0, and we are done. Otherwise, we have to
solve a WSSP on an interval graph with k
1
mmaximal conflict cliques to find such a set
S. By setting up the corresponding LP, we find that the sequence of pivot elements to solve this
problem is exactly
j1

jk
1
, and pivot steps are carried out for all 1
t
kwith d
t
1.
We apply our induction hypothesis and achieve
2.3. Weighted Stable Set Constraints on Interval Graphs
41
S
jt
1
t
kwith d
t
1 and Ijt
Ijs
/
0
t
s
k
.
Thus, Q
jl
S
jt
1
t
m
1 with d
t
1 and Ijt
Ijs
/
0
t
s
m
1
is an
optimal solution to the WSSP on interval graphs.
The above Theorem 2.10 allows us to construct an optimal solution if we know the sequence
of pivot elements: All we have to do is start with Q
/
0. Then we visit all pivot elements in the
reverse order. An element jtis added to Qiff d
t
1 and its corresponding interval does not
overlap with any corresponding interval of an element in Q. This last check can be performed
efficiently by maintaining the value minj
Qfj, whereby fi:
min
p
1
p
m
i
Cp
1
i
ndenotes the index of the first maximal conflict clique that node ibelongs to.
It remains to compute the sequence of pivot columns for which a pivot step is being carried
out. However, according to Section 2.3.2.1, this is an easy task if we can only determine the
reduced costs quickly:
Lemma 2.11 Let 1
t
m,
j1

jm
R1

Rmbe the sequence of pivot columns ac-
cording to our pivot selection strategy, and let d :
1

m
0
1
with d
t
:
1iff ctjt
0.
Furthermore, we set qt:
t. Let zt
denote the objective function value after iteration t
(z0:
0), and at
1
qtjt
1be the pivot element in iteration t. Finally, for all 1
i
n, we set
gt
i:
zfi
1
ztif fi
t, and gt
i:
0otherwise. Then,
(a) zt
zt
1
ct
1
jt
d
t
zt
1.
(b) zt
1
r
tcr
1
jr
d
r
.
(c) ct
i
ci
gt
i
i
t
p
mRp.
Proof: (a) If no pivoting takes place in iteration t, then ct
1
jt
0 and d
t
0, and zt
zt
1.
Otherwise, ct
1
jt
0 and d
t
1. According to Lemma 2.9, at
1
qtjt
a0
qtjt
1, and bt
1
qt
b0
qt
1.
Thus, zt
zt
1
ct
1
jt
bt
1
qt
at
1
qtjt
zt
1
ct
1
jt
zt
1.
(b) With z0
0, (b) is a simple implication of (a).
(c) Let t
p
mand i
Rp. First, assume that fi
t. Then, a0
ri
0
1
r
t. As all pivot
elements up to step twere chosen from rows 1
r
t, we have that ct
i
c0
i
ci
gt
i.
So let fi
t. Using (b), we see that gt
i
zfi
1
zt
fi
r
tcr
1
jr
d
r
. As a0
ri
0
1
r
fi, we know that cfi
1
i
ci. Now let fi
r
t
p. With Lemma 2.9, it holds that ar
1
qrjr
a0
qrjr
1, and also ar
1
qri
a0
qri
1. Thus, cr
i
cr
1
i
cr
1
jr
d
r
, and hence ct
i
cfi
1
i
fi
r
tcr
1
jr
d
r
ci
gt
i.
42
Chapter 2. Optimization Constraints
2.3.2.3 An Algorithm providing Dual Information
With Theorem 2.10 and Lemma 2.11, we can formulate an efficient algorithm solving the WSSP
on interval graphs that provides us with dual values as a byproduct. In phase 1, we determine the
(max_start) rest cliques Rp, with 1
p
m, and the corresponding values fi
1
i
n. This
can be done in time Θ
nlogn
.
Phase 2 consists of miterations: First, we set z0:
0. In each iteration 1
t
mwe compute
ct
1
i
ci
zfi
1
zt
1
i
Rt, and jt
Rtwith ct
1
jt
mini
Rt
ct
1
i
. If ct
1
jt
0, we set
zt:
zt
1, otherwise zt:
zt
1
ct
1
jt. Finally, we set t:
t
1 and proceed to the next iteration.
With Remark 2.2, we know that the sets Rpform a partition of V. Thus, all nodes 1
i
n
are being looked at exactly once to compute the reduced costs. Also, in all computations of the
pivot columns, each node is incorporated only once. Therefore, phase 2 takes time Θ
n
.
After miterations, we know the value zm, as well as the sequence
j1

jm
R1

Rm
of pivot columns and the function d. By applying Theorem 2.10, we can construct a stable set
out of this information in linear time. Since the rest cliques of the underlying interval graph
are independent of the objective function, we achieve an incremental linear time algorithm for
logn
calls with different objectives.
Most importantly, we get dual values as a byproduct. By looking at the optimal tableau, we
find that the optimal dual variable for each maximal clique constraint 1
t
mhas the value
cm
jt
d
t
.
2.3.3 Cost-based Filtering
After having developed an algorithm that solves the WSSP on interval graphs, we now give an
efficient filtering algorithm for the corresponding constraint. Unlike the Shortest Path Problem,
that can also be solved in polynomial time, but for which achieving a state of arc-consistency
is NP-hard, the Weighted Stable Set Problem exhibits a stable substructure: when any node is
removed from or added to the stable set, the remaining problem is again a Weighted Stable Set
Problem. In our case, the sub-problem can even be represented as a WSSP on a (modified)
interval graph. Therefore, a simple arc-consistency algorithm can be obtained by probing all
variable values using the previously developed algorithm, which requires time Θ
n2
.
In the following, we developa cost-based filtering algorithm for the WSSP on interval graphs
that achieves a state of arc-consistency in amortized linear time for
logn
calls to the routine
with changing objectives and/or variable domains. To develop that algorithm, we re-interpret the
problem as finding a shortest path in a directed, acyclic and node-weighted co-interval graph:
We introduce an artificial source σand an artificial sink τwith corresponding intervals before
and after all other nodes, and with cσ:
0 and cτ:
0. Set G
N
A
with N:
V
σ
τ
and
A:
i
j

i
j
N
end
i

start
j
. We then define π
σ
τ
as the set of simple paths from
2.3. Weighted Stable Set Constraints on Interval Graphs
43
σto τin G. The cost of a path P
π
σ
τ
is defined as cost
P
:
i
Pci.
Remark 2.3 There is a one-to-one correspondence between stable sets in G and paths P
π
σ
τ
in G:
Let Q
i1

il
V denote a stable set in G. Without loss of generality, we may assume
that start
ij
start
ik
for all 1
j
k
l. Then, since Q is a stable set, it even holds
that end
ij
start
ik
for all 1
j
k
l. Therefore, P :
σ
i1

il
τ
is a simple
path from σto τin G with cost
P
h
Pch
0
1
j
lcil
0
cost
Q
.
On the other hand, if P
σ
i1

il
τ
π
σ
τ
, then end
ij
start
ik
for all 1
j
k
l. Therefore, the set Q :
i1

il
is a stable set in G, and cost
Q
1
j
lcil
h
Pch
cost
P
.
Therefore, a minimum weighted stable set in G corresponds to a shortest path from σto τin G,
and both have the same costs.
Given an upper bound B
, we define
Rem
B
:
1
i
n

P
π
σ
τ
i
P:cost
P
B
, and
Req
B
:
1
i
n

P
π
σ
τ
i
P:cost
P
B
.
Then, with Remark 2.3, to achieve a state of arc-consistency for a weighted stable set constraint
on an interval graph, we need to remove the value 1 from the domain of Xiiff i
Rem
B
, and
we have to remove the value 0 from the domain of Xiiff i
Req
B
. Since the variables are all
binary, this corresponds to setting Xi
0 iff i
Rem
B
, and Xi
1 iff i
Req
B
.
2.3.3.1 Removing Nodes
To compute Rem
B
, it is sufficient to determine the values of the shortest paths from the source
σvia node jto the sink τfor all j
1

n
. This can be done by computing the shortest-
path distances from the source and to the sink (compare with Section 2.2.2). With Remark 2.3,
the shortest path from σto a node j
1

n
can be determined by solving a WSSP on the
reduced interval graph with node set
1
p
fjCp, i.e., by solving the following LP:
Minimize LPj
4
i
Cp
p
fjcixi
subject to i
Cpxi
1
1
p
fj
x
0
Then the shortest-path value from σto jis c
σ
j
LPj
4
cj. According to the previously
developed theory, the minimal objective for the above LPj
4is exactly zfj
1. Thus, the shortest-
path distance of node jis c
σ
j
zfj
1
cj.
44
Chapter 2. Optimization Constraints
A similar theory as presented before shows that the shortest-path distances to the sink can be
determined by applying the algorithm of Section 2.3.2.3 using the last clique belongings
li:
max
p
1
p
m
i
Cp
1
i
n,
and the min_end rest cliques, where
min_end :M
,min_end
C
:
mini
C
end
i
.
Solving LPj
4in this inverse manner yields objectivefunction values τztfor all iterations 0
t
m.
Then the shortest-path distance to the sink is c
j
τ
τzm
lj
cj. With those values at hand,
we can determine the shortest-path value through node 1
j
nby c
σ
j
c
j
σ
cj
zfj
1
cj
τzm
lj. Then,
Rem
B
1
j
n
zfj
1
cj
τzm
lj
B
.
The algorithm sketched above determines the set of nodes that have to be removed from the
graph to achieve a state of arc-consistency. Of course, other constraints may also remove nodes.
In both cases, we must be able to handle these changes efficiently for the next call to our routine.
Without going into implementation details, we note that the removal of nodes does not affect the
interval structure of the graph, and that the data structures storing the max_start and min_end
rest cliques as well as the first and last clique belongings can be compressed in linear time to
delete any number of nodes from the graph.
We conclude that the members of Rem
B
can be computed and deleted in time Θ
nlogn
and in amortized linear time for
logn
calls of the filtering algorithm.
2.3.3.2 Requiring Nodes
To compute Req
B
, we need to identify all nodes that must be an element of any path having a
value lower than B. Obviously, only nodes on the shortest path Scan have this property. Thus,
for every node j
Swe need to find out whether the value of a shortest path Pwith j
Pis still
lower than B.
Remark 2.4 Let 1
fj
lj
m denote the first and the last clique belongings of j. Further-
more, let P be the shortest path with j
P. Obviously, it either holds that Ij
Ii
/
0
i
P,
or there exists a node i
P such that Ij
Ii
/
0. In the first case, we know that the value of the
shortest path not using the time interval Ijhas the value
cost
P
zm
cj.
2.3. Weighted Stable Set Constraints on Interval Graphs
45
In the second case, after having determined and deleted Rem
B
from the graph, we only have
to check if there exists any node i
1

n
, i
j, with Ij
Ii
/
0. For when such a node i
exists, there also exists a path ¯
P with i
¯
P and cost
¯
P
B (otherwise i would have been deleted
before). Since i and j are overlapping, we also know that j
¯
P. Thus, in ¯
P we have found a path
not covering j with a value lower than B. Therefore, j
Req
B
. On the other hand, if such a
node i does not exists, the second case is obsolete, and we only need to consider the first case.
By making that observation, we can determine Req
B
in amortized linear time: First, for all
j
P, we check whether there exists a node in the shrunken graph that overlaps with j. Without
specifying the implementation details here, we just note that this can be done in linear time for
all nodes j. If no overlapping node exists, we compute zm
cjand check whether this value is
lower than B. If not, we add jto Req
B
, otherwise we do not add it to Req
B
.
Now, we have an efficient algorithm at hand to compute Req
B
. Obviously, other constraints
and branching decisions must be taken into account when our procedure is being called next.
Thus, we have to be able to transform our graph in such a way that from now on, every path must
visit the new required nodes. At first glance this sounds problematic, as a naive approach would
delete all arcs going around the required nodes, but this procedure would cause the resulting
graph to not have the co-interval property anymore.
We can force the admissible paths (that is, paths with costs lower than B) to visit the required
nodes by making them extremely cheap: Let Req
1

n
the set of (currently) required
nodes. Furthermore, let T
0 be sufficiently large3. Then we set ˆcj:
cj
T
j
Req, and
ˆcj:
cj
j
Req. We use ˆcinstead of cas our objective and check whether the shortest-path
value is lower than B
Req

T. If not, either two required nodes overlap, or the shortest-path
value in the original graph exceeds B. Moreover, by determining Rem
B
Req

T
, we find all
nodes that overlap with some required node plus all nodes that would cause the shortest path in
the original graph to exceed the threshold B.
We summarize the results from the previous sections in
Theorem 2.11 Arc-consistency for a weighted stable set constraint on an interval graph can
be achieved in time Θ
nlogn
or in amortized linear time for
logn
incremental calls of the
filtering algorithm.
3Assuming that mini
V
ci

0, a valid setting for Tis for example T:
n

1
maxi
V
ci

mini
V
ci

46
Chapter 2. Optimization Constraints
2.4 Weighted All Different Constraints
The constraint structure of many discrete optimization problemscan be modeled efficiently using
all-different constraints. As a matter of fact, the all-different constraint was one of the first
global constraints that were considered in the literature [179]. Regarding the combination of
the all-different constraint and a linear objective, in [34] Caseau and Laburthe introduced the
MinWeightAllDiff constraint. In first applications [35], it was used for pruning purposes only.
In [77, 78], Focacci et al. showed how the constraint (the authors refer to it as the IlcAllDiffCost
constraint) can also be used for domain filtering by exploiting reduced-cost information.
In this section, we present an arc-consistency algorithm for the minimum weight all-different
constraint. It is based on standard operations research algorithms for the computation of mini-
mum weight bipartite matchings and shortest paths with non-negative edge weights. We show
that arc-consistency can be achieved in time O
n
d
mlogm

, where ndenotes the number of
variables, mis the cardinality of the union of all variable domains, and ddenotes the sum of the
cardinalities of the variable domains.
The work presented in this section was published in [187]. It is structured as follows: In Sec-
tion 2.4.1, we formally define the minimum weight all-different constraint. The arc-consistency
algorithm for the constraint is presented in Section 2.4.2.
2.4.1 The Minimum Weight All-Different Constraint
Given a natural number n

and variables X1

Xn, we denote the domains of the variables by
D1:
D
X1

Dn:
D
Xn
, and let D:
x1

xm
iDidenote the union of alldomains,
whereby m
D
. Furthermore, given costs cij
0 for assigning value xjto variable Xi(whereby
cij may be undefined if xj
Di), we add a variable for the objective Z
Z
X
c
i
Xi
xjcij
to be minimized. Note that the non-negativity restriction on ccan always be achieved by setting
ˆcij :
cij
mini
jcij, which will change the objective by the constant nmini
jcij.
In the course of optimization, once we have found a feasible solution with associated objec-
tive value B, we are then only searching for improving solutions, thus requiring Z
B. Then, we
define:
Definition 2.10 Theminimumweightall-differentconstraintisthe conjunctionof anall-different
constraint on variables X1

Xnand a bound constraint on the objective Z, i.e.:
MinWeightAllDiff
X1

Xn
c
B
:
ϑAllDiff
X1

Xn
Z
B
AllDiff
X1

Xn
Z
B
.
Consider the following example: Given variables X1

X6with domains D1
A
B
C
,
D2
A
B
D
,D3
C
D
E
,D4
C
D
E
,D5
B
C
E
, and D6
E
F
. In
Figure 2.7, we complete the example by specifying a cost matrix c.
2.4. Weighted All Different Constraints
47
A B C D E F
157 2
2234
348 5
4362
58 7 2
636
(a)
1
234
CD
A
B
F
65
E
(b)
Fig. 2.7: (a) The table gives the costs cij of assigning a value xjto a variable Xi. (b) A bipartite graph links
variables to values that they can take. Bold numbers and lines mark the optimal solution with objective
value 26.
In the following, we will assume m
n, since otherwise there exists no feasible assign-
ment. Figure 2.7 shows that there is a tight correlation between the minimum weight all-
different constraint and the Weighted Bipartite Perfect Matching Problem that can be formal-
ized by setting G:
G
X
D
c
:
V1
V2
E
c
whereV1:
X1

Xn
,V2:
x1

xm
and
E:

Xi
xj
xj
Di
. It is easy to see that any perfect matching (that is, a subset of pairwise
non-adjacent edges of cardinality n) in Gdefines a feasible assignment of all-different values
to the variables. Therefore, there is also a one-to-one correspondence of cost-optimal variable
assignments and minimum weight perfect matchings in G.
For the latter problem, a series of efficient algorithms have been developed. Using the Hun-
garian method or the successive shortest path algorithm, it can be solved in time O
n
d
mlogm

, where d:
i
Di
denotes the number of edges in the given bipartite graph. For a
detailed presentation of approaches for the Weighted Bipartite Matching Problem, we refer the
reader to [1].
Since there are efficient algorithms available, there is no need to apply a tree search to com-
pute an optimal variable assignment if the minimum weight all-different constraint is the only
constraint of a discrete optimization problem. However, the situation changes when the problem
consists of more than one minimum weight all-different constraint or a combination with other
constraints. Then a tree search may very well be the favorable algorithmic approach to tackle the
problem [34].
In such a scenario, we can exploitthe algorithms developedin the OR communityto compute
a bound on the best possible variable assignment that can still be reached in the sub-tree rooted at
the current choice point. Also, it has been suggested to use reduced-cost information to perform
cost-based filtering at essentially no additional computational cost [78].
In the following, we describe an algorithm that achieves arc-consistency in the same worst
48
Chapter 2. Optimization Constraints
A B C D E F
1-5 7 2
22 -3 4
3-4 8 5
43 -6 2
58 7 -2
64 -6
(a)
1 5
E
234
CD
A
B
F
6
(b)
Fig. 2.8: (a) The new cost matrix cM, and (b) the network NMfor the optimal matching from Figure 2.7.
case running time as is needed to compute a minimum weight perfect matching when using the
Hungarian method or the successive shortest path algorithm.
2.4.2 An Arc-Consistency Algorithm
To achieve arc-consistency of the minimum weight all-different constraint, we need to remove
all values from variable domains that cannot be part of any feasible assignment of values to
variables with associated costs Z
B. That is, in the graph interpretation of the problem, we
need to compute and remove the set of edges that cannot be part of any perfect matching with
costs less than B.
For any perfect matching M, we set cost
M
:
Xi
xj
Mcij. Furthermore, we define the
corresponding network NM:
V1
V2
A
cM
whereby
A:
Xi
xj
Xi
xj
M
xj
Xi

Xi
xj
M
,
and cM
ij :
cij if
Xi
xj
M, and cM
ij :
cij otherwise. That is, we transform the graph Ginto
a directed network by directing matching edges from V1to V2and all other edges from V2to V1.
Furthermore, the cost of arcs going from V1to V2is multiplied by
1. Figure 2.8 shows the
directed network NMfor our example.
In the following, we will make some key observations that we will use later to develop an
efficient arc-consistency algorithm. For a cycle Cin NM, we set cost
C
:
e
CcM
e. Let M
denote a perfect matching in G.
2.4. Weighted All Different Constraints
49
Lemma 2.12 Given an edge e
M, assume that there exists a minimum-cost cycleCein NMthat
contains e.4
a) There is aperfect matching Mein G thatcontains e, and itholdsthat cost
Me
cost
M
cost
Ce
.
b) The set M is a minimum weight perfect matching in G, iff there is no negative cycle in NM.
c) If M is of minimum weight, then for every perfect matching Methat contains e, it holds that
cost
Me
cost
M
cost
Ce
.
Proof:
a) Let C
eand C
edenote the edges in Ethat correspond to arcs in Cethat go from V2to V1,
or from V1to V2, respectively. We define Me:
M
C
e
C
e. Obviously, e
Me, and
since
C
e
C
e
,Meis a perfect matching in G. It holds that:
cost
Me
cost
M
cost
C
e
cost
C
e
cost
M
cost
Ce
.
b) Follows directly from (a).
c) It is easy to see that the symmetric difference M
Me
M
Me
Me
Mforms a set of
cyclesC1

Crin Gthat also correspond to cycles in NM. Moreover, it holds that
cost
Me
cost
M
cost
M
Me
cost
Me
M
,
and thus
cost
Me
cost
M
icost
Ci
.
Without loss of generality, we may assume that e
C1. Then, due to (b) and cost
Ce
cost
C1
, we have that
cost
Me
cost
M
cost
C1
cost
M
cost
Ce
.
Theorem 2.12 Let M denote a minimum weight perfect matching in G, and e
E
M. There
exists a perfect matching Mewith e
Meand cost
Me
B, iff there exists a cycle Cein NMthat
contains e with cost
Ce
B
cost
M
.
4Here and in the following we identify an edge e
Gand its corresponding arc in the directed network NM.
50
Chapter 2. Optimization Constraints
Proof: Let Cedenote the cycle in NMwith e
Ceand minimal costs.
Assume that there is no such cycle. Then either there is no cycle in NMthat contains e,
or cost
Ce
B
cost
M
. In the first case, there exists no matching Methat contains e.5
In the latter case, with Lemma 2.12(c), we have that cost
Me
cost
M
cost
Ce
B,
which is a contradiction.
We have that cost
Ce
B
cost
M
. With Lemma 2.12(a) this implies that there exists
a perfect matching Methat contains e, and for which it holds that cost
Me
cost
M
cost
Ce
B.
With Theorem 2.12, we can now characterize values that have to be removed from variable
domains in order to achieve arc-consistency. Given a minimum weight perfect matching Min G,
infeasible assignments simply correspond to arcs ein NMthat are not contained in any cycle Ce
with cost
Ce

B
cost
M
.
Of course, if cost
M

Bwe know from Lemma 2.12(b) that the current choice point is
inconsistent, and we can backtrack right away. So let us assume that cost
M
B. Then, using
empty cycles Cewith cost
Ce
0
B
cost
M
, we can show that all edges e
Mare valid
assignments. Thus, we only need to consider e
M. By construction, we know that the corre-
sponding edge in NMis directed fromV2toV1, i.e. e
xj
Xi
. Denote the shortest-path distance
from Xito xjin NMby dist
Xi
xj
cM
. Then, for the minimum weight cycle Cewith e
Ce, it
holds that: cost
Ce
cij
dist
Xi
xj
cM
. Thus, it is sufficient to compute the shortest-path
distances from V1toV2in NM.
We can ease this work by eliminating negative edge weights in NM. Consider node potential
functions π1:V1
and π2:V2
. It is a well-known fact that the shortest-path structure of
the network remains intact if we change the cost function by setting cM
ij :
cM
ij
π1
i
π2
jfor all
i
j
M, and cM
ij :
cM
ij
π1
i
π2
jfor all
i
j
M. Then,
dist
Xi
xj
cM
dist
Xi
xj
cM
ij
π1
i
π2
j.
If the network does not contain negative weight cycles (which is true because Mis a perfect
matching of minimum weight, see Lemma 2.12(b)), we can choose node potentials such that
cM
0. This idea has been used before in the all-pairs shortest path algorithm by Johnson [43].
In our context, after having computed a minimum weight perfect matching, we get the node
potential functions π1and π2for free by using the dual and negative dual values corresponding
to the nodes inV1andV2, respectively. As a matter of fact, the resulting cost vector cMis exactly
the vector of reduced costs c: If e
i
j
V1
V2, then 0
cij
cij
π1
i
π2
j
, and thus
cM
ij
cij
π1
i
π2
j
cij
0
cij. Otherwise, cM
cM
ij
π1
i
π2
j
cij
π1
i
π2
j
cij
(see Figure 2.9).
5Note that this observationis commonly used in domain filtering algorithms for the all-different constraint [179].
2.4. Weighted All Different Constraints
51
A B C D E F
10 1 0
20 0 1
30 0 1
41 0 0
52 5 0
61 0
(a)
1 5
E
234
CD
A
B
F
6
6
386
6 6
04
0
4
0
1
(b)
Fig. 2.9: (a) The changed cost vector c
cM, and (b) the network NMwith node potentials π1and π2.
Bold numbers show those assignments that can be eliminated by simple reduced-cost propagation in the
presence of a solution with value B
28.
We summarize: To achieve arc-consistency, we first compute a minimum weight perfect
matching in a bipartite graph in time O
n
d
mlogm

. We obtain an optimal matching M,
dual values π1,π2, and reduced costs c. If cost
M
B, we can backtrack. Otherwise, we set
up a network N
V1
V2
A
c
and compute nsingle source shortest paths with non-negative
edge weights, each of them requiring time O
d
mlogm
when using Dijkstra’s algorithm in
combination with Fibonacci heaps [43]. We obtain distances dist
Xi
xj
c
for all variables and
values. Finally, we remove value xjfrom the domain of Xi, iff
cij
dist
Xi
xj
c
cij
dist
Xi
xj
c
π1
i
π2
j
cij
dist
Xi
xj
cM
cost
C
i
j
B
cost
M
,
whereC
i
j
is the shortest cycle in NMthat contains
i
j
. Obviously, this entire procedure runs
in time O
n
d
mlogm

. The entire domain filtering process is visualized for our example in
Figure 2.10.
Interestingly, the idea of using reduced-cost shortest-path distances has been considered be-
fore to strengthen reduced-cost propagation [78]. For an experimental evaluation of this idea, we
refer the reader to that paper. Now we have shown that this enhanced reduced-cost propagation
is powerful enough to guarantee arc-consistency for the minimum weight all-different constraint.
The algorithm we introduced achieves arc-consistency in time O
n
d
mlogm

. At first
sight this sounds optimal, because it is the same time that is needed by algorithms for the
Weighted Bipartite Perfect Matching Problem such as the Hungarian method or the successive
shortest path algorithm. However, two questions remain open: First, can we derive a cost-based
52
Chapter 2. Optimization Constraints
A B C D E F
10 0 2 2 2 3
21 0 2 2 2 3
30 0 0 1 2 3
40 0 0 0 2 3
50 0 0 0 0 1
6 0
(a)
A B C D E F
1-5 -6 0 -4 0 -3
2-1 -3 3 -1 3 0
3-7 -8 -4 -7 -2 -5
4-5 -6 -2 -6 0 -3
5-5 -6 -2 -6 -2 -5
6 -6
(b)
A B C D E F
10 1 2
21 0 3
30 1 3
41 0 2
52 5 0
60
(c)
Fig. 2.10: (a) Shortest paths from nodes in V1to nodes in V2with respect to the reduced costs c. (b) The
same shortest paths using the original cost vector cM. (c) The additional costs imposed by an assignment
Xi
xj. Bold numbers show those assignments that can be eliminated in the presence of a solution with
value B
28.
filtering algorithm from the cost-scaling algorithm that gives the best known time bound for
Assignment Problems that satisfy the similarity assumption? Second, can the above filtering
method be implemented to run incrementally faster?
2.5. Knapsack Constraints
53
2.5 Knapsack Constraints
Based on reduction techniques for the Knapsack Problem (KP), we develop cost-based filtering
routines for knapsack constraints. We present several algorithms using bounds of different qual-
ity. The method that we consider the most interesting in theory and practice is based on a bound
proposed by Martello and Toth in [147]. By reusing information gained in an initial preprocess-
ing step taking time Θ
nlogn
, the actual reduction per choice point only requires linear time.
We compare two of the new algorithms numerically with two other reduction algorithms that
have been proposed earlier in the KP literature.
The work presented in this section was published in [61, 64, 65]. It is organized as follows:
First, we motivate the development of an efficient cost-based filtering algorithm for knapsack
constraints in Section 2.5.1. In Section 2.5.2, we present existing upper bounds and reduction
techniques for the KP. Then, in Section 2.5.3, we develop algorithmsfor the quick propagation of
knapsack constraints. An experimental evaluation of these algorithms, as well as a comparison
with alternative approaches is presented in Section 2.5.4. Finally, we discuss generalizations for
knapsack related problems in Section 2.5.5.
2.5.1 Definition and Applications
Cost-based domain filtering algorithms for knapsack optimization constraints are relevant in var-
ious application areas. First of all, capacity constraints are the basic building blocks for linear
programs. Therefore, knapsack constraints may very well be viewed as a standard modeling el-
ement when tackling problems in this large class of problems. Consider the following example:
Automatic Recording The Automatic Recording Problem (ARP) consists in finding an optimal
selection of items, each of them associated with a weight, an interval, and a profit value, such
that
the total weight of the selection does not exceed a given capacity,
all intervals associated with all selected items are pairwise non-overlapping, and
the profit is maximized.
Thus, the problem consists of a knapsack constraint accompanied by an independent set con-
straint. It models the automatic selection of TV broadcasts for video recording. The independent
set constraint ensures that only non-overlapping broadcasts can be recorded, whereas the knap-
sack constraint models the limited storage capacity of the recording device. The objective is to
maximize user’s satisfaction. In Chapter 6, we develop an algorithm that solves the ARP by
Lagrangian relaxation using the filtering algorithm presented in Section 2.5.3.1.
54
Chapter 2. Optimization Constraints
Quadratic Knapsack Problems Knapsack constraints can also be used profitably when tack-
ling the Quadratic Knapsack Problem (QKP). It calls for maximizing a quadratic boolean objec-
tive function subject to a linear capacity constraint. Filtering algorithms for KP are often used to
reduce the size of the given QKP [31]. Consider the relax and cut algorithm of Porto, de Moraes,
and Lucena [170] as an example. It computes bounds of the QKP by linearizing the problem to
KP, then tightening the problem by adding three families of valid inequalities, and finally solving
the resulting linear program (LP) by Lagrangian relaxation. To solve the Lagrangian dual, a se-
ries of KPs has to be solved in every search node. The authors stress that knapsack variable fixing
algorithms are vital ingredients of their approach. The algorithms proposed in Section 2.5.3 may
help to increase the overall performance.
In our last example, we show that knapsack constraints may also be relevant for the solution
of sub-problems when using decomposition techniques on eligible optimization problems:
CP-based Column Generation In Section 3.1, we develop a method called constraint pro-
gramming based column generation. It implies that a constraint satisfaction problem is set up
to generate columns in a column generation framework. When applying that approach to ap-
propriate optimization problems, augmented Knapsack Problems emerge as sub-problems. As
an example, consider the Constrained Cutting Stock Problem that is a Cutting Stock Problem
with additional constraints on the cutting patterns. Using a column generation approach, the
sub-problem is a Constrained Bounded Knapsack Problem: the length of the rolls determines the
capacity for the cutting patterns, and the objective is used to search for columns with negative
reduced costs only. Each cutting pattern has cost 1 since we try to minimize the number of rolls
needed to cover the specified demand. Thus, the objective in the sub-problem is to minimize
1
πTX(i.e. to minimize the reduced costs of the cutting pattern), where πis the vector of dual
values corresponding to the current optimal solution of the continuous relaxation of the master
problem. The KP objective then is to maximize πTXwith an initial lower bound of 1. Additional
constraints usually stem from real-world applications and may be non-linear. Some examples for
real-world constraints are given in [38].
2.5.1.1 Constrained Knapsack Problems
The examples given above show that knapsack constraints are often accompanied by other con-
straints when modeling real-life problems. Therefore, we introduce the definition of Constrained
Knapsack Problems (CKPs) which are Knapsack Problems with additional constraints, whereby
objective function and the capacity constraint have to be linear.
Definition 2.11 Let C
n
w1

wn
;p1

pn
. C is the capacity of the knapsack, n
the number of items, and withe weight of item i with profit pi
1
i
n. Moreover, let w :
w1

wn
T, and p :
p1

pn
T.
2.5. Knapsack Constraints
55
1. Let
:
0
1
, and G :
x
n
wTx
C
.
2. Let k
, and R :
r1

rk
rj:
n
1
j
k
. Every r
R is called a
(knapsack) rule and R is called a (knapsack) rule set.
3. Every x
G is called feasible (with respect to a given rule set R), iff r
x
1
r
R.
F
R
:
x
G
x is feasible
is called the set of feasible constrained knapsacks (with
respect to rule set R). To simplify the notation, we often write F instead of F
R
if R is
known from the context.
4. The Constrained Knapsack Problem is then to
maximize pTX
X
F.
Note that, for the unconstrained KP, it holds that F
G. For such pure Knapsack Problems
without additional constraints, the state-of-the-art solving techniques would focus on a so-called
core problem, which may be extended during the optimization process [146, 167]. For these al-
gorithms, itis notstraightforwardto see how the reductionalgorithmswe present in thefollowing
could be integrated efficiently.
However, algorithms tailored for the special case of pure KP are usually not able to solve
general CKPs, because they do not allow to incorporate additional constraints. One reason is
that algorithms designed to solve pure KPs make certain assumptions that do not hold for CKPs.
For example, it is not clear for the CKP that we can require the profits to be non-negative (as
it is the case for KP), because the strategy of omitting items with positive weight and negative
profit [149] may notyield feasible solutionsat all. Thus, in general a tree search will be necessary
to solve CKPs, and cost-based domain filtering algorithms for knapsack constraints may help to
improve the performance of such an approach.
For the remainder of this section, with identifiers
C
n
w
p
G
R, and Fwe refer to Defini-
tion 2.11. We will sometimes need to refer to reduced (C)KPs where an item i
1

n
is ei-
ther included or excluded in any feasible solution. We refer to those problems with(C)KP
Xi
1
or (C)KP
Xi
0
.
In a canonical IP formulation of the Knapsack Problem, there is one variable Xifor each item
i
1

n
. The domain of each variable is defined as D
Xi
:

. Furthermore, the capacity
constraint is modeled by a function ω:
n
with ω
X
ω
X1

Xn
1 iff wTX
C.
Finally, the objective function is Z:
n

with Z
X
Z
X1

Xn
:
pTX.
Definition 2.12 Given any lower bound B
0, we call the maximization constraint ϑω
P
B
a
knapsack constraint.
Items of a CKP fall into either one of the following classes:
56
Chapter 2. Optimization Constraints
items ithat can be excluded from further investigation as they cannot be part of any im-
proving solution, i.e.
Z
x
B
wTx
C
xi
1 (2.2)
items ithat can be included into the knapsack as they must be part of any improving
solution, i.e.
Z
x
B
wTx
C
xi
0 (2.3)
items that cannot be decided at the moment.
A filtering algorithm that achieves (hyper-)arc-consistency for the knapsack constraint has to
include and to remove items that do not fall into the last class. Since showing that either (2.2) or
(2.3) holds for an item i(i.e. to check the arc-consistency of ϑω
P
B
) generally requires to solve
a KP itself, complete propagation here is an NP-hard task. One way to cope with the situation is
to develop pseudo-polynomial filtering algorithms. For example, in [209] a reduction algorithm
for subset-sum knapsack constraints is developed that has pseudo-polynomial run-time.
We propose another way by checking if the inequality holds for an upper bound Uon
CKP[Xi
b], b
0 or b
1, i.e., we check U

b

B. Then we write
U
CKP
Xi
b
B.6
2.5.2 Knapsack Relaxations
2.5.2.1 Upper Bounds for Knapsack Problems
The effectiveness of a domain filtering algorithm that achieves relaxed consistency is determined
by the relaxation quality, i.e. the bounds used for cost-based filtering. Following the presentation
given in chapter 2 of [149], we present some upper bounds that have been originally developed
for the maximization problem KP. They also apply to the CKP by relaxing it to a KP first.
Obviously, ignoring all additional constraints often does not yield tight bounds on the objective.
However, if the additional constraints satisfy certain properties, they can be incorporated in the
objective function of a pure KP using Lagrangian relaxation. For additional linear constraints,
there are ways of how this can be done effectively (see [80, 188] and Chapter 3). Notice that
dropping all additional constraints allows to set Xi:
0
pi
0 and 1
i
n. We therefore
require all items to have positive profits.
Without loss of generality, we may assume that the items are ordered according to decreasing
efficiency, i.e. p1
w1
p2
w2

pn
wn. We define the critical item s of a Knapsack Problem as the
first item that overloads the knapsack, that is s
minj
j
i
1wi
C
(we omit the trivial case
6To improve the readability, here and in the following we write CKP or KP instead of
n, and identify CKP[Xi
b] as well as KP[Xi
b] with

b

, where
b
is the i-th factor.
2.5. Knapsack Constraints
57










s
w
c
C
s
s
Fig. 2.11: The width of each element is proportional to its weight. The elements are ordered with respect
to the efficiencies pi
wi. The leftmost element has the biggest efficiency, and the rightmost the smallest
one. smarks the critical item in U1.
here where no such sexists). Dantzig [50] showed that the linear relaxation of the 0-1 knapsack
has the optimal value s
1
j
1pj
cps
ws, where cis defined as the remaining capacity of theknapsack
after filling in the first s
1 items: c
C
s
1
j
1wj.
Let /
0
M1

Mn
. Let li:
min
Mi
denote the minimum, and ri:
max
Mi
denote
the maximum of Mi, 1
i
n. The first upper bound on KP is defined asU1: 2
n

with
U1
M1

Mn
:
max
Z
X1

Xn

ω
X1

Xn
1
Xi
li
ri
1
i
n
.
It holds that,
U1:
U1
KP
s
1
j
1pj
cps
ws
(2.4)
A second bound U2was introduced Martello and Toth in [147]. It imposes the integrality of
the critical item s. Either item sbelongs to the optimal solution (leading to a value U1) or not
(leading to a value U0):
U0
s
1
j
1
pj
cps
1
ws
1
(2.5)
U1
s
1
j
1pj
ps
ws
c
ps
1
ws
1
(2.6)
Defining U2as the maximum of U0and U1results in a bound dominating U1. Formally, let
/
0
M1

Mn
, and let sdenote the critical item with respect to necessarily included and
excluded items implicitly defined by the Mi. We set U2: 2
n
withU2
/
0
:
, and
U2
M1

Mn
:
max
U0
U1
i
s
Mi
0
pi
i
s
Mi
1
pi.
58
Chapter 2. Optimization Constraints




































c
wc
-
s
s
s
Fig. 2.12: U3requires the integrality of item s. The figures show U1
KP
Xs
0
, and U1
KP
Xs
1
.
It holds that,
U2:
U2
KP
max
U0
U1
U1
(2.7)
Instead of estimating the loss caused by the integrality of item susing the efficiency of the
neighboring items of s, an even tighter bound can be obtained by calculating bounds U1on
KP
Xs
0
, and KP
Xs
1
[68, 114, 212]. Let U0:
U1
KP
Xs
0
, and U1:
U1
KP
Xs
1
. ThenU3:
max
U0
U1
dominatesU1andU2. An even tighter bound could be obtained by
usingU2instead of U1in the definition of U0and U1and so on.
The Figures 2.11 and 2.12 give graphical interpretations of the boundsU1andU3. Obviously,
all three boundsU1
U2
U3can be computed in time O
n
after a preprocessing step of sorting the
items according to decreasing efficiencies. This requires time Θ
nlogn
. Balas and Zemel [10]
developed an algorithm for the calculation of susing linear time without any preprocessing.
However,for thereductionalgorithmthatwe presentinthe followingjustasin formerreduction
algorithms for the KP the efficiency ordering is needed anyway. On top of that, we use an
ordering of the items with respect to increasing weights.
In a tree search, both orderings can be calculated in an initial preprocessing step. After that,
theycan be reused in every search node. Within a columngeneration context,the weight ordering
only has to be calculated once, but the efficiency ordering has to be re-computed every time new
dual values of the master problem lead to a change of the objective in the successive CKPs.
2.5.2.2 Reduction Techniques for Knapsack Problems
A first reduction algorithm for KPs based on upper bound U1has been proposed by Ingargiola
and Korsh [122]. In a loop over all items i
1

n, the algorithm determines U1
KP
Xi
b
b
0
1
. Since each bound calculation takes linear time, the worst case complexity of this
algorithm is Θ
n2
.
2.5. Knapsack Constraints
59
If boundU2is used instead ofU1, more effective filtering can be achieved in the same asymp-
totic running time. Martello and Toth [148] showed that the running time can be reduced to
O
nlogn
while keeping the solution quality of bound U2. The key idea of their algorithm is to
compute the critical item sby binary search. We refer to the methods of Ingargiola and Korsh,
and Martello and Toth as IKR, and MTR, respectively.
Dembo and Hammer [53] proposed a reduction algorithm (DHR) that runs in linear time
Θ
n
. They calculate the critical item sonly once for the original problem. Within a loop they
estimate the loss when removing/including item i
1

nby extrapolating the efficiency of
item s, which allows to perform this step in constant time. As this extrapolation is less accurate
thanU1, their method is not as effective as IKR or MTR.
Though having been developed more than a decade ago, the methods DHR and MTR are still
vital ingredients in state-of-the-art solvers for the pure KP and the QKP [167, 168, 170].
The algorithm we present in the following cannot improve the running time of reduction
techniques based on the more efficient bounds U1
U2if the reduction algorithm is only called
once. For such an application, the new method presented and the one developed by Martello and
Toth both require the same asymptotic running time in Θ
nlogn
.
The situation changes, however, if a reduction method is called many times for similar knap-
sack instances, as it is the case when applying a tree search: in every search node, we try to prune
the search or at least to tighten the problem formulation by applying domain filtering. When us-
ing unary branching constraints, the subsequent instances only differ with respect to the sets of
variables that have already been fixed. As we will see, such a situation allows to hide parts of the
work in a preprocessing step that takes time Θ
nlogn
. Provided with the information gathered
in that preprocessing, every call to the reduction routine requires linear time only.
2.5.3 Cost-based Filtering for Knapsack Constraints
2.5.3.1 A Fast Propagation Algorithm based on Bound U1and U2
Now, we show how the running time of IKR and MTR can be reduced to Θ
n
by making use of
information generated in a preprocessing step requiring time Θ
nlogn
. The bounds obtainedare
of the same quality as in the original algorithms. Again, let KP
Xj
b
denote

b

,b
0
1
, and lets
M1

Mn
minj
i
j
Mi
wi
C
i
Mi
1
wi
denotethe
critical item of KP
Xj
b
. The key idea of the routine is to calculate the bounds of the reduced
problems U
KP
Xj
b
in an order of increasing weight of the items j. Thereby, we obtain a
sequence of critical items that is monotonically increasing. Thus, the critical item and the upper
bound for the j-th item (with respect to the weight ordering) can be transformed into the critical
item and upper bound for the
j
1
-th item by starting the calculation of s
KP
Xj
1
b
at
s
KP
Xj
b
.
60
Chapter 2. Optimization Constraints
The time consuming step in reduction algorithms using bound U1,U2is to determine the
critical items s
KP
Xi
b
1
i
n, and b
0
1
. Once these values are known, the
calculation of the upper bounds and the reduction itself only require linear time. (In fact, in the
following algorithm the bounds can be computed at the same time as the critical items. To clarify
the argumentation, however, we just show how to calculate the latter.) 7
Although calculating s
KP
Xi
b
for each single i
1

n
b
0
1
generally takes
linear time, the calculation of all these values also only requires time Θ
n
once we know an
ordering σ
σ1

σn
of the items according to their weight, i.e. wσi
wσjiff i
j. The
efficiency ordering of the items as well as the the permutation σcan be obtained in a sorting step
prior to any reduction and requiring time Θ
nlogn
.
Given s
s
KP
, we know that U
KP
Xi
1
U
KP
i
s, and U
KP
Xi
0
U
KP
i
s. Thus, we only need to calculate the arrays S1:
s
KP
Xi
1
i
s
, and
S0:
s
KP
Xi
0

i
s
. We describe how to determine S0in the following. The calculation
of S1is done analogously.
We iterate over all items i
sin increasing order of weight. That way, we can be sure that
s
KP
Xi
0
increases monotonically with growing i
1

s
1
. Thus, we can start the
search for the next critical item at the position of the last one.
The following bookkeeping argument shows that this procedure only takes linear time. We
estimate the computational effort of the reduction algorithm by assigning a unit cost (say, 1 C
)
to the items causing it:
Every item j
sthat is being passed is charged 1 C
. By “passed” we mean that the item
is being included entirely when iterating from one critical item to the other.
Every item is charged 1 C
each time it is being included fractionally.
The first group of items causes at most nC
costs as the critical items are monotonically increas-
ing: every item is being passed at most once. It remains to calculate the effort for all items that
are being included fractionally. Obviously, there are at most as many fractionally included items
as critical items. Therefore, this group of items also costs not more than nC
. Thus, the costs for
the entire computation are in O
n
.
Finally, the calculation of s
KP
Xs
0
can be performed in time that is linear in the number
of items as well. Another possibility to calculate this value is to insert item sat the position
corresponding to cin the weight ordering of items and to calculate s
KP
Xs
0
just like the
critical items for the exclusion of the other items.
7Note that, by omitting the fractional parts, it is also possible to calculate lower bounds for the pure KP. For the
general CKP, the necessary feasibility checking with respect to additional constraints makes the generation of lower
bounds more complicated. Thus, more elaborate and problem dependent primal heuristics have to developed here.
In any case, reduction should only take place, after all lower bounds have been calculated [148].
2.5. Knapsack Constraints
61
































































s
Fig. 2.13: The figure illustrates the process of the reduction algorithm presented for KP
Xi
0
. The
weight ordering in which the items are tested ensures that the critical item moves monotonically to the
right.
Obviously, the above algorithm can be applied with bounds U1and U2. As a consequence,
we have shown the following
Theorem 2.13 Aftera Θ
nlogn
preprocessingstep, relaxedU2-consistencyfor aknapsackcon-
straint can be obtained in time O
n
per choice point.
It is easy to see that, for a constant number of choice points, MTR and the algorithm given
above need the same running time of Θ
nlogn
. If
logn
choice points have tobe investigated,
however, the time spent in the preprocessing is dominated by the accumulated time needed in the
choice points. In that case, Theorem 2.13 implies
Corollary 2.5 If propagation is triggered in
logn
search nodes, relaxed U2-consistency for
a knapsack constraint can be obtained in amortized time O
n
per choice point.
Thus, in a typicalsearch tree with
logn
search nodes, the methodpresented here is asymp-
totically optimal and superior to the algorithms proposed before.
2.5.3.2 More Effective Cost-based Filtering using Bound U3
To strengthen the filtering abilities of the optimization constraint, we can also use the stronger
boundU3:
U3
KP
is obtained by calculating bound U1on KP
Xs
0
, and KP
Xs
1
. When we
want to use that bound for cost-based domain filtering, we need to compute sb
i,b
0
1
, the
62
Chapter 2. Optimization Constraints
critical items of those restricted KPs if, additionally, Xi
b: Let 1
i
n,b
0
1
. Then
s0
i:
s
KP
Xi
b
Xs
KP
Xi
b
0
, and s1
i:
s
KP
Xi
b
Xs
KP
Xi
b
1
.
To compute these values efficiently, first we determine the values s
KP
Xi
b
using the
algorithm in Section 2.5.3.1. Then we apply a binary search to determine s0
iand s1
ifor all
1
i
n. This leads to a running time of Θ
nlogn
. A similar idea has been introduced in [148].
Corollary 2.6 With the previous procedure, relaxed U3-consistency for a knapsack constraint
can be obtained in time O
nlogn
per choice point.
For real-life instances, using a binary search to determine the critical item of KP
Xs
b2
Xi
b1
for b1
b2
0
1
, usually does not pay off as it is likely to be “close” to s. Thus, we consider
this result to be of theoretical interest only. However, the algorithm above leads to another
filtering algorithm that is asymptotically as efficient as the one presented in Section 2.5.3.1 (that
runs inamortized linear time), but thatis evenmore effective. In fact, the bound ituses toperform
cost-based filtering is at least as good as U2, but for some items it is even U3:
Let 1
i
n,b
0
1
,s:
s
KP
,s0
i:
s
KP
Xi
b
Xs
0
, and s1
i:
s
KP
Xi
b
Xs
1
. In contrast to the sequence of critical items that is computed for U3, the second
variable Xsthat is being fixed remains the same for all s0
i, and s1
i. Again, by using the algorithm
in Section 2.5.3.1, we determine U2
KP
Xs
0
Xi
b
1
i
n, and then U2
KP
Xs
1
Xi
b
1
i
n. For any given 1
i
n, we check whether max
U2
KP
Xs
0
Xi
b
U2
KP
Xs
1
Xi
b
B. If so, we fix the value of Xito 1
b.
It is easy to see that the bound calculated is at least as good as U2. For items i
swith
s
KP
Xi
0
sand items i
swith s
KP
Xi
1
s, however, domain filtering is just as
effective as for bound U3. Hence, we achieve an amortized linear time algorithm based on a
’mix’ of U2and U3bounds.
2.5.4 Experiments
After having analyzed the new algorithms theoretically, now we compare them numerically
with different methods that were derived from KP reduction techniques presented in the lit-
erature. All experiments were run on a SUN Enterprise 450 Model 4300 (296 MHz) with
1 GB RAM, under Solaris 2.6. The reduction algorithms were implemented in C++ on top
of ILOG SOLVER 5.0 [121].
2.5.4.1 Test Environment
To show the potential of the new propagation algorithms, and to avoid cross-talking with other
constraints, we decided to base the experiments on pure Knapsack Problems only. That way, we
get a clear view on the performance of each filtering algorithm without disturbing interferences
2.5. Knapsack Constraints
63
that can evoke easily when using more complex settings that incorporate additional constraints.
For an example of a combination of the algorithms presented here and a shortest path constraint,
we refer the reader to Chapter 6. Likewise, we omit specially tailored tree search or branching
strategies for pure KPs. Instead, we used the default settings of the underlying CP library.
A word of caution is necessary here: even though our experiments are based on pure KP
data, the filtering algorithms we developed are not suited for state-of-the-art KP solvers. Also,
we do not claim that the solvers we implemented are competitive to the best KP solvers (see
Section 2.5.1.1). Our focus here is clearly on Constrained Knapsack Problems.
A weak propagation algorithm, if started from scratch, will obviously have to visit more
choice points to find an optimal or near-optimal solution of the problem than a good one. There-
fore, to make the comparison fair, we initialize the lower bound with the optimal objective value
B
and just measure the time and the number of choice points that each approach takes to
prove optimality.
The generator code of David Pisinger [167] was used to produce random instancesof two dif-
ferent classes of Knapsack Problems where the weights wjare randomly distributed in [1,1000],
and the profits pjare chosen as given below:
uncorrelated: pjrandomly distributed in [1,1000],
weakly correlated: pjrandomly distributed in
wj
100
wj
100
1
1100
In all cases, the knapsack capacity is chosen asC
1
2n
j
1wj. The problem sizes range from
10 to 20000 items, and 100 Knapsack Problems were generated for each size and class.
We omit the classes of strongly correlated data (pj
wj
10) and subset-sum data (pj
wj).
It is known that the bounds described in Section 2.5.2.1 are not suited for these classes (which is
easy to see as
k:pk
wk
1). For them, bounds based on cardinality constraints have shown
to be effective [146, 150]. In the application area that we focus on (see Section 2.5.1), however,
it is justified to assume that the evolving KPs are more likely to fall into one of the classes we
used for our tests.
2.5.4.2 The Opponents
The algorithms referred to as linU1and linU2are based on the amortized linear time reduction
method described in Section 2.5.3.1, and use bounds U1and U2, respectively. Methods DHR,
and MTR have been described in Section 2.5.2.2. We implemented all algorithms in the same
CP environment. Table 2.2 summarizes the major characteristics for the candidates used in the
experiments. All methods need O
n
memory for the propagation stack and for the different
orderings used. Within a choice point, only O
1
additional memory is required.
Notice that, in our experiments, we do not evaluate the filtering algorithm based on a mixture
of boundU2andU3thatwassketchedin Section 2.5.3.2. The propagationalgorithmbased on this
64
Chapter 2. Optimization Constraints
Name see Bound pre-proc. time time per node
DHR Sect. 2.5.2.2, D
H
bound Θ
n
MTR Sect. 2.5.2.2, U2Θ
nlogn
Θ
nlogn
linU1Sect. 2.5.3.1 U1Θ
nlogn
Θ
n
linU2Sect. 2.5.3.1 U2Θ
nlogn
Θ
n
Tab. 2.2: Characteristics of the four algorithms used in the experiments.
mixed bound visits only slightly fewer choice points than linU2, but requires more computation
time. Recall from Section 2.5.3 that the work that has to be done to perform domain filtering
using bound U2is almost the same as using bound U1. When using the mixed bound, however,
the workload is twice as much as that for bound U1.
As we will show in this section, we are facing a trade-off between the time needed per choice
point and the reduction of choice points that can be achieved by using tighter bounds. Within
the test environment that we have chosen for our experiments, a slight reduction of choice points
does not justify a much higher effort undertaken in every choice point. Therefore, the filtering
algorithm based on the mixed bound is of interest only in the context of more complex CKPs
incorporating additional and possibly hard side constraints that would make even small reduc-
tions of choice points more favorable. However, in the KP setting that we consider here, to avoid
cross-talking with additional constraints and to evaluate the pure performance of the different
propagation algorithms, the algorithm developed in Section 2.5.3.2 is not competitive.
2.5.4.3 Numerical Results
The simple approach for solving a CKP in a CP context would be to introduce a sum-constraint
(i.e. jwjXj
C) plus a constraint stating that we are only looking for improving solutions (i.e.
jpjXj
B). However, as shownin Table 2.3, that approach cannot compete at all with the other
propagation methods. Both the number of choice points and the CPU time grow exponentially
when the problem size increases. A dash means that the average calculation for a test instance
takes more than two hours. For both classes, only small problems with not more than 40 items
can be solved within that time limit. The poor performance of the pure CP approach shows the
need for sophisticated filtering techniques when knapsack constraints occur in a CP model. As
will be shown in the following, more elaborate techniques are able to tackle problems of several
1000 items in a few seconds, generating only relatively few choice points.
Small Instances Tables 2.4 and 2.5 show the average results of 100 different instances of the
same data size n. We present the running time in seconds, and the number of choice points cp
that the method visits. Table 2.6 shows a comparison of the different methods regarding the time
per choice point for uncorrelated and weakly correlated data.
2.5. Knapsack Constraints
65
Size uncorrelated weakly correlated
n cp time cp time
10 37.77 0.01 73.74 0.01
20 1455.80 0.16 28736.07 2.91
30 141338.82 15.50 16771406.92 1641.94
40 10311820.44 1410.07
Tab. 2.3: The pure CP approach for both problem classes. cp is the average number of choice points, time
the average time in seconds for 100 instances of the given size.
Size DHR linU1linU2MTR
n cp time cp time cp time cp time
10 2.43 0.00 0.87 0.00 0.67 0.00 0.67 0.00
20 5.47 0.00 2.68 0.00 2.35 0.00 2.35 0.00
40 7.20 0.00 3.61 0.00 3.22 0.00 3.22 0.00
60 10.18 0.00 6.07 0.00 5.26 0.00 5.26 0.00
80 13.96 0.01 8.43 0.00 7.04 0.00 7.04 0.00
100 14.21 0.01 8.20 0.00 6.75 0.00 6.75 0.00
200 24.85 0.02 17.16 0.02 14.47 0.01 14.47 0.01
300 32.47 0.04 22.57 0.03 18.76 0.02 18.76 0.02
400 38.19 0.05 27.69 0.04 23.28 0.04 23.28 0.04
500 46.50 0.08 33.64 0.06 28.68 0.05 28.68 0.05
600 63.61 0.11 48.67 0.09 40.95 0.08 40.95 0.08
700 54.67 0.11 41.16 0.09 34.53 0.08 34.53 0.08
800 69.92 0.16 51.76 0.13 42.38 0.11 42.38 0.11
900 68.89 0.17 51.76 0.14 42.35 0.13 42.35 0.12
1000 97.83 0.26 72.38 0.21 59.73 0.17 59.73 0.18
Tab. 2.4: Uncorrelated data instances. We give the average numbers for 100 test sets per size. time is the
time in seconds, cp the number of choice points.
66
Chapter 2. Optimization Constraints
Size DHR linU1linU2MTR
n cp time cp time cp time cp time
10 10.42 0.00 6.31 0.00 5.42 0.00 5.42 0.00
20 20.41 0.00 13.82 0.00 11.35 0.00 11.35 0.00
40 33.26 0.01 23.42 0.01 19.87 0.01 19.87 0.00
60 37.69 0.01 26.69 0.01 22.52 0.01 22.52 0.01
80 56.07 0.02 40.10 0.01 33.21 0.01 33.21 0.01
100 61.60 0.02 45.49 0.02 37.94 0.02 37.94 0.02
200 103.85 0.06 77.05 0.05 64.33 0.05 64.33 0.04
300 162.20 0.13 123.11 0.11 99.67 0.10 99.67 0.09
400 202.23 0.21 151.50 0.17 118.71 0.15 118.71 0.14
500 226.36 0.29 161.80 0.23 122.57 0.19 122.57 0.18
600 286.40 0.42 207.56 0.33 158.92 0.27 158.92 0.26
700 345.28 0.58 252.25 0.45 185.42 0.36 185.42 0.35
800 314.00 0.61 214.64 0.44 151.34 0.34 151.34 0.33
900 428.16 0.89 300.34 0.67 210.06 0.51 210.06 0.49
1000 451.74 1.04 313.50 0.78 220.33 0.60 220.33 0.57
Tab. 2.5: Weakly correlated data instances. We give the average numbers for 100 test sets per size. time
is the time in seconds, cp the number of choice points.
Size Type DHR linU1linU2MTR
ntime/cp time/cp time/cp time/cp
500 uncorrelated 1.72 1.78 1.74 1.74
500 correlated 1.28 1.42 1.55 1.47
1000 uncorrelated 2.66 2.90 2.85 3.01
1000 correlated 2.30 2.49 2.72 2.59
Tab. 2.6: Uncorrelated and weakly correlated data instances. We give the average time per choice point
in milliseconds for 100 test sets per size.
2.5. Knapsack Constraints
67
Size nlinU2(time per cp) MTR (time per cp)
500 1.74 1.74
1000 2.85 3.01
2000 5.08 5.58
4000 11.80 12.42
8000 28.71 32.36
16000 71.71 75.42
Tab. 2.7: Uncorrelated data. Comparison of running times per choice point for the new amortized linear
time propagation algorithm based on bound U2and the implementation of MTR. We give the average time
per choice point in milliseconds for 100 test sets per size.
The Dembo/Hammer-based filtering algorithm needs to visit the largest amount of choice
points among the four propagation algorithms tested. This matches the expected behavior of a
methodthatpruneswithrespect toweakerbounds. Due totheshorttimeper choicepoint, though,
it is only slightly slower than the other methods on uncorrelated data. Thus, the numerical results
reflect the expected trade-off between an effective filtering and the time needed to achieve a
higher level of consistency. In the presence of additional constraints (causing a longer time spent
per choice point that is needed for constraint propagation), it is likely that a smaller number
of choice points will result in a faster overall computation. Algorithm linU1uses fewer choice
points than DHR, but is not as effective as the U2-based algorithms, MTR and linU2. For the
larger instances of this test set, these two only visit between 50% and 65.6% of the choice points
needed by DHR.
For weakly correlated data, linU2only visits at most 69.7% of the choice points of the DHR
routine. Moreover, linU2slightlyoutperforms DHR with respect to the total running time. Notice
that the time per choice point spent by linU2for weakly correlated instances is smaller than that
for uncorrelated data. The reason for this is that the preprocessing time for initializing the more
complex data structures for linU2and for sorting the items according to weight and efficiency is
spread over a much higher amount of choice points.
Large Instances To get a clearer insight into the characteristics of the different algorithms, we
performed some tests on larger instances. Going up to 10000 items, the disadvantages of the
poor bounds used by linU1and especially DHR become obvious. Due to a much bigger amount
of choice points that have to be visited, the total running times exceed those of linU2and MTR
(see Table 2.8).
Still, on average, the algorithms based on MTR and linU2need about the same running
time. We assume that, for smaller test instances, the binary search performed by MTR is faster
because it causesless overhead thanlinU2. As the problemsize increases, however, the difference
68
Chapter 2. Optimization Constraints
Size DHR linU1linU2MTR
n cp time cp time cp time cp time
1000 97.83 0.26 72.38 0.21 59.73 0.17 59.73 0.18
2000 161.48 0.79 120.64 0.65 100.38 0.51 100.38 0.56
3000 202.34 1.59 148.43 1.31 118.90 1.00 118.90 1.06
4000 291.00 3.17 205.16 2.43 146.58 1.73 146.58 1.82
5000 360.47 4.82 245.32 3.79 184.83 2.65 184.83 2.98
6000 534.61 9.46 376.69 7.81 197.43 3.84 197.43 4.30
7000 620.48 12.90 431.55 10.11 294.18 6.78 294.18 7.57
8000 823.34 21.08 567.43 16.47 285.22 8.19 285.22 9.23
9000 1051.72 31.76 712.51 23.74 435.65 14.50 435.65 15.46
10000 1143.54 38.39 797.58 30.21 620.35 22.71 620.35 24.99
Tab. 2.8: Uncorrelated data. Comparison of running times for the new amortized linear time propagation
algorithms and implementations of DHR, and MTR. We give the average time in seconds as well as the
number of choice points for 100 test sets per size.
uncorrelated weakly correlated
Size linU2MTR linU2MTR
n cp time time cp time time
10000 620.35 22.71 24.99 1626.78 60.98 66.58
11000 629.43 26.38 28.76 2572.45 110.47 121.08
12000 604.87 28.04 32.31 2590.45 125.40 137.21
13000 1341.42 69.30 77.31 2694.07 142.13 156.26
14000 875.71 50.42 56.96 3520.18 206.68 228.54
15000 1041.80 64.60 70.74 2818.97 185.33 204.80
16000 1256.73 90.12 94.78 2164.99 154.56 172.14
17000 1670.81 124.53 139.63 3145.36 250.59 276.93
18000 2580.28 205.81 227.81 2980.91 251.43 279.63
19000 2870.68 243.05 274.93 4871.67 435.33 476.97
20000 2750.36 256.88 288.15 4319.27 405.56 452.50
Tab. 2.9: Comparison of running times of linU2and MTR on uncorrelated and weakly correlated data. cp
is the number of choice points, time the running time in seconds.
2.5. Knapsack Constraints
69
in efficiency becomes more noticeable, and linU2slightly outperforms MTR (see Tables 2.7
and 2.9).
A drawback of the new methods is the need for an initial sorting step in the preprocessing in
which a profit and a weight ordering of all items are calculated. However, timing experiments
show that this initial step costs about 0.06 seconds for 10000 items and takes less than 0.01
seconds for 1000 items. According to Table 2.8, the total running time for these problem sizes is
much higher. Hence, the preprocessing time can be neglected in practice.
2.5.5 Cost-based Filtering for Knapsack Related Problems
Before summarizing our results on the filtering algorithms of knapsack constraints, we would
like to discuss their applicability to two special variants of the Knapsack Problem that have been
introduced in the literature.
Multidimensional Knapsack Problems The Multidimensional Knapsack Problem consists in
the maximization of a given profit function with respect to two or more given capacity con-
straints. The problem can be viewed as a collection of mKnapsack Problems sharing one objec-
tive:
max jpjXj
s
t
jwi
jXj
Ci
i
1

m
Xj
0
1
(2.8)
Thus, for each of the capacity constraints, we can define an optimization constraint and per-
form cost-based filtering using the propagation algorithms we just presented. This approach,
however, suffers a setback from the fact that the bounds computed in each optimization con-
straint ignore all constraints except one. Therefore, the bounds are not tight, and filtering is less
effective than it could and should be.
In Chapter 3, we develop a generic method for linking filtering algorithms of linear opti-
mization constraints, the CP-based Lagrangian relaxation. When applied to Multidimensional
Knapsack Problems, problem reduction is based on the filtering routines of the individual knap-
sack constraints incorporating the other constraints in a Lagrangian objective. We will see that
this approach is clearly favorable compared to the loose connection of optimization constraints
that interact via domain reduction only.
Note, however, that the asymptotic complexity improvements that we introduced are lost
whenapplyingthe knapsackfilteringalgorithmin thecontextof CP-based Lagrangianrelaxation,
because for each Lagrangian sub-problem, the objective changes. Thus, the efficiency ordering
has to be re-computed which then dominates the algorithmic complexity. It is worth noting that
this problem does not occur when the filtering algorithms presented here are applied to column
70
Chapter 2. Optimization Constraints
generation sub-problems (as in CP-based column generation), because the objective remains
fixed for the entire tree search that is applied to compute a new column. Thus, the efficiency
ordering of the knapsack items has to be re-computed only when a new sub-problem is set up.
Bounded Knapsack Problems Bounded Knapsack Problems generalize the 0-1 KP by defin-
ing individual bounds on the solution vector:
max jpjXj
s
t
jwjXj
C
Xj
0
1
2

uj
(2.9)
The discussion in Section 2.5.1 on the Constrained Cutting Stock Problem has shown an
application of Bounded Knapsack Problems. Obviously, (2.9) can be transformed into a CKP by
replacing each original variable Xjby ujnew variables X
j
k
0
1
k
1

uj. (Note that a
finite ujalways exists, as Xj
C
wj
.). Then the algorithms presented before could be applied.
That approach, however, artificially enlarges the number of variables and ignores the additional
structure of (2.9) completely.
We can do better by extending U1and U2to general integer bounds for KP. That is, we
chose the critical item as s:
minj
j
i
1ui
pi
C
. Then U1can be re-written as U1
KP
s
1
j
1uj
pj

cps
ws
, where c
C
s
1
j
1uj
wj. For a detailed discussion of such generaliza-
tions, and an extension of U2, we refer the reader to [149, pp. 84ff.]. Using these extended
bounds, efficient propagation for the Bounded Knapsack Problem is then easily achieved by the
algorithms proposed in Sections 2.5.3.1 and 2.5.3.2.
2.5.6 Summary
Based on relaxation bounds for KP, we introduced a reduction algorithm that runs in amortized
time Θ
n
for
logn
calls. The algorithm can be used efficiently as a propagation routine when
solving a combinatorial optimization problem that contains one or more knapsack constraints.
In a CP search, the efficiency of the algorithm developed depends on the number of choice
points and the time needed per choice point: The more choice points are investigated during
the search, the less dominant are the preprocessing times for initialization and sorting. Also, if
more time per choice point is spent by other routines that, for instance, propagate additional
constraints of a CKP or calculate more expensive bounds on the objective the more important
is an effective filtering behavior that justifies a higher effort spent per choice point.
Experiments show that the algorithms presented are as effective as another method based on
a reduction technique previously proposed by Martello and Toth for KP. The theoretical analysis
and numerical comparison show that the new filtering algorithm is asymptotically more efficient.
Chapter 3
Cost-based Filtering and Problem
Decomposition
In the previous chapter, we developed a tool box of efficient cost-based filtering algorithms for a
whole variety of importantoptimizationconstraints. None of these filtering algorithmsis actually
useful when the problem corresponding to the constraint has to be solved: For the Shortest Path
Problem, the Weighted Stable Set Problem on interval graphs, and the Weighted Bipartite Match-
ing Problem there exist efficient polynomial time algorithms. Therefore, there is no need to apply
a tree search and domain filtering to solve these problems. As a matter of fact, the contrary is the
case: the filtering algorithms that we developed are based on the efficient algorithms available
to solve the corresponding optimization problems. The only NP-hard optimization problem that
we considered was the Knapsack Problem. However, even for this problem, it is not clear how
the state of the art algorithms for its solution could benefit from the filtering algorithms that we
developed.
On the other hand, real-life problems often consist in a combination of various constraints.
Frequently, the resulting problem is NP-hard, and the special composition even of well-known
optimization problems has not been studied before. Of course, for every such combination at
hand, the complexity of the resulting problem could be studied; questions regarding the approx-
imability of the problem may be answered; and algorithms for the predominant constraints may
successfully be adapted to efficiently handle the additional constraints. In general, a sound theo-
retical workwill usuallyestablishan understandingof the special augmentedconstraintstructure,
and this approach will most likely yield the most efficient algorithms. Therefore, for composed
problems that are of great practical relevance or that occur very frequently, this way of construct-
ing an efficient algorithm for a specific problem is favorable and necessary.
In industrial practice, however, the efficiency of the resulting algorithm is not the only crite-
rion. Of course, faster algorithms providing solutions of very good or even provably high quality
are clearly favorable. On the other hand, there is an obvious need to develop stable software
71
72
Chapter 3. Cost-based Filtering and Problem Decomposition
solutions quickly, and rapid prototyping is of great importance: For example when a company
wants to occupy a new market more quickly than its competitors. Or, when the problems that
have to be solved are varying, which may be caused by flexible environments in which the types
of constraints to be obeyed are changing frequently.
In constraint programming, a problem is represented as a set of constraints on variables with
finite domains. The standard algorithmicapproach isto applya tree search where, in every choice
point, constraints are propagated, i.e., they are used to shrink the domains of the corresponding
variables, if possible. This process of constraint propagation is repeated until a stable state is
reached where no values can be removed from variable domains anymore. Then a new branching
decision is made and the search continues.
This way, constraints interact only via the domains of their variables, which makes the ap-
proach extremely flexible with respect to the addition or removalof constraints. Moreover, and in
contrast to linear programming, the types of constraintsthat can be used are not really predefined.
All that is needed is a domain filtering or at least a checking algorithm for every constraint that is
used in the problem model. These algorithms can be tailored for a given constraint. Standard CP
solvers like ILOG SOLVER [121] even offer the possibility to compose constraints out of a set of
basic logic and algorithmic constraints, which facilitates the software development process and
gives less room for mistakes in the implementation.
In this setting, when given a discrete optimization problem, we may be able to identify sub-
structures that match one of the optimization constraints considered in the previous chapter.
Then we can simply plug in the constraint and use its corresponding filtering algorithm. Prob-
lem tightening with respect to cost considerations that used to be highly problem-dependent can
be handed over to standard libraries that take this task over. This results in a faster and safer
software development.
There is a price to pay, however. The way how constraint programming decomposes a prob-
lem is very weak, because only one constraint is considered at a time. This allows local incon-
sistencies to be resolved very efficiently, but on the other hand the approach lacks a global view
on a problem, which is particularly bad with respect to the computation of meaningful bounds
on the objective.
In the following, we show how to improve upon this situation by making use of two standard
decomposition methods in operations research: column generation and Lagrangian relaxation.
The results of Section 3.1 were published in [128, 129], and parts of the Sections 3.2, 3.3 were
published in [188, 189, 190].
3.1. CP-based Column Generation
73
3.1 CP-based Column Generation
Given a natural number n
and a finite set1X
n, we consider the following discrete
optimization problem that consists of two constraint families A:Ax
b, and B:Bx
e,x
X:2
Minimize LPP
cTx
subject to Ax
b
Bx
e
x
X
The convex hull of solutions to Bdefines a compact polytope in
n. Let D
d1

dL
denote
the matrix that consists of one column for each corner of this polytope. Then, each solution of
the system Bcan be written as convex-combination of the columns of D, i.e., for all x
Xwith
Bx
ethere exist λ1

λL
0 such that iλi
1 and x
Dλ. Therefore, LPPcan be rewritten
as Minimize LPC
cTDλ
subject to ADλ
b
i
Lλi
1
λ
0
Dλ
X
We achieve a linearcontinuous relaxationby omittingthe discrete constraint Dλ
X. The advan-
tage of the above re-formulationis, that there is nocross-talking between the constraintsin Aand
Banymore. However, in general the matrix AD will contain far too many columns to allow an
explicit representation. Fortunately, such a representation is not needed to solve the correspond-
ing LP, because the simplex algorithm considers only one column at a time. Therefore, columns
can simply be generated when needed. This idea gave yield to the concept of column generation,
and it is one of the most frequently used techniques in the linear programming practice.
The origins of column generation date back to the works of Dantzig and Wolfe [51] and
Gilmore and Gomory [96]. The latter paper applies column generation to the classical Cutting
Stock Problem where the sub-problem is a Knapsack Problem. More recent applications include
specially structured integer programs such as the Generalized Assignment Problem, Time Con-
strained Vehicle Routing, Crew Pairing, Crew Assignment and related problems. We refer the
reader to [56] for a survey.
The procedure works as follows: We start with a sub-matrix ¯
D
d1

dk
and solve the
reduced system Minimize LPR
¯
D
cT¯
Dλ
subject to A¯
Dλ
b
i
kλi
1
λ
0
1A typical example is X
0
1
n.
2Here and in the following we identify the name of an LP (here LPP) and its optimal objective value.
74
Chapter 3. Cost-based Filtering and Problem Decomposition
that is called the masterproblem. Denote the dual of the convexcombination constraintby π
,
and let µdenote the vector of duals of the constraints A¯
Dλ
b. We want to use the dual data to
generate a new column that has the potential to reduce the costs in the master problem. In the
simplex algorithm, those columns are determined with the help of reduced costs, that must be
negative. Thus, we consider the sub-problem
Minimize LPS
µ
cT
µTA
x
subject to Bx
e
x
X
If LPS
µ
π, we add the solution dk
1:
xto the master matrix ¯
Dand start over with the
next iteration by re-optimizing the increased master problem LPR
d1

dk
1
. Otherwise, the
process stops, and we achieve a valid lower bound on LPP. Since the solution computed will in
general not fulfill Dλ
X, the remaining gap between upper and lower bound has to be closed in
a branch & price approach. We refer the reader to Barnhart et al. [13] for further information on
this topic.
If a discrete optimization problem can be decomposed in the way we just described, the sub-
problem may be viewed as a constraint satisfaction problem where the set Xis defined by a set
of additional constraints. A typical example of such a sub-problem is the Constrained Knapsack
Problem that evolves e.g. when solving the Constrained Cutting Stock Problem with the help
of column generation. Another important class of sub-problems are Constrained Shortest Path
Problems that evolve in many contexts that range from route guidance [123] and duty scheduling
in public transit [25] up to the scheduling of switching engines [142]. The crew scheduling
application that we consider in Chapter 5 is another example where the sub-problem exhibits the
structure of a Constrained Shortest Path Problem.
For real-life applications, the additional constraints defining Xcan exhibit very complicated
structures, such as gliding time window constraints in the Airline Crew Assignment for example.
Also, the additional constraints may vary from case to case. A constraint propagation approach
can easily cope with that situation.
In that context, the advantage of the problem decomposition consists in the fact that the
constrained sub-problem does not contain the restrictions of Aanymore. Standard CP modeling
would also separate the families Aand B. However, when considering B, the constraints in A
are simply ignored, which can have a severely bad impact on the bounds used for cost-based
filtering. Using the above decomposition, we also consider the constraint family Bonly, but in
combination with changing objectives that reflect the constraints in A. Therefore, we achieve a
global view on the problem and tighter bounds that are used for a much more effective domain
filtering.
Of course, our hope is to find a decomposition such that we can identify a predominant
optimization constraint in the sub-problem that can be used for an efficient cost-based filtering in
3.2. CP-based Lagrangian Relaxation
75
the column generation process. Provided with such a decomposition, the algorithms developed
in the previous chapter can help to solve these problems efficiently.
3.2 CP-based Lagrangian Relaxation
Given a natural number nand vectors l
u
n, we consider an integer linear optimization prob-
lem (IP) consisting of the two constraint families A:Ax
b,xi
li

ui
, and B:Bx
d,
xi
li

ui
:
Minimize L
cTx
subject to Ax
b
Bx
d
xi
li

ui
Acommonwaytoachievea lower bound ¯
Lonsucha problemistodrop theintegralityconstraints
xi
li

ui
and to replace them by li
xi
uiinstead. We get
Minimize ¯
L
cTx
subject to Ax
b
Bx
d
l
x
u
Now, to achieve a state of relaxed ¯
L-consistencywe could of course solve a series of LPs ¯
L
xi
v
where we set some variable xi, 1
i
n, to some value v
li

ui
. Then, given an upper
bound B, we can eliminate vfrom the domain of xiif ¯
L
xi
v
B. Note that, due to ¯
L
xi
v
¯
L
xi
w
for all w
v(the lower bound constraints follow analogously), this procedure will not
split the domains of the variables x. That is, after the filtering the domains of the variables xican
again be represented as xi
ˆ
li

ˆui
for some ˆ
li
liand ˆui
ui, 1
i
n.
The problem with the previous probing procedure is that it requires to re-optimize a dual
feasible LP many times, and this is usually unattractive with respect to the required computation
time. Therefore, it has been suggested to estimate the loss in performance by carrying out exactly
one dual re-optimization step. This method is known as reduced-cost filtering. It is computation-
ally cheap, but since it only indirectly exploits the structure of the problem it has a tendency to
be rather ineffective.
To improve the inherent trade-off between computational effort and effectivity, we try to
decompose the problem. Assume that efficient filtering algorithms Prop(A) and Prop(B) exist
that achieve a state of relaxed consistency for the constraint families Aand B, respectively. The
obvious approach to solve problem Lexactly is to apply a branch-and-bound algorithm using
linear relaxation bounds for pruning and the existing filtering algorithms Prop(A) and Prop(B)
to tighten the problem formulation in every choice point.
76
Chapter 3. Cost-based Filtering and Problem Decomposition
However, even though Prop(A) and Prop(B) may be effective for the substructures they have
been designed for, their application for the combined problem is usually not. This is because
tight bounds on the objective cannot be obtained by taking only a subset of the restrictions into
account. An accurate bound on the overall problem can only be computed by looking at the
entire problem, i.e., it cannot be achieved by looking at either one constraint family only.
Lagrangian relaxation allows us to bring together the advantages of a tight global bound and
the existing filtering algorithms that exploit the special structure of their respective constraint
families. The idea of Lagrangian relaxation was first presented in [59] for Resource Allocation
Problems. Held and Karp used it for the TSP [107, 108], and it has been applied in many different
areas since then. For a general introductionwe refer the reader to [1]. The method thatwe present
in the following is somewhat related to that in [80] where Focacci et al. introduce a method to
strengthen cost-based filtering by using Lagrangian multipliers to incorporate additional cuts to
tighten the bound used for propagation.
For our abstract composed problem, we introduce a vector of Lagrange multipliers λ
0 and
define the Lagrangian sub-problem
Minimize LB
λ
cTx
λT
Ax
b
subject to Bx
d
xi
li

ui
For every choice of λ
0, LB
λ
is a lower bound on L. Then the Lagrange multiplier problem
or Lagrangian dual consists in finding the maximum lower bound that can be achieved:
Maximize G
LB
λ
subject to λ
0
Lemma 3.1 Given 1
j
n, a value v
lj

uj
, let LB
λ
xj
v
denote theIP thatevolves
when adding the constraint xj
v to LB
λ
. Furthermore, let B
denote an upper bound on
the objective of L such that ¯
L
xj
v
B. Finally, denote the continuous relaxation of LB
λ
by
¯
LB
λ
. Then there exists a vector λ
0such that LB
λ
xj
v
¯
LB
λ
xj
v
B.
Proof: Let λ
0 denote a vector of optimal dual values of the constraint family Ain ¯
L
xj
v
.
The theory of Lagrangian relaxation showsthat the vector λdefines optimal Lagrange multipliers
for ¯
LB
λ
xj
v
. Therefore, LB
λ
xj
v
¯
LB
λ
xj
v
¯
L
xj
v
B.
To put the result into words: Lemma 3.1 shows that for every variable xjand value v
lj

uj
that can be filtered with respect to the relaxation ¯
L, there exists a vector of Lagrange
multipliers that allows to filter this value with respect to the constraint family Bonly. Of course,
due to symmetry, the same result holds when we relax Band keep the constraints in Aas hard
constraints only.
3.2. CP-based Lagrangian Relaxation
77
This observation motivatesthe following procedure: We compute Gwith the help of an itera-
tivealgorithmfor the maximizationof a piece-wise linear, concave function. Standard algorithms
used in the literature are subgradient algorithms or bundle methods [1]. For every selection of
multipliers λ
0, LB
λ
is a valid lower bound on L. Thus, we can apply Prop(B) on the con-
straint family Bevery time when we solve the Lagrangian sub-problem LB
λ
. Of course, our
hope is that, while solving the Lagrangian dual, we traverse through most relevant selections
of Lagrange multipliers, which will result in a filtering that almost achieves a state of relaxed
¯
L-consistency.
If we still find that this filtering procedure does not sufficiently reduce the variables domains,
we can do even more. Consider the other possible decomposition
Minimize LA
π
cTx
πT
Bx
d
subject to Ax
b
xi
li

ui
Given a current selection of Lagrangian multipliers λ
0, denote the optimal dual values of
Bx
din the continuous relaxation of LB
λ
by πλ
0.
Lemma 3.2 Denote the continuous relaxation of LA
πλ
by ¯
LA
πλ
. Then, ¯
LA
πλ

¯
LB
λ
.
Proof: Denote the dual of ¯
LA
πλ
by DA
πλ
. By assumption, the vector πλis dual optimal
for ¯
LB
λ
. Let µλ
0 and νλ
0 denote the optimal duals for the constraints x
uand x
lin
¯
LB
λ
, respectively. Then, due to strong LP duality, it holds that
¯
LB
λ
dTπλ
µT
λu
νT
λl
λTb(optimality),
and
c
λTA
πT
λB
µλ
νλ
0 (feasibility).
Therefore, λ,µλand νλare feasible solutions to DA
πλ
with the objective value ¯
LB
λ
. Thus,
¯
LA
πλ
DA
πλ

¯
LB
λ
.
As a simple consequence, we get the following
Corollary 3.1 If the Lagrangian relaxation LB
λ
exhibits the integrality property, it holds that:
LA
πλ

LB
λ
.
Lemma 3.2 and Corollary 3.1 showthat the duals πλof ¯
LB
λ
are a good candidate to achieve
an improved lower bound LA
πλ
on L. This observation motivates the idea to improve the
effectiveness of the filtering algorithm by applying Prop(A) to LA
πλ
in every or at least some
iterations of the algorithm that maximizes the Lagrangian dual.
78
Chapter 3. Cost-based Filtering and Problem Decomposition
We put the ideas together. Two linear optimization constraint families Aand Bfor which
efficient filtering algorithms Prop(A) and Prop(B) are known can be combined effectively: we
compute Lagrangian multipliers for Aand use Prop(B) for filtering in each Lagrangian sub-
problem LB
λ
. Then, in selected Lagrangian iterations, we hand back optimal dual information
πλof ¯
LB
λ
to propagate A, i.e. we apply Prop(A) on LA
πλ
.
3.3 Remarks and Generalizations
3.3.1 Solving the Lagrangian Dual and Impotence
When using CP-based Lagrangian relaxation, after having shrunk thedomain of thevariables, the
immediate re-application of the filtering algorithm may yield a further reduction of the domains.
This effect is caused by the algorithms such as subgradient algorithms, bundle methods or
the volume algorithm [11] used for the maximization of the Lagrangian dual, that will in
general proceed differently when the domains of the variables are changed. As a result, different
Lagrangian multipliers and sub-problems are investigated, which also gives yield to a different
filtering behavior. As a consequence, the filtering procedure as described is not idempotent [6].
Moreover, it is not clear whether domain reduction should actually take place during the
optimization of the Lagrangian dual. We are save if we just mark those values that can be deleted
from variable domains and postpone the actual reduction until the Lagrangian dual is solved. On
the other hand, it may be also favorable to incorporate the new knowledge as early as possible. It
is subject to further research to investigate how e.g. a subgradient search can cope with changing
problems, and whether convergence can stillbe proven in such a scenario. A practical application
of this procedure will be evaluated in Chapter 7.
3.3.2 Redundant Constraint Generation
Since the filtering behavior of the reduction algorithm based on Lagrangian relaxation relies on
the sub-problems investigated during the optimization of the Lagrangian dual, we cannot be sure
that our cost-based filtering algorithm exhibits a property that we call continuity:
Let Bdenote an upper bound on the minimization problem L, let Cdenote the current choice
point and LC
x
v
the best bound achieved regarding the removal of vfrom the domain of xin
C. Now assume that we have δ:
B
LC
x
v
0 for some variable xand vin the domain
of x. Assume further that a primal heuristic finds a new upper bound ¯
B
B
δnext. We call
a cost-based filtering algorithm continuous, if it is guaranteed that in every child node Dof the
current choice point Cit is detected that vcan be removed from the domain of x.
When using Lagrangian decomposition, this is not the case. Let λ
0 denote Lagrangian
multipliers such that LB
λ
x
v
LC
x
v
. Then we cannot be sure that, when performing
3.3. Remarks and Generalizations
79
problem reduction in D, the algorithm optimizing the Lagrangian dual will investigate the La-
grangian multipliers λ. Thus, it may very well be the case that LB
λ
x
v
LD
x
v
and
¯
B
LD
x
v
.
To overcome this problem, we suggest to store, for each variable-value assignment, the value
LC
x
v
of the largest lower bound achieved so far. This procedure may be viewed as a gener-
ation of redundant local constraints of the form: L
LC
x
v
or x
v.
3.3.3 Linking more than Two Optimization Constraints
The procedure sketched can easily be generalized if the linking of more than two constraints
is desired. All we need to do is to select the substructure that determines the Lagrangian sub-
problem, i.e., the one that is used to guide the algorithm for the solution of the Lagrangian dual.
In selected iterations, we apply the filtering algorithm for the other substructures with a modified
objectivefunction. Thatmodificationis determinedby thedual values ofthe familyofconstraints
in the Lagrangian sub-problem and the Lagrange multipliers for the remaining substructures.
3.3.4 Linear Relaxations and Cuts
If continuous bounds are preferred to bounds based on Lagrangian relaxations, it is also possible
to use dual values instead of Lagrange multipliers to modify the objective functions for the re-
spective sub-problems we want to apply a filtering algorithm on. We still use the terminology of
a linking methodbased on Lagrangian relaxation, as we use Lagrangian objectivesfor cost-based
filtering.
Of course, the method can also be used in combination with tightening algorithms such as
cut generators. We simply incorporate all additional cuts as a new family of constraints we have
to find Lagrange multipliers (or dual values) for.
3.3.5 Binary IPs
Interestingly, as a special case we achieve a propagation algorithm for binary IPs. Given A
m
n,b
m, and p
n, we consider the following binary program:
Maximize pTx
subject to Ax
b
x
0
1
n
The problem can be viewed as a combination of mKnapsack Problems. Assuming that we solve
the continuous relaxation to compute an upper bound, let π
mand µ
ndenote the optimal
80
Chapter 3. Cost-based Filtering and Problem Decomposition
solution to the dual problem, i.e., πand µsolve the following linear problem:
Minimize bTπ
1Tµ
subject to ATπ
µ
p
π
µ
0
Let 1
i
m, and let iA
m
1
ndenote the matrix that evolves from Aby erasing row i,
and ib,iπ
m
1the vectors that evolve by erasing component ifrom band π, respectively.
Furthermore, let aidenote the i-th row of matrix A. Then, for every 1
i
m, we perform
domain reduction with respect to the following Knapsack Problem:
Maximize
pT
iπTiA
x
iπTib
subject to aix
bi
x
0
1
n
Thus, as a special application of CP-based Lagrangian relaxation, we achieve an effective fil-
tering algorithm for binary IPs that runs in Θ
mnlogn
(using one of the knapsack filtering
algorithms described in Section 2.5) after we have found optimal dual values of the continuous
relaxation.
3.3.6 Column Generation vs. Lagrangian Relaxation
Finally, we compare the two reduction methods developed in the previous two sections. Column
generation and Lagrangian relaxation are both absolutely identicalwith respect to the structure of
the sub-problems they investigate. The only difference consists in the way how the sub-problems
are achieved. In columngeneration, the penalties are determinedby the dualsof a linear program,
the master problem. Whereas in Lagrangian relaxation the penalties are updated with respect to
subgradients or variants of them. In practice, an average master iteration in column generation is
far more costly than in the Lagrangian relaxation setting, but therefore much fewer iterations are
necessary to obtain a satisfactory solution value.
The filtering methods we described take this important difference into account: In column
generation, we suggest to consider a CP-based tree search to solve the sub-problem, and to use
optimization constraints like the ones developed in the previous chapter to ensure the generation
of columns with negative reduced costs. The application of a tree search is affordable, because
the re-optimization of the master problem is rather costly itself, and because the total number of
master iterations is usually low.
On the other hand, we suggest to use Lagrangian relaxation within a tree search to obtain
lower bounds on the objective. Then, in every Lagrange iteration, we can use optimization
constraints for cost-based filtering that determine substructures of the optimization problem at
hand.
Chapter 4
Symmetry Breaking
In the previous chapters, we have studied the interplay between the objective functions and the
constraints of discrete optimization problems. Now we want to focus on a special aspect of the
constraint structure of many constraint satisfaction or optimization problems: Symmetry.
Symmetries can give rise to severe problems for exact and heuristic algorithms as equivalent
search regions are unnecessarily being explored more than once. Generally, there are two ways
of handling symmetries. The first one is to model the problem in such a way that none or at
least less symmetries remain. This may also imply the adding of constraints which will only be
satisfied by one assignment in each equivalence class. The major disadvantage of this approach is
that it requiresthe user to have a certain levelof experience, and sometimes itis even notpossible
to remove symmetries from a problem formulation as they are inherent to the given problem. The
second way is to break symmetries while searching for a solution. This can be done by adding
new constraints on backtracking. Those constraints can be used for domain filtering or, if the
detection of symmetries appears to be rather expensive, only for pruning.
The standard approach for breaking symmetries is to model a given discrete optimization or
constraintsatisfactionproblemin somecleverandoften non-intuitiveway. These re-formulations
are usually highly problem specific and not generic. In recent years, symmetry breaking was
studied moresystematically. In [181], Rothberg presentswaysto remove symmetriesfrom mixed
integer problems by using cuts. Sherali and J.C. Smith discuss the effectiveness of adding con-
straints to a basic model in a number of case studies [201]. In [93], Gent and B. Smith develop
a generic approach called Symmetry Breaking During Search (SBDS). In every choice point,
SBDS may extend the model dynamically by adding symmetry breaking constraints. For the
Social Golfer Problem, this approach has been shown to be efficient in combination with refined
problem formulations which are used to remove some symmetry already in the model [202]. As
the number of symmetries in the given problem is enormous, the approach presented is not able
to detect all of them and thus also gives non-unique solutions. In [155], Meseguer and Torras
introduce a symmetry avoiding approach that works by adapting the search strategy.
81
82
Chapter 4. Symmetry Breaking
We introduce a method that also detects symmetries within the search procedure. Every time
the search algorithm generates a new choice point, we check if it is equivalent to or dominated
by a node that has been expanded earlier. If so, the current choice point can be pruned. If not, it
is processed normally. By checking whether a value assignment to a variable yields a symmetric
search node, we can also use symmetries to shrink the domains of variables. However, that
propagation can be very costly, and therefore it is not suited in all cases. As the method is based
on the detection of dominance relations between sub-trees, we call it Symmetry Breaking via
Dominance Detection (SBDD).
The method that we present in the following was also developed independently by Focacci
and Milano [81] who presented their work at the same conference. In a later work, the idea of
dominance detection between choice points was extended to achieve a method for the heuristic
pruning of search nodes when solving discrete optimization problems [82].
The work presented in this chapter was published in [63]. It is structured as follows: In
Section 4.1, we formally introduce the SBDD approach. In Sections 4.2, 4.3, and 4.4, it is
applied to three different examples from combinatorial optimization and combinatorial design.
Numerical results are given that illustrate the effectiveness of the approach.
4.1 Symmetry Breaking by Dominance Detection
The goal of breaking symmetries is to avoid the exploration of a search subspace
that can be
mapped into a previously considered part
via a symmetry function. For if
does not contain
any solution, then neither does
. Otherwise, all solutions in
are symmetric to those already
computed in
. Thus, symmetries can be used to prune the search tree, and also to removevalues
from variable domains thatwouldyield the search to a symmetricpart of the search space. Before
we outline the concept formally, we introduce some helpful definitions first.
Definition 4.1 Let X
x1

xn
denote the set of variables of the model to solve, and let D
x
denote the domain of a variable x
X. The tuple Pc
Dc
x1

Dc
xn

denotes the current
state in choice point c. We refer to the representation Pcas a pattern.
Definition 4.2 Let Pc
Dc
x1

Dc
xn

, Pc
Dc
x1

Dc
xn

denote two patterns.
We say that Pc
includes Pcand write Pc
Pc
, iff
x
X:Dc
x

Dc
x
.
We set M Dc:
Dc
x1

Dc
xn
.
Given a symmetry mapping function ϕ:M Dc
M Dc, we say that Pc
dominates Pc
(under the symmetry ϕ), iff ϕ
Pc

Pc
. Then, we write Pc
Pc
.
4.1. Symmetry Breaking by Dominance Detection
83
(a) (b) (c)
Fig. 4.1: The concept of SBDD.
Due to the monotonicity of filtering algorithms [6], we have the following
Property 4.1 Given two choice points c and c
, where c
is a successor of c in the search tree.
Then, it holds that: Pc
Pc.
The approach that we suggest for pruning symmetric parts of the search space is very simple
and straightforward, but to the best of our knowledge apparently it has not been considered
before. The method is based on the following ingredients:
A database Tthat stores information on the search space already explored.
A problem specific function Φ:
P
P
false
true
that yields true iff the pattern
P
is dominated by P
under some symmetry function ϕ.
If symmetries shall also be used for propagation, a similar function is needed that, for all
variables x, removes all values bfrom the domain of xfor which Φ
P
x
b
P
true.
In every choice point, we check whether the current pattern P
is dominated by some pre-
viously considered pattern in T. If so, the current node is pruned. Otherwise, we can use the
function Φfor propagation. Thus, we perform Symmetry Breaking via Dominance Detection
(SBDD). Figure 4.1 visualizes the general procedure. White nodes are still active, black nodes
have been fully expanded already. Boxes represent patterns in T, circles are patterns not or no
longer contained in T. Finally,
marks the current node. Originally, a pattern
must be
checked against all fully expanded nodes (see Figure 4.1(a)).
Obviously, it is problematic if we are to store all expanded nodes in T. In the next section,
we describe how to handle Tefficiently for depth first search (DFS). Later we will generalize
the result to arbitrary search strategies.
84
Chapter 4. Symmetry Breaking
4.1.1 Efficient Realization in a Depth First Search
The key for an efficient realization of the general SBDD concept as described above is the ob-
servation that, within a DFS, we do not need to keep the information of all previously expanded
nodes in the search tree. Instead, we can merge sibling entries in Ton backtracking, thus sum-
marizing and compressing the information gathered.
Lemma 4.1 Let c be a choice point with state Pc
Dc
x1

Dc
xn

, and denote the states
of the children c1

clof c by Pck
Dck
x1

Dck
xn

1
k
l. Finally, let Pc
denote
the state in choice point c
with Pc
Pckfor some 1
k
l. Then, it holds that Pc
Pc.
Proof: Denote the symmetry function by ϕwith ϕ
Pc
Pck. Then, with property 4.1, we have
that ϕ
Pc
Pck
Pc. Thus, Pc
Pc.
Using Lemma 4.1, SBDD in combination with DFS can be realized efficiently: We start with
T
/
0and process each choice point as follows:
1. Check the pattern Pcof the current choice point cagainst all patterns in T. If there exists
a pattern P
Twith Φ
Pc
P
then fail. (Alternatively encapsulate this function in a
constraint and use it also for domain filtering.)
2. Process the current choice point.
3. On backtracking: If there are more sibling nodes to be expanded, then add the current
pattern to T, else delete all patterns of the sibling nodes from T.
When using DFS, the current pattern needs only be compared with patterns left-adjacent to
the path from the root to
(see Figure 4.1(b)). Notice that step 2 refers to the normal processing
of a choice point that also takes place when no additional symmetry breaking framework is
utilized, including the choice of a branching constraint and the exploration of the children.
The efficiency of the approach depends on two parameters: the time needed to evaluate the
function Φ, and on the number of such evaluations needed. Using the previous procedure, the
number of patterns in Tis at most as large as the depth of the search tree times the cardinality of
the largest domain.
4.1.2 Arbitrary Search Strategies
With respect to the importance of the size of T, at first it seems to be impractical to combine
SBDD with search strategies other than DFS, because the number of previously expanded nodes,
4.1. Symmetry Breaking by Dominance Detection
85
and thus the size of T, may be enormous. Another possibility is that the symmetry breaking
method becomes ineffective, because many nodes are closed late, which is the case for breadth
first search, for instance.
Nevertheless, with a slight modification, it is possible to cope with general search strategies.
Let cdenote the current choice point, and Pcthe corresponding pattern. The idea now is to
check whether a symmetry function maps Pcto a pattern of a choice point c
that would have
been processed before cif DFS would have been applied on a static ordering of the branching
constraints (see Figure 4.1(c)). If so, cis rejected, otherwise we proceed normally. That way,
we prune the tree because we detect that the work has either been carried out already or because
we decide to do it later. Notice that the current path in the search tree contains all information
necessary to identify the patterns that are relevant for checking. The assumption of a static
branching constraintordering defines a virtual ordering of all choice points. The approach rejects
the current choice point iff a dominating pattern exists left of it in a virtual DFS tree, i.e. iff the
current choice point has a later virtual DFS closing time stamp. As an exhaustive search will
eventually consider the leftmost nodes as well, we can be sure not to miss a solution.
Note that the search strategy is slightly affected by this procedure, because the exploration of
choice points can be postponed by the symmetry breaking algorithm. This side-effect is clearly
not desirable. However, one might expect that a reasonable search strategy rates symmetric parts
of the search tree as equally important. In that case, the expanding of the current choice point is
only postponed formally, but in fact is carried out next in a symmetric version.
4.1.3 A Different Representation of Choice Points
In Definition 4.1, we have defined a pattern with respect to the current state of the domains. We
could also have used the current set of constraints including the branching decisions taken to
identify a choice point. When defining symmetry detection functions on pairs of sets of con-
straints, we can again detect symmetries between choice points. Note that Property 4.1 and
Lemma 4.1 are still valid in this setting. Therefore, the idea of SBDD can also be realized
efficiently when using a constraint representation of a choice point.
Such a representation may be favorable with respect to non-unary branching constraints, and
also with respect to the efficiency of the evaluation of the symmetry detection function [174].
However, functions based on this representation tend to be less intuitive. Therefore, in the exam-
ples that we consider in the following, we will use the definition of a pattern as in Definition 4.1
and in combination with unary branching constraints only.
After having outlined the general approach, in the following sections we apply it to three
different applications in the field of combinatorial optimization and constraint satisfaction.
86
Chapter 4. Symmetry Breaking
σσ
σ
σσ
σ
32
1
32
1
000
100
001
110
011
1000
0001
1110
0111
0000 1111
0010
1101
1011
0100
1010
0011
1100
0110
1001
101
010
111
0101
Fig. 4.2: DeBruijn networks of dimension 3 (left) and 4 (right). A node is marked by the binary string
corresponding to its number. The dashed lines mark the symmetries of the DeBruijn network.
4.2 DeBruijn Graph Bisection
The first application ofthe method described inSection 4.1 that we present isthe Graph Bisection
Problem. Givenanundirected graphG
V
E
,the GraphBisectionProblem asksfora setS
V
such that the cardinalities of Sand SCdiffer at most by one, and the number of edges between
both sets is minimal. This optimal number is often referred to as the bisection width of the graph.
Graph Bisection is known to be NP-hard, exact solutions can only be computed for small graphs,
typically
V
200. Interestingly, Graph Bisection alone already induces a symmetry as the sets
Sand SCcan be exchanged.
An obvious symmetry breaking strategy in this case is the initial assignment of a node to the
set S. However, if the graph Gto be partitioned is itself symmetric, such an assignment does not
break the resulting combined symmetries.
In parallel computing, connection networks are typically nicely structured and their symme-
tries are known. Graphs of the hypercube family have been studied intensively (see [19, 140]).
One popular network is the so-called DeBruijn network which is defined as follows:
Definition 4.3 The DeBruijn Network of dimension k is a directed graph DB
k
V
k
Ek
with
Vk
0

2k
1
. The edge set can be described best by associating the nodes with their
corresponding binary representation, i.e. Vk
b0

bk
1
0
1
k
. Then,
Ek
bα
αb
bα
αb
α
0
1
k
1
b
0
1

where b denotes inverting bit b, i.e. b
1
b.
4.2. DeBruijn Graph Bisection
87
In the following, for the Graph Bisection Problem, we will interpret any directed arc of
DB(k) as an undirected edge. Then DB(k) contains 2knodes, each having degree 4, and 2k
1
edges. Furthermore, DB(k) contains 3 symmetries described by the following automorphisms:
σ1:V
V
b0
b1

bk
1

bk
1
bk
2

b0
σ2:V
V
b0
b1

bk
1

b0
b1

bk
1
σ3:V
V
b0
b1

bk
1

bk
1
bk
2

b0
Symmetries σ1
σ2and σ3are visualized in Figure 4.2, where DB(3) and DB(4) are shown.
4.2.1 Bisection Width of the DeBruijn Graph
It can be shown that the bisection width of DB(k) is in Θ
2k
k
, but there are only few results
known for specific dimensions. In [70], an optimal bisection width of 30 for DB(7) has been
computed. At the time that paper was written, the algorithm based on LP bounds ran for about
two weeks. To our knowledge, no exact bisection widthsfor bigger DeBruijn graphs were known
at that time.
In [197], Sensen improvedthe well-known boundbased on clique embeddingsby introducing
variable multicommodity flows. Using interior point methods for the resulting linear programs,
he was able to prove an exact bisection width of 54 for DB(8). SBDD was used to prevent the
consideration of symmetric parts of the search space. We refer the reader to [197] for details on
the overall approach. Here, we concentrate on the breaking of symmetries. We use this example
to show an easy application of SBDD rather than to underline its efficiency. For comparisons
with SBDS, we refer the reader to Sections 4.3 and 4.4.
4.2.2 Symmetry Breaking for the Bisection of DeBruijn Graphs
When bisectioning DeBruijn graphs, seven symmetries have to be encoded in Φ. They stem from
the three automorphisms of the graph itself, the exchange of Sand SC, and the combination of
these symmetries.
For the Graph BisectionProblem, a pattern is implementedas an n-tuple p
0
1
n.pi
0
(pi
1) means that node i
S(i
SC). pi
means that node ihas not been assigned yet. The
symmetry functions ϕ1

ϕ7permute the nodes according to σ1
σ2or σ3and/or invert the
entries. A pattern P
is dominated by P
iff there is a symmetry function ϕk, 1
k
7 such
that, for all 0
i
n, it holds that ϕk
P
i
or P
i
ϕk
P
i.
It is also possibleto use pattern informationfor propagation. Assume that there is a symmetry
function ϕkand an index j, 0
j
n, such that ϕk
P
i
or P
i
ϕk
P
i
1
i
n
i
j
and p
j
. Let ϕk
p
j
0 (or ϕk
p
j
1). Then we can enforce that node jis in SC(or S,
respectively).
88
Chapter 4. Symmetry Breaking
Fig. 4.3: The search tree when bisectioning DB(8) without breaking any symmetries.
Fig. 4.4: The search tree for the bisection of DB(8) when breaking all possible symmetries. Chains of
choice points with only one successor result from symmetry-based domain filtering.
Figures 4.3 and 4.4 show the different branching trees resulting from a computation of DB(8)
with and without breaking symmetries. Huge parts of the solution space are cut off by lower
bound information. Thus, many symmetric sub-trees are pruned early, thereby diminishing the
effect of symmetry breaking. However, since the effort per choice point in this approach is very
high due to expensive bound computations (
14 minutes per choice point), even small reduc-
tions of the tree size improve the overall performance significantly. Thus, for the computation of
the bisection width of DB(8), the breaking of symmetries was able to reduce the running time by
roughly 2 days, whereby the remaining overall computation time then took 37.5 hours.
In Chapter 9, we consider the Graph Bisection Problem in more detail and develop an ap-
proximation scheme for the efficient computation of Sensen’s lower bound. The approximation
of the bound is fast, but less accurate, which results in far larger search trees. In this environment,
when bisectioning DeBruijn graphs, the reduction of choice points is even more visible.
4.3 The Social Golfer Problem
We also applied SBDD to find solutions for the Social Golfer Problem. We study that problem
in detail in Chapter 8. Therefore, here we introduce it only very briefly. The original question
was posed as follows (Problem 10 in CSPLib [47]):
32 golfers want to play in 8 groups of 4 each week, in such way that any two golfers
play in the same group at most once. How many weeks can they do this for?
4.3. The Social Golfer Problem
89
The problem can be generalized by parameterizing it to wweeks and ggroups of splayers
each, written as g-s-wfrom now on1. In case of
s
1
w
gs
1, we achieve a specification
where every player must play with every other exactly once. This problem is also known as the
Schoolgirl Problem (see Chapter 8).
4.3.1 Symmetries in the Social Golfer Problem
Obviously, there is a lot of symmetry in the problem. First, players can be placed at any position
within a group (ϕP), groups can be exchanged within their week (ϕG), and also the weeks can be
ordered arbitrarily (ϕW). Furthermore, the players can be permuted (ϕX).
Followingthe idea that symmetrydetection shouldalso workwell incombinationwithsimple
models, we have chosen a straightforward one that can be implemented with little effort using the
ILOG SOLVER [121] environment. The groups are modeled as sets of players with the cardinality
of each set fixed to s. Each week contains gsuch sets, and the full pattern covers wweeks. To
shrink the search space, we fix all players in the first week in increasing order. Additionally, we
insert the first splayers into the first sgroups for all weeks thereafter. Finally, the first group of
the second week is filled with the smallest players possible. All these assignments can be made
without increasing the complexity of the model nor losing unique solutions.
4.3.2 Symmetry Breaking for the Social Golfer Problem
By using set variables for each group, the model does not contain symmetry ϕPanymore. To
detect the domination of patterns with respect to the other symmetries, we describe three sym-
metry detection functions ΦG,ΦW
Gand ΦW
G
X, that are used during the search. Function ΦW
G
includes checks performed by ΦG, and ΦW
G
Xincludes those done by ΦW
G.
ΦGGiven two week indices 1
i
j
w,ΦGis used to check if a week iof pattern P
dominates
week jof pattern P
with respect to symmetry ϕG. This is done by checking whether all
players of week iof pattern P
can be mapped to week jof pattern P
. In the example
shown in Figure 4.5, week 3 of pattern P
cannot be mapped to week 2 of pattern P
,
because players 2 and 3 are in the same group in pattern P
, but are in different groups in
pattern P
. Week 1 of pattern P
also cannot be mapped to week 2 of pattern P
, because
player 8 has no matching partner. However, week 2 of pattern P
can be mapped to week
2 of pattern P
.
1In the original problem, it is clear that the golfers cannot play for more than 10 weeks. On the other hand, a
solution for 5 weeks can be found easily without backtracking by always choosing the first possible player for a
group in each week. Meanwhile, a 9 week solution has been found, but it remains unclear whether there exists a
10 week solution or not.
90
Chapter 4. Symmetry Breaking
1
1
1
1
1
2
2
2
3
3
4
4
567
7
8
8
3
2
8
2
84
week 1
week 2
week 3
week 1
week 2
week 3
Fig. 4.5: The left hand side shows two patterns P
and P
. Each pattern consists of three weeks (hor-
izontal) of three groups of three players. Unfixed variables are left empty. On the right hand side, the
corresponding bipartite graph is shown, containing a node for each week of both patterns. Since a match-
ing of cardinality 3 exists (bold edges), P
is dominated by P
.
ΦW
GTo break symmetries ϕWand ϕG, function ΦW
Gconstructs a bipartite graph Gcontaining
a node for each week of P
and P
. An edge is inserted, iff a week of P
dominates a
week of P
, which is determined using ϕG. If Gcontains a matching of cardinality w, i.e.,
a perfect matching, P
dominates P
. Again, Figure 4.5 shows an example.
ΦW
G
XIncorporating also the last symmetry ϕXresults in a huge computational effort, as ΦW
G
has to be applied for
g
s
! different permutations. To reduce the cost of this check, we use
the fact that the first week of a pattern is always complete due to the fixed entries. Since it
has to be matched to some other week, “only” w
s!
g
g! possibilities are left. However,
the test remains expensive. Therefore, we tried some variations reducing the frequency
when ΦW
G
Xis applied. A parameter qcan be set to restrict full symmetry checks to every
q-th level of the search tree. Optionally, it can be limited to be performed on full patterns,
i.e. leaves, only, which is the default.
4.3.3 Numerical Results
The model described has been implemented in ILOG SOLVER 5.0 [121] and run for different
configurations on a Sun Enterprise 450 (400 MHz UltraSparc-II). Tables 4.1 and 4.2 show the
results of the experiments. Apart from the time (in seconds) needed to find the first solution (t1)
and the time to find all solutions (tall), the number of calls to the symmetry detection functions
ΦW
Gand ΦW
G
Xis given. In the sym-section, ΦW
Gis applied to check for symmetries ϕW
and ϕGin each node of the search tree. Since symmetries ϕXare not detected, there are many
non-unique solutions found. In the nosym-section, ΦW
Gis also applied in every node of the
4.3. The Social Golfer Problem
91
problem solutions t1tall ΦW
GΦW
G
Xsymmetries cp fails
sym
4-3-2 48 0.00 0.03 226 0 0 195 148
4-3-3 2688 0.02 6.09 99454 0 0 28299 25612
4-3-4 1968 0.05 26.70 382120 0 2808 94845 92878
4-3-5 0 0.00 36.34 412456 0 3120 100389 200390
nosym
4-3-2 1 0.00 0.04 226 47 47 195 194
4-3-3 4 0.01 10.00 99454 2687 2684 28299 28296
4-3-4 3 0.04 29.18 382120 1967 4773 94845 94843
4-3-5 0 0.00 36.28 412456 0 3120 100389 200390
Tab. 4.1: Results of the golfer 4-3-Xproblem.
search tree, and additionally ΦW
G
Xis applied in leaves preventing symmetric solutions from
being written out. The tables continue with the number of detected symmetries (symmetries), the
number of choice points (cp), and the number of fails.
Since invoking the symmetry detection function ΦW
G
Xis computationally very expensive,
applying it in every search node does not improve the overall run-time, although the number of
choice points is reduced. Clearly, there is a trade-off between the reduction of choice points and
the effort spent for the detection of symmetries. We have tested a scheme that applies ΦW
G
X
not only in leaves but also performs additional checks for all symmetries in every node in the
q-th level of the search tree. Table 4.3 shows that invoking ΦW
G
Xtoo often rather increases
the overall run-time, but applying it too rarely (e.g., only in leaves) is not the best choice either.
For the 4-4-4 instance, an invocation in about every 8-th level has shown to be the best. Similar
observations have been made for other instances as well. Table 4.4 shows the improved running
times for the 4-4-Xinstance.
4.3.3.1 SBDS versus SBDD
In [202], an SBDS approach is developed for the Social Golfer Problem. To break symmetries,
SBDS inserts additional constraints to the model during the search, and hands them over to
the solver. Due to the large amount of symmetries in the Social Golfer Problem, the approach
presented is not able to add all constraints necessary to break all symmetries.
Therefore, different models for the Social Golfer Problem are discussed. In combination
with more complex models that break several symmetries themselves, SBDS performs well and
is able to reduce the number of choice points significantly. However, the approach presented
in [202] is not able to only compute unique solutions. Moreover, the general approach for the
92
Chapter 4. Symmetry Breaking
problem solutions t1tall ΦW
GΦW
G
Xsymmetries cp fails
sym
4-4-2 216 0.00 0.09 735 0 0 555 340
4-4-3 5184 0.01 8.71 74175 0 0 43755 38572
4-4-4 1296 0.01 20.53 140595 0 1296 82635 81340
4-4-5 432 0.01 25.90 132531 0 2160 75723 75292
4-4-6 0 0.00 30.76 114027 0 0 72267 72268
nosym
4-4-2 1 0.01 0.17 735 215 215 555 555
4-4-3 2 0.01 136.31 74175 5183 5182 43755 43754
4-4-4 1 0.01 22.09 140595 1295 2591 82635 82634
4-4-5 1 0.02 26.51 132531 431 2591 75723 75723
4-4-6 0 0.00 30.71 114027 0 0 72267 72268
Tab. 4.2: Results of the golfer 4-4-Xinstance.
level of ΦW
G
Xsolutions t1tall ΦW
GΦW
G
Xsymmetries cp fails
nosym
1 1 0.01 698.51 0 26 18 82 82
2 1 0.02 271.35 29 27 24 123 123
4 1 0.02 101.26 156 79 79 339 339
8 1 0.01 14.51 5292 1296 1296 4730 4730
leaves 1 0.01 22.09 140595 1295 2591 82635 82634
Tab. 4.3: Results of the golfer 4-4-4 instance performing additional checks for symmetry ϕXin search
nodes of every q-th depth.
problem solutions t1tall ΦW
GΦW
G
Xsymmetries cp fails
nosym, level of ΦW
G
X
8
4-4-2 1 0.00 0.17 735 215 215 555 555
4-4-3 2 0.01 134.10 5283 1298 1297 6492 2891
4-4-4 1 0.01 14.51 5292 1296 1296 4730 4730
4-4-5 1 0.02 15.68 5291 1295 1296 4722 4722
4-4-6 0 0.00 17.16 5290 1295 1295 4714 4715
Tab. 4.4: Improved results of the golfer 4-4-Xperforming additional checks for symmetry ϕXin search
tree nodes of every 8-th depth.
4.4. The
n
-Queens Problem
93
Social Golfer Problem is not able to tackle larger instances like the golfers 5-3-7 efficiently. Only
in combination with a model designed for the specific case of the Schoolgirl Problem, a solution
is found.
Using SBDD for the Social Golfer Problem, it is possible to find unique solutions only.
Additionally, it also works in combination with very simple models. Obviously, the performance
of the approach that we presented for the Social Golfer Problem can be further improved by
using more sophisticated problem formulations (see Chapter 8). However, here we wanted to
demonstrate that SBDD can also be used efficiently by inexperienced users and in combination
with simple models. We believe that the symmetry breaking method that we developed is so
easy to use because all it requires is the definition of the pattern structure and of the function
that checks whether a pattern dominates another or not. Thus, the user can think of symmetries
algorithmically rather than in terms of constraints.
4.4 The n-Queens Problem
Finally, we consider the classical n-Queens Problem. It consists in placing nqueens on a n
n
chessboard such that no two queens can capture each other. That is, no two queens are allowed
to be placed on the same row, the same column, or the same diagonal.
Nowadays constraint programming approaches are able to find one solutionfor 1000-Queens
in a few seconds. Askingfor allnon-symmetric solutionsof n-Queens requires more effort. In the
following, we describe the SBDS approach of Gent and B. Smith [93] on the n-Queens Problem
and compare it with SBDD.
4.4.1 Symmetry Breaking for the n-Queens Problem
It is easy to see that the n-Queens Problem incorporates seven symmetries, namely reflections
in the horizontal and vertical axis, reflections in the main diagonals, and rotations through
90
180
270
.
We use the following standard model for n-Queens:
Each row i
0

n
1 is represented by an integer variable xi. Assigning xi
jcorre-
sponds to placing a queen in row iand column j.
Additional integer variables yiand wi,i
0

n
1, are used to check the diagonals of
the chessboard. We post the constraints yi
xi
i,wi
xi
i.
The domains are x
0

n
1
,y
0

2n
w
n

n
.
AllDiff constraints on x,y, and wensure that no two queens can capture each other.
94
Chapter 4. Symmetry Breaking










































Fig. 4.6: Six out of 40 solutions of 7-queens are unique.
4.4.1.1 SBDS
In [93], SBDS is introduced first and tested on a variety of problems. The approach is general
and compatible with different search strategies. A user of the concept only needs to provide
symmetry functions mapping a single assignment to its symmetric version.
In a choice point where we set x
von the left and x
von the right branch, SBDS adds
all constraints that are necessary to prevent the solver from exploring a sub-tree symmetric to an
already investigated one. By keeping track of all previously broken symmetries, only necessary
constraints are posted, thus keeping the overhead small.
4.4.1.2 SBDD
For the n-Queens Problem, a pattern pis an n-tuple where piis the column number in which
the queen covering row iis placed, or, in case the position of the queen in row ihas not been
set yet, pi
. E.g., the pattern corresponding to the first chessboard in Figure 4.6 is p
0
4
1
5
2
6
3
.
4.4.2 Numerical Results
In contrast to the algorithm we developed for the Social Golfer Problem, here we also use sym-
metry for domain filtering. A constraint is posted to the model that keeps track of the current
situation in the search. As propagation turned out to be rather expensive, we limited the number
of calls to the propagation routine to one.
We also implemented a version of SBDS and tested it on the model described above. Both
codes were running on the same Sun Enterprise as the program for the Social Golfer Problem in
Section 4.3.
Table 4.5 compares thenumber of solutions,the numberof fails, and thecomputationtime for
calculating all solutions(sym), calculating only unique solutions via SBDS, and unique solutions
using SBDD. We omit the number of solutions for SBDD as it is identical to SBDS. The results
givenfor SBDS are similar to those givenin [93]. Only the number of fails slightlydiffers, which
we believe to be caused by small variations in the implementation and the different CP engines
used (ILOG SOLVER 4.3 vs. ILOG SOLVER 5.0).
4.5. Summary
95
sym SBDS SBDD
n solutions fails time solutions fails time fails time
4 2 4 0.01 1 3 0.00 6 0.00
5 10 4 0.00 2 4 0.00 13 0.00
6 4 35 0.01 1 11 0.02 31 0.01
7 40 69 0.02 6 19 0.01 56 0.02
8 92 289 0.04 12 63 0.01 130 0.03
9 352 1111 0.16 46 216 0.04 397 0.08
10 724 5072 0.57 92 851 0.13 1464 0.29
11 2680 22124 2.49 341 3808 0.53 5991 1.26
12 14200 103956 11.88 1787 17673 2.52 27731 6.27
13 73712 531401 61.56 9233 89534 12.55 140348 33.11
14 365596 2932626 337.00 45752 483214 69.62 746530 189.07
15 2279184 16920396 1946.07 285053 2784876 403.16 4391877 1213.36
16 14772512 105445065 12154.60 1846955 17277508 2608.51 27153758 7463.62
Tab. 4.5: Solving n-Queens without breaking symmetries (sym), with breaking symmetries via SBDS, and
by avoiding them via SBDD. Computing times are given in seconds.
Obviously, SBDD does not perform as well as SBDS on the n-Queens Problem. The reason
for this is that the number of symmetries is fairly small. The difference between SBDS and
SBDD can be viewed as follows: SBDS iterates through the symmetries of a problem and adds
symmetry breaking constraints if necessary. SBDD on the other hand iterates through the choice
points expanded earlier. The latter approach is clearly favorable if the number of symmetries
is very high (like for the Social Golfer Problem, for example). However, when the number of
symmetries is very limited (as it the case for the n-Queens Problem), it is much more efficient to
add some few additional symmetry breaking constraints on backtracking.
4.5 Summary
We have suggested an approach for breaking symmetries that is based on the detection of dom-
inance relations between choice points. The method is generally applicable and works in com-
bination with all exhaustive search strategies while it may overrule strategies other than DFS.
Moreover, it removes symmetric parts of the search tree efficiently in combination with any
model. Thus, it can also be easily used by inexperienced users on straightforward models that do
not break symmetries themselves.
The ease of use mainly results from the fact that it is only necessary to define the pattern
structure and a function that checks if one pattern dominates another. This algorithmic approach
allows somewhat more flexibility than a model that breaks symmetries itself, as has been demon-
96
Chapter 4. Symmetry Breaking
strated for the Social Golfer Problem when adapting the frequency of constraint propagation for
certain symmetries.
The methodhas shownto be easilyapplicable withoutcausing a bigimplementationoverhead
on three very different applications from combinatorial optimization and constraint satisfaction.
Moreover, it worked efficiently even in combination with easy models and also on highly sym-
metric problems such as the Social Golfer Problem.
As a disadvantage, the use of patterns is less efficient on problems that contain only very few
symmetries such as the n-Queens Problem. There, the dynamic adding of constraints in an SBDS
fashion is clearly favorable.
PART II
Applications
In Part I of this thesis, we have introduced general purpose methods for pruning and filtering
with respect to cost considerations and symmetry. In the following Part II, we consider some
specific combinatorial optimization and constraint satisfaction problems. The applications that
we study are used to provide a practical evaluation of the previously developed methods.
In particular, we consider the Airline Crew Assignment Problem in Chapter 5. The approach
presented is based on the concept of CP-based column generation in combination with shortest
path constraints.
In Chapter 6, we study the Automatic Recording Problem, that evolves in the context of
modern multimedia applications. An algorithmic approach is presented that links knapsack con-
straints and weighted stable set constraints on interval graphs following the idea of CP-based
Lagrangian relaxation.
The Capacitated Network Design Problem is tackled in Chapter 7. Lower bounds can be
computed by decomposing the problem. We review previously developed reduction techniques
and use CP-based Lagrangian relaxation to link them together. Moreover, a new technique is
presented that adds locally valid cuts based on Lagrangian relaxation to the problem.
A new approach for the Social Golfer Problem is developed in Chapter 8. Using SBDD for
symmetry breaking and the new idea of heuristic constraint propagation, we are able to solve
problems that were previously out of reach for solvers based on constraint programming.
Finally, in Chapter 9, we develop a solver for the Graph Bisection Problem. The core of the
algorithm is a lower bounding procedure that approximates maximum multicommodity flows.
Chapter 5
Airline Crew Assignment
The Airline Crew Assignment Problem (CAP) consists in assigning lines of work to a set of crew
members such that a set of activities is partitioned and the costs for that assignment are mini-
mized. Especially for European airline companies, complex constraints defining the feasibility
of a line of work have to be respected. We present two different algorithms to tackle the large-
scale optimization problem of Airline Crew Assignment. The first is an application of CP-based
column generation that we introduced in Section 3.1. The approach incorporates shortest path
sub-problems and uses algorithms from Section 2.2. The second approach performs a CP-based
heuristic tree search. We show how both algorithms can be linked to overcome their inherent
weaknesses by integrating methods from constraint programming and operations research. Nu-
merical results show the superiority of the hybrid algorithm in comparison to CP-based tree
search and column generation alone.
Scheduling flying crews of airline companies is a hard combinatorial problem, given the
complexity of the constraints that have to be satisfied and the huge search space that has to be
explored. The problem is often tackled by breaking it down into the Crew Pairing and the Crew
Assignment (or Rostering) Problem. In the crew pairing part, basic activities such as flight legs
(flights without stopover) are grouped into pairings. The latter ones are lines of work for one
or more days starting and ending at a home base. Then, in the crew assignment phase, these
pairings are assigned to crew members.
Although easier in practice than the original problem, both sub-problems are still hard to
solve. Obviously, the Airline Crew Assignments Problem that we consider here is NP-hard,
which is easy to see by reduction to the Set Partitioning Problem [88]. Generally, operations re-
search (OR) and constraint programming (CP) techniques are available to solve the CAP, since it
has drawn the interest of both scientific communities for many years until today. Most industrial
software is based on OR techniques. However, especially for European airlines, there are strict
rules enforced by legislation, unions, etc. that define the feasibility of schedules. Thus, since a
huge amount of computational effort is put into the generation of infeasible lines of work, com-
99
100
Chapter 5. Airline Crew Assignment
mon OR-based generate and test approaches are not efficient enough. We show how constraint
programming can be incorporated to overcome typical weaknesses of OR approaches. For a re-
cent overview on optimization problems and solution techniques in the airline industry, we refer
the reader to [182, 218].
During the last decade, some work was done on the Crew Assignment Problem. Column
generation methods have proven to be quite successful [52, 87, 183]. For solving the Railway
Crew Rostering Problem, which is similar, but not identical to the Airline Crew Assignment
Problem, Caprara et al. developed both an OR- and a CP-based approach [30, 32]. For the latter,
a lower bound from the OR field was used to improve the efficiency.
By construction, OR methods view a problem globally, taking into account all variables and
usually more than one or even most constraints at a time. By calculating upper and lower bounds
on the costs, they show a good ability to identify promising parts of the search space. However,
they often suffer from minor local conflicts, which might prevent a feasible solution from being
found. On the other hand, CP methods can efficiently handle feasibility problems by resolving
local conflicts using advanced search techniques and reduction algorithms based on concepts like
arc-consistency. Respectively, CP methods lack the ability to view the variables and constraints
of a problem globally. Therefore, they often have problems when stuck in local optima.
We present two different approaches to tackle the Airline Crew Assignment Problem: a CP-
based heuristic tree search approach (HTS) [205], and one following the CP-based column gener-
ation framework (CGA) (see Section 3.1). We show how these two approaches can be combined
to overcome their inherent limitations.
The work presented in thischapter was publishedin [62, 194, 195]. It is structured as follows:
In Section 5.1, we formally define the Airline Crew Assignment Problem. In Section 5.2, we
discuss two autonomous approaches to solve the CAP. We give the characteristics of two real-
world airline test cases and present detailed ways of how the two approaches developed can be
combined to form an efficient hybrid algorithm in Section 5.3. Finally, in Section 5.4, numerical
results show the superiority of the hybrid algorithm compared to the individual approaches.
5.1 The Airline Crew Assignment Problem
Given a set of crew members, a set of pairings, a set of rules and a cost function, a roster is an
assignment of a subset of pairings to one specific crew member. A schedule is a set of rosters
such that all rules are obeyed and every pairing is assigned to exactly one crew member. Rules
may concern a single crew member or multiple crew members. Single crew member rules regard
each individual crew member’s roster, stating for example that no two temporally overlapping
pairings can be assigned to the same person. Multiple crew member rules aim at more than one
crew member, stating for example that two given pairings must be assigned to two crew members
5.2. Two Approaches for the Crew Assignment Problem
101
out of which at least one must have a certain level of experience. The cost function associates a
cost with every legal schedule, and its minimization is desired.
In our case, every rule in the rule set only deals with just one single crew member, and the
objective function is linear over the rosters. That means that only single crew member rules can
be modeled and that the cost of the entire solution to the CAP is defined as the sum of the costs
of the selected rosters. More formally:
Definition 5.1 Given k
m
n
, we denote the set of crew members byC :
1

m
and the
set of pairings by T :
1

n
. Furthermore, denote the set of all subsets of a set S by 2S.
1. Let R :
C
2T. Every r
R is called a roster and R is called the set of all possible rosters.
2. Let B :
0
1
and H :
h1

hk
hi:R
B
1
i
k
. Every h
H is called a
(single crew member) rule and H is called a rule set.
3. A roster r
R is called legal (with respect to a rule set H)iff h
r
1
h
H. L
H
:
r
R; r is legal
is the set of legal rosters (with respect to the rule set H).
4. f :R

is called a cost function.
5. The (Airline) Crew Assignment Problem (CAP) is to minimize 1
i
mf

ci
ti

, whereby
ci
ti
L
H
1
i
m such that:
(a)
c1

cm
C
(b)
1
i
m
ti
T whereby ti
tj
/
0
i
j
1
i
j
m.
The model as stated above neither allows non-linear objectives when combining rosters, nor
permits to restrict the combinationof rosters by additional multiplecrew member rules one might
be interested in for real-life applications. Nevertheless, both methods that we present for solving
the previous problem allow to treat linear multiple crew member rules as well.
5.2 Two Approaches for the Crew Assignment Problem
In this section, we introduce two approaches for the CAP that we want to combine later. As a
major objective, we aim at developing a generic tool that is able to treat different rules and regu-
lations that typically arise in airline companies. Particularly for European airlines, these rules are
very complex and often non-linear. It was therefore decided to model the rules and regulations
as a constraint program. Hence, both approaches and the resulting integrated approach are based
on a CP core. For further details on this core, we refer the reader to [163].
102
Chapter 5. Airline Crew Assignment
time
pairings
2
1 5 9
6
73
84
Fig. 5.1: Constructing a legal reduced-cost optimal roster is equivalent to finding a constrained shortest
path in a weighted DAG.
5.2.1 CP-based Column Generation Approach
The definition of the CAP as stated above allows to decompose it naturally into the sub-problem
of generating legal rosters and the set partitioning (SPP) master problem. Therefore, we can
apply the idea of CP-based column generation that was introduced in Section 3.1. The master
problem is an integer program (IP) that ensures restrictions (5a) and (5b) in Definition 5.1:
min
i
1

kf

cϕ
i
ti

xi
s
t
i
1

k
and ϕ
i
j
xi
1j
1

m(5.1)
i
1

k
and sbelongs to ti
xi
1s
1

n(5.2)
xi
0
1
whereby ϕ:
1

k
1

m
maps a columnnumber toa crew member. The mconstraints
in (5.1) assign exactly one line of work to each crew member. The nconstraints in (5.2) ensure
that all activities are covered exactly once. In this model, every (legal) roster corresponds to a
0-1 column.
The sub-problem consists in finding rosters respecting all rules and improving the objective.
From linear programming (LP) duality theory, it is known that columns with negative reduced
costs are candidates for such an improvement. Notice that duality theory is only valid for the
LP-relaxation of the original IP. Thus, pure column generation must be viewed as a heuristic
only. To prove optimality of the IP model, column generation has to be extended to a branch and
price approach [13].
We first generate a bunch of individuallines of work and then try to combine them to partition
the entire work. When solving the LP-relaxation of the master problem, we get dual information
5.2. Two Approaches for the Crew Assignment Problem
103
CP−based Column Generator
LP Solver SPP Matrix
RostersLP Duals
SPP Solver
Constraint Model for Airline Rules Initialization
Solution/Duals
Fig. 5.2: The entire approach: The inner loop generates columns using dual information, the outer loop
solves the master problem.
that allows to search for potentially improving columns. That is, in the sub-problem we try to
generate new rosters that have negative reduced costs. Those rosters are added to the master
problem, which is solved again, and so on until no more rosters with negative reduced costs can
be computed or until a certain iteration limit is reached.
Selecting an optimal set of non-overlapping activities respecting the rule set can be inter-
preted as the problem of finding a constrained shortest path in a weighted directed acyclic graph
(DAG) G(see Figure 5.1). However, due to complexand possibly non-linearsingle crew member
rules, generating legal rosters with negative reduced costs can be very difficult, and it is doubtful
whether the shortest path substructure is really dominant in the constraint satisfaction problem
that arises. Moreover, rule sets vary from airline to airline and have no common structure that
could easily be exploited to design a generic efficient constrained shortest path algorithm that
can cope with any rule set. Therefore, we apply a CP search to generate legal rosters. As we
are only searching for individual lines of work with associated negative reduced costs, we add
an optimization constraint that is used for problem reduction. That is, instead of searching for
constrained shortest paths, we rather introduce a shortest path constraint.
The entire approach is sketched in Figure 5.2. In the initialization phase we start be set-
ting up the desired airline rules and regulations and generate an initial SPP matrix that may
consist of dummy columns only (see Section 5.2.1.2). In the outer loop of master iterations,
we solve the current SPP integer program and get a first current solution and new dual values
of the LP-relaxation. The column generator then defines the next sub-problem to be solved by
picking a crew member. A specified number of rosters is generated in the inner loop: we add
104
Chapter 5. Airline Crew Assignment
0
10000
20000
30000
40000
50000
60000
123456
choice points
master iterations
NRC
SPC
total enum
0
200
400
600
800
1000
1200
1400
1600
1800
1 2 3 4 5 6
time [sec]
master iterations
NRC
SPC
total enum
Fig. 5.3: Number of choice points versus master iterations (left), and running time versus master iterations
(right) for SPC, NRC, and total enumeration. The tests were run with a data instance of type 10-00-20
that was solved to optimality.
the corresponding columns to the (continuous) master problem, solve it by the means of linear
programming, and obtain new dual values. After rosters have been generated for all crew mem-
bers, we achieve an enhanced SPP matrix and the next master iteration begins. This process is
either interrupted after a given time limit has been exceeded or when no more rosters have been
generated that yield an improvement of the LP-relaxation in any of the sub-problems.
In the following, we investigatesome properties of the sketched procedure in more detail and
show that CP-based column generation is able to solve non-trivial Crew Assignment Problems.
In particular, we demonstrate the effect obtained by the propagation of the path constraint.
The problem instances that we consider here stem from a major European airline. The rules,
regulations, and objective function have directly been abstracted from the real-world case and
preserve the essential characteristics of this case. The data sets are sufficiently large to measure
the effects of constraintpropagation, but they are smallenough torun experiments ina reasonable
time frame.
To characterize an instance, we specify the number of crew members, the number of pre-
assigned activities, and the number of activities to be assigned. For example, an instance of type
67-165-280 consists of 67 crew members, 165 pre-assignments, and 280 tasks. The experiments
were run on a SUN UltraSparc-IV with 296 MHz CPU and 1024 MB main memory. For the
constraint model of the CAP, the ROSTER LIBRARY [163] based on ILOG SOLVER 4.4 [120]
was used. The LPs and IPs were solved with ILOG PLANNER 3.3 [119].
5.2.1.1 Shortest Path Filtering
In the experiment for Figure 5.3, we compare three models for the generation of legal rosters
with negative reduced costs. The first performs total enumeration, the second (NRC) uses a
simple arithmetic constraint to ensure the generation of columns with negative reduced costs,
and the third (SPC) uses a DAG shortest path constraint (see Section 2.2.2) for this purpose. The
5.2. Two Approaches for the Crew Assignment Problem
105
0
5000
10000
15000
20000
25000
30000
35000
0 2 4 6 8 10 12 14
Choice points
Master Iteration
SPC 574 609 504 392 280 931 210 119 119 147 63 77 0
NRC 574 1918 1197 2037 4599 5117 6118 12446 14077 13433 18340 21532 32095
Fig. 5.4: Number of choice points versus master iterations using SPC, NRC with a data set of type 7-0-30.
left picture shows the reduction of choice points when using cost-based filtering techniques for
problem reduction. In the sixth master iteration, SPC uses less than half the number of choice
points than NRC. This gain is not consumed by a significant increase in computation time per
choice point: As shown in the right figure, the decrease in running time is quite similar to the
decrease in the number of choice points. As expected, total enumeration is not competitive at all.
To demonstrate the superiority of a shortest path constraint when compared with a simple
arithmetic constraint in more detail, we run a test on a small instance where, in each master
iteration, the number of choice points is noted. Figure 5.4 shows that SPC uses much fewer
choice points than NRC. Furthermore, in the last master iteration, the shortest path constraint
is able to prove optimality for the continuous relaxation of the master problem very quickly by
showing that no more columns with negative reduced costs exist. The negative reduced cost
constraint, however, still visits an increasing number of choice points per iteration.
One reason for the efficiency of the shortestpath constraintand the reason why there isalmost
no gap between the reduction of choice points and the reduction in time is the use of the incre-
mental version as mentioned in Section 2.2.2.3. In Figure 5.5, we compare a non-incremental
versionof the shortestpath constraint withan incremental one. For a fixedtime of 10000 seconds
for the entire optimization, the faster incremental version needs only 2000 seconds for propaga-
tion, whereas, in the non-incremental version, almost 60% of total calculation time is consumed
by that part of the algorithm. Thus, the incremental version allows to perform nearly 3 times
as many propagations as the non-incremental version and hence helps to improve the solution
quality.
106
Chapter 5. Airline Crew Assignment
0
250000
500000
750000
0 1000 2000 3000 4000 5000 6000
propagations
time[sec]
incremental
non-incremental
430000
440000
450000
460000
470000
0 3000 6000 9000
cost
time [sec]
Fig. 5.5: The left picture shows time versus the number of calls of the propagation routine using the
incremental and the non-incremental implementation of the shortest path constraint. Both versions were
stopped after 10000 seconds total CPU time. The experiment was run with a data instance of type 10-00-
70. The right picture shows a comparison of NRC (upper curve) and SPC (lower curve) in a time versus
quality diagram on a data instance of type 67-165-280.
The right picture in Figure 5.5 shows a time versus quality comparison of NRC and SPC.
After a first big drop in the objective, NRC dives deeply into huge search trees that only consist
of rosters with non-negative reduced costs. SPC can prune those search trees much earlier and
therefore continuously reduces the objective without stalling.
We have shown that CP-based column generation works reasonably for the CAP. However,
we observe two major drawbacks. We will see later in Section 5.3, how these problems can be
overcome by combining the column generation approach (CGA) with a direct CP approach.
5.2.1.2 Set Partitioning Set Covering
The majorobstacle for CGA isthe setpartitioning(SPP) structure of the masterproblem. Finding
a feasible solution to the SPP is NP-hard already [88]. Moreover, the dual information gained
from equation constraints is more difficult to exploit than that of cover or packing constraints.
Therefore, we would actually like to relax the master problem to a set covering formulation (that
remains an NP-hard problem but can be solved much more easily in our case) by only requiring
the pairings to be flown by one or more crew members, i.e., we suggest to relax (5.2) to ixi
1.
Then, however, to compute a legal schedule, we have to decide which crew member finally gets
an over-covered pairing assigned.
5.2.1.3 Feasible Solutions for Set Partitioning
To obtain a formulation that guarantees that we can always find a feasible solution, we add two
types of dummy columns: The first type of column covers exactly crew member i, the second
5.2. Two Approaches for the Crew Assignment Problem
107
exactly one activity j, for all i
1

mand j
1

n. That is, we allow empty rosters and
unassigned activities. By setting the costs for choosing a dummy column to an arbitrary high
value, we make sure that they only become part of an optimal solution if the original master
problem is infeasible.
Although this procedure works, to achieve meaningful dual information of the master prob-
lem, the solution must not be spoiled by dummy costs. Thus, it is preferable to generate an initial
set of rosters that contains an entire work partitioning schedule.
5.2.2 Heuristic Tree Search Approach
The other algorithm developed to tackle the CAP is the heuristic tree search approach (HTS)
based on constraint programming. In that HTS, each complete feasible solution of the CAP is
constructed by solving the corresponding constraint satisfaction problem [205]. The problem is
modeled by a set of variables which correspond to assignable pairings. For each pairing, there
is a variable the domain of which represents the crew members that can possibly be assigned to
the pairing.1For each constrained variable representing the assignment of a pairing, its initial
domain comprises all available crew members for this paring. The posting of the appropriate
constraints reduces the domains of these variables by removing crew members that cannot be
allocated to the corresponding pairings. This is possible, for example, due to pre-assigned activ-
ities, or due to regulation violations because of the crew member’s history, etc. The search tree
of the problem is created by iterating over pairings in a heuristic dynamical order and assigning
each pairing to a crew member.
As usual, every branch in the search tree corresponds to assigning a pairing to a crew mem-
ber. Every non-leaf node corresponds to a partial assignment, identified by the path from the
root to the node. Leaf nodes correspond to infeasible partial assignments or complete and legal
schedules, i.e. (not necessarily optimal) feasible solutions of the problem. Each allocation of a
crew member to a pairing activates the constraint propagation mechanism. Total enumeration is
tried to be avoided by removing values which are inconsistent with the posted constraints from
variables’ domains. For example, the assignment of a pairing to a crew member causes the re-
moval of this crew member from all pairings’ domains that overlap in time with the one that
has just been assigned. When a node is proven to be a dead-end, which means that one or more
pairings cannot be carried out be any crew member in the given partial assignment, backtracking
occurs, and decisions taken before are reconsidered.
The constraints of the problem are the regulations of the airline at hand that dictate which
rosters are acceptable and which ones are in violation of the airline rules. A solution to a con-
1It is assumed that every pairing can only be assigned to one crew member. In case there are more than one crew
members necessary to staff a pairing, copies of the pairing are created, and each copy can again only be assigned to
one single crew member.
108
Chapter 5. Airline Crew Assignment
straint satisfaction problem is any assignment of values to variables that respects all constraints.
A feasible solution to the CAP, formulated as a constraint satisfaction problem, is any assignment
of crew members to pairings such that all airline rules and regulations are respected. Then the
objective function is optimized by searching for improving solutions only. Regarding the way
the search tree is traversed, a variety of search methods were developed and tested.
5.2.2.1 Tree Traversal
A variety of search methods for traversing the search tree exists in the literature. The oldest,
most popular and, by far, most widely used search method is Depth First Search (DFS). The
main drawback of DFS is that, even for instances of moderate size, it only explores a very small
portion of the search tree at the lower left.2DFS was implemented and tested for the CAP, and,
as no surprise, it was found that it does not perform very well, because the first decisions taken
are never reconsidered.
Innovation in the field came from the notion of discrepancy. At a given node, a heuristic
function suggests which branch the search should follow, as the one that is assumed is most
likely to contain solutions (or solutions of good quality in the case of optimization). Always
following the heuristic’s advice defines a unique path that is said to contain no discrepancies.
Following the heuristic’s advice except for one case defines paths of discrepancy 1, for two cases
discrepancy 2, and so on.
Limited Discrepancy Search (LDS) [106] is an iterative search method. In the i-th iteration,
it explores all paths with idiscrepancies. In the original LDS method, paths with discrepancies
higher up the tree are explored before the ones where the discrepancies occur further to the
bottom. The intuitive justification for that approach is that a heuristic is more likely to fail higher
up the tree, where information is limited. We implemented a variant of LDS. Our variant searches
paths with discrepancies lower down the tree before the ones with “higher” discrepancies. The
advantage is that time consuming descends from near the root towards the leaves are avoided.
Also, our variant is not iterative. It searches those paths having ior less discrepancies and then
exits. Thus, it is not complete. Practically, however, the parameter ican be chosen so that a big
enough portion of the tree is explored. In our experiments, this portion of the tree is much bigger
than a modern computer could explore in a reasonable amount of time. We refer to this variant
as modified Exact Discrepancy Search (mEDS).
Depth-Bounded Discrepancy Search (DDS) [213] is also an iterative method. In the i-th
iteration, it exploresall paths where discrepancies occur before depth i. In contrast to LDS, a path
with many discrepancies high in the tree is explored before a path with very few discrepancies
low in the tree. This is also justified by the assumption that heuristics tend to fail with a higher
2It is common practice to regard the branches under a node as ordered according to a heuristic function. Follow-
ing the advice of a heuristic means to go “left” down the search tree.
5.3. Integration
109
probability on top of the tree.
Finally, wealso implementedLarge NeighborhoodSearch (LNS) thatwasintroducedin[200]
and incorporates local search techniques within the CP framework. The idea is to restrict the
search within a fragment of the search space. In this way, local improvements can be made
that would remain unnoticed by most incomplete search methods. A reduced search space for
a problem with a set of variables Vand a known feasible assignment Acan be created as fol-
lows: A large subset V1of Vis selected. All assignments in Afor variables in V1are fixed and
thus a partial solution is created. Search is performed in the remaining variables with any of the
above search methods. After this search is finished (either because the search sub-space has been
exhausted or because any other termination criterion is met), another sub-space is selected and
the process is repeated. The advantage of LNS is that local improvements are discovered easily,
and the objective value may improve quickly. The disadvantage is that the search space cannot
be viewed globally. Thus, it is likely that important improvements are missed. A reasonable
strategy when using LNS is to use one of the search methods above in the beginning to guide the
search towards a promising area of the search space and to use LNS afterwards.
5.3 Integration
We present two ways of integratingboth methods each one motivatedby one of the two following
problem cases:
5.3.1 The Airline Test Cases
We consider real-world test cases stemmingfrom twoEuropean airline companies. The instances
of companyA consist of 50–65 crew members and 766–959 pairings. Company B has 7–30 crew
members and 129–279 pairings. Case A covers a planning period of one calendar month, while
data sets for B cover two weeks. While case B incorporates mainly 1–2 day pairings, A considers
pairings of duration less than 24 hours.
The objective of company B is to achieve a fair distribution of activities over all crew mem-
bers, whereas in A we aim at satisfying as many preferences expressed by the crew members as
possible by minimizing dissatisfaction. Importantly, the rule sets in both cases are distinct. In
A, typical rules such as succession rules and rest time rules, but also more complicated ones like
rules ensuring a minimum of days off within gliding windows of variable lengths are incorpo-
rated. Also, rules guaranteeing minimum and maximum flight time are enforced. All rules in A
are hard constraints, meaning that if they are violated, the solution is considered infeasible. In
B, we consider flight time rules that limit the time actually flown by the crew within certain time
periods. These rules are also strict.
110
Chapter 5. Airline Crew Assignment
The main difference between the two test cases regarding the algorithms we developed is
caused by the fact that company B does not insist on a partitioning of the work, i.e., restric-
tion (5b) is relaxed to
1
i
mti
T. Obviously, this difference requires that our column gener-
ation approach is able to incorporate two different types of master problems.
In the first problem case, the construction of a feasible schedule is difficult due to very strict
rules called for by the airline company. We observe that CGA eventually gets close to solutions
of good quality, but minor inconsistencies delay it disproportionately long. We show that this
can be overcome effectively by letting the CGA approach solve a relaxed (that is set covering)
version of the problem and then handing possibly over-covered (and thus infeasible) solutions to
the HTS approach for fixing.
In the second problem case, the rule set is not that strict. The CGA approach alone proceeds
as expected. However, the initial time spent for driving dummy columns out of the basis is
considerable. In this phase, dual values are not very meaningful, because penalties dominate the
objective. We show how the HTS method can help attacking the problem.
5.3.2 Transforming a Set Covering into a Set Partitioning Solution
The first method is applied on case A. In this company, no pairing can be left unassigned. More-
over, there is a relatively large number of pairings with respect to the number of crew members
and the number of pairings that a single crew member is able to service (for example 959 daily
pairings and 65 crew members on a typical monthly instance). These conditions make finding a
feasible solution difficult for the CGA approach. On the other hand, the HTS approach is able to
construct feasible solutions by using sophisticated search methods and heuristics tailored for the
specific problem. However, after a short while no improving solutions can be found.
We overcome the problems of both methods by letting the CGA approach find set covering
instead of set partitioning solutions. That is, we relax the pairing partitioning constraints (5.2) by
only requiring that every pairing is assigned to at least one crew member. The columns generated
by the CGA approach can much more easily be combined to SCP solutions. Then the conversion
of SCP to SPP solutions is performed by the HTS approach, which can resolve local conflicts
efficiently byusingsophisticatedpropagationalgorithms. An outlineofthe procedureis shownin
Algorithm1. Here,Vis the set of all variables, AXis a tuple of assignments
v
xv
of values xv
to variables vgenerated by approach X,a
A
v
is a function which returns the value of variable
vin assignment A, DEFAULTSVAR and DEFAULTSVAL are the variable and value selection
functions normally used by the HTS approach respectively, REPAIRSVAR and REPAIRSVAL
are the corresponding heuristics used for repairing set covering solutions and HTSOPTIMIZE,
and CGAOPTIMIZE are the HTS and CGA optimization functions. PARTITION is a function
which will be explained shortly. LNSOPTIMIZE performs optimization using the LNS method
that forms search sub-spaces by dividing the planning horizon into time windows.
5.3. Integration
111
Algorithm 1 Top level algorithm for the first method
1: AHTS
HTSOPTIMIZE(V, DEFAULTSVAR, DEFAULTSVAL)
2: repeat
3: ACGA
CGAOPTIMIZE
4:
V1
V2
V3

PARTITION(AHT S
ACGA)
5: for all v
V3do
6: v
a
ACGA
v
7: AHTS
HTSOPTIMIZE(V1
V2, REPAIRSVAR(V1
V2,ACGA,V2),
REPAIRSVAL(V1
V2,ACGA,V2)))
8: AHTS
LNSOPTIMIZE(V, REPAIRSVAR(V,ACGA,V1
V2
V3),
REPAIRSVAL(V,ACGA,V1
V2
V3), AHTS
9: until stopping condition
We now explain this algorithm in greater detail. In the first line, one or more initial solutions
are found by the HTS approach. This initialization step provides the algorithm with a set of
columns, which can be combined to feasible solutions. Not much time is devoted to this phase.
The variable and value selection heuristics that would normally be used by the HTS approach are
applied here. Any of the methods presented in the previous sections can be plugged in. However,
we found mEDS toperform best in our case. The columns constitutingthese solutionsare handed
to the CGA approach for optimization in Line 3. The solution produced in this step is correct
except for the fact that some pairings are assigned to more than one crew member, which is not
legal.
The next task is to use the information found in ACGA to construct a feasible solution. Let
V1be the set of variables which correspond to over-covered pairings. An optimistic approach
would be to assign the values of the assignment ACGA to all the variables in V
V1and let the
HTS approach perform a search in the space of the variables inV1. This, however, could lead to a
failure, since it is not known that the partial solution obtained is extensible to a feasible solution.
There are other scheduling problems, such as the Vehicle Routing Problem With Time Win-
dows [185] for example, for which an set covering solution can be repaired easily by removing
entries for over-covered rows from all but one of the corresponding columns. However, in our
case, this procedure is likely to fail as certain rules may cause the resulting rosters to be infeasi-
ble. For example, a minimum flight time rule might be violated if a pairing is removed from an
otherwise feasible roster. We say that such a rule destroys the legal sub-roster property of a rule
set.
We can distinguish three subsets of variables in V: The set V1that consists of variables that
correspond to over-covered pairings in ACGA, the set V2that consists of variables which have
different values in ACGA and AHTS, and the set V3which corresponds to variables having the
same value in both assignments.
112
Chapter 5. Airline Crew Assignment
Algorithm 2 Heuristics for the first method
REPAIRSVAR
S
A
V
1: v
NIL
2: for all unbound variables v
Vdo
3: if a
A
v
Dvthen
4: return v
5: return DEFAULTSVAR(S)
REPAIRSVAL
S
A
V
v
1: if v
Vand a
A
v
Dvthen
2: return a
A
v
3: else
4: return DEFAULTSVAL
S
v
The function PARTITION partitions Vin exactly this manner. Assignments of variables in
V3are known to be extensible to a full solution, since one has already been found. Thus, since
there is no information which suggests the contrary, they are realized as soon as possible in each
iteration (Lines 5 and 6 in Algorithm 1). Assignments in set V2may be considered as almost
certain. However, in Line 7 of Algorithm 1, they are realized in a way that allows to reconsider
them in case there exists no feasible solution that extends the assignments of variables in V2and
V3. Finally, CGA does not provide meaningful information for variables in V1. Therefore, HTS
performs the search for assignments to these variables using the default heuristics.
The variable and value selection functions are modified as shown in Algorithm 2. There, the
variables that are not fixed yet are given in the set S.Vis a subset of Sfor which assignments
exist in A. For example, when the variable selection rule is invoked in Line 7 of Algorithm 1, S
is V1
V2,Vis V2and Ais ACGA. In this case, the variable to be assigned next is any variable
in V2for which its suggested value exists in its domain. In other words, all possible assignments
in ACGA are realized as soon as possible, in accordance to the intuitive belief that they will most
probably lead to an area that contains improving solutions. If this is not possible, then a variable
inV1is selected, and the default heuristic is used.
Whenever possible, the value selection heuristic assigns the value suggested by CGA. Two
important details are worth to note:
1. The variable selection heuristic is consulted every time when a new assignment has to be
made in the HTS search. That is, if a variable vis selected (because a
ACGA
v
Dv) and then,
for any reason, the search backtracks beyond that point (removing a
ACGA
v
from Dv), then
another variable might be selected instead of v. That way, assignments and not just variables are
dynamically ordered throughout the search process in such a way that those decisions contained
in ACGA will always be taken as early as possible.
5.3. Integration
113
2. Discrepancy-based search methods are used motivated by the belief that the assignments in
ACGA are probably good ones. That is, we try to stick to the decisions made by CGA, and we
would like to make only few deviations. In our implementation, this issue is handled by using a
variant of the LDS search method. In the original LDS proposal, based on the assumption that
heuristic decisions are less accurate high up in the search tree, early decisions are reconsidered
first. In our case, though, the assignments for variables in V2are realized in the beginning, and
we want to stick to them. Therefore, we prefer to use mEDS in this phase, too.
We further notethat thefunction HTSOPTIMIZE in Line1 ofAlgorithm1 mayor maynot use
LNS. Whether LNS can help to improve the efficiency is problem dependent. Our experiments
show that, as a stand-alone method, it is not preferable because it is likely to get stuck in a
local optimum soon. However, we found that it can be useful to apply LNS after having found
the first solution with the help of a global tree search method. Locality is not a major problem
when using LNS in combination with CP-based column generation: the latter carries the major
burden of optimization, whereas the CP-approach is used to resolve minor local inconsistencies,
hopefully without loosing much of the relaxed set covering solution quality. We show the effects
of using LNS in our experimental results.
We also use LNS in Line 8 of Algorithm 1 to overcome a problem that might arise when
fixing variables in V3. Recall that they are determined by assignments which have the same
values in both AHTS and ACGA, and they are bound to their values as proposed by CGA to
explore promising regions of the search space. We give the search more freedom by allowing
that these assignments may be reconsidered and use LNS onV1
V2
V3instead of onlyV1
V2.
To be more precise, in our experiments, we used LNS with mEDS as the sub-tree search method.
5.3.3 Generating Combinable Columns and Exploiting Dual Values
We propose a second integration strategy, that is applied on company B. In this case, the conver-
gence of the CGA approach towards an optimal solution is assisted by HTS first by constructing
a set of initial columns that are combinable to complete partitioning solutionsin a start-up phase,
and second by constructing columns with negative reduced costs during the main optimization
phase. These columns are guaranteed to be extensible to a feasible solution, since they are ex-
tracted from one. A top level sketch of this method is shown in Algorithm 3. Cis a set of rosters,
Ais an assignment, and duals are the dual values corresponding to this assignment (obtained by
the CGA). The function HTSPOSTNRC transfers the dual values to HTS that is forced to search
for columns with negative reduced costs.
5.3.3.1 Start-up Heuristic
In the CGA, columns are generated for each crew member sequentially. By using dual infor-
mation, columns with negative reduced costs are generated. Thus, when the problem is non-
114
Chapter 5. Airline Crew Assignment
Algorithm 3 Top level algorithm for the second method
1: C
HTSTREESEARCH
V
DEFAULTSVAR
DIVERSESVAL
2: repeat
3: A
duals
CGAOPTIMIZE(C)
4: HTSPOSTNRC
duals
5: C
HTSLNSTREESEARCH
V
MAXDUALVAR
MAXDUALVAL
A
6: until stopping condition
degenerate, they lead to a decrease in the continuous relaxation of the master problem. There-
fore, to find high quality rosters, “good” dual values are needed. Especially in the beginning, the
information contained in the dual values is very poor. This is because usually no feasible solution
is known at this point, and penalties stemming from dummy columns (that have to be introduced
in the master problem to guarantee the existence of a solution) have a great impact on the dual
values. We need to find a set of rosters that can be combined legally to form a set partitioning
solution to the CAP. However, the column generator of the CGA is hardly able to produce such
a solution, as it computes one roster at a time and is only indirectly aware of colliding pairings
in different rosters.
HTS can help here. In an integrated approach, it is used to generate a bunch of complete
feasible solutions in the beginning, thereby providing one column for each crew member with
every schedule found. Thus, a first set of columns that can be combined feasibly to a complete set
partitioningsolutionprovidesthe CGA with the necessary “grip” to accelerate towards promising
parts of the search space with respect to the real objective without disturbing penalties.
Line 1 of the Algorithm 3 realizes this idea. HTS searches for an initial number of solutions
without performing optimization. The number of solutions to be found is a parameter that has to
be tuned with respect to the time spent in this phase and the quality of the initial dual values.
Another parameter that has to be taken into account is the diversity of the columns that
are generated. It may be desirable to have many diverse rosters at hand that allow more and
more profitable combinations in the master problem. One rule of thumb used in practice is
that no crew-pairing assignment should appear more than a certain number of times in these
columns. The idea is realized in the slightly modified value selection heuristic DIVERSESVAL,
which is shown in Algorithm 4. It works exactly as the value selection heuristic that is normally
used, but it also records the assignments made and limits the number of times a crew member
can be assigned to a pairing. This heuristic, for example in combination with depth-bounded
discrepancy search [213], guarantees that columns will be adequately different from each other
to make the CGA method even more efficient.
5.3. Integration
115
Algorithm 4 Modified value selection heuristic for the second method
DIVERSESVAL
V
v
A
k
1: val
NIL
2: repeat
3: val
DEFAULTSVAL
S
v
4: if the assignment
v
val
appears more than ktimes in Athen
5: remove val from Dv
6: else
7: return val
8: until val
NIL or Dvis empty
Especially for large data sets, we find that many initial solutions are needed. To speed up
their computation, we try to shrink the search space: First, only one solution is computed. Then
the LNS search procedure is applied to obtain solutions that satisfy the diversity conditions in
locally bounded areas of the search space.
5.3.3.2 Main Optimization Loop
As shown in Line 3 of Algorithm 3, CGA performs an optimization run taking the columns
produced by HTS as input. It returns an assignment Aas well as the corresponding dual values
for the crew members and pairings. The solution returned is feasible with respect to all the
company’s rules and regulations. Then, starting from this point, HTS performs a locally limited
search for columns with negative reduced costs.
The constraint posted in Line 4 of the algorithm ensures that a certain number of the columns
corresponding to each solution found will have negative reduced costs. This number is defined
empirically. Finding a schedule that consists of columns with negative reduced costs only is
rather unlikely. On the other hand, producing only few such columns is a wasted effort. Our
experiments show that schedules that contain 30% columns with associated negative reduced
costs can be achieved for our test set. Of course, this does not imply that 70% of the columns
produced are useless. Instead, those columns guarantee that all newly generated columns can be
extended to a feasible solution. Thus, all columns that are produced are important with respect
to integer feasibility, whereas the columns with negative reduced costs reflect our search for
improving solutions with respect to a linear continuous objective.
Line 5 of Algorithm 3 performs an LNS search with few deviations regarding the solution
providedby CGA. The pairing with the maximumdual is assigned to the crew with the maximum
dual as long as this crew member’s reduced costs are not guaranteed to be negative already.
Again, our search method of choice is mEDS.
116
Chapter 5. Airline Crew Assignment
950000
1000000
1050000
1100000
1150000
1200000
1250000
1300000
1350000
1400000
1450000
0 20000 40000 60000 80000 100000 120000
LNS-HTS
HTS
Hybrid
Fig. 5.6: Data set with 65 crew members and 959 pairings.
5.4 Numerical Results
To demonstrate the superiority of combined approaches integrating CP and OR techniques, we
applied the hybrid algorithms as presented to real-world Crew Assignment Problems (see Sec-
tion 5.3.1). We applied each method integrating HTS and CGA onthe airline cases that motivated
their development. All algorithms were implemented in C++ on top of ILOG SOLVER [120] and
ILOG CPLEX [116]. The first integration strategy was applied on two monthly data sets from
company A. Experiments for this case were performed on a 640 MB, 296 MHz SUN UltraSparc-
II, with a time limit of 120000 seconds.3
The efficiency of our algorithm improves the production system which company A used at
the time when this work was done. Figure 5.6 is a cost (i.e., dissatisfaction) versus time graph
showing the performance of the hybrid and the pure HTS methods applied on a monthly data set
containing 959 pairings and 65 crew members. The problem is stated as minimization problem.
The curve marked “LNS-HTS” corresponds to a hasty strategy in which, after one solution is
obtained, LNS is used to achieve some good solutions quickly. The “HTS” curve shows a more
mature strategy, where the search finds several good solutions before LNS is applied to locally
optimizethem. The curve marked“hybrid”shows theperformance of thehybridapproach, which
clearly outperforms both. Interestingly, the pure CGA cannot detect any feasible solution at all.
Within 120000 seconds, it is not able to remove all dummy columns from the solution, i.e., the
original master problem without dummy columns still is infeasible.
In these specific experiments,for exhibitionpurposes only, we call the HTS strategyin Line 1
of Algorithm 1 in order to show that the hybrid has the best performance regardless of the start-
3Curves stopping beforethis threshold indicate that no better solution was found fromthe moment corresponding
to the end of the curve until the time limit has been reached.
5.4. Numerical Results
117
600000
700000
800000
900000
1000000
1100000
1200000
1300000
1400000
1500000
0 20000 40000 60000 80000 100000 120000
LNS-HTS
HTS
Hybrid
Fig. 5.7: Data set with 50 crew members and 766 pairings.
up phase. That is the reason why “LNS-HTS” outperforms “hybrid” in the beginning. Of course,
we repeat that a reasonable choice for the start-up phase of Algorithm 1 would be a strategy
more like “LNS-HTS”. This strategy is used in the experiments of Figure 5.7, which shows the
performance of the same methods on another monthly data set of company A containing 766
pairings and 50 crew members.
The following set of experiments is carried out in order to investigate the second way of
integration. Experiments for this case were performed on a 128 MB, 143 MHz SUN UltraSparc,
with a time limit of 20000 or 70000 seconds depending on the problem size. Figures 5.8 and 5.9
show the costs versus time plot for CGA, HTS and the second, so-called, consolidated approach
for data sets with 7 crew members and 129 pairings, and 30 crew members and 279 pairings,
respectively.
The plots depict the expected behavior of CGA and HTS. CGA steadily optimizes the ob-
jective, but the quality of the initial solution is poor. Moreover, the time needed to find a first
solution grows with the problem size. On the other hand, HTS finds relatively good solutions
quickly by using heuristic information, but soon gets stuck. The consolidated approach ben-
efits from both approaches: it finds good solutions quickly because of HTS and then steadily
continues to refine the solutions with the help of CGA.
It can also be seen that the integrated approach is slower than HTS early in the experiments.
During that time, the hybrid approach is using the HTS module to create an initial set of columns
according to the start-up heuristic. The reason why HTS is slower in the consolidated case is
that the goal is not to find better and better solutions, since the main optimization burden lies on
the CGA side. Instead, HTS rather tries to find diverse rosters, which help CGA to find better
solutions in the following.
118
Chapter 5. Airline Crew Assignment
0
200
400
600
800
1000
0 5000 10000 15000 20000
CGA
HTS
Hybrid
Fig. 5.8: Data set with 7 crew members and 129 pairings.
50000
100000
150000
200000
250000
0 10000 20000 30000 40000 50000 60000 70000
CGA
HTS
Hybrid
Fig. 5.9: Data set with 30 crew members and 279 pairings.
5.5. Summary
119
The experiments regarding the second way of integration show that it is always useful to
assign the task of finding a set of initial solutions to the HTS approach. The best number of solu-
tions computed initially depends on the rule set as well as on the characteristics of the instance.
Assigning the main optimization burden to CGA is the default choice, as it views the problem
globally taking into account all variables and constraints at a time. If minor local adjustments can
lead to quality improvements, then having HTS perform LNS searches throughout the process is
cost-effective. Moreover, if the column generation process gets stuck, i.e., if a significant number
of columns with negative reduced costs proves not be combinable to an IP solution, then having
HTS generate solutions incorporating columns with negative reduced costs is cost-effective, too.
The numerical results clearly show that each hybrid approach is successful on the airline case
on which it is applied in our experiments. The question that arises is whether the two hybrids
can generally be combined or not.
We believe that orthogonality generally holds: A meta-hybrid could start off by having the
HTS construct a set of solutions out of which diverse and feasibly combinable columns can be
extracted. Then the CGA approach can be used to improve a relaxed version of the problem,
which is repaired by the HTS approach.
We found that whether or not the use of one of the hybrid approaches we presented can speed
up the computation of a good solution is problem dependent:
Of course, the first hybrid can only be applied profitably, if the master problem is hard
enough to justify the use of a relaxation that must be repaired at some point. Regarding
airline case B, this precondition is not fulfilled, which is why we cannot apply hybrid 1 on
this case.
Using initial solutions provided by the HTS approach in order to speed up the starting
phase of CGA only pays off when the CGA approach alone has difficulties in driving
dummy columns out of the basis or if it spends too much time on this phase of the process.
This is not given in airline case A, which causes that hybrid 2 cannot be used profitably
here.
We conclude that generally the two hybrids can be combined, but the usefulness of a meta-
hybrid is problem dependent. Its tuning heavily relies on inherent problem properties, which
might not be known a priori.
5.5 Summary
For the CAP, we have shown how the concept of CP-based column generation that we presented
in Section 3.1 works in practice. The sub-problem of roster generation can be viewed as a
120
Chapter 5. Airline Crew Assignment
Constrained Shortest Path Problem. We applied the filtering algorithms that were developed
in Section 2.2 and gave a real-world empirical evaluation that showed the positive influence of
cost-based filtering, especially when using an efficient, incremental implementation.
Although the column generation approach works reasonably, we found that it suffers from
two major drawbacks: dummy costs and in-combinable rosters. Therefore, we presented a direct
CP approach and merged the two together. We showed how methods from CP and OR can help
each other to overcome their fundamental weak points.
While OR methods view a problem globally and show a good ability to detect promising
regions of the search space, CP methods can efficiently handle feasibility problems and are well
suited to resolve local conflicts. The first way of integration that we proposed uses the CP-
based column generation approach (CGA) to compute cost-efficient yet relaxed solutions to the
problem, and then resolves conflicts of overcovered pairings by applying a heuristic CP tree
search (HTS). The synergy effects are particularly visible if a lot of work has to be grouped
in relatively few partitions. Then column generation alone often fails to generate combinable
rosters, and the use of HTS as a repairing module helps a lot to increase the overall performance.
The second way of integrationthat we introducedconcerns the use of dual values. We showed
how column generation approaches can profit from CP via the computation of diverse combin-
able initial columns. On the other hand, the use of dual information in a CP-based heuristic tree
search has shown to be very efficient. It allows to laden the optimization burden on the OR part
and away from CP, which then can focus on what it was designed for originally, namely to solve
constraint satisfaction problems.
We believe that the ideas discussed in this chapter can be generalized for other problems
as well, especially in connection with (CP-based) column generation. We presented results on
large-scale real-world CAP data, which show clearly visible improvements in performance of the
hybrid approaches compared to the solitary methods.
Chapter 6
Automatic Recording
In Chapter 3, we have seen that the invocation of an optimization constraint is likely to become
inefficient when it only represents a partial view on the entire problem. This causes that the
bounds used for domain filtering are not accurate anymore, which then leaves the propagation
algorithm ineffective. We have shown how problem decomposition can help to overcome this
problem by linking optimization constraints via the objective rather than by the common inter-
play via variable domains only.
Implicitly we assumed that many real-world problems can actually be decomposed naturally
into two or more basic substructures. In this chapter, we introduce the Automatic Recording
Problem (ARP) [139] that is an example for such a composed problem. The ARP can be viewed
as a combination of a Knapsack Problem (see Section 2.5) and a Maximum Weighted Stable Set
Problem (see Section 2.3) on an interval graph. For this example, we show the benefits of linking
a knapsack and a weighted stable set constraint via CP-based Lagrangian relaxation.
The work presented in this chapter was published in [188, 189, 190]. It is structured as
follows: In Section 6.1, we formally introduce the ARP. Then, in Section 6.2, the concept of CP-
based Lagrangian relaxation is applied to the problem. Finally, in Section 6.3, we give numerical
results by evaluating the practical performance of different combined filtering algorithms for the
ARP.
6.1 The Automatic Recording Problem
The technology of digital television offers new possibilities for individualized services that can-
not be provided by current analog broadcasts. Additional information like classification of con-
tent, or starting and ending times can be submitted within the digital broadcast stream. With
this information at hand, new services can be provided that make use of individual profiles and
maximize customer satisfaction.
121
122
Chapter 6. Automatic Recording
capacity
40h recording Football
James Bond Formula 1
Mickey MouseNews
time
channels
News
US Open
Music
Teletubbies
Star Trek
Star Wars High Noon
10h of MPEG-2 require
18GB
20-200 digital TV channels
Content metadata encoded in broadcast stream
User profile known
Fig. 6.1: The automatic recording scenario.
One service which is available already today [9, 208] is an "intelligent" digital video recorder
that is aware of its user’s preferences and records automatically. The recorder tries to match a
given user profile with the information submitted by the different TV channels. E.g., a user
may be interested in thrillers, the more recent the better. The digital video recorder is supposed
to record programs such that the user’s satisfaction is maximized. As the number of channels
may be enormous (more than 100 digital channels are possible), a service that automatically
provides an individual selection is highly appreciated and subject of current research activities
(for example within projects like UP-TV [210] funded by the European Union or the TV-Anytime
Forum).
In this context, two restrictions have to be met. First, the storage capacity is limited (10 hours
of MPEG-2 video needs about 18 GB). Second, only one program can be recorded at a time (see
Figure 6.1).
More formally, we define the problem as follows:
Definition 6.1 Let n
, V
1

n
the set of programs, start
i
end
i
i
V the corre-
sponding starting and ending times, w
wi
1
i
n
n
the storage requirements, K
the
storage capacity, and p
pi
1
i
n
nthe profit vector.
We say that the interval Ii:
start
i
end
i
corresponds to program i
V, and call two
programs i
j
Voverlapping whose corresponding intervals overlap, i.e. Ii
Ij
/
0. For X
V
we call pX:
i
Xpithe user satisfaction (with respect to X).
The Automatic Recording Problem (ARP) then is to find a subset X
V such that
(a) X can be stored within the given disc size, i.e. i
Xwi
K.
(b) At most one program is allowed to be recorded at a time, i.e. Ii
Ij
/
0
i
j
X.
(c) X maximizes the user satisfaction, i.e. pX
pY
Y
V, Y respecting (a) and (b).
6.1. The Automatic Recording Problem
123
6.1.1 On the Complexity of the Automatic Recording Problem
Obviously, even if all programs are pairwise non-overlapping (i.e., if restriction (b) is obsolete),
it remains to solve a Knapsack Problem. Thus, the ARP is NP-hard. Let pmax :
max
pi
1
i
n
. We develop a pseudo-polynomial algorithm running in time Θ
n2pmax
that will be used
later to derive a fully polynomial time approximation scheme (FPTAS) for the ARP.
6.1.1.1 A Dynamic Programming Algorithm
The algorithmwe develop in the followingis similar to the teaching-bookdynamic programming
algorithm for Knapsack Problems. Setting
:
and ψ:
npmax
1, we compute a
matrix M
mkl
n
ψ, 0
k
ψ, 1
l
n. In mkl, we store the minimum knapsack
capacity that is needed to achieve a profit greater or equal kwhen using items lower or equal l
only (mkl
iff 1
i
lpi
k).
We assume that Vis ordered with respect to increasing ending times, i.e., 1
i
j
n
implies ei
ej. Furthermore, let lastj
V
1
denote the last non-overlapping node lower
than j, i.e.,
elastj
sjand ei
sj
lastj
i
j.
We set lastj:
1 iff no such node exists, i.e., iff e0
sj. To simplify the notation, let us assume
that mk
1
for all 0
k
ψ, and mk
1
0 for all k
0. Then,
mkl
min
mk
l
1
mk
pl
lastl
wl
.
The previous recursion equation yields a dynamic programming algorithm: First, we sort the
items with respect to their ending times and determine lastifor all 1
i
n. Both can be done
in time Θ
nlogn
. Then we build up the matrix row by row. Finally, we compute max
k
mk
n
K
. The total runningtime of thisprocedure and the memory needed are obviouslyin Θ
n2pmax
.
6.1.1.2 A Fully Polynomial Time Approximation Scheme
As for Knapsack Problems, we can use the dynamic programming algorithm to derive an FPTAS
by scaling the profit vector. Given ε
0, we set S:
εpmax
n, and pi:
pi
S
. Then, pmax
n
ε
. Thus, the running time of our dynamic programming algorithm applied with the scaled
profit vector is in Θ
n3
ε
.
Now let us study the error that we make by using pinstead of p. Let x
0
1
ndenote an
optimal solution with respect to p, and x
0
1
nan optimal solution with respect to p. Then,
pTx
S
pT
S
x
SpTx
SpTx
S

pTx
S
n
pTx
Sn
(6.1)
Therefore,
124
Chapter 6. Automatic Recording
pTx
pTx
pTx
Sn
pmax
ε,
i.e., the relative error is at most ε. Thus, we have found an FPTAS for the ARP.
6.1.2 A Mathematical Programming Formulation
Since the problem of finding and proving optimalsolutions is of interest in its own right, and also
since the FPTAS we developed requires far too much memory to be applicable in practice, we
focus on exact approaches for solving the ARP. Using mathematical programming, the problem
can be stated as an integer linear program (IP):
Maximize IP1
pTx
subject to xi
xj
1
1
i
j
n
Ii
Ij
/
0
1
i
nwixi
K
x
0
1
n
The objective function maximizes the user satisfaction. Constraints of the form xi
xj
1
ensure that for overlapping intervals Ii
Ij, at most one program ior jcan be selected. Memory
restrictions are enforced by the last row. The formulation can be tightened when replacing the
non-overlapping constraints by maximal clique constraints (see Section 2.3):
Denote the set of maximal conflict cliques by M:
C1

Cm
2V. Then restrictions of
the form i
Cpxi
1
1
p
mimplythatxi
xj
1for allnodesi
j
Vwhosecorresponding
intervals overlap. On the other hand, if xi
xj
1 for all overlapping intervals, it is also true that
i
Cpxi
1
1
p
m. Thus, IP1is equivalent to
Maximize IP2
pTx
subject to i
Cpxi
1
1
p
m
1
i
nwixi
K
x
0
1
n
Though being NP-complete on general graphs, finding maximal cliques on interval graphs
that naturally model the non-overlapping constraints on the programs is simple. It can be per-
formed in time Θ
nlogn
[99], and hence, IP2can be obtained in polynomial time.
6.1.3 Solving the Resulting Integer Linear Program
Although methods exist that do not split the search space like cutting plane algorithms, for ex-
ample to solve a (mixed) integer linear program, branch-and-bound approaches have proven to
be efficient, widely applicable and thus are most commonly used. In every choice point, a bound
based on some (often continuous) relaxation is being computed. If that bound is worse than the
6.2. CP-based Lagrangian Relaxation for the ARP
125
objective value Bof the incumbent solution, then backtracking occurs. A successful application
of the branch-and-bound paradigm relies heavily on tight bounds that can be computed quickly.
Problem reduction can help to improve the performance of a branch-and-bound search if the fil-
tering algorithm is both effective and efficient. Effective means that it must have an impact, i.e.,
it has to be able to filter many values, whereas the efficiency measures how quickly the routine
works.
The effectiveness of a filtering algorithm mainly depends on the quality of bounds it uses to
estimate the impact of fixing a variable to one of its values. For the ARP, our experiments show
that the continuous relaxation bound yields a good estimate on the solution quality that can be
reached. Thus, it can be used for pruning purposes in a branch-and-bound approach. However,
it is not straightforward to see how this bound could be used for filtering purposes effectively,
that is, other than by probing via full re-optimization, which is inefficient. On the other hand,
domain reductions with respect to reduced-cost information can be done quickly, but is not very
effective. In the following, we will show how CP-based Lagrangian relaxation can help here.
6.2 CP-based Lagrangian Relaxation for the ARP
Using the refined model IP2, the ARP can be viewed as a combination of two simpler optimiza-
tion constraints: a knapsack constraint, and a maximum weighted stable set constraint on an
interval graph. For the knapsack constraint, a filtering algorithm was developed in Section 2.5
that runs in time Θ
nlogn
. Likewise, in Section 2.3, we developed a filtering algorithm for
the maximum weighted stable set substructure (WSSP) of the ARP. The algorithm runs in time
Θ
nlogn
or in amortized linear time for
logn
incremental propagation calls.
Provided with the two filtering algorithms, we are able to perform domain reduction for the
two natural substructures of the ARP. According to the abstract description in Section 3.2, we
will now tie the two filtering algorithms together:
As the filtering algorithm for the WSSP allows us to incorporate changing objectives at a
low computational cost, we decide to relax the capacity constraint. We introduce a non-negative
Lagrange multiplier λ
0 and define the Lagrangian sub-problem
Maximize L
λ
1
i
n
pi
λwi
xi
λK
subject to i
Cpxi
1
1
p
m
x
0
1
n
The Lagrange multiplier problem then is to minimize L
λ
, such that λ
0. For every λ
0,
L
λ
is a valid upper bound on the objective. Therefore, we can apply cost-based filtering for
the weighted stable set constraint on interval graphs every time we solve the Lagrangian sub-
problem. For given Lagrangian multipliers λ, we use dual information π
π
λ
mfrom the
corresponding stable set sub-problem to perform variable fixing with respect to the knapsack
126
Chapter 6. Automatic Recording
substructure next. Note that the algorithm developed in Section 2.3 provides us with those values
at essentially no additional cost. By Lagrange relaxing the maximal clique constraints with
multipliers π
0, we obtain a Knapsack Problem. Let µi:
j:i
Cjπj
1
i
nand π:
1
j
mπj. Then the problem is to
Maximize 1
i
n
pi
µi
xi
π
subject to 1
i
nwixi
K
x
0
1
n
Relaxations of this problem again yield a valid upper bound, and we can propagate the knapsack
optimization constraint on the modified objective.
6.2.1 Implementation Details
We have so far left out some implementation details concerning the choice of the branching
variable and the computation of optimal Lagrangian multipliers λ
. In this section, we will give
an insight in the implementation the tests are performed with.
We use four different approaches for our experiments: the first is a pure branch-and-bound
algorithm without any problem tightening (referred to as P-0). The second uses the filtering
algorithms for Knapsack and Maximum Weighted Stable Set Problems on the original objec-
tive (P-1). The third and the fourth approach (P-2 and P-3) realize the idea of linking filtering
algorithms for linear optimization constraints via Lagrangian relaxation. P-2 calls for domain
reduction with respect to both substructures just once after the Lagrangian dual has been solved,
whereas P-3 also propagates the maximum weighted stable set constraint during the search for
optimal Lagrange multipliers.
6.2.1.1 Continuous Bound Computation
For pruning, the computation of a linear bound on the objective is needed. P-2 and P-3 obvi-
ously use the objective value corresponding to LB
λ
for this purpose. As the computation via
Lagrangian relaxation with stable set sub-problems turned out to be very efficient, we use that
algorithm for all four approaches.
6.2.1.2 Computation of λ
To determine λ
, we use a method to maximize one-dimensional concave functions based on the
golden section. We obtain a sequence of λk,k

. Let emax :
max
pi
wi
1
i
n
. Then,
for all ε
0, there exists a constant c
0 such that
λk
λ
ε
k
c
logemax
6.3. Numerical Results
127
Thus, after O
logemax
iterations we can numerically approximate the optimal Lagrange multi-
plier λ
. Each iteration costs amortized linear time for a total of at least
logn
iterations in
all search nodes. Finally, in every choice point we add O
nlogn
for the succeeding knapsack
filtering algorithm. Therefore, the integrated filtering algorithm for the tight global Lagrangian
relaxation bound runs in time O
nlogemax
nlogn
.
Notice that the Lagrangian sub-problem is totally unimodular, i.e., it exhibits the integral-
ity property. Thus, the Lagrangian relaxation bound has the same value as the bound that is
determined by a linear continuous relaxation.
6.2.1.3 Branching Variable Selection
Using the shortest path interpretation of the weighted stable set constraint on interval graphs (see
Section 2.3.3), all algorithms choose the first node on the shortest path1with maximal efficiency
pi
wias branching variable.
6.3 Numerical Results
All experiments are performed on a PC with an AMD-Athlon 600 MHz processor and 256 MB
RAM running Linux 2.2. The implementation was done in C++ and compiled by gcc 2.95 with
maximal optimization (O3). The algorithms are built on top of ILOG SOLVER 5.0 [121].
6.3.1 Test Instance Generation
The experiments are conducted on several sets of randomly generated test instances. To achieve
scenarios which we believe to be of relevance for the real-life application, each set of instances is
generated by specifying the time horizon (half a day to 3 days) and the number of channels (20
100). The generator sequentially fills the channels by starting each new program one minute after
the last. For each new program a class is being chosen randomly. That class then determines
the interval from which the length is chosen randomly. We consider either 3, 5, or 7 different
classes. The lengths of programs in the classes vary from 5
2 minutes to 150
50 minutes. The
disc space necessary to store each program equals its length, and the storage capacity is randomly
chosen as 45%–55% of the entire time horizon.
To achieve a complete instance, it remains to choose the associated profits of programs. For
the experiments, we use four different strategies for the computation of an objective function:
For the class usefulness (CU) instances, the associated profit values are determined with
respect to the chosen class, where the associated profit values of a class can vary between
zero and 600
200.
1According to the optimal reduced-costs objective 1
i
n
pi
λ
wi
xi
λ
K.
128
Chapter 6. Automatic Recording
In the time correlated (TC) instances, each 15 minute time interval is assigned a random
value between 0 and 10. Then the profit of a program is determined as the sum of all
intervals that program has a non-empty intersection with.
For the weakly correlated (TWC) instances, that value is perturbed by a noise of
20%.
Finally, in the subset sum (SSS) data, the profit of a program simply equals its length.
The different objectives try to emulate some effects that we believe to hold for real-life instances.
In the CU instances for example, programs of the same class cause similar attractions. On the
other hand, the TC and TWC instances cause many conflicts regarding the choice of programs
that are being broadcasted at the same time. The assumption that programs overlapping in time
cause similar attractions is justified by the fact that TV channels are planning their broadcasts
according to the behavior of target groups. To a large extent, these target groups are determined
by and vary with the time of the day. However, the different strategies that we consider are only
intuitively justified. The feasibility of our approach for real-life instances can only be concluded
from the fact that we achieve similar results for all choices of the objective.
We identify a test set by giving the parameters the generator is started with. According to the
previous description those parameters are: The time horizon in minutes, the number of channels,
the number of different classes [3, 5, or 7], and the objective type [CU, TC, TWC, or SSS].
6.3.2 Experimental Evaluation
In the following, we present our numerical results. The experiments consist of 50 random in-
stances per test set. For each instance, the approaches P-0 P-3 are run to find and prove an
optimal solution. We give running times and the number of choice points needed for an exhaus-
tive search. All approaches find a first solution rather early in the search. Therefore, the main
work consists in the proof of optimality rather than in the construction of the solution. We con-
clude that the branching variable selection that we use efficiently supports finding near-optimal
solutions in a non-exhaustive search.
Table 6.1 shows the performance (time and choice points) of all four approaches on test sets
generated with a time horizon of 12 hours and 20 channels using 5 different program classes and
CU, TC, and TWC to determine the objective function.
When comparing the different types of objectives, we find that, for all four approaches, the
TC instances are much harder than CU and TWC, which are comparably easy to solve. This is
a general observation we made for all kinds of different test sets using 3 or 7 classes as well as
different time horizons and numbers of channels.
We further observe that a higher degree of integration between the two optimization con-
straints yields partially drastic reductions in the number of choice points of up to a factor of
6.3. Numerical Results
129
test set P-0 P-1 P-2 P-3
5 12h 20 ch time nodes time nodes time nodes time nodes
CU 9.5 519.4 5.2 295.5 7.0 198.5 8.6 184.0
TC 441.8 40155.7 67.2 4525.8 18.0 696.5 28.4 575.6
TWC 15.5 1136.1 12.0 802.6 6.2 339.9 9.5 321.2
Tab. 6.1: The table compares the different approaches on three different test sets with 5 classes, 12 hours,
20 channels and different objectives. The time (in seconds) and the number of choice points are averages
for 50 randomly generated instances for each objective. The average number of programs per instance are
between 607.6 and 612.6.
about 70 on the difficult time correlated instances. Regarding the computation time, there is a
trade-off between the reduction of choice points and the time spent per choice point (TpCP). The
TC and TWC instances show that P-2 can outperform P-3 because of the shorter TpCP that is
needed for that degree of integration. When comparing P-1 and P-3 on the CU instances, the
reduction of choice points is not big enough to justify the longer TpCP needed, and P-1 is the
approach that takes the least computation time.
Generally, a bigger reduction of choice points is more likely to pay off when the absolute
TpCP needed is rather high. Particularly, this holds for applications where additional constraints
are propagatedon topof the objectiveconstraint itself. For theARP, the optimizationconstraint is
the only active constraint. Therefore, to justify the worse TpCP caused by the more complicated
propagation algorithm, a substantial reduction of the number of choice points must be achieved.
The P-2 and P-3 approaches obtain a sufficient reduction of choice points on the more difficult
TC test sets, and also for larger test instances:
Table 6.2 shows the performances of all approaches on test instances that are generated using
5 different program classes with different time horizons and numbers of channels. The objective
is computed according to the chosen classes, i.e., according to CU. As expected, for the larger
instances with a time horizon of 72 hours (3 days) and 20 channels, the two linking approaches
P-2 and P-3 outperform P-1 roughly by a factor of 4 regarding the number of choice points
and almost a factor of 2 with respect to the computation time needed. The minima, maxima
and standard deviations prove that the average numbers we present are not biased by very few
outliers, but represent meaningful values for the evaluation of the algorithms performances.
In Table 6.3, we compare the different approaches on a variety of very different test in-
stances that are generated using different parameters and objective functions. Again, relevant
and partially substantial reductions in the number of choice points can be obtained by CP-based
Lagrangian relaxation realized in P-2 and P-3.
130
Chapter 6. Automatic Recording
test set P-0 P-1 P-2 P-3
5 CU time nodes time nodes time nodes time nodes
12h
20ch
avg
min
max
std
2.4
0.1
25.4
4.2
238.3
5.0
2216.0
396.8
1.3
0.1
15.1
2.3
129.9
5.0
1531.0
243.6
1.4
0.1
12.9
2.5
108.6
5.0
1269.0
206.5
2.1
0.1
24.2
4.0
89.9
5.0
1045.0
173.5
12h
50ch
avg
min
max
std
16.5
0.1
167.3
30.8
741.9
3.0
7058.0
1377.6
8.2
0.2
82.6
14.5
370.1
3.0
3615.0
658.9
9.9
0.2
154.1
22.9
272.0
3.0
2664.0
506.9
14.2
0.2
156.5
26.8
250.5
3.0
2664.0
465.0
24h
20ch
avg
min
max
std
9.5
0.5
87.5
15.2
519.4
21.0
4416.0
829.9
5.2
0.4
59.6
9.3
295.5
13.0
3762.0
587.6
7.0
0.3
93.4
17.2
198.5
10.0
2094.0
377.8
8.6
0.5
98.1
17.7
184.0
10.0
2067.0
374.9
24h
50ch
avg
min
max
std
1104.9
0.8
31045.5
4448.4
24301.4
12.0
675235.0
97121.0
585.2
1.0
15625.2
2272.6
14219.3
12.0
368440.0
54288.6
883.3
0.7
33281.3
4662.0
8371.9
9.0
292753.0
41139.4
921.5
1.1
31573.5
4441.2
8286.8
9.0
292753.0
41121.2
72h
20ch
avg
min
max
std
2627.7
2.0
32751.9
5514.7
40901.5
29.0
460350.0
85325.8
1786.7
2.4
30520.3
4543.1
27662.0
29.0
412421.0
65188.4
920.4
3.0
11766.0
1996.8
6674.7
29.0
90397.0
14515.9
990.9
5.5
13724.7
2189.7
6514.7
29.0
89589.0
14379.4
Tab. 6.2: The table shows a comparison of the performance of the different approaches on 5 test sets with
5 classes and objective CU for various time horizons (in hours) and channel numbers (ch). Italic numbers
give the average time (in seconds) and the average number of nodes of 50 randomly generated instances
in each test set (avg). Numbers below are: minimum (min), maximum (max), and standard deviation
(std) for these 50 instances. The average number of programs per instance is 315.2 for (12h/20ch), 793.5
(12h/50ch), 607.6 (24h/20ch), 1512.1 (24h/50ch), and 1782.6 (72h/20ch), respectively.
6.3. Numerical Results
131
P-0 P-1 P-2 P-3
test set time nodes time nodes time nodes time nodes
3 CU
120h 20ch 5210.3 60839.2 1734.4 30676.4 455.9 3433.9 490.1 2945.1
5 TWC
72h 20ch 11600.1 293386.8 1526.8 35718.4 261.0 3683.6 411.7 3134.5
7 TC
24h 50ch 8349.0 250367.1 4066.3 105572.6 403.4 6235.4 533.0 4219.1
Tab. 6.3: The table illustrates the performance of the different approaches on very different benchmark
classes. Each test set contains 50 randomly generated problem instances. There is an average of 1956.7
programs in the 120h/20ch test set, 1782.6 programs in test set 72h/20ch, and 1423.3 programs in test set
24h/50ch.
test set
P-0 P-1 P-2 P-3
5 SSS time nodes time nodes time nodes time nodes
12h 20ch 316.4p 0.2 23.1 0.2 15.2 0.2 15.2 0.3 15.2
12h 50ch 792.4p 0.5 18.8 0.5 13.9 0.5 13.9 0.8 13.9
24h 20ch 611.2p 0.5 26.9 0.6 21.9 0.6 21.9 1.0 21.9
24h 50ch 1527.3p 1.6 29.1 1.8 23.2 1.7 23.2 3.0 23.2
72h 20ch 1778.3p 3.5 53.4 4.6 51.7 5.0 51.7 8.7 51.7
72h 50ch 4464.3p 11.0 54.4 14.4 52.8 15.4 52.8 27.2 52.8
Tab. 6.4: The table shows the performance of the different approaches on subset sum data sets ranging
from 12 hours and 20 channels up to 72 hours and 50 channels. The average number of programs in the
50 randomly generated instances per test set is given as parameter p.
So far, we have left out comparisons regarding the choice of the objective according to SSS.
Table 6.4 shows the results obtained for a collection of very different test sets generated with
SSS. Two facts stand out: first, a comparison with Table 6.1 shows that the SSS instances are
much easier to solve than for other choices of the objective. Second, P-1 achieves only a slight
reduction of choice points compared to P-0 that cannot be improved by P-2 and P-3 at all.
The effect is not surprising: We considered the somewhat artificial SSS test sets because
of their obvious relation to subset sum benchmarks for Knapsack Problems. Due to the equal
efficiency pi
wiof all programs, the knapsack optimization constraint has great difficulties to in-
clude or exclude programs. Therefore, the knapsack constraint is not effective, and the burden of
domain reduction lies on the WSSP optimization constraint only. In total, using the optimization
constraint for pruning purposes only is most time efficient here.
132
Chapter 6. Automatic Recording
test set avg. no P-0 P-1 P-2 P-3
3 CU of programs time nodes time nodes time nodes time nodes
12h 100ch 1048.3 18.8 680.8 9.1 326.3 5.1 108.6 7.6 95.5
24h 50ch 1013.4 37.8 1461.1 19.5 734.4 11.1 187.9 13.8 178.5
72h 20ch 1175.0 177.4 3003.1 111.7 1897.2 36.6 468.4 42.9 401.2
Tab. 6.5: The table compares the performance of the different algorithms on benchmark sets with 3 classes
and objective CU, each containing 50 randomly generated problem instances with roughly 1000 programs
on average.
Finally, we investigate the impact of the number of channels. Table 6.5 shows a comparison
of three different test sets that are generated using 3 different program classes and CU objec-
tives. All instances have a similar size and roughly contain 1000 programs. We observe that
the instances become more difficult to solve for all approaches when the number of channels is
decreasing. This surprising result may be caused by the fact that a smaller number of channels
increases the relative importance of the knapsack optimization constraint that is inherently more
difficult than the WSSP constraint. However, a sound answer to that question can only be given
by further investigation. It remains to note that we observe a converse behavior for TC data sets:
the instances become more difficult the more channels are involved which is obviously caused
by many temporally conflicting programs of similar value.
6.4 Summary and Future Work
We have shown that the Automatic Recording Problem can be viewed as a composition of a
Knapsack Problem and a Weighted Stable Set Problem on an interval graph. By introducing an
FPTAS, we showed that the problem, though being NP-hard, can be approximated with arbitrary
approximation quality. Then, using an exact tree search approach, we showed the benefits of
linking filtering algorithms via Lagrangian relaxation. The numerical results we obtained show
a significant improvement achieved by CP-based Lagrangian relaxation with respect to the com-
putation time and the number of choice points.
There are several natural extensions for the ARP. For example, a digital video recorder could
have more than one recording unit which allows the recording of a limited number of channels
simultaneously. In an IP context, this modification can be introduced easily. For the exact ap-
proach presented, a fast and efficient filtering algorithm for this type of relaxed non-overlapping
constraint is subject to further research.
Chapter 7
Capacitated Network Design
In the last chapter, we presented the AutomaticRecording Problem (ARP) as an example of a dis-
crete optimization problem that can be naturally decomposed into two basic substructures. Now,
we want to develop a branch & bound approach for the Capacitated Network Design Problem,
that can also be viewed as a conglomerate of two simple constraint families defined by the mass-
balance and the bundle constraints. We investigate both induced Lagrangian sub-problems and
develop problem reduction algorithms for them. As we shall see, applying the strongest version
of CP-based Lagrangian relaxation is inefficient for this problem. Therefore, we generalize the
idea of cost-based domain filtering, that may be viewed as an adding of unary local constraints.
An algorithm is presented that adds local cardinality cuts based on Lagrangian relaxation. Their
usefulness is evaluated empirically by numerous tests.
The work presented in this chapter was published in [136, 193]. It is structured as follows: In
Section 7.1, we introduce the Capacitated Network Design Problem (CNDP). To solve the prob-
lem, we use bounds, variable fixing algorithms and local cardinality cuts based on Lagrangian
relaxation as described in Section 7.2. The entire branch & bound-approach is described in
Section 7.3. Finally, in Section 7.4, we give numerical results.
7.1 The Capacitated Network Design Problem
The Capacitated Network Design Problem was first defined in [91] and is relevant for a wide
area of applications, ranging from telecommunications to transportation problems [97, 144]. The
problem consists in finding an optimal subset of arcs in a network G
V
E
such that we can
transport a given demand dk
V
of goods 1
k
K(so-called commodities) at optimal total
cost. The latter consists of two components: the flow costs and the design costs. The flow cost is
the sum of costs for the routing of each commodity, whereby, for each arc
i
j
and commodity
k, a scalar ck
ij
0 determines the cost of routing one unit of commodity kvia
i
j
. The design
133
134
Chapter 7. Capacitated Network Design
costs are determined by the costs of installing the chosen arcs, whereby, for each arc
i
j
, we
are given a fixed arc-installation cost fij. Additionally, for each arc, there is a capacity uij given
that limits the total amount of flow that can be routed via
i
j
.
For allarcs
i
j
Eandcommodities1
l
K, letbl
ij
min
dl
uij
. Usingvariablesxl
E
for the flows and y
0
1
E
for the design decisions, a mixed integer program formulation
for the Capacitated Network Design Problem can be stated as follows:
Minimize LCNDP
l
cl
Txl
fTy
subject to Nxl
dl
1
lxl
ij
uijyij
i
j
E
2
xl
ij
bl
ijyij
i
j
E
1
l
K
3
x
0
4
y
0
1
E
5
For ease of notation, we refer to the previous LP with LCNDP, which is also used to denote the
optimal objective value. The network flow constraints (also called mass-balance constraints) (1)
are defined by the node-arc-incidence matrix N
nia
i
V
a
Eand a demand vector dk
V
for
all commodities k, whereby nia
1 iff a
h
i
,nia
1 iff a
i
h
, and nia
0 otherwise,
and dk
i
0 iff node i
Vis a demand node and dk
i
0 iff node iis a supply node for commodity
k. Without loss of generality, we may assume that there is exactly one demand node and one
supply node for each commodity [112].
The total flow on an arc
i
j
is constrained by the capacity uij (so-called capacity or bundle
constraints (2)). The set of upper bound constraints (3) is redundant to the problem formulation.
It is introduced to strengthen the linear continuous relaxation of the mixed integer problem.
7.1.1 State of the Art
The CNDP is an NP-hard problem, which is easy to see by reduction to the Steiner Tree Problem.
Since the latter is MAXSNP-hard [20], we cannot even hope for a fully polynomial approxima-
tion scheme (FPTAS) for the CNDP. Here, we focus on exact and heuristic solution approaches.
Regarding exact solution approaches, Crainic, Frangioni, and Gendron develop lower bound-
ing procedures for the CNDP [44]. The main insights are the following: Tight approximations
of the so-called strong LP-relaxation (see LCNDP including the redundant constraints (3)) can be
found muchfaster by Lagrangianrelaxation than byoptimizingthe LP usingstandard LP-solvers.
The authors investigateso-called shortest path and knapsack relaxations (see Section 7.2). When
solving the Lagrangian dual, bundle methods converge faster than ordinary subgradient methods
and are more robust. Motivated by this successful work, we evaluate several Lagrangian relax-
ations in the context of branch & bound.
7.1. The Capacitated Network Design Problem
135
In [112], Holmberg and Yuan present a method to compute exact or heuristic solutions for the
CNDP. They use the Lagrangian knapsack relaxation in each node of the branch & bound tree
to efficiently compute lower bounds. Special penalty tests were developed which correspond to
variable fixing strategies presented in the following. An evaluation of the following components
is given: subgradient search procedure for solving the Lagrangian dual, primal heuristic for
finding feasible solutions, and interplay between branch & bound and the subgradient search.
On top of that work, a heuristic is developed that is embedded in the tree search procedure.
That heuristic is able to provide near-optimal solutions for large CNDP instances which are out
of reach of exact methods like Lagrangian relaxation based branch & bound or branch-and-cut
approaches (represented by the CPLEX implementation, for example).
In [22], Bienstock et al. describe two cutting-plane algorithms for a variant of the CNDP
with multi-arcs (i.e., an arc can be inserted multiple times). One of them is based on the multi-
commodity formulation of CNDP and uses cutset and three-partition inequalities. The other one
adds total capacity, partition, and rounded metric inequalities. In a branch-and-cut framework,
both variants provide sound results on a benchmark of realistic data. A substantial improve-
ment of this procedure is achieved by Bienstock in [23]: a branch-and-cut algorithm based on
ε-approximations of linear programs performs better on the same benchmark data.
Further branch-and-cut algorithms for the Multi-Arc CNDP are investigated in research pa-
pers by Günlük, and Atamtürk et al. [7, 102]. Several valid inequalities are considered, and it is
found that branch-and-cut is substantially superior to branch & bound code on the investigated
benchmark data. After addingthe cuts, the integralitygap at theroot node isreduced significantly
and the total number of evaluated nodes diminishes. These results emphasize the importance of
tight lower bounds for the CNDP.
Sridhar and Park report about an implementation of a Benders-and-cut algorithm for the
CNDP [204]. The algorithm consists of three parts: a cutting plane algorithm for the compu-
tation of tight lower bounds, a heuristic to generate feasible solutions, and the Benders-and-cut
algorithm itself. The computational results provided in the paper are based on a wide range of
instances with varying traffic demand. The complexity of the problem instances depends heavily
on the capacity provided in the network. CNDP instances with low transportation demand are
easy to solve, whereas for problem instances with higher demand the authors suggest to use the
Benders-and-cut algorithm. In cases when the traffic becomes extremely dense, the integrality
gaps increase. To strengthen the LP-relaxation, flow inequalities are very effective.
A branch-and-price algorithm for the path-based formulation of the CNDP is compared with
the traditional branch & bound for the link-based formulation by Clarke and Gong in [40]. The
path-based formulation is computationallymore efficient. Adding SOS-branching on each origin
and destination node increases this advantage. The efficient solution of the pricing problem
(which is a simple Shortest Path Problem) enables a faster solving of the LP-relaxations in the
branch & bound tree. Computational results on instances with 6, 10, and 15 nodes are reported.
136
Chapter 7. Capacitated Network Design
7.2 Lagrangian Relaxation Bounds
The CNDP can be viewed as a mixture of a continuous and a discrete optimization problem.
The latter is obviously constituted by the design variables, whereas the first is a Min-Cost Mul-
ticommodity Flow Problem (MMCF) that evolves when the design variables are fixed. For the
MMCF, besides linear programming solvers, especially cost-decomposition approaches based
on Lagrangian relaxation have been applied successfully [85]. The bounds we will use for the
CNDP will be based on those cost-decomposition approaches for the MMCF.
Regarding the MMCF and also for the CNDP, we are left with two promising choices of
which hard constraints should be softened:
the bundle constraints (“shortest path relaxation”), or
the mass-balance constraints (“knapsack relaxation”).
7.2.1 Shortest Path Relaxation
Assume, in LCNDP, for each arc
i
j
Ewe introduce a Lagrangian multiplier λij
0 and
transfer the corresponding bundle constraint into the objective function. At the same time, we
also relax the upperbound constraintsxl
ij
bl
ij usingpenalty costsνl
ij
0. Let
:
n
n
n
denote the component-wise product of two vectors, i.e. z
x
yiff zs
xsys
x
y
nand
1
s
n. Then we get the following linear program:
Minimize LSP
λ
ν
l
cl
λ
νl
Txl
f
λ
u
lνl
bl
Ty
subject to Nxl
dl
1
l
K
x
0
y
0
1
E
The theory of Lagrangian relaxation shows that, for every choice of the Lagrangian multipliers
λ
ν
0, LSP
λ
ν
is a lower bound on the CNDP. Notice that this value can be obtained easily,
because there is no cross-talking between variables yand xand among the variables xlanymore.
Thus, we can compute LSP
λ
ν
by solving Kproblems of the form
Minimize Ll
SP
λ
ν
cl
λ
νl
Txl
subject to Nxl
dl
xl
0
and by setting yij
1 iff fij
λijuij
kνl
ijbl
ij
0, and yij
0 otherwise. That is, we can solve
the Lagrangian sub-problem mainly by computing KSingle Source Shortest Path Problems with
positive arc weights. Therefore, the shortest path sub-problem can be solved in time O
K
m
nlogn

.
7.2. Lagrangian Relaxation Bounds
137
7.2.2 Knapsack Relaxation
The other promising alternative is to relax the mass-balance constraints in LCNDP. For the con-
straints to be relaxed, we introduce Lagrangian multipliers µl
ifor all 1
l
Kand i
V. We get
the following linear program:
Minimize LKP
µ
lij
cl
ij
µl
i
µlj
Txl
ij
fTy
µTd
subject to lxl
ij
uijyij
i
j
E
xl
ij
bl
ijyij
i
j
E
x
0
y
0
1
E
Whereas the shortest path relaxation decomposes the Lagrangian sub-problem by the different
commodities, here we achieve an arc-wise decomposition. To solve the previous LP, for each
i
j
Ewe consider the following linear program, that is similar to the linear continuous relax-
ation of a Knapsack Problem:
Minimize L
i
j
KP
µ
lcl
ijxl
ij
subject to lxl
ij
uij
xl
ij
bl
ij
1
l
K
x
0
where cl
ij
cl
ij
µl
i
µlj. For each
i
j
E, we set xl
ij
xl
ij for all 1
l
K, and yij
1, iff
fij
L
i
j
KP
µ
0. Otherwise, we set xl
ij
0 for all 1
l
K, and yij
0. Obviously, this setting
provides us with an optimal solution for L
i
j
KP
µ
. Thus, the main effort is to solve the problems
L
i
j
KP
µ
. However, this is an easy task (compare with [147, 149]): first, we can eliminate all
variables withpositivecost coefficients, i.e., we set xl
ij
0 for all1
l
Kwithcl
ij
0. Next, we
sort the xl
ij according to increasing cost coefficients cl
ij, that is, from now on we may assume that
cl
ij
cl
1
ij
0 for all 1
l
s
K, whereby sis the number of negative objective coefficients.
Let k

denote the critical item with k
min
l
s
h
lbh
ij
uij
s
1
. We obtain
L
i
j
KP
µ
by setting xh
ij
bh
ij for all h
k,xh
ij
0 for all h
min
k
s
, and, in case of k
s
1,
xk
ij
uij
h
kbh
ij. Thus, the knapsack sub-problem can be solved in time O
E
KlogK

.
Note that both relaxations exhibit the integrality property. Thus, the bound we achieve in
both settings equals the linear continuous relaxation bound of the CNDP [44].
7.2.3 Subgradient Optimization
For both relaxation types, the Lagrangian dual consists in maximizing the lower bound. That is,
for the shortest path relaxation we have to maximizeL
λ
ν
subject to λ
ν
0. For the knapsack
138
Chapter 7. Capacitated Network Design
relaxation, our task is simply to maximize L
µ
. A popular way of solving these concave, piece-
wise linear maximizationproblems over a convex region is to apply a subgradient search (see [1]
for a general introduction). Several proposals regarding the specification of the general method
have been made in the CNDP literature. Let stdenote a subgradient in iteration t. We compute
the new search direction by setting dt
st
αdt
1
1
α
[112]. We also experimented with
other approaches to solve the Lagrangian dual, such as the modified Camerini-Fratta-Maffioli
rule [28] or the volume algorithm [11]. Without going into details here, we just note that none of
these modifications yield visible improvements on the overall performance.
7.2.4 Variable Fixing
A big advantage of Lagrangian relaxation based bound computations is that they can be used for
variable fixing in a very efficient way. In the presence of an optimal or at least high quality upper
bound B
for the CNDP, it is an easy task to check whether a variable yij can still be set
to either of its bounds without worsening the lower bound too much. More formally, given the
Lagrangian multipliers λand νin the current shortest path sub-problem, a value l
0
1
and
any arc
i
j
E, we can set
yij
lif
2l
1
fij
λijuij
l
νl
ijbl
ij

B
LSP
λ
ν
(7.1)
Analogously, given the Lagrangian multipliers µin the current knapsack sub-problem, a value
l
0
1
and any arc
i
j
E, we can set
yij
lif
2l
1
fij
L
i
j
KP
µ

B
LKP
µ
(7.2)
Now, we have two variable fixing algorithms at hand for two different Lagrangian sub-problems.
Therefore, we are able to apply the concept of CP-based Lagrangian relaxation (see Section 3.2).
First, we can choose one of the two alternatives (for example the one for which the Lagrangian
dual can be solved more quickly) and apply the corresponding variable fixing algorithm in every
Lagrangian sub-problem. If we find that this procedure does not filter enough values, we can
do even more: with the help of dual values gained in the solution process of the Lagrangian
sub-problem, in every Lagrange iteration we can apply both variable fixing algorithms.
Let us assume we decide to use the shortest path relaxation. Given the current Lagrangian
multipliers λand ν, in every iteration we need to solve KShortest Path Problems. Using the
well-known Dijkstra algorithm for that purpose, we do not only get the optimal objective of each
problem, we also get the shortest path distances µl
iof each node ifor free. It is known for a long
time that those distances are optimal dual values in the corresponding LP. The idea of the linking
method consists in using these duals as Lagrangian multipliers for the knapsack sub-problem
next. That is, we consider LKP
µ
. That way, we can apply the variable fixing algorithms for
7.2. Lagrangian Relaxation Bounds
139
both sub-problems. Note that, for optimal Lagrangian multipliers λand ν, the shortest path
distances µin combination with the multipliers are also optimal for the dual of LCNDP.
When using the knapsack relaxation, the situation is slightly more complicated, because we
need to provide dual values for the bundle as well as the upper bound constraints. Given the
current Lagrangian multipliers µ, we solve
E
knapsack sub-problems as described in 7.2.2.
Again, when given any arc
i
j
E, we assume that s
denotes the number of negative cost
coefficients in L
i
j
KP
µ
, that the remaining variables xl
ij are ordered with respect to increasing
cost coefficients, and that k
s
1 is the critical item.
In case of k
s
1, we set λij
ck
ij,νl
ij
cl
ij
ck
ij for all l
kand νl
ij
0 for all l
k. For
k
s
1, we set λij
0, νl
ij
cl
ij for all l
kand νl
ij
0 for all l
k.
Theorem 7.1 The vectors λand νdefine optimal dual values for LKP
µ
.
Proof: It is sufficientto show that the valueλij and the vector νij
ν1
ij

νK
ij
givean optimal
dual solution for L
i
j
KP
µ
for all
i
j
E. The dual of L
i
j
KP
µ
is the following linear program:
Maximize D
i
j
KP
µ
uijλij
lbl
ijνl
ij
subject to λij
νl
ij
cl
ij
1
l
K
λ,ν
0
First, we show that our solution is dual feasible:
k
s
1: It holds that λij
ck
ij
0, because k
s. Also, if l
k, then νl
ij
cl
ij
ck
ij
0,
because cl
ij
ck
ij. If l
k, then νl
ij
0. Thus, all values are non-positive.
Furthermore, if l
k, then it holds that λij
νl
ij
ck
ij
cl
ij
ck
ij
cl
ij. For l
k, it holds
that λij
νl
ij
ck
ij
cl
ij.
k
s
1: It holds that λij
0 and νl
ij
cl
ij
0 for all l
k, because this implies l
s.
Furthermore, νl
ij
0 for all l
k. Thus, all values are non-positive.
For k
s
1, we have that λij
νl
ij
cl
ij for all 1
l
K.
To prove the optimalityof the dual solution(and, at the same time, also for the primal solution
we gave in Section 7.2.2), we show that the two objective values for xij in L
i
j
KP
µ
and λij,νij in
D
i
j
KP
µ
match each other:
k
s
1: lcl
ijxl
ij
l
kcl
ijbl
ij
ck
ij
uij
l
kbl
ij
l
kbl
ij
cl
ij
ck
ij
uijck
ij
lbl
ijνl
ij
uijλij.
k
s
1: lcl
ijxl
ij
l
scl
ijbl
ij
lbl
ijνl
ij
uijλij.
140
Chapter 7. Capacitated Network Design
To summarize: if we choose the shortest path relaxation, in every Lagrangian sub-problem
we solve Kshortest path sub-problems and achieve a lower bound on the CNDP. If that bound
is worse than the best known or even optimal objective value B, we can prune the current choice
point. Otherwise, we fix variables according to Implication 7.1. Then we set up LKP
µ
, where µ
are the shortest path distances, and fix variables according to Implication 7.2.
If, instead, we choose to use the knapsack relaxation, in every Lagrangian sub-problem we
solve
E
linear continuous Knapsack Problems and achieve a lower bound for the CNDP. Again,
if that bound is worse than B, we can prune the current choice point. Otherwise, we fix variables
according to Implication 7.2. Then we set up LSP
λ
ν
, where λand νare the dual values of
LKP
µ
in Theorem 7.1, and fix variables according to Implication 7.1.
Of course, many variants of the algorithm sketched above can be thought of. For example, we
may find that applying both variable fixing algorithms in every Lagrangian sub-problem is too
time consuming and does not pay off. Then we can introduce frequency parameters that control
the percentage of Lagrangian sub-problems for which either of the twovariable fixing algorithms
is applied. As an extreme choice, we may for example decide to apply variable fixing for the
optimalLagrangian multipliersonly. As our experimentsshow, it is favorable to use theknapsack
relaxation to solve the Lagrangian dual quickly. As one would expect, solving KShortest Path
Problems in everyLagrangian sub-problemin additionto the
E
knapsack sub-problemsis rather
costly and slows down the solution process considerably. The following theorem helps to cope
with this situation more efficiently.
Theorem 7.2 Given Lagrangian multipliers µ in the knapsack relaxation, denote some optimal
dual values for LKP
µ
by λ
0and ν
0. Then,
LSP
λ
ν

LKP
µ

(7.3)
Proof: Let κ
0 denote optimal dual values for the upper bound constraints of yin LKP
µ
. The
vectors µand κare dual feasible for LSP
λ
ν
, because λ,νand κare primal optimal for LKP
µ
,
and thus
µk
j
µk
i
ck
ij
λij
νk
ij
i
j
E
1
k
K
and (7.4)
κij
fij
uijλij
k
bk
ijνk
ij
i
j
E(7.5)
Any dual feasible solution for a minimization problem determines a valid lower bound on that
problem. Denote the dual of an optimization problem Lby D. Then,
LSP
λ
ν
DSP
λ
ν

µTd
1Tκ
DKP
µ
LKP
µ

(7.6)
7.2. Lagrangian Relaxation Bounds
141
We can use this theorem to relax Implication 7.1, which improves the running time of our
variable fixing algorithm, but makes it also less effective. Unfortunately, as we shall see in
Section 7.4, even in its strong version the shortest path variable fixing algorithm is already too
ineffective, and therefore this idea cannot be used to improve the running time of a CNDP solver.
Another important choice is to decide when the effects of the variable fixing algorithms
shouldbe made visible tothe subgradient algorithm. Obviously,when we fix variables duringthe
process of finding optimal Lagrangian multipliers, we are actually changing the problem. Thus,
with respect of the convergence of the iterative procedure, we have to reset. The most defensive
strategy is to start from scratch (with the current Lagrangian multipliers as starting point) and
to set the step-length coefficient to its initial setting. However, we may consider to allow only
a little more flexibility by scaling the factor by a fixed percentage. A totally different approach
would be not to incorporate the changes of the bounds of the variables during the optimization
process, but to restart the whole procedure (if any variable fixing has taken place) after optimal
Lagrangian multipliers have been found. We evaluate different parameter settings in the numeric
section.
7.2.5 Lagrangian Cardinality Cuts
Since CP-based Lagrangian relaxation is partly inefficient for the CNDP, we generalize the idea
of domain filtering. Variable fixing may be viewed as an addition of unary constraints that force
a variable to take a specific value. Domain filtering that achieves a state of bound consistency
may be viewed as a procedure that adds unary interval constraints. Even general domain filtering
can be viewed as an additionof unary constraints that enforce a variable not to take some specific
values. In general, all these additional constraints are only valid locally. Now, we do not want to
restrict ourselves to unary constraints anymore. Instead, we consider the generation of additional
local constraints with respect to cost considerations.
We will see that, in the presence of a near-optimal solution to the CNDP with associated
objective value B, in each Lagrangian sub-problem we can infer restrictions on the number of
arcs that need to be installed in any improving solution. Before we can state the idea more
formally, to ease the notation, we introduce some identifiers. For the shortest path relaxation, we
define
ˆ
fij
fij
λijuij
lνl
ijbl
ij
i
j
E(7.7)
ˆ
B
B
l
Ll
SP
λ
ν
(7.8)
for Lagrangian multipliers λ
ν
0. Likewise, for the knapsack relaxation, we set
ˆ
fij
fij
L
i
j
KP
µ
i
j
E(7.9)
ˆ
B
B
µTd(7.10)
142
Chapter 7. Capacitated Network Design
for Lagrangian multipliers µ. Furthermore, denote the current Lagrangian sub-problem LSP
λ
ν
or LKP
µ
by LR.
Theorem 7.3 Denote an ordering of the arcs in E by e1

e
E
such that i
j implies ˆ
fei
ˆ
fej.
It holds: if LR
B, then
a) there exist values
F
min
0
u
E
h
u
ˆ
feh
ˆ
B
and (7.11)
U
max
0
u
E
h
u
ˆ
feh
ˆ
B
(7.12)
b) In any improving solution
x
y
, it holds that
F
i
j

E
yij
U
(7.13)
Proof:
a) It is sufficient to show that there exists a value 0
l
E
such that h
lˆ
feh
ˆ
B. Let
l
argmin0
h
E
h
lˆ
feh
. It holds that:
h
l
ˆ
feh
LSP
λ
ν
h
K
Lh
SP
λ
ν
B
h
K
Lh
SP
λ
ν
ˆ
B
(7.14)
Or, respectively,
h
l
ˆ
feh
LKP
µ
µTd
B
µTd
ˆ
B
(7.15)
b) Let L
U
:
L
i
j

Eyij
U
denote the optimization problem that evolves by adding
the constraint
i
j

Eyij
Uto a problem L. By the definition of U, we know that
L
R
U
B. We remind the reader that we identify the name of an optimization problem
with its optimal value. Now, assume that there exists a feasible solution
x
y
for LCNDP
with
E
i
j
Eyij
U. Then it holds that
cTx
fTy
L
CNDP
U
L
R
U
B
(7.16)
The case
i
j

Eyij
Ffollows analogously.
Theorem 7.3 allows us to add cardinality cuts on the number of arcs to be installed without
loosing improving solutions. We will evaluate the effect of local cardinality cuts on the solution
process in Section 7.4.
7.3. A Branch & Bound Algorithm
143
7.3 A Branch & Bound Algorithm
After having described the bound computation and possible reduction strategies based on La-
grangian relaxation, now we sketch the decisions taken in the tree search.
Dominance Cut-Off Rule Apart from the lower bound exceeding the upper bound, the search
in the current node can be pruned if the min-cost routing of all commodities only uses arcs
that have already been decided to be installed. Thus, in every choice point we use a column
generation approach to solve the Min-Cost Multicommodity Flow Problem on the subset of arcs
with associated yij that have a current upper bound of 1. If that routing only uses arcs
i
j
with
yij with lower bound 1, we can prune the search and backtrack.
Branching Variable Selection The previous discussion also induces a rule for the selection of
the branching variable: it is clearly favorable to choose a variable for branching that is being
used by the current optimal min-cost multicommodity flow. Of course, there may be more than
just one such variable. Then we can choose the one with minimal or maximal reduced costs
ˆ
fij
in the Lagrangian sub-problem with the best associated multipliers. The different choices will be
evaluated in Section 7.4.
Tree Traversal A simple depth first search procedure is used to choose the next search node.
This allows to find feasible solutions quickly and eases the reuse of Lagrangian multipliers.
Primal Heuristic To find reasonably good and near-optimal solutions quickly, in every search
node we apply a Lagrangian heuristic that was suggested by Holmberg and Yuan. It works by
computing multicommodity flow solutions on a subset of the arcs and de-assigning all arcs that
carry no flow. For further details, we refer the reader to [112].
Variable Fixing Heuristic Since the Capacitated Network Design Problem is very hard to be
solved exactly, we may decide to search for relatively good solutions quickly. The exact ap-
proach can be transformed into a heuristic for the problem by fixing variables more optimisti-
cally. Holmberg and Yuan [112] developed the so-called α-heuristic for this purpose: While
solving the Lagrangian dual, we protocol how often a variable is set to zero or one. If one of the
values is dominant with respect to a given parameter, the variable is simply set to this value.
7.4 Numerical Results
We report on our computational experience with the previously developed algorithms. The sec-
tion is structured as follows: first, we introduce the benchmark data used in the experiments.
Then we define the possible parameter settings that activate and deactivate different algorith-
mic components. Finally, we compare the variants when solving the CNDP from scratch, in the
optimality proof, and when using the approach as a heuristic.
144
Chapter 7. Capacitated Network Design
All tests were carried out on systems with AMD Athlon, 600 MHz processors, and 256 MB
main memory. The code was compiled with the GNU g++ 2.95 compiler using optimization
level O3.
7.4.1 Benchmark Data
Surprisingly, in spite of the theoretical interest the CNDP has drawn and the large number of
research groups that have dealt with the problem, apparently there has been no benchmark set
established on which researchers can compare algorithms that solve the CNDP exactly. Much
work has been done with respect to the computationof lower boundsand the heuristic solution of
the problem. Benchmarks used for this purpose (to be found in [44, 112], for example) are still
too large to allow the computation of optimal solutions. For variations of the problem (such as
the Multi-Arc CNDP, Network Loading, etc.), benchmark data exists, but it is not straightforward
to see how it could be converted into meaningful instances for the pure CNDP as we consider it
here.
Thus, we decided to base our comparison on a benchmark of 48 instances generated by
a CNDP generator developed by Crainic et al. and described in [44]. The generator is used
by different research groups after it was enhanced with a stable random number generator by
A. Frangioni. We generated graphs with 12, 18, and 24 nodes with 50 to 440 arcs and 50 to 160
commodities. For the heuristic comparison we use the benchmark set from [44, 45].
7.4.2 Algorithm Variants Considered in the Experiments
The optimization system developed consists of several parts. We list the components that are
compared and evaluated in the experiments: different Lagrangian relaxation algorithms based on
the shortest path (SP) or the knapsack relaxation (KP), respectively; a branch & bound algorithm
using bounds based on those relaxations, where the branching variable is chosen according to
minimal (BR0) or maximal (BR1) absolute reduced-cost values ˆ
fij; two different filtering al-
gorithms based on the shortest path relaxation (SF) and the knapsack relaxation (KF) that fix
variables statically (STAT) after optimal Lagrangian multipliers have been computed, or dynam-
ically (DYN) during the optimization of the Lagrangian dual; and finally, the cardinality interval
tightening algorithm that adds Lagrangian cardinality cuts to the problem (CIT).
7.4. Numerical Results
145
7.4.3 Evaluation
BR0
BR0-CIT BR1
BR1-CIT
time 93.7% 25.3%
min 4.72% 0.28%
max 353% 131.5%
variance 62.1% 11.5%
nodes 38.6% 10.1%
min 0.73% 0.02%
max 120.7% 78.4%
variance 14.5% 2.9%
Tab. 7.1: The impact of cardinality interval tightening when us-
ing the knapsack relaxation for pruning and problem reduction.
Mean, minimum, maximum, and variance of running time and
number of nodes in the branch-and-bound tree are given.
With the first experiments we per-
formed we wanted to find out
which type of Lagrangian relax-
ation was preferable. In ac-
cordance to the results reported
in [112], we found that the knap-
sack relaxation is clearly superior
both with respect to the number
of subgradient iterations needed to
solve the Lagrangian dual as well
as the time needed to solve the
Lagrangian sub-problems. There-
fore, we start right away with an
evaluation of the impact of La-
grangian cardinality cuts when solvingthe CNDP using the knapsack relaxation. Table 7.1 shows
a comparison of lower bound routines using the Lagrangian knapsack relaxation with and with-
out cardinality cuts. Table 7.2 shows a comparison of two different strategies for the selection
of the branching variable. Comparing two variants, the tables give the average percentage of the
second variant when compared to the first (that is always set to 100%) with respect to running
times and the number of search nodes visited in the branch & bound trees. Moreover, we specify
minima, maxima, and the variance of those percentages.
BR0
BR1 BR0-CIT
BR1-CIT
time 1817.7% 235.1%
min 68.03% 30.91%
max 7445.5% 1221.7%
variance 37944.6% 604.7%
nodes 2750.4% 311.1%
min 89.832% 15.636%
max 19415.3% 1427%
variance 172454% 1163.3%
Tab. 7.2: The impact of the branching variable selection when
pruning and filtering is done with the help of the knapsack relax-
ation.
Clearly, choosing a branch-
ing variable with minimal reduced
costs is favorable, no matter if car-
dinality cuts are introduced or not.
This contradicts the recommenda-
tion given in [112]. Actually, this
result is not very surprising. Intu-
itively, the variable with the mini-
mal absolute reduced costs is least
likely to be set by variable fixing.
It is the variable we have the least
knowledge about, and therefore it
is a good choice to base a case dis-
tinction on it. In contrast, the vari-
able with the largest absolute reduced costs is most likely to be set by variable fixing, and there-
fore it is no good idea to double the effort by using this variable for branching.
146
Chapter 7. Capacitated Network Design
Regarding the introduction of Lagrangian cardinality cuts, Table 7.1 shows that they have a
great impact on the number of search nodes that have to be investigated. Cardinality cuts are also
favorable with respect to the total running time, but the gains are not as large as with respect to
the size of the search tree. The trade-off is caused by the additional effort that is necessary to sort
the arcs with respect to the current reduced costs ˆ
fij.
When looking at the data more precisely, we find that the primal heuristic works much better
in the presence of cardinality cuts. The result of this positive effect is clear: high quality upper
bounds are found much earlier in the search, pruning and variable fixing work much better, and
the number of search nodes is greatly reduced, which explains the numbers in Table 7.1.
We conjecture that the primal heuristic works so well in the presence of cardinality cuts,
because they provide a good estimate on the number of arcs that need to be installed in order to
improve the current solution. Thus, the right amount of arcs is opened for the heuristic, and it is
able to compute near-optimal solutions at a higher rate.
Next, we evaluate the use of CP-based Lagrangian relaxation for the CNDP. Table 7.3 shows
a comparison of runs when using shortest pathvariable fixing in addition to the knapsack variable
fixing algorithm. The results are very disappointing: not only is the integrated approach inferior
with respect to the total running time. On top of that, the reduction of choice points is negligible,
and therefore the additional effort taken is almost worthless.
SOLVE: KF
KF-SF OPT: KF
KF-SF
time 148.6% 144.1%
min 96.59% 51.87%
max 466% 271.3%
variance 46.1% 13.5%
nodes 133.8% 94.9%
min 71.42% 20%
max 677.1% 180.3%
variance 166.3% 7.5%
Tab. 7.3: The impact of additional shortest-path filtering when using
the knapsack relaxation for pruning and problem reduction. Branching
strategy BR0 and cardinality interval tightening are used.
Note that, when using
the linking method, the num-
ber of search nodes some-
times even exceeds the value
that is achieved when us-
ing knapsack variable fix-
ing only. This is caused
bydifferenceswhenbuilding
up the search tree: the La-
grangian dual usually stops
with different Lagrangian
multipliers that have a severe
impact on the variable selec-
tion. Moreover, the genera-
tion of primal solutions dif-
fers, which makesthe comparisonparticularly difficult, because variable fixing ishighlysensitive
to the quality of upper bounds. Thus, to eliminate the last perturbation, we repeated the exper-
iment on the algorithmic optimality proof. That is, we provide the algorithm with an optimal
solution and let it prove its optimality only.
7.4. Numerical Results
147
Table 7.3 shows the results that still reveal the poor performance of the additional applica-
tion of shortest path variable fixing. The reason for this is that the shortest path variable fixing
algorithm is much less effective than the one based on the knapsack relaxation: Recall from Sec-
tion 7.2.4 that the shortest path relaxation value consists of two values, one for the shortest-path
routings, and the other one for the design variables. However, the variable fixing algorithm only
incorporates the latter costs and does not take into account that the removal of arcs may generally
increase the routing costs as well. Therefore, using shortest path variable fixing as described in
Section 7.2.4 is comparably ineffective and does not pay off.
We tried to improve the effectiveness of the algorithm by adding node-capacity constraints.
If a node is a source for some commodities, its out-capacity must be large enough to push the
corresponding supplyinto the network. Similarly, if a node is a sink node for some commodities,
its in-capacity must be large enough to let the required demand in. In contrast to the knapsack
relaxation, where the x- and y-variables are not independent, the shortest path relaxation allows
to incorporate those constraints very easily. Moreover, we tried to improve the shortest path
reduction algorithmfurther by computinga lower bound onthe additionalcosts ofrouting a given
commodity when a certain arc is not installed. The algorithm presented in Section 2.2.4.1 was
implementedfor this purpose. However, all these efforts did not result in a filtering algorithmthat
was effective enough to be worth applying. Therefore, we cannot recommend to use the shortest
path relaxation, neither for the computation of Lagrangian relaxation bounds, nor for problem
reduction. Note that this recommendation stands in contradiction to the one given in [44].
DYN
STAT
time 115.9%
min 36.95%
max 229.2%
variance 14.1%
nodes 123.2%
min 29.72%
max 233.8%
variance 20.8%
Fig. 7.1: Optimality proofs: comparison of
two strategies when using reduction based on
knapsack relaxation: on-the-fly fixing vs. fix-
ing after Lagrange.
In Figure 7.1, we compare the strategy to fix vari-
ables during the optimization of the Lagrangian dual
with the fixing of variables after optimal multipliers
have been found only. We see that filtering “on the
fly” is slightly favorable. Our experience also shows
that the subgradient method is very robust with re-
spect to problem reduction, and we can afford not to
reset the step length after variables have been fixed
in the Lagrangian sub-problem without loosing con-
vergence in practice.
Next, in Table 7.4, we compare the performance
of the algorithm we developed and the standard
solver ILOG CPLEX 7.5 [118] when solving the
CNDP and when proving the optimality of a given
solution. Clearly, using LP-bounds improvedby sev-
eral kinds of cuts that CPLEX adds to the problem
results in a huge reduction of search nodes. How-
ever, Lagrangian relaxation allows to compute lower bounds much faster, so that the approach
148
Chapter 7. Capacitated Network Design
OPT: CPLEX
KP-BR0-KF-CIT SOLVE: CPLEX
KP-BR0-KF-CIT
time 73.5% 229.2%
min 9.63% 22.48%
max 259% 753.5%
variance 36.5% 356.5%
nodes 1148.1% 3014.6%
min 196.666% 100%
max 5250% 10279.5%
variance 10762.4% 73762.5%
Tab. 7.4: Comparison of the CPLEX branch-and-cut algorithm and Lagrangian relaxation (pruning and
reduction based on the knapsack relaxation plus cardinality interval tightening).
presented here is still competitive when solving the CNDP. It even achieves a considerable im-
provement upon the running time in the optimality proof. Most important, however, is the fact
that the approach we developed is able to tackle much larger instances using the variable fixing
heuristic presented in Section 7.3.
We compare the non-exact version of our approach (using the α-fixing heuristic) with other
heuristic approaches [45, 94, 95] that have been developed for the CNDP. A comparison with
CPLEX is left out because it is not at all competitive for this benchmark set containing larger
CNDP instances. In Figure 7.2, we give the percentage of instances in a benchmark set (set C
in [44, 45], containing 31 instances) that have been solved within a given solution quality (in
percent, compared with the best known solution). Not only are the α-fixing with and without
cardinality cuts clearly superior with respect to the achieved solution quality. On top of that,
the heuristic variable fixing approach was stopped after at most 300 seconds CPU time. On this
benchmark set, heuristic variable fixing is on average about 6 times faster than TABU-PATH and
23 times faster than PATH-RELINKING (using SPECint values to make different architectures
comparable).
7.5 Summary and Future Work
We have presented an approach for the solution of the Capacitated Network Design Problem. It is
based on a tree search where lower bounds based on Lagrangian relaxation are used for pruning.
Two kinds of relaxation are considered, the shortest path and the knapsack relaxation. The latter
is clearly favorable with respect to the convergence of the subgradient algorithm that optimizes
the Lagrangian dual.
Two different variable fixing algorithms have been proposed in the literature that belong to
7.5. Summary and Future Work
149
Histogramm for the benchmark set C
.00%
20.00%
40.00%
60.00%
80.00%
100.00%
0 1 2 3 4 5 6 7 8 9 10 and larger
solution quality (%)
cumulated probability
ALPHA 300 TABU-PATH TABU-ARC (400) SS/PL/ID (400)
TABU-CYCLE PATH-RELINKING ALPHA CIT 300
Fig. 7.2: Comparison of different heuristic solvers for the CNDP.
the kind of relaxation that is chosen. When using the knapsack relaxation, we have shown how
variables can also be fixed with respect to shortest-path considerations by using dual values in
the Lagrangian knapsack sub-problem. However, even in a stronger version and in combination
with node-capacity constraints, the shortest path variable fixing algorithm is too ineffective to
justify the additional effort that is necessary for its application.
To tighten the problem formulation in a search node, we introduced the idea of local La-
grangian cardinality cuts. Experimental results show that their application improves the overall
running time, eventhough the timeper search node increases considerably whentheyare applied.
Finally, we compared the heuristic variable fixing approach with other heuristic approaches
developed for the CNDP. The results show that the tree-search approach that we implemented
clearly outperforms other heuristics both with respect to the CPU time needed and the solution
quality that is achieved.
Regarding the fact that we set up our system for the evaluation of variable fixing algorithms
and local Lagrangian cardinality cuts, and taking intoaccount that no sophisticated methods(like
bundle methods for example) for the optimization of the Lagrangian dual are used, we consider
these results as very encouraging. Most importantly, note that no global cuts are introduced yet to
strengthen the lower bounds computed. With the help of additional cuts that provably strengthen
the LP relaxation (see [33] for example), we expect that the performance of the approach pre-
sented can be further improved.
150
Chapter 7. Capacitated Network Design
Chapter 8
The Social Golfer Problem
In Chapter 4, we introduced the Social Golfer Problem as an example for a highly symmetric
constraint satisfaction problem. We revise the symmetry detection function for the symmetry
breaking method that was presented in Chapter 4. Then, with the help of heuristic constraint
propagation, we develop a state-of-the-art algorithm for the Social Golfer Problem.
The Social Golfer Problem has attracted much interest in recent years. That interest is mainly
caused by its highly symmetric structure, that has let it become a favorite playground for research
on the systematic breaking of symmetries.
In [202], Barbara Smith applied an approach that breaks symmetries during the search, i.e.,
she uses SBDS [93] to tackle the Social Golfer Problem. In combination with careful model
selection she was able to efficiently break most of the symmetries, but still found non-unique
solutions for the instances studied. Note that work is in progress which removes SBDS’s need
for an explicit list of symmetries [92, 152]; eventually this should allow SBDS to be used to
eliminate all symmetries from the problem.
In [81], Filippo Focacci and Michela Milano presented another generic method for breaking
symmetries, based on global cut seeds, generating symmetry removal cuts. With this approach it
should be possible to eliminate all the symmetries of the Social Golfer Problem, but at the time
of writing this has not been done.
The very similar SBDD method that we presented in Chapter 4was developedindependently.
It is based on the detection of dominance relations between choice points and works particularly
well for highly symmetric problems. At the time of writing, this is the only technique which has
been used to completely eliminate all symmetries from non-trivial instances of the Social Golfer
Problem.
In [104], Warwick Harvey compares SBDS and SBDD, and also givessome numerical results
on the Social Golfer Problem.
151
152
Chapter 8. The Social Golfer Problem
The approach that we present in the following was the best one known for the Social Golfer
Problem by the time of publication. Meanwhile, our results have been assimilated and extended
by several other research groups.
In [16], Nicolas Barnier and Pascal Brisset developed an approach for the Social Golfer
Problem that extends the concept of SBDD by incorporating the branching variable selection.
In [174], Francois Puget also refines symmetry breaking for the Social Golfer Problem.
The work presented in this chapter was published in [191, 192]. It is structured as follows:
To keep the chapter self-contained, we review the definition of the Social Golfer Problem in Sec-
tion 8.1. Then, in Section 8.2, we present a refined symmetry breaking function. Incorporating
heuristic constraint propagation of some redundant constraints that are introduced in Section 8.3,
we present numerical results of our approach in Section 8.4.
8.1 Definition
Given natural numbers w
g
s
, the Social Golfer Problem consists in finding wpartitionings
of the set
1

gs
into gsets of size ssuch that no two such sets have more than one member
in common. More formally, the problem is to compute wg sets X1
1

X1
g
X2
1

Xw
g
1

gs
such that
Xi
k
sfor all 1
i
w, 1
k
g,
1
k
gXi
k
1

gs
for all 1
i
w, and
Xi
k
Xj
l
1 for all
i
k
j
l
.
An instance of the Social Golfer Problem is written as a triple g-s-wfrom now on. When
s
1
w
gs
1, we have a configuration where every player must play with every other exactly
once. This corresponds to a resolvable Balanced Incomplete Block Design. Perhaps the most
well-known of these is Kirkman’s Schoolgirl Problem, posed (and solved) by Thomas Kirkman
in 1850. This instance, which is equivalent to the golfer 5-3-7 problem, was stated as follows:
How can 15 schoolgirls walk in 5 rows of 3 each for 7 days so that no girl walks
with any other girl in the same triplet more than once?
To the best of our knowledge, the computational complexity of the Social Golfer Problem is
not known yet. In the combinatorial design area, solutions for s
3 are known as Kirkman Triple
Systems or Resolvable Steiner Systems [41]. It can be shown that an instance x-x-4 is equivalent
to finding two orthogonal latin squares of size x. Even more so, an instance x-x-yis equivalent
to finding a set of y
2 mutually orthogonal latin squares, a problem that has been studied for
many years now [42].
8.2. Another SBDD-Approach for the Social Golfer Problem
153
Despite its apparently simple definition, computational approaches have great difficulties
solving even small instances of the Social Golfer Problem in a reasonable amount of time. In our
view, there are two main aspects to the problem that cause its big complexity for enumeration
approaches:
The Social Golfer Problem is highly symmetric.
The clique structure of the constraints ensuring that any two golfers do not play together
more than once makes it hard to judge the feasibility of a partial assignment.
In the following, we will address these two points by introducing a refined symmetry detec-
tion function for SBDD and the idea of heuristic constraint propagation.
8.2 Another SBDD-Approach for the Social Golfer Problem
The Social Golfer Problem contains a remarkable number of symmetries. Players can be placed
at any position within a group, groups can be rearranged within their week, and the weeks can
be ordered arbitrarily. Moreover, the player names can be permuted in any way desired. To give
an example: even the best models (in terms of symmetry reduction) for the original schoolgirl
instance still contain more than 1012 symmetries.
We have chosen to apply symmetry breaking during search (SBDD) in combination with a
straightforward model for the Social Golfer Problem that can be implemented with very little
effort using the ILOG SOLVER environment [121]. The groups are modeled as sets of players
with the cardinality of each set fixed to s. Each week contains gsuch sets, and the full pattern
covers wweeks. Initially, we fix all the players in the first week in increasing order. Additionally,
we insert the first splayers into the first sgroups for all weeks thereafter. Finally, the first group
of the second week can be filled with the smallest indexed players possible. None of these initial
labelings exclude any unique solutions.
We build up the schedule week by week, choosing as branching variable the group with the
smallest domain or as branching value the player with the fewest possible groups he or she can
be assigned to, depending on what leaves us with fewer choices.
To apply SBDD, we need to define a symmetry detection function ϕthat, for two given choice
points c1and c2represented as patterns reflecting the branching decisions taken so far, returns
true if and only if there exists a symmetry showing that c1defines a sub-tree of c2under that
symmetry. Then, at every choice point we check whether it is dominated in this fashion by some
previously expanded choice point, and if so, we prune the search.
In Section 4.3, to find a symmetry that proves a dominance relation between choice points,
we presented a procedure that, when given a specific player permutation, checks whether there
154
Chapter 8. The Social Golfer Problem
exists a week permutation that shows that one pattern dominates the other. To remove all sym-
metries, it was then necessary to iterate over all player permutations, a test that turned out to be
extremely expensive. Now, instead of iterating over all player permutations and computing week
permutations, we suggest to proceed vice versa: given a permutation of the weeks, we check
whether there exists a permutation of the players such that one pattern dominates the other. That
is, we iterate over all week permutations in c2and search for a player permutation that is feasible
with respect to the currently required matching of the weeks. Thus, again we set up a nested
constraint satisfaction problem to find a suitable symmetry or prove that none exists.
As this full dominance check is still very expensive for the Social Golfer Problem, we only
perform it when a week is being filled completely. This idea is motivatedby the experiments that
we presented earlier in Section 4.3. There, the application of a full dominancecheck turned out to
be most efficient when being applied in selected levels of the search tree only. For the remaining
nodes, we fix the playerpermutationto the identityand apply ΦW
G(see Section 4.3.2), that is, we
search for symmetries according to feasible orderings of the weeks, the groups and the players
within the groups. This less expensive check is implemented as a pairwise dominance check
between weeks, followed by the computation of a maximum cardinality matching on a bipartite
graph
V1
V2
E
where the weeks in c1and c2define the nodes in V1and V2, respectively, and
v1
i
v2
j
Eiff week iin c1is dominated by week jin c2.
8.3 Heuristic Constraint Propagation
Having introduced the Social Golfer Problem, and having presented an efficient way of handling
the many symmetries in the problem, we now introduce a new idea, the heuristic propagation of
additional redundant constraints by means of local search.
Assume we are given an NP-hard constraint satisfaction problem. Even though there is no
proof that we cannot solve the problem efficiently, there is strong empirical evidence that we
cannot compute a solution in polynomial time. The common approach then is to explore the
search space in some sophisticated manner that tries to consider huge parts implicitly. For con-
straint satisfaction problems, that means that we try to cut off preferably large regions that do not
contain any feasible solutions. In constraint programming, particularly when performing some
kind of tree search, domain filtering algorithms are used for that purpose. Basically, the model
and the degree of propagation determine how the work is partitioned between the choice points
and the search tree as a whole. That is, we can aim at reducing the number of choice points by
spending additional effort in each of the choice points, or we can choose to keep the work done
per choice point small, resulting in a bigger search tree.
Thus, we face a trade-off between the time spent per choice point and the total number of
choice points. To take the alternatives to extremes, on one hand we can explore the entire domain
space, and on the other hand we can compute a solution or prove that none exists in the first node
8.3. Heuristic Constraint Propagation
155
visited, without making any choices which might need to be backtracked. For most applications,
the optimal balance will lie somewhere between the two extremes.
If we find that we expand too many choice points we may want to give more burden to the
individual choice points. Revising the model by adding redundant constraints is a common way
to achieve that goal. In general, we expect the redundant constraints to detect inconsistencies
higher up in the search tree. However, since checking whether a given partial assignment is
extensible to a full solution is usually of the same computational complexity as the original
problem, redundant constraints typically still only enforce a relaxation of the actual problem.
We propose adding tight redundant constraints that may be hard to verify exactly, but that
can be checked by applying some heuristic. That is, when formulating additional constraints,
we do not wish to restrict ourselves to considering only those constraints which are (relatively)
easy to propagate completely. Instead, we perform an incomplete check of complex redundant
constraints usingthe rich heuristic machinery that was developedin the operations research com-
munity.
8.3.1 Literature on the Integration of CP and Local Search
In recent years, a substantial number of different approaches to the integration of constraint
programming (CP) and local search (LS) have been developed. The main problem when con-
structing CP-LS hybrids is caused by the fact that CP uses monotonic reasoning whereas LS does
not.
Fairly balanced hybrids result from sequential applications of the two methods [36, 173].
Other balanced hybrids can also be achieved by applying decomposition methods (like La-
grangian relaxation, column generation or Benders decomposition), where sub- and master prob-
lem can be solved by different solution methods.
On the other hand, there also exist many developments that favor one of the methods and
just use the other one to overcome certain weaknesses. Constrained local search, for example,
uses LS as the predominant approach, with CP used to find neighbors in a sparse and/or large
neighborhood [5, 165].
Focusingon constraintprogramminginsteadhasyielded hybridsthat uselocal searchto adapt
the variable and/or value ordering in the search tree. Only recently, an approach was presented
that uses local search to find dominating partial assignments that prove the sub-optimality of
the current search node, information that can be used for pruning [82]. For a more complete
overview on the field we refer the reader to the recent tutorial by Focacci et al. [76].
As with other methods, constraint programming forms the basis of our approach. However,
our use of local search is quite different to any of the methods mentioned above. For a given
model for a problem, we suggest considering the addition of complex redundant constraints, and
then using local search to perform (incomplete) propagation of these constraints.
156
Chapter 8. The Social Golfer Problem
The idea of using a stochastic search method to prove unsatisfiability is not new; it was
Challenge 5 in [196]. Also, heuristic search has been used when tackling certain optimization
problems like maximum clique or graph coloring [15, 60]. However, it appears that the idea
has never been introduced systematically. In the few examples where they are used, complex
redundant constraints are only used for pruning, but not for domain filtering. To the best of our
knowledge, the workpresented here is the first to do it, albeiton a fairly specific set ofconstraints
that comprise sub-problems of the real problem we are trying to solve.
We already mentioned that the constraints requiring that every golfer must not play with any
other golfer more than once makes it very hard to judge the extensibility of a partial assignment.
That is, the sub-tree rooted by the current choice point may not contain any feasible solution, but
due to the fact that the different constraints in the problem are propagated independently from
each other and only interact via domain reductions, we cannot detect infeasibilities high up in
the search tree. As a matter of fact, when using the model we described above, many searches
will only backtrack when the assignments for an entire week are almost complete or even after
having started to do assignments in the last week only.
For most constraint satisfaction problems, to check whether a partial assignment can still be
extended to a feasible solution is of the same computational complexity as the original problem
itself. Therefore, in general the pruning and filtering algorithm applied in the search nodes
cannot be expected to be exact. Rather, it must be looked at as a heuristic to tighten the problem
formulation and to shrink the search space. Of course, there is a trade-off between the time
needed to apply that heuristic and the time needed to explore the remaining sub-tree.
Now, to improve the situation for the Social Golfer Problem as well as for many other con-
straint satisfaction problems, we can try to formulate necessary constraints for partial assign-
ments to be extensible to complete, feasible solutions. For the Social Golfer Problem (and, we
believe, for many other problems as well), we are left with the decision to choose a weak redun-
dant constraint that can be propagated efficiently, or to pick a condition that is more accurate but
maybe much harder to verify. For the latter, we may consider applying a heuristic to perform
incomplete domain filtering or pruning. Note that since the added constraints are redundant, the
incompleteness of this filtering does not affect either the soundness or the completeness of the
search; if some opportunity for pruning is missed, it just means that the tree searched may be
larger than strictly necessary.
In the following, we describe two different types of additional constraints that we would like
to add to our model to be able to detect inconsistencies quickly and as early as possible. The first
type of redundant constraint is defined with respect to the possibility of completing the assign-
ments in a given week. Since we represent weeks by rows in our schedule under construction,
we use the term horizontal constraints to refer to these constraints. Correspondingly, by verti-
cal constraints we mean those used to check necessary conditions for a partial assignment to be
extensible to a w-week solution.
8.3. Heuristic Constraint Propagation
157
group 1 group 2 group 3 group 4 group 5
week 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
week 2 1 4 7 2 5 8 3 * * * * * * * *
Tab. 8.1: A partial instantiation of the 5-3-2 Social Golfer Problem.
group 1 group 2 group 3 group 4 group 5
week 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
week 2 1 4 7 2 5 8 3 6 10 11 * * 12 * *
Tab. 8.2: A more complete partial instantiation of the 5-3-2 Social Golfer Problem.
8.3.2 Horizontal Constraints
Consider the example in Table 8.1. We are searching for a two-week schedule for 15 golfers, to
be arranged in 5 groups of 3 in each week.
For the given partial assignment, suppose we add player 6 to group 3. Next observe that
players 10, 11 and 12 mustbe separated in week 2. As there are only three more groups that have
not yet been filled completely, the third player in group three must be one of players 10, 11 and
12. A possible continuation is given in Table 8.2.
Now there are only two groups left that have not been completed yet, but players 13, 14 and
15 must still be separated. Therefore, the current partial assignment is inconsistent, and we can
backtrack.
We can generalize this observation. For a given incomplete week, we define a residual graph
Rthat consistsof a node for each unassigned player and an edge for each pair of such players that
have already been assigned to play together in some other week. An example of such a graph,
corresponding to week two from Table 8.1, is shown in Figure 8.1. Then, for the given week we
count the number of groups that are not closed yet and compare that number with the size of the
biggest clique in R. If the first number is smaller than the latter, then there is no way to extend
the current assignment to the rest of the week, and the assignment is inconsistent.
6
12
10
11 15 14
13
9
Fig. 8.1: The residual graph of week 2 from Table 8.1.
158
Chapter 8. The Social Golfer Problem
In this way we can define a sufficient condition for a witness which proves that the current
partial assignment is inconsistent: a clique exceeding a certain cardinality. If we can find such
a witness, we can backtrack immediately. Finding a clique of size kis known to be NP-hard for
arbitrary graphs, and while the residual graphs we are dealing with have special structure that
may allow the efficient computation of such a clique, we chose not to try to find a polynomial-
time complete method. Instead we apply a heuristic search to find a sufficiently large clique,
an approach which, as we shall see, has advantages over one which simply returns the largest
possible clique.
Reconsider the situation given in Table 8.1. We can add neither player 6 nor player 9 to
group 3 for the same reason: the members of groups 4 and 5 of week 1 must be separated, and
to do so we require the two open positions in group 3 of week 2.
When checking the redundant constraint described above, assume we have set up the residual
graph and suppose the heuristic we apply finds the two disjoint cliques of size three (
10
11
12
and
13
14
15
). Since the sizes of these cliques are equal to the number of incomplete groups,
we have not found any witnesses showing that the current partial assignment cannot be extended
to a full schedule indeed, the schedule can still be completed. However, since group 3 has only
two open positions left, we can conclude that group 3 must be a subset of
3
10
11
12
13
14
15
. That is, we can use heuristic information for domain filtering.
We conclude that finding a witness for unsatisfiability is a rather complex task, but it can be
looked for by applying a heuristic. Moreover, even if we do not find such a witness, we may find
other “good” witnesses (namely some fairly large cliques) and their informationcan be combined
and used for domain filtering.
Therefore, it is advantageous to use a heuristic that not only provides us with good solutions
quickly, but that also gives us several solutions achieving almost optimal objective values. Local
search heuristics seem perfectly suited for this purpose.
8.3.2.1 Heuristic Clique Search
To find large cliques in the residual graph, we perform a randomized local search that works in
the following way: We initialize our current clique Cwith a random node and set Cbest
C.
Next we intensifyCby repeatedly searching for a random node that is adjacent to all nodes inC.
If no such node exists anymore, we compare the cardinalities
C
and
Cbest
and update Cbest if
necessary. We then move on with a diversification step by adding a random node v
V
Cto C
and removing all nodes inCthat are not adjacent to v. These nodes shall not be considered in the
next diversification step. Now, the loop is complete and we return to the intensification phase.
The process stops after having found a clique that exceeds the crucial cardinality or after a given
iteration limit.
8.3. Heuristic Constraint Propagation
159
group 1 group 2 group 3 group 4 group 5
week 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
week 2 1 5 9 13 2 6 10 * 3 7 * * 4 8 * * * * * *
Tab. 8.3: A partial instantiation of the 5-4-2 Social Golfer Problem.
16 15
14 19
18
20
17
12
11
Fig. 8.2: The residual graph of week 2 from Table 8.3.
Obviously, the approach as sketched above produces a sequence of cliques that we can use
for pruning and domain filtering. We do not claim that the clique search procedure we use is
very sophisticated for finding maximum cardinality cliques in a given graph. In fact, many other
heuristic and exact approaches can be thought of, and there has certainly been a lot of relevant
research done. However, we did not aim at developing a special method that produces one clique
of large cardinality, but rather that finds a large number of fairly big cliques. In any event, while
the algorithm could no doubt be improved on, for the graphs we are dealing with it works well
enough to prove the point.
Before we continue by presenting another type of redundant constraint developed for the
Social Golfer Problem, we give a more complete example of how horizontal constraints can be
used for pruning and domain filtering.
Consider the partial assignment for the social golfer instance 5-4-2 given in Table 8.3. The
associated residual graph for week 2 is given in Figure 8.2.
The local search procedure that we apply returns three disjoint cliques, namely
11
12
,
14
15
16
, and
17
18
19
20
. Since there are still four groups that have not been filled up
yet and the largest clique is of size four, we cannot prune right away. However, we do know
that the final element in group 2 must come from the clique of size four, and so we can shrink
the set of possible elements appearing in this group to
2
6
10
17
18
19
20
. This leaves just
three open groups left, and we have a remaining clique of size 3 at hand. Again we cannot
prune here, but we know that the remaining two elements of groups 3 and 4 must come from the
cliques of size three and four, so we can shrink the set of possible elements of those groups to
be
3
7
14
15
16
17
18
19
20
and
4
8
14
15
16
17
18
19
20
, respectively. Now there is
only one open group left, but we still have to separate players 11 and 12; as a result, the current
assignment is inconsistent, and we can backtrack.
160
Chapter 8. The Social Golfer Problem
group 1 group 2 group 3 group 4 group 5
week 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
week 2 1 6 7 2 5 10 3 4 13 8 11 14 9 12 15
week 3 1 5 14 2 4 12 3 7 11 9 10 13 6 8 15
week 4 1 4 * 2 * * 3 5 * * * * * * 15
week 5 * * * * * * * * * * * * * * *
week 6 * * * * * * * * * * * * * * *
week 7 * * * * * * * * * * * * * * *
Tab. 8.4: A partial instantiation of the 5-3-7 Social Golfer Problem.
The example shows that it can even be advantageous to have several cliques at hand rather
than one big clique only: all cliques together allowed us to prune the search at the current choice
point. If we had known the largest clique of size four only, though, we would have had to expand
the sub-tree below.
8.3.3 Vertical Constraints
Horizontal constraints are very helpful for judging the extensibility of the week currently under
construction. However, they do not help much in getting a clearer view of whether a partial
assignment can still be extended to a full w-week solution. Therefore, we added another type of
redundant constraint to our model, so-called vertical constraints.
Again, we start our discussion with an example (see Table 8.4). In the current partial as-
signment, week 4 can still be completed consistently. Is there still a continuation of the given
schedule to a full 7-week solution, though?
1
3
25
4
7
11
10
Fig. 8.3: The residual graph of player
15 from Table 8.4.
Looking at player 15 (let us assume she is fe-
male), we find that she has played with all players in
6
8
9
12
13
14
already. As there are 4 weeks left that
she has not yet been assigned partners for, she must still
play with all players in
1
2
3
4
5
7
10
11
. To be able
to do so, there must be four independent pairs of players in
this set that still have not been assigned to play with each
other. To check this, we define a residual graph again (this
time for each player, in contrast to each week for horizon-
tal constraints) that consists of one node for each player
that player 15 has not been assigned to play with yet, and
an edge between all such players that have already played
with each other (see Figure 8.3).
8.3. Heuristic Constraint Propagation
161
group 1 group 2 group 3 group 4
week 1 1 2 3 4 5 6 7 8 9 10 11 12
week 2 1 4 7 2 8 10 3 5 11 6 9 12
week 3 1 8 11 2 4 9 3 6 10 5 7 12
week 4 * * * * * * * * * * * *
week 5 * * * * * * * * * * * *
Tab. 8.5: A partial instantiation of the 4-3-5 Social Golfer Problem.
1 2
348
Fig. 8.4: The residual graph of player 12 from Table 8.5.
When trying to find four disjoint stable sets (the term independent set is also used in the
literature) of size two heuristically, we find that we do not succeed. Of course, this does not
prove that there is none. That is, for the vertical constraints we need a witness that not enough
disjoint stable sets of a given size exist. The given example already gives an idea of what such
a witness might look like. Looking at Figure 8.3, we find that the clique
1
2
3
4
5
prevents
us from finding four disjoint stable sets of size two. That is, a clique that exceeds a certain
cardinality is a witness that the current assignment is inconsistent.
To find large cliques in the residual graph of a player, we can apply the local search procedure
developed in Section 8.3.2.1. However, we are facing a slight drawback when using cliques as
witnesses. For the Schoolgirl Problem, i.e. when we know that every player must play with every
other player exactly once, the bound on the maximum cardinality clique in the residual graph can
be chosen to be rather tight. However, if we look at the Social Golfer Problem instance 4-3-5 for
example, every golfer does not play every other golfer; they play every other golfer except one,
and a priori we cannot say which one that is. As a result, if we look for a clique large enough to
guarantee that there will be too few disjoint sets of the appropriate size, that condition is stronger
than we would like. This means that it does not become satisfied until later in the computation,
and we are not able to prune as early as would be desirable.
To see an example of this, consider the partial schedule in Table 8.5. There are five players
which player 12 has so far not been assigned to play with, and to complete the schedule we need
two disjoint stable sets of size two. To prove that this is not possible with a single clique, we
162
Chapter 8. The Social Golfer Problem
group 1 group 2 group 3 group 4
week 1 1 2 3 4 5 6 7 8 9 10 11 12
week 2 1 4 7 2 5 12 3 8 11 6 9 10
week 3 1 6 11 2 4 9 3 7 12 5 8 10
week 4 * * * * * * * * * * * *
week 5 * * * * * * * * * * * *
Tab. 8.6: A partial instantiation of the 4-3-5 Social Golfer Problem.
83
10
11
12
8
12
7
2
3
4
8
1
6
9
Fig. 8.5: The residual graphs of players 4 (left), 6 (middle) and 12 (right) from Table 8.6.
would need a clique of size four. Looking at the residual graph for player 12 (see Figure 8.4),
there are three cliques of size three, namely
1
2
3
,
1
2
4
and
1
2
8
, but no cliques of size
four. However, there are no pairs of disjoint stable sets of size two either, so the schedule cannot
be completed but this is not detected. (Note that the residual graphs for the other players all have
exactly the same structure.)
As with the horizontal constraints, we can obtain better results by considering more than one
clique at once. To illustrate this, consider the slightly different example in Table 8.6. Looking
at the residual graph of player 12 (see Figure 8.5), we find two cliques of size three, namely
1
4
6
and
4
6
9
. As before, there are no cliques of size four, but this time there is a pair of
disjoint stable sets of size two:
1
9
and
4
8
. So we cannot tell that the schedule cannot be
extendedsimply by lookingat this residualgraph. However, we can draw inferences about which
player is the one that player 12 will not be playing with: it must be an element of the intersection
of all cliques of size three. This is because all cliques of size three have to be broken by the
removal of a node; otherwise we are guaranteed that there is no way to partition the remaining
four nodes into a pair of disjoint stable sets. Thus we can deduce that the player that player 12
must not play with is either player 4 or player 6.
We now look at the residual graphs for player 4 and player 6 (see Figure 8.5). For player 4
we find that the intersection of cliques
3
8
11
and
10
11
12
requires that player 4 will
certainly not play with player 11. Similarly, player 6 must not play with player 3, since, for
example,
3
7
8
2
3
12
3
. This means both of these players must play with player 12,
which contradicts the fact that player 12 must not play with one of them. Hence we can prune
the search.
8.4. Numerical Results
163
4-3-3 4-3-4 4-3-5 5-4-3 5-4-4 5-4-5 5-4-6 5-3-6 5-3-7
PI 0.32 0.71 0.98 46.63 83.24 132 143
5 days
5 days
(102) (182) (227) (7430) (5392) (6409) (17129)
3.1 3.9 4.3 6.3 15.4 20.6 83.2
H0.23 0.43 0.31 40.19 43.2 43.74 33.80 13933 13311
(76) (87) (90) (6205) (2829) (2823) (2818) (266140) (266268)
3.0 4.9 3.4 6.5 15.3 15.5 12.0 52.4 50.0
V0.36 0.62 0.47 57.30 90 98.25 46.69 85855 1815
(100) (173) (116) (7415) (5282) (5574) (3300) (1075165) (153697)
3.6 3.6 4.1 7.7 17.0 17.6 14.1 79.9 11.8
HV 0.25 0.4 0.26 47.78 47.45 44.18 23.21 6771 394
(76) (75) (74) (6205) (2750) (2791) (1770) (165238) (85790)
3.3 5.3 3.5 7.7 17.3 15.8 13.1 41.0 46.0
Tab. 8.7: The CPU time needed to compute all unique solutions (in seconds), in brackets the number of
choice points visited, and the time per choice point (in milliseconds) when computing all unique solutions.
8.4 Numerical Results
To confirm our theoretical discussion, we implemented the model as described in Section 8.2 in
C++, compiled by gcc 2.95 with maximal optimization (O3). All experiments were performed
on a PC with a INTEL Pentium III/933MHz-processor and 512 MB RAM running Linux 2.4.
Regarding the additional redundant constraints, we present a comparison of four different
parameter settings:
1. The plain implementation without redundant constraints (PI),
2. PI plus horizontal constraints only (H),
3. PI plus vertical constraints only (V), and
4. PI plus horizontal and vertical constraints (HV).
In Table 8.7, the variants are evaluated on several social golfer instances. To make the com-
parison fair and reduce the impact of other choices such as the variable and value orderings used,
we compute all unique solutions of an instance (or prove that there are none, for the 4-3-5 and
5-4-6 instances), counting the number of choice points and measuring the CPU time required.
Clearly, using both types of additional constraints results in the biggest reduction of choice
points, though for small numbers of weeks the vertical constraints do not give any benefit when
horizontal constraints are used (and little benefit even when they are not). It is not surprising
164
Chapter 8. The Social Golfer Problem
4-3-3 4-3-4 4-3-5 5-4-3 5-4-4 5-4-5 5-4-6 5-3-6 5-3-7
PI 111 285 358 27057 19804 22295 22673
36 115 137 3025 4515 5747 5814
28 45 48 31 71 81 80
H68 113 114 19264 10589 10690 10652 2013018 2012597
36 64 58 3018 1805 1934 1870 463737 463984
35 58 26 35 66 71 60 94 96
V105 270 123 26970 19280 20209 10896 7969706 724678
36 110 10 3025 4515 4757 2310 2684251 143631
22 40 2 25 62 63 35 96 80
HV 68 89 70 19264 10262 10476 5993 1161383 386390
36 51 0 3018 1805 1785 1074 242709 18424
48 40 0 30 59 58 23 87 44
Tab. 8.8: The number of simple and complete symmetry checks, and the percentage of time spent in these
checks when computing all unique solutions.
that the vertical constraints are of little use when the number of weeks is small, since for such
instances the constraints are very weak: there is a lot of flexibility available in deciding which
players any given player should play with in the remaining weeks since the number of players
they will never play with is quite large. However, as the number of weeks increases, each player
must play with a greater number of the other players, so the amount of flexibility is reduced and
the constraints become steadily stronger.
A similar trend can be seen in the effectiveness of the horizontal constraints: while they
are useful for small numbers of weeks, their effectiveness improves as the number of weeks
increases. Again, this is due to the fact that as the number of weeks increases, each player
must play with a greater number of the other players, so that there is less flexibility available
in deciding who should play together in a given week. This means that it is more likely that
a partial week assignment cannot be completed, and hence the horizontal constraints can prune
more often.
Whether or not a reduction in the number of choice points also results in a reduction of CPU
time is of course determined by the trade-off between the time needed to apply the incomplete
propagation algorithms and the time saved by the reduction of choice points. Adding vertical
constraints when the number of weeks is small can be expected to worsen the CPU time due to
the ineffectiveness of these constraints on these instances, and that is indeed what happens. For
(almost) all other instances, the reduction in the number of nodes is sufficiently large to offset the
extra cost of the constraints. The only real surprise here is that adding the extra constraints can
result in the average amount of time spent at each node decreasing, sometimes quite markedly.
8.5. Summary
165
Number of Groups Number of Players per Group
Number of Weeks 2-2 3-2 3-3 4-2 4-3 4-4 5-2 5-3 5-4 5-5
2 1 1 1 2 1 1 2 2 1 1
3 1 2 1 8 4 2 23 251 40 2
4 0 1 1 16 3 1 310 13933 20 1
5 1 0 19 0 1 3468 9719 10 1
6 0 13 0 13277 49 0 1
7 6 14241 7 0
8 0 3192 0
9 396
10 0
Tab. 8.9: The number of unique solutions for several social golfer instances.
This is due to some synergy between the extra constraints and the symmetry breaking method
used: roughly speaking, the more pruning, the easier the average dominance check becomes,
presumably because the nodes pruned tend to have more expensive checks (see Table 8.8). Since
the dominance checks form a significant part of the run time, this can be quite a noticeable effect.
Using the (HV) setting, we have been able to compute all unique solutions for all instances
with at most 5 groups and 5 players per group (see Table 8.9). To the best of our knowledge, this
is the first computational approach that has been able to compute these numbers for instances
of this size. Moreover, we found solutions for many previously unsolved (at least by constraint
programming) larger instances, such as the 10-6-6 and the 9-7-4 instances.1Finally, we were
able to solve the formerly unknown 6-5-6 and 6-5-7 instances. We computed a six week solution
for the six groups of five instance, proved that there are exactly two unique solutions for that
instance, and showed that these solutions are optimal by proving that no seven week solution
exists.
8.5 Summary
For the Social Golfer Problem, we developed an algorithm that efficiently computes unique solu-
tions only. Symmetry breaking is based on the concept of SBDD. We have introduced the idea of
heuristic constraint propagation for complex redundant constraints. We proposed two different
types of additional constraints, so-called horizontal and vertical constraints. Propagating both
types of constraints exactly would require the computation of all the maximal cliques in residual
graphs with certain structural properties. Instead, we perform an incomplete propagation using
1An overview of solutions found by constraint programming can be found at [105].
166
Chapter 8. The Social Golfer Problem
a local search method to find a number of maximal cliques (but perhaps not all of them and per-
haps not the largest possible). We have shown how such sets of cliques can be used for domain
filtering and pruning. The experiments clearly show that adding tight redundant constraints to
the problem can be of benefit, even when they are only propagated incompletely.
Our ability to use local search for domain filtering and proving unsatisfiability relies on the
fact that we have identified sufficient conditions for proving that a partial schedule cannot be
extended to a complete one, and that a local search procedure can be defined such that a solution
returned is a proof that the conditions have been met. It would be interesting to see what other
kinds of constraints this can be done for, and whether it can be generalized far enough to pass
Challenge 5 from [196].
Chapter 9
Graph Bisection
The last problem that we consider in this thesis is the NP-hard Graph Bisection Problem. We
develop an algorithmthat, given a graph, determines its exact bisectionwidth. Due to its inherent
hardness, we cannot hope for an efficient algorithm to tackle the problem (at least in terms
of worst case running times and under the common assumption that NP
P), but we aim at
increasing the size of instances that can still be solved in an affordable amount of time.
As it is the case for any combinatorial optimization problem, the task of computing the bi-
section width of a graph is twofold. First, an optimal solution to the problem must be computed,
and second, its optimality must be proven. Regarding the first task, efficient heuristics have been
developed in the literature to compute high quality and, for small graphs, often even optimal
solutions very quickly. To prove the optimality of a given solution, we use a total enumeration
branch & bound approach. The key issue for the development of a good tree search algorithm is
to compute tight lower bounds on the bisection width.
In [197], Sensen presented a new lower bound for the bisection width of a graph that consists
in solving a generalized Maximum Multicommodity Flow Problem: We need to solve a Max-
imum Multicommodity Flow Problem on an undirected graph, whereby the commodities have
exactly one source, but may have many sinks.
In contrast to the previous chapters, problem reduction is a minor subject of our research for
the Graph Bisection Problem. However, we have seen that filtering based on cost considerations
relies on a tight bound on the objective. Our contribution here is the development of time and
memory efficient algorithms that approximate Sensen’s lower bound. Building up on existing
techniques for the solution of Multicommodity Flow Problems, we develop two bound computa-
tion routines: the first is a fully polynomial time approximation scheme (FPTAS), and the second
is a Lagrangian relaxation based cost-decomposition approach. Both routines are embedded in
a branch & bound-framework and compared with a barrier LP-solver on various test instances.
The algorithms presented allow us to compute the bisection width of large structured graphs,
such as DeBruijn 9 and Shuffle-Exchange 10, which were unknown and out of the reach of exact
167
168
Chapter 9. Graph Bisection
graph bisection algorithms before.
The work presented has not been published yet. It is joint work with Norbert Sensen and
Larissa Timajev. The chapter is structured as follows: In Section 9.1, we introduce the Graph
Bisection Problem. In Section 9.2, we review the lower bounds on the bisection width that have
been proposed in [197], especially the so-called VarMC-bound. In Sections 9.3 and 9.4, we
develop two algorithms for the computation/approximationof that VarMC-bound. The complete
branch & bound-approach is sketched in Section 9.5. Finally, in Section 9.6, we compare the
different algorithms on a set of test instances.
9.1 The Graph Bisection Problem
The Graph Bisection Problem is defined as follows:
Definition 9.1 Let G
V
E
u
denote an undirected, edge-weighted graph, whereby uvw is the
weight of the edge
v
w
E. Since G is undirected, uvw
uwv for any
v
w
E.
Let 2
k
and V1

Vk
V such that, for all 1
i
j
k, Vi
Vj
/
0and V
V1

Vk. Then we call
V1

Vk
a k-partition (of G). A balanced k-partition of a
graph G additionally satisfies the condition
Vi
Vj
1for any 1
i
j
k.
Given a k-partition
V1

Vk
, we set U :

v
w
Vi
Vj
1
i
j
k
E. Then
the value
v
w
Uuvw is called the cut size of the partition. The minimal cut size among
all balanced k-partitions of a graph G is called the k-section width of G.
For k
2, a k-partition is also called a bisection (of G). The minimum cut size over all
balanced bisections of G is called the bisection width of G.
Given an edge-weighted, undirected graph G, the Graph Bisection Problem consists in the
computation of the bisection width of G.
For many special graph classes (such as grids, tori, hyper-cubes, butterflies, cube-connected-
cycles etc.), the bisection width has been sought out by theoretical reflections. However, in gen-
eral the Graph Bisection Problem is NP-hard [89]. The best known approximation algorithm for
the problem is presented in [69], where a poly-logarithmic algorithm is presented that achieves
an approximation quality in O
log2n
.
In practice, the size of instances that can stillbe solved efficiently is rather small. Until today,
optimal bisections can be computed for small graphs with a few hundred vertices only. On the
other hand, there are very efficient heuristics available for the problem [109, 134, 164, 172, 214].
Therefore, the main problem is the proof of optimality, and thus, the computation of tight lower
bounds on the bisection width of a graph.
9.2. Bounds on Graph Bisection
169
In the last years, a number of approaches have been presented that solve the Graph Partition-
ing Problem exactly. In [72], a branch & cut algorithm for the problem is presented. The same
approach is followed in [27], whereas in [124], the Graph Partitioning Problem is tackled using
a column generation approach. The most recent and apparently also most successful approach is
presented in [133] where semi-definite programming relaxations are used as lower bounds. An
experimental comparison of our developments and the semi-definite bound in [133] is given in
Section 9.6.
9.2 Bounds on Graph Bisection
Our work is based on the lower bounds on the Graph Partitioning Problem introduced in [197].
In the following, we give a small survey on the main ideas of these bounds:
A well-known lower bound on the bisection width can be achieved by embedding a clique
with the same number of nodes ninto the given graph G[140]. If that complete graph can
be embedded with a congestion C, we know that Ghas a bisection width of at least n2
4C. A
similar lower bound on the bisection width can be computed by solving a Multicommodity Flow
Problem: For every node in G, we introduce a commodity that originates at its corresponding
node. Every other node requires exactly one unit of that commodity. That way, every node has
to send one unit of its corresponding commodity to every other node. Note that we do not need
to enforce integrality constraints on the flows while strengthening the bound computed.
We can improve this bound by taking two steps of generalization: First, it can be observed
that every single-sourcemulticommodityflowinstance (with arbitrary demands and destinations)
can be used to compute a lower bound on the bisection width. The critical point is theCutFlow,
i.e. the amount of flow which can be ensured to cross every possible bisection of the graph. It is
easy to show that CutFlow
Cis always a valid lower bound on the bisection width of a graph.
Second, we do not have to select an appropriate multicommodity flow instance by ourselves:
Typically, linear programming techniques are used to solve Multicommodity Flow Problems.
In [197], it is proposed to leave the selection of an appropriate multicommodity flow instance to
the linear program by adding some variables and constraints. Two different possibilities with a
different degree of freedom for the selection of the Multicommodity Flow Problem have been
introduced: the VarMC and the MVarMC formulations. In the MVarMC formulation, every node
has the freedom to send a commodity of arbitrary size to each other node. On the other hand, in
the VarMC formulation, every nodehas tosend a commodityof arbitrary size toevery othernode,
whereby each destination gets the same share. Experiments have shown that the VarMC formu-
lation gives equally good bounds on connected graphs as the MVarMC formulation. Therefore,
we develop different algorithms for the computation of the VarMC-bound.
170
Chapter 9. Graph Bisection
9.3 Approximation of the VarMC-bound
Even though the continuous Multicommodity Flow Problem (MMCF) is solvable in polynomial
time, for more than a decade now researchers have tried to develop approximation schemes
for the problem. The reason for this at first surprising fact is that large multicommodity flow
instances have to be solved in many application areas and in combination with a whole variety of
discrete optimization problems such as network design or graph bisection. Standard simplex or
interior point LP-solvers, even specialized software based on primal basis partitioning [66, 67]
or Lagrangian relaxation based resource- or cost-decompositions [44, 84] are simply not fast
enough to tackle real-size instances of these problems in a reasonable amount of time. Therefore,
there is a big interest in the development of algorithms that provide near-optimal solutions more
quickly.
The first FPTAS’s for maximum multicommodity flow were based on Lagrangian relaxations
and linear programming. They have been improved and adapted to different models in a series
of papers [100, 101, 131, 135, 141, 169, 175, 198]. While all this research was built on the idea
of rerouting existing flows, the idea of augmenting flow has led to a couple of new algorithms
with improved running times [74, 75, 90, 130, 217]. Several publications also report about the
usability of approximation algorithms for Multicommodity Flow Problems in practice [2, 98,
101, 176, 184].
In this section, we develop an ε-approximation scheme for the VarMC-bound: Let G
V
E
u
denote an undirected, edge-weighted graph with
V
n,
E
m, with associated node-
arc incidence matrix N
1
0
1
n
2m, and capacities u
m
0. Furthermore, let K
denote
the number of commodities with source nodes sk
Vand demands dk
n, 1
k
K, whereby
dk
i
0 for all i
sk, and idk
i
0. Finally, denote the cost coefficient of commodity kby pk.
Then the problem is to
Maximize kpkλk
subject to Nxk
λkdk
1
k
K
1
kxk
ij
xk
ji
uij

i
j
E
2
λ
x
0
This way to state the problem allows us directly to compute a lower bound on the bisection width
of a given graph (compare with Section 9.2): the objective maximizes the cutflow, whereby the
variables λkdetermine the sender volume of commodity k, and the xkgive the correspondingflow
in the network. The constraints
1
ensure that xkis a feasible flow of commodity kwith volume
λk, and restrictions
2
enforce that the capacity of the undirected edges
i
j
is not violated.
Note that, to solve the problem, it is sufficient to look at the special case with pk
1 for all
1
k
K, because for pk
0 we set λk
0 and xk
0, and otherwise we can scale the demand
dkby 1
pk.
9.3. Approximation of the VarMC-bound
171
9.3.1 The FPTAS
We try to keep the chapter self-contained. Note however, that the following description is ana-
logue to the “maximum multicommodity flow”-section in [74], which itself builds directly on
the analysis given in [90]. Our contribution here consists in the generalization of the existing
algorithms for the Maximum Multicommodity Flow Problem to an FPTAS that can handle com-
modities with more than one sink. This, of course, is essential for the computation of a lower
bound for the bisection width of a graph (see Section 9.2). At the same time, since we focus on
graph bisection here, we enable the FPTAS to handle undirected edges. However, the changes
that are necessary for that purpose are not of fundamental nature, so that the theory we present
can also be used for the development of an approximation scheme for generalized Maximum
Multicommodity Flow Problems on directed graphs with multi-sink commodities.
For all 1
k
K, denote the set of all paths from the source skto the sink i
V
sk
by
πk
i
. Furthermore, set πk
i
skπk
i
. Using variables yk
Prepresenting the flow of commodity
kalong some path P
πk, we achieve a path-based formulation of the problem
Maximize kλk
subject to
P
πk
i
yk
P
λkdk
i
1
k
K
i
V
sk
k
P
πk:
i
j
Pyk
P
uij
i
j
E
λ
y
0
The dual of the previous LP can be written as
Minimize
i
j
Euijlij
subject to
i
j
Plij
zk
h
1
k
K
h
V
sk
P
πk
h
i
skdk
izk
i
1
1
k
K
l
0
That is, we have to assign lengths lij
0 to each edge
i
j
E, such that i
skdk
idistk
i
l
1
for all 1
k
K, and
i
j
Euijlij is minimized, whereby distk
i
l
denotes the shortest path
distance from skto the sink i
Vin Gunder length function l.
The approximation scheme that we investigate in the following is sketched in Algorithm 5.
We start with length function l
δfor an appropriately defined δ
δ
ε
G
d
, and the primal
solutionx
0. Thelength functionlis definedon theundirected edge setE, whereas we maintain
flow values xk
ij and xk
ji for each edge
i
j
Eand all 1
k
K. While there is still a tree T
along which we can route the demand of a commodity kwith costs less than 1, the algorithm
selects such a tree and augments flow along this tree. More precisely, the algorithm selects a
172
Chapter 9. Graph Bisection
Algorithm 5 APPROXIMATION SCHEME
1: x:
0
l:
δ
2: ˆ
α
minkidk
iδ,oldSource
1
3: repeat
4: for all 1
k
Kdo
5: if oldSource
skthen
6: T
SHORTEST_PATH_TREE(sk,l)
7: oldSource
sk
8: while idk
idistk
i
l
min
1
1
ε
ˆ
α
do
9: for all
i
j
Tdo
10: ck
ij
h
Tj
dk
h
11: ϕ
min
i
j
T
2xk
ji
uij
ck
ij,:
0
12: for all
i
j
Tdo
13: ji
min
xk
ji
ϕck
ij
14: ij
ϕck
ij
ji
15: if ij
ji
0then
16: lij
lij
1
ε
ij
ji
uij
17: T
SHORTEST_PATH_TREE(sk,l)
18: x
x
19: ˆ
α
ˆ
α
1
ε
20: until ˆ
α
1
tree with approximately minimal costs up to an approximation factor of 1
ε. This property is
achieved by maintaininga lowerbound ˆ
αon the current minimalroutingcosts of any commodity.
The amount of flow sent along tree Tis determined in the following way: Let Tjdenote all
nodes in the sub-tree rooted at node j
V(including j). For each edge
i
j
T, we com-
pute the congestion ck
ij when routing the demand of commodity kalong that tree, i.e., we set
ck
ij
h
Tjdk
h. Basically, we achieve a feasible routing by scaling the flow by min
i
j
Tuij
ck
ij.
However, since we are working on an undirected network here, we would like to consider only
flows with min
xk
ij
xk
ji
0 for all 1
k
Kand
i
j
E. When we also incorporate and
change the current flow xk
ji of commodity kin the opposite direction, we achieve an even bigger
scaling factor of min
i
j
T
2xk
ji
uij
ck
ij. Formally, we can prove Lemma 9.1 regarding the
change ij
ji of the current flow on edge
i
j
Eof commodity k.
In case of a positive flow change on an edge
i
j
E(i.e., when ij
ji
0), we update
the dual variables by setting lij
1
ε
ij
ji
uij
lij, i.e., we increase the lengths of an
edge with respect to the congestion of that edge.
9.3. Approximation of the VarMC-bound
173
Finally, the primal solution is updated by xk
ij
xk
ij
ij. This setting may yield an infeasible
solution, since it may violate some capacity constraint k
xk
ij
xk
ji
uij for some edge
i
j
E. However, the mass balance constraints are still valid. This allows us, at the end of the
algorithm, to scale the final flows xkso that they build a feasible solution to the problem.
For the algorithm sketched above, we are able to prove
Theorem 9.1 Let TSP
m
nlogn, and S
dk
i
dk
i
0
1
k
K
i
V
. An ε-approximate
VarMC-bound on the bisection width of a graph can be computed in time O

mTSP
S
ε2
.
Following the analysis in [74], we prove the previous theorem with the help of a sequence of
lemmata. We set ρ
mink
i
dk
i
dk
i
0
, and σ
log1
ε

1
ε
δρ

. Then,
Lemma 9.1 In every iteration in which the current flow is changed, it holds that:
a) ij
ji
uij for all
i
j
E, and
b) there exists an edge
i
j
E such that ij
ji
uij.
Proof: Let ϕ
min
i
j

T
2xk
ji
uij
ck
ij.
a) For
i
j

j
i
Tthere is no change in the flow. Let
i
j
T. In case of xk
ji
0, we have
ij
ji
ij
ϕck
ij
2xk
ji
uij
ck
ij
ck
ij
uij. In case of xk
ji
ϕck
ij, we have ji
ϕck
ij
and ij
0. Thus, ij
ji
ji
ϕck
ij
0
uij. In case of 0
xk
ji
ϕck
ij, we have
ij
ji
ϕck
ij
xk
ji
xk
ji
2xk
ji
uij
ck
ij
ck
ij
2xk
ji
uij. Finally, the case
j
i
T
is analogue to
i
j
T.
b) Considerthe edge
i
j
Twithϕ
2xk
ji
uij
ck
ij. In case ofxk
ji
0, we haveij
ji
ij
ϕck
ij
2xk
ji
uij
ck
ij
ck
ij
uij. In case of xk
ji
0, it holds that xk
ji
2xk
ji
uij
2xk
ji
uij
ck
ij
ck
ij
ϕck
ij. Thus, ij
ji
ϕck
ij
xk
ji
xk
ji
2xk
ji
uij
ck
ij
ck
ij
2xk
ji
uij.
Lemma 9.2 The flow obtained by scaling the final flow by 1
σis primal feasible.
Proof: Let ij
t
denote the change of the flow on edge
i
j
Tin iteration t. Then,
t
I
ij
t
ji
t

equals the flow on edge
i
j
Eafter iteration I. It is sufficient to show
that, at the end of the algorithm, it holds that t
I
ij
t
ji
t

σuij.
In each iteration with ij
t
ji
t
0, the length lij of edge
i
j
Eincreases by a factor
of 1
ε
ij
t
ji
t

uij. Denote the set of all iterations tin which ij
t
ji
t
0 by I.
Then we have that
lij
δ
t
I
1
εij
t
ji
t
uij
(9.1)
174
Chapter 9. Graph Bisection
With Lemma 9.1 and 1
εx
1
ε
xfor all 0
x
1, we have that
lij
δ
t
I
1
ε
ij
t
ji
t
uij
δ
1
ε
t
I
ij
t
ji
t
uij
(9.2)
Thus, at the end of the algorithm we have
t
I
ij
t
ji
t

uij log1
εlij
δ
(9.3)
Since the left hand side is a valid upper bound for the final congestion on edge
i
j
E, it
remains to show that lij
1
ε
ρ. Assume the opposite. With Lemma 9.1, lij always increases
by a factor of at most 1
ε. Thus, before the last change it must hold lij
1
ρ. Consider
the iteration in which lij increases the last time during the increase of the flow regarding some
commodity 1
k
K. We know that then h
skdk
hdistk
h
l
1. However, since lij
1
ρand
i
j
is an edge on the shortest path from skto one of its sinks, we have that h
dk
h
Sdistk
h
l
1
ρ. Thus,
h
sk
dk
hdistk
h
l
ρ
h
dk
h
S
distk
h
l

ρ
ρ
1
(9.4)
which is a contradiction.
Lemma 9.3 Let τ
1
ε
ρ, and denote the maximum number of edges in a simple path from
a source skto one of its sinks i
V by L. When setting δ
τ
τLmaxki
skdk
i
1
ε, the final flow
scaled by 1
σis optimal with a relative error of at most 3ε.
Proof: We prove the desired accuracy of the solution computed by comparing it against the
objective value Dof a dual feasible solution, which our algorithm produces as a byproduct, and
which gives us a valid upper bound on the primal optimal solution value Zopt.
For any given length function l:E
0, let αdenote the minimal routing costs of all
commodities, i.e., α
α
l
minki
skdk
idistk
i
l
. Furthermore, denote thedual objectivevalue
corresponding to the current choice of lby D
l
i
j
Euijlij. Then the optimal dual objective
value is Zopt
minlD
l
α
l
.
Consider iterationt, and let l
t
,k,ϕ,cand Tdenote the current choice of the length function,
commodity, scaling factor, congestion vector and routing tree, respectively. For ease of notation,
we set α
t
α
l
t

,D
t
D
l
t

, and distk
i
t
distk
i
l
t

. For any edge
i
j
E, define
Γij
max
ij
t
ji
t

0
. Note that, in every iteration t, it holds that Γij
ϕck
ij. Then,
D
t
D
t
1
ε
i
j
T
Γijlij
t
1
D
t
1
ε
i
j
T
ϕck
ijlij
t
1
D
t
1
εϕ
h
sk
dk
h
i
j
T;h
Tj
lij
t
1
D
t
1
εϕ
h
sk
dk
hdistk
h
t
1

D
t
1
εϕ
1
ε
α
t
1

9.3. Approximation of the VarMC-bound
175
Denote the primal objective value kλk(corresponding to the possibly infeasible flows
xk) after the iteration tby Z
t
(Z
0
0). Obviously, it holds that Z
t
Z
t
1
ϕ. Thus,
D
t
D
t
1
ε
1
ε
Z
t
Z
t
1

α
t
1
, and therefore
D
t
D
0
ε
1
ε
1
h
t
Z
h
Z
h
1

α
h
1
(9.5)
Consider the length function l
t
l
0
l
t
δ. Then, for the dual objective, we have D
l
t
l
0

D
t
D
0
, andfor theprimal objective,we get α
l
t
l
0

α
t
Lδmaxki
skdk
i,
where Ldenotes the maximum number of edges on a simple path from a source skto one of its
sinks i
V. Hence,
Zopt
D
l
t
l
0

α
l
t
l
0

D
t
D
0
α
t
Lδmaxk
i
skdk
i
(9.6)
Thus,
α
t
Lδmax
k
i
sk
dk
i
ε
1
ε
Zopt
1
h
t
Z
h
Z
h
1

α
h
1

(9.7)
Denote the right hand side of Inequality 9.7 by A
t
. Then,
A
t
A
t
1
ε
1
ε
Zopt
Z
t
Z
t
1

α
t
1
A
t
1
1
ε
1
ε
Zopt
Z
t
Z
t
1

A
t
1
eε
1
ε
Zopt
Z
t
Z
t
1
A
0
eε
1
ε
Z
t
Zopt
because of 1
x
exfor all x
and Z
0
0. Now consider the lastiterationt. Then, α
t
1.
With A
0
Lδmaxki
skdk
i, we get
1
α
t
A
t
Lδmax
k
i
sk
dk
ieε
1
ε
Z
t
Zopt
(9.8)
Thus,
Z
t
Zopt
ε
1
ε
ln 1
Lδmaxk
i
skdk
i
(9.9)
When setting δaccording to Lemma 9.3, a simple calculation shows that for the scaled objective
value we get Z
t
σ
Zopt 1
ε
ε
1
ε
ln1
ε
Z
t
σ
Zopt
1
3ε

(9.10)
for all ε
1.
176
Chapter 9. Graph Bisection
Proof of Theorem 9.1: In Lemmas 9.2 and 9.3, we have shown that the algorithm returns a
feasible flow of the desired accuracy. It remains to show that the running time is polynomial.
The running time is dominated by the operations in Lines 6, 8, 10 and 17. Setting ν
minki
skdk
iand µ
log1
ε

1
ε
νδ

σ, the following holds:
6) Since ˆ
αis initially set to νδ, and ˆ
α
1
εat the end of the computation, we know that
the outer while loop is executed at most log1
ε

1
ε
νδ

µtimes. If we order the
commodities according to the corresponding source nodes sk, we have to compute Line 6
at most times. Thus, this line adds a workload of O
nσTSP
.
8) Obviously, every execution of Line 8 may require time O
n
. However, the simple estima-
tion of a cost of kn per outer while iteration can be strengthened by observing that in each
such iteration all flow destinations will be investigated exactly once. Thus, for every outer
while iteration this line adds a workload of O
S
. For every inner while iteration the work
to be done in Line 8 is dominated by Line 15. Thus, it is sufficient to add a workload of
O
S
σ
for Line 8.
9,10) At first, the computation of the current congestion ck
ij on each edge
i
j
Tseems to
require a running time quadratic in n. However, when computing the shortest path tree
T, we get a topological ordering of Tfor free (simply by using the ordering in which
Dijkstra’s algorithm labels the nodes). Using this topological ordering, we can compute
the values h
Tjdk
hbottom up, requiring a total running time in O
n
. Thus, Lines 9 and 10
are dominated by Line 17.
17) Whenever a shortest-tree computation results in a flow change along the computed tree, we
know from Lemma 9.1 that at least for one edge the length increases by a factor of 1
ε.
At the beginning, all lengths are equal to δ, and the proof of Lemma 9.2 shows that, at the
end of the algorithm, it holds that lij
1
ε
ρ. Therefore, Line 17 is executed at most
mlog1
ε

1
ε
δρ

mσtimes. Thus, Line 17 adds a workload of O
mσTSP
.
Putting the results together and assuming m
n, we get a running time in
O
nσTSP
S
σ
mσTSP
O

mTSP
S
σ

(9.11)
Thus, when we set δaccording to Lemma 9.3, we achieve a running time in
O

mTSP
S
logn
log
max
k
i
sk
dk
i
ρ

ε2
O

mTSP
S
ε2

(9.12)
9.4. Cost-Decomposition Approach
177
9.3.2 Implementation Details
The previous theoreticworkgivesus an approximationscheme for a lower boundon thebisection
widthof a graph. Note that, withrespect to Section9.2, we knowthat
S
O
n2
.1Therefore, for
any connected graph we can achieve an ε-approximation of the VarMC-bound on the bisection
width in time O
m2
ε2
.
Even though our theoretical work does not give any further guarantees, in practice, the ap-
proximation scheme presented can be improved. Most importantly, the final scaling factor σ
used to make the final flow primal feasible should not be determined by the formula given in
Lemma 9.2. Instead, we can easily determine a scaling factor ˆ
σ
σby computing the con-
gestion cij of the final flow on each edge
i
j
Eand setting ˆ
σ
max
i
j

Ecij
uij. This
improvement does not affect the overall running time of the algorithm.
In practice, we may consider to do even more and to apply an idea that we call enhanced
scaling: before scaling the final flow, at the end of the algorithm, for each commodity kwe
obtain a flow xkand possibly also a scalar λk
xk
dk
sk
0. In order to construct a feasible
flow, instead of scaling all xkequally, we could also set up another optimization problem to find
scalars ξkthat solve the following LP:
Maximize kλkξk
subject to kξk
xk
ij
xk
ji
uij
i
j
E
ξ
0
That way, the final bound obtained can be improved in practice. However, this gain has to be
paid for by an additional computational effort that, in theory, dominates the overall running
time. However, as we shall see in Section 9.6, when solving an instance of the Graph Bisection
Problem, the effort is taken worthwhile.
9.4 Cost-Decomposition Approach
Another way of computing the VarMC-bound on the bisection width is to use cost-based de-
composition techniques, that were originally developed for the Min-Cost Multicommodity Flow
Problem. The idea is to relax the capacity constraints (another term used in the literature is
bundle constraints) in order to decompose the problem into a set of Shortest Path Problems.
Recall from Section 9.2 that we need to compute flows that guarantee a certain cutflow ˆ
F
while keeping the maximum congestion on any edge within a given limit ˆ
C. More formally, we
1Without going into details here, we would like to add that it even holds that
S

n2, and this remains true even
when new commodities are added when branching.
178
Chapter 9. Graph Bisection
need to find flows xksuch that
Nxk
λkdk
1
k
K
kxk
ij
xk
ji
uij ˆ
C

i
j
E
LP 1
kλk
ˆ
F
λ
x
0
We will investigate a transformation of this problem into an optimization problem, namely by
trying to minimize the congestion while guaranteeing that the required cutflow is achieved. We
develop an approach on that optimization problem using an integration of column generation and
Lagrangian relaxation.
9.4.1 Column Generation
The path-based formulation in Section 9.3.1 could be used to build a column generation ap-
proach on. However, taking into account the large number of source-sink pairs that we are facing
when computing the VarMC-bound on the bisection width of a given graph it is in Θ
n2
we would have to cope with an LP with extremely many constraints. For example, for the
DeBruijn 9, the simplex tableau would have more than 218 rows. In contrast to the number of
columns that can be controlled first by generating only columns with negative reduced costs and
second by successive matrix compressions, such a huge number of rows cannot be handled effi-
ciently. Thus, in practice we cannot afford to generate columns that represent paths in the graph,
even though this would be advantageous with respect to the total number of columns that have to
be generated in order to achieve a near-optimal solution. For the same reason, a master problem
consisting of key paths and cycles as it was proposed in [12] cannot be handled efficiently for our
application. Thus, we will use a master problem that is based on trees.
Let Mk
Tk
g
1
g
Gk
denote the set of all trees rooted at skand routing the demand of
commodity kto its sinks (where
Mk
Gkdenotes the number of all such tree routings). Define
M:
kMk, and let ck
g
j
h
Tk
g
j
dk
hdenote the congestion on an edge
i
j
Tk
g(i.e., iis the
unique predecessor of jin Tk
g) for all 1
k
Kand Tk
g
Mk. Then the problem can be written
as Minimize LPC
C
subject to
k
K
g
Gkξgck
g
j
uijC
i
j
E
k
K
g
Gkξg
ˆ
F
ξ
0
Starting with a subset ˆ
M
M, we solve the reduced master problem. Using dual variables rij
0
for the capacity constraints and f
0 for the cut-flow restriction, we generate new columns with
9.4. Cost-Decomposition Approach
179
negative reduced costs by solving the sub-problem
Minimize
i
j
E
xk
ij
xk
ji
rij
f
subject to Nxk
dk
1
k
K
x
0
Note that the sub-problem decomposes into Ksingle-source Shortest Path Problems with non-
negative edge costs. We can prune the search, if LPC
ˆ
C.
The process of generating columns and solving the master problem is iterated until we cannot
compute trees with associated negative reduced costs anymore or until a master iteration limit is
reached. The number of columns in the master matrix is controlled by a frequent compressing
step that reduces the number of columns with respect to the current associated reduced costs.
However, our experiments showed that the compression must not be carried out too aggressively,
because we need a fairly large number of columns in the master matrix in order to keep the total
number of master iterations within reasonable limits. This is a clear drawback of the tree-based
formulation of the master problem that results in a relatively high solution time for the master
problem.
Thus, we try to keep the number of master iterations small by adding a whole set of columns
between two master problem solutions. The only question that must be answered is how to
generate meaningful new columns without the help of new dual variables. We use Lagrangian
relaxation for this purpose, first on a min-congestion formulation of the problem, and second, on
a max-cutflow representation. The idea to use Lagrangian relaxation to generate new columns is
motivated by the fact that the optimal Lagrangian multipliers in the min-congestion formulation
are also optimal dual values for the column generation procedure and vice versa.
The whole procedure works as follows: we start a subgradient optimizationof the Lagrangian
dual and achieve an upper bound for the cutflow or a lower boundon the congestion, respectively.
At the same time, we feed the tree-flows computed in the successive Lagrangian sub-problems
into the matrix of the master problem. If the upper bound on the maximum cutflow is smaller
than ˆ
F, or if the lower boundon the minimum congestionas greater than ˆ
C, respectively, we have
proven that the VarMC-bound does not allow to prune the current search node, and we can branch
right away. Otherwise, we solve the master problem and achieve a feasible flow that yields an
upper bound on the congestion. If we achieve a flow with an associated congestion smaller than
ˆ
C, we can prune the current search node. Otherwise, we restart the Lagrangian sub-routine with
the current dual values again. This process is iterated until we find optimal flows or an iteration
limit is reached.
9.4.2 Lagrangian Relaxation Based Column Generation
To complete the description, we briefly present the two Lagrangian relaxations that we consider
to generate columns in a column generation framework in more detail.
180
Chapter 9. Graph Bisection
By relaxing the capacity constraints in LP 1 using Lagrangian multipliers rij
0, we get a
max-cutflow formulation
Maximize k
λk
i
j
E
xk
ij
xk
ji
rij
ˆ
CrTu
subject to Nxk
dk
1
k
K
x
0
Or, we can aim at a min-congestion formulation
Minimize k
i
j
E
xk
ij
xk
ji
rij
C
1
rTu
subject to Nxk
dk
1
k
K
kλk
ˆ
F
x
0
In both cases, we need to solve KSingle-Source Shortest Path Problems again. Thus, both
formulations allow us to use the shortest path trees computed in the Lagrangian sub-problems
to generate columns for the master problem. Note that both Lagrangian sub-problems may be
unbounded. This problem is overcome by setting upper bounds on λk(for example the out-
capacity of sk) or on C(for example 1
1ˆ
C), respectively.
The update of the Lagrangian multipliers can be done by different subgradient algorithms us-
ing different formulas for the computation of the new search direction dt. Let stdenote a subgra-
dient in iteration t. We can set dt
st(pure subgradient); or dt
αdt
1
st, whereby 0
α
1
is fix (so-called Crowder rule [46]); or we may set dt
αtdt
1
st, with αt
st
dt
1
if
st
Tdt
1
0, and α
0 otherwise (modified Camerini-Fratta-Maffioli rule [28]); another possi-
bility is to set dt
αdt
1
1
α
st, whereby 0
α
1 is fix (so-called volume algorithm[11]).
All variants will be evaluated in the numeric section.
Interestingly, there are some similarities to the approximation scheme presented in Sec-
tion 9.3. In both cases, we compute a sequence of upper and lower bounds. In the approxi-
mation scheme as well as in column generation and Lagrangian relaxation we compute shortest
path trees with respect to some changing length function. The only difference is how we change
that length function. In the approximation scheme, it is increased exponentially with respect to
the current congestion on an edge. In the subgradient algorithm for Lagrangian relaxation, it is
changed with respect to search directions that reflect a current subgradient and possibly parts of
the search history. In column generation, new lengths are simply set to the current dual values in
the master problem. Thus, we consider an experimental comparison of these different strategies
of interest.
9.5. A Branch & Bound Algorithm
181
9.5 A Branch & Bound Algorithm
Our maingoal isthe computationof exact solutionsfor Graph BisectionProblems. Therefore, we
construct a branch&bound algorithm using the described VarMC-bound as lower bound for the
problems. A detailed description of the branch & bound implementation can be found in [197].
In the following, we give a brief survey on the main ideas:
First, we heuristically compute a graph bisection using PARTY [172]. Since the initial solu-
tion obtained is optimal in most cases, usually we only need to prove optimality. A pure depth
first search tree traversal is sufficient for that purpose. The branching is done on the decision
whether two specific vertices
v
w
stay in the same partition (join) or if they are separated
(split). A join is performed by adapting the graph by merging these two vertices into one vertex.
A split is performed by introducing an additional commodity from vertex vto vertex wwhose
entire scaled amount is known to cross the cut. Thus, it can be added to theCutFlow completely.
The selection of the pair
v
w
for the next branching is done with the help of an upper bound on
the lower bound. Additionally to this idea described in [197], the selection is restricted to pairs
v
w
(if any) where one node (say, v) has been split with some other node x
wbefore. Then
the split of
v
w
implies a join of
x
w
, of course.
Problem reduction is done by improving the so-called “Forcing Moves” strategy described
in [197], Lemma 2. In the original version, only adjacent vertices
v
w
were considered. Now,
we look at a residual graph with edge-capacities that equal the amount of capacity which is not
used by a given VarMC solution. Two vertices vand wcan be joined if the maximal flow in the
residual graph exceeds a specific value.
9.6 Numerical Results
In this section, we present the results of our computational experiments. Before comparing the
algorithms with each other and with the semi-definite bound developed in [133], we first present
some experiments regarding the effects of different parameter settings for the FPTAS in Sec-
tion 9.3 and the cost-decomposition approach in Section 9.4. The following experiments were
executed on systems with INTEL Pentium-III, 850 MHz, and 512 MB memory. To show the
performance on different kinds of graphs, we use four different sets of 20 randomly generated
graphs: The set RandPlan contains random maximal planar graphs with 100 vertices; the graphs
are generated using the same principle as it is used in LEDA [153]. Benchmark set RandReg con-
sists of random regular graphs with 100 vertices and degree four; for its generation, the algorithm
from Steger and Wormald [206] is used. The set Random contains graphs with 44 vertices where
every pair
v
w
is adjacent with probability 0.2. Finally, the set RandW consists of complete
graphs with 24 vertices where every edge has a random weight in the interval
0

99
.
In most works on exact graph partitioning (see e.g. [27, 133]), sets like Random and RandW
182
Chapter 9. Graph Bisection
0.01
0.1
1
10
100
10 100 1000 10000 100000
absolute Diff. to exact value [logscaled]
Iteration [logscaled]
primal value, ε=0.025
enhanced primal value, ε=0.025
dual value, ε=0.025
primal value, ε=0.25
enhanced primal value, ε=0.25
dual value, ε=0.25
Fig. 9.1: Progression of the bounds with ε
0
025 and ε
0
25
are used for the experiments. We added the sets of random regular and random planar graphs
here, because we believe that more structured graph classes should also be considered taking into
account their bigger relevance for practical applications.
9.6.1 Approximating Lower Bounds
First, we show the results regarding the FPTAS developed in Section 9.3. To illustrate the be-
havior of the algorithm, Figure 9.1 shows the progression of the primal and dual value, and the
enhanced primal value for ε
0
025 and ε
0
25. Recall from Section 9.3.2 that the enhanced
primal value is achieved by solving a linear program to compute the optimal scaling factors of the
final flows. The run was made on one specific RandReg graph, but similarresults are obtained by
using any of the other graphs we considered in our experiments. Thus, we consider this example
as representative.
Interestingly, it appears that the improvement caused by enhanced scaling is a specific factor
that is almost independent of the number of iterations. As one might have expected, a closer
look at the data shows that the improvement usually gets slightly smaller with more iterations,
whereby the effect is most visible for graphs in the Random set. Only for planar graphs we found
that the gain by enhanced scaling becomes clearly greater with increasing iterations. Regarding
the dependencyof the chosenε, we find that for the setsRandReg and RandPlanthe improvement
achieved by enhanced scalingbecomes smaller thegreater εis, and for (weighted) randomgraphs
it is almost constant.
In Figure 9.1, for both settings of ε, the dual value converges to its final error fairly quickly
and then remains nearly unchanged. When comparing the primal values with ε
0
25 and
ε
0
025, we find that the first has a smaller error at the beginning, but is not improved anymore
after few iterations. Using ε
0
025, the convergence of the primal value is slower, but reaches
a clearly better result at the end.
9.6. Numerical Results
183
εˆ
εˆ
εsˆ
εdnum its. time
0.0125 0.0008 0.00035 0.0022 361,014 3360.0
0.025 0.0019 0.00091 0.0042 89,026 888.3
0.05 0.0043 0.0021 0.0084 21,642 232.2
0.1 0.011 0.0062 0.016 5,110 56.0
0.2 0.029 0.016 0.033 1,130 12.9
0.4 0.069 0.040 0.057 213 2.4
0.6 0.12 0.067 0.057 66 0.8
0.8 0.20 0.11 0.057 23 0.3
Tab.9.1: Real errors and computational effort depending
on the given ε.
The best choice of εfor the use in a
branch & bound environment is a trade-
off. If εis too big, the bound approxi-
mation computed is too bad, and the num-
ber of sub-problems in the search tree ex-
plodes. On the other hand, if εis cho-
sen toosmall, the boundapproximation for
each sub-problem is too time consuming.
Table 9.1 shows this trade-off for the same
graph as in Figure 9.1. Again, we consider
this example as representative. We denote
the approximationparameter thealgorithm
is started with by ε,ˆ
εis the final error relative to the real VarMC-bound value, ˆ
εsdenotes the final
error when enhanced scaling is used, and ˆ
εdgives the final error of the dual value. Furthermore,
the number of iterations and the running time in seconds are given.
It gets clear that the error that is actually reached is much better than the approximation
guarantee εwhich the algorithm is started with. The figure also shows that the theoretical factor
of 1
ε2in the running time of the algorithm is confirmed by the experiments.
Note that the runningtime for the enhanced scalingversionis not explicitly statedin the table.
It increases the running time by only a hundredth of a second, and is therefore negligible in our
comparison. In this experiment, however, the linear program for the computation of enhanced
scaling factors was only solved once at the end of the approximation. When using the FPTAS
for lower bound computations in a branch & bound, we are also interested in intermediate primal
values that may allow us to prune the current choice point even before the approximation of the
current VarMC-bound is finished. Then we found that it is a good choice to compute enhanced
primal values only every hundredth iteration, which almost does not increase the overall running
time but improves the approximation quality significantly.
In Table 9.2, we give the resulting running times and the number of sub-problems of the
branch & bound algorithm using approximated bounds. The results given are averages over all
20 instances for every set of graphs.
The results without forcing moves show the expected behavior for the choice of ε. The
smaller εis, the smaller is the number of search nodes. This rule is not strict when using forcing
moves: looking at the random planar graphs, we see that less good solutionsof the VarMC-bound
can result in stronger forcing moves so that the number of sub-problems may even decrease. The
figure also shows that the effects of enhanced scaling and forcing moves are different for the
different classes of graphs and also for changing εs. Altogether, the experiments show that
setting εto 50% is favorable, which is a surprisingly high value.
184
Chapter 9. Graph Bisection
ε
0
025 ε
0
05 ε
0
1ε
0
25 ε
0
5ε
0
75
graph time subp. time subp. time subp. time subp. time subp. time subp.
RandPlan 6513 462 2350 463 1058 468 632 582 438 1536 524 5740
RandReg 3728 24 1902 29 1001 37 522 101 551 552 1194 4063
(1) Random 2857 114 727 119 334 129 175 212 230 1124 781 15038
RandW 1487 47 534 60 217 89 108 297 102 2034 118 11498
RandPlan 6395 461 2158 461 962 463 587 557 450 1466 562 5273
RandReg 2192 22 1107 23 561 26 196 37 126 90 163 489
(2) Random 2620 113 525 114 228 118 98 139 80 280 186 2188
RandW 1383 37 472 46 176 62 76 150 85 788 1289 3859
RandPlan 3013 412 1009 406 587 381 133 239 54 117 35 78
RandReg 2186 22 1083 23 622 25 181 34 117 67 160 173
(3) Random 2249 107 465 108 209 111 83 121 49 164 47 328
RandW 737 14 283 17 122 25 44 54 35 188 41 582
Tab. 9.2: Times and sizes of the search trees of the branch & bound algorithm using the approximation
algorithm with different εs. (1): without enhanced scaling, without forcing moves, (2): with enhanced
scaling, without forcing moves, (3): with enhanced scaling and forcing moves.
9.6.2 Lower Bounds using Cost-Decomposition
Now, we evaluate the cost-decomposition approach developed in Section 9.4. Analogically to
Table 9.2, the Table 9.3 shows the results of the branch & bound algorithm when using the
cost-decomposition technique for the computation of the VarMC-bound.
Both, the running times and the sizes of the search trees produced by the various subgradient
algorithms and Lagrangian formulations differ considerably on the different benchmark sets.
Thus, it is not an easy task to draw valid conclusions out of these experiments. As a tendency,
the max-cutflow formulation looks better than the min-congestion formulation (except for the
random planar graphs). When using the max-cutflow formulation, the Crowder rule gives a good
overall performance.
9.6.3 Comparison of Lower Bound Algorithms
After having investigatedthe developed algorithmssolitarily, we now want to compare them with
each other and with the semi-definite bound presented in [133]. We start with a presentation of
time and quality of the four different lower bound algorithms:
The VarMC-bound using the ILOG CPLEX 7.0 standard barrier solver [117].
The VarMC-bound using the max-cutflow decomposition with the Crowder rule.
9.6. Numerical Results
185
pure subgr. Crowder mod. CFM Volume
graph time subp. time subp. time subp. time subp.
RandPlan 11318 85 6626 85 24986 85 7752 85
RandReg 1192 38 589 28 737 29 568 32
(1) Random 748 117 694 119 722 116 849 127
RandW 151 11 128 11 134 10 166 17
RandPlan 2032 85 1276 85 1831 85 1307 85
RandReg 3768 64 17320 299 4374 72 17633 292
(2) Random 1776 160 3366 239 1941 169 6211 418
RandW 1710 183 1394 139 1591 166 3661 588
Tab.9.3: Average running times in seconds and average number of search nodes using cost-decomposition
without forcing moves. (1): max-cutflow formulation, (2): min-congestion formulation.
The VarMC-bound using the FPTAS with a desired approximation guarantee of 50% and
enhanced scaling.
The semi-definite bound presented in [133] and available as CUTSDP-package (program
bis0) at [132]. The program uses parameters
maxlarge
and
maxsmall
. According to the
setting in [133], we set
maxlarge
:
1, and
maxsmall
:
10.
In Tables 9.4, 9.5, and 9.6, we apply the different algorithms to grids, tori, shuffle-exchange
graphs, DeBruijn graphs, and graphs stemming from a real-world finite elements application.
The experiments were performed on a SUN Enterprise 450 Model 4400 machine with 1 GB main
memory and a SUN UltraSparc-II 400 MHz processor. The tables give the number of nodes, the
number of edges, the exact bisection width, and, for each algorithm, the bound computed and the
time needed for its computation (in seconds).
First, by comparing the VarMC-bound achieved by CPLEX and the semi-definite bound, we
see that the VarMC-bound is indeed superior with respect to its quality on sparse and structured
graphs like the ones that we consider here. As a slight drawback, the bound is not suited for
disconnected graphs like the BCR graphs
ma, m1
, and
m4
. For them, the MVarMC-bound
discussed in Section 9.2 yields much better results [197]. Work is in progress that tries to extend
the work presented here to the MVarMC bound as well. However, here we focus on the VarMC-
bound only.
For the remaining connected, and especially the larger graphs, we see that the VarMC-bound
dominates the semi-definite bound, sometimes quite remarkably. Moreover, we see that the
CPLEX barrier solver2computes the VarMC-bound always faster than CUTSDP obtains the semi-
definite bound - except for large shuffle-exchange and DeBruijn graphs. For these graphs, the
2We also experimented with the primal and dual simplex, but, while the dual simplex outperforms the primal
simplex algorithm, both algorithms could not at all compete with the interior point solver.
186
Chapter 9. Graph Bisection
Graph
V
E
bw CPLEX Decomp. Approx (50%) CUTSDP
Bound Time Bound Time Bound Time Bound Time
Grid 9x4 36 59 5 4.50 <1 4.50 2 4.32 <1 5.00 1
Grid 10x5 50 85 5 5.00 1 5.00 4 5.00 <1 5.00 3
Grid 10x9 90 161 9 9.00 15 9.00 21 9.00 1 8.12 17
Grid 10x10 100 180 10 10.00 26 10.00 32 10.00 1 8.28 24
Grid 10x11 110 199 11 11.00 16 11.00 54 10.59 2 8.48 30
Grid 11x11 121 220 12 11.48 19 11.48 67 11.11 2 8.98 42
Torus 9x4 36 72 10 8.10 <1 8.10 3 7.78 <1 7.60 1
Torus 10x5 50 100 10 10.00 3 10.00 4 9.84 <1 9.88 3
Torus 10x9 90 180 18 18.00 19 18.00 32 17.56 1 17.22 19
Torus 10x10 100 200 20 20.00 11 20.00 38 18.95 2 17.98 27
Torus 10x11 110 220 22 20.17 15 20.17 48 19.66 2 17.63 30
Torus 11x11 121 242 24 22.18 15 22.00 65 20.62 3 18.50 39
Tab. 9.4: Comparison of bounds on Grids and Tori.
Graph
V
E
bw CPLEX Decomp. Approx (50%) CUTSDP
Bound Time Bound Time Bound Time Bound Time
SE 2 4 3 2 2.00 <1 2.00 <1 2.00 <1 2.00 <1
SE 3 8 10 2 2.00 <1 2.00 <1 2.00 <1 2.00 <1
SE 4 16 21 4 3.10 <1 3.10 <1 3.08 <1 3.49 <1
SE 5 32 46 6 5.23 <1 5.20 2 5.10 <1 5.08 2
SE 6 64 93 10 8.91 8 8.87 11 8.69 <1 7.40 21
SE 7 128 190 16 15.12 43 15.01 53 14.73 3 11.31 120
SE 8 256 381 28 26.15 484 25.69 244 24.94 16 18.14 453
DB 2 4 5 4 4.00 <1 4.00 <1 4.00 <1 4.00 <1
DB 3 8 13 4 4.00 <1 4.00 <1 4.00 <1 4.00 <1
DB 4 16 29 6 6.00 <1 6.00 <1 5.89 <1 6.00 <1
DB 5 32 61 10 10.00 <1 10.00 3 9.82 <1 9.70 2
DB 6 64 125 18 16.96 9 16.90 14 16.26 <1 14.94 20
DB 7 128 253 30 28.98 62 28.80 65 27.45 4 22.58 49
DB 8 256 509 54 49.54 652 49.05 322 46.95 21 35.08 443
Tab. 9.5: Comparison of bounds on shuffle-exchange (SE) and DeBruijn (DB) graphs.
9.6. Numerical Results
187
Graph
V
E
bw CPLEX Decomp. Approx (50%) CUTSDP
Bound Time Bound Time Bound Time Bound Time
BCR ma 54 72 2 0 0 0 1.93 4
BCR mb 74 120 4 3.08 8 3.08 9 3.08 4 3.12 10
BCR mc 74 125 6 5.10 8 5.10 14 5.10 4 5.45 10
BCR md 80 129 4 3.16 9 3.16 11 3.15 4 3.20 13
BCR me 60 96 3 3.00 3 3.00 6 3.00 3 3.00 5
BCR mf 90 146 4 3.21 14 3.21 18 3.21 4 2.86 18
BCR m1 100 155 4 0 0 0 2.46 26
BCR m4 32 50 6 0 0 0 5.68 1
BCR m6 70 120 7 6.36 7 6.36 14 6.36 4 6.03 9
BCR m8 148 265 7 7.00 22 7.00 64 6.96 5 5.98 80
Tab. 9.6: Comparison of bounds on real-world graphs stemming from a finite elements application.
quality of the VarMC-bound is much better, though. Therefore, even when using a standard
barrier solver for its computation, the VarMC-bound is clearly preferable to the semi-definite
bound on sparse, structured graphs.
However, for larger graphs containing 500 nodes or more, the memory consumption of
CPLEX becomes critical. For example, the bound on DeBruijn 9 could not be computed within
2 GB main memory. As stated in Section 9.4, cost-decomposition can help to cope with that sit-
uation. We see that the Lagrangian relaxation based column generation approach yields slightly
worse lower bounds, and there is also more time needed, except for large shuffle-exchange
and DeBruijn graphs. In exchange, the memory requirements are much lower, and the cost-
decompositionapproach allowsus totackle large graphs like DeBruijn9 andShuffle-Exchange 10.
When using the approximation scheme with an approximation guarantee of 50%, we lose
even more of the bounds quality. However, in most cases, the bounds obtained are still better
than those computed by CUTSDP, and the computation time is drastically reduced. At the same
time, the memory consumption is comparable to the cost-decomposition approach. Therefore,
the FPTAS developed in Section 9.3 is the algorithm of choice for the bisection of sparse and
structured graphs.
Next, we embed the three different algorithmsfor the VarMC-bound into the branch & bound
approach with problem reduction as sketched in Section 9.5. Again, the experiments were exe-
cuted on systems with 850 MHz INTEL Pentium-III processors. Table 9.7 shows the results on
the four previously described benchmark sets.
We see that, also within a branch & bound approach, the enhanced approximation algorithm
gives the best running times, even though the bounds are worst and the search trees are largest.
Recall from the formulation of the the master problemin Section 9.4, that the cost-decomposition
approach was especially designed to be memory efficient, which makes it less competitive with
188
Chapter 9. Graph Bisection
respect to the running time.
CPLEX Decomp. Approx. (50%)
graph time subp. time subp. time subp.
RandPlan 74 7 889 26 54 117
RandReg 993 21 557 28 117 67
Random 350 99 612 106 49 164
RandW 30 9 120 11 35 188
Tab. 9.7: Average running times (seconds) and sizes of
the search trees using the different methods for computing
the VarMC-bound.
Finally, we note that, using the FPTAS
in Section 9.3, we are able to compute
the bisection widths of DeBruijn 9 (92),
Shuffle-Exchange 9 (48), and Shuffle-
Exchange 10 (82) with the additional help
of the symmetry breaking method SBDD
developed in Chapter 4. In our view, these
resultsimpressivelyshowthe efficiencyof
the VarMC-bound as well as the FPTAS
for its approximation.
9.7 Summary and Future Work
We developed two specialized algorithms for the computation/approximation of the VarMC-
bound on the bisection width of an undirected, edge-weighted graph. The first algorithm is
based on an approximation scheme (FPTAS) for maximum multicommodity flows and yields
an ε-approximation in time O
m2
ε2
. We could show empirically that the real error obtained
is usually much better than the approximation guarantee, especially when using an enhanced
scaling method at the end of the algorithm.
The second algorithm that we developeduses the idea of cost-decomposition inan integration
of Lagrangian relaxation and column generation. We compared two different Lagrangian formu-
lations for the generation of columns and four different rules to determine the search direction
in a subgradient algorithm. The performance of the different algorithms varies a lot on different
graph classes, and it is a hard task to find a set of robust parameter settings that guarantee a stable
performance.
When comparingthe two algorithmswitha barrier LP-solverand a semi-definite program, we
found that it is clearly favorable to use the approximation scheme that yields very good bounds
on sparse, structured graphs in very little time. It allowed us to compute the bisection widths
of large graphs, such as DeBruijn 9, Shuffle-Exchange 9, and Shuffle-Exchange 10, which were
unknown and out of the reach of exact graph bisection algorithms before.
As a subject of future work, we investigate the possibility to adapt the FPTAS that we devel-
oped for the VarMC-bound for an approximation of the MVarMC-bound. Then, we hope to be
able to tackle also disconnected graphs for which the VarMC-bound is inefficient.
Chapter 10
Conclusion
In the introduction, we observed that there exists an incongruity of the needs for optimization
software on one hand and, on the other hand, the solutions that algorithmic computer science is
able to offer. From an algorithmic point of view, the optimization abilities of todays software
libraries are often more than satisfactory for many real-life applications. However, the need
for complex problem modeling appears as a major obstacle for a broader use of optimization
software.
We tried to improve upon this situation by providing filtering algorithms for higher level
symbolic constraints. They allow to model real-life problems more intuitively while preserv-
ing the strong optimization abilities of mathematical programming. We believe that the set of
optimization constraints that we considered covers very important substructures that arise fre-
quently in real-life applications. However, our presentation is not exhaustive, and more work has
to be done to provide practitioners with a more complete set of higher level building blocks for
problem modeling.
Whenever a problem is decomposed into substructures even when they are larger than
usual the question arises of how a global view on the entire problem can be achieved. Espe-
cially with respect to tight bounds on the objective, the answer to this question is crucial. We
have proposed to link optimization constraints via well-known decomposition techniques from
operations research. When the user is able to provide a solver with information about the sub-
structures of a problem, we believe that the linking of optimization constraints via CP-based
column generation and CP-based Lagrangian relaxation can also be automated and hidden from
the user.
Regarding symmetry breaking, the method that we proposed does not by itself detect the
symmetries in a problem model. Instead, the representation of a search node and the symmetry
detection function still must be provided. Therefore, a deeper knowledge is required from the
user. However, the idea to think of symmetries algorithmically by asking: When are two choice
points symmetric? is much more intuitive than to develop a problem model that contains no
189
symmetries. Therefore, we believe that our method can help inexperienced users to cope with
symmetry more efficiently.
When tackling the optimization problems that we considered in the second part of this thesis,
the methods and reduction algorithms developed in Part I accelerated the software development
process considerably. Moreover, as we have seen, they can yield to competitive algorithms.
However, note that, especially for the Capacitated Network Design Problem, the Social Golfer
Problem, and the Airline Crew Assignment Problem, we added problem specific knowledge to
improvethe efficiency. In our view, asa subjectof futurework, itwouldbe desirable togeneralize
the ideas of local Lagrangian cuts, heuristic constraint propagation and the repair techniques for
column generation. Finally, the work on the Graph Bisection Problem motivates the question
whether approximation algorithms can be exploited for problem reduction as well, which may
give yield to a notion of relaxed ε-consistency for optimization constraints.
In the end, a brief note that goes beyond the scientific scope of this thesis. We aimed at giving
a broader access to optimization power. However, we strongly believe that efficiency is no value
by itself. Optimization is to be seen as a tool that can be used and misused. We therefore ask to
exploit it with care and responsibility for the good of the people.
List of Figures
2.1 The figure showsarcs on shortestpaths fromv1and to v11 ina DAG. Dashed lines
mark shortest-path arcs from v1, dotted lines those to v11. Solid lines represent
arcs that are in both sets. Consider for example node 7: the shortest path from v1
to 7 is
v1
3
7
, and the shortest path from node 7 to v11 is
7
9
v11
. Therefore,
a shortest path from v1to v11 via node 7 is
v1
3
7
9
v11
. . . . . . . . . . . . . 19
2.2 The structure replacing a node in G. . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 The figure schematically shows an edge
k
l
Ethat must exist according to
Lemma 2.4. Solid lines mark edges in E, and dashed lines mark parts of the
shortest path between v1and vn. The dotted line between land vnindicates that
there exists a path between the two nodes that does not visit the edge
r
s
. The
alternating lines and dots between land rindicate that the shortest path from lto
vnvisits node r. The numbers on top of the nodes give their corresponding DFS
numbers, and triangles mark DFS sub-trees. . . . . . . . . . . . . . . . . . . . . 24
2.4 The figure schematically shows an edge
i
j
Ethat must exist according
to Lemma 2.5. Solid lines mark edges in E, and dashed lines mark parts of
the shortest path between v1and vn. Alternating lines and dots indicate parts
of the shortest path from v1to a node, and dotted lines indicate parts of the
shortest path from a node to vn. The proof of Theorem 2.2 shows that the path
P
v1
r

P
r
i

P
j
s

P
s
vn

is 2-admissible and does not visit the edge
r
s
. 25
2.5 A directed graph with non-negative arc weights. Assume we are given an upper
bound B
8. All arcs in the graph are part of an admissible path with costs
lower than B, and every admissible path with costs lower than Bmust visit the
arc
1
2
. However, there exists a path
v1
3
v4
that does not visit this arc. . . . 27
2.6 The figure schematically shows a shortest path tree Trooted at v1. Solid lines
denote arcs in G, dashed lines mark parts of the shortest path P
v1
vn
from v1
to vn. The triangles symbolize shortest path sub-trees. For an edge e
r
s
P
v1
vn
, the nodes in Vare partitioned into two non-empty sets Seand SC
e. If
eis removed from the graph, the shortest path from v1to vnmust visit an edge
i
j
Se
SC
e
T. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
191
192
List of Figures
2.7 (a) The table gives the costs cij of assigning a value xjto a variable Xi. (b) A
bipartite graph links variables to values that they can take. Bold numbers and
lines mark the optimal solution with objective value 26. . . . . . . . . . . . . . . 47
2.8 (a) The new cost matrix cM, and (b) the network NMfor the optimal matching
from Figure 2.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.9 (a) The changed cost vector c
cM, and (b) the network NMwith node potentials
π1and π2. Bold numbers show those assignments that can be eliminated by
simple reduced-cost propagation in the presence of a solution with value B
28. 51
2.10 (a) Shortest paths from nodes in V1to nodes in V2with respect to the reduced
costs c. (b) The same shortest paths using the original cost vector cM. (c) The
additional costs imposed by an assignment Xi
xj. Bold numbers show those
assignments that can be eliminated in the presence of a solution with value B
28. 52
2.11 The width of each element is proportional to its weight. The elements are or-
dered with respect to the efficiencies pi
wi. The leftmost element has the biggest
efficiency, and the rightmost the smallest one. smarks the critical item in U1. . . 57
2.12 U3requires the integrality of item s. The figures show U1
KP
Xs
0
, and
U1
KP
Xs
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.13 The figure illustratesthe processofthe reductionalgorithm presentedfor KP
Xi
0
. The weight ordering in whichthe items are tested ensures thatthe critical item
moves monotonically to the right. . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.1 The concept of SBDD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2 DeBruijn networks of dimension 3 (left) and 4 (right). A node is marked by the
binary string corresponding to its number. The dashed lines mark the symmetries
of the DeBruijn network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.3 The search tree when bisectioning DB(8) without breaking any symmetries. . . . 88
4.4 The search tree for the bisection of DB(8) when breaking all possible symme-
tries. Chains of choice points with only one successor result from symmetry-
based domain filtering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.5 The left hand side shows two patterns P
and P
. Each pattern consists of
three weeks (horizontal) of three groups of three players. Unfixed variables are
left empty. On the right hand side, the corresponding bipartite graph is shown,
containinga node for each week of both patterns. Since a matching of cardinality
3 exists (bold edges), P
is dominated by P
. . . . . . . . . . . . . . . . . . . . 90
4.6 Six out of 40 solutions of 7-queens are unique. . . . . . . . . . . . . . . . . . . . 94
5.1 Constructing a legal reduced-cost optimal roster is equivalent to finding a con-
strained shortest path in a weighted DAG. . . . . . . . . . . . . . . . . . . . . . 102
List of Figures
193
5.2 The entire approach: The inner loop generates columns using dual information,
the outer loop solves the master problem. . . . . . . . . . . . . . . . . . . . . . 103
5.3 Number of choice points versus master iterations (left), and running time versus
master iterations (right) for SPC, NRC, and total enumeration. The tests were
run with a data instance of type 10-00-20 that was solved to optimality. . . . . . . 104
5.4 Number of choice points versus master iterations using SPC, NRC with a data
set of type 7-0-30. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.5 The left picture shows time versus the number of calls of the propagation routine
using the incremental and the non-incremental implementation of the shortest
path constraint. Both versions were stopped after 10000 seconds total CPU time.
The experiment was run with a data instance of type 10-00-70. The right picture
shows a comparison of NRC (upper curve) and SPC (lower curve) in a time
versus quality diagram on a data instance of type 67-165-280. . . . . . . . . . . 106
5.6 Data set with 65 crew members and 959 pairings. . . . . . . . . . . . . . . . . . 116
5.7 Data set with 50 crew members and 766 pairings. . . . . . . . . . . . . . . . . . 117
5.8 Data set with 7 crew members and 129 pairings. . . . . . . . . . . . . . . . . . . 118
5.9 Data set with 30 crew members and 279 pairings. . . . . . . . . . . . . . . . . . 118
6.1 The automatic recording scenario. . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.1 Optimality proofs: comparison of two strategies when using reduction based on
knapsack relaxation: on-the-fly fixing vs. fixing after Lagrange. . . . . . . . . . . 147
7.2 Comparison of different heuristic solvers for the CNDP. . . . . . . . . . . . . . . 149
8.1 The residual graph of week 2 from Table 8.1. . . . . . . . . . . . . . . . . . . . 157
8.2 The residual graph of week 2 from Table 8.3. . . . . . . . . . . . . . . . . . . . 159
8.3 The residual graph of player 15 from Table 8.4. . . . . . . . . . . . . . . . . . . 160
8.4 The residual graph of player 12 from Table 8.5. . . . . . . . . . . . . . . . . . . 161
8.5 The residual graphs of players 4 (left), 6 (middle) and 12 (right) from Table 8.6. . 162
9.1 Progression of the bounds with ε
0
025 and ε
0
25 . . . . . . . . . . . . . . 182
194
List of Figures
List of Tables
2.1 The table gives an overview of the findings in this section. . . . . . . . . . . . . 32
2.2 Characteristics of the four algorithms used in the experiments. . . . . . . . . . . 64
2.3 The pure CP approach for both problem classes. cp is the average number of
choice points, time the average time in seconds for 100 instances of the given size. 65
2.4 Uncorrelated data instances. We give the average numbers for 100 test sets per
size. time is the time in seconds, cp the number of choice points. . . . . . . . . . 65
2.5 Weakly correlated data instances. We give the average numbers for 100 test sets
per size. time is the time in seconds, cp the number of choice points. . . . . . . . 66
2.6 Uncorrelated and weakly correlated data instances. We give the average time per
choice point in milliseconds for 100 test sets per size. . . . . . . . . . . . . . . . 66
2.7 Uncorrelated data. Comparison of running times per choice point for the new
amortized linear time propagation algorithm based on bound U2and the imple-
mentation of MTR. We give the average time per choice point in milliseconds
for 100 test sets per size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.8 Uncorrelated data. Comparison of running times for the new amortized linear
time propagation algorithms and implementations of DHR, and MTR. We give
the average time in seconds as well as the number of choice points for 100 test
sets per size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.9 Comparison of running times of linU2and MTR on uncorrelated and weakly
correlated data. cp is the number of choice points, time the running time in
seconds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.1 Results of the golfer 4-3-Xproblem. . . . . . . . . . . . . . . . . . . . . . . . . 91
4.2 Results of the golfer 4-4-Xinstance. . . . . . . . . . . . . . . . . . . . . . . . . 92
4.3 Results of the golfer 4-4-4 instance performing additional checks for symmetry
ϕXin search nodes of every q-th depth. . . . . . . . . . . . . . . . . . . . . . . . 92
4.4 Improved results of the golfer 4-4-Xperforming additional checks for symmetry
ϕXin search tree nodes of every 8-th depth. . . . . . . . . . . . . . . . . . . . . 92
195
196
List of Tables
4.5 Solvingn-Queenswithoutbreaking symmetries(sym), withbreaking symmetries
via SBDS, and by avoiding them via SBDD. Computing times are given in seconds. 95
6.1 The table compares the different approaches on three different test sets with 5
classes, 12 hours, 20 channels and different objectives. The time (in seconds) and
the number of choice points are averages for 50 randomly generated instances for
each objective. The average number of programs per instance are between 607.6
and 612.6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.2 The table shows a comparison of the performance of the different approaches on
5 test sets with 5 classes and objective CU for various time horizons (in hours)
and channel numbers (ch). Italic numbers give the average time (in seconds) and
the average number of nodes of 50 randomly generated instances in each test set
(avg). Numbers below are: minimum(min), maximum(max), and standard devi-
ation (std) for these 50 instances. The average number of programs per instance
is 315.2 for (12h/20ch), 793.5 (12h/50ch), 607.6 (24h/20ch), 1512.1 (24h/50ch),
and 1782.6 (72h/20ch), respectively. . . . . . . . . . . . . . . . . . . . . . . . . 130
6.3 The table illustrates the performance of the different approaches on very differ-
ent benchmark classes. Each test set contains 50 randomly generated problem
instances. There is an average of 1956.7 programs in the 120h/20ch test set,
1782.6 programs in test set 72h/20ch, and 1423.3 programs in test set 24h/50ch. . 131
6.4 The table shows the performance of the different approaches on subset sum data
sets ranging from 12 hours and 20 channels up to 72 hours and 50 channels. The
average number of programs in the 50 randomly generated instances per test set
is given as parameter p. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.5 The table compares the performance of the different algorithms on benchmark
sets with 3 classes and objective CU, each containing 50 randomly generated
problem instances with roughly 1000 programs on average. . . . . . . . . . . . . 132
7.1 The impact of cardinality interval tightening when using the knapsack relaxation
for pruning and problem reduction. Mean, minimum, maximum, and variance of
running time and number of nodes in the branch-and-bound tree are given. . . . . 145
7.2 The impact of the branching variable selection when pruning and filtering is done
with the help of the knapsack relaxation. . . . . . . . . . . . . . . . . . . . . . . 145
7.3 The impact of additional shortest-path filtering when using the knapsack relax-
ation for pruningand problemreduction. Branching strategyBR0 and cardinality
interval tightening are used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
List of Tables
197
7.4 Comparison of the CPLEX branch-and-cut algorithm and Lagrangian relaxation
(pruning and reduction based on the knapsack relaxation plus cardinality interval
tightening). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
8.1 A partial instantiation of the 5-3-2 Social Golfer Problem. . . . . . . . . . . . . . 157
8.2 A more complete partial instantiation of the 5-3-2 Social Golfer Problem. . . . . 157
8.3 A partial instantiation of the 5-4-2 Social Golfer Problem. . . . . . . . . . . . . . 159
8.4 A partial instantiation of the 5-3-7 Social Golfer Problem. . . . . . . . . . . . . . 160
8.5 A partial instantiation of the 4-3-5 Social Golfer Problem. . . . . . . . . . . . . . 161
8.6 A partial instantiation of the 4-3-5 Social Golfer Problem. . . . . . . . . . . . . . 162
8.7 The CPU time needed to compute all unique solutions (in seconds), in brackets
the number of choice points visited, and the time per choice point (in millisec-
onds) when computing all unique solutions. . . . . . . . . . . . . . . . . . . . . 163
8.8 The number of simple and complete symmetry checks, and the percentage of
time spent in these checks when computing all unique solutions. . . . . . . . . . 164
8.9 The number of unique solutions for several social golfer instances. . . . . . . . . 165
9.1 Real errors and computational effort depending on the given ε. . . . . . . . . . . 183
9.2 Times and sizes of the search trees of the branch & bound algorithm using the
approximation algorithm with different εs. (1): without enhanced scaling, with-
out forcing moves, (2): with enhanced scaling, without forcing moves, (3): with
enhanced scaling and forcing moves. . . . . . . . . . . . . . . . . . . . . . . . . 184
9.3 Average running times in seconds and average number of search nodes using
cost-decomposition without forcing moves. (1): max-cutflow formulation, (2):
min-congestion formulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
9.4 Comparison of bounds on Grids and Tori. . . . . . . . . . . . . . . . . . . . . . 186
9.5 Comparison of bounds on shuffle-exchange (SE) and DeBruijn (DB) graphs. . . . 186
9.6 Comparison of bounds on real-world graphs stemming from a finite elements
application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
9.7 Average running times (seconds) and sizes of the search trees using the different
methods for computing the VarMC-bound. . . . . . . . . . . . . . . . . . . . . . 188
198
List of Tables
Bibliography
[1] R.K. Ahuja, T.L. Magnati, and J.B. Orlin. Network Flows. Prentice Hall, 1993.
[2] C. Albrecht. Provably good global routing by a new approximation algorithm for multi-
commodity flow. International Conference on Physical Design, pp. 19–25, 2000.
[3] E. Andersson, E. Housos, N. Kohl, and D. Wedelin. Crew pairing optimization. Interna-
tional Series in Operations Research and Management Science, 9:228–258, Kluwer Aca-
demic Publishers, 1998.
[4] Y. Aneja, V. Aggarwal, and K. Nair. Shortest chain subject to side conditions. Networks,
13:295-302, 1983.
[5] D. Applegate and W. Cook. A computational study of the job-shop scheduling problem.
ORSA Journal on Computing, 3:149–156, 1991.
[6] K. R. Apt. The Rough Guide to Constraint Propagation. 5th International Conference on
Principles and Practice of Constraint Programming (CP), LNCS 1713:1–23, 1999.
[7] A. Atamtürk, D. Rajan. On Splittable and Unsplittable Capacitated Network Design Arc-
Set Polyhedra. To appear in Mathematical Programming, 2001.
[8] G. Ausiello, G.F. Italiano, A.M. Spaccamela, and U. Nanni. Incremental Algorithms for
Minimal Length Paths. Journal of Algorithms, 12(4): 615–638, 1990.
[9]
mediaTV
, Technical Description, Axcent AG. http://www.axcent.de.
[10] E. Balas and E. Zemel. An algorithm for large-scale zero-one knapsack problems. Opera-
tions Research, 28:119–148, 1980.
[11] F. Barahona and R. Anbil. The Volume Algorithm: producing primal solutions with a
subgradient algorithm. Mathematical Programming, 87:385–399, 2000.
[12] C. Barnhart, C.A. Hane, E.L. Johnson, and G. Sigismondi. A column generation and parti-
tioning approach for multi-commodtiy flow problems. Telecommunication Systems, 3:239–
258, 1995.
199
200
Bibliography
[13] C. Barnhart, E.L. Johnson, G.L. Nemhauser, M.W.P. Savelsbergh, and P.H. Vance. Branch-
and-price: Column generation for solving huge integer programs. Operations Research,
46(3):316–329, 1998.
[14] C. Barnhart and R.G. Shenoi. An approximate model and solution approach for the long-
haul crew pairing problem. Transportation Science, 32(3):221–231, 1998.
[15] N. Barnier and P. Brisset. Graph Coloring for Air Traffic Flow Management. 4th Interna-
tional Workshop on Integration of AI and OR Techniques in Constraint Programming for
Combinatorial Optimization Problems (CP-AI-OR), pp. 133–147, 2002.
[16] N. Barnier and P. Brisset. Solving the Kirkman’s Schoolgirl Problem in a Few Seconds.
8th International Conference on Principles and Practice of Constraint Programming (CP),
LNCS 2470:477–491, 2002.
[17] J. Beasley and N. Christofides. An Algorithm for the Resource Constrained Shortest Path
Problem. Networks, 19:379-394, 1989.
[18] H. Beringer and B. De Backer. Combinatorial problem solving in constraint logic pro-
gramming with cooperative solvers. Logic Programming: Formal Methods and Practical
Applications, pp. 245–272, Elsevier, 1995.
[19] J.C. Bermond and C. Peyrat. De Bruijn and Kautz networks: a competitor for the hyper-
cube? 1st European Workshop on Hypercubes and Distributed Computers, pp. 279–293,
North-Holland, 1989.
[20] M. Bern and P. Plassmann. The Steiner problem with edge lengths 1 and 2. Information
Processing Letters (IPL), 32:171–176, 1989.
[21] C. Bessière. Arc-consistency and arc-consistency again. Artificial Intelligence, 65:179–
190, 1994.
[22] D. Bienstock, O. Günlük, S. Chopra, and C.Y. Tsai. Mininum cost capacity installation for
multicommodity flows. Mathematical Programming, 81:177-199, 1998.
[23] D. Bienstock. Experiments with a network design algorithm using epsilon-approximate
linear programs. Technical Report, CORC Report 1999-4, 1999.
[24] A. Bockmayr and T. Kasper. Branch and infer: A unifying framework for integer and finite
domain constraint programming. INFORMS Journal on Computing, 10(3):287–300, 1998.
[25] R. Borndörfer and A. Löbel. Scheduling duties by adaptive column generation. Technical
Report, Konrad-Zuse-Zentrum für Informationstechink Berlin, ZIB-01-02, 2001.
Bibliography
201
[26] C. Bornstein, A. Litman, B. Maggs, R. Sitaraman, and T. Yatzkar. On the Bisection Width
andExpansionof ButterflyNetworks. 1stMerged InternationalParallelProcessingSympo-
sium and Symposium on Parallel and Distributed Processing (IPPS/SPDP), IEEE, pp. 144–
150, 1998.
[27] L. Brunetta, M. Conforti, and G. Rinaldi. A branch-and-cut algorithm for the equicut
problem. Mathematical Programming, 78:243–263, 1997.
[28] P. Camerini, L. Fratta, and F. Maffioli. On Improving Relaxation methods by Modified
Gradient Techniques. Mathematical Programming Studies, 3:26–34, 1975.
[29] A. Caprara, M. Fischetti, and P. Toth. A heuristic algorithm for the set covering problem.
5th International Conference on Integer Programming and Combinatorial Optimization
(IPCO), LNCS 1084:72–84, 1996.
[30] A. Caprara, F. Focacci, E. Lamma, P. Mello, M. Milano, P. Toth, and D. Vigo. Integrating
constraint logic programming and operations research techniques for the crew rostering
problem. Software Practice and Experience, 28(1): 49–76, 1998.
[31] A. Caprara, D. Pisinger, and P. Toth. Exact Solution of the Quadratic Knapsack Problem.
INFORMS Journal on Computing, 11:125–137, 1999.
[32] A. Caprara, P. Toth, D. Vigo, and M. Fischetti. Modeling and solving the crew rostering
problem. Operations Research, 46(6):820–830, 1998.
[33] R.D. Carr, L.K. Fleischer, V.J. Leung, and C.A. Phillips. Strengthening Integrality Gaps
for Capacitated Network Design and Covering Problems. 11th Symposium on Discrete
Algorithms (SODA), 2000.
[34] Y. Caseau and F. Laburthe. Solving Various Weighted Matching Problems with Constraints.
3rd International Conference on Principles and Practice of Constraint Programming (CP),
LNCS 1330:17–31, 1997.
[35] Y. Caseau and F. Laburthe. Solving Small TSPs with Constraints. 14th International Con-
ference on Logic Programming (ICLP), pp. 316–330, The MIT Press, 1997.
[36] Y. Caseau and F. Laburthe. Heuristics for large constrained routing problems. Journal of
Heuristics, 5:281–303, 1999.
[37] L. Cavique, C. Rego, and I. Themido. Subgraph ejection chains and tabu search for the
crew scheduling problem. Journal of the Operational Research Society, 50:608–616, 1999.
[38] C. Chu and J. Antonio. Approximation algorithm to solve real-life multicriteria cutting
stock problems. Operations Research, 47(4):495–508, 1999.
202
Bibliography
[39] H.D. Chu, E. Gelman, and E.L. Johnson. Solving large scale crew scheduling problems.
European Journal of Operational Research, 97:260–268, 1997
[40] L.W. Clarke and P. Gong. Capacitated Network Design with Column Generation. Research
Report, Georgia Institute of Technology, 1998.
[41] M.B. Cohen, C.J. Colbourn, L.A. Ives, and A.C.H. Ling. Kirkman triple systems of order
21 with non-trivial automorphism group. Mathematics of Computation, 71:873–881, 2001.
[42] C. Colbourn and J. Dinitz. The CRC Handbook of Combinatorial Designs. CRC Press,
1996.
[43] T.H. Cormen, C.E. Leiserson, and R.L. Rivest. Introduction to Algorithms. The MIT Press,
1990.
[44] T.G. Crainic, A. Frangioni, and B. Gendron. Bundle-based relaxation methods for mul-
ticommodity capacitated fixed charge network design. Discrete Applied Mathematics,
112:73–99, 2001.
[45] T.G. Crainic, M. Gendreau, and J.M. Farvolden. A simplex-based tabu search method for
capacitated network design. INFORMS Journal on Computing, 12(3):223–236, 2000.
[46] H. Crowder. Computational improvements for subgradient optimization. Symposia Mathe-
matica, XIX:357–372, 1976.
[47] CSPLib: a problem library for constraints, maintained by I.P. Gent, T. Walsh, and B. Sel-
man, http://www-users.cs.york.ac.uk/˜tw/csplib/
[48] J. Czyzyk, S. Mehrotra, M. Wagner, and S.J. Wright. PCx user guide (Version 1.1). Tech-
nical Report, Optimization Technology Center, Aragone National Laboratory and North-
western University, 1996.
[49] J. Czyzyk, S. Mehrotra, M. Wagner, and S.J. Wright. PCx: An interior-point code for linear
programming. Optimization Methods and Software, 11(2):397–430, 1999.
[50] G.B. Dantzig. Discrete variable extremum problems. Operations Research, 5:266–277,
1957.
[51] G.B. Dantzig and P. Wolfe. The decomposition algorithm for linear programs. Economet-
rica, 29(4):767–778, 1961.
[52] P.R. Day and D.M. Ryan. Flight attendant rostering for short-haul airline operations. Op-
erations Research, 45(5):649–661, 1997.
Bibliography
203
[53] R.S. Dembo and P.L. Hammer. A reduction algorithm for knapsack problems. Methods of
Operations Research, 36:49–60, 1980.
[54] C. Demetrescu and G.F. Italiano. Fully Dynamic All Pairs Shortest Paths with Real Edge
Weights. 42nd Annual Symposium on Foundations of Computer Science (FOCS), IEEE,
pp. 260–267, 2001.
[55] G. Desaulniers, J. Desrosiers, Y. Dumas, S. Marc, B. Rioux, M.M. Solomon, and F. Soumis.
Crew pairing at Air France. European Journal of OperationalResearch, 97:245–259, 1997.
[56] J. Desrosiers, Y. Dumas, M.M. Solomon, and F. Soumis. Time constrained routing and
scheduling. Network Routing Handbooks in Operations Research and Management
Science, 8:35–139, North-Holland, 1995.
[57] I. Dumitrescu and N. Boland. The weight-constrained shortest path problem: preprocess-
ing, scaling and dynamic programming algorithms with numerical comparisons. 17th In-
ternational Symposium on Mathematical Programming (ISMP), 2000.
[58] ECLIPSE. ParcTechnologiesLimited.http://www.icparc.ic.ac.uk/eclipse/.
[59] H. Everett. Generalized lagrange multiplier method for solving problems of optimum allo-
cation of resource. Operations Research, 11:399–417, 1963.
[60] T. Fahle. Cost Based Filtering vs. Upper Bounds for Maximum Clique. 4th International
Workshop on Integration of AI and OR Techniques in Constraint Programming for Combi-
natorial Optimization Problems (CP-AI-OR), pp. 93–107, 2002.
[61] T. Fahle, U. Junker, S.E. Karisch, N. Kohl, M. Sellmann, and B. Vaaben. Constraint Prop-
agation for Complex Column Generation Subproblems. 17th International Symposium on
Mathematical Programming (ISMP), 2000.
[62] T. Fahle, U. Junker, S.E. Karisch, N. Kohl, M. Sellmann, and B. Vaaben. Constraint pro-
gramming based column generation for crew assignment. Journal of Heuristics, 8(1):59–
81, 2002.
[63] T. Fahle, S. Schamberger, and M. Sellmann. Symmetry Breaking. 7th International Con-
ference on Principles and Practice of Constraint Programming (CP), LNCS 2239:93–107,
2001.
[64] T. Fahle and M. Sellmann. Constraint Programming Based Column Generation with Knap-
sack Subproblems. 2nd International Workshop on Integration of AI and OR Techniques
in Constraint Programming for Combinatorial Optimization Problems (CP-AI-OR), Pader-
born Center for Parallel Computing, Technical Report tr-001-2000:33–44, 2000.
204
Bibliography
[65] T. Fahle and M. Sellmann. Cost-Based Filtering for the Constrained Knapsack Problem.
Annals of Operations Research, 115:73–93, 2002.
[66] J. Farvolden, K. Jones, I. Lustig, and W. Powell. Multicommodity Network Flows The
Impact Of FormulationOn Decomposition. MathematicalProgramming, 62:95–117, 1993.
[67] J. Farvolden, W. Powell, and I. Lustig. A primal partitioning solution for the arc-chain
formulation of a multicommoditynetwork flow problem. Operations Research, 41(4):669–
693, 1993.
[68] D. Fayard and G. Plateau. An algorithm for the solution of the 0-1 knapsack problem.
Computing, 28:269–287, 1982.
[69] U. Feige and R. Krauthgamer. A Polylogarithmic Approximation of the Minimum Bisec-
tion. Journal on Computing, 31(4):1090–1118, 2002.
[70] R. Feldmann, B. Monien, P. Mysliwietz, and S. Tschöke. A Better Upper Bound on the
Bisection Width of de Bruijn Networks. 14th International Symposium on Theoretical
Aspects of Computer Science (STACS), LNCS 1200:511–522, 1997.
[71] T. Feo and M. Resende. Greedy randomized adaptive search procedures. Journal of Global
Optimization, 6:109–133, 1995.
[72] C.E. Ferreira, A. Martin, C.C. de Souza, R. Weismantel, and L.A. Wolsey. The node ca-
pacitated graph partitioning problem: a computational study. Journal of Mathematical
Programming, 81:229–256, 1998.
[73] P.O. Fjällström. Algorithms for graph partitioning: A survey. Linköping Electronic Ar-
ticles in Computer and Information Science,http://www.ep.liu.se/ea/cis/-
1998/010/, 1998.
[74] L.K. Fleischer. Approximating Fractional Multicommodity Flow Independent of the Num-
ber of Commodities. SIAM Journal on Discrete Mathematics, 13(4):505–520, 2000.
[75] L. Fleischer and K.D. Wayne. Fast and simpleapproximation schemes for generalized flow.
Mathematical Programming, 91(2):215–238, 2002.
[76] F. Focacci, F. Laburthe, and A. Lodi. Local Search and Constraint Programming. Handbook
of Metaheuristic, Kluwer Academic Publishers, to appear.
[77] F. Focacci, A. Lodi, and M. Milano. Solving TSP through the Integration of OR and CP
Techniques. Workshop on Large Scale Combinatorial Optimization and Constraints, Elec-
tronic Notes in Discrete Mathematics, 1998.
Bibliography
205
[78] F. Focacci, A. Lodi, and M. Milano. Integration of CP and OR methods
for Matching Problems. 1st International Workshop on Integration of AI and
OR Techniques in Constraint Programming for Combinatorial Optimization Prob-
lems (CP-AI-OR),http://www.deis.unibo.it/Events/Deis/Workshops/-
Proceedings.html, 1999.
[79] F. Focacci, A. Lodi, and M. Milano. Cost-Based Domain Filtering. 5th International
Conference on Principlesand Practice of Constraint Programming(CP), LNCS 1713:189–
203, 1999.
[80] F. Focacci, A. Lodi, and M. Milano. Cutting Planes in Constraint Programming: An Hy-
brid Approach. 2nd International Workshop on Integration of AI and OR Techniques in
Constraint Programming for Combinatorial Optimization Problems (CP-AI-OR), Pader-
born Center for Parallel Computing, Technical Report tr-001-2000:45–51, 2000.
[81] F. Focacci and M. Milano. Global Cut Framework for Removing Symmetries. 7th Inter-
national Conference on Principles and Practice of Constraint Programming (CP), LNCS
2239:77–92, 2001.
[82] F. Focacci and P. Shaw. Pruning sub-optimal search branches using local search. 4th Inter-
national Workshop on Integration of AI and OR Techniques in Constraint Programming for
Combinatorial Optimization Problems (CP-AI-OR), pp. 181–189, 2002.
[83] S. Fortune, J. Hopcroft, and J. Wyllie. The directed subgraph homeomorphism problem.
Theoretical Computer Science, 10(2):111–121, 1980.
[84] A. Frangioni. A Bundle type Dual-ascent Aproach to Linear Multi-Commodity Min Cost
Flow Problems. Technical Report, Dipartimento di Informatica, Universita di Pisa, TR-96-
01, 1996.
[85] A. Frangioni. Dual Ascent Methods and MulticommodityFlow Problems. DoctoralThesis,
Dipartimento di Informatica, Universita di Pisa, TD-97-05, 1997.
[86] M.L. Fredmann and R.E. Tarjan. Fibonacci heaps and their uses in improved network
optimization algorithms. Journal of the ACM, 34:596–615, 1987.
[87] M. Gamache, F. Soumis, D. Villeneuve, J. Desrosiers, and E. Gélinas. The preferential
bidding system at Air Canada. Transportation Science, 32(3):246–255, 1998.
[88] M.R. Garey and D.S. Johnson. Computers and Intractability, A Guide to the Theory of
NP-Completeness. Freeman, San Francisco, 1979.
[89] M.R. Garey, D.S. Johnson, and L. Stockmeyer.. Some simplified NP-complete graph prob-
lems. Theoretical Comuter Sience, 1:237–267, 1976.
206
Bibliography
[90] N. Garg and J. Könemann. Faster and simpler algorithms for multicommodity flow and
other fractional packing problems. 39th Annual Symposium on Foundations of Computer
Science (FOCS), IEEE, pp. 300–309, 1998.
[91] B. Gendron and T.G. Crainic. Relaxations for multicommodity capacitated network design
problems. Technical Report, Centre de recherche sur les transports, Universitéde Montréal,
CRT-96-05, 1994.
[92] I.P. Gent, W. Harvey, and T. Kelsey. Groups and Constraints: Symmetry Breaking During
Search. 8th International Conference on Principles and Practice of Constraint Program-
ming (CP), LNCS 2470:415–430, 2002.
[93] I.P. Gent and B.M. Smith. Symmetry Breaking During Search in Constraint Programming.
14th European Conference on Artificial Intelligence (ECAI), pp. 599–603, 2000.
[94] I. Ghamlouche, T.G. Crainic, and M. Gendreau. Cycle-based neighbourhoods for fixed-
charge capacitated multicommoditynetwork design. Technical Report, Centre de recherche
sur les transports, Université de Montréal, CRT-2001-01, 2001.
[95] I. Ghamlouche, T.G. Crainic, and M. Gendreau. Path relinking, cycle-based neighbour-
hoods and capacitated multicommodity network design. Technical Report, Centre de
recherche sur les transports, Université de Montréal, CRT-2002-01, 2002.
[96] P.C. Gilmore and R.E. Gomory. A linear programming approach to the cutting stock prob-
lem. Operations Research, 9:849–859, 1961.
[97] F. Glover, D. Klingman, and N.V. Phillips. Network Models in Optimization and Their
Applications in Practice. Wiley, 1992.
[98] A.V. Goldberg, J.D. Oldham, S.A. Plotkin, and C. Stein. An Implementation of a Com-
binatorial Approximation Algorithm for Minimum-Cost Multicommodity Flow. 6th In-
ternational Conference on Integer Programming and Combinatorial Optimization (IPCO),
LNCS 1412:338–352, 1998.
[99] M.C. Golumbic. Algorithmic Graph Theory and Perfect Graphs. Academic Press, New
York, 1991.
[100] M.D. Grigoriadis and L.G. Khachiyan. Fast approximation schemes for convex programs
with many blocks and coupling constraints. SIAM Journal on Optimization, 4:86–107,
1994.
[101] M.D. Grigoriadis and L.G. Khachiyan. Approximate minimum-cost multicommodity
flows. Mathematical Programming, 75:477–482, 1996.
Bibliography
207
[102] O. Günlük. A branch-and-cut algorithm for capacitated network design problems. Math-
ematical Programming, 86(1):17–39, 1999.
[103] G. Handler and I. Zang. A Dual Algorithm for the Restricted Shortest Path Problem.
Networks, 10:293–310, 1980.
[104] W. Harvey. Symmetry Breaking and the Social Golfer Problem. Workshop on Symmetry
in Constraints (SymCon), 2001.
[105] W. Harvey. Warwick’s Results Page for the Social Golfer Problem.http://-
www.icparc.ic.ac.uk/˜wh/golf/.
[106] W.D. Harvey and M.L. Ginsberg. Limited discrepancy search. 14th International Joint
Conference on Artificial Intelligence (IJCAI), pp. 607–613, 1997.
[107] M. Held and R.M. Karp. The traveling-salesman problem and minimum spanning trees.
Operations Research, 18:1138–1162, 1970.
[108] M. Held and R.M. Karp. The traveling-salesman problem and minimum spanning trees:
Part II. Mathematical Programming, 1:6–25, 1971.
[109] B. Hendrickson and B. Leland. The chaco user’s guide: Version 2.0. Technical Report,
Sandia National Laboratories, Albuquerque, SAND94-2692, 1994.
[110] D.S. Hochbaum. Approximation Algorithms for NP-hard Problems. PWS Publishing
Company, 1997.
[111] K.L. Hoffman and M. Padberg. Solving airline crew scheduling problems by branch-and-
cut. Management Science, 39(6):657–682, 1993.
[112] K. Holmberg and D. Yuan. A Lagrangean Heuristic Based Branch-and-Bound Approach
for the Capacitated Network Design Problem. Operations Research, 48:461–481, 2000.
[113] J. Hooker. Unifying optimization and constraint satisfaction. Invited talk at the 16th
International Joint Conference on Artificial Intelligence (IJCAI). Slides available at
http://ba.gsia.cmu.edu/jnh/ijcai.ppt.
[114] P.D. Hudson. Improving the branch and bound algorithm for the knapsack problem.
Queen’s University Research Report, Belfast, 1977.
[115] J.Y. Hsiao, C.Y. Tang, and R.S. Chang. An efficient algorithm for finding a maximum
weight 2-independent set on interval graphs. Information Processing Letters, 43(5):229–
235, 1992.
208
Bibliography
[116] ILOG CPLEX 6.5. Reference manual and user manual. ILOG, 1999.
[117] ILOG CPLEX 7.0. Reference manual and user manual. ILOG, 2000.
[118] ILOG CPLEX 7.5. Reference manual and user manual. ILOG, 2001.
[119] ILOG Planner 3.3. Reference manual and user manual. ILOG, 1999.
[120] ILOG Solver 4.4. Reference manual and user manual. ILOG, 1999.
[121] ILOG Solver 5.0. Reference manual and user manual. ILOG, 2000.
[122] G.P. Ingargiola and J.F. Korsh. A reduction algorithm for zero-one single knapsack prob-
lems. Management Science, 20:460–463, 1973.
[123] O. Jahn, R. Möhring, and A. Schulz. Optimal routing of traffic flows with length restric-
tions in networks with congestion. Technical Report, TU Berlin, 658-1999, 1999.
[124] E. Johnson, A. Mehrotra, and G. Nemhauser. Min-cut clustering. Mathematical Program-
ming, 62:133–151, 1993.
[125] H. Joksch. The Shortest Route Problem with Constraints. Journal of Mathematical Anal-
ysis and Application, 14:191–197, 1966.
[126] M. Jünger and D. Naddef (Editors). Computational Combinatorial Optimization. LNCS
2241, 2001.
[127] M. Jünger and S. Thienel. The ABACUS system for Branch and Cut and Price Algo-
rithms in Integer Programming and Combinatorial Optimization. Software Practice and
Experiments, 30:1325–1352, 2000.
[128] U. Junker, S.E. Karisch, N. Kohl, B. Vaaben, T. Fahle, and M. Sellmann. A Framework
for Constraint programming based column generation. 5th International Conference on
Principles and Practice of Constraint Programming (CP), LNCS 1713:261–274, 1999.
[129] U. Junker, S.E. Karisch, N. Kohl, B. Vaaben, T. Fahle, and M. Sellmann. A Framework for
Constraint Programming Based Column Generation. 16th International Joint Conference
on Artificial Intelligence (IJCAI), Workshop on Non-Binary Constraints, 1999.
[130] G. Karakostas. Fast Approximation Schemes for Fractional Multicommodity Flow Prob-
lems. 13th Symposium on Discrete Algorithms (SODA), 2002.
[131] D. Karger and S. Plotkin. Adding multiple cost constraints to combinatorial optimiza-
tion problems, with applications to multicommodity flows. 27th Symposium on Theory of
Computing, pp. 18–25, 1995.
Bibliography
209
[132] S. Karisch. CUTSDP A toolbox for a cutting-plane approach based on semidefinite
programming. User’s guide, Version1.0, Department of Mathematical Modeling,Technical
University of Denmark, 10/98, 1998.
[133] S.E. Karisch, F. Rendl, and J. Clausen. Solving graphbisectionproblemswith semidefinite
programming. INFORMS Journal on Computing, 12(3):177–191, 2000.
[134] G. Karypis and V. Kumar. Multilevel algorithms for multi-constraint graph partitioning.
Technical Report, Deptartment of Computer Science, University of Minnesota, TR 98-019,
1998.
[135] P. Klein, S. Plotkin, C. Stein, and E. Tardos. Faster approximation algorithms for the
unit capacity concurrent flow problem with applications to routing and finding sparse cuts.
SIAM Journal on Computing, 23:466–487, 1994.
[136] G. Kliewer, M. Sellmann, and A. Koberstein. Solving the capacitated network design
problem in parallel. 3rd meeting of the PAREO Euro working group on Parallel Processing
in Operations Research (PAREO), 2002.
[137] N. Kohl and S.E. Karisch. Airline Crew Rostering: Problem Types, Modeling and Opti-
mization. Carmen Research and Technology Report, CRTR-2001-1, 2001.
[138] V. Kumar. Algorithms for Constraints-Satisfactionproblems: A Survey. The AI Magazine,
AAAI, 13:32–44, 1992.
[139] M. Lehradt. Basisalgorithmen für ein TV Anytime System. Diploma Thesis, University
of Paderborn, 2000.
[140] F.T. Leighton. Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hy-
percubes. Morgan Kaufmann Publishers, 1992.
[141] T. Leighton, F. Makedon, S. Plotkin, C. Stein, E. Tardos, and S. Tragoudas. Fast Approx-
imation Algorithms for Multicommodity Flow Problems. Journal of Computer and System
Sciences, 50(2):228–243, 1995.
[142] M. Lübbecke and U. Zimmermann. Computer aided scheduling of switching engines. 8th
International Conference on Computer-Aided Scheduling of Public Transport (CASPT),
2000.
[143] A.K. Mackworth. Consistency in networks of relations. Artificial Intelligence, 8(1):99–
118, 1977.
[144] T.L. Magnanti and R.T. Wong. Network design and transportation planning: Models and
algorithms. Transportation Science, 18:1–55, 1984.
210
Bibliography
[145] K. Marriott and P.J. Suckey. Programming with Constraints: An Introduction. The MIT
Press, 1998.
[146] S. Martello, D. Pisinger, and P. Toth. Dynamic programming and tight bounds for the 0-1
knapsack problem. Management Science, 45:414–424, 1999.
[147] S. Martello and P. Toth. An upper bound for the zero-one knapsack problem and a branch
and bound algorithm. European Journal of Operational Research, 1:169–175, 1977.
[148] S. Martello and P. Toth. A new algorithm for the 0-1 knapsack problem. Management
Science, 34:633–644, 1988.
[149] S. Martello and P. Toth. Knapsack Problems Algorithms and Computer Implementa-
tions. Wiley, 1990.
[150] S. Martello and P. Toth. Upper Bounds and Algorithms for hard 0-1 knapsack problems.
Operations Research, 45(5):768–778, 1997.
[151] R.D. McBride. Progress made in solving the multicommodity flow problem. SIAM Jour-
nal on Optimization, 8:947–955, 1998.
[152] I. McDonald. Unique Symmetry Breaking in CSPs Using Group Theory. Workshop on
Symmetry in Constraints (SymCon), 2001.
[153] K. Mehlhorn and S. Nähler. LEDA: A Platform for Combinatorial and Geometric Com-
puting. Communications of the ACM, 38(1):96–102, 1995.
[154] K. Mehlhorn and M. Ziegelmann. Resource Constrained Shortest Paths. 8th Annual
European Symposium on Algorithms (ESA), LNCS 1879:326–337, 2000.
[155] P. Meseguer and C. Torras. Exploiting symmetries within constraint satisfaction search.
Artificial Intelligence, 129(1–2):133–163, 2001.
[156] M. Milano. Integration of Mathematical Programming and Constraint Programming for
Combinatorial Optimization Problems. Tutorial at the 6th International Conference on
Principles and Practice of Constraint Programming (CP), 2000.
[157] U. Montanari. Networks of constraints: fundamental properties and applications. Infor-
mation Science, 7(2):95–132, 1974.
[158] G.L. Nemhauser and L.A. Wolsey. Integer and Combinatorial Optimization. Wiley, 1988.
[159] G.L. Nemhauser, M.W.P. Savelsberg, and G.C. Sigismondi. MINTO, a Mixed INTeger
Optimizer. Operations Research Letters, 15:47–58, 1994.
Bibliography
211
[160] W.P.M. Nuijten and E.H.L. Aarts. A computational study of constraint satisfaction for
multiple capacitated job shop scheduling. European Journal of Operational Research,
90(2):269–284, 1996.
[161] A. Orda. Routing with end to end QoS guarantees in broadband networks. Conference on
Computer Communications (Infocom), IEEE, pp. 27–34, 1998.
[162] G. Ottosson and E.S. Thorsteinsson. Linear Relaxation and Reduced-Cost Based Propa-
gation of Continuous Variable Subscripts. 2nd International Workshop on Integration of AI
and OR Techniques in Constraint Programming for Combinatorial Optimization Problems
(CP-AI-OR), Paderborn Center for Parallel Computing, Technical Report tr-001-2000:129–
138, 2000.
[163] PARROT. Executive Summary. ESPRIT 24960, 1997.
[164] F. Pellegrini and J. Roman. SCOTCH: A software package for static mapping by dual
recursive bipartitioning of process and architecture graphs. 4th European Conference on
High Performance Computing and Networking (HPCN), pp. 493–498, 1996.
[165] G. Pesant and M. Gendreau. A view of local search in constrained programming. 2nd
International Conference on Principles and Practice of Constrained Programming (CP),
LNCS 1118:353–366, 1996.
[166] S. Pettie and V. Ramachandran. Computing undirected shortest paths using comparisons
and additions. 13th Symposium on Discrete Algorithms (SODA), 2002.
[167] D. Pisinger. An expanding-core algorithm for the exact 0-1 knapsack problem. European
Journal of Operational Research, 87:175–187, 1995.
[168] D. Pisinger. An exact algorithm for large multiple knapsack problems. European Journal
of Operational Research, 114:528–541, 1999.
[169] S.A. Plotkin, D. Shmoys, and E. Tardos. Fast approximation algorithms for fractional
packing and covering problems. Math. of Operations Research, 20:257–301, 1995.
[170] O. Porto, M. de Moraes, and A. Lucena. A relax and cut algorithm for the quadratic
knapsack problem. 17th International Symposium on Mathematical Programming (ISMP),
2000.
[171] R. Preis and R. Dieckmann. The PARTY PartitioningLibrary User Guide Version 1.1.
Technical Report, University of Paderborn, tr-rsfb-96-024, 1996.
212
Bibliography
[172] R. Preis and R. Diekmann. PARTY - A Software Libraryfor Graph Partitioning. Advances
in Computational Mechanics with Parallel and Distributed Processing, Civil-Comp Press,
pp. 63–71, 1997.
[173] S. Prestwich. A hybrid search architecture applied to hard random 3-SAT and low-
autocorrelation binary sequences. 6th International Conference on Principles and Practice
of Constrained Programming (CP), LNCS 1894:337–352, 2000.
[174] J.-F. Puget. Symmetry Breaking Revisited. 8th International Conference on Principles
and Practice of Constrained Programming (CP), LNCS 2470:446–461, 2002.
[175] T. Radzik. Fast deterministic approximation for the multicommodity flow problem. 6th
Symposium on Discrete Algorithms (SODA), pp. 486–492, 1995.
[176] T. Radzik. Experimental study of a solution method for multicommodity flow problems.
2nd Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 79–102, 2000.
[177] G. Ramalingam and T. Reps. An Incremental Algorithm for a Generalization of the
Shortest-Path Problem. Journal of Algorithms, 21(2): 267–305, 1992.
[178] G. Ramalingam and T. Reps. On the computational complexity of dynamic graph prob-
lems. Theoretical Computer Science, 158(1–2): 233–277, 1995.
[179] J.-C. Régin. A filtering algorithm for constraints of difference in CSPs. 12th National
Conference on Artificial Intelligence, AAAI, pp. 362–367, 1994.
[180] R. Rodosek, M. Wallace, and M.T. Haijan. A new approach to integrating mixed integer
programming and constraint logic programming. Annals of Operations Research, 86:63–
87, 1999.
[181] E. Rothberg. Using Cuts to Remove Symmetry. 17th International Symposium on Math-
ematical Programming (ISMP), 2000.
[182] R.A. Rushmeier, K.L. Hoffman, and M. Padberg. Recent advances in exact optimization
of airline scheduling problems. Technical Report, George Mason University, 1995.
[183] D.M. Ryan. The solution of massive generalized set partitioning problems in aircrew
rostering. Journal of the Operational Research Society, 43(5):459–467, 1992.
[184] M. Sato. Efficient implementation of an approximation algorithm for multicommodity
flows. Master Thesis, Graduate School of Engineering Science, Osaka University, 2000.
[185] J. Schulze and T. Fahle. A parallel algorithm for the vehicle routing problem with time
window constraints. Annals of Operations Research, 86:585–607, 1999.
Bibliography
213
[186] SCIL. Symbolic Constraints in Integer Linear Programming. http://www.mpi-sb.-
mpg.de/SCIL/.
[187] M. Sellmann. An Arc-Consistency Algorithm for the Weighted All Different Constraint.
8th International Conference on Principles and Practice of Constraint Programming (CP),
LNCS 2470:744–749, 2002.
[188] M. Sellmann and T. Fahle. CP-Based Lagrangian Relaxation for a Multimedia Applica-
tion. 3rd International Workshop on Integration of AI and OR Techniques in Constraint
Programming for Combinatorial Optimization Problems (CP-AI-OR), pp. 1–14, 2001.
[189] M. Sellmann and T. Fahle. Coupling Variable Fixing Algorithms for the Automatic
Recording Problem. 9th Annual European Symposium on Algorithms (ESA), LNCS
2161:134–145, 2001.
[190] M. Sellmann and T. Fahle. Constraint Programming Based Lagrangian Relaxation for the
Automatic Recording Problem. Annals of Operations Research, to appear.
[191] M. Sellmann and W. Harvey. Heuristic Constraint Propagation. 4th International Work-
shop on Integration of AI and OR Techniques in Constraint Programming for Combinato-
rial Optimization Problems (CP-AI-OR), pp. 191–204, 2002.
[192] M. Sellmann and W. Harvey. Heuristic Constraint Propagation. 8th International Confer-
ence on Principles and Practice of Constraint Programming (CP), LNCS 2470:738–743,
2002.
[193] M. Sellmann, G. Kliewer, and A. Koberstein. Lagrangian Cardinality Cuts and Variable
Fixing for Capacitated Network Design. 10th Annual European Symposium on Algorithms
(ESA), LNCS 2461:845–858, 2002.
[194] M. Sellmann, K. Zervoudakis, P. Stamatopoulos, and T. Fahle. Integrating Direct CP
Search and CP-based Column Generation for the Airline Crew Assignment Problem. 2nd
International Workshop on Integration of AI and OR Techniques in Constraint Program-
ming for Combinatorial Optimization Problems (CP-AI-OR), Paderborn Center for Parallel
Computing, Technical Report tr-001-2000:163–170, 2000.
[195] M. Sellmann, K. Zervoudakis, P. Stamatopoulos, and T. Fahle. Crew Assignment via Con-
straint Programming: Integrating Column Generation and Heuristic Tree Search. Annals of
Operations Research, 115:207–225, 2002.
[196] B. Selman, H. Kautz, and D. McAllester. Ten Challenges in Propositional Reasoning and
Search. 14th International Joint Conference on Artificial Intelligence (IJCAI), pp. 50–54,
1997.
214
Bibliography
[197] N. Sensen. Lower Bounds and Exact Algorithmsfor the Graph PartitioningProblem using
Multicommodity Flows. 9th Annual European Symposium on Algorithms (ESA), LNCS
2161:391–403, 2001.
[198] F. Shahrokhi and D.W. Matula. The maximum concurrent flow problem. Journal of the
ACM, 37:318–334, 1990.
[199] F. Shahrokhi and L. Szekely. On Canonical Concurrent Flows, Crossing Number and
Graph Expansion. Combinatorics, Probability and Computing, 3:523–543, 1994.
[200] P. Shaw. Using constraint programming and local search methods to solve vehicle routing
problems. 4th International Conference on Principles and Practice of ConstraintProgram-
ming (CP), LNCS 1520:417–431, 1998.
[201] H.D. Sherali and J. Cole Smith. Improving Discrete Model Representation Via Symmetry
Considerations. 17th International Symposium on Mathematical Programming (ISMP),
2000.
[202] B. Smith. Reducing Symmetry in a Combinatorial Design Problem. 3rd International
Workshop on Integration of AI and OR Techniques in Constraint Programming for Combi-
natorial Optimization Problems (CP-AI-OR), pp. 351–360, 2001.
[203] C. Souza, R. Keunings, L.A. Wolsey, and O. Zone. A new approach to minimising the
frontwidth in finite element calculations. Computer Methods in Applied Mechanics and
Engineering, 111:323–334, 1994.
[204] V. Sridhar and J.S. Park. Benders-and-cut algorithm for fixed-charge capacitated network
design problems. European Journal of Operational Research, 125(3):622–632, 2000.
[205] P. Stamatopoulos, G. Boukeas, K. Zervoudakis, V. Stoumpos, and C. Halatsis. Parallel
CP-based direct crew rostering. ESPRIT 24960 (PARROT), University of Athens and
University of Paderborn, Deliverable D-TEC2.1, 1999.
[206] A. Steger and N.C. Wormald. Generating random regular graphs quickly. Combinatorics,
Probability and Computation, 8:377–396, 1999.
[207] M. Thorup. Undirected singlesource shortestpaths inlinear time. 38thAnnualSymposium
on Foundations of Computer Science (FOCS), IEEE, pp. 12–21, 1997.
[208] TIVO. TV your way. TIVO, Inc., http://www.tivo.com/home.asp.
Bibliography
215
[209] M. Trick. A Dynamic Programming Approach for Consistency and Propagation for Knap-
sack Constraints. 3rd International Workshop on Integration of AI and OR Techniques in
Constraint Programming for Combinatorial Optimization Problems (CP-AI-OR), pp. 113–
124, 2001.
[210] UP-TV. European Research Project, IST-1999-20751. http://www.up-tv.de/.
[211] P. Van Hentenryck, Y. Deville, and C.M. Teng. A generic arc-consistency algorithm and
its specializations. Artificial Intelligence, 57:291–321, 1992.
[212] P.R.C. Villela and C.T. Bornstein. An improved bound for the 0-1 knapsack problem.
Technical Report, COPPE-Federal University of Rio de Janeiro, ES31-83, 1983.
[213] T. Walsh. Depth-bounded discrepancy search. 14th International Joint Conference on
Artificial Intelligence (IJCAI), pp. 1388–1393, 1997.
[214] C. Walshaw, M. Cross, and M. Everett. A localised algorithm for optimising unstructured
mesh partitions. International Journal of Supercomputer Applications and High Perfor-
mance Computing, 9(4):280–295, 1995.
[215] H.P. Williams. Model Building in Mathematical Programming. Wiley, 1978.
[216] G. Xue. Primal-dual algorithms for computing weight-constrained shortest paths and
weight-constrained minimum spanning trees. International Performance, Computing, and
Communications Conference (IPCCC), IEEE, pp. 271–277, 2000.
[217] N. Young. Randomized rounding without solving the linear program. 6th Symposium on
Discrete Algorithms (SODA), pp. 170–178, 1995.
[218] G. Yu (Editor). Operations Research in the Airline Industry. International Series in Oper-
ations Research and Management Science, Kluwer Academic Publishers, 1998.
[219] T.H. Yunes, A.V. Moura, and C.C. Souza. A hybrid approach for solving large crew
scheduling problems. International Workshop on Practical Aspects of Declarative Lan-
guages (PADL), LNCS 1753:293–307, 2000.