Document [original]

Universität Paderborn

Fachbereich Mathematik/Informatik

Reduction Techniques

in Constraint Programming

and Combinatorial Optimization

Dissertation

von

Meinolf Sellmann

Schriftliche Arbeit zur Erlangung des Grades

eines Doktors der Naturwissenschaften

Paderborn, im August 2002.

Für Olga.

Dank

Diese Dissertationentstand inden vergangenenvier Jahren während meiner Tätigkeit alswis-

senschaftlicher Mitarbeiter in der Arbeitsgruppe Monien im Fachbereich Mathematik/Informatik

an der Universität Paderborn. Sie wäre ohne die Unterstützung und Hilfe einer Vielzahl von Wis-

senschaftlern, Kollegen, Freunden und Verwandten in dieser Form nicht möglich gewesen. Bei

ihnen möchte ich mich an dieser Stelle bedanken.

Zuerst gilt mein besonderer und herzlicher Dank Prof. Dr. Burkhard Monien, der diese Arbeit

mit seiner großen wissenschaftlichen Kompetenz und viel Wohlwollen sowohl inhaltlichals auch

menschlich betreut hat. Die vielfältigen Themen und Projekte, die an seinem Lehrstuhl aktiv

verfolgt werden, bieten einen außerordentlichen Einblick in das aktuelle Forschungsgeschehen,

für den ich ihm sehr dankbar bin. Darüber hinaus hat mich über die vergangenen Jahre die

Gewissheit seiner Unterstützung getragen, die ebenso Leistung wie Bescheidenheit fordert und

fördert.

Für ihre Liebe, ihren Rat in schwierigen Situationen, ihre Geduld und ihren Glauben an mich

bedanke ich mich ganz besonders bei meiner Freundin Olga. Ohne ihren Rückhalt und ihre

Hilfe wäre diese Arbeit nicht gelungen. Dasselbe gilt auch für meine Familie, meine Eltern und

Brüder, die mich stets nach Kräften unterstützen und meine Entwicklung fördern.

Mein weiterer Dank für die gute wissenschaftliche Zusammenarbeit gilt meinen Kollegen

in der Arbeitsgruppe, im EU-Projekt PARROT, im DFG-Sonderforschungsbereich Massive Pa-

rallelität, im EU-Projekt UP-TV und im DFG-Schwerpunktprogramm Algorithmik großer und

komplexer Netzwerke. Viele der in dieser Arbeit dargestellten Ergebnisse sind in diesen For-

schungsgruppendurcheinenintensivenIdeenaustauschentstanden. Ausdrücklichnennenmöchte

ich Torsten Fahle, Dr. Warwick Harvey, Georg Kliewer, Norbert Sensen und Kyriakos Zer-

voudakis.

Schließlich bedanke ich mich noch sehr herzlich bei Dr. Michael Laska für seine Unter-

stützung als Projektmanager der AG und bei Ulrich Ahlers, Sigrid Gundelach, Marion Rohloff

und Thomas Thissen für die unermüdliche Hilfe bei organisatorischen und technischen Proble-

men.

Vielen Dank!

Paderborn, im August 2002. Meinolf Sellmann

Contents

1 Introduction 1

1.1 A World of Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Modeling and Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3.1 Outline and Major Results . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.3 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Part I — Methods

2 Optimization Constraints 11

2.1 Definitions and General Observations . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.1 On the Complexity of Cost-based Domain Filtering Problems . . . . . . 12

2.1.2 Degrees of Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Shortest Path Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.2 Shortest Path Problems on DAGs . . . . . . . . . . . . . . . . . . . . . 18

2.2.3 Shortest Path Problems on Undirected Graphs . . . . . . . . . . . . . . . 21

2.2.4 Shortest Path Problems on Directed Graphs . . . . . . . . . . . . . . . . 27

2.2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.3 Weighted Stable Set Constraints on Interval Graphs . . . . . . . . . . . . . . . . 33

2.3.1 The Weighted Stable Set Constraint . . . . . . . . . . . . . . . . . . . . 34

2.3.2 A Mathematical Programming Approach . . . . . . . . . . . . . . . . . 34

2.3.3 Cost-based Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.4 Weighted All Different Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.4.1 The Minimum Weight All-Different Constraint . . . . . . . . . . . . . . 46

2.4.2 An Arc-Consistency Algorithm . . . . . . . . . . . . . . . . . . . . . . 48

Contents

2.5 Knapsack Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.5.1 Definition and Applications . . . . . . . . . . . . . . . . . . . . . . . . 53

2.5.2 Knapsack Relaxations . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2.5.3 Cost-based Filtering for Knapsack Constraints . . . . . . . . . . . . . . 59

2.5.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

2.5.5 Cost-based Filtering for Knapsack Related Problems . . . . . . . . . . . 69

2.5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3 Cost-based Filtering and Problem Decomposition 71

3.1 CP-based Column Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.2 CP-based Lagrangian Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.3 Remarks and Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

3.3.1 Solving the Lagrangian Dual and Impotence . . . . . . . . . . . . . . . . 78

3.3.2 Redundant Constraint Generation . . . . . . . . . . . . . . . . . . . . . 78

3.3.3 Linking more than Two Optimization Constraints . . . . . . . . . . . . . 79

3.3.4 Linear Relaxations and Cuts . . . . . . . . . . . . . . . . . . . . . . . . 79

3.3.5 Binary IPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.3.6 Column Generation vs. Lagrangian Relaxation . . . . . . . . . . . . . . 80

4 Symmetry Breaking 81

4.1 Symmetry Breaking by Dominance Detection . . . . . . . . . . . . . . . . . . . 82

4.1.1 Efficient Realization in a Depth First Search . . . . . . . . . . . . . . . . 84

4.1.2 Arbitrary Search Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 84

4.1.3 A Different Representation of Choice Points . . . . . . . . . . . . . . . 85

4.2 DeBruijn Graph Bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.2.1 Bisection Width of the DeBruijn Graph . . . . . . . . . . . . . . . . . . 87

4.2.2 Symmetry Breaking for the Bisection of DeBruijn Graphs . . . . . . . . 87

4.3 The Social Golfer Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.3.1 Symmetries in the Social Golfer Problem . . . . . . . . . . . . . . . . . 89

4.3.2 Symmetry Breaking for the Social Golfer Problem . . . . . . . . . . . . 89

4.3.3 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.4 The n-Queens Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.4.1 Symmetry Breaking for the n-Queens Problem . . . . . . . . . . . . . . 93

4.4.2 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Contents

iii

Part II — Applications

5 Airline Crew Assignment 99

5.1 The Airline Crew Assignment Problem . . . . . . . . . . . . . . . . . . . . . . . 100

5.2 Two Approaches for the Crew Assignment Problem . . . . . . . . . . . . . . . . 101

5.2.1 CP-based Column Generation Approach . . . . . . . . . . . . . . . . . . 102

5.2.2 Heuristic Tree Search Approach . . . . . . . . . . . . . . . . . . . . . . 107

5.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.3.1 The Airline Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.3.2 Transforming a Set Covering into a Set Partitioning Solution . . . . . . . 110

5.3.3 Generating Combinable Columns and Exploiting Dual Values . . . . . . 113

5.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

6 Automatic Recording 121

6.1 The Automatic Recording Problem . . . . . . . . . . . . . . . . . . . . . . . . . 121

6.1.1 On the Complexity of the Automatic Recording Problem . . . . . . . . . 123

6.1.2 A Mathematical Programming Formulation . . . . . . . . . . . . . . . . 124

6.1.3 Solving the Resulting Integer Linear Program . . . . . . . . . . . . . . . 124

6.2 CP-based Lagrangian Relaxation for the ARP . . . . . . . . . . . . . . . . . . . 125

6.2.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.3 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6.3.1 Test Instance Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6.3.2 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6.4 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

7 Capacitated Network Design 133

7.1 The Capacitated Network Design Problem . . . . . . . . . . . . . . . . . . . . . 133

7.1.1 State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

7.2 Lagrangian Relaxation Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

7.2.1 Shortest Path Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . 136

7.2.2 Knapsack Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

7.2.3 Subgradient Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 137

7.2.4 Variable Fixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

7.2.5 Lagrangian Cardinality Cuts . . . . . . . . . . . . . . . . . . . . . . . . 141

Contents

7.3 A Branch & Bound Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

7.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

7.4.1 Benchmark Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

7.4.2 Algorithm Variants Considered in the Experiments . . . . . . . . . . . . 144

7.4.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

7.5 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

8 The Social Golfer Problem 151

8.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

8.2 Another SBDD-Approach for the Social Golfer Problem . . . . . . . . . . . . . 153

8.3 Heuristic Constraint Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . 154

8.3.1 Literature on the Integration of CP and Local Search . . . . . . . . . . . 155

8.3.2 Horizontal Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

8.3.3 Vertical Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

8.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

8.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

9 Graph Bisection 167

9.1 The Graph Bisection Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

9.2 Bounds on Graph Bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

9.3 Approximation of the VarMC-bound . . . . . . . . . . . . . . . . . . . . . . . . 170

9.3.1 The FPTAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

9.3.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

9.4 Cost-Decomposition Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

9.4.1 Column Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

9.4.2 Lagrangian Relaxation Based Column Generation . . . . . . . . . . . . . 179

9.5 A Branch & Bound Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

9.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

9.6.1 Approximating Lower Bounds . . . . . . . . . . . . . . . . . . . . . . . 182

9.6.2 Lower Bounds using Cost-Decomposition . . . . . . . . . . . . . . . . . 184

9.6.3 Comparison of Lower Bound Algorithms . . . . . . . . . . . . . . . . . 184

9.7 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

10 Conclusion 189

Contents

List of Figures 191

List of Tables 195

Bibliography 199

Contents

Chapter 1

Introduction

1.1 A World of Optimization

On all levels, the world we live in is full of optimization problems and processes. Many macro-

scopic physical and chemical phenomenacan be explained by the theory thatnature tries to reach

a state of minimum enthalpy. Also, every competition is inherently associated with a measure of

success that can be optimized. Biologicallife itself is based on competition, and the law of evolu-

tion favors efficiency and adaptiveness. The same principle determines our economic system that

is based on the egoistic striving of every economic subject to maximize its wealth. Therefore, to

forecast natural processes and to be economically competitive, optimization problems have to be

solved.

The progress that is made in algorithmic computer science can help to take up this chal-

lenge. Nowadays, most computers that are sold are used to edit texts, to manage large amounts

of data, and to provide access to the Internet. On the other hand, provided with state-of-the-art

optimization software, even a simple personal computer has the potential to become a valu-

able tool for the simulation of natural processes, efficient production and decision support in a

progressive, up-to-date company. However, the technology push is dragging and the solutions

that algorithmic computer science offers are only laboriously transferred into practice. Espe-

cially small and medium sized companies often cannot afford the risk of investing a considerable

amount of money into the development of company-specific software that may soon turn out to

be over-specialized and too rigid to ensure a return on investment in the perpetually changing

environments of a globalized economy.

Of course, commercial optimization software that efficiently supports a stable, non-malleable

process by solving a general optimization problem — like Job-Shop Problems, for example —

has a chance to successfully find its way into the industry. Due to highly dynamic production

conditions, the number of such applications is rather limited, though, and there is certainly a

Chapter 1. Introduction

need for flexible and also company specific optimization software. For this purpose, there are

general solvers and software libraries available. Outstanding examples are ILOG SOLVER [121],

ECLIPSE [58], ABACUS [127], LEDA [153], ILOG CPLEX [118], and MINTO [159]. However,

with respect to the wide range of applications that they potentially address, even these compa-

rably successful tools could reach a much larger market than the one that is currently serviced.

A major obstacle for a broader use of standard optimization software is a lack of expertise out-

side the scientific world. In the current situation, we observe that the more powerful a solver

is, the more knowledge is necessary to use it successfully. Therefore, to ease the handling of

optimization software, the modeling of real-world problems has to become more intuitive, and

its influence on the efficiency of the solution process must be reduced.

1.2 Modeling and Efficiency

Consider the situation in mathematical programming. The user is forced to crush a possibly

well-structured problem into a set of very basic linear and integer constraints. If this modeling

process is carried out carefully, standard mixed integer program solvers like CPLEX can tackle

many problems with an astounding efficiency. However, to set up an efficiently solvable integer

program is an art, it requires much experience and is not at all an easy task [215]. In constraint

programming, the situation with respect to problem modeling is much more comfortable. Nowa-

days constraint programming solvers offer sets of predefined, so-called symbolic constraints that

reflect the user’s intuitionmuch better than linear programs. However, even though the modeling

is easier, due to the loose connection of constraints, the optimization abilities of constraint pro-

gramming solvers are much more limited than those based on mathematical programming. To

a large extent, the lack of efficiency is caused by the fact that unfavorable regions of the search

space are being explored unnecessarily, which could be avoided by using a tight global bound on

the objective.

In order to overcome the obstacle of complicated problem modeling in mathematical pro-

gramming, work is in progress that tries to provide the user with higher lever building blocks

that can be used to describe a problem. The SCIL-library [186] for branch-and-cut-and-price

approaches is an excellent example of this attempt. Using a description language that provides

constructs in the style of symbolic constraints, the user can set up a problem model. Not only

does this make the modeling process easier, but on top of that, the solver is no longer provided

with only a set of dis-aggregated linear constraints, but is made aware of the basic substruc-

tures of a problem. Therefore, it is able to exploit specific knowledge about these substructures,

for example by adding global cuts to a problem that are valid for one of the polytopes that are

intersected. This may improve the quality of problem relaxations.

On the other hand, to improve the optimization abilities of constraint programming solvers,

there is a remarkable effort visible that tries to incorporate the merits of mathematical program-

1.3. Contribution

ming into constraint programming machineries. The ILOG CONCERT technology, for exam-

ple, combines ILOG CPLEX and ILOG SOLVER. On a broader scale, a considerable number

of researchers work on the integration of methods from operations research into constraint pro-

gramming. Particularly, since 1999, the International Workshop on Integration of AI and OR

Techniques in Constraint Programming for Combinatorial Optimization Problems (CP-AI-OR)

has become a major terminal for the exchange of ideas between the two research areas. Even

though combinatorial optimization and constraint programming cover a widespread area of re-

search topics and go far beyond simple enumeration techniques, tree search has been identified

and used as a key juncture between the two fields.

In combinatorial optimization, following the branch & bound paradigm, tree search is used

to solve hard optimization problems exactly, a task that consists of computing a feasible solution

and proving its optimality. Upper and lower bounding routines form the beating heart of every

branch & bound algorithm: primal heuristics or approximation algorithms are used to find near-

optimal solutions quickly, while relaxations overestimate the best performance of any solution

that can be found in a given sub-tree. Of course, if that estimate is worse than the performance

of the best known, so-called incumbent solution, the corresponding sub-tree does not need to be

explored any further and can be pruned.

On the other hand, in constraint programming, tree search is used to overcome the incom-

pleteness of a pure inference calculus that achieves a state of local consistency only. In every

search node, the finite domains of the variables of the CP-model are reduced with respect to the

model’s constraints; a process called constraint propagation. Sequentially, constraints are propa-

gated until a state of the domains is reached that achieves the desired degree of consistency. Only

then, a case distinction takes place and a branching step is carried out. This way, constraints in-

teract only via the domains of their variables, which makes the approach extremely flexible with

respect to the addition or removal of constraints.

1.3 Contribution

In this thesis, we develop reduction techniques for combinatorial optimization and constraint

satisfaction problems that can be embedded in a tree search approach. In combinatorial opti-

mization, bound estimates and variable fixing algorithms are commonly used for that purpose,

whereas in constraintprogrammingfilteringalgorithmsundertakethe task ofshrinkingthe search

space by eliminating values from variable domains. The algorithms that we develop are meant

to be used as symbolic constraints in constraint programming solvers and also in optimization

software that provides high level description constructs like the SCIL-library. Therefore, we con-

sider the work done in this thesis as a contribution toward the development of efficient and easy

to use optimization software.

Chapter 1. Introduction

Note that, when solving discrete optimization problems to optimality, there are really two

tasks to be considered. First, an optimal solution must be constructed, and second, its optimality

must be proven. Optimal or at least near-optimal solutions can often be found quickly by heuris-

tics or by approximation algorithms, both specially tailored for the given problem. In contrast to

the construction of a high quality solution, the algorithmic optimality proof requires the investi-

gation of the entire search space, which in general is much harder than to partly explore the most

promising regions only. By eliminating parts of the search space that do not contain improving

solutions, problem reduction can help with respect to both aspects of discrete optimization.

1.3.1 Outline and Major Results

The thesis is organized in two parts. Part I consists of the Chapters 2–4 and is method oriented.

This means that the task of achieving a certain degree of consistency for some special filtering

problems is studied theoretically. The efficiency of the algorithms developed is measured in

terms of worst case complexity and the degree of consistency that they achieve.

The first type of reduction algorithm that we develop involves a special kind of symbolic

constraint: In Chapter 2, our goal is to provide a set of so-called optimization constraints. By

linking the objective function with the constraint structure of a problem, such constraints can be

used for pruning and problem reduction with respect to cost considerations, a process called cost-

based filtering. That way, optimization constraints naturally combine the optimization abilities

of operations research algorithms and the efficient modeling and filtering concepts of constraint

programming.

Particularly, we study the problem of achieving different degrees of consistency for opti-

mization constraints. Since achieving a state of hyper-arc-consistency may turn out to be an

NP-hard problem itself, we introduce a new type of consistency for optimization constraints,

so-called relaxed consistency. Based on the two concepts, we develop efficient cost-based fil-

tering algorithms for shortest path constraints (on directed acyclic graphs, undirected graphs

with non-negative edge weights, and directed graphs without negative cycles), weighted stable

set constraints in interval graphs, weighted all-different constraints, and knapsack constraints.

These constraints are supposed to be used as basic building blocks when modeling real-life dis-

crete optimization problems. By exploiting the knowledge of the given constraint structures, the

corresponding reduction algorithms make use of previously developed bounds and the efficient

ways known to compute them.

As we shall see, the loose connection of optimization constraints via variable domains results

in less effective and thus also less efficient problem reduction. Therefore, in Chapter 3, we

present a theory that motivatesthe linking of optimizationconstraints via the standard operations

research decomposition techniques column generation and Lagrangian relaxation.

Then, a second type of reduction algorithm is developed that bases its decisions on the con-

1.3. Contribution

straint structure of a problem rather than on cost considerations. Obviously, a search node does

not need to be expanded if it represents a previously considered configuration. However, this

situation occurs frequently when tackling problems that contain symmetry. In Chapter 4, we

present a general symmetry breaking method called SBDD that is based on dominance detection

between choice points. An experimental evaluation shows that the method is better suited to

tackle highly symmetric problems than previously developed symmetry breaking techniques.

Part II of the thesis covers the Chapters 5–9 and is application oriented. Several combinato-

rial optimization and constraint satisfaction problems are investigated. The approaches that we

develop are based on the algorithms and methodsfrom Part I. This allowsan empirical evaluation

of the previously developed reduction algorithms on top of the theoretical work done in the first

part.

In particular, we consider the Airline Crew Assignment Problem in Chapter 5. The approach

presented is based on the concept of CP-based column generation in combination with shortest

path constraints. By exploiting CP and OR specific advantages, we are able to speed up the

computation of real-world airline crew schedules considerably. The ideas that we present have

been integrated in an industrial airline crew assignment software system and have yielded drastic

savings in running time.

In Chapter 6, we studythe Automatic Recording Problem, that evolves in the contextof mod-

ern multimedia applications. After giving an approximation scheme for the NP-hard problem, an

exact algorithmic approach is presented that links knapsack constraints and weighted stable set

constraints on interval graphs following the idea of CP-based Lagrangian relaxation. Numerical

results show that our implementation is efficient enough to tackle real-size problem instances in

an amount of time that is well affordable in practice.

The Capacitated Network Design Problem is tackled in Chapter 7. Lower bounds can be

computed by decomposing the problem. We review previously developed reduction techniques

and use CP-based Lagrangian relaxation to link them together. Moreover, a new technique is

presented that adds locally valid cuts based on a Lagrangian relaxation of the problem. In our

experiments, we show that a heuristic version of our potentially exact solver is able to provide

solutions of higher quality in less time than the best known heuristic techniques known so far.

A new approach for the Social Golfer Problem is developed in Chapter 8. Using SBDD for

symmetry breaking and the new idea of heuristic constraint propagation, we are able to solve

problems that were previously out of reach for solvers based on constraint programming.

Finally, in Chapter 9, we develop a solver for the Graph Bisection Problem. The core of the

algorithm is a lower bounding procedure that approximates maximum multicommodity flows

with multiple sinks. Comparisons with a previously developed bound based on semi-definite

programming show the gains in quality and computation time on sparse, structured graphs. Es-

pecially, our implementation is the first to compute the bisection widths of DeBruijn 9, Shuffle-

Exchange 9, and Shuffle-Exchange 10.

Chapter 1. Introduction

1.3.2 Background

To a large extent, the thesis is self-contained. However, we assume that the reader is famil-

iar with the basic concepts of algorithm and complexity theory (dynamic programming, NP-

completeness, approximation schemes, etc.), operations research (linear programming, Lagran-

gian relaxation, column generation, etc.), and constraint programming (logic programming,

hyper-arc-consistency, constraint propagation, etc.). For introductions, we refer the reader to:



Algorithm and Complexity Theory

–Cormen, Leiserson, and Rivest: Introduction to Algorithms [43].

–Garey and Johnson: Computers and Intractability [88].

–Hochbaum: Approximation Algorithms for NP-hard Problems [110].



Operations Research.

–Nemhauser and Wolsey: Integer and Combinatorial Optimization [158].

–Ahuja, Magnati, and Orlin: Network Flows [1].

–Jünger and Naddef: Computational Combinatorial Optimization [126].



Constraint Programming.

–Marriott and Stuckey: Programming with Constraints: An Introduction [145].

–Kumar: Algorithms for Constraints-Satisfaction Problems: A Survey [138].

–Apt: The Rough Guide to Constraint Propagation [6].

1.3.3 Publications

Many parts of the work presented have been published on several workshops and conferences. In

case of multiple authors, the results have been achieved in a joint effort of the collaborating re-

searchers. An alphabetical ordering of the list of authors indicatesthat all researchers contributed

equally (this applies to Section 2.5 and Chapters 4, 9), whereas a deviation from the alphabetical

ordering denotes that the first mentioned authors contributed significantly more than the other

authors (Chapters 3, and 5–8).

Unreviewed Workshops



U. Junker, S.E. Karisch, N. Kohl, B. Vaaben, T. Fahle, and M. Sellmann. A Framework for

Constraint Programming Based Column Generation. 16th International Joint Conference

on Artificial Intelligence (IJCAI), Workshop on Non-Binary Constraints, 1999.

1.3. Contribution



T. Fahle, U. Junker, S.E. Karisch, N. Kohl, M. Sellmann, and B. Vaaben. Constraint Prop-

agation for Complex Column Generation Subproblems. 17th International Symposium on

Mathematical Programming (ISMP), 2000.



G. Kliewer, M. Sellmann, and A. Koberstein. Solving the capacitated network design

problem in parallel. 3rd meeting of the PAREO Euro working group on ParallelProcessing

in Operations Research (PAREO), 2002.

Reviewed Workshops



T. Fahle and M.Sellmann. ConstraintProgramming Based Column Generationwith Knap-

sack Subproblems. 2nd International Workshop on Integration of AI and OR Techniques

inConstraintProgrammingforCombinatorialOptimizationProblems(CP-AI-OR), Pader-

born Center for Parallel Computing, Technical Report tr-001-2000:33–44, 2000.



M. Sellmann, K. Zervoudakis, P. Stamatopoulos, and T. Fahle. Integrating Direct CP

Search and CP-based Column Generation for the Airline Crew Assignment Problem. 2nd

International Workshop on Integration of AI and OR Techniques in Constraint Program-

mingfor CombinatorialOptimizationProblems(CP-AI-OR), Paderborn Center for Parallel

Computing, Technical Report tr-001-2000:163–170, 2000.



M. Sellmann and T. Fahle. CP-Based Lagrangian Relaxation for a Multimedia Applica-

tion. 3rd International Workshop on Integration of AI and OR Techniques in Constraint

Programming for Combinatorial Optimization Problems (CP-AI-OR), pp. 1–14, 2001.



M. Sellmann and W. Harvey. Heuristic Constraint Propagation. 4th International Work-

shop on Integration of AI and OR Techniques in Constraint Programming for Combinato-

rial Optimization Problems (CP-AI-OR), pp. 191–204, 2002.

International Conferences



U. Junker, S.E. Karisch, N. Kohl, B. Vaaben, T. Fahle, and M. Sellmann. A Framework

for Constraint Programming Based Column Generation. 5th International Conference on

Principles and Practice of Constraint Programming (CP), LNCS 1713:261–274, 1999.



M.SellmannandT.Fahle. CouplingVariableFixingAlgorithmsforthe AutomaticRecord-

ing Problem. 9th Annual European Symposium on Algorithms (ESA), LNCS 2161:134–

145, 2001.



T. Fahle, S. Schamberger, and M. Sellmann. Symmetry Breaking. 7th International Con-

ference on Principles and Practice of ConstraintProgramming (CP), LNCS 2239:93–107,

2001.

Chapter 1. Introduction



M. Sellmann, G. Kliewer, and A. Koberstein. Lagrangian Cardinality Cuts and Variable

Fixing for Capacitated Network Design. 10th Annual European Symposium on Algorithms

(ESA), LNCS 2461:845–858, 2002.



M. Sellmann. An Arc-Consistency Algorithm for the Weighted All Different Constraint.

8th InternationalConference on Principles and Practice of ConstraintProgramming(CP),

LNCS 2470:744–749, 2002.



M. Sellmann and W. Harvey. Heuristic Constraint Propagation. 8th International Confer-

ence on Principles and Practice of Constraint Programming (CP), LNCS 2470:738–743,

2002.

Journals



T. Fahle, U. Junker, S.E. Karisch, N. Kohl, M. Sellmann, and B. Vaaben. Constraint pro-

gramming based column generation for crew assignment. Journal of Heuristics, 8(1):59–

81, 2002.



M. Sellmann, K. Zervoudakis, P. Stamatopoulos, and T. Fahle. Crew Assignment via Con-

straint Programming: Integrating Column Generation and Heuristic Tree Search. Annals

of Operations Research, 115:207–225, 2002.



T. Fahle and M. Sellmann. Cost-Based Filtering for the Constrained Knapsack Problem.

Annals of Operations Research, 115:73–93, 2002.



M. Sellmann and T. Fahle. Constraint Programming Based Lagrangian Relaxation for the

Automatic Recording Problem. Annals of Operations Research, to appear.

PART I

— Methods —

In the first part of this thesis, we introduce general purpose methods for pruning and filtering

with respect to cost considerations and symmetry.

In particular, we develop a tool box of cost-based filtering algorithms in Chapter 2. We

introduce the notion of relaxed consistencyand study the complexityof achievingdifferent levels

of consistency for shortest path constraints, weighted stable set constraints on interval graphs,

weighted all-different constraints, and knapsack constraints.

Then, in Chapter 3, we investigate how the interplay of optimization constraints can be im-

proved with the help of the standard problem decomposition techniques column generation and

Lagrangian relaxation.

Finally, in Chapter 4, we develop a straightforward symmetry breaking method that is based

on the detection of dominance relations between choice points. The method is applied to three

different problems and is shown to be particularly suited for highly symmetric problems.

Chapter 2

Optimization Constraints

In this chapter, we develop a “tool box” of domain filtering algorithms that can be used for

solving combinatorial optimization problems. While concentrating on some specific problems,

it is important to keep in mind that the domain filtering algorithms we develop are to be used as

building blocks for tackling more complex optimization problems. As a matter of fact, for the

problemsconsideredhere, domainfilteringusuallyonlymakessenseinthe presenceofadditional

constraints. Different methods of how cost-based filtering algorithms can be exploited in the

context of more general optimization problems will be presented in Chapter 3.

2.1 Definitions and General Observations

Within a tree search, during the course of optimization, we compute a sequence of feasible solu-

tions. We refer to the best known feasible solution as the incumbent solution. Obviously, once

we have found a solutionof a certain quality, we are searching for better solutionsonly. Thus, we

impose a restriction on the objective. That restriction, in combination with other side-constraints

of the original problem, forms an optimization constraint [65, 79, 80, 128, 162, 189], which is

the core concept that we will be using throughout this chapter. It was developed by a community

that has been working on the integrationof constraint programming (CP) and operations research

(OR) in recent years. In the OR world, though never explicitly stated as constraints, optimization

constraints are frequently used for bound computations and variable fixing. From a CP perspec-

tive, they can be viewed as global constraints that link the objective with some other constraints

of the problem:

Given n



, let X1



Xndenote some variables with finite domains D1:







Dn:







. Furthermore, given a constraint ζ:D1



 





, and an objective function

Z:D1





, let xi







Chapter 2. Optimization Constraints

Definition 2.1 Let B





denote an upper (lower) bound on the objective Z to be minimized

(maximized).



ϑζ







:D1

  

 





with ϑζ









 





1iff ζ









1and









B is called minimization constraint.



ϑζ







:D1

  

 





with ϑζ









 





1iff ζ









1and









B is called maximization constraint.



A minimization or maximization constraint is also called an optimization constraint.

The purpose of optimization constraints is twofold: first, they can be used for pruning by

computing an upper/lower bound on the objective, which is the common idea in branch & bound

algorithms. Second, they may also be used to remove those values from variable domains that

cannot be part ofanyimprovingsolution, which maybe viewed as a generalizationof the variable

fixing technique: for binary problems, variable fixing and domain filtering are essentially the

same.

2.1.1 On the Complexity of Cost-based Domain Filtering Problems

In order to achieve a state of (hyper-)arc-consistency [6, 138]1of an optimization constraint, we

have to find and remove all assignments that cannot be extended to an improving solution that

is feasible with respect to ζ. That is, if ζis the only constraint of a combinatorial optimization

problem (we call that optimization problem and the optimization constraint corresponding to

or associated with each other), an arc-consistency algorithm allows us to compute improving

solutions in a backtrack-free search. Consequently, if the original problem is NP-hard, so is

the problem of achieving arc-consistency of the corresponding optimization constraint. The

Knapsack Problem is an example for such a situation.

If the optimizationproblem associatedwith an optimizationconstraint ispolynomial, thenthe

arc-consistency problem may also be polynomial. The Weighted Bipartite Matching Problem is

an example, because there exists a polynomial time algorithm for the problem and the removal

of an edge or two nodes (when the edge between the nodes is chosen to be part of the matching)

does not change the structure of the problem.

The situation may change, however, if the problem structure is not preserved when a variable

is forced to take a specific value. Consider a Shortest Path Problem in an arbitrary network,

where we use a binary variable for each edge. The problem of finding a shortest path is of

course solvable in polynomial time. However, if we are to compute the set of edges that must or

1To be precise: here and in the remainder of this thesis, we consider the problem of achieving a state of hyper-

arc-consistency. However,for historical reasons and also to improvethe readability, we simply write arc-consistency

when we refer to a state of local consistency with respect to one constraint only.

2.1. Definitions and General Observations

cannot be part of any simple path that does not exceed a certain length, we are facing an NP-hard

problem: it is easy to see that the Two Vertex Disjoint Paths Problem [83] can be reduced to this

problem.

2.1.2 Degrees of Consistency

The discussionshows thatwe cannot always hope for a cost-baseddomain filteringalgorithm that

achieves arc-consistency. Therefore, we may consider to develop less effective but polynomial

time bounded filtering algorithms that may only achieve a weaker degree of consistency. Note

that, in a different context, the idea of weaker forms of consistency gives yield to the notion of

bound consistency that can also be achieved more easily than general arc-consistency, and that

has proven valuable for many applications.

Regarding cost-based filtering, an idea that has been developed in OR to perform variable

fixing on integer linear problems is the reduced-cost filtering method: when solving the contin-

uous relaxation bound on a linear combinatorial optimization problem with the help of a general

LP solver (such as the simplex algorithm or interior point methods), we get dual information

and reduced-cost data for free. That data can be used to compute a lower bound on the loss of

performance that we have to accept when adding a new constraint of the form X



x(usually this

is done by performing one dual simplex re-optimization step). Of course, if the loss is too large,

we can deduce that xmust be removed from the domain of X.

We strengthen and generalize the basic idea by coupling optimization constraints and relax-

ations:

Definition 2.2 Given an optimization constraint ϑζ







:D1

 

 





, let ∆:





 

Dn. Furthermore, denote the set of all subsets of ∆by 2∆.



Let ϑζ







be a minimization constraint, and let L : 2∆

 

such that for all Mi









 





min





























where min /



∞. We callL a relaxationofϑζ



Zandsay thatϑζ







isrelaxedL-consistent,

iff for any given 1



n and xi



Di, L



  



 







Analogously, let ϑζ







be a maximization constraint, and letU : 2∆

 

such that for all















max





 







 















where max /





∞. We call U a relaxation of ϑζ



Zand say that ϑζ







is relaxed U-

consistent, iff for any given 1



n and xi



Di,U



 



 





Chapter 2. Optimization Constraints

As one would expect, the definition states that relaxed L-consistency (relaxed U-consistency

follows analogously) can be achieved the easier the weaker the relaxation Lis. For L





∞,

there is no work to do to achieve relaxed L-consistency, whereas arc-consistency is enforced

when L



  





min





 























. That is, the

choice of Ldetermines the degree of domain filtering.

In practice, Lis usually chosen as a fairly tight bound that can still be computed quickly.

Generally, within a tree search there is a trade-off between the time spent per search node and the

total number of search nodes. Thus, the favorable choice of the accuracy of the relaxation is al-

ways subject to the optimization problem at hand. Note that the definition of relaxed consistency

allows to compare domain filtering algorithms with respect to the running time and the degree of

consistency they achieve.

In the following, we develop cost-based domain filtering algorithms for shortest path con-

straints, weighted stable set constraints on interval graphs, weighted all-different constraints,

and knapsack constraints.

We have seen already, that achieving arc-consistencyfor knapsack or shortestpath constraints

are NP-hard tasks. Therefore, the challenge is to develop efficient filtering algorithms for the two

constraints that achieve relaxed consistency for favorably strong relaxations.

On the other hand, for weighted stable set constraints on interval graphs and weighted all-

different constraints, arc-consistency can be achieved in polynomial time. Thus, we aim at devel-

oping filtering methods for these constraints that achieve arc-consistency and run faster than in

time O





, where ndenotes the number of (binary) variables, and Tis the time needed to solve

the corresponding optimization problem. That time bound can obviously be achieved by probing

all variable values in a brute force manner. As we will see, we can achieve arc-consistency for

both constraints in time O





, i.e. the same time that is needed to compute the bound on the

objective.

2.2. Shortest Path Constraints

2.2 Shortest Path Constraints

Many real-world problems, e.g. in personnel scheduling and transportation planning, can be

modeled naturally as Constrained Shortest Path Problems (CSPs), i.e., as Shortest Path Problems

with additional constraints. A well-studied problem in this class is the Resource Constrained

Shortest Path Problem. Reduction techniques are vital ingredients of solvers for the CSP, that is

frequently NP-hard, depending on the nature of the additional constraints. Viewed as heuristics,

until today these techniques have not been studied theoretically with respect to their efficiency,

i.e., with respect to the relation of filtering power and running time. Using core concepts of

constraint programming and the notion of relaxed consistency, we provide a sound theoretical

studyof cost-basedfiltering forshortest pathconstraintson acyclic, onundirected andon directed

graphs that must not contain negative cycles.

Real-world problems can frequently be modeled as Shortest Path Problems with additional

constraints. The best known Constrained Shortest Path Problem (CSP) is probably the Resource

Constrained Shortest Path Problem [4, 17, 57, 103, 125] that consists in the combination of a

Shortest Path Problem and capacity constraints on a set of resources. Even on DAGs, for non-

negativeobjective functions and for only one resource that problem is known to be NP-hard [88].

Standard applications for the Resource Constrained Shortest Path Problem are route planning

in traffic networks and quality of service routing [161, 216]. The Crew Scheduling Problem is

another example of a real-world problem where CSPs are used in many successful approaches:

In a column generation process, CSPs have to be solved to generate columns, which correspond

to individual lines of work in this context [219]. In Chapter 5, we present an example of such a

crew scheduling approach based on column generation with shortest path sub-problems.

Generally, CSPs appear very often as sub-problems in column generation approaches. Exam-

ples range from route guidance [123] and duty scheduling in public transit [25] up to the schedul-

ing of switching engines [142]. In Section 3.1, a general framework for constraint programming

based column generation is developed that formalizes the use of optimization constraints in this

context.

To solve Constrained Shortest Path Problems, state of the art solvers compute lower and

upper bounds on the problem and then close the duality gap. The latter task is carried out by an

enumeration procedure such as a tree search [17], dynamic programming [154] or a k-shortest

path algorithm [103]. Particularly in a tree search, but also in the other approaches the tightening

of (sub-)problems is vital for an effective gap closing procedure. Therefore, it is essential for the

overall performance and the practical success of the entire approach.

The first tightening strategy that was proposed goes back to a work done by Aneja, Aggarwal,

and Nair [4]for problem reductionof the Resource Constrained ShortestPathProblem. The basic

idea consists in identifying nodes and arcs that cannot be visited by any path that obeys the given

resource restrictions. The same method can also be used to identify nodes and arcs that cannot be

Chapter 2. Optimization Constraints

visited by any improving path, which gives a first cost-based filtering algorithm for the problem.

Dumitrescu and Boland [57] proposed a repeated problem reduction procedure that has shown

to be very successful for hard constrained problems. Beasley and Christofides [17] have shown

how a tighter global, Lagrangian relaxation based bound can be used for the elimination of nodes

and arcs.

Apparently, none of these heuristics has been classified with respect to its filtering abilities.

Moreover, the reduction techniques used all focus on the removal of nodes and arcs, but those

arcs and nodes that must be visited by all path of a certain quality remain undetected. However,

with respect to the additional constraints of the CSP this information can be very valuable as it

may prove useful for an additional reduction of the problem.

Constraint programming theory provides means for the level of consistency that a constraint

filtering algorithm achieves. Using the extended notion of relaxed consistency for optimization

constraints, we are able to measure a non-exact filtering algorithm not only with respect to its

running time, but also to its filtering power that is determined by the quality of the relaxation

used. With respect to shortest path constraints, we study the complexity of achieving a state of

(hyper-)arc-consistency. Since the problem is NP-hard in the general case, we introduce short-

est path relaxations and develop and compare different filtering algorithms for different graph

classes.

In Section 2.2.2, we develop an efficient linear time filtering algorithm for shortest path

constraints on DAGs. In Section 2.2.3, we investigate the problem on undirected graphs, where

it is shown to be NP-hard. We introduce a shortest path relaxation L1and formulate a linear time

algorithm that achieves a state of relaxed L1-consistency. Finally, in Section 2.2.4, we develop

cost-based filtering algorithms for shortest path constraints on general directed networks with

non-negative costs or graphs that at least do not contain negative weight cycles.

2.2.1 Definition

Definition 2.3 Denote a weighted (directed or undirected) graph by G

 





, and let h

 



A sequence of nodes P

 







Vhwith











E for all 1





h is called a

path from i1to ihin G.



A path P is called simple iff P visits every node at most once. For all i





V, denote the

set of all simple paths from i to j by π









For all paths P, nodes i



V and edges









E, we write i



P or









P iff P visits

node i or the edge







, respectively. For a set of nodes or edges S, we write S



P, iff



P for all s



S. Correspondingly, we write P



S iff s



S for all s



2.2. Shortest Path Constraints



The cost of a path P

 





is defined as cost







∑1





hcijij



1. Accordingly, for

any set S



E, we define cost







∑







Scij.

Definition 2.4 Let G

 





denote a (directed or undirected) graph with n





and m





, a designated source v1



V and sink vn



V, and arc costs cij





. Furthermore, assume we

are given binary variables X1



Xm, an integer variable Z, and an objective bound B







Ashortest path constraint has binary variables X1



Xmand an integer variable Z, and

an instantiation Xi



xi, for all 1



m, and Z



z is consistent iff the following holds:

1. The set











E determines a simple path in the graph G from the source v1

to the sink vn.

2. z is the cost of the path represented by the value of X, and z





Every simple path from source to sink with costs less than B is called admissible.

To ease the notation, for the remainder of this section we assume that a shortest path con-

straint is associated with a set variableY



Ethat represents the set of edges eifor which Xi



The (current) domains of the variables Xwill be represented by two sets: the set of possible

members pos





, and the set of required members req





of Y. In the sub-tree of the search

rooted at the current choice point we require req







pos





. That is, req





represents

the set of variables for which it has been set Xi



1, and the set pos





represents the set of all

variables for which it has not been decided to set Xi



0 already. Then, in the current choice

point, we have to search for admissible paths Psuch that req







pos





. Note that we use

the set variable Yonly to ease the presentation. It has no impact on the implementation that is

assumed to use only the variables X. Especially, the didactic use of a set variable has no impact

on the definition of arc-consistency. To achieve arc-consistency of a shortest path constraint, we

must ensure:



For all e



pos





, there exists an admissible path Pwith req













pos





, and



for all e



pos







req





, there existsan admissible path with req







pos











Obviously, on the existence of an admissible path can be decided by applying a shortest path

algorithm. However, to decide whether there exists a simple path that visits a set of edges is

an NP-hard task which can be shown by a simple reduction to the Two Vertex Disjoint Path

Problem [83]. Consequently, the arc-consistency problem for the general shortest path constraint

is also NP-hard.

Chapter 2. Optimization Constraints

2.2.2 Shortest Path Problems on DAGs

The reduction on the Two Vertex Disjoint Path Problem does not prove NP-hardness for acyclic

graphs. As a matter of fact, on DAGs the problem is solvable in polynomial time. In the follow-

ing, we develop a cost-based domainfiltering algorithm for shortestpath constraints thatachieves

arc-consistency in time O







As stated above, to perform domain reduction for a shortest path constraint, we are facing

two tasks: first, in order to shrink the set pos





as much as possible, we need to identify all arcs

in Ethat cannot be visited by any admissible path req









pos





. Second, to increase

req





as much as possible, we must compute all arcs that must be visited by all admissible path

req









pos





First, to ensure that all paths computed are subsets of pos





, we remove all arcs from G

that are not in pos





. Then we want to make sure that all admissible paths are super-sets of

req





. To do so, we set M:



mmax



cij



cij





, we decrease the arc weight cij of edge









req





by Mand adapt the upper bound Bby subtracting M



req







. In the following, we

assume that G

 





has been updated accordingly. Note that the removal of nodes and arcs

can be performed in time O







when an adjacency list representation (such as the forward

or backward star representations [1]) of Gis used.

2.2.2.1 Removing Arcs from the Possible Set

Without loss of generality, we may assume that the nodes in Vare ordered topologically. If a

node i



Vprecedes a node j



Vin the topological ordering, we write i



top j. We write i



top j

iff i



top jand i





j. Note that, for all arcs









E, it holds that i



top j. Furthermore, we

may assume that v1and vnare the first and last nodes in this ordering, respectively. To find out

for a given arc









Ewhether there still exists an admissible path Pwith









P, we use a

method that was originally developed for the Resource Constrained Shortest Path Problem [4]:

First, we compute the shortest path distances c







from the source v1to all nodes i



Once a topological ordering of the nodes is known (which takes time O







), this can be done

in time O







even in the presence of negative arc weights (see [43]). Next, we compute the

shortest path distances c







from all nodes i



Vto the sink vn, which can also be done in

linear time by reversing all arcs and using the same procedure as before with vnas the starting

node. Finally, for every arc









pos





we check whether the shortest simple path from v1to

vnvia







has costs lower than B, i.e., we remove







from pos





iff









cij













(2.1)

Using the same idea we can also identify nodes that can be removed from the graph, because

they can never be part of any path from source to sink that is short enough to beat the current

upper bound. Figure 2.1 illustrates the situation.

2.2. Shortest Path Constraints

v1v11

Fig. 2.1: The figure shows arcs on shortest paths from v1and to v11 in a DAG. Dashed lines mark shortest-

path arcs from v1, dotted lines those to v11. Solid lines represent arcs that are in both sets. Consider for

example node 7: the shortest path from v1to 7 is







, and the shortest path from node 7 to v11 is





v11



. Therefore, a shortest path from v1to v11 via node 7 is





v11



2.2.2.2 Adding Arcs to the Required Set

After having shrunk the possible set, we still need to identify all arcs that must be visited by

all admissible paths to achieve a state of arc-consistency for a shortest path constraint. If we

perform the algorithm for the removal of members from the possible set first, we may assume

that the graph Gonly contains arcs and nodes that are part of at least one admissible path. Then

the following lemma characterizes the arcs that are required.

Lemma 2.1 Denote the graph that is obtained by un-directing all arcs in G by Gu. If, for all

arcs e



E, there exists an admissible path P in G such that e



P, then the following statements

are equivalent:

1. Every admissible path P in G contains the arc









2. For all arcs









E with









 





, it holds that k





top i or j



top k



3. The edge







is a bridge in Gu.

Proof:





2. Assume that there exist nodes k





Vsuch that

















 





. Then

there exists an admissible path Psuch that









Pand









P. Thus, k





top ior



top k







3. Statement (2) implies that the arc







is the only arc that leaves node i. Also,

there exists no arc that by-passes the node iin G. Thus the removal of







disconnects

v1and vnin Gu, i.e.,







is a bridge in Gu.





1. If







is a bridge in Gu, then every path from v1to vnin Gmust contain the arc







. Therefore, for every admissible path P, it also holds that









Chapter 2. Optimization Constraints

Lemma 2.1 allows us to compute the arcs to be required by searching for bridges in the

undirected version of G, and the bridges of an undirected graph can be computed in time O







[43].

The following theorem summarizes the results in Sections 2.2.2.1 and 2.2.2.2.

Theorem 2.1 On DAGs, arc-consistency for the shortest path constraint can be achieved in

linear time.

2.2.2.3 Incremental Shortest Path Constraint Propagation

During the search, the domain ofYchanges frequently, which means that many and often similar

SSSPs have to be solved. Therefore, we develop an incremental version of the algorithm. Instead

of restarting the computation from scratch, it makes use of previously computed shortest path

information. Moreover, it uses the information on the differences in the domains of the current

and the last call: the required set may have grown, and the possible set may have shrunk.

For an efficient implementation of the algorithm, we use both the forward and the backward

star representation of G. We choose this redundant data structure, first because we need to

compute shortest paths in Gand the reverse of G, and second because we are able to perform the

incremental shortest path update more efficiently.

In order to compute the arcs and nodes that have to be removed from the graph, a support

idea in the style of AC-6 [21] can be used to reduce the computational effort required. If a node

ileaves the possible set, we mark all its adjacent nodes jas affected by the removal or distance

change of node i. If the node iwas even the direct shortest path predecessor of node jin the

preceding call to the propagation routine, we refer to ias the support node of j.

We iterate overall nodes in their topologicalorder. If the node jis affected, we check whether

its support still exists or is replaceable by another node without a change in the shortest path dis-

tance c







. Since this requires iterating over all in-going arcs, a backward star representation

is used. Only if the support is lost and cannot be replaced, we need to propagate the distance

update and mark the successors of node jas affected. To do this efficiently, the forward star

representation of Gis used. That way, we perform a continuing update on all affected nodes in

only one pass.

If a new arc







becomes required, we do not need to re-compute the shortest path distances

of all nodes in the graph, because nodes that precede iin the topological ordering are not affected

by that change. Therefore, it is sufficient to restart the SSSP-algorithm for DAGs at node i. The

distance c







can be reused from the previous call to the constraint propagation algorithm.

Moreover, we can stop examining all outgoing arcs when we run over the first node kfor which







was already formerly required: the shortest path tree structure “behind” a required arc

2.2. Shortest Path Constraints

remains intact, and the difference in the distance c







before and after the change simply

applies to all following nodes as well.

In the worst case, the incremental variant of the propagation algorithm may still require time







. However, in practice the ideas sketched in the above can reduce the computational

effort considerably as we shall see in Chapter 5.

2.2.3 Shortest Path Problems on Undirected Graphs

Next we consider shortest path constraints on undirected graphs with non-negativeedge weights.

Unlike in the previous section, achieving arc-consistency for a shortest path constraint on undi-

rected graphs is an NP-hard task, as the following observation shows.

Lemma 2.2 Given an undirected graph G

 





, n :





, m :





, two designated nodes





V and a set of edges S



E, it is NP-hard to decide whether there exists a simple path









with S



Proof: We reduce the problem to the Hamiltonian Path Problem: Given an undirected graph

 





, do there exist two nodes s





Vand a simple path P









with V



P? We

transform Ginto an instance









such that there exists a simple path P









with



iff there exists a Hamiltonian path in the original graph G.

12d

Fig. 2.2: The structure replacing a node in G.

First, we add two new nodes v1



vn, and all

edges in V

 





. Then every node v



is replaced by the structure given in Figure 2.2:

The ellipse sketches a former node v



V. For

alledges e1























incident to v, we add new nodes 1v

 

that connect the new structure with their cor-

responding edges. Moreover, we add two

new nodes avand bvand edges















. Finally, we setS:















the set of edges that must be visited.

Then there exists a simple path P









with S



iff there exists a Hamil-

tonian path in the original graph G: Given a path P









with S



must visit all

structures sketched in Figure 2.2 at least once, because it must visit all edges







. On the

other hand, after P



has visited the edge







it can never return to the current structure be-

cause all paths that pass through it must visit either node avor bvagain. Therefore, P



visits all

structures corresponding to the original nodes in Vexactly once, and thus defines a Hamiltonian

path in G.

Chapter 2. Optimization Constraints

On the other hand, assume there exists a Hamiltonian path P









in Gfor some nodes





V. Then we construct a path P









with S



in the following manner: We start at

v1and go to node sfirst. Now, for each v



Vthat the Hamiltonian path visits we enter via some

edge ei, go to node avfrom there, we visit the edge







, and find our way out via the node

incident to ejthat is visited by Pnext. Since V



P, we visit all edges in Slike that. Finally, we

end at node tand proceed to vnfrom there.

As a simple consequence of Lemma 2.2, we get the following

Corollary 2.1 To check the arc-consistency of a shortest path constraint on an undirected graph

is NP-hard.

Due to this result, in the following, we develop a cost-based filtering algorithm that achieves

relaxed consistency rather than arc-consistency. In order to introduce the relaxation we want to

use, we first start with the following

Definition 2.5 Denote a weighted (directed or undirected) graph by G

 







A path P is called a k-simple path in G iff, for all j



V, the path P visits j at most k times.

Note that a 1-simple path is a simple path in G.



With P















we refer to a shortest path from i to j (with respect to c). Then, to

ease the notation, we set c









cost









Given a shortest path constraint, a k-simple path P from v1to vnis called a k-admissible

path iff cost







Note that in a graph with non-negative edge weights, a shortest admissible path is also a shortest

2-admissible path. Now, instead of checking for admissible paths only, we consider the fol-

lowing shortest path relaxation: Let D





denote the domain of Yrepresented as the pair of sets



req





pos





. We set H:















with P



pos







and Ff:







Pis a 2-simple

path from v1to vnwith f





for all f



E. Then we define







max



min



cost











, maxf



req







min



cost











Lemma 2.3 L1is a shortest path relaxation, i.e., it holds that







min



cost













 

req







pos







Proof: Let P









denote the shortest path in Gwith req









pos





. Obviously, it

holds that P



Hand P



Fffor all f



req





. Therefore, L1







cost





2.2. Shortest Path Constraints

The big advantage of the relaxation above is that it allows to be checked for consistency very

easily, as we shall see below. Note however, that L1does not require that the 2-admissible paths

must visit all nodes in req





simultaneously. Of course, this weakens the relaxation. We can

reduce the negative effects by improving the probability that a 2-admissible path visits the edges

in req





: we set cij :



0 for all









req





and subtract cost



req





from B.

According to the definition, a shortest path constraint is relaxed L1-consistent, iff

1. for all f



pos





, there exists a 2-admissible path P



Ff, and

2. for all f



pos







req





, there exists an admissible path P



Hwith f



In the following two sections, we show how relaxed L1-consistency can be achieved efficiently.

2.2.3.1 Removing Edges from the Possible Set

In order to check whether there exists a 2-admissible path in Gthat visits an edge









we can use the same idea as in the previous section on shortest paths in DAGs. Obviously,

the shortest 2-simple path from v1to vnthat visits







is either













with costs









cij













 







with costs c









cij









. Therefore, to check

whether an edge has to be removed from pos





with respect to the relaxation L1, it is sufficient

to know the shortest path distances from the source and to the sink of all nodes. Both values can

be computed for all nodes by only two shortest path computations in Gin time O





nlogn



using Dijkstra’s algorithm in combination with Fibonacci heaps [86]. In a RAM model, shortest

paths on undirected graphs can be computed in time O







when using the algorithm of

Thorup (see [207] and the recent extension of Pettie and Ramachandran in [166]). Thus, the set

of edges that has to be removed from pos





to achieve relaxed L1-consistency can be computed

in time O





nlogn



, and in time O







on a RAM.

2.2.3.2 Adding Edges to the Required Set

After having removed all edges from Gthat cannot be part of any 2-admissible path, the edges

that must be visited by all such paths can be characterized by the following

Theorem 2.2 Assume that all edges in G are part of at least one 2-admissible path. Then an

edge









E must be visited by all admissible paths, iff

















, and









is a bridge in G.

Before we can prove the above theorem, we need to show the following two lemmas first:

Chapter 2. Optimization Constraints

v1k

1f f+1

Fig. 2.3: The figure schematically shows an edge







Ethat must exist according to Lemma 2.4. Solid

lines mark edges in E, and dashed lines mark parts of the shortest path between v1and vn. The dotted line

between land vnindicates that there exists a path between the two nodes that does not visit the edge





The alternating lines and dots between land rindicate that the shortest path from lto vnvisits node r. The

numbers on top of the nodes give their corresponding DFS numbers, and triangles mark DFS sub-trees.

Lemma 2.4 (Compare with Figure 2.3.) Assume that all edges in G are part of at least one

2-admissible path. Let









E denote an edge that must be visited by all admissible paths and

that can be removed from G without disconnecting v1and vn. Then there exists an edge









such that



















P and









2. k is a shortest path predecessor of r, and















Proof: Assume we compute a shortest path P

 

 











. Then i1



v1,ih



and if



r,if





sfor some 1





h. Next, we change the graph representation of Gsuch

that









is the first outgoing edge of node igfor all 1





h. For all nodes j



V, let









denote the ordering in which the nodes are first visited by a depth first search

using the modified graph representation of G. Then dig



gfor all 1



h. Since the removal







does not disconnect v1and vn, there exists a forward edge









Ewith dk



fand





1. This implies Statements 1 and 2.

It remains to show that















. By assumption, there exists a 2-admissible path R

through edge







. There are two possibilities: either Rvisits node kor node lfirst, which

corresponds to:

a) c









ckl











B, or

b) Rvisits lbefore kand c









ckl











In the first case, since















and







must be visited by all admissible paths, it holds

that















, and we are done.

2.2. Shortest Path Constraints

v1vn

i j

Fig. 2.4: The figure schematically shows an edge





 

Ethat must exist according to Lemma 2.5. Solid

lines mark edges in E, and dashed lines mark parts of the shortest path between v1and vn. Alternating lines

and dots indicate parts of the shortest path from v1to a node, and dotted lines indicate parts of the shortest

path from a node to vn. The proof of Theorem 2.2 shows that the path





























 

2-admissible and does not visit the edge





Let us consider the second case. Let Q









denote a shortest path from v1to lwith









Q. Withoutloss of generality, we may assume that kand lare chosen such that









We observe that















, because otherwise this implies that

















and

the 2-admissible path visits node kbefore node l. Now, since kis a shortest path predecessor of

rand















, it holds that k









. Then,









ckl



















ckl



































ckl



















ckl











which reduces this case to (a).

Lemma 2.5 (Compare with Figure 2.4.) Assume that all edges in G are part of at least one

2-admissible path. Let









E denote an edge that must be visited by all admissible paths and

that can be removed from G without disconnecting v1and vn. Then there exists an edge









such that















and















, and















and















Proof: Denote an edge as in Lemma 2.4 by









E. Then there exists a path P









with









P, and we may assume















1. Since















, there existsan edge









Psuch that















and















2. By assumption, there exists a 2-admissible path that visits j. Since















, it

follows that















, because







must be visited by all admissiblepaths. Finally,

assume that















. Then the shortest path visiting node ihas costs

Chapter 2. Optimization Constraints









crs



















crs









However, the path from v1via r,iand sto vnhas costs































which is lower or equal to the cost of the shortest path visiting i. This implies that it

is also a shortest path visiting node i. It does not, however, visit some edges with zero

costs. Particularly, it does not visit the edge







. Therefore, we may assume that















Now, we have everything at hand to give the previously postponed

Proof of Theorem 2.2:



Let







be a bridge on the shortest path P









. Then the removal of







discon-

nects the graph G. Since the node pairs







and







are still connected, the removal







also disconnects v1and vn. Thus, for all P









, it holds that









Therefore, also all admissible paths must visit









Obviously, if there exists any admissible path, then P







is admissible, too. Thus,















. Now assume that the removal of







does not disconnect v1and vn.

Then, according to Lemma 2.5, there exists an edge









Esuch that











































and















. By assumption, there exists a 2-

admissible path Rvisiting







. Without loss of generality, we may assume that Rvisits

node ibefore node j, because









cij



















crs











cij











crs



























cij



























cij









However, this implies that









R, which is a contradiction to the assumption that every

admissible path must visit







Using Theorem 2.2, after having removed all edges that cannot be part of any 2-admissible

path, we can compute all edges that must be visited by all admissible paths in time O







first, we compute a shortest path P









and mark all edges on this path. Then we compute

all bridges in Gand check which ones are visited by P.

2.2. Shortest Path Constraints

1 1 1

1 1

Fig. 2.5: A directed graph with non-negative arc weights. Assume we are given an upper bound B



All arcs in the graph are part of an admissible path with costs lower than B, and every admissible path

with costs lower than Bmust visit the arc







. However, there exists a path







that does not visit

this arc.

The following theorem summarizes the results in the previous two sub-sections:

Theorem 2.3 On undirected graphs with non-negative edge weights, relaxed L1-consistency of

a shortest path constraint can be achieved in time O





nlogn



, and in time O







on a

RAM.

2.2.4 Shortest Path Problems on Directed Graphs

To complete our discussion on cost-based filtering for shortest path constraints, we finish with

some results on shortest paths in general directed networks. We start by considering directed

graphs with non-negative arc weights. In the end of this section, we will show how these results

can be exploited to cope with negative arc weights as well.

As has been stated in the introduction to this section, achieving arc-consistency for shortest

path constraints in general networks is NP-hard. Regarding the removal of arcs from the possible

set, relaxed L1-consistency on directed graphs with non-negative arc weights can be achieved in

the same way as on undirected graphs. However, with respect to arcs that must be visited by all

admissible paths, the situation is more complicated. Recall the result from Section 2.2.3: After

having removedthe infeasible edges, in undirected graphs, the edges that are required are exactly

the ones on the shortest path that must be visited by all paths from v1to vn.

Unfortunately, this classification does not hold for directed graphs as the example in Fig-

ure 2.5 shows. Thus, for all arcs















, we have to re-compute the shortest path value

when removing







from E, which may require n



1 shortest path computations in the worst

case.

Theorem 2.4 On directed graphs with non-negative arc weights, relaxed L1-consistency can be

achieved in time O





nlogn



Since the computation time of the algorithm sketched previously may not be efficient enough

to be of use when being applied in a tree search, in the following we consider another shortest

Chapter 2. Optimization Constraints

pathrelaxation. Let T



Edenote a shortestpath tree inGrootedatv1. Withoutlossofgenerality,

we may assume that every node in Gcan be reached from v1, and thus that V



T. Obviously,

when e



Eis removed from T, the nodes inVare partitioned into two sets: the set v1



Vof

nodes that are still connected with v1in T







, and the complement of Sein V,SC

e. Obviously,





0iff e



T. We set







Pis a 2-simple path from v1to vnwith P



pos





or, if e





pos





, then there exists an arc











Tsuch that i



Seand j





Moreover, we define







max



min



cost











, maxf



req







min



cost











To understand the above shortest path relaxation better, we make the following observations:



Obviously, since H



J,L2is dominated by L1, i.e., L2



L1. Therefore, L2is also a

shortest path relaxation.



The difference between relaxations L1and L2only consists in the set Jthat is used instead

of Hto determine the arcs that have to be required to achieve a state of relaxed consistency.

In contrast to H, the set Jalso contains paths Pthat are not simple (i.e., paths that may

visit some nodes more than just once) and that may visit arcs e



pos





. However, if





pos





, then we enforce that Pmust also visit another arc









Tthat connects

Sewith SC

e. This implies e



T, as otherwise SC



0. Moreover, it holds that cost







min











cij



























Like L1,L2also does not force the 2-admissible paths to visit the nodes in req





simulta-

neously. Again we can improve the effectivity of the filtering algorithm by setting cij :



for all









req





and by subtracting cost



req





from B.



A shortest path constraint is relaxed L2-consistent, iff

1. for all f



pos





, there exists a 2-admissible path P



Ff, and

2. for all f



pos







req





, there exists a 2-admissible path P



Jwith f



P, or there

exists an arc e





Tsuch that e





We have seen that the relaxation L2is dominated by L1. Nevertheless, cost-based filtering

that achieves relaxed L2-consistency is still stronger than ordinary reduced-cost filtering (see

Section 2.1.2):

2.2. Shortest Path Constraints

Lemma 2.6 If a shortest path constraint is relaxed L2-consistent, reduced-cost filtering is inef-

fective.

Proof: Let



reqY



pos





such that the shortest path constraint is L2-consistent. Furthermore,

denote the reduced costs of









pos





by cij





By assumption, for all









pos





, it holds that there exists a 2-admissible path that

visits







. Particularly, the shortest path in F







is 2-admissible, i.e., c









cij











B. Reduced-cost filtering removes an arc









pos





from the possible set iff









cij



























cij



However, since c























, it holds that









cij



















cij





















Reduced-cost filtering adds f

 







pos







req





to req





iff









min



cgh

















By assumption, for all f

 







pos







req





, there exists a 2-admissible path Pin G

such that either









Por there exists an arc











Tsuch that











a) Let f



P. Since Pis 2-admissible, it implies that there exists an admissible path (that

can be constructed by removing all loops in P) that does not visit f. Thus, fmust not

be required.

b) Now, let











Tsuch that











f, and













Tthe arc with

minimal reduced costs cgh



0. Then,









cgh











crs











crs



















crs



















cost







because c























Chapter 2. Optimization Constraints

v1r

Fig. 2.6: The figure schematically shows a shortest path tree Trooted at v1. Solid lines denote arcs in G,

dashed lines mark parts of the shortest path P







from v1to vn. The triangles symbolize shortest path

sub-trees. For an edge e













, the nodes in Vare partitioned into two non-empty sets Seand

e. If eis removed from the graph, the shortest path from v1to vnmust visit an edge











2.2.4.1 Relaxed L2-Consistency

As relaxations L1and L2do not differ with respect to the definition of Ff,f



E, to remove arcs

from pos





we can simply follow the procedure sketched in Section 2.2.3.

Regarding the identification of arcs that have to be added to req





to achieve relaxed L2-

consistency, for all e



pos







req





, we have tocompute the costof the shortest2-simple path

Pfrom v1to vnsuch that e



Por such that there exists an edge











Twith











where Tis a shortest path tree in Grooted at v1.

First, we compute the shortest paths from v1to vnand vnto v1in the reverse of Gin time





nlogn



. As a byproduct, we get Tand shortest path distances c













for all



V. If c









B, the current choice point is inconsistent, and we can backtrack. Otherwise,

candidates to be added to req





are only the arcs e









. Since v1



Seand vn



e, the

shortest 2-simple path Pfrom v1to vnwith e



Pmust contain an arc











e. Moreover,

since T











, we have that









T(see Figure 2.6). Therefore, it is sufficient to

compute, for all e









, the costs of the shortest 2-simple path Pfrom v1to vnthat contains

some













Let P







 











 

,r1



v1and rh





vn, and denote the sequence of

arcs that P







visits by







, whereby ek

 







for all 1



h. Furthermore,

for all 1



h, let Qkdenote a shortest 2-simple path from v1to vnwith









Qkfor some









Sek





T. Then,

cost







min



























Sek







A brute force approach requires time Θ





to determine these values. However, we can do

better when we compute the values cost





for all 1



hsequentially. Note that







e1.

2.2. Shortest Path Constraints

We keep the nodes jin the current set SC

ekin a min-heap, whereby the associated value of jin the

heap is defined as

xj:



min





















Sekand













Obviously, the smallest xjin the heap determines cost





. In the transition from one shortest-

path arc ekto the next ek



1, the nodes i



Sek



Sek



1have to be removed from the heap, and the

values xjmust be updated. For each node i



Sek



Sek



1, we iterate over all outgoing arcs and

perform a decrease-key on the adjacent nodes if necessary. Then iis removed from the heap.

Since every node in Vleaves the heap at most once and never re-enters it, for all 1



h, this

procedure requires at most m decrease-key operations and n delete-min operations. Therefore,

when using a Fibonacci heap, the values cost





for all 1



hcan be determined in time





nlogn



. Then ekis added to req





iff cost







B. It follows:

Theorem 2.5 On directed graphs with non-negative arc weights, relaxed L2-consistency of a

shortest path constraint can be achieved in time O





nlogn



Finally, we consider the general case of directed graphs with integer arc weights that do not

contain negative weight cycles. On such graphs, the Bellman-Ford algorithm computes a single

source shortest path in time O





. The shortest path distance from source to sink can be used to

prune the search if that value exceeds the given bound B. However, for the purpose of cost-based

filtering with respect to the relaxations L1or L2, we need to compute the shortest path distances

from the source and to the sink for all nodes.

Of course, we could apply the Bellman-Ford algorithm with v1as root in Gand vnin the

reverse of Gto obtain these values. To achieve relaxed L1-consistency, this procedure would

require time Θ



n2m



in the worst case. We can do much better though, especially when taking

into account that within a tree search, many similar Shortest Path Problems have to be solved.

We can speed up the computation by using node potentials hvfor all v



V. It is a well-known

fact, that the shortest path structure of a graph is maintained when the arc weights are changed

to cij



cij





hj[1]. We aim at finding node potentials hsuch that c



0. Then, even after

arcs have been removed from the graph or the shortest path is required to visit certain arcs, we

can simply apply the algorithms that we developed for directed graphs with non-negative arc

weights. The only necessary modification is to compute c



















hj.

In order to compute the desired node potentials, we use a method that has been developed

for the computation of all pairs shortest paths by Johnson [43]: we add an artificial source node

sand arcs







for all i



V, and we set csi :



0. If the given graph does not contain negative

weight cycles, the Bellman-Ford algorithm produces shortest path distances c







. For all arcs









E, we have that c

















cij. Thus, when setting hi:









we get

cij



cij







cij



















Chapter 2. Optimization Constraints

Degree of Consistency

Graph Type ArcCon L1L2RedCost

DAG O







undirected, c



0 NP-hard O





nlogn





RAM









directed, c



0 NP-hard O





nlogn







nlogn



directed, NP-hard O





nlogn



O(nm)

no negative cycles amort.



Ω











nlogn



Tab. 2.1: The table gives an overview of the findings in this section.

The following two theorems follow directly from the discussion:

Theorem 2.6 On directed graphs, relaxed L1-consistency of a shortest path constraint can be

achieved in time O





nlogn



Theorem 2.7 On directed graphs, relaxed L2-consistency of a shortest path constraint can be

achieved in time O





. Within a tree search, relaxed L2-consistency can be achieved in amor-

tized time O





nlogn



for Ω





calls of the filtering procedure.

2.2.5 Summary

Before we proceed, we summarize the results that we achieved in this section (see Table 2.1): On

DAGs, arc-consistency for a shortest path constraint can be achieved in linear time by exploiting

topological orderings. On general directed and on undirected graphs, achieving arc-consistency

is an NP-hard task. We developed two shortest path relaxations L1and L2both based on the

class of 2-simple paths. We showed that L1dominates L2, and cost-based filtering based on L2is

superior to reduced-cost filtering. On undirected graphs with non-negative edge weights, relaxed

L1-consistency(and therefore also relaxed L2-consistency)can be achievedin time O





nlogn



and in time O







on a RAM. On directed graphs with non-negative arc weights, relaxed L1-

consistency can be obtained in time O





nlogn



, and a state of relaxed L2-consistency

can be achieved in time O





nlogn



. Finally, in the presence of negative arc weights, we

use the Bellman-Ford algorithm just once for the computation of node potentials that allow us

to solve the Shortest Path Problems on graphs with non-negative arc weights. Therefore, we

achieve relaxed L1-consistency in time O





nlogn



, and L2-consistency in time O









nlogn



for Ω





calls of the filtering algorithm.

2.3. Weighted Stable Set Constraints on Interval Graphs

2.3 Weighted Stable Set Constraints on Interval Graphs

Real-world scheduling problems often require the selection of temporally non-overlapping tasks,

as one machine, processor or person can only work on one job at a time. E.g., if one wants to

record movies from TV, no two temporally overlapping broadcasts can be taped. Thus, when

given a set of weighted tasks with starting and ending times, we try to find a selection of non-

overlapping tasks such that their weighted sum is minimized [43]2.

Frequently, the problem evolves only as relaxation or sub-problem of a real-world applica-

tion. For instance, in a realisticmodel of the aboveTV recording example, thestorage capacity of

the recording device is limited (see Chapter 6). The problem can viewed as an augmented Knap-

sack Problem, which is NP-hard. Exact algorithms to compute and prove an optimal solution

for such problems are often based on enumeration approaches. The tightening of sub-problems

can help greatly to improve the performance of a tree search approach. Therefore, in this sec-

tion we develop an efficient cost-based filtering algorithm that exploits the special structure of

non-overlapping constraints.

During a tree search, many similar problem instances have to be solved, whereby the in-

stances from one iteration to the other only differ with respect to necessarily included and ex-

cluded tasks and, as we shall see later, possibly changes in the objective. Therefore, the develop-

ment of an incremental algorithmis desirable [8, 54, 177, 178]. The algorithm we developworks

in two phases: a preprocessing phase using time Θ



nlogn



, and an optimization and filtering

phase using linear time. The data structure established in the first phase is independent of the ob-

jective function and can be adapted in linear time to reflect decisions on necessarily included and

excluded tasks. Thus, we achieve an amortized linear time algorithm for Ω



logn



incremental

calls with changing variable domains and different objectives.

Repeated computations with changing objective functions are important when solving La-

grangian relaxations for example. In Section 3.2, we developa method to link linear optimization

constraints that is based on Lagrangian relaxation, and that makes use of dual and reduced-cost

information while solving the Lagrangian dual. Therefore, as a major objective in this section,

we develop an efficient algorithm that, on top of an optimal selection of non-overlapping tasks,

provides dual and reduced-cost data as a byproduct.

The work presented in this section was published in [189]. It is structured as follows: In

Section 2.3.1, we define the weighted stable set constraint formally. Then, in Section 2.3.2, we

develop an algorithm based on mathematical programming that computes minimum weighted

stable sets on interval graphs and provides dual and reduced-costs information as a byproduct.

2In contrast to the common definition of weighted stable set problems, we state the problem not as a maximiza-

tion but as a minimization problem. We do this because the latter problem has a one-to-one correspondence with

a shortest-path problem in the complement graph that we will use for filtering purposes later. Note that we allow

negative node weights such that a maximization problem can easily be transformed into a minimization instance.

Chapter 2. Optimization Constraints

Finally, in Section 2.3.3, we give a cost-based filtering algorithm for weighted stable set con-

straints on interval graphs.

2.3.1 The Weighted Stable Set Constraint

A natural way of modeling the problem of finding a selection of non-overlapping tasks is to

consider an interval graph [99]: the tasks are represented by the nodes, and an edge connects

two nodes iff the corresponding tasks are in conflict, i.e., iff the corresponding intervals are

overlapping.

Definition 2.6 A graphG

 





is called an interval graph iff there exist intervals I1







such that























Then the problemconsists infinding a minimumweighted stable set (WSSP) in an interval graph.

We generalize the use of conflict graphs and define:

Definition 2.7 Given an undirected graph G

 





, n





, V









, denote the node

weights in G by c





n, and let B





. Let Xi









denote binary variables for all 1



Then a weighted stable set constraint has variables







, and an instantiation Xi



xifor

all 1



n is true, iff



for all 1







n, it holds that xi



xjimplies









E, and



∑xi



1ci



Obviously, the weighted stable set constraint is a minimizationconstraint. On general graphs,

thecomputationofa minimumstable setis anNP-hardtask. Therefore, achievingarc-consistency

for the weighted stable set constraint on general graphs is also NP-hard. However, on interval

graphs minimum weighted stable sets can be computed in time O



nlogn



[115]. The existing

algorithms for the WSSP on interval graphs are based on sweep line or dynamic programming

approaches and neither provide dual values and reduced-cost information, nor do they suggest

how cost-based filtering could be performed efficiently.

We will show that a state of arc-consistency can be achieved in amortized linear time for

Ω



logn



incremental calls for weightedstable setconstraints oninterval graphs. In the following,

we assume that we are given a number n



, intervals Ii





start





end







for all 1



n, task

weights c





nand an upper bound Bon the objective. We refer to the corresponding interval

graph with G

 





whereby V









and E



















2.3.2 A Mathematical Programming Approach

We present an algorithm based on mathematical programming for the Minimum Weighted Stable

Set Problem on interval graphs that provides us with dual and reduced-cost information as a

2.3. Weighted Stable Set Constraints on Interval Graphs

byproduct, and that will be extended to an efficient cost-based filtering for the problem later.

Obviously, the following Integer Program solves the WSSP on interval graphs:

Minimize IP1



∑1



ncixi

subject to xi





























In this formulation, an LP relaxation does not necessarily yield an integer solution. However,

we can tighten the problem formulationsuch that every LP-solutionis already integer. To achieve

that formulation, we introduce a few more definitions:

Definition 2.8 A setC



V is called a conflict clique, iff I













C. A conflict cliqueC

is called maximal, iff





V, D conflict clique: C







D. Let M :











denote the set of maximal conflict cliques in G. We set max_start :M





, max_start







maxi





start







Remark 2.1



mCp



V, because





is a conflict clique





V. Thus, there exists a

maximal conflict clique Cp,1



m, such that i



Cp.

Lemma 2.7 The function max_start is injective.

Proof: Assume max_start







max_start





for some 1







m. Then there exist nodes



Cpand sq



Cqsuch that

start







max_start







max_start







start





and









Cq:

end







start







start







start





and

start







start







start







end





Thus, all nodes in Cpand Cqare pairwise overlapping. Therefore, Cp



Cqis a conflict clique.

AsCpand Cqare maximal, we have Cp







Cq. However, as p





qimpliesCp





Cq, we

have p



Thus, without loss of generality, in the following we assume that the conflict cliques are

ordered with respect to max_start, i.e.

max_start







max_start



 







Chapter 2. Optimization Constraints

Lemma 2.8 Let 1







m and i





Cr. Then i







Proof: Let sp











Cr, such that start







max_start



 









. Further-

more, let j



Cq. Then,

end







start







start







start





and

start







start







start







end





Therefore, Cq







is a conflict clique. As Cqis maximal,Cq









, i.e. i



Cq.

Corollary 2.2 m



Proof: Let 1









Cpsuchthat i





1, asotherwiseCp



1, which contradictsthe

maximality ofCporCp







1. Thus, with Lemma 2.8, we have i











m. Therefore,













m, and thus m



Definition 2.9 We set Rp:













m and Rm:



Cm, and call every such Rpa

(max_start) rest clique.

Remark 2.2 The rest cliques form a partition of V: Let 1



n. Remark 2.1 states that i



for some 1



m. Let q :



max













. Then, i



Rq.

On the other hand, let i





Rqwith 1







m. Then, i





Cq. Therefore, with

Lemma 2.8, we have i



1, which is a contradiction to i



Rp.

LetC1



Cmdenote the maximal conflict cliques of Gordered according to max_start, and

consider the following integer program:

Minimize IP2



∑1



ncixi

subject to ∑i



Cpxi















The maximal conflict clique restrictions imply that xi





1 for all nodes i





Vwhose

corresponding intervals overlap. On the other hand, if xi





1 for all overlapping intervals

Iiand Ij, it is also true that ∑i



Cpxi







m. Thus, the above IP solves the WSSP on

interval graphs.

In the following, by A











nwe denote the corresponding matrix to IP2, i.e. A





api





nwith api



1 iff i



Cp.

2.3. Weighted Stable Set Constraints on Interval Graphs

Theorem 2.8 The corresponding matrix A of IP2is an interval matrix.

Proof: We have to show that api



ari



1 implies that aqi







r, 1



n. By

the construction of A, this is equivalent to showing that i





Crimplies i







However, this is true according to Lemma 2.8.

Corollary 2.3 IP2is totally unimodular.

Proof: Interval matrices are totally unimodular [158].

Corollary 2.3 now allows to solve the WSSP on interval graphs as a linear program:

Minimize LP3



∑1



ncixi

subject to ∑i



Cpxi









Notice that, with Remark 2.1, the maximal conflict clique restrictions imply that x



2.3.2.1 A Pivot Selection Strategy

We use the simplex method to solve LP3. Let R1



Rmdenote the (max_start) rest cliques of G.

In iteration 1



m, we choose q:



tas pivot row and j



Rqwith the smallest reduced costs

as pivot column. If the reduced costs of jare less than 0, we perform a pivot step. Otherwise we

proceed with the next iteration immediately.

Theorem 2.9 After m such iterations, the simplex tableau is primal and dual feasible.

The proof of Theorem 2.9 will be given later in this section. In the following, we refer to the

simplex tableau with the following identifiers: elements of the matrix At













nafter



msimplex iterations are denoted by



atpi





n, entries in the right hand side





mare referred to by



btp





m, and the reduced costs ctare denoted by







First, we prove that our pivot selection preserves primal feasibility. We observe that x



0 is

primal feasible as b0





 



m. To assure the maintenance of primal feasibility, we

must show that bt



 



m. To do so, we prove the following

Lemma 2.9 Let 0



m, 1



m, 1



n. Then,

(a) p



t implies that atpi



pi and btp



(b) btp



0, i



tRrimplies atpi













1, i



tRrimplies atpi









(d) btp









Chapter 2. Optimization Constraints

Proof: We induce over t:



0:b0



1 and a0













mand 1







1: Let t



m, and denote the pivot column in iteration t



1 by j



1. If the reduced

costs of column jare greater or equal zero, then we are done since bt





btand At





At.

Otherwise, we set q:





1 and choose at

qj as pivot element. By induction hypothesis (a), we

know that bt



1, and that at



qj. Now, since j



Cq,a0



1. Thus our pivot

element is equal to 1.

(a) Therefore, at





qi and bt





1 for all 1



n. Now let t







According to our pivot selection strategy,















This, and the interval property of the matrix Aimply

atpj



0 for p





2, and thus for all p





Thus, in iteration t



1 the rows p





1 do not change, i.e.





atpj



pj and bt





btp



For p





1, (a) implies (b)–(d). Thus, in the following we assume p



t. Since, for all p



with atpj



0, row pdoes not change in iteration t



1, we only need to consider p



twith

atpj





0. Then, as the matrix Ais totally unimodular, it holds that atpj











Finally, let i



tRrwith at



0. Due to the interval matrix property of Aand a0



for some r







q, we know then that a0







q. Moreover, as all pivot elements up to

step twere chosen from rows lower than q, it holds that at











, and bt



1 for

all q





m. Therefore, we only need to consider at



(b–d) Let p



t,i



tRr,at



1 and atpj











. Using induction hypothesis (d), we

know that btp









. First, assume that btp



0. By induction hypothesis (b), we know

that atpj





1 and atpi











. Thus,





btp













and at





atpi









Now let us assume btp



1. Then, by induction hypothesis (c), we know that atpj



1, and

atpi









. Thus,





btp













and at





atpi













2.3. Weighted Stable Set Constraints on Interval Graphs

Now we show that after miterations we achieve dual feasibility.

Lemma 2.10 Let 1



m. Then,

(a) ct



0for all i



Rt.

(b) If t



m, then i





tRpimplies ct





Proof: (a) Let j



Rtdenote the pivot column in iteration t. If ct





0 we are done, as then











0 for all i



Rt. So let us assume ct





0. In Lemma 2.9, we have already

shown that the pivot element is at



t j



1, and









In particular, we know that









Ct.

We conclude that











(b) Let 1





mand i



Rp. Lemma 2.9 states that at



ri for all t





m. Then, due

to the interval matrix property of A, and i









1, we know that at









Therefore, ct





Corollary 2.4 After m iterations we achieve dual feasibility.

Proof: Let 1



n, and 1



msuch that i



Rt. With Lemma 2.10, it holds that:













Now, we give the previously postponed

Proof of Theorem 2.9: In Lemma 2.9 and Corollary 2.4, we have shown that after m



iterations the simplex tableau is primal and dual feasible.

Chapter 2. Optimization Constraints

2.3.2.2 An Efficient Simplex Realization

We have shown how the WSSP on interval graphs can be stated as a totally unimodular LP.

Moreover, we have proven a feasible pivot selection strategy that yields an optimal tableau after

at most nsimplex iterations.

In the following, we develop an efficient Θ



nlogn



-time algorithm to compute a set Q









with Ii











Q, such that cost







∑i



Qciis minimal. Most importantly,

the algorithm also provides us with dual and reduced-cost information as a byproduct. To es-

tablish that algorithm, we show how the simplex computations according to the pivot strategy

developed in the previous section can be performed efficiently.

Theorem 2.10 Let











Rmdenote the sequence of pivot columns according

to Section 2.3.2.1, and let d :







 





with d







1iff ctjt



0for all 1



Then the set Q :









m with d(t)=1 and Ijt



Ijr











is a stable set in

V with minimal costs.

Proof: If no pivoting is taking place (d







0 for all 1



m), the initial tableau is optimal

with x



0. Therefore, Q



0is an optimal solution to the problem. So let us assume that









mand d













We induce over m:



1 : If D





0and there exists only one row, exactly one pivot step is being carried out. Then

xj1is the only basic variable, and according to Lemma 2.9 it holds that: xj1



1. Thus,







is an optimal solution.





1 : Now assume that there are m



1 maximal conflict cliques. We set l:



maxt



Dt.

Again, by applying Lemma 2.9, we know that bm





1. Therefore, there exists an

optimal solution









lwith d









, with jl



Let k:



min











Ctand d









. When setting N:





mCt



we know that





0. Thus, there exists an optimal solution Q









S, where











kRt, and it holds that Ii











Since cost







cjl



cost





, we can construct an optimal solution by finding such a set Swith

a minimal value cost





. If k



1, it holds that S



0, and we are done. Otherwise, we have to

solve a WSSP on an interval graph with k





mmaximal conflict cliques to find such a set

S. By setting up the corresponding LP, we find that the sequence of pivot elements to solve this

problem is exactly









, and pivot steps are carried out for all 1





kwith d







We apply our induction hypothesis and achieve

2.3. Weighted Stable Set Constraints on Interval Graphs











kwith d







1 and Ijt



Ijs









Thus, Q



















1 with d







1 and Ijt



Ijs













is an

optimal solution to the WSSP on interval graphs.

The above Theorem 2.10 allows us to construct an optimal solution if we know the sequence

of pivot elements: All we have to do is start with Q



0. Then we visit all pivot elements in the

reverse order. An element jtis added to Qiff d







1 and its corresponding interval does not

overlap with any corresponding interval of an element in Q. This last check can be performed

efficiently by maintaining the value minj



Qfj, whereby fi:



min

















ndenotes the index of the first maximal conflict clique that node ibelongs to.

It remains to compute the sequence of pivot columns for which a pivot step is being carried

out. However, according to Section 2.3.2.1, this is an easy task if we can only determine the

reduced costs quickly:

Lemma 2.11 Let 1











  

Rmbe the sequence of pivot columns ac-

cording to our pivot selection strategy, and let d :







 





with d







1iff ctjt



Furthermore, we set qt:



t. Let zt





denote the objective function value after iteration t

(z0:



0), and at



qtjt



1be the pivot element in iteration t. Finally, for all 1



n, we set



zfi





ztif fi



t, and gt



0otherwise. Then,

(a) zt















(b) zt



∑1



tcr

















mRp.

Proof: (a) If no pivoting takes place in iteration t, then ct





0 and d







0, and zt





Otherwise, ct





0 and d







1. According to Lemma 2.9, at



qtjt



qtjt



1, and bt





Thus, zt









qtjt









(b) With z0



0, (b) is a simple implication of (a).



mand i



Rp. First, assume that fi



t. Then, a0







t. As all pivot

elements up to step twere chosen from rows 1



t, we have that ct





So let fi



t. Using (b), we see that gt



zfi









∑fi



tcr









. As a0









fi, we know that cfi





ci. Now let fi



p. With Lemma 2.9, it holds that ar



qrjr



qrjr



1, and also ar



qri



qri



1. Thus, cr















, and hence ct



cfi





∑fi



tcr













Chapter 2. Optimization Constraints

2.3.2.3 An Algorithm providing Dual Information

With Theorem 2.10 and Lemma 2.11, we can formulate an efficient algorithm solving the WSSP

on interval graphs that provides us with dual values as a byproduct. In phase 1, we determine the

(max_start) rest cliques Rp, with 1



m, and the corresponding values fi





n. This

can be done in time Θ



nlogn



Phase 2 consists of miterations: First, we set z0:



0. In each iteration 1



mwe compute







zfi











Rt, and jt



Rtwith ct





mini









. If ct





0, we set

zt:





1, otherwise zt:





jt. Finally, we set t:





1 and proceed to the next iteration.

With Remark 2.2, we know that the sets Rpform a partition of V. Thus, all nodes 1



are being looked at exactly once to compute the reduced costs. Also, in all computations of the

pivot columns, each node is incorporated only once. Therefore, phase 2 takes time Θ





After miterations, we know the value zm, as well as the sequence









  

of pivot columns and the function d. By applying Theorem 2.10, we can construct a stable set

out of this information in linear time. Since the rest cliques of the underlying interval graph

are independent of the objective function, we achieve an incremental linear time algorithm for

Ω



logn



calls with different objectives.

Most importantly, we get dual values as a byproduct. By looking at the optimal tableau, we

find that the optimal dual variable for each maximal clique constraint 1



mhas the value









2.3.3 Cost-based Filtering

After having developed an algorithm that solves the WSSP on interval graphs, we now give an

efficient filtering algorithm for the corresponding constraint. Unlike the Shortest Path Problem,

that can also be solved in polynomial time, but for which achieving a state of arc-consistency

is NP-hard, the Weighted Stable Set Problem exhibits a stable substructure: when any node is

removed from or added to the stable set, the remaining problem is again a Weighted Stable Set

Problem. In our case, the sub-problem can even be represented as a WSSP on a (modified)

interval graph. Therefore, a simple arc-consistency algorithm can be obtained by probing all

variable values using the previously developed algorithm, which requires time Θ





In the following, we developa cost-based filtering algorithm for the WSSP on interval graphs

that achieves a state of arc-consistency in amortized linear time for Ω



logn



calls to the routine

with changing objectives and/or variable domains. To develop that algorithm, we re-interpret the

problem as finding a shortest path in a directed, acyclic and node-weighted co-interval graph:

We introduce an artificial source σand an artificial sink τwith corresponding intervals before

and after all other nodes, and with cσ:



0 and cτ:



0. Set G

 





with N:











and



















end







start







. We then define π







as the set of simple paths from

2.3. Weighted Stable Set Constraints on Interval Graphs

σto τin G. The cost of a path P









is defined as cost







∑i



Pci.

Remark 2.3 There is a one-to-one correspondence between stable sets in G and paths P









in G:



Let Q











V denote a stable set in G. Without loss of generality, we may assume

that start







start





for all 1







l. Then, since Q is a stable set, it even holds

that end







start





for all 1







l. Therefore, P :

 



 





is a simple

path from σto τin G with cost







∑h



Pch





∑1



lcil





cost







On the other hand, if P

 

















, then end







start





for all 1







l. Therefore, the set Q :









is a stable set in G, and cost







∑1



lcil



∑h



Pch



cost





Therefore, a minimum weighted stable set in G corresponds to a shortest path from σto τin G,

and both have the same costs.

Given an upper bound B





, we define

Rem





















 



P:cost









, and

Req





















 





P:cost









Then, with Remark 2.3, to achieve a state of arc-consistency for a weighted stable set constraint

on an interval graph, we need to remove the value 1 from the domain of Xiiff i



Rem





, and

we have to remove the value 0 from the domain of Xiiff i



Req





. Since the variables are all

binary, this corresponds to setting Xi



0 iff i



Rem





, and Xi



1 iff i



Req





2.3.3.1 Removing Nodes

To compute Rem





, it is sufficient to determine the values of the shortest paths from the source

σvia node jto the sink τfor all j









. This can be done by computing the shortest-

path distances from the source and to the sink (compare with Section 2.2.2). With Remark 2.3,

the shortest path from σto a node j









can be determined by solving a WSSP on the

reduced interval graph with node set





fjCp, i.e., by solving the following LP:

Minimize LPj



∑i







fjcixi

subject to ∑i



Cpxi









Then the shortest-path value from σto jis c









LPj



cj. According to the previously

developed theory, the minimal objective for the above LPj

4is exactly zfj



1. Thus, the shortest-

path distance of node jis c









zfj



cj.

Chapter 2. Optimization Constraints

A similar theory as presented before shows that the shortest-path distances to the sink can be

determined by applying the algorithm of Section 2.3.2.3 using the last clique belongings

li:



max

















and the min_end rest cliques, where

min_end :M





,min_end







mini





end







Solving LPj

4in this inverse manner yields objectivefunction values τztfor all iterations 0



Then the shortest-path distance to the sink is c









τzm



cj. With those values at hand,

we can determine the shortest-path value through node 1



nby c



















zfj



τzm



lj. Then,

Rem













zfj



τzm







The algorithm sketched above determines the set of nodes that have to be removed from the

graph to achieve a state of arc-consistency. Of course, other constraints may also remove nodes.

In both cases, we must be able to handle these changes efficiently for the next call to our routine.

Without going into implementation details, we note that the removal of nodes does not affect the

interval structure of the graph, and that the data structures storing the max_start and min_end

rest cliques as well as the first and last clique belongings can be compressed in linear time to

delete any number of nodes from the graph.

We conclude that the members of Rem





can be computed and deleted in time Θ



nlogn



and in amortized linear time for Ω



logn



calls of the filtering algorithm.

2.3.3.2 Requiring Nodes

To compute Req





, we need to identify all nodes that must be an element of any path having a

value lower than B. Obviously, only nodes on the shortest path Scan have this property. Thus,

for every node j



Swe need to find out whether the value of a shortest path Pwith j





Pis still

lower than B.

Remark 2.4 Let 1



m denote the first and the last clique belongings of j. Further-

more, let P be the shortest path with j





P. Obviously, it either holds that Ij









or there exists a node i



P such that Ij







0. In the first case, we know that the value of the

shortest path not using the time interval Ijhas the value

cost









cj.

2.3. Weighted Stable Set Constraints on Interval Graphs

In the second case, after having determined and deleted Rem





from the graph, we only have

to check if there exists any node i









, i





j, with Ij







0. For when such a node i

exists, there also exists a path ¯

P with i



P and cost







B (otherwise i would have been deleted

before). Since i and j are overlapping, we also know that j





P. Thus, in ¯

P we have found a path

not covering j with a value lower than B. Therefore, j





Req





. On the other hand, if such a

node i does not exists, the second case is obsolete, and we only need to consider the first case.

By making that observation, we can determine Req





in amortized linear time: First, for all



P, we check whether there exists a node in the shrunken graph that overlaps with j. Without

specifying the implementation details here, we just note that this can be done in linear time for

all nodes j. If no overlapping node exists, we compute zm



cjand check whether this value is

lower than B. If not, we add jto Req





, otherwise we do not add it to Req





Now, we have an efficient algorithm at hand to compute Req





. Obviously, other constraints

and branching decisions must be taken into account when our procedure is being called next.

Thus, we have to be able to transform our graph in such a way that from now on, every path must

visit the new required nodes. At first glance this sounds problematic, as a naive approach would

delete all arcs going around the required nodes, but this procedure would cause the resulting

graph to not have the co-interval property anymore.

We can force the admissible paths (that is, paths with costs lower than B) to visit the required

nodes by making them extremely cheap: Let Req





 



the set of (currently) required

nodes. Furthermore, let T



0 be sufficiently large3. Then we set ˆcj:









Req, and

ˆcj:









Req. We use ˆcinstead of cas our objective and check whether the shortest-path

value is lower than B





Req





T. If not, either two required nodes overlap, or the shortest-path

value in the original graph exceeds B. Moreover, by determining Rem







Req







, we find all

nodes that overlap with some required node plus all nodes that would cause the shortest path in

the original graph to exceed the threshold B.

We summarize the results from the previous sections in

Theorem 2.11 Arc-consistency for a weighted stable set constraint on an interval graph can

be achieved in time Θ



nlogn



or in amortized linear time for Ω



logn



incremental calls of the

filtering algorithm.

3Assuming that mini







0, a valid setting for Tis for example T:







maxi







mini







Chapter 2. Optimization Constraints

2.4 Weighted All Different Constraints

The constraint structure of many discrete optimization problemscan be modeled efficiently using

all-different constraints. As a matter of fact, the all-different constraint was one of the first

global constraints that were considered in the literature [179]. Regarding the combination of

the all-different constraint and a linear objective, in [34] Caseau and Laburthe introduced the

MinWeightAllDiff constraint. In first applications [35], it was used for pruning purposes only.

In [77, 78], Focacci et al. showed how the constraint (the authors refer to it as the IlcAllDiffCost

constraint) can also be used for domain filtering by exploiting reduced-cost information.

In this section, we present an arc-consistency algorithm for the minimum weight all-different

constraint. It is based on standard operations research algorithms for the computation of mini-

mum weight bipartite matchings and shortest paths with non-negative edge weights. We show

that arc-consistency can be achieved in time O





mlogm



, where ndenotes the number of

variables, mis the cardinality of the union of all variable domains, and ddenotes the sum of the

cardinalities of the variable domains.

The work presented in this section was published in [187]. It is structured as follows: In Sec-

tion 2.4.1, we formally define the minimum weight all-different constraint. The arc-consistency

algorithm for the constraint is presented in Section 2.4.2.

2.4.1 The Minimum Weight All-Different Constraint

Given a natural number n



and variables X1

 

Xn, we denote the domains of the variables by

D1:







Dn:







, and let D:







 



iDidenote the union of alldomains,

whereby m





. Furthermore, given costs cij



0 for assigning value xjto variable Xi(whereby

cij may be undefined if xj



Di), we add a variable for the objective Z











∑i





xjcij

to be minimized. Note that the non-negativity restriction on ccan always be achieved by setting

ˆcij :



cij



mini



jcij, which will change the objective by the constant nmini



jcij.

In the course of optimization, once we have found a feasible solution with associated objec-

tive value B, we are then only searching for improving solutions, thus requiring Z



B. Then, we

define:

Definition 2.10 Theminimumweightall-differentconstraintisthe conjunctionof anall-different

constraint on variables X1



Xnand a bound constraint on the objective Z, i.e.:

MinWeightAllDiff











ϑAllDiff



















AllDiff















Consider the following example: Given variables X1



X6with domains D1

















,D3









,D4









,D5









, and D6









. In

Figure 2.7, we complete the example by specifying a cost matrix c.

2.4. Weighted All Different Constraints

A B C D E F

157 2

2234

348 5

4362

58 7 2

636

(a)

234

(b)

Fig. 2.7: (a) The table gives the costs cij of assigning a value xjto a variable Xi. (b) A bipartite graph links

variables to values that they can take. Bold numbers and lines mark the optimal solution with objective

value 26.

In the following, we will assume m



n, since otherwise there exists no feasible assign-

ment. Figure 2.7 shows that there is a tight correlation between the minimum weight all-

different constraint and the Weighted Bipartite Perfect Matching Problem that can be formal-

ized by setting G:









 





whereV1:





 



,V2:









and















. It is easy to see that any perfect matching (that is, a subset of pairwise

non-adjacent edges of cardinality n) in Gdefines a feasible assignment of all-different values

to the variables. Therefore, there is also a one-to-one correspondence of cost-optimal variable

assignments and minimum weight perfect matchings in G.

For the latter problem, a series of efficient algorithms have been developed. Using the Hun-

garian method or the successive shortest path algorithm, it can be solved in time O





mlogm



, where d:



∑i



denotes the number of edges in the given bipartite graph. For a

detailed presentation of approaches for the Weighted Bipartite Matching Problem, we refer the

reader to [1].

Since there are efficient algorithms available, there is no need to apply a tree search to com-

pute an optimal variable assignment if the minimum weight all-different constraint is the only

constraint of a discrete optimization problem. However, the situation changes when the problem

consists of more than one minimum weight all-different constraint or a combination with other

constraints. Then a tree search may very well be the favorable algorithmic approach to tackle the

problem [34].

In such a scenario, we can exploitthe algorithms developedin the OR communityto compute

a bound on the best possible variable assignment that can still be reached in the sub-tree rooted at

the current choice point. Also, it has been suggested to use reduced-cost information to perform

cost-based filtering at essentially no additional computational cost [78].

In the following, we describe an algorithm that achieves arc-consistency in the same worst

Chapter 2. Optimization Constraints

A B C D E F

1-5 7 2

22 -3 4

3-4 8 5

43 -6 2

58 7 -2

64 -6

(a)

1 5

234

(b)

Fig. 2.8: (a) The new cost matrix cM, and (b) the network NMfor the optimal matching from Figure 2.7.

case running time as is needed to compute a minimum weight perfect matching when using the

Hungarian method or the successive shortest path algorithm.

2.4.2 An Arc-Consistency Algorithm

To achieve arc-consistency of the minimum weight all-different constraint, we need to remove

all values from variable domains that cannot be part of any feasible assignment of values to

variables with associated costs Z



B. That is, in the graph interpretation of the problem, we

need to compute and remove the set of edges that cannot be part of any perfect matching with

costs less than B.

For any perfect matching M, we set cost







∑







Mcij. Furthermore, we define the

corresponding network NM:

 





whereby













































and cM

ij :





cij if









M, and cM

ij :



cij otherwise. That is, we transform the graph Ginto

a directed network by directing matching edges from V1to V2and all other edges from V2to V1.

Furthermore, the cost of arcs going from V1to V2is multiplied by



1. Figure 2.8 shows the

directed network NMfor our example.

In the following, we will make some key observations that we will use later to develop an

efficient arc-consistency algorithm. For a cycle Cin NM, we set cost







∑e



CcM

e. Let M

denote a perfect matching in G.

2.4. Weighted All Different Constraints

Lemma 2.12 Given an edge e



M, assume that there exists a minimum-cost cycleCein NMthat

contains e.4

a) There is aperfect matching Mein G thatcontains e, and itholdsthat cost







cost







cost





b) The set M is a minimum weight perfect matching in G, iff there is no negative cycle in NM.

c) If M is of minimum weight, then for every perfect matching Methat contains e, it holds that

cost







cost







cost





Proof:

a) Let C



eand C



edenote the edges in Ethat correspond to arcs in Cethat go from V2to V1,

or from V1to V2, respectively. We define Me:

 









e. Obviously, e



Me, and

since















,Meis a perfect matching in G. It holds that:

cost







cost







cost









cost









cost







cost





b) Follows directly from (a).

c) It is easy to see that the symmetric difference M











Mforms a set of

cyclesC1

 

Crin Gthat also correspond to cycles in NM. Moreover, it holds that

cost







cost







cost









cost







and thus

cost







cost







∑icost





Without loss of generality, we may assume that e



C1. Then, due to (b) and cost







cost





, we have that

cost







cost







cost







cost







cost





Theorem 2.12 Let M denote a minimum weight perfect matching in G, and e





M. There

exists a perfect matching Mewith e



Meand cost







B, iff there exists a cycle Cein NMthat

contains e with cost









cost





4Here and in the following we identify an edge e



Gand its corresponding arc in the directed network NM.

Chapter 2. Optimization Constraints

Proof: Let Cedenote the cycle in NMwith e



Ceand minimal costs.



Assume that there is no such cycle. Then either there is no cycle in NMthat contains e,

or cost









cost





. In the first case, there exists no matching Methat contains e.5

In the latter case, with Lemma 2.12(c), we have that cost







cost







cost







which is a contradiction.



We have that cost









cost





. With Lemma 2.12(a) this implies that there exists

a perfect matching Methat contains e, and for which it holds that cost







cost







cost







With Theorem 2.12, we can now characterize values that have to be removed from variable

domains in order to achieve arc-consistency. Given a minimum weight perfect matching Min G,

infeasible assignments simply correspond to arcs ein NMthat are not contained in any cycle Ce

with cost









cost





Of course, if cost







Bwe know from Lemma 2.12(b) that the current choice point is

inconsistent, and we can backtrack right away. So let us assume that cost







B. Then, using

empty cycles Cewith cost











cost





, we can show that all edges e



Mare valid

assignments. Thus, we only need to consider e



M. By construction, we know that the corre-

sponding edge in NMis directed fromV2toV1, i.e. e

 





. Denote the shortest-path distance

from Xito xjin NMby dist







. Then, for the minimum weight cycle Cewith e



Ce, it

holds that: cost







cij



dist







. Thus, it is sufficient to compute the shortest-path

distances from V1toV2in NM.

We can ease this work by eliminating negative edge weights in NM. Consider node potential

functions π1:V1

 

and π2:V2

 

. It is a well-known fact that the shortest-path structure of

the network remains intact if we change the cost function by setting cM

ij :





π1



π2

jfor all









M, and cM

ij :





π1



π2

jfor all









M. Then,

dist









dist









π1



π2

If the network does not contain negative weight cycles (which is true because Mis a perfect

matching of minimum weight, see Lemma 2.12(b)), we can choose node potentials such that



0. This idea has been used before in the all-pairs shortest path algorithm by Johnson [43].

In our context, after having computed a minimum weight perfect matching, we get the node

potential functions π1and π2for free by using the dual and negative dual values corresponding

to the nodes inV1andV2, respectively. As a matter of fact, the resulting cost vector cMis exactly

the vector of reduced costs c: If e

 









V2, then 0



cij



cij



π1







π2



, and thus





cij



π1



π2





cij



cij. Otherwise, cM





π1



π2



cij



π1







π2





cij

(see Figure 2.9).

5Note that this observationis commonly used in domain filtering algorithms for the all-different constraint [179].

2.4. Weighted All Different Constraints

A B C D E F

10 1 0

20 0 1

30 0 1

41 0 0

52 5 0

61 0

(a)

1 5

234

386

6 6

(b)

Fig. 2.9: (a) The changed cost vector c



cM, and (b) the network NMwith node potentials π1and π2.

Bold numbers show those assignments that can be eliminated by simple reduced-cost propagation in the

presence of a solution with value B



28.

We summarize: To achieve arc-consistency, we first compute a minimum weight perfect

matching in a bipartite graph in time O





mlogm



. We obtain an optimal matching M,

dual values π1,π2, and reduced costs c. If cost







B, we can backtrack. Otherwise, we set

up a network N

 





and compute nsingle source shortest paths with non-negative

edge weights, each of them requiring time O





mlogm



when using Dijkstra’s algorithm in

combination with Fibonacci heaps [43]. We obtain distances dist







for all variables and

values. Finally, we remove value xjfrom the domain of Xi, iff

cij



dist









cij



dist









π1



π2



cij



dist









cost













cost





whereC





is the shortest cycle in NMthat contains







. Obviously, this entire procedure runs

in time O





mlogm



. The entire domain filtering process is visualized for our example in

Figure 2.10.

Interestingly, the idea of using reduced-cost shortest-path distances has been considered be-

fore to strengthen reduced-cost propagation [78]. For an experimental evaluation of this idea, we

refer the reader to that paper. Now we have shown that this enhanced reduced-cost propagation

is powerful enough to guarantee arc-consistency for the minimum weight all-different constraint.

The algorithm we introduced achieves arc-consistency in time O





mlogm



. At first

sight this sounds optimal, because it is the same time that is needed by algorithms for the

Weighted Bipartite Perfect Matching Problem such as the Hungarian method or the successive

shortest path algorithm. However, two questions remain open: First, can we derive a cost-based

Chapter 2. Optimization Constraints

A B C D E F

10 0 2 2 2 3

21 0 2 2 2 3

30 0 0 1 2 3

40 0 0 0 2 3

50 0 0 0 0 1

6∞ ∞ ∞ ∞ ∞ 0

(a)

A B C D E F

1-5 -6 0 -4 0 -3

2-1 -3 3 -1 3 0

3-7 -8 -4 -7 -2 -5

4-5 -6 -2 -6 0 -3

5-5 -6 -2 -6 -2 -5

6∞ ∞ ∞ ∞ ∞ -6

(b)

A B C D E F

10 1 2

21 0 3

30 1 3

41 0 2

52 5 0

6∞0

(c)

Fig. 2.10: (a) Shortest paths from nodes in V1to nodes in V2with respect to the reduced costs c. (b) The

same shortest paths using the original cost vector cM. (c) The additional costs imposed by an assignment



xj. Bold numbers show those assignments that can be eliminated in the presence of a solution with

value B



28.

filtering algorithm from the cost-scaling algorithm that gives the best known time bound for

Assignment Problems that satisfy the similarity assumption? Second, can the above filtering

method be implemented to run incrementally faster?

2.5. Knapsack Constraints

2.5 Knapsack Constraints

Based on reduction techniques for the Knapsack Problem (KP), we develop cost-based filtering

routines for knapsack constraints. We present several algorithms using bounds of different qual-

ity. The method that we consider the most interesting in theory and practice is based on a bound

proposed by Martello and Toth in [147]. By reusing information gained in an initial preprocess-

ing step taking time Θ



nlogn



, the actual reduction per choice point only requires linear time.

We compare two of the new algorithms numerically with two other reduction algorithms that

have been proposed earlier in the KP literature.

The work presented in this section was published in [61, 64, 65]. It is organized as follows:

First, we motivate the development of an efficient cost-based filtering algorithm for knapsack

constraints in Section 2.5.1. In Section 2.5.2, we present existing upper bounds and reduction

techniques for the KP. Then, in Section 2.5.3, we develop algorithmsfor the quick propagation of

knapsack constraints. An experimental evaluation of these algorithms, as well as a comparison

with alternative approaches is presented in Section 2.5.4. Finally, we discuss generalizations for

knapsack related problems in Section 2.5.5.

2.5.1 Definition and Applications

Cost-based domain filtering algorithms for knapsack optimization constraints are relevant in var-

ious application areas. First of all, capacity constraints are the basic building blocks for linear

programs. Therefore, knapsack constraints may very well be viewed as a standard modeling el-

ement when tackling problems in this large class of problems. Consider the following example:

Automatic Recording The Automatic Recording Problem (ARP) consists in finding an optimal

selection of items, each of them associated with a weight, an interval, and a profit value, such

that



the total weight of the selection does not exceed a given capacity,



all intervals associated with all selected items are pairwise non-overlapping, and



the profit is maximized.

Thus, the problem consists of a knapsack constraint accompanied by an independent set con-

straint. It models the automatic selection of TV broadcasts for video recording. The independent

set constraint ensures that only non-overlapping broadcasts can be recorded, whereas the knap-

sack constraint models the limited storage capacity of the recording device. The objective is to

maximize user’s satisfaction. In Chapter 6, we develop an algorithm that solves the ARP by

Lagrangian relaxation using the filtering algorithm presented in Section 2.5.3.1.

Chapter 2. Optimization Constraints

Quadratic Knapsack Problems Knapsack constraints can also be used profitably when tack-

ling the Quadratic Knapsack Problem (QKP). It calls for maximizing a quadratic boolean objec-

tive function subject to a linear capacity constraint. Filtering algorithms for KP are often used to

reduce the size of the given QKP [31]. Consider the relax and cut algorithm of Porto, de Moraes,

and Lucena [170] as an example. It computes bounds of the QKP by linearizing the problem to

KP, then tightening the problem by adding three families of valid inequalities, and finally solving

the resulting linear program (LP) by Lagrangian relaxation. To solve the Lagrangian dual, a se-

ries of KPs has to be solved in every search node. The authors stress that knapsack variable fixing

algorithms are vital ingredients of their approach. The algorithms proposed in Section 2.5.3 may

help to increase the overall performance.

In our last example, we show that knapsack constraints may also be relevant for the solution

of sub-problems when using decomposition techniques on eligible optimization problems:

CP-based Column Generation In Section 3.1, we develop a method called constraint pro-

gramming based column generation. It implies that a constraint satisfaction problem is set up

to generate columns in a column generation framework. When applying that approach to ap-

propriate optimization problems, augmented Knapsack Problems emerge as sub-problems. As

an example, consider the Constrained Cutting Stock Problem that is a Cutting Stock Problem

with additional constraints on the cutting patterns. Using a column generation approach, the

sub-problem is a Constrained Bounded Knapsack Problem: the length of the rolls determines the

capacity for the cutting patterns, and the objective is used to search for columns with negative

reduced costs only. Each cutting pattern has cost 1 since we try to minimize the number of rolls

needed to cover the specified demand. Thus, the objective in the sub-problem is to minimize



πTX(i.e. to minimize the reduced costs of the cutting pattern), where πis the vector of dual

values corresponding to the current optimal solution of the continuous relaxation of the master

problem. The KP objective then is to maximize πTXwith an initial lower bound of 1. Additional

constraints usually stem from real-world applications and may be non-linear. Some examples for

real-world constraints are given in [38].

2.5.1.1 Constrained Knapsack Problems

The examples given above show that knapsack constraints are often accompanied by other con-

straints when modeling real-life problems. Therefore, we introduce the definition of Constrained

Knapsack Problems (CKPs) which are Knapsack Problems with additional constraints, whereby

objective function and the capacity constraint have to be linear.

Definition 2.11 Let C





 

;p1







. C is the capacity of the knapsack, n

the number of items, and withe weight of item i with profit pi





n. Moreover, let w :





 



T, and p :

 





2.5. Knapsack Constraints

1. Let











, and G :









wTx





2. Let k

 

, and R :









rj:













. Every r



R is called a

(knapsack) rule and R is called a (knapsack) rule set.

3. Every x



G is called feasible (with respect to a given rule set R), iff r























x is feasible



is called the set of feasible constrained knapsacks (with

respect to rule set R). To simplify the notation, we often write F instead of F





if R is

known from the context.

4. The Constrained Knapsack Problem is then to

maximize pTX





Note that, for the unconstrained KP, it holds that F



G. For such pure Knapsack Problems

without additional constraints, the state-of-the-art solving techniques would focus on a so-called

core problem, which may be extended during the optimization process [146, 167]. For these al-

gorithms, itis notstraightforwardto see how the reductionalgorithmswe present in thefollowing

could be integrated efficiently.

However, algorithms tailored for the special case of pure KP are usually not able to solve

general CKPs, because they do not allow to incorporate additional constraints. One reason is

that algorithms designed to solve pure KPs make certain assumptions that do not hold for CKPs.

For example, it is not clear for the CKP that we can require the profits to be non-negative (as

it is the case for KP), because the strategy of omitting items with positive weight and negative

profit [149] may notyield feasible solutionsat all. Thus, in general a tree search will be necessary

to solve CKPs, and cost-based domain filtering algorithms for knapsack constraints may help to

improve the performance of such an approach.

For the remainder of this section, with identifiers





R, and Fwe refer to Defini-

tion 2.11. We will sometimes need to refer to reduced (C)KPs where an item i









is ei-

ther included or excluded in any feasible solution. We refer to those problems with(C)KP







or (C)KP







In a canonical IP formulation of the Knapsack Problem, there is one variable Xifor each item





 



. The domain of each variable is defined as D









. Furthermore, the capacity

constraint is modeled by a function ω:







with ω















1 iff wTX



Finally, the objective function is Z:





with Z















pTX.

Definition 2.12 Given any lower bound B



0, we call the maximization constraint ϑω







knapsack constraint.

Items of a CKP fall into either one of the following classes:

Chapter 2. Optimization Constraints



items ithat can be excluded from further investigation as they cannot be part of any im-

proving solution, i.e.









wTx







1 (2.2)



items ithat can be included into the knapsack as they must be part of any improving

solution, i.e.









wTx







0 (2.3)



items that cannot be decided at the moment.

A filtering algorithm that achieves (hyper-)arc-consistency for the knapsack constraint has to

include and to remove items that do not fall into the last class. Since showing that either (2.2) or

(2.3) holds for an item i(i.e. to check the arc-consistency of ϑω







) generally requires to solve

a KP itself, complete propagation here is an NP-hard task. One way to cope with the situation is

to develop pseudo-polynomial filtering algorithms. For example, in [209] a reduction algorithm

for subset-sum knapsack constraints is developed that has pseudo-polynomial run-time.

We propose another way by checking if the inequality holds for an upper bound Uon

CKP[Xi



b], b



0 or b



1, i.e., we check U





 











B. Then we write



CKP











B.6

2.5.2 Knapsack Relaxations

2.5.2.1 Upper Bounds for Knapsack Problems

The effectiveness of a domain filtering algorithm that achieves relaxed consistency is determined

by the relaxation quality, i.e. the bounds used for cost-based filtering. Following the presentation

given in chapter 2 of [149], we present some upper bounds that have been originally developed

for the maximization problem KP. They also apply to the CKP by relaxing it to a KP first.

Obviously, ignoring all additional constraints often does not yield tight bounds on the objective.

However, if the additional constraints satisfy certain properties, they can be incorporated in the

objective function of a pure KP using Lagrangian relaxation. For additional linear constraints,

there are ways of how this can be done effectively (see [80, 188] and Chapter 3). Notice that

dropping all additional constraints allows to set Xi:







0 and 1



n. We therefore

require all items to have positive profits.

Without loss of generality, we may assume that the items are ordered according to decreasing

efficiency, i.e. p1







wn. We define the critical item s of a Knapsack Problem as the

first item that overloads the knapsack, that is s



minj



∑j



1wi





(we omit the trivial case

6To improve the readability, here and in the following we write CKP or KP instead of



n, and identify CKP[Xi



b] as well as KP[Xi



b] with



  







  



, where





is the i-th factor.

2.5. Knapsack Constraints





Fig. 2.11: The width of each element is proportional to its weight. The elements are ordered with respect

to the efficiencies pi



wi. The leftmost element has the biggest efficiency, and the rightmost the smallest

one. smarks the critical item in U1.

here where no such sexists). Dantzig [50] showed that the linear relaxation of the 0-1 knapsack

has the optimal value ∑s





1pj



cps

ws, where cis defined as the remaining capacity of theknapsack

after filling in the first s



1 items: c





∑s





1wj.

Let /









. Let li:



min





denote the minimum, and ri:



max





denote

the maximum of Mi, 1



n. The first upper bound on KP is defined asU1: 2





with









max





































It holds that,

U1:











∑



1pj





cps





(2.4)

A second bound U2was introduced Martello and Toth in [147]. It imposes the integrality of

the critical item s. Either item sbelongs to the optimal solution (leading to a value U1) or not

(leading to a value U0):





∑







cps







(2.5)





∑



1pj



















(2.6)

Defining U2as the maximum of U0and U1results in a bound dominating U1. Formally, let









, and let sdenote the critical item with respect to necessarily included and

excluded items implicitly defined by the Mi. We set U2: 2



 

withU2









∞, and



 





max









∑i













∑i









pi.

Chapter 2. Optimization Constraints

















Fig. 2.12: U3requires the integrality of item s. The figures show U1











, and U1











It holds that,

U2:









max











(2.7)

Instead of estimating the loss caused by the integrality of item susing the efficiency of the

neighboring items of s, an even tighter bound can be obtained by calculating bounds U1on







, and KP







[68, 114, 212]. Let U0:













, and U1:













. ThenU3:



max







dominatesU1andU2. An even tighter bound could be obtained by

usingU2instead of U1in the definition of U0and U1and so on.

The Figures 2.11 and 2.12 give graphical interpretations of the boundsU1andU3. Obviously,

all three boundsU1



U3can be computed in time O





after a preprocessing step of sorting the

items according to decreasing efficiencies. This requires time Θ



nlogn



. Balas and Zemel [10]

developed an algorithm for the calculation of susing linear time without any preprocessing.

However,for thereductionalgorithmthatwe presentinthe following–justasin formerreduction

algorithms for the KP – the efficiency ordering is needed anyway. On top of that, we use an

ordering of the items with respect to increasing weights.

In a tree search, both orderings can be calculated in an initial preprocessing step. After that,

theycan be reused in every search node. Within a columngeneration context,the weight ordering

only has to be calculated once, but the efficiency ordering has to be re-computed every time new

dual values of the master problem lead to a change of the objective in the successive CKPs.

2.5.2.2 Reduction Techniques for Knapsack Problems

A first reduction algorithm for KPs based on upper bound U1has been proposed by Ingargiola

and Korsh [122]. In a loop over all items i



 

n, the algorithm determines U1









 









. Since each bound calculation takes linear time, the worst case complexity of this

algorithm is Θ





2.5. Knapsack Constraints

If boundU2is used instead ofU1, more effective filtering can be achieved in the same asymp-

totic running time. Martello and Toth [148] showed that the running time can be reduced to



nlogn



while keeping the solution quality of bound U2. The key idea of their algorithm is to

compute the critical item sby binary search. We refer to the methods of Ingargiola and Korsh,

and Martello and Toth as IKR, and MTR, respectively.

Dembo and Hammer [53] proposed a reduction algorithm (DHR) that runs in linear time





. They calculate the critical item sonly once for the original problem. Within a loop they

estimate the loss when removing/including item i



 

nby extrapolating the efficiency of

item s, which allows to perform this step in constant time. As this extrapolation is less accurate

thanU1, their method is not as effective as IKR or MTR.

Though having been developed more than a decade ago, the methods DHR and MTR are still

vital ingredients in state-of-the-art solvers for the pure KP and the QKP [167, 168, 170].

The algorithm we present in the following cannot improve the running time of reduction

techniques based on the more efficient bounds U1



U2if the reduction algorithm is only called

once. For such an application, the new method presented and the one developed by Martello and

Toth both require the same asymptotic running time in Θ



nlogn



The situation changes, however, if a reduction method is called many times for similar knap-

sack instances, as it is the case when applying a tree search: in every search node, we try to prune

the search or at least to tighten the problem formulation by applying domain filtering. When us-

ing unary branching constraints, the subsequent instances only differ with respect to the sets of

variables that have already been fixed. As we will see, such a situation allows to hide parts of the

work in a preprocessing step that takes time Θ



nlogn



. Provided with the information gathered

in that preprocessing, every call to the reduction routine requires linear time only.

2.5.3 Cost-based Filtering for Knapsack Constraints

2.5.3.1 A Fast Propagation Algorithm based on Bound U1and U2

Now, we show how the running time of IKR and MTR can be reduced to Θ





by making use of

information generated in a preprocessing step requiring time Θ



nlogn



. The bounds obtainedare

of the same quality as in the original algorithms. Again, let KP







denote



   















, and lets



 





minj



∑i











∑i











denotethe

critical item of KP







. The key idea of the routine is to calculate the bounds of the reduced

problems U











in an order of increasing weight of the items j. Thereby, we obtain a

sequence of critical items that is monotonically increasing. Thus, the critical item and the upper

bound for the j-th item (with respect to the weight ordering) can be transformed into the critical

item and upper bound for the







-th item by starting the calculation of s























Chapter 2. Optimization Constraints

The time consuming step in reduction algorithms using bound U1,U2is to determine the

critical items s









 



n, and b









. Once these values are known, the

calculation of the upper bounds and the reduction itself only require linear time. (In fact, in the

following algorithm the bounds can be computed at the same time as the critical items. To clarify

the argumentation, however, we just show how to calculate the latter.) 7

Although calculating s











for each single i





 













generally takes

linear time, the calculation of all these values also only requires time Θ





once we know an

ordering σ

 

σ1



σn



of the items according to their weight, i.e. wσi



wσjiff i



j. The

efficiency ordering of the items as well as the the permutation σcan be obtained in a sorting step

prior to any reduction and requiring time Θ



nlogn



Given s







, we know that U















 



s, and U















 



s. Thus, we only need to calculate the arrays S1:





















, and

S0:





















. We describe how to determine S0in the following. The calculation

of S1is done analogously.

We iterate over all items i



sin increasing order of weight. That way, we can be sure that











increases monotonically with growing i











. Thus, we can start the

search for the next critical item at the position of the last one.

The following bookkeeping argument shows that this procedure only takes linear time. We

estimate the computational effort of the reduction algorithm by assigning a unit cost (say, 1 C–

–)

to the items causing it:



Every item j



sthat is being passed is charged 1 C–

–. By “passed” we mean that the item

is being included entirely when iterating from one critical item to the other.



Every item is charged 1 C–

–each time it is being included fractionally.

The first group of items causes at most nC–

–costs as the critical items are monotonically increas-

ing: every item is being passed at most once. It remains to calculate the effort for all items that

are being included fractionally. Obviously, there are at most as many fractionally included items

as critical items. Therefore, this group of items also costs not more than nC–

–. Thus, the costs for

the entire computation are in O





Finally, the calculation of s











can be performed in time that is linear in the number

of items as well. Another possibility to calculate this value is to insert item sat the position

corresponding to cin the weight ordering of items and to calculate s











just like the

critical items for the exclusion of the other items.

7Note that, by omitting the fractional parts, it is also possible to calculate lower bounds for the pure KP. For the

general CKP, the necessary feasibility checking with respect to additional constraints makes the generation of lower

bounds more complicated. Thus, more elaborate and problem dependent primal heuristics have to developed here.

In any case, reduction should only take place, after all lower bounds have been calculated [148].

2.5. Knapsack Constraints

























Fig. 2.13: The figure illustrates the process of the reduction algorithm presented for KP







. The

weight ordering in which the items are tested ensures that the critical item moves monotonically to the

right.

Obviously, the above algorithm can be applied with bounds U1and U2. As a consequence,

we have shown the following

Theorem 2.13 Aftera Θ



nlogn



preprocessingstep, relaxedU2-consistencyfor aknapsackcon-

straint can be obtained in time O





per choice point.

It is easy to see that, for a constant number of choice points, MTR and the algorithm given

above need the same running time of Θ



nlogn



. If Ω



logn



choice points have tobe investigated,

however, the time spent in the preprocessing is dominated by the accumulated time needed in the

choice points. In that case, Theorem 2.13 implies

Corollary 2.5 If propagation is triggered in Ω



logn



search nodes, relaxed U2-consistency for

a knapsack constraint can be obtained in amortized time O





per choice point.

Thus, in a typicalsearch tree with Ω



logn



search nodes, the methodpresented here is asymp-

totically optimal and superior to the algorithms proposed before.

2.5.3.2 More Effective Cost-based Filtering using Bound U3

To strengthen the filtering abilities of the optimization constraint, we can also use the stronger

boundU3:





is obtained by calculating bound U1on KP







, and KP







. When we

want to use that bound for cost-based domain filtering, we need to compute sb

i,b









, the

Chapter 2. Optimization Constraints

critical items of those restricted KPs if, additionally, Xi



b: Let 1



n,b









. Then























, and s1























To compute these values efficiently, first we determine the values s











using the

algorithm in Section 2.5.3.1. Then we apply a binary search to determine s0

iand s1

ifor all



n. This leads to a running time of Θ



nlogn



. A similar idea has been introduced in [148].

Corollary 2.6 With the previous procedure, relaxed U3-consistency for a knapsack constraint

can be obtained in time O



nlogn



per choice point.

For real-life instances, using a binary search to determine the critical item of KP











for b1











, usually does not pay off as it is likely to be “close” to s. Thus, we consider

this result to be of theoretical interest only. However, the algorithm above leads to another

filtering algorithm that is asymptotically as efficient as the one presented in Section 2.5.3.1 (that

runs inamortized linear time), but thatis evenmore effective. In fact, the bound ituses toperform

cost-based filtering is at least as good as U2, but for some items it is even U3:

Let 1



n,b









,s:







,s0

















, and s1

















. In contrast to the sequence of critical items that is computed for U3, the second

variable Xsthat is being fixed remains the same for all s0

i, and s1

i. Again, by using the algorithm

in Section 2.5.3.1, we determine U2













 



n, and then U2













 



n. For any given 1



n, we check whether max















 















 

B. If so, we fix the value of Xito 1



It is easy to see that the bound calculated is at least as good as U2. For items i



swith













sand items i



swith s













s, however, domain filtering is just as

effective as for bound U3. Hence, we achieve an amortized linear time algorithm based on a

’mix’ of U2and U3bounds.

2.5.4 Experiments

After having analyzed the new algorithms theoretically, now we compare them numerically

with different methods that were derived from KP reduction techniques presented in the lit-

erature. All experiments were run on a SUN Enterprise 450 Model 4300 (296 MHz) with

1 GB RAM, under Solaris 2.6. The reduction algorithms were implemented in C++ on top

of ILOG SOLVER 5.0 [121].

2.5.4.1 Test Environment

To show the potential of the new propagation algorithms, and to avoid cross-talking with other

constraints, we decided to base the experiments on pure Knapsack Problems only. That way, we

get a clear view on the performance of each filtering algorithm without disturbing interferences

2.5. Knapsack Constraints

that can evoke easily when using more complex settings that incorporate additional constraints.

For an example of a combination of the algorithms presented here and a shortest path constraint,

we refer the reader to Chapter 6. Likewise, we omit specially tailored tree search or branching

strategies for pure KPs. Instead, we used the default settings of the underlying CP library.

A word of caution is necessary here: even though our experiments are based on pure KP

data, the filtering algorithms we developed are not suited for state-of-the-art KP solvers. Also,

we do not claim that the solvers we implemented are competitive to the best KP solvers (see

Section 2.5.1.1). Our focus here is clearly on Constrained Knapsack Problems.

A weak propagation algorithm, if started from scratch, will obviously have to visit more

choice points to find an optimal or near-optimal solution of the problem than a good one. There-

fore, to make the comparison fair, we initialize the lower bound with the optimal objective value





and just measure the time and the number of choice points that each approach takes to

prove optimality.

The generator code of David Pisinger [167] was used to produce random instancesof two dif-

ferent classes of Knapsack Problems where the weights wjare randomly distributed in [1,1000],

and the profits pjare chosen as given below:

–uncorrelated: pjrandomly distributed in [1,1000],

–weakly correlated: pjrandomly distributed in





100





100







1100



In all cases, the knapsack capacity is chosen asC



2∑n



1wj. The problem sizes range from

10 to 20000 items, and 100 Knapsack Problems were generated for each size and class.

We omit the classes of strongly correlated data (pj





10) and subset-sum data (pj



wj).

It is known that the bounds described in Section 2.5.2.1 are not suited for these classes (which is

easy to see as



k:pk



1). For them, bounds based on cardinality constraints have shown

to be effective [146, 150]. In the application area that we focus on (see Section 2.5.1), however,

it is justified to assume that the evolving KPs are more likely to fall into one of the classes we

used for our tests.

2.5.4.2 The Opponents

The algorithms referred to as linU1and linU2are based on the amortized linear time reduction

method described in Section 2.5.3.1, and use bounds U1and U2, respectively. Methods DHR,

and MTR have been described in Section 2.5.2.2. We implemented all algorithms in the same

CP environment. Table 2.2 summarizes the major characteristics for the candidates used in the

experiments. All methods need O





memory for the propagation stack and for the different

orderings used. Within a choice point, only O





additional memory is required.

Notice that, in our experiments, we do not evaluate the filtering algorithm based on a mixture

of boundU2andU3thatwassketchedin Section 2.5.3.2. The propagationalgorithmbased on this

Chapter 2. Optimization Constraints

Name see Bound pre-proc. time time per node

DHR Sect. 2.5.2.2, D





bound –Θ





MTR Sect. 2.5.2.2, U2Θ



nlogn





nlogn



linU1Sect. 2.5.3.1 U1Θ



nlogn







linU2Sect. 2.5.3.1 U2Θ



nlogn







Tab. 2.2: Characteristics of the four algorithms used in the experiments.

mixed bound visits only slightly fewer choice points than linU2, but requires more computation

time. Recall from Section 2.5.3 that the work that has to be done to perform domain filtering

using bound U2is almost the same as using bound U1. When using the mixed bound, however,

the workload is twice as much as that for bound U1.

As we will show in this section, we are facing a trade-off between the time needed per choice

point and the reduction of choice points that can be achieved by using tighter bounds. Within

the test environment that we have chosen for our experiments, a slight reduction of choice points

does not justify a much higher effort undertaken in every choice point. Therefore, the filtering

algorithm based on the mixed bound is of interest only in the context of more complex CKPs

incorporating additional and possibly hard side constraints that would make even small reduc-

tions of choice points more favorable. However, in the KP setting that we consider here, to avoid

cross-talking with additional constraints and to evaluate the pure performance of the different

propagation algorithms, the algorithm developed in Section 2.5.3.2 is not competitive.

2.5.4.3 Numerical Results

The simple approach for solving a CKP in a CP context would be to introduce a sum-constraint

(i.e. ∑jwjXj



C) plus a constraint stating that we are only looking for improving solutions (i.e.

∑jpjXj



B). However, as shownin Table 2.3, that approach cannot compete at all with the other

propagation methods. Both the number of choice points and the CPU time grow exponentially

when the problem size increases. A dash means that the average calculation for a test instance

takes more than two hours. For both classes, only small problems with not more than 40 items

can be solved within that time limit. The poor performance of the pure CP approach shows the

need for sophisticated filtering techniques when knapsack constraints occur in a CP model. As

will be shown in the following, more elaborate techniques are able to tackle problems of several

1000 items in a few seconds, generating only relatively few choice points.

Small Instances Tables 2.4 and 2.5 show the average results of 100 different instances of the

same data size n. We present the running time in seconds, and the number of choice points cp

that the method visits. Table 2.6 shows a comparison of the different methods regarding the time

per choice point for uncorrelated and weakly correlated data.

2.5. Knapsack Constraints

Size uncorrelated weakly correlated

n cp time cp time

10 37.77 0.01 73.74 0.01

20 1455.80 0.16 28736.07 2.91

30 141338.82 15.50 16771406.92 1641.94

40 10311820.44 1410.07 — —

Tab. 2.3: The pure CP approach for both problem classes. cp is the average number of choice points, time

the average time in seconds for 100 instances of the given size.

Size DHR linU1linU2MTR

n cp time cp time cp time cp time

10 2.43 0.00 0.87 0.00 0.67 0.00 0.67 0.00

20 5.47 0.00 2.68 0.00 2.35 0.00 2.35 0.00

40 7.20 0.00 3.61 0.00 3.22 0.00 3.22 0.00

60 10.18 0.00 6.07 0.00 5.26 0.00 5.26 0.00

80 13.96 0.01 8.43 0.00 7.04 0.00 7.04 0.00

100 14.21 0.01 8.20 0.00 6.75 0.00 6.75 0.00

200 24.85 0.02 17.16 0.02 14.47 0.01 14.47 0.01

300 32.47 0.04 22.57 0.03 18.76 0.02 18.76 0.02

400 38.19 0.05 27.69 0.04 23.28 0.04 23.28 0.04

500 46.50 0.08 33.64 0.06 28.68 0.05 28.68 0.05

600 63.61 0.11 48.67 0.09 40.95 0.08 40.95 0.08

700 54.67 0.11 41.16 0.09 34.53 0.08 34.53 0.08

800 69.92 0.16 51.76 0.13 42.38 0.11 42.38 0.11

900 68.89 0.17 51.76 0.14 42.35 0.13 42.35 0.12

1000 97.83 0.26 72.38 0.21 59.73 0.17 59.73 0.18

Tab. 2.4: Uncorrelated data instances. We give the average numbers for 100 test sets per size. time is the

time in seconds, cp the number of choice points.

Chapter 2. Optimization Constraints

Size DHR linU1linU2MTR

n cp time cp time cp time cp time

10 10.42 0.00 6.31 0.00 5.42 0.00 5.42 0.00

20 20.41 0.00 13.82 0.00 11.35 0.00 11.35 0.00

40 33.26 0.01 23.42 0.01 19.87 0.01 19.87 0.00

60 37.69 0.01 26.69 0.01 22.52 0.01 22.52 0.01

80 56.07 0.02 40.10 0.01 33.21 0.01 33.21 0.01

100 61.60 0.02 45.49 0.02 37.94 0.02 37.94 0.02

200 103.85 0.06 77.05 0.05 64.33 0.05 64.33 0.04

300 162.20 0.13 123.11 0.11 99.67 0.10 99.67 0.09

400 202.23 0.21 151.50 0.17 118.71 0.15 118.71 0.14

500 226.36 0.29 161.80 0.23 122.57 0.19 122.57 0.18

600 286.40 0.42 207.56 0.33 158.92 0.27 158.92 0.26

700 345.28 0.58 252.25 0.45 185.42 0.36 185.42 0.35

800 314.00 0.61 214.64 0.44 151.34 0.34 151.34 0.33

900 428.16 0.89 300.34 0.67 210.06 0.51 210.06 0.49

1000 451.74 1.04 313.50 0.78 220.33 0.60 220.33 0.57

Tab. 2.5: Weakly correlated data instances. We give the average numbers for 100 test sets per size. time

is the time in seconds, cp the number of choice points.

Size Type DHR linU1linU2MTR

ntime/cp time/cp time/cp time/cp

500 uncorrelated 1.72 1.78 1.74 1.74

500 correlated 1.28 1.42 1.55 1.47

1000 uncorrelated 2.66 2.90 2.85 3.01

1000 correlated 2.30 2.49 2.72 2.59

Tab. 2.6: Uncorrelated and weakly correlated data instances. We give the average time per choice point

in milliseconds for 100 test sets per size.

2.5. Knapsack Constraints

Size nlinU2(time per cp) MTR (time per cp)

500 1.74 1.74

1000 2.85 3.01

2000 5.08 5.58

4000 11.80 12.42

8000 28.71 32.36

16000 71.71 75.42

Tab. 2.7: Uncorrelated data. Comparison of running times per choice point for the new amortized linear

time propagation algorithm based on bound U2and the implementation of MTR. We give the average time

per choice point in milliseconds for 100 test sets per size.

The Dembo/Hammer-based filtering algorithm needs to visit the largest amount of choice

points among the four propagation algorithms tested. This matches the expected behavior of a

methodthatpruneswithrespect toweakerbounds. Due totheshorttimeper choicepoint, though,

it is only slightly slower than the other methods on uncorrelated data. Thus, the numerical results

reflect the expected trade-off between an effective filtering and the time needed to achieve a

higher level of consistency. In the presence of additional constraints (causing a longer time spent

per choice point that is needed for constraint propagation), it is likely that a smaller number

of choice points will result in a faster overall computation. Algorithm linU1uses fewer choice

points than DHR, but is not as effective as the U2-based algorithms, MTR and linU2. For the

larger instances of this test set, these two only visit between 50% and 65.6% of the choice points

needed by DHR.

For weakly correlated data, linU2only visits at most 69.7% of the choice points of the DHR

routine. Moreover, linU2slightlyoutperforms DHR with respect to the total running time. Notice

that the time per choice point spent by linU2for weakly correlated instances is smaller than that

for uncorrelated data. The reason for this is that the preprocessing time for initializing the more

complex data structures for linU2and for sorting the items according to weight and efficiency is

spread over a much higher amount of choice points.

Large Instances To get a clearer insight into the characteristics of the different algorithms, we

performed some tests on larger instances. Going up to 10000 items, the disadvantages of the

poor bounds used by linU1and especially DHR become obvious. Due to a much bigger amount

of choice points that have to be visited, the total running times exceed those of linU2and MTR

(see Table 2.8).

Still, on average, the algorithms based on MTR and linU2need about the same running

time. We assume that, for smaller test instances, the binary search performed by MTR is faster

because it causesless overhead thanlinU2. As the problemsize increases, however, the difference

Chapter 2. Optimization Constraints

Size DHR linU1linU2MTR

n cp time cp time cp time cp time

1000 97.83 0.26 72.38 0.21 59.73 0.17 59.73 0.18

2000 161.48 0.79 120.64 0.65 100.38 0.51 100.38 0.56

3000 202.34 1.59 148.43 1.31 118.90 1.00 118.90 1.06

4000 291.00 3.17 205.16 2.43 146.58 1.73 146.58 1.82

5000 360.47 4.82 245.32 3.79 184.83 2.65 184.83 2.98

6000 534.61 9.46 376.69 7.81 197.43 3.84 197.43 4.30

7000 620.48 12.90 431.55 10.11 294.18 6.78 294.18 7.57

8000 823.34 21.08 567.43 16.47 285.22 8.19 285.22 9.23

9000 1051.72 31.76 712.51 23.74 435.65 14.50 435.65 15.46

10000 1143.54 38.39 797.58 30.21 620.35 22.71 620.35 24.99

Tab. 2.8: Uncorrelated data. Comparison of running times for the new amortized linear time propagation

algorithms and implementations of DHR, and MTR. We give the average time in seconds as well as the

number of choice points for 100 test sets per size.

uncorrelated weakly correlated

Size linU2MTR linU2MTR

n cp time time cp time time

10000 620.35 22.71 24.99 1626.78 60.98 66.58

11000 629.43 26.38 28.76 2572.45 110.47 121.08

12000 604.87 28.04 32.31 2590.45 125.40 137.21

13000 1341.42 69.30 77.31 2694.07 142.13 156.26

14000 875.71 50.42 56.96 3520.18 206.68 228.54

15000 1041.80 64.60 70.74 2818.97 185.33 204.80

16000 1256.73 90.12 94.78 2164.99 154.56 172.14

17000 1670.81 124.53 139.63 3145.36 250.59 276.93

18000 2580.28 205.81 227.81 2980.91 251.43 279.63

19000 2870.68 243.05 274.93 4871.67 435.33 476.97

20000 2750.36 256.88 288.15 4319.27 405.56 452.50

Tab. 2.9: Comparison of running times of linU2and MTR on uncorrelated and weakly correlated data. cp

is the number of choice points, time the running time in seconds.

2.5. Knapsack Constraints

in efficiency becomes more noticeable, and linU2slightly outperforms MTR (see Tables 2.7

and 2.9).

A drawback of the new methods is the need for an initial sorting step in the preprocessing in

which a profit and a weight ordering of all items are calculated. However, timing experiments

show that this initial step costs about 0.06 seconds for 10000 items and takes less than 0.01

seconds for 1000 items. According to Table 2.8, the total running time for these problem sizes is

much higher. Hence, the preprocessing time can be neglected in practice.

2.5.5 Cost-based Filtering for Knapsack Related Problems

Before summarizing our results on the filtering algorithms of knapsack constraints, we would

like to discuss their applicability to two special variants of the Knapsack Problem that have been

introduced in the literature.

Multidimensional Knapsack Problems The Multidimensional Knapsack Problem consists in

the maximization of a given profit function with respect to two or more given capacity con-

straints. The problem can be viewed as a collection of mKnapsack Problems sharing one objec-

tive:

max ∑jpjXj



∑jwi



jXj







 









(2.8)

Thus, for each of the capacity constraints, we can define an optimization constraint and per-

form cost-based filtering using the propagation algorithms we just presented. This approach,

however, suffers a setback from the fact that the bounds computed in each optimization con-

straint ignore all constraints except one. Therefore, the bounds are not tight, and filtering is less

effective than it could and should be.

In Chapter 3, we develop a generic method for linking filtering algorithms of linear opti-

mization constraints, the CP-based Lagrangian relaxation. When applied to Multidimensional

Knapsack Problems, problem reduction is based on the filtering routines of the individual knap-

sack constraints incorporating the other constraints in a Lagrangian objective. We will see that

this approach is clearly favorable compared to the loose connection of optimization constraints

that interact via domain reduction only.

Note, however, that the asymptotic complexity improvements that we introduced are lost

whenapplyingthe knapsackfilteringalgorithmin thecontextof CP-based Lagrangianrelaxation,

because for each Lagrangian sub-problem, the objective changes. Thus, the efficiency ordering

has to be re-computed which then dominates the algorithmic complexity. It is worth noting that

this problem does not occur when the filtering algorithms presented here are applied to column

Chapter 2. Optimization Constraints

generation sub-problems (as in CP-based column generation), because the objective remains

fixed for the entire tree search that is applied to compute a new column. Thus, the efficiency

ordering of the knapsack items has to be re-computed only when a new sub-problem is set up.

Bounded Knapsack Problems Bounded Knapsack Problems generalize the 0-1 KP by defin-

ing individual bounds on the solution vector:

max ∑jpjXj



∑jwjXj









 



(2.9)

The discussion in Section 2.5.1 on the Constrained Cutting Stock Problem has shown an

application of Bounded Knapsack Problems. Obviously, (2.9) can be transformed into a CKP by

replacing each original variable Xjby ujnew variables X















uj. (Note that a

finite ujalways exists, as Xj







.). Then the algorithms presented before could be applied.

That approach, however, artificially enlarges the number of variables and ignores the additional

structure of (2.9) completely.

We can do better by extending U1and U2to general integer bounds for KP. That is, we

chose the critical item as s:



minj



∑j



1ui







. Then U1can be re-written as U1







∑s





1uj







cps



, where c





∑s





1uj



wj. For a detailed discussion of such generaliza-

tions, and an extension of U2, we refer the reader to [149, pp. 84ff.]. Using these extended

bounds, efficient propagation for the Bounded Knapsack Problem is then easily achieved by the

algorithms proposed in Sections 2.5.3.1 and 2.5.3.2.

2.5.6 Summary

Based on relaxation bounds for KP, we introduced a reduction algorithm that runs in amortized

time Θ





for Ω



logn



calls. The algorithm can be used efficiently as a propagation routine when

solving a combinatorial optimization problem that contains one or more knapsack constraints.

In a CP search, the efficiency of the algorithm developed depends on the number of choice

points and the time needed per choice point: The more choice points are investigated during

the search, the less dominant are the preprocessing times for initialization and sorting. Also, if

more time per choice point is spent by other routines – that, for instance, propagate additional

constraints of a CKP or calculate more expensive bounds on the objective – the more important

is an effective filtering behavior that justifies a higher effort spent per choice point.

Experiments show that the algorithms presented are as effective as another method based on

a reduction technique previously proposed by Martello and Toth for KP. The theoretical analysis

and numerical comparison show that the new filtering algorithm is asymptotically more efficient.

Chapter 3

Cost-based Filtering and Problem

Decomposition

In the previous chapter, we developed a tool box of efficient cost-based filtering algorithms for a

whole variety of importantoptimizationconstraints. None of these filtering algorithmsis actually

useful when the problem corresponding to the constraint has to be solved: For the Shortest Path

Problem, the Weighted Stable Set Problem on interval graphs, and the Weighted Bipartite Match-

ing Problem there exist efficient polynomial time algorithms. Therefore, there is no need to apply

a tree search and domain filtering to solve these problems. As a matter of fact, the contrary is the

case: the filtering algorithms that we developed are based on the efficient algorithms available

to solve the corresponding optimization problems. The only NP-hard optimization problem that

we considered was the Knapsack Problem. However, even for this problem, it is not clear how

the state of the art algorithms for its solution could benefit from the filtering algorithms that we

developed.

On the other hand, real-life problems often consist in a combination of various constraints.

Frequently, the resulting problem is NP-hard, and the special composition even of well-known

optimization problems has not been studied before. Of course, for every such combination at

hand, the complexity of the resulting problem could be studied; questions regarding the approx-

imability of the problem may be answered; and algorithms for the predominant constraints may

successfully be adapted to efficiently handle the additional constraints. In general, a sound theo-

retical workwill usuallyestablishan understandingof the special augmentedconstraintstructure,

and this approach will most likely yield the most efficient algorithms. Therefore, for composed

problems that are of great practical relevance or that occur very frequently, this way of construct-

ing an efficient algorithm for a specific problem is favorable and necessary.

In industrial practice, however, the efficiency of the resulting algorithm is not the only crite-

rion. Of course, faster algorithms providing solutions of very good or even provably high quality

are clearly favorable. On the other hand, there is an obvious need to develop stable software

Chapter 3. Cost-based Filtering and Problem Decomposition

solutions quickly, and rapid prototyping is of great importance: For example when a company

wants to occupy a new market more quickly than its competitors. Or, when the problems that

have to be solved are varying, which may be caused by flexible environments in which the types

of constraints to be obeyed are changing frequently.

In constraint programming, a problem is represented as a set of constraints on variables with

finite domains. The standard algorithmicapproach isto applya tree search where, in every choice

point, constraints are propagated, i.e., they are used to shrink the domains of the corresponding

variables, if possible. This process of constraint propagation is repeated until a stable state is

reached where no values can be removed from variable domains anymore. Then a new branching

decision is made and the search continues.

This way, constraints interact only via the domains of their variables, which makes the ap-

proach extremely flexible with respect to the addition or removalof constraints. Moreover, and in

contrast to linear programming, the types of constraintsthat can be used are not really predefined.

All that is needed is a domain filtering or at least a checking algorithm for every constraint that is

used in the problem model. These algorithms can be tailored for a given constraint. Standard CP

solvers like ILOG SOLVER [121] even offer the possibility to compose constraints out of a set of

basic logic and algorithmic constraints, which facilitates the software development process and

gives less room for mistakes in the implementation.

In this setting, when given a discrete optimization problem, we may be able to identify sub-

structures that match one of the optimization constraints considered in the previous chapter.

Then we can simply plug in the constraint and use its corresponding filtering algorithm. Prob-

lem tightening with respect to cost considerations that used to be highly problem-dependent can

be handed over to standard libraries that take this task over. This results in a faster and safer

software development.

There is a price to pay, however. The way how constraint programming decomposes a prob-

lem is very weak, because only one constraint is considered at a time. This allows local incon-

sistencies to be resolved very efficiently, but on the other hand the approach lacks a global view

on a problem, which is particularly bad with respect to the computation of meaningful bounds

on the objective.

In the following, we show how to improve upon this situation by making use of two standard

decomposition methods in operations research: column generation and Lagrangian relaxation.

The results of Section 3.1 were published in [128, 129], and parts of the Sections 3.2, 3.3 were

published in [188, 189, 190].

3.1. CP-based Column Generation

3.1 CP-based Column Generation

Given a natural number n

 

and a finite set1X





n, we consider the following discrete

optimization problem that consists of two constraint families A:Ax



b, and B:Bx



e,x



X:2

Minimize LPP



cTx

subject to Ax







The convex hull of solutions to Bdefines a compact polytope in



n. Let D

 

 



denote

the matrix that consists of one column for each corner of this polytope. Then, each solution of

the system Bcan be written as convex-combination of the columns of D, i.e., for all x



Xwith



ethere exist λ1



λL



0 such that ∑iλi



1 and x



Dλ. Therefore, LPPcan be rewritten

as Minimize LPC



cTDλ

subject to ADλ



∑i



Lλi





Dλ





We achieve a linearcontinuous relaxationby omittingthe discrete constraint Dλ



X. The advan-

tage of the above re-formulationis, that there is nocross-talking between the constraintsin Aand

Banymore. However, in general the matrix AD will contain far too many columns to allow an

explicit representation. Fortunately, such a representation is not needed to solve the correspond-

ing LP, because the simplex algorithm considers only one column at a time. Therefore, columns

can simply be generated when needed. This idea gave yield to the concept of column generation,

and it is one of the most frequently used techniques in the linear programming practice.

The origins of column generation date back to the works of Dantzig and Wolfe [51] and

Gilmore and Gomory [96]. The latter paper applies column generation to the classical Cutting

Stock Problem where the sub-problem is a Knapsack Problem. More recent applications include

specially structured integer programs such as the Generalized Assignment Problem, Time Con-

strained Vehicle Routing, Crew Pairing, Crew Assignment and related problems. We refer the

reader to [56] for a survey.

The procedure works as follows: We start with a sub-matrix ¯

 





and solve the

reduced system Minimize LPR







cT¯

Dλ

subject to A¯

Dλ



∑i



kλi





1A typical example is X









2Here and in the following we identify the name of an LP (here LPP) and its optimal objective value.

Chapter 3. Cost-based Filtering and Problem Decomposition

that is called the masterproblem. Denote the dual of the convexcombination constraintby π





and let µdenote the vector of duals of the constraints A¯

Dλ



b. We want to use the dual data to

generate a new column that has the potential to reduce the costs in the master problem. In the

simplex algorithm, those columns are determined with the help of reduced costs, that must be

negative. Thus, we consider the sub-problem

Minimize LPS





 



µTA



subject to Bx







If LPS







π, we add the solution dk





xto the master matrix ¯

Dand start over with the

next iteration by re-optimizing the increased master problem LPR









. Otherwise, the

process stops, and we achieve a valid lower bound on LPP. Since the solution computed will in

general not fulfill Dλ



X, the remaining gap between upper and lower bound has to be closed in

a branch & price approach. We refer the reader to Barnhart et al. [13] for further information on

this topic.

If a discrete optimization problem can be decomposed in the way we just described, the sub-

problem may be viewed as a constraint satisfaction problem where the set Xis defined by a set

of additional constraints. A typical example of such a sub-problem is the Constrained Knapsack

Problem that evolves e.g. when solving the Constrained Cutting Stock Problem with the help

of column generation. Another important class of sub-problems are Constrained Shortest Path

Problems that evolve in many contexts that range from route guidance [123] and duty scheduling

in public transit [25] up to the scheduling of switching engines [142]. The crew scheduling

application that we consider in Chapter 5 is another example where the sub-problem exhibits the

structure of a Constrained Shortest Path Problem.

For real-life applications, the additional constraints defining Xcan exhibit very complicated

structures, such as gliding time window constraints in the Airline Crew Assignment for example.

Also, the additional constraints may vary from case to case. A constraint propagation approach

can easily cope with that situation.

In that context, the advantage of the problem decomposition consists in the fact that the

constrained sub-problem does not contain the restrictions of Aanymore. Standard CP modeling

would also separate the families Aand B. However, when considering B, the constraints in A

are simply ignored, which can have a severely bad impact on the bounds used for cost-based

filtering. Using the above decomposition, we also consider the constraint family Bonly, but in

combination with changing objectives that reflect the constraints in A. Therefore, we achieve a

global view on the problem and tighter bounds that are used for a much more effective domain

filtering.

Of course, our hope is to find a decomposition such that we can identify a predominant

optimization constraint in the sub-problem that can be used for an efficient cost-based filtering in

3.2. CP-based Lagrangian Relaxation

the column generation process. Provided with such a decomposition, the algorithms developed

in the previous chapter can help to solve these problems efficiently.

3.2 CP-based Lagrangian Relaxation

Given a natural number nand vectors l



 

n, we consider an integer linear optimization prob-

lem (IP) consisting of the two constraint families A:Ax



b,xi









, and B:Bx











Minimize L



cTx

subject to Ax













Acommonwaytoachievea lower bound ¯

Lonsucha problemistodrop theintegralityconstraints









and to replace them by li



uiinstead. We get

Minimize ¯



cTx

subject to Ax





Now, to achieve a state of relaxed ¯

L-consistencywe could of course solve a series of LPs ¯







where we set some variable xi, 1



n, to some value v





 



. Then, given an upper

bound B, we can eliminate vfrom the domain of xiif ¯









B. Note that, due to ¯















for all w



v(the lower bound constraints follow analogously), this procedure will not

split the domains of the variables x. That is, after the filtering the domains of the variables xican

again be represented as xi







ˆui



for some ˆ



liand ˆui



ui, 1



The problem with the previous probing procedure is that it requires to re-optimize a dual

feasible LP many times, and this is usually unattractive with respect to the required computation

time. Therefore, it has been suggested to estimate the loss in performance by carrying out exactly

one dual re-optimization step. This method is known as reduced-cost filtering. It is computation-

ally cheap, but since it only indirectly exploits the structure of the problem it has a tendency to

be rather ineffective.

To improve the inherent trade-off between computational effort and effectivity, we try to

decompose the problem. Assume that efficient filtering algorithms Prop(A) and Prop(B) exist

that achieve a state of relaxed consistency for the constraint families Aand B, respectively. The

obvious approach to solve problem Lexactly is to apply a branch-and-bound algorithm using

linear relaxation bounds for pruning and the existing filtering algorithms Prop(A) and Prop(B)

to tighten the problem formulation in every choice point.

Chapter 3. Cost-based Filtering and Problem Decomposition

However, even though Prop(A) and Prop(B) may be effective for the substructures they have

been designed for, their application for the combined problem is usually not. This is because

tight bounds on the objective cannot be obtained by taking only a subset of the restrictions into

account. An accurate bound on the overall problem can only be computed by looking at the

entire problem, i.e., it cannot be achieved by looking at either one constraint family only.

Lagrangian relaxation allows us to bring together the advantages of a tight global bound and

the existing filtering algorithms that exploit the special structure of their respective constraint

families. The idea of Lagrangian relaxation was first presented in [59] for Resource Allocation

Problems. Held and Karp used it for the TSP [107, 108], and it has been applied in many different

areas since then. For a general introductionwe refer the reader to [1]. The method thatwe present

in the following is somewhat related to that in [80] where Focacci et al. introduce a method to

strengthen cost-based filtering by using Lagrangian multipliers to incorporate additional cuts to

tighten the bound used for propagation.

For our abstract composed problem, we introduce a vector of Lagrange multipliers λ



0 and

define the Lagrangian sub-problem

Minimize LB







cTx



λT







subject to Bx













For every choice of λ



0, LB





is a lower bound on L. Then the Lagrange multiplier problem

or Lagrangian dual consists in finding the maximum lower bound that can be achieved:

Maximize G







subject to λ





Lemma 3.1 Given 1



n, a value v





 



, let LB











denote theIP thatevolves

when adding the constraint xj



v to LB





. Furthermore, let B





denote an upper bound on

the objective of L such that ¯









B. Finally, denote the continuous relaxation of LB









. Then there exists a vector λ



0such that LB

























Proof: Let λ



0 denote a vector of optimal dual values of the constraint family Ain ¯







The theory of Lagrangian relaxation showsthat the vector λdefines optimal Lagrange multipliers

for ¯











. Therefore, LB

































To put the result into words: Lemma 3.1 shows that for every variable xjand value v









that can be filtered with respect to the relaxation ¯

L, there exists a vector of Lagrange

multipliers that allows to filter this value with respect to the constraint family Bonly. Of course,

due to symmetry, the same result holds when we relax Band keep the constraints in Aas hard

constraints only.

3.2. CP-based Lagrangian Relaxation

This observation motivatesthe following procedure: We compute Gwith the help of an itera-

tivealgorithmfor the maximizationof a piece-wise linear, concave function. Standard algorithms

used in the literature are subgradient algorithms or bundle methods [1]. For every selection of

multipliers λ



0, LB





is a valid lower bound on L. Thus, we can apply Prop(B) on the con-

straint family Bevery time when we solve the Lagrangian sub-problem LB





. Of course, our

hope is that, while solving the Lagrangian dual, we traverse through most relevant selections

of Lagrange multipliers, which will result in a filtering that almost achieves a state of relaxed

L-consistency.

If we still find that this filtering procedure does not sufficiently reduce the variables domains,

we can do even more. Consider the other possible decomposition

Minimize LA







cTx



πT







subject to Ax













Given a current selection of Lagrangian multipliers λ



0, denote the optimal dual values of



din the continuous relaxation of LB





by πλ



Lemma 3.2 Denote the continuous relaxation of LA



πλ



by ¯



πλ



. Then, ¯



πλ









Proof: Denote the dual of ¯



πλ



by DA



πλ



. By assumption, the vector πλis dual optimal

for ¯





. Let µλ



0 and νλ



0 denote the optimal duals for the constraints x



uand x



lin





, respectively. Then, due to strong LP duality, it holds that







dTπλ



µT

λu



νT

λl



λTb(optimality),

and



λTA



πT

λB



µλ



νλ



0 (feasibility).

Therefore, λ,µλand νλare feasible solutions to DA



πλ



with the objective value ¯





. Thus,



πλ







πλ









As a simple consequence, we get the following

Corollary 3.1 If the Lagrangian relaxation LB





exhibits the integrality property, it holds that:



πλ









Lemma 3.2 and Corollary 3.1 showthat the duals πλof ¯





are a good candidate to achieve

an improved lower bound LA



πλ



on L. This observation motivates the idea to improve the

effectiveness of the filtering algorithm by applying Prop(A) to LA



πλ



in every or at least some

iterations of the algorithm that maximizes the Lagrangian dual.

Chapter 3. Cost-based Filtering and Problem Decomposition

We put the ideas together. Two linear optimization constraint families Aand Bfor which

efficient filtering algorithms Prop(A) and Prop(B) are known can be combined effectively: we

compute Lagrangian multipliers for Aand use Prop(B) for filtering in each Lagrangian sub-

problem LB





. Then, in selected Lagrangian iterations, we hand back optimal dual information

πλof ¯





to propagate A, i.e. we apply Prop(A) on LA



πλ



3.3 Remarks and Generalizations

3.3.1 Solving the Lagrangian Dual and Impotence

When using CP-based Lagrangian relaxation, after having shrunk thedomain of thevariables, the

immediate re-application of the filtering algorithm may yield a further reduction of the domains.

This effect is caused by the algorithms — such as subgradient algorithms, bundle methods or

the volume algorithm [11] — used for the maximization of the Lagrangian dual, that will in

general proceed differently when the domains of the variables are changed. As a result, different

Lagrangian multipliers and sub-problems are investigated, which also gives yield to a different

filtering behavior. As a consequence, the filtering procedure as described is not idempotent [6].

Moreover, it is not clear whether domain reduction should actually take place during the

optimization of the Lagrangian dual. We are save if we just mark those values that can be deleted

from variable domains and postpone the actual reduction until the Lagrangian dual is solved. On

the other hand, it may be also favorable to incorporate the new knowledge as early as possible. It

is subject to further research to investigate how e.g. a subgradient search can cope with changing

problems, and whether convergence can stillbe proven in such a scenario. A practical application

of this procedure will be evaluated in Chapter 7.

3.3.2 Redundant Constraint Generation

Since the filtering behavior of the reduction algorithm based on Lagrangian relaxation relies on

the sub-problems investigated during the optimization of the Lagrangian dual, we cannot be sure

that our cost-based filtering algorithm exhibits a property that we call continuity:

Let Bdenote an upper bound on the minimization problem L, let Cdenote the current choice

point and LC







the best bound achieved regarding the removal of vfrom the domain of xin

C. Now assume that we have δ:













0 for some variable xand vin the domain

of x. Assume further that a primal heuristic finds a new upper bound ¯





δnext. We call

a cost-based filtering algorithm continuous, if it is guaranteed that in every child node Dof the

current choice point Cit is detected that vcan be removed from the domain of x.

When using Lagrangian decomposition, this is not the case. Let λ



0 denote Lagrangian

multipliers such that LB



















. Then we cannot be sure that, when performing

3.3. Remarks and Generalizations

problem reduction in D, the algorithm optimizing the Lagrangian dual will investigate the La-

grangian multipliers λ. Thus, it may very well be the case that LB



















and









To overcome this problem, we suggest to store, for each variable-value assignment, the value







of the largest lower bound achieved so far. This procedure may be viewed as a gener-

ation of redundant local constraints of the form: L









or x





3.3.3 Linking more than Two Optimization Constraints

The procedure sketched can easily be generalized if the linking of more than two constraints

is desired. All we need to do is to select the substructure that determines the Lagrangian sub-

problem, i.e., the one that is used to guide the algorithm for the solution of the Lagrangian dual.

In selected iterations, we apply the filtering algorithm for the other substructures with a modified

objectivefunction. Thatmodificationis determinedby thedual values ofthe familyofconstraints

in the Lagrangian sub-problem and the Lagrange multipliers for the remaining substructures.

3.3.4 Linear Relaxations and Cuts

If continuous bounds are preferred to bounds based on Lagrangian relaxations, it is also possible

to use dual values instead of Lagrange multipliers to modify the objective functions for the re-

spective sub-problems we want to apply a filtering algorithm on. We still use the terminology of

a linking methodbased on Lagrangian relaxation, as we use Lagrangian objectivesfor cost-based

filtering.

Of course, the method can also be used in combination with tightening algorithms such as

cut generators. We simply incorporate all additional cuts as a new family of constraints we have

to find Lagrange multipliers (or dual values) for.

3.3.5 Binary IPs

Interestingly, as a special case we achieve a propagation algorithm for binary IPs. Given A







n,b





m, and p





n, we consider the following binary program:

Maximize pTx

subject to Ax











The problem can be viewed as a combination of mKnapsack Problems. Assuming that we solve

the continuous relaxation to compute an upper bound, let π





mand µ





ndenote the optimal

Chapter 3. Cost-based Filtering and Problem Decomposition

solution to the dual problem, i.e., πand µsolve the following linear problem:

Minimize bTπ



1Tµ

subject to ATπ





Let 1



m, and let iA













ndenote the matrix that evolves from Aby erasing row i,

and ib,iπ







1the vectors that evolve by erasing component ifrom band π, respectively.

Furthermore, let aidenote the i-th row of matrix A. Then, for every 1



m, we perform

domain reduction with respect to the following Knapsack Problem:

Maximize





iπTiA





iπTib

subject to aix











Thus, as a special application of CP-based Lagrangian relaxation, we achieve an effective fil-

tering algorithm for binary IPs that runs in Θ



mnlogn



(using one of the knapsack filtering

algorithms described in Section 2.5) after we have found optimal dual values of the continuous

relaxation.

3.3.6 Column Generation vs. Lagrangian Relaxation

Finally, we compare the two reduction methods developed in the previous two sections. Column

generation and Lagrangian relaxation are both absolutely identicalwith respect to the structure of

the sub-problems they investigate. The only difference consists in the way how the sub-problems

are achieved. In columngeneration, the penalties are determinedby the dualsof a linear program,

the master problem. Whereas in Lagrangian relaxation the penalties are updated with respect to

subgradients or variants of them. In practice, an average master iteration in column generation is

far more costly than in the Lagrangian relaxation setting, but therefore much fewer iterations are

necessary to obtain a satisfactory solution value.

The filtering methods we described take this important difference into account: In column

generation, we suggest to consider a CP-based tree search to solve the sub-problem, and to use

optimization constraints like the ones developed in the previous chapter to ensure the generation

of columns with negative reduced costs. The application of a tree search is affordable, because

the re-optimization of the master problem is rather costly itself, and because the total number of

master iterations is usually low.

On the other hand, we suggest to use Lagrangian relaxation within a tree search to obtain

lower bounds on the objective. Then, in every Lagrange iteration, we can use optimization

constraints for cost-based filtering that determine substructures of the optimization problem at

hand.

Chapter 4

Symmetry Breaking

In the previous chapters, we have studied the interplay between the objective functions and the

constraints of discrete optimization problems. Now we want to focus on a special aspect of the

constraint structure of many constraint satisfaction or optimization problems: Symmetry.

Symmetries can give rise to severe problems for exact and heuristic algorithms as equivalent

search regions are unnecessarily being explored more than once. Generally, there are two ways

of handling symmetries. The first one is to model the problem in such a way that none or at

least less symmetries remain. This may also imply the adding of constraints which will only be

satisfied by one assignment in each equivalence class. The major disadvantage of this approach is

that it requiresthe user to have a certain levelof experience, and sometimes itis even notpossible

to remove symmetries from a problem formulation as they are inherent to the given problem. The

second way is to break symmetries while searching for a solution. This can be done by adding

new constraints on backtracking. Those constraints can be used for domain filtering or, if the

detection of symmetries appears to be rather expensive, only for pruning.

The standard approach for breaking symmetries is to model a given discrete optimization or

constraintsatisfactionproblemin somecleverandoften non-intuitiveway. These re-formulations

are usually highly problem specific and not generic. In recent years, symmetry breaking was

studied moresystematically. In [181], Rothberg presentswaysto remove symmetriesfrom mixed

integer problems by using cuts. Sherali and J.C. Smith discuss the effectiveness of adding con-

straints to a basic model in a number of case studies [201]. In [93], Gent and B. Smith develop

a generic approach called Symmetry Breaking During Search (SBDS). In every choice point,

SBDS may extend the model dynamically by adding symmetry breaking constraints. For the

Social Golfer Problem, this approach has been shown to be efficient in combination with refined

problem formulations which are used to remove some symmetry already in the model [202]. As

the number of symmetries in the given problem is enormous, the approach presented is not able

to detect all of them and thus also gives non-unique solutions. In [155], Meseguer and Torras

introduce a symmetry avoiding approach that works by adapting the search strategy.

Chapter 4. Symmetry Breaking

We introduce a method that also detects symmetries within the search procedure. Every time

the search algorithm generates a new choice point, we check if it is equivalent to or dominated

by a node that has been expanded earlier. If so, the current choice point can be pruned. If not, it

is processed normally. By checking whether a value assignment to a variable yields a symmetric

search node, we can also use symmetries to shrink the domains of variables. However, that

propagation can be very costly, and therefore it is not suited in all cases. As the method is based

on the detection of dominance relations between sub-trees, we call it Symmetry Breaking via

Dominance Detection (SBDD).

The method that we present in the following was also developed independently by Focacci

and Milano [81] who presented their work at the same conference. In a later work, the idea of

dominance detection between choice points was extended to achieve a method for the heuristic

pruning of search nodes when solving discrete optimization problems [82].

The work presented in this chapter was published in [63]. It is structured as follows: In

Section 4.1, we formally introduce the SBDD approach. In Sections 4.2, 4.3, and 4.4, it is

applied to three different examples from combinatorial optimization and combinatorial design.

Numerical results are given that illustrate the effectiveness of the approach.

4.1 Symmetry Breaking by Dominance Detection

The goal of breaking symmetries is to avoid the exploration of a search subspace



that can be

mapped into a previously considered part



via a symmetry function. For if



does not contain

any solution, then neither does



. Otherwise, all solutions in



are symmetric to those already

computed in



. Thus, symmetries can be used to prune the search tree, and also to removevalues

from variable domains thatwouldyield the search to a symmetricpart of the search space. Before

we outline the concept formally, we introduce some helpful definitions first.

Definition 4.1 Let X









denote the set of variables of the model to solve, and let D





denote the domain of a variable x



X. The tuple Pc

 









denotes the current

state in choice point c. We refer to the representation Pcas a pattern.

Definition 4.2 Let Pc

 









, Pc



 













denote two patterns.



We say that Pc



includes Pcand write Pc





, iff





X:Dc















We set M Dc:





   







Given a symmetry mapping function ϕ:M Dc



M Dc, we say that Pc



dominates Pc

(under the symmetry ϕ), iff ϕ









. Then, we write Pc





4.1. Symmetry Breaking by Dominance Detection

(a) (b) (c)

Fig. 4.1: The concept of SBDD.

Due to the monotonicity of filtering algorithms [6], we have the following

Property 4.1 Given two choice points c and c



, where c



is a successor of c in the search tree.

Then, it holds that: Pc





Pc.

The approach that we suggest for pruning symmetric parts of the search space is very simple

and straightforward, but to the best of our knowledge apparently it has not been considered

before. The method is based on the following ingredients:



A database Tthat stores information on the search space already explored.



A problem specific function Φ:













 

false



true



that yields true iff the pattern



is dominated by P



under some symmetry function ϕ.



If symmetries shall also be used for propagation, a similar function is needed that, for all

variables x, removes all values bfrom the domain of xfor which Φ



















true.

In every choice point, we check whether the current pattern P



is dominated by some pre-

viously considered pattern in T. If so, the current node is pruned. Otherwise, we can use the

function Φfor propagation. Thus, we perform Symmetry Breaking via Dominance Detection

(SBDD). Figure 4.1 visualizes the general procedure. White nodes are still active, black nodes

have been fully expanded already. Boxes represent patterns in T, circles are patterns not or no

longer contained in T. Finally,



marks the current node. Originally, a pattern



must be

checked against all fully expanded nodes (see Figure 4.1(a)).

Obviously, it is problematic if we are to store all expanded nodes in T. In the next section,

we describe how to handle Tefficiently for depth first search (DFS). Later we will generalize

the result to arbitrary search strategies.

Chapter 4. Symmetry Breaking

4.1.1 Efficient Realization in a Depth First Search

The key for an efficient realization of the general SBDD concept as described above is the ob-

servation that, within a DFS, we do not need to keep the information of all previously expanded

nodes in the search tree. Instead, we can merge sibling entries in Ton backtracking, thus sum-

marizing and compressing the information gathered.

Lemma 4.1 Let c be a choice point with state Pc

 









, and denote the states

of the children c1



clof c by Pck

 

Dck



 

Dck



 



l. Finally, let Pc



denote

the state in choice point c



with Pc

 

Pckfor some 1



l. Then, it holds that Pc

 

Pc.

Proof: Denote the symmetry function by ϕwith ϕ









Pck. Then, with property 4.1, we have

that ϕ









Pck



Pc. Thus, Pc



Pc.

Using Lemma 4.1, SBDD in combination with DFS can be realized efficiently: We start with



0and process each choice point as follows:

1. Check the pattern Pcof the current choice point cagainst all patterns in T. If there exists

a pattern P



Twith Φ







then fail. (Alternatively encapsulate this function in a

constraint and use it also for domain filtering.)

2. Process the current choice point.

3. On backtracking: If there are more sibling nodes to be expanded, then add the current

pattern to T, else delete all patterns of the sibling nodes from T.

When using DFS, the current pattern needs only be compared with patterns left-adjacent to

the path from the root to



(see Figure 4.1(b)). Notice that step 2 refers to the normal processing

of a choice point that also takes place when no additional symmetry breaking framework is

utilized, including the choice of a branching constraint and the exploration of the children.

The efficiency of the approach depends on two parameters: the time needed to evaluate the

function Φ, and on the number of such evaluations needed. Using the previous procedure, the

number of patterns in Tis at most as large as the depth of the search tree times the cardinality of

the largest domain.

4.1.2 Arbitrary Search Strategies

With respect to the importance of the size of T, at first it seems to be impractical to combine

SBDD with search strategies other than DFS, because the number of previously expanded nodes,

4.1. Symmetry Breaking by Dominance Detection

and thus the size of T, may be enormous. Another possibility is that the symmetry breaking

method becomes ineffective, because many nodes are closed late, which is the case for breadth

first search, for instance.

Nevertheless, with a slight modification, it is possible to cope with general search strategies.

Let cdenote the current choice point, and Pcthe corresponding pattern. The idea now is to

check whether a symmetry function maps Pcto a pattern of a choice point c



that would have

been processed before cif DFS would have been applied on a static ordering of the branching

constraints (see Figure 4.1(c)). If so, cis rejected, otherwise we proceed normally. That way,

we prune the tree because we detect that the work has either been carried out already or because

we decide to do it later. Notice that the current path in the search tree contains all information

necessary to identify the patterns that are relevant for checking. The assumption of a static

branching constraintordering defines a virtual ordering of all choice points. The approach rejects

the current choice point iff a dominating pattern exists left of it in a virtual DFS tree, i.e. iff the

current choice point has a later virtual DFS closing time stamp. As an exhaustive search will

eventually consider the leftmost nodes as well, we can be sure not to miss a solution.

Note that the search strategy is slightly affected by this procedure, because the exploration of

choice points can be postponed by the symmetry breaking algorithm. This side-effect is clearly

not desirable. However, one might expect that a reasonable search strategy rates symmetric parts

of the search tree as equally important. In that case, the expanding of the current choice point is

only postponed formally, but in fact is carried out next in a symmetric version.

4.1.3 A Different Representation of Choice Points

In Definition 4.1, we have defined a pattern with respect to the current state of the domains. We

could also have used the current set of constraints including the branching decisions taken to

identify a choice point. When defining symmetry detection functions on pairs of sets of con-

straints, we can again detect symmetries between choice points. Note that Property 4.1 and

Lemma 4.1 are still valid in this setting. Therefore, the idea of SBDD can also be realized

efficiently when using a constraint representation of a choice point.

Such a representation may be favorable with respect to non-unary branching constraints, and

also with respect to the efficiency of the evaluation of the symmetry detection function [174].

However, functions based on this representation tend to be less intuitive. Therefore, in the exam-

ples that we consider in the following, we will use the definition of a pattern as in Definition 4.1

and in combination with unary branching constraints only.

After having outlined the general approach, in the following sections we apply it to three

different applications in the field of combinatorial optimization and constraint satisfaction.

Chapter 4. Symmetry Breaking

σσ

000

100

001

110

011

1000

0001

1110

0111

0000 1111

0010

1101

1011

0100

1010

0011

1100

0110

1001

101

010

111

0101

Fig. 4.2: DeBruijn networks of dimension 3 (left) and 4 (right). A node is marked by the binary string

corresponding to its number. The dashed lines mark the symmetries of the DeBruijn network.

4.2 DeBruijn Graph Bisection

The first application ofthe method described inSection 4.1 that we present isthe Graph Bisection

Problem. Givenanundirected graphG

 





,the GraphBisectionProblem asksfora setS



such that the cardinalities of Sand SCdiffer at most by one, and the number of edges between

both sets is minimal. This optimal number is often referred to as the bisection width of the graph.

Graph Bisection is known to be NP-hard, exact solutions can only be computed for small graphs,

typically





200. Interestingly, Graph Bisection alone already induces a symmetry as the sets

Sand SCcan be exchanged.

An obvious symmetry breaking strategy in this case is the initial assignment of a node to the

set S. However, if the graph Gto be partitioned is itself symmetric, such an assignment does not

break the resulting combined symmetries.

In parallel computing, connection networks are typically nicely structured and their symme-

tries are known. Graphs of the hypercube family have been studied intensively (see [19, 140]).

One popular network is the so-called DeBruijn network which is defined as follows:

Definition 4.3 The DeBruijn Network of dimension k is a directed graph DB





 





with





 





. The edge set can be described best by associating the nodes with their

corresponding binary representation, i.e. Vk





















. Then,







bα



αb

 



bα



αb



























where b denotes inverting bit b, i.e. b





4.2. DeBruijn Graph Bisection

In the following, for the Graph Bisection Problem, we will interpret any directed arc of

DB(k) as an undirected edge. Then DB(k) contains 2knodes, each having degree 4, and 2k



edges. Furthermore, DB(k) contains 3 symmetries described by the following automorphisms:

σ1:V































σ2:V





























σ3:V































Symmetries σ1



σ2and σ3are visualized in Figure 4.2, where DB(3) and DB(4) are shown.

4.2.1 Bisection Width of the DeBruijn Graph

It can be shown that the bisection width of DB(k) is in Θ





, but there are only few results

known for specific dimensions. In [70], an optimal bisection width of 30 for DB(7) has been

computed. At the time that paper was written, the algorithm based on LP bounds ran for about

two weeks. To our knowledge, no exact bisection widthsfor bigger DeBruijn graphs were known

at that time.

In [197], Sensen improvedthe well-known boundbased on clique embeddingsby introducing

variable multicommodity flows. Using interior point methods for the resulting linear programs,

he was able to prove an exact bisection width of 54 for DB(8). SBDD was used to prevent the

consideration of symmetric parts of the search space. We refer the reader to [197] for details on

the overall approach. Here, we concentrate on the breaking of symmetries. We use this example

to show an easy application of SBDD rather than to underline its efficiency. For comparisons

with SBDS, we refer the reader to Sections 4.3 and 4.4.

4.2.2 Symmetry Breaking for the Bisection of DeBruijn Graphs

When bisectioning DeBruijn graphs, seven symmetries have to be encoded in Φ. They stem from

the three automorphisms of the graph itself, the exchange of Sand SC, and the combination of

these symmetries.

For the Graph BisectionProblem, a pattern is implementedas an n-tuple p











n.pi



(pi



1) means that node i



S(i



SC). pi





means that node ihas not been assigned yet. The

symmetry functions ϕ1



ϕ7permute the nodes according to σ1



σ2or σ3and/or invert the

entries. A pattern P



is dominated by P



iff there is a symmetry function ϕk, 1



7 such

that, for all 0





n, it holds that ϕk











or P





ϕk







It is also possibleto use pattern informationfor propagation. Assume that there is a symmetry

function ϕkand an index j, 0





n, such that ϕk











or P





ϕk

















and p







. Let ϕk









0 (or ϕk









1). Then we can enforce that node jis in SC(or S,

respectively).

Chapter 4. Symmetry Breaking

Fig. 4.3: The search tree when bisectioning DB(8) without breaking any symmetries.

Fig. 4.4: The search tree for the bisection of DB(8) when breaking all possible symmetries. Chains of

choice points with only one successor result from symmetry-based domain filtering.

Figures 4.3 and 4.4 show the different branching trees resulting from a computation of DB(8)

with and without breaking symmetries. Huge parts of the solution space are cut off by lower

bound information. Thus, many symmetric sub-trees are pruned early, thereby diminishing the

effect of symmetry breaking. However, since the effort per choice point in this approach is very

high due to expensive bound computations (



14 minutes per choice point), even small reduc-

tions of the tree size improve the overall performance significantly. Thus, for the computation of

the bisection width of DB(8), the breaking of symmetries was able to reduce the running time by

roughly 2 days, whereby the remaining overall computation time then took 37.5 hours.

In Chapter 9, we consider the Graph Bisection Problem in more detail and develop an ap-

proximation scheme for the efficient computation of Sensen’s lower bound. The approximation

of the bound is fast, but less accurate, which results in far larger search trees. In this environment,

when bisectioning DeBruijn graphs, the reduction of choice points is even more visible.

4.3 The Social Golfer Problem

We also applied SBDD to find solutions for the Social Golfer Problem. We study that problem

in detail in Chapter 8. Therefore, here we introduce it only very briefly. The original question

was posed as follows (Problem 10 in CSPLib [47]):

32 golfers want to play in 8 groups of 4 each week, in such way that any two golfers

play in the same group at most once. How many weeks can they do this for?

4.3. The Social Golfer Problem

The problem can be generalized by parameterizing it to wweeks and ggroups of splayers

each, written as g-s-wfrom now on1. In case of











1, we achieve a specification

where every player must play with every other exactly once. This problem is also known as the

Schoolgirl Problem (see Chapter 8).

4.3.1 Symmetries in the Social Golfer Problem

Obviously, there is a lot of symmetry in the problem. First, players can be placed at any position

within a group (ϕP), groups can be exchanged within their week (ϕG), and also the weeks can be

ordered arbitrarily (ϕW). Furthermore, the players can be permuted (ϕX).

Followingthe idea that symmetrydetection shouldalso workwell incombinationwithsimple

models, we have chosen a straightforward one that can be implemented with little effort using the

ILOG SOLVER [121] environment. The groups are modeled as sets of players with the cardinality

of each set fixed to s. Each week contains gsuch sets, and the full pattern covers wweeks. To

shrink the search space, we fix all players in the first week in increasing order. Additionally, we

insert the first splayers into the first sgroups for all weeks thereafter. Finally, the first group of

the second week is filled with the smallest players possible. All these assignments can be made

without increasing the complexity of the model nor losing unique solutions.

4.3.2 Symmetry Breaking for the Social Golfer Problem

By using set variables for each group, the model does not contain symmetry ϕPanymore. To

detect the domination of patterns with respect to the other symmetries, we describe three sym-

metry detection functions ΦG,ΦW



Gand ΦW



X, that are used during the search. Function ΦW



includes checks performed by ΦG, and ΦW



Xincludes those done by ΦW



ΦGGiven two week indices 1







w,ΦGis used to check if a week iof pattern P



dominates

week jof pattern P



with respect to symmetry ϕG. This is done by checking whether all

players of week iof pattern P



can be mapped to week jof pattern P



. In the example

shown in Figure 4.5, week 3 of pattern P



cannot be mapped to week 2 of pattern P



because players 2 and 3 are in the same group in pattern P



, but are in different groups in

pattern P



. Week 1 of pattern P



also cannot be mapped to week 2 of pattern P



, because

player 8 has no matching partner. However, week 2 of pattern P



can be mapped to week

2 of pattern P



1In the original problem, it is clear that the golfers cannot play for more than 10 weeks. On the other hand, a

solution for 5 weeks can be found easily without backtracking by always choosing the first possible player for a

group in each week. Meanwhile, a 9 week solution has been found, but it remains unclear whether there exists a

10 week solution or not.

Chapter 4. Symmetry Breaking

567

week 1

week 2

week 3

week 1

week 2

week 3

Fig. 4.5: The left hand side shows two patterns P



and P



. Each pattern consists of three weeks (hor-

izontal) of three groups of three players. Unfixed variables are left empty. On the right hand side, the

corresponding bipartite graph is shown, containing a node for each week of both patterns. Since a match-

ing of cardinality 3 exists (bold edges), P



is dominated by P



ΦW



GTo break symmetries ϕWand ϕG, function ΦW



Gconstructs a bipartite graph Gcontaining

a node for each week of P



and P



. An edge is inserted, iff a week of P



dominates a

week of P



, which is determined using ϕG. If Gcontains a matching of cardinality w, i.e.,

a perfect matching, P



dominates P



. Again, Figure 4.5 shows an example.

ΦW



XIncorporating also the last symmetry ϕXresults in a huge computational effort, as ΦW



has to be applied for







! different permutations. To reduce the cost of this check, we use

the fact that the first week of a pattern is always complete due to the fixed entries. Since it

has to be matched to some other week, “only” w









g! possibilities are left. However,

the test remains expensive. Therefore, we tried some variations reducing the frequency

when ΦW



Xis applied. A parameter qcan be set to restrict full symmetry checks to every

q-th level of the search tree. Optionally, it can be limited to be performed on full patterns,

i.e. leaves, only, which is the default.

4.3.3 Numerical Results

The model described has been implemented in ILOG SOLVER 5.0 [121] and run for different

configurations on a Sun Enterprise 450 (400 MHz UltraSparc-II). Tables 4.1 and 4.2 show the

results of the experiments. Apart from the time (in seconds) needed to find the first solution (t1)

and the time to find all solutions (tall), the number of calls to the symmetry detection functions

ΦW



Gand ΦW



Xis given. In the sym-section, ΦW



Gis applied to check for symmetries ϕW

and ϕGin each node of the search tree. Since symmetries ϕXare not detected, there are many

non-unique solutions found. In the nosym-section, ΦW



Gis also applied in every node of the

4.3. The Social Golfer Problem

problem solutions t1tall ΦW



GΦW



Xsymmetries cp fails

sym

4-3-2 48 0.00 0.03 226 0 0 195 148

4-3-3 2688 0.02 6.09 99454 0 0 28299 25612

4-3-4 1968 0.05 26.70 382120 0 2808 94845 92878

4-3-5 0 0.00 36.34 412456 0 3120 100389 200390

nosym

4-3-2 1 0.00 0.04 226 47 47 195 194

4-3-3 4 0.01 10.00 99454 2687 2684 28299 28296

4-3-4 3 0.04 29.18 382120 1967 4773 94845 94843

4-3-5 0 0.00 36.28 412456 0 3120 100389 200390

Tab. 4.1: Results of the golfer 4-3-Xproblem.

search tree, and additionally ΦW



Xis applied in leaves preventing symmetric solutions from

being written out. The tables continue with the number of detected symmetries (symmetries), the

number of choice points (cp), and the number of fails.

Since invoking the symmetry detection function ΦW



Xis computationally very expensive,

applying it in every search node does not improve the overall run-time, although the number of

choice points is reduced. Clearly, there is a trade-off between the reduction of choice points and

the effort spent for the detection of symmetries. We have tested a scheme that applies ΦW



not only in leaves but also performs additional checks for all symmetries in every node in the

q-th level of the search tree. Table 4.3 shows that invoking ΦW



Xtoo often rather increases

the overall run-time, but applying it too rarely (e.g., only in leaves) is not the best choice either.

For the 4-4-4 instance, an invocation in about every 8-th level has shown to be the best. Similar

observations have been made for other instances as well. Table 4.4 shows the improved running

times for the 4-4-Xinstance.

4.3.3.1 SBDS versus SBDD

In [202], an SBDS approach is developed for the Social Golfer Problem. To break symmetries,

SBDS inserts additional constraints to the model during the search, and hands them over to

the solver. Due to the large amount of symmetries in the Social Golfer Problem, the approach

presented is not able to add all constraints necessary to break all symmetries.

Therefore, different models for the Social Golfer Problem are discussed. In combination

with more complex models that break several symmetries themselves, SBDS performs well and

is able to reduce the number of choice points significantly. However, the approach presented

in [202] is not able to only compute unique solutions. Moreover, the general approach for the

Chapter 4. Symmetry Breaking

problem solutions t1tall ΦW



GΦW



Xsymmetries cp fails

sym

4-4-2 216 0.00 0.09 735 0 0 555 340

4-4-3 5184 0.01 8.71 74175 0 0 43755 38572

4-4-4 1296 0.01 20.53 140595 0 1296 82635 81340

4-4-5 432 0.01 25.90 132531 0 2160 75723 75292

4-4-6 0 0.00 30.76 114027 0 0 72267 72268

nosym

4-4-2 1 0.01 0.17 735 215 215 555 555

4-4-3 2 0.01 136.31 74175 5183 5182 43755 43754

4-4-4 1 0.01 22.09 140595 1295 2591 82635 82634

4-4-5 1 0.02 26.51 132531 431 2591 75723 75723

4-4-6 0 0.00 30.71 114027 0 0 72267 72268

Tab. 4.2: Results of the golfer 4-4-Xinstance.

level of ΦW



Xsolutions t1tall ΦW



GΦW



Xsymmetries cp fails

nosym

1 1 0.01 698.51 0 26 18 82 82

2 1 0.02 271.35 29 27 24 123 123

4 1 0.02 101.26 156 79 79 339 339

8 1 0.01 14.51 5292 1296 1296 4730 4730

leaves 1 0.01 22.09 140595 1295 2591 82635 82634

Tab. 4.3: Results of the golfer 4-4-4 instance performing additional checks for symmetry ϕXin search

nodes of every q-th depth.

problem solutions t1tall ΦW



GΦW



Xsymmetries cp fails

nosym, level of ΦW





4-4-2 1 0.00 0.17 735 215 215 555 555

4-4-3 2 0.01 134.10 5283 1298 1297 6492 2891

4-4-4 1 0.01 14.51 5292 1296 1296 4730 4730

4-4-5 1 0.02 15.68 5291 1295 1296 4722 4722

4-4-6 0 0.00 17.16 5290 1295 1295 4714 4715

Tab. 4.4: Improved results of the golfer 4-4-Xperforming additional checks for symmetry ϕXin search

tree nodes of every 8-th depth.

4.4. The

-Queens Problem

Social Golfer Problem is not able to tackle larger instances like the golfers 5-3-7 efficiently. Only

in combination with a model designed for the specific case of the Schoolgirl Problem, a solution

is found.

Using SBDD for the Social Golfer Problem, it is possible to find unique solutions only.

Additionally, it also works in combination with very simple models. Obviously, the performance

of the approach that we presented for the Social Golfer Problem can be further improved by

using more sophisticated problem formulations (see Chapter 8). However, here we wanted to

demonstrate that SBDD can also be used efficiently by inexperienced users and in combination

with simple models. We believe that the symmetry breaking method that we developed is so

easy to use because all it requires is the definition of the pattern structure and of the function

that checks whether a pattern dominates another or not. Thus, the user can think of symmetries

algorithmically rather than in terms of constraints.

4.4 The n-Queens Problem

Finally, we consider the classical n-Queens Problem. It consists in placing nqueens on a n



chessboard such that no two queens can capture each other. That is, no two queens are allowed

to be placed on the same row, the same column, or the same diagonal.

Nowadays constraint programming approaches are able to find one solutionfor 1000-Queens

in a few seconds. Askingfor allnon-symmetric solutionsof n-Queens requires more effort. In the

following, we describe the SBDS approach of Gent and B. Smith [93] on the n-Queens Problem

and compare it with SBDD.

4.4.1 Symmetry Breaking for the n-Queens Problem

It is easy to see that the n-Queens Problem incorporates seven symmetries, namely reflections

in the horizontal and vertical axis, reflections in the main diagonals, and rotations through





180





270



We use the following standard model for n-Queens:



Each row i







1 is represented by an integer variable xi. Assigning xi



jcorre-

sponds to placing a queen in row iand column j.



Additional integer variables yiand wi,i



 



1, are used to check the diagonals of

the chessboard. We post the constraints yi





i,wi







The domains are x





 









 

















AllDiff constraints on x,y, and wensure that no two queens can capture each other.

Chapter 4. Symmetry Breaking





















































































Fig. 4.6: Six out of 40 solutions of 7-queens are unique.

4.4.1.1 SBDS

In [93], SBDS is introduced first and tested on a variety of problems. The approach is general

and compatible with different search strategies. A user of the concept only needs to provide

symmetry functions mapping a single assignment to its symmetric version.

In a choice point where we set x



von the left and x





von the right branch, SBDS adds

all constraints that are necessary to prevent the solver from exploring a sub-tree symmetric to an

already investigated one. By keeping track of all previously broken symmetries, only necessary

constraints are posted, thus keeping the overhead small.

4.4.1.2 SBDD

For the n-Queens Problem, a pattern pis an n-tuple where piis the column number in which

the queen covering row iis placed, or, in case the position of the queen in row ihas not been

set yet, pi





. E.g., the pattern corresponding to the first chessboard in Figure 4.6 is p









4.4.2 Numerical Results

In contrast to the algorithm we developed for the Social Golfer Problem, here we also use sym-

metry for domain filtering. A constraint is posted to the model that keeps track of the current

situation in the search. As propagation turned out to be rather expensive, we limited the number

of calls to the propagation routine to one.

We also implemented a version of SBDS and tested it on the model described above. Both

codes were running on the same Sun Enterprise as the program for the Social Golfer Problem in

Section 4.3.

Table 4.5 compares thenumber of solutions,the numberof fails, and thecomputationtime for

calculating all solutions(sym), calculating only unique solutions via SBDS, and unique solutions

using SBDD. We omit the number of solutions for SBDD as it is identical to SBDS. The results

givenfor SBDS are similar to those givenin [93]. Only the number of fails slightlydiffers, which

we believe to be caused by small variations in the implementation and the different CP engines

used (ILOG SOLVER 4.3 vs. ILOG SOLVER 5.0).

4.5. Summary

sym SBDS SBDD

n solutions fails time solutions fails time fails time

4 2 4 0.01 1 3 0.00 6 0.00

5 10 4 0.00 2 4 0.00 13 0.00

6 4 35 0.01 1 11 0.02 31 0.01

7 40 69 0.02 6 19 0.01 56 0.02

8 92 289 0.04 12 63 0.01 130 0.03

9 352 1111 0.16 46 216 0.04 397 0.08

10 724 5072 0.57 92 851 0.13 1464 0.29

11 2680 22124 2.49 341 3808 0.53 5991 1.26

12 14200 103956 11.88 1787 17673 2.52 27731 6.27

13 73712 531401 61.56 9233 89534 12.55 140348 33.11

14 365596 2932626 337.00 45752 483214 69.62 746530 189.07

15 2279184 16920396 1946.07 285053 2784876 403.16 4391877 1213.36

16 14772512 105445065 12154.60 1846955 17277508 2608.51 27153758 7463.62

Tab. 4.5: Solving n-Queens without breaking symmetries (sym), with breaking symmetries via SBDS, and

by avoiding them via SBDD. Computing times are given in seconds.

Obviously, SBDD does not perform as well as SBDS on the n-Queens Problem. The reason

for this is that the number of symmetries is fairly small. The difference between SBDS and

SBDD can be viewed as follows: SBDS iterates through the symmetries of a problem and adds

symmetry breaking constraints if necessary. SBDD on the other hand iterates through the choice

points expanded earlier. The latter approach is clearly favorable if the number of symmetries

is very high (like for the Social Golfer Problem, for example). However, when the number of

symmetries is very limited (as it the case for the n-Queens Problem), it is much more efficient to

add some few additional symmetry breaking constraints on backtracking.

4.5 Summary

We have suggested an approach for breaking symmetries that is based on the detection of dom-

inance relations between choice points. The method is generally applicable and works in com-

bination with all exhaustive search strategies while it may overrule strategies other than DFS.

Moreover, it removes symmetric parts of the search tree efficiently in combination with any

model. Thus, it can also be easily used by inexperienced users on straightforward models that do

not break symmetries themselves.

The ease of use mainly results from the fact that it is only necessary to define the pattern

structure and a function that checks if one pattern dominates another. This algorithmic approach

allows somewhat more flexibility than a model that breaks symmetries itself, as has been demon-

Chapter 4. Symmetry Breaking

strated for the Social Golfer Problem when adapting the frequency of constraint propagation for

certain symmetries.

The methodhas shownto be easilyapplicable withoutcausing a bigimplementationoverhead

on three very different applications from combinatorial optimization and constraint satisfaction.

Moreover, it worked efficiently even in combination with easy models and also on highly sym-

metric problems such as the Social Golfer Problem.

As a disadvantage, the use of patterns is less efficient on problems that contain only very few

symmetries such as the n-Queens Problem. There, the dynamic adding of constraints in an SBDS

fashion is clearly favorable.

PART II

— Applications —

In Part I of this thesis, we have introduced general purpose methods for pruning and filtering

with respect to cost considerations and symmetry. In the following Part II, we consider some

specific combinatorial optimization and constraint satisfaction problems. The applications that

we study are used to provide a practical evaluation of the previously developed methods.

In particular, we consider the Airline Crew Assignment Problem in Chapter 5. The approach

presented is based on the concept of CP-based column generation in combination with shortest

path constraints.

In Chapter 6, we study the Automatic Recording Problem, that evolves in the context of

modern multimedia applications. An algorithmic approach is presented that links knapsack con-

straints and weighted stable set constraints on interval graphs following the idea of CP-based

Lagrangian relaxation.

The Capacitated Network Design Problem is tackled in Chapter 7. Lower bounds can be

computed by decomposing the problem. We review previously developed reduction techniques

and use CP-based Lagrangian relaxation to link them together. Moreover, a new technique is

presented that adds locally valid cuts based on Lagrangian relaxation to the problem.

A new approach for the Social Golfer Problem is developed in Chapter 8. Using SBDD for

symmetry breaking and the new idea of heuristic constraint propagation, we are able to solve

problems that were previously out of reach for solvers based on constraint programming.

Finally, in Chapter 9, we develop a solver for the Graph Bisection Problem. The core of the

algorithm is a lower bounding procedure that approximates maximum multicommodity flows.

Chapter 5

Airline Crew Assignment

The Airline Crew Assignment Problem (CAP) consists in assigning lines of work to a set of crew

members such that a set of activities is partitioned and the costs for that assignment are mini-

mized. Especially for European airline companies, complex constraints defining the feasibility

of a line of work have to be respected. We present two different algorithms to tackle the large-

scale optimization problem of Airline Crew Assignment. The first is an application of CP-based

column generation that we introduced in Section 3.1. The approach incorporates shortest path

sub-problems and uses algorithms from Section 2.2. The second approach performs a CP-based

heuristic tree search. We show how both algorithms can be linked to overcome their inherent

weaknesses by integrating methods from constraint programming and operations research. Nu-

merical results show the superiority of the hybrid algorithm in comparison to CP-based tree

search and column generation alone.

Scheduling flying crews of airline companies is a hard combinatorial problem, given the

complexity of the constraints that have to be satisfied and the huge search space that has to be

explored. The problem is often tackled by breaking it down into the Crew Pairing and the Crew

Assignment (or Rostering) Problem. In the crew pairing part, basic activities such as flight legs

(flights without stopover) are grouped into pairings. The latter ones are lines of work for one

or more days starting and ending at a home base. Then, in the crew assignment phase, these

pairings are assigned to crew members.

Although easier in practice than the original problem, both sub-problems are still hard to

solve. Obviously, the Airline Crew Assignments Problem that we consider here is NP-hard,

which is easy to see by reduction to the Set Partitioning Problem [88]. Generally, operations re-

search (OR) and constraint programming (CP) techniques are available to solve the CAP, since it

has drawn the interest of both scientific communities for many years until today. Most industrial

software is based on OR techniques. However, especially for European airlines, there are strict

rules enforced by legislation, unions, etc. that define the feasibility of schedules. Thus, since a

huge amount of computational effort is put into the generation of infeasible lines of work, com-

100

Chapter 5. Airline Crew Assignment

mon OR-based generate and test approaches are not efficient enough. We show how constraint

programming can be incorporated to overcome typical weaknesses of OR approaches. For a re-

cent overview on optimization problems and solution techniques in the airline industry, we refer

the reader to [182, 218].

During the last decade, some work was done on the Crew Assignment Problem. Column

generation methods have proven to be quite successful [52, 87, 183]. For solving the Railway

Crew Rostering Problem, which is similar, but not identical to the Airline Crew Assignment

Problem, Caprara et al. developed both an OR- and a CP-based approach [30, 32]. For the latter,

a lower bound from the OR field was used to improve the efficiency.

By construction, OR methods view a problem globally, taking into account all variables and

usually more than one or even most constraints at a time. By calculating upper and lower bounds

on the costs, they show a good ability to identify promising parts of the search space. However,

they often suffer from minor local conflicts, which might prevent a feasible solution from being

found. On the other hand, CP methods can efficiently handle feasibility problems by resolving

local conflicts using advanced search techniques and reduction algorithms based on concepts like

arc-consistency. Respectively, CP methods lack the ability to view the variables and constraints

of a problem globally. Therefore, they often have problems when stuck in local optima.

We present two different approaches to tackle the Airline Crew Assignment Problem: a CP-

based heuristic tree search approach (HTS) [205], and one following the CP-based column gener-

ation framework (CGA) (see Section 3.1). We show how these two approaches can be combined

to overcome their inherent limitations.

The work presented in thischapter was publishedin [62, 194, 195]. It is structured as follows:

In Section 5.1, we formally define the Airline Crew Assignment Problem. In Section 5.2, we

discuss two autonomous approaches to solve the CAP. We give the characteristics of two real-

world airline test cases and present detailed ways of how the two approaches developed can be

combined to form an efficient hybrid algorithm in Section 5.3. Finally, in Section 5.4, numerical

results show the superiority of the hybrid algorithm compared to the individual approaches.

5.1 The Airline Crew Assignment Problem

Given a set of crew members, a set of pairings, a set of rules and a cost function, a roster is an

assignment of a subset of pairings to one specific crew member. A schedule is a set of rosters

such that all rules are obeyed and every pairing is assigned to exactly one crew member. Rules

may concern a single crew member or multiple crew members. Single crew member rules regard

each individual crew member’s roster, stating for example that no two temporally overlapping

pairings can be assigned to the same person. Multiple crew member rules aim at more than one

crew member, stating for example that two given pairings must be assigned to two crew members

5.2. Two Approaches for the Crew Assignment Problem

101

out of which at least one must have a certain level of experience. The cost function associates a

cost with every legal schedule, and its minimization is desired.

In our case, every rule in the rule set only deals with just one single crew member, and the

objective function is linear over the rosters. That means that only single crew member rules can

be modeled and that the cost of the entire solution to the CAP is defined as the sum of the costs

of the selected rosters. More formally:

Definition 5.1 Given k



 

, we denote the set of crew members byC :









and the

set of pairings by T :









. Furthermore, denote the set of all subsets of a set S by 2S.

1. Let R :





2T. Every r



R is called a roster and R is called the set of all possible rosters.

2. Let B :









and H :









hi:R









. Every h



H is called a

(single crew member) rule and H is called a rule set.

3. A roster r



R is called legal (with respect to a rule set H)iff h











H. L











R; r is legal



is the set of legal rosters (with respect to the rule set H).

4. f :R





is called a cost function.

5. The (Airline) Crew Assignment Problem (CAP) is to minimize ∑1









, whereby











 



m such that:

(a)





 

(b)





T whereby ti



















The model as stated above neither allows non-linear objectives when combining rosters, nor

permits to restrict the combinationof rosters by additional multiplecrew member rules one might

be interested in for real-life applications. Nevertheless, both methods that we present for solving

the previous problem allow to treat linear multiple crew member rules as well.

5.2 Two Approaches for the Crew Assignment Problem

In this section, we introduce two approaches for the CAP that we want to combine later. As a

major objective, we aim at developing a generic tool that is able to treat different rules and regu-

lations that typically arise in airline companies. Particularly for European airlines, these rules are

very complex and often non-linear. It was therefore decided to model the rules and regulations

as a constraint program. Hence, both approaches and the resulting integrated approach are based

on a CP core. For further details on this core, we refer the reader to [163].

102

Chapter 5. Airline Crew Assignment

time

pairings

1 5 9

Fig. 5.1: Constructing a legal reduced-cost optimal roster is equivalent to finding a constrained shortest

path in a weighted DAG.

5.2.1 CP-based Column Generation Approach

The definition of the CAP as stated above allows to decompose it naturally into the sub-problem

of generating legal rosters and the set partitioning (SPP) master problem. Therefore, we can

apply the idea of CP-based column generation that was introduced in Section 3.1. The master

problem is an integer program (IP) that ensures restrictions (5a) and (5b) in Definition 5.1:

min ∑











cϕ











∑









and ϕ











m(5.1)

∑









and sbelongs to ti





n(5.2)









whereby ϕ:







 





maps a columnnumber toa crew member. The mconstraints

in (5.1) assign exactly one line of work to each crew member. The nconstraints in (5.2) ensure

that all activities are covered exactly once. In this model, every (legal) roster corresponds to a

0-1 column.

The sub-problem consists in finding rosters respecting all rules and improving the objective.

From linear programming (LP) duality theory, it is known that columns with negative reduced

costs are candidates for such an improvement. Notice that duality theory is only valid for the

LP-relaxation of the original IP. Thus, pure column generation must be viewed as a heuristic

only. To prove optimality of the IP model, column generation has to be extended to a branch and

price approach [13].

We first generate a bunch of individuallines of work and then try to combine them to partition

the entire work. When solving the LP-relaxation of the master problem, we get dual information

5.2. Two Approaches for the Crew Assignment Problem

103

CP−based Column Generator

LP Solver SPP Matrix

RostersLP Duals

SPP Solver

Constraint Model for Airline Rules Initialization

Solution/Duals

Fig. 5.2: The entire approach: The inner loop generates columns using dual information, the outer loop

solves the master problem.

that allows to search for potentially improving columns. That is, in the sub-problem we try to

generate new rosters that have negative reduced costs. Those rosters are added to the master

problem, which is solved again, and so on until no more rosters with negative reduced costs can

be computed or until a certain iteration limit is reached.

Selecting an optimal set of non-overlapping activities respecting the rule set can be inter-

preted as the problem of finding a constrained shortest path in a weighted directed acyclic graph

(DAG) G(see Figure 5.1). However, due to complexand possibly non-linearsingle crew member

rules, generating legal rosters with negative reduced costs can be very difficult, and it is doubtful

whether the shortest path substructure is really dominant in the constraint satisfaction problem

that arises. Moreover, rule sets vary from airline to airline and have no common structure that

could easily be exploited to design a generic efficient constrained shortest path algorithm that

can cope with any rule set. Therefore, we apply a CP search to generate legal rosters. As we

are only searching for individual lines of work with associated negative reduced costs, we add

an optimization constraint that is used for problem reduction. That is, instead of searching for

constrained shortest paths, we rather introduce a shortest path constraint.

The entire approach is sketched in Figure 5.2. In the initialization phase we start be set-

ting up the desired airline rules and regulations and generate an initial SPP matrix that may

consist of dummy columns only (see Section 5.2.1.2). In the outer loop of master iterations,

we solve the current SPP integer program and get a first current solution and new dual values

of the LP-relaxation. The column generator then defines the next sub-problem to be solved by

picking a crew member. A specified number of rosters is generated in the inner loop: we add

104

Chapter 5. Airline Crew Assignment

10000

20000

30000

40000

50000

60000

123456

choice points

master iterations

NRC

SPC

total enum

200

400

600

800

1000

1200

1400

1600

1800

1 2 3 4 5 6

time [sec]

master iterations

NRC

SPC

total enum

Fig. 5.3: Number of choice points versus master iterations (left), and running time versus master iterations

(right) for SPC, NRC, and total enumeration. The tests were run with a data instance of type 10-00-20

that was solved to optimality.

the corresponding columns to the (continuous) master problem, solve it by the means of linear

programming, and obtain new dual values. After rosters have been generated for all crew mem-

bers, we achieve an enhanced SPP matrix and the next master iteration begins. This process is

either interrupted after a given time limit has been exceeded or when no more rosters have been

generated that yield an improvement of the LP-relaxation in any of the sub-problems.

In the following, we investigatesome properties of the sketched procedure in more detail and

show that CP-based column generation is able to solve non-trivial Crew Assignment Problems.

In particular, we demonstrate the effect obtained by the propagation of the path constraint.

The problem instances that we consider here stem from a major European airline. The rules,

regulations, and objective function have directly been abstracted from the real-world case and

preserve the essential characteristics of this case. The data sets are sufficiently large to measure

the effects of constraintpropagation, but they are smallenough torun experiments ina reasonable

time frame.

To characterize an instance, we specify the number of crew members, the number of pre-

assigned activities, and the number of activities to be assigned. For example, an instance of type

67-165-280 consists of 67 crew members, 165 pre-assignments, and 280 tasks. The experiments

were run on a SUN UltraSparc-IV with 296 MHz CPU and 1024 MB main memory. For the

constraint model of the CAP, the ROSTER LIBRARY [163] based on ILOG SOLVER 4.4 [120]

was used. The LPs and IPs were solved with ILOG PLANNER 3.3 [119].

5.2.1.1 Shortest Path Filtering

In the experiment for Figure 5.3, we compare three models for the generation of legal rosters

with negative reduced costs. The first performs total enumeration, the second (NRC) uses a

simple arithmetic constraint to ensure the generation of columns with negative reduced costs,

and the third (SPC) uses a DAG shortest path constraint (see Section 2.2.2) for this purpose. The

5.2. Two Approaches for the Crew Assignment Problem

105

5000

10000

15000

20000

25000

30000

35000

0 2 4 6 8 10 12 14

Choice points

Master Iteration

SPC 574 609 504 392 280 931 210 119 119 147 63 77 0

NRC 574 1918 1197 2037 4599 5117 6118 12446 14077 13433 18340 21532 32095

Fig. 5.4: Number of choice points versus master iterations using SPC, NRC with a data set of type 7-0-30.

left picture shows the reduction of choice points when using cost-based filtering techniques for

problem reduction. In the sixth master iteration, SPC uses less than half the number of choice

points than NRC. This gain is not consumed by a significant increase in computation time per

choice point: As shown in the right figure, the decrease in running time is quite similar to the

decrease in the number of choice points. As expected, total enumeration is not competitive at all.

To demonstrate the superiority of a shortest path constraint when compared with a simple

arithmetic constraint in more detail, we run a test on a small instance where, in each master

iteration, the number of choice points is noted. Figure 5.4 shows that SPC uses much fewer

choice points than NRC. Furthermore, in the last master iteration, the shortest path constraint

is able to prove optimality for the continuous relaxation of the master problem very quickly by

showing that no more columns with negative reduced costs exist. The negative reduced cost

constraint, however, still visits an increasing number of choice points per iteration.

One reason for the efficiency of the shortestpath constraintand the reason why there isalmost

no gap between the reduction of choice points and the reduction in time is the use of the incre-

mental version as mentioned in Section 2.2.2.3. In Figure 5.5, we compare a non-incremental

versionof the shortestpath constraint withan incremental one. For a fixedtime of 10000 seconds

for the entire optimization, the faster incremental version needs only 2000 seconds for propaga-

tion, whereas, in the non-incremental version, almost 60% of total calculation time is consumed

by that part of the algorithm. Thus, the incremental version allows to perform nearly 3 times

as many propagations as the non-incremental version and hence helps to improve the solution

quality.

106

Chapter 5. Airline Crew Assignment

250000

500000

750000

0 1000 2000 3000 4000 5000 6000

propagations

time[sec]

incremental

non-incremental

430000

440000

450000

460000

470000

0 3000 6000 9000

cost

time [sec]

Fig. 5.5: The left picture shows time versus the number of calls of the propagation routine using the

incremental and the non-incremental implementation of the shortest path constraint. Both versions were

stopped after 10000 seconds total CPU time. The experiment was run with a data instance of type 10-00-

70. The right picture shows a comparison of NRC (upper curve) and SPC (lower curve) in a time versus

quality diagram on a data instance of type 67-165-280.

The right picture in Figure 5.5 shows a time versus quality comparison of NRC and SPC.

After a first big drop in the objective, NRC dives deeply into huge search trees that only consist

of rosters with non-negative reduced costs. SPC can prune those search trees much earlier and

therefore continuously reduces the objective without stalling.

We have shown that CP-based column generation works reasonably for the CAP. However,

we observe two major drawbacks. We will see later in Section 5.3, how these problems can be

overcome by combining the column generation approach (CGA) with a direct CP approach.

5.2.1.2 Set Partitioning — Set Covering

The majorobstacle for CGA isthe setpartitioning(SPP) structure of the masterproblem. Finding

a feasible solution to the SPP is NP-hard already [88]. Moreover, the dual information gained

from equation constraints is more difficult to exploit than that of cover or packing constraints.

Therefore, we would actually like to relax the master problem to a set covering formulation (that

remains an NP-hard problem but can be solved much more easily in our case) by only requiring

the pairings to be flown by one or more crew members, i.e., we suggest to relax (5.2) to ∑ixi



Then, however, to compute a legal schedule, we have to decide which crew member finally gets

an over-covered pairing assigned.

5.2.1.3 Feasible Solutions for Set Partitioning

To obtain a formulation that guarantees that we can always find a feasible solution, we add two

types of dummy columns: The first type of column covers exactly crew member i, the second

5.2. Two Approaches for the Crew Assignment Problem

107

exactly one activity j, for all i





mand j





n. That is, we allow empty rosters and

unassigned activities. By setting the costs for choosing a dummy column to an arbitrary high

value, we make sure that they only become part of an optimal solution if the original master

problem is infeasible.

Although this procedure works, to achieve meaningful dual information of the master prob-

lem, the solution must not be spoiled by dummy costs. Thus, it is preferable to generate an initial

set of rosters that contains an entire work partitioning schedule.

5.2.2 Heuristic Tree Search Approach

The other algorithm developed to tackle the CAP is the heuristic tree search approach (HTS)

based on constraint programming. In that HTS, each complete feasible solution of the CAP is

constructed by solving the corresponding constraint satisfaction problem [205]. The problem is

modeled by a set of variables which correspond to assignable pairings. For each pairing, there

is a variable the domain of which represents the crew members that can possibly be assigned to

the pairing.1For each constrained variable representing the assignment of a pairing, its initial

domain comprises all available crew members for this paring. The posting of the appropriate

constraints reduces the domains of these variables by removing crew members that cannot be

allocated to the corresponding pairings. This is possible, for example, due to pre-assigned activ-

ities, or due to regulation violations because of the crew member’s history, etc. The search tree

of the problem is created by iterating over pairings in a heuristic dynamical order and assigning

each pairing to a crew member.

As usual, every branch in the search tree corresponds to assigning a pairing to a crew mem-

ber. Every non-leaf node corresponds to a partial assignment, identified by the path from the

root to the node. Leaf nodes correspond to infeasible partial assignments or complete and legal

schedules, i.e. (not necessarily optimal) feasible solutions of the problem. Each allocation of a

crew member to a pairing activates the constraint propagation mechanism. Total enumeration is

tried to be avoided by removing values which are inconsistent with the posted constraints from

variables’ domains. For example, the assignment of a pairing to a crew member causes the re-

moval of this crew member from all pairings’ domains that overlap in time with the one that

has just been assigned. When a node is proven to be a dead-end, which means that one or more

pairings cannot be carried out be any crew member in the given partial assignment, backtracking

occurs, and decisions taken before are reconsidered.

The constraints of the problem are the regulations of the airline at hand that dictate which

rosters are acceptable and which ones are in violation of the airline rules. A solution to a con-

1It is assumed that every pairing can only be assigned to one crew member. In case there are more than one crew

members necessary to staff a pairing, copies of the pairing are created, and each copy can again only be assigned to

one single crew member.

108

Chapter 5. Airline Crew Assignment

straint satisfaction problem is any assignment of values to variables that respects all constraints.

A feasible solution to the CAP, formulated as a constraint satisfaction problem, is any assignment

of crew members to pairings such that all airline rules and regulations are respected. Then the

objective function is optimized by searching for improving solutions only. Regarding the way

the search tree is traversed, a variety of search methods were developed and tested.

5.2.2.1 Tree Traversal

A variety of search methods for traversing the search tree exists in the literature. The oldest,

most popular and, by far, most widely used search method is Depth First Search (DFS). The

main drawback of DFS is that, even for instances of moderate size, it only explores a very small

portion of the search tree at the lower left.2DFS was implemented and tested for the CAP, and,

as no surprise, it was found that it does not perform very well, because the first decisions taken

are never reconsidered.

Innovation in the field came from the notion of discrepancy. At a given node, a heuristic

function suggests which branch the search should follow, as the one that is assumed is most

likely to contain solutions (or solutions of good quality in the case of optimization). Always

following the heuristic’s advice defines a unique path that is said to contain no discrepancies.

Following the heuristic’s advice except for one case defines paths of discrepancy 1, for two cases

discrepancy 2, and so on.

Limited Discrepancy Search (LDS) [106] is an iterative search method. In the i-th iteration,

it explores all paths with idiscrepancies. In the original LDS method, paths with discrepancies

higher up the tree are explored before the ones where the discrepancies occur further to the

bottom. The intuitive justification for that approach is that a heuristic is more likely to fail higher

up the tree, where information is limited. We implemented a variant of LDS. Our variant searches

paths with discrepancies lower down the tree before the ones with “higher” discrepancies. The

advantage is that time consuming descends from near the root towards the leaves are avoided.

Also, our variant is not iterative. It searches those paths having ior less discrepancies and then

exits. Thus, it is not complete. Practically, however, the parameter ican be chosen so that a big

enough portion of the tree is explored. In our experiments, this portion of the tree is much bigger

than a modern computer could explore in a reasonable amount of time. We refer to this variant

as modified Exact Discrepancy Search (mEDS).

Depth-Bounded Discrepancy Search (DDS) [213] is also an iterative method. In the i-th

iteration, it exploresall paths where discrepancies occur before depth i. In contrast to LDS, a path

with many discrepancies high in the tree is explored before a path with very few discrepancies

low in the tree. This is also justified by the assumption that heuristics tend to fail with a higher

2It is common practice to regard the branches under a node as ordered according to a heuristic function. Follow-

ing the advice of a heuristic means to go “left” down the search tree.

5.3. Integration

109

probability on top of the tree.

Finally, wealso implementedLarge NeighborhoodSearch (LNS) thatwasintroducedin[200]

and incorporates local search techniques within the CP framework. The idea is to restrict the

search within a fragment of the search space. In this way, local improvements can be made

that would remain unnoticed by most incomplete search methods. A reduced search space for

a problem with a set of variables Vand a known feasible assignment Acan be created as fol-

lows: A large subset V1of Vis selected. All assignments in Afor variables in V1are fixed and

thus a partial solution is created. Search is performed in the remaining variables with any of the

above search methods. After this search is finished (either because the search sub-space has been

exhausted or because any other termination criterion is met), another sub-space is selected and

the process is repeated. The advantage of LNS is that local improvements are discovered easily,

and the objective value may improve quickly. The disadvantage is that the search space cannot

be viewed globally. Thus, it is likely that important improvements are missed. A reasonable

strategy when using LNS is to use one of the search methods above in the beginning to guide the

search towards a promising area of the search space and to use LNS afterwards.

5.3 Integration

We present two ways of integratingboth methods each one motivatedby one of the two following

problem cases:

5.3.1 The Airline Test Cases

We consider real-world test cases stemmingfrom twoEuropean airline companies. The instances

of companyA consist of 50–65 crew members and 766–959 pairings. Company B has 7–30 crew

members and 129–279 pairings. Case A covers a planning period of one calendar month, while

data sets for B cover two weeks. While case B incorporates mainly 1–2 day pairings, A considers

pairings of duration less than 24 hours.

The objective of company B is to achieve a fair distribution of activities over all crew mem-

bers, whereas in A we aim at satisfying as many preferences expressed by the crew members as

possible by minimizing dissatisfaction. Importantly, the rule sets in both cases are distinct. In

A, typical rules such as succession rules and rest time rules, but also more complicated ones like

rules ensuring a minimum of days off within gliding windows of variable lengths are incorpo-

rated. Also, rules guaranteeing minimum and maximum flight time are enforced. All rules in A

are hard constraints, meaning that if they are violated, the solution is considered infeasible. In

B, we consider flight time rules that limit the time actually flown by the crew within certain time

periods. These rules are also strict.

110

Chapter 5. Airline Crew Assignment

The main difference between the two test cases regarding the algorithms we developed is

caused by the fact that company B does not insist on a partitioning of the work, i.e., restric-

tion (5b) is relaxed to



mti



T. Obviously, this difference requires that our column gener-

ation approach is able to incorporate two different types of master problems.

In the first problem case, the construction of a feasible schedule is difficult due to very strict

rules called for by the airline company. We observe that CGA eventually gets close to solutions

of good quality, but minor inconsistencies delay it disproportionately long. We show that this

can be overcome effectively by letting the CGA approach solve a relaxed (that is set covering)

version of the problem and then handing possibly over-covered (and thus infeasible) solutions to

the HTS approach for fixing.

In the second problem case, the rule set is not that strict. The CGA approach alone proceeds

as expected. However, the initial time spent for driving dummy columns out of the basis is

considerable. In this phase, dual values are not very meaningful, because penalties dominate the

objective. We show how the HTS method can help attacking the problem.

5.3.2 Transforming a Set Covering into a Set Partitioning Solution

The first method is applied on case A. In this company, no pairing can be left unassigned. More-

over, there is a relatively large number of pairings with respect to the number of crew members

and the number of pairings that a single crew member is able to service (for example 959 daily

pairings and 65 crew members on a typical monthly instance). These conditions make finding a

feasible solution difficult for the CGA approach. On the other hand, the HTS approach is able to

construct feasible solutions by using sophisticated search methods and heuristics tailored for the

specific problem. However, after a short while no improving solutions can be found.

We overcome the problems of both methods by letting the CGA approach find set covering

instead of set partitioning solutions. That is, we relax the pairing partitioning constraints (5.2) by

only requiring that every pairing is assigned to at least one crew member. The columns generated

by the CGA approach can much more easily be combined to SCP solutions. Then the conversion

of SCP to SPP solutions is performed by the HTS approach, which can resolve local conflicts

efficiently byusingsophisticatedpropagationalgorithms. An outlineofthe procedureis shownin

Algorithm1. Here,Vis the set of all variables, AXis a tuple of assignments





of values xv

to variables vgenerated by approach X,a







is a function which returns the value of variable

vin assignment A, DEFAULTSVAR and DEFAULTSVAL are the variable and value selection

functions normally used by the HTS approach respectively, REPAIRSVAR and REPAIRSVAL

are the corresponding heuristics used for repairing set covering solutions and HTSOPTIMIZE,

and CGAOPTIMIZE are the HTS and CGA optimization functions. PARTITION is a function

which will be explained shortly. LNSOPTIMIZE performs optimization using the LNS method

that forms search sub-spaces by dividing the planning horizon into time windows.

5.3. Integration

111

Algorithm 1 Top level algorithm for the first method

1: AHTS



HTSOPTIMIZE(V, DEFAULTSVAR, DEFAULTSVAL)

2: repeat

3: ACGA



CGAOPTIMIZE









PARTITION(AHT S



ACGA)

5: for all v



V3do

6: v





ACGA





7: AHTS



HTSOPTIMIZE(V1



V2, REPAIRSVAR(V1



V2,ACGA,V2),

REPAIRSVAL(V1



V2,ACGA,V2)))

8: AHTS



LNSOPTIMIZE(V, REPAIRSVAR(V,ACGA,V1



V3),

REPAIRSVAL(V,ACGA,V1



V3), AHTS



9: until stopping condition

We now explain this algorithm in greater detail. In the first line, one or more initial solutions

are found by the HTS approach. This initialization step provides the algorithm with a set of

columns, which can be combined to feasible solutions. Not much time is devoted to this phase.

The variable and value selection heuristics that would normally be used by the HTS approach are

applied here. Any of the methods presented in the previous sections can be plugged in. However,

we found mEDS toperform best in our case. The columns constitutingthese solutionsare handed

to the CGA approach for optimization in Line 3. The solution produced in this step is correct

except for the fact that some pairings are assigned to more than one crew member, which is not

legal.

The next task is to use the information found in ACGA to construct a feasible solution. Let

V1be the set of variables which correspond to over-covered pairings. An optimistic approach

would be to assign the values of the assignment ACGA to all the variables in V



V1and let the

HTS approach perform a search in the space of the variables inV1. This, however, could lead to a

failure, since it is not known that the partial solution obtained is extensible to a feasible solution.

There are other scheduling problems, such as the Vehicle Routing Problem With Time Win-

dows [185] for example, for which an set covering solution can be repaired easily by removing

entries for over-covered rows from all but one of the corresponding columns. However, in our

case, this procedure is likely to fail as certain rules may cause the resulting rosters to be infeasi-

ble. For example, a minimum flight time rule might be violated if a pairing is removed from an

otherwise feasible roster. We say that such a rule destroys the legal sub-roster property of a rule

set.

We can distinguish three subsets of variables in V: The set V1that consists of variables that

correspond to over-covered pairings in ACGA, the set V2that consists of variables which have

different values in ACGA and AHTS, and the set V3which corresponds to variables having the

same value in both assignments.

112

Chapter 5. Airline Crew Assignment

Algorithm 2 Heuristics for the first method

REPAIRSVAR







1: v



NIL

2: for all unbound variables v



Vdo

3: if a









Dvthen

4: return v

5: return DEFAULTSVAR(S)

REPAIRSVAL







1: if v



Vand a









Dvthen

2: return a







3: else

4: return DEFAULTSVAL







The function PARTITION partitions Vin exactly this manner. Assignments of variables in

V3are known to be extensible to a full solution, since one has already been found. Thus, since

there is no information which suggests the contrary, they are realized as soon as possible in each

iteration (Lines 5 and 6 in Algorithm 1). Assignments in set V2may be considered as almost

certain. However, in Line 7 of Algorithm 1, they are realized in a way that allows to reconsider

them in case there exists no feasible solution that extends the assignments of variables in V2and

V3. Finally, CGA does not provide meaningful information for variables in V1. Therefore, HTS

performs the search for assignments to these variables using the default heuristics.

The variable and value selection functions are modified as shown in Algorithm 2. There, the

variables that are not fixed yet are given in the set S.Vis a subset of Sfor which assignments

exist in A. For example, when the variable selection rule is invoked in Line 7 of Algorithm 1, S

is V1



V2,Vis V2and Ais ACGA. In this case, the variable to be assigned next is any variable

in V2for which its suggested value exists in its domain. In other words, all possible assignments

in ACGA are realized as soon as possible, in accordance to the intuitive belief that they will most

probably lead to an area that contains improving solutions. If this is not possible, then a variable

inV1is selected, and the default heuristic is used.

Whenever possible, the value selection heuristic assigns the value suggested by CGA. Two

important details are worth to note:

1. The variable selection heuristic is consulted every time when a new assignment has to be

made in the HTS search. That is, if a variable vis selected (because a



ACGA







Dv) and then,

for any reason, the search backtracks beyond that point (removing a



ACGA





from Dv), then

another variable might be selected instead of v. That way, assignments and not just variables are

dynamically ordered throughout the search process in such a way that those decisions contained

in ACGA will always be taken as early as possible.

5.3. Integration

113

2. Discrepancy-based search methods are used motivated by the belief that the assignments in

ACGA are probably good ones. That is, we try to stick to the decisions made by CGA, and we

would like to make only few deviations. In our implementation, this issue is handled by using a

variant of the LDS search method. In the original LDS proposal, based on the assumption that

heuristic decisions are less accurate high up in the search tree, early decisions are reconsidered

first. In our case, though, the assignments for variables in V2are realized in the beginning, and

we want to stick to them. Therefore, we prefer to use mEDS in this phase, too.

We further notethat thefunction HTSOPTIMIZE in Line1 ofAlgorithm1 mayor maynot use

LNS. Whether LNS can help to improve the efficiency is problem dependent. Our experiments

show that, as a stand-alone method, it is not preferable because it is likely to get stuck in a

local optimum soon. However, we found that it can be useful to apply LNS after having found

the first solution with the help of a global tree search method. Locality is not a major problem

when using LNS in combination with CP-based column generation: the latter carries the major

burden of optimization, whereas the CP-approach is used to resolve minor local inconsistencies,

hopefully without loosing much of the relaxed set covering solution quality. We show the effects

of using LNS in our experimental results.

We also use LNS in Line 8 of Algorithm 1 to overcome a problem that might arise when

fixing variables in V3. Recall that they are determined by assignments which have the same

values in both AHTS and ACGA, and they are bound to their values as proposed by CGA to

explore promising regions of the search space. We give the search more freedom by allowing

that these assignments may be reconsidered and use LNS onV1



V3instead of onlyV1



V2.

To be more precise, in our experiments, we used LNS with mEDS as the sub-tree search method.

5.3.3 Generating Combinable Columns and Exploiting Dual Values

We propose a second integration strategy, that is applied on company B. In this case, the conver-

gence of the CGA approach towards an optimal solution is assisted by HTS first by constructing

a set of initial columns that are combinable to complete partitioning solutionsin a start-up phase,

and second by constructing columns with negative reduced costs during the main optimization

phase. These columns are guaranteed to be extensible to a feasible solution, since they are ex-

tracted from one. A top level sketch of this method is shown in Algorithm 3. Cis a set of rosters,

Ais an assignment, and duals are the dual values corresponding to this assignment (obtained by

the CGA). The function HTSPOSTNRC transfers the dual values to HTS that is forced to search

for columns with negative reduced costs.

5.3.3.1 Start-up Heuristic

In the CGA, columns are generated for each crew member sequentially. By using dual infor-

mation, columns with negative reduced costs are generated. Thus, when the problem is non-

114

Chapter 5. Airline Crew Assignment

Algorithm 3 Top level algorithm for the second method

1: C



HTSTREESEARCH





DEFAULTSVAR



DIVERSESVAL



2: repeat

3: A



duals



CGAOPTIMIZE(C)

4: HTSPOSTNRC



duals



5: C



HTSLNSTREESEARCH





MAXDUALVAR



MAXDUALVAL





6: until stopping condition

degenerate, they lead to a decrease in the continuous relaxation of the master problem. There-

fore, to find high quality rosters, “good” dual values are needed. Especially in the beginning, the

information contained in the dual values is very poor. This is because usually no feasible solution

is known at this point, and penalties stemming from dummy columns (that have to be introduced

in the master problem to guarantee the existence of a solution) have a great impact on the dual

values. We need to find a set of rosters that can be combined legally to form a set partitioning

solution to the CAP. However, the column generator of the CGA is hardly able to produce such

a solution, as it computes one roster at a time and is only indirectly aware of colliding pairings

in different rosters.

HTS can help here. In an integrated approach, it is used to generate a bunch of complete

feasible solutions in the beginning, thereby providing one column for each crew member with

every schedule found. Thus, a first set of columns that can be combined feasibly to a complete set

partitioningsolutionprovidesthe CGA with the necessary “grip” to accelerate towards promising

parts of the search space with respect to the real objective without disturbing penalties.

Line 1 of the Algorithm 3 realizes this idea. HTS searches for an initial number of solutions

without performing optimization. The number of solutions to be found is a parameter that has to

be tuned with respect to the time spent in this phase and the quality of the initial dual values.

Another parameter that has to be taken into account is the diversity of the columns that

are generated. It may be desirable to have many diverse rosters at hand that allow more and

more profitable combinations in the master problem. One rule of thumb used in practice is

that no crew-pairing assignment should appear more than a certain number of times in these

columns. The idea is realized in the slightly modified value selection heuristic DIVERSESVAL,

which is shown in Algorithm 4. It works exactly as the value selection heuristic that is normally

used, but it also records the assignments made and limits the number of times a crew member

can be assigned to a pairing. This heuristic, for example in combination with depth-bounded

discrepancy search [213], guarantees that columns will be adequately different from each other

to make the CGA method even more efficient.

5.3. Integration

115

Algorithm 4 Modified value selection heuristic for the second method

DIVERSESVAL







1: val



NIL

2: repeat

3: val



DEFAULTSVAL







4: if the assignment



val



appears more than ktimes in Athen

5: remove val from Dv

6: else

7: return val

8: until val





NIL or Dvis empty

Especially for large data sets, we find that many initial solutions are needed. To speed up

their computation, we try to shrink the search space: First, only one solution is computed. Then

the LNS search procedure is applied to obtain solutions that satisfy the diversity conditions in

locally bounded areas of the search space.

5.3.3.2 Main Optimization Loop

As shown in Line 3 of Algorithm 3, CGA performs an optimization run taking the columns

produced by HTS as input. It returns an assignment Aas well as the corresponding dual values

for the crew members and pairings. The solution returned is feasible with respect to all the

company’s rules and regulations. Then, starting from this point, HTS performs a locally limited

search for columns with negative reduced costs.

The constraint posted in Line 4 of the algorithm ensures that a certain number of the columns

corresponding to each solution found will have negative reduced costs. This number is defined

empirically. Finding a schedule that consists of columns with negative reduced costs only is

rather unlikely. On the other hand, producing only few such columns is a wasted effort. Our

experiments show that schedules that contain 30% columns with associated negative reduced

costs can be achieved for our test set. Of course, this does not imply that 70% of the columns

produced are useless. Instead, those columns guarantee that all newly generated columns can be

extended to a feasible solution. Thus, all columns that are produced are important with respect

to integer feasibility, whereas the columns with negative reduced costs reflect our search for

improving solutions with respect to a linear continuous objective.

Line 5 of Algorithm 3 performs an LNS search with few deviations regarding the solution

providedby CGA. The pairing with the maximumdual is assigned to the crew with the maximum

dual as long as this crew member’s reduced costs are not guaranteed to be negative already.

Again, our search method of choice is mEDS.

116

Chapter 5. Airline Crew Assignment

950000

1000000

1050000

1100000

1150000

1200000

1250000

1300000

1350000

1400000

1450000

0 20000 40000 60000 80000 100000 120000

LNS-HTS

HTS

Hybrid

Fig. 5.6: Data set with 65 crew members and 959 pairings.

5.4 Numerical Results

To demonstrate the superiority of combined approaches integrating CP and OR techniques, we

applied the hybrid algorithms as presented to real-world Crew Assignment Problems (see Sec-

tion 5.3.1). We applied each method integrating HTS and CGA onthe airline cases that motivated

their development. All algorithms were implemented in C++ on top of ILOG SOLVER [120] and

ILOG CPLEX [116]. The first integration strategy was applied on two monthly data sets from

company A. Experiments for this case were performed on a 640 MB, 296 MHz SUN UltraSparc-

II, with a time limit of 120000 seconds.3

The efficiency of our algorithm improves the production system which company A used at

the time when this work was done. Figure 5.6 is a cost (i.e., dissatisfaction) versus time graph

showing the performance of the hybrid and the pure HTS methods applied on a monthly data set

containing 959 pairings and 65 crew members. The problem is stated as minimization problem.

The curve marked “LNS-HTS” corresponds to a hasty strategy in which, after one solution is

obtained, LNS is used to achieve some good solutions quickly. The “HTS” curve shows a more

mature strategy, where the search finds several good solutions before LNS is applied to locally

optimizethem. The curve marked“hybrid”shows theperformance of thehybridapproach, which

clearly outperforms both. Interestingly, the pure CGA cannot detect any feasible solution at all.

Within 120000 seconds, it is not able to remove all dummy columns from the solution, i.e., the

original master problem without dummy columns still is infeasible.

In these specific experiments,for exhibitionpurposes only, we call the HTS strategyin Line 1

of Algorithm 1 in order to show that the hybrid has the best performance regardless of the start-

3Curves stopping beforethis threshold indicate that no better solution was found fromthe moment corresponding

to the end of the curve until the time limit has been reached.

5.4. Numerical Results

117

600000

700000

800000

900000

1000000

1100000

1200000

1300000

1400000

1500000

0 20000 40000 60000 80000 100000 120000

LNS-HTS

HTS

Hybrid

Fig. 5.7: Data set with 50 crew members and 766 pairings.

up phase. That is the reason why “LNS-HTS” outperforms “hybrid” in the beginning. Of course,

we repeat that a reasonable choice for the start-up phase of Algorithm 1 would be a strategy

more like “LNS-HTS”. This strategy is used in the experiments of Figure 5.7, which shows the

performance of the same methods on another monthly data set of company A containing 766

pairings and 50 crew members.

The following set of experiments is carried out in order to investigate the second way of

integration. Experiments for this case were performed on a 128 MB, 143 MHz SUN UltraSparc,

with a time limit of 20000 or 70000 seconds depending on the problem size. Figures 5.8 and 5.9

show the costs versus time plot for CGA, HTS and the second, so-called, consolidated approach

for data sets with 7 crew members and 129 pairings, and 30 crew members and 279 pairings,

respectively.

The plots depict the expected behavior of CGA and HTS. CGA steadily optimizes the ob-

jective, but the quality of the initial solution is poor. Moreover, the time needed to find a first

solution grows with the problem size. On the other hand, HTS finds relatively good solutions

quickly by using heuristic information, but soon gets stuck. The consolidated approach ben-

efits from both approaches: it finds good solutions quickly because of HTS and then steadily

continues to refine the solutions with the help of CGA.

It can also be seen that the integrated approach is slower than HTS early in the experiments.

During that time, the hybrid approach is using the HTS module to create an initial set of columns

according to the start-up heuristic. The reason why HTS is slower in the consolidated case is

that the goal is not to find better and better solutions, since the main optimization burden lies on

the CGA side. Instead, HTS rather tries to find diverse rosters, which help CGA to find better

solutions in the following.

118

Chapter 5. Airline Crew Assignment

200

400

600

800

1000

0 5000 10000 15000 20000

CGA

HTS

Hybrid

Fig. 5.8: Data set with 7 crew members and 129 pairings.

50000

100000

150000

200000

250000

0 10000 20000 30000 40000 50000 60000 70000

CGA

HTS

Hybrid

Fig. 5.9: Data set with 30 crew members and 279 pairings.

5.5. Summary

119

The experiments regarding the second way of integration show that it is always useful to

assign the task of finding a set of initial solutions to the HTS approach. The best number of solu-

tions computed initially depends on the rule set as well as on the characteristics of the instance.

Assigning the main optimization burden to CGA is the default choice, as it views the problem

globally taking into account all variables and constraints at a time. If minor local adjustments can

lead to quality improvements, then having HTS perform LNS searches throughout the process is

cost-effective. Moreover, if the column generation process gets stuck, i.e., if a significant number

of columns with negative reduced costs proves not be combinable to an IP solution, then having

HTS generate solutions incorporating columns with negative reduced costs is cost-effective, too.

The numerical results clearly show that each hybrid approach is successful on the airline case

on which it is applied in our experiments. The question that arises is whether the two hybrids

can generally be combined or not.

We believe that orthogonality generally holds: A meta-hybrid could start off by having the

HTS construct a set of solutions out of which diverse and feasibly combinable columns can be

extracted. Then the CGA approach can be used to improve a relaxed version of the problem,

which is repaired by the HTS approach.

We found that whether or not the use of one of the hybrid approaches we presented can speed

up the computation of a good solution is problem dependent:



Of course, the first hybrid can only be applied profitably, if the master problem is hard

enough to justify the use of a relaxation that must be repaired at some point. Regarding

airline case B, this precondition is not fulfilled, which is why we cannot apply hybrid 1 on

this case.



Using initial solutions provided by the HTS approach in order to speed up the starting

phase of CGA only pays off when the CGA approach alone has difficulties in driving

dummy columns out of the basis or if it spends too much time on this phase of the process.

This is not given in airline case A, which causes that hybrid 2 cannot be used profitably

here.

We conclude that generally the two hybrids can be combined, but the usefulness of a meta-

hybrid is problem dependent. Its tuning heavily relies on inherent problem properties, which

might not be known a priori.

5.5 Summary

For the CAP, we have shown how the concept of CP-based column generation that we presented

in Section 3.1 works in practice. The sub-problem of roster generation can be viewed as a

120

Chapter 5. Airline Crew Assignment

Constrained Shortest Path Problem. We applied the filtering algorithms that were developed

in Section 2.2 and gave a real-world empirical evaluation that showed the positive influence of

cost-based filtering, especially when using an efficient, incremental implementation.

Although the column generation approach works reasonably, we found that it suffers from

two major drawbacks: dummy costs and in-combinable rosters. Therefore, we presented a direct

CP approach and merged the two together. We showed how methods from CP and OR can help

each other to overcome their fundamental weak points.

While OR methods view a problem globally and show a good ability to detect promising

regions of the search space, CP methods can efficiently handle feasibility problems and are well

suited to resolve local conflicts. The first way of integration that we proposed uses the CP-

based column generation approach (CGA) to compute cost-efficient yet relaxed solutions to the

problem, and then resolves conflicts of overcovered pairings by applying a heuristic CP tree

search (HTS). The synergy effects are particularly visible if a lot of work has to be grouped

in relatively few partitions. Then column generation alone often fails to generate combinable

rosters, and the use of HTS as a repairing module helps a lot to increase the overall performance.

The second way of integrationthat we introducedconcerns the use of dual values. We showed

how column generation approaches can profit from CP via the computation of diverse combin-

able initial columns. On the other hand, the use of dual information in a CP-based heuristic tree

search has shown to be very efficient. It allows to laden the optimization burden on the OR part

and away from CP, which then can focus on what it was designed for originally, namely to solve

constraint satisfaction problems.

We believe that the ideas discussed in this chapter can be generalized for other problems

as well, especially in connection with (CP-based) column generation. We presented results on

large-scale real-world CAP data, which show clearly visible improvements in performance of the

hybrid approaches compared to the solitary methods.

Chapter 6

Automatic Recording

In Chapter 3, we have seen that the invocation of an optimization constraint is likely to become

inefficient when it only represents a partial view on the entire problem. This causes that the

bounds used for domain filtering are not accurate anymore, which then leaves the propagation

algorithm ineffective. We have shown how problem decomposition can help to overcome this

problem by linking optimization constraints via the objective rather than by the common inter-

play via variable domains only.

Implicitly we assumed that many real-world problems can actually be decomposed naturally

into two or more basic substructures. In this chapter, we introduce the Automatic Recording

Problem (ARP) [139] that is an example for such a composed problem. The ARP can be viewed

as a combination of a Knapsack Problem (see Section 2.5) and a Maximum Weighted Stable Set

Problem (see Section 2.3) on an interval graph. For this example, we show the benefits of linking

a knapsack and a weighted stable set constraint via CP-based Lagrangian relaxation.

The work presented in this chapter was published in [188, 189, 190]. It is structured as

follows: In Section 6.1, we formally introduce the ARP. Then, in Section 6.2, the concept of CP-

based Lagrangian relaxation is applied to the problem. Finally, in Section 6.3, we give numerical

results by evaluating the practical performance of different combined filtering algorithms for the

ARP.

6.1 The Automatic Recording Problem

The technology of digital television offers new possibilities for individualized services that can-

not be provided by current analog broadcasts. Additional information like classification of con-

tent, or starting and ending times can be submitted within the digital broadcast stream. With

this information at hand, new services can be provided that make use of individual profiles and

maximize customer satisfaction.

121

122

Chapter 6. Automatic Recording

capacity

40h recording Football

James Bond Formula 1

Mickey MouseNews

time

channels

News

US Open

Music

Teletubbies

Star Trek

Star Wars High Noon



10h of MPEG-2 require



18GB



20-200 digital TV channels



Content metadata encoded in broadcast stream



User profile known

Fig. 6.1: The automatic recording scenario.

One service which is available already today [9, 208] is an "intelligent" digital video recorder

that is aware of its user’s preferences and records automatically. The recorder tries to match a

given user profile with the information submitted by the different TV channels. E.g., a user

may be interested in thrillers, the more recent the better. The digital video recorder is supposed

to record programs such that the user’s satisfaction is maximized. As the number of channels

may be enormous (more than 100 digital channels are possible), a service that automatically

provides an individual selection is highly appreciated and subject of current research activities

(for example within projects like UP-TV [210] funded by the European Union or the TV-Anytime

Forum).

In this context, two restrictions have to be met. First, the storage capacity is limited (10 hours

of MPEG-2 video needs about 18 GB). Second, only one program can be recorded at a time (see

Figure 6.1).

More formally, we define the problem as follows:

Definition 6.1 Let n

 

, V





 



the set of programs, start







end



 



V the corre-

sponding starting and ending times, w

 









the storage requirements, K







the

storage capacity, and p

 





 

nthe profit vector.

We say that the interval Ii:





start



 

end







corresponds to program i



V, and call two

programs i





Voverlapping whose corresponding intervals overlap, i.e. Ii







0. For X



we call pX:



∑i



Xpithe user satisfaction (with respect to X).

The Automatic Recording Problem (ARP) then is to find a subset X



V such that

(a) X can be stored within the given disc size, i.e. ∑i



Xwi



(b) At most one program is allowed to be recorded at a time, i.e. Ii



















V, Y respecting (a) and (b).

6.1. The Automatic Recording Problem

123

6.1.1 On the Complexity of the Automatic Recording Problem

Obviously, even if all programs are pairwise non-overlapping (i.e., if restriction (b) is obsolete),

it remains to solve a Knapsack Problem. Thus, the ARP is NP-hard. Let pmax :



max









. We develop a pseudo-polynomial algorithm running in time Θ



n2pmax



that will be used

later to derive a fully polynomial time approximation scheme (FPTAS) for the ARP.

6.1.1.1 A Dynamic Programming Algorithm

The algorithmwe develop in the followingis similar to the teaching-bookdynamic programming

algorithm for Knapsack Problems. Setting











∞



and ψ:



npmax



1, we compute a

matrix M

 

mkl









ψ, 0





ψ, 1



n. In mkl, we store the minimum knapsack

capacity that is needed to achieve a profit greater or equal kwhen using items lower or equal l

only (mkl



∞iff ∑1



lpi



k).

We assume that Vis ordered with respect to increasing ending times, i.e., 1







implies ei



ej. Furthermore, let lastj











denote the last non-overlapping node lower

than j, i.e.,

elastj



sjand ei





lastj





We set lastj:





1 iff no such node exists, i.e., iff e0



sj. To simplify the notation, let us assume

that mk





∞for all 0



ψ, and mk





0 for all k



0. Then,

mkl



min









lastl





The previous recursion equation yields a dynamic programming algorithm: First, we sort the

items with respect to their ending times and determine lastifor all 1



n. Both can be done

in time Θ



nlogn



. Then we build up the matrix row by row. Finally, we compute max











. The total runningtime of thisprocedure and the memory needed are obviouslyin Θ



n2pmax



6.1.1.2 A Fully Polynomial Time Approximation Scheme

As for Knapsack Problems, we can use the dynamic programming algorithm to derive an FPTAS

by scaling the profit vector. Given ε



0, we set S:



εpmax



n, and pi:







. Then, pmax







. Thus, the running time of our dynamic programming algorithm applied with the scaled

profit vector is in Θ







Now let us study the error that we make by using pinstead of p. Let x









ndenote an

optimal solution with respect to p, and x









nan optimal solution with respect to p. Then,

pTx









SpTx



SpTx





pTx











pTx





(6.1)

Therefore,

124

Chapter 6. Automatic Recording



pTx



pTx





pTx





pmax



ε,

i.e., the relative error is at most ε. Thus, we have found an FPTAS for the ARP.

6.1.2 A Mathematical Programming Formulation

Since the problem of finding and proving optimalsolutions is of interest in its own right, and also

since the FPTAS we developed requires far too much memory to be applicable in practice, we

focus on exact approaches for solving the ARP. Using mathematical programming, the problem

can be stated as an integer linear program (IP):

Maximize IP1



pTx

subject to xi





















∑1



nwixi











The objective function maximizes the user satisfaction. Constraints of the form xi





ensure that for overlapping intervals Ii



Ij, at most one program ior jcan be selected. Memory

restrictions are enforced by the last row. The formulation can be tightened when replacing the

non-overlapping constraints by maximal clique constraints (see Section 2.3):

Denote the set of maximal conflict cliques by M:











2V. Then restrictions of

the form ∑i



Cpxi







mimplythatxi





1for allnodesi





Vwhosecorresponding

intervals overlap. On the other hand, if xi





1 for all overlapping intervals, it is also true that

∑i



Cpxi







m. Thus, IP1is equivalent to

Maximize IP2



pTx

subject to ∑i



Cpxi







∑1



nwixi











Though being NP-complete on general graphs, finding maximal cliques on interval graphs

that naturally model the non-overlapping constraints on the programs is simple. It can be per-

formed in time Θ



nlogn



[99], and hence, IP2can be obtained in polynomial time.

6.1.3 Solving the Resulting Integer Linear Program

Although methods exist that do not split the search space – like cutting plane algorithms, for ex-

ample – to solve a (mixed) integer linear program, branch-and-bound approaches have proven to

be efficient, widely applicable and thus are most commonly used. In every choice point, a bound

based on some (often continuous) relaxation is being computed. If that bound is worse than the

6.2. CP-based Lagrangian Relaxation for the ARP

125

objective value Bof the incumbent solution, then backtracking occurs. A successful application

of the branch-and-bound paradigm relies heavily on tight bounds that can be computed quickly.

Problem reduction can help to improve the performance of a branch-and-bound search if the fil-

tering algorithm is both effective and efficient. Effective means that it must have an impact, i.e.,

it has to be able to filter many values, whereas the efficiency measures how quickly the routine

works.

The effectiveness of a filtering algorithm mainly depends on the quality of bounds it uses to

estimate the impact of fixing a variable to one of its values. For the ARP, our experiments show

that the continuous relaxation bound yields a good estimate on the solution quality that can be

reached. Thus, it can be used for pruning purposes in a branch-and-bound approach. However,

it is not straightforward to see how this bound could be used for filtering purposes effectively,

that is, other than by probing via full re-optimization, which is inefficient. On the other hand,

domain reductions with respect to reduced-cost information can be done quickly, but is not very

effective. In the following, we will show how CP-based Lagrangian relaxation can help here.

6.2 CP-based Lagrangian Relaxation for the ARP

Using the refined model IP2, the ARP can be viewed as a combination of two simpler optimiza-

tion constraints: a knapsack constraint, and a maximum weighted stable set constraint on an

interval graph. For the knapsack constraint, a filtering algorithm was developed in Section 2.5

that runs in time Θ



nlogn



. Likewise, in Section 2.3, we developed a filtering algorithm for

the maximum weighted stable set substructure (WSSP) of the ARP. The algorithm runs in time



nlogn



or in amortized linear time for Ω



logn



incremental propagation calls.

Provided with the two filtering algorithms, we are able to perform domain reduction for the

two natural substructures of the ARP. According to the abstract description in Section 3.2, we

will now tie the two filtering algorithms together:

As the filtering algorithm for the WSSP allows us to incorporate changing objectives at a

low computational cost, we decide to relax the capacity constraint. We introduce a non-negative

Lagrange multiplier λ



0 and define the Lagrangian sub-problem

Maximize L







∑1







λwi





λK

subject to ∑i



Cpxi















The Lagrange multiplier problem then is to minimize L





, such that λ



0. For every λ







is a valid upper bound on the objective. Therefore, we can apply cost-based filtering for

the weighted stable set constraint on interval graphs every time we solve the Lagrangian sub-

problem. For given Lagrangian multipliers λ, we use dual information π











mfrom the

corresponding stable set sub-problem to perform variable fixing with respect to the knapsack

126

Chapter 6. Automatic Recording

substructure next. Note that the algorithm developed in Section 2.3 provides us with those values

at essentially no additional cost. By Lagrange relaxing the maximal clique constraints with

multipliers π



0, we obtain a Knapsack Problem. Let µi:



∑j:i



Cjπj





nand π:



∑1



mπj. Then the problem is to

Maximize ∑1







µi





subject to ∑1



nwixi











Relaxations of this problem again yield a valid upper bound, and we can propagate the knapsack

optimization constraint on the modified objective.

6.2.1 Implementation Details

We have so far left out some implementation details concerning the choice of the branching

variable and the computation of optimal Lagrangian multipliers λ



. In this section, we will give

an insight in the implementation the tests are performed with.

We use four different approaches for our experiments: the first is a pure branch-and-bound

algorithm without any problem tightening (referred to as P-0). The second uses the filtering

algorithms for Knapsack and Maximum Weighted Stable Set Problems on the original objec-

tive (P-1). The third and the fourth approach (P-2 and P-3) realize the idea of linking filtering

algorithms for linear optimization constraints via Lagrangian relaxation. P-2 calls for domain

reduction with respect to both substructures just once after the Lagrangian dual has been solved,

whereas P-3 also propagates the maximum weighted stable set constraint during the search for

optimal Lagrange multipliers.

6.2.1.1 Continuous Bound Computation

For pruning, the computation of a linear bound on the objective is needed. P-2 and P-3 obvi-

ously use the objective value corresponding to LB







for this purpose. As the computation via

Lagrangian relaxation with stable set sub-problems turned out to be very efficient, we use that

algorithm for all four approaches.

6.2.1.2 Computation of λ



To determine λ



, we use a method to maximize one-dimensional concave functions based on the

golden section. We obtain a sequence of λk,k



. Let emax :



max











. Then,

for all ε



0, there exists a constant c



0 such that



λk















logemax

6.3. Numerical Results

127

Thus, after O



logemax



iterations we can numerically approximate the optimal Lagrange multi-

plier λ



. Each iteration costs amortized linear time for a total of at least Ω



logn



iterations in

all search nodes. Finally, in every choice point we add O



nlogn



for the succeeding knapsack

filtering algorithm. Therefore, the integrated filtering algorithm for the tight global Lagrangian

relaxation bound runs in time O



nlogemax



nlogn



Notice that the Lagrangian sub-problem is totally unimodular, i.e., it exhibits the integral-

ity property. Thus, the Lagrangian relaxation bound has the same value as the bound that is

determined by a linear continuous relaxation.

6.2.1.3 Branching Variable Selection

Using the shortest path interpretation of the weighted stable set constraint on interval graphs (see

Section 2.3.3), all algorithms choose the first node on the shortest path1with maximal efficiency



wias branching variable.

6.3 Numerical Results

All experiments are performed on a PC with an AMD-Athlon 600 MHz processor and 256 MB

RAM running Linux 2.2. The implementation was done in C++ and compiled by gcc 2.95 with

maximal optimization (O3). The algorithms are built on top of ILOG SOLVER 5.0 [121].

6.3.1 Test Instance Generation

The experiments are conducted on several sets of randomly generated test instances. To achieve

scenarios which we believe to be of relevance for the real-life application, each set of instances is

generated by specifying the time horizon (half a day to 3 days) and the number of channels (20 –

100). The generator sequentially fills the channels by starting each new program one minute after

the last. For each new program a class is being chosen randomly. That class then determines

the interval from which the length is chosen randomly. We consider either 3, 5, or 7 different

classes. The lengths of programs in the classes vary from 5



2 minutes to 150



50 minutes. The

disc space necessary to store each program equals its length, and the storage capacity is randomly

chosen as 45%–55% of the entire time horizon.

To achieve a complete instance, it remains to choose the associated profits of programs. For

the experiments, we use four different strategies for the computation of an objective function:



For the class usefulness (CU) instances, the associated profit values are determined with

respect to the chosen class, where the associated profit values of a class can vary between

zero and 600



200.

1According to the optimal reduced-costs objective ∑1















128

Chapter 6. Automatic Recording



In the time correlated (TC) instances, each 15 minute time interval is assigned a random

value between 0 and 10. Then the profit of a program is determined as the sum of all

intervals that program has a non-empty intersection with.



For the weakly correlated (TWC) instances, that value is perturbed by a noise of



20%.



Finally, in the subset sum (SSS) data, the profit of a program simply equals its length.

The different objectives try to emulate some effects that we believe to hold for real-life instances.

In the CU instances for example, programs of the same class cause similar attractions. On the

other hand, the TC and TWC instances cause many conflicts regarding the choice of programs

that are being broadcasted at the same time. The assumption that programs overlapping in time

cause similar attractions is justified by the fact that TV channels are planning their broadcasts

according to the behavior of target groups. To a large extent, these target groups are determined

by and vary with the time of the day. However, the different strategies that we consider are only

intuitively justified. The feasibility of our approach for real-life instances can only be concluded

from the fact that we achieve similar results for all choices of the objective.

We identify a test set by giving the parameters the generator is started with. According to the

previous description those parameters are: The time horizon in minutes, the number of channels,

the number of different classes [3, 5, or 7], and the objective type [CU, TC, TWC, or SSS].

6.3.2 Experimental Evaluation

In the following, we present our numerical results. The experiments consist of 50 random in-

stances per test set. For each instance, the approaches P-0 –P-3 are run to find and prove an

optimal solution. We give running times and the number of choice points needed for an exhaus-

tive search. All approaches find a first solution rather early in the search. Therefore, the main

work consists in the proof of optimality rather than in the construction of the solution. We con-

clude that the branching variable selection that we use efficiently supports finding near-optimal

solutions in a non-exhaustive search.

Table 6.1 shows the performance (time and choice points) of all four approaches on test sets

generated with a time horizon of 12 hours and 20 channels using 5 different program classes and

CU, TC, and TWC to determine the objective function.

When comparing the different types of objectives, we find that, for all four approaches, the

TC instances are much harder than CU and TWC, which are comparably easy to solve. This is

a general observation we made for all kinds of different test sets using 3 or 7 classes as well as

different time horizons and numbers of channels.

We further observe that a higher degree of integration between the two optimization con-

straints yields partially drastic reductions in the number of choice points of up to a factor of

6.3. Numerical Results

129

test set P-0 P-1 P-2 P-3

5 12h 20 ch time nodes time nodes time nodes time nodes

CU 9.5 519.4 5.2 295.5 7.0 198.5 8.6 184.0

TC 441.8 40155.7 67.2 4525.8 18.0 696.5 28.4 575.6

TWC 15.5 1136.1 12.0 802.6 6.2 339.9 9.5 321.2

Tab. 6.1: The table compares the different approaches on three different test sets with 5 classes, 12 hours,

20 channels and different objectives. The time (in seconds) and the number of choice points are averages

for 50 randomly generated instances for each objective. The average number of programs per instance are

between 607.6 and 612.6.

about 70 on the difficult time correlated instances. Regarding the computation time, there is a

trade-off between the reduction of choice points and the time spent per choice point (TpCP). The

TC and TWC instances show that P-2 can outperform P-3 because of the shorter TpCP that is

needed for that degree of integration. When comparing P-1 and P-3 on the CU instances, the

reduction of choice points is not big enough to justify the longer TpCP needed, and P-1 is the

approach that takes the least computation time.

Generally, a bigger reduction of choice points is more likely to pay off when the absolute

TpCP needed is rather high. Particularly, this holds for applications where additional constraints

are propagatedon topof the objectiveconstraint itself. For theARP, the optimizationconstraint is

the only active constraint. Therefore, to justify the worse TpCP caused by the more complicated

propagation algorithm, a substantial reduction of the number of choice points must be achieved.

The P-2 and P-3 approaches obtain a sufficient reduction of choice points on the more difficult

TC test sets, and also for larger test instances:

Table 6.2 shows the performances of all approaches on test instances that are generated using

5 different program classes with different time horizons and numbers of channels. The objective

is computed according to the chosen classes, i.e., according to CU. As expected, for the larger

instances with a time horizon of 72 hours (3 days) and 20 channels, the two linking approaches

P-2 and P-3 outperform P-1 roughly by a factor of 4 regarding the number of choice points

and almost a factor of 2 with respect to the computation time needed. The minima, maxima

and standard deviations prove that the average numbers we present are not biased by very few

outliers, but represent meaningful values for the evaluation of the algorithms performances.

In Table 6.3, we compare the different approaches on a variety of very different test in-

stances that are generated using different parameters and objective functions. Again, relevant

and partially substantial reductions in the number of choice points can be obtained by CP-based

Lagrangian relaxation realized in P-2 and P-3.

130

Chapter 6. Automatic Recording

test set P-0 P-1 P-2 P-3

5 CU time nodes time nodes time nodes time nodes

12h

20ch

avg

min

max

std

2.4

0.1

25.4

4.2

238.3

5.0

2216.0

396.8

1.3

0.1

15.1

2.3

129.9

5.0

1531.0

243.6

1.4

0.1

12.9

2.5

108.6

5.0

1269.0

206.5

2.1

0.1

24.2

4.0

89.9

5.0

1045.0

173.5

12h

50ch

avg

min

max

std

16.5

0.1

167.3

30.8

741.9

3.0

7058.0

1377.6

8.2

0.2

82.6

14.5

370.1

3.0

3615.0

658.9

9.9

0.2

154.1

22.9

272.0

3.0

2664.0

506.9

14.2

0.2

156.5

26.8

250.5

3.0

2664.0

465.0

24h

20ch

avg

min

max

std

9.5

0.5

87.5

15.2

519.4

21.0

4416.0

829.9

5.2

0.4

59.6

9.3

295.5

13.0

3762.0

587.6

7.0

0.3

93.4

17.2

198.5

10.0

2094.0

377.8

8.6

0.5

98.1

17.7

184.0

10.0

2067.0

374.9

24h

50ch

avg

min

max

std

1104.9

0.8

31045.5

4448.4

24301.4

12.0

675235.0

97121.0

585.2

1.0

15625.2

2272.6

14219.3

12.0

368440.0

54288.6

883.3

0.7

33281.3

4662.0

8371.9

9.0

292753.0

41139.4

921.5

1.1

31573.5

4441.2

8286.8

9.0

292753.0

41121.2

72h

20ch

avg

min

max

std

2627.7

2.0

32751.9

5514.7

40901.5

29.0

460350.0

85325.8

1786.7

2.4

30520.3

4543.1

27662.0

29.0

412421.0

65188.4

920.4

3.0

11766.0

1996.8

6674.7

29.0

90397.0

14515.9

990.9

5.5

13724.7

2189.7

6514.7

29.0

89589.0

14379.4

Tab. 6.2: The table shows a comparison of the performance of the different approaches on 5 test sets with

5 classes and objective CU for various time horizons (in hours) and channel numbers (ch). Italic numbers

give the average time (in seconds) and the average number of nodes of 50 randomly generated instances

in each test set (avg). Numbers below are: minimum (min), maximum (max), and standard deviation

(std) for these 50 instances. The average number of programs per instance is 315.2 for (12h/20ch), 793.5

(12h/50ch), 607.6 (24h/20ch), 1512.1 (24h/50ch), and 1782.6 (72h/20ch), respectively.

6.3. Numerical Results

131

P-0 P-1 P-2 P-3

test set time nodes time nodes time nodes time nodes

3 CU

120h 20ch 5210.3 60839.2 1734.4 30676.4 455.9 3433.9 490.1 2945.1

5 TWC

72h 20ch 11600.1 293386.8 1526.8 35718.4 261.0 3683.6 411.7 3134.5

7 TC

24h 50ch 8349.0 250367.1 4066.3 105572.6 403.4 6235.4 533.0 4219.1

Tab. 6.3: The table illustrates the performance of the different approaches on very different benchmark

classes. Each test set contains 50 randomly generated problem instances. There is an average of 1956.7

programs in the 120h/20ch test set, 1782.6 programs in test set 72h/20ch, and 1423.3 programs in test set

24h/50ch.

test set

P-0 P-1 P-2 P-3

5 SSS time nodes time nodes time nodes time nodes

12h 20ch 316.4p 0.2 23.1 0.2 15.2 0.2 15.2 0.3 15.2

12h 50ch 792.4p 0.5 18.8 0.5 13.9 0.5 13.9 0.8 13.9

24h 20ch 611.2p 0.5 26.9 0.6 21.9 0.6 21.9 1.0 21.9

24h 50ch 1527.3p 1.6 29.1 1.8 23.2 1.7 23.2 3.0 23.2

72h 20ch 1778.3p 3.5 53.4 4.6 51.7 5.0 51.7 8.7 51.7

72h 50ch 4464.3p 11.0 54.4 14.4 52.8 15.4 52.8 27.2 52.8

Tab. 6.4: The table shows the performance of the different approaches on subset sum data sets ranging

from 12 hours and 20 channels up to 72 hours and 50 channels. The average number of programs in the

50 randomly generated instances per test set is given as parameter p.

So far, we have left out comparisons regarding the choice of the objective according to SSS.

Table 6.4 shows the results obtained for a collection of very different test sets generated with

SSS. Two facts stand out: first, a comparison with Table 6.1 shows that the SSS instances are

much easier to solve than for other choices of the objective. Second, P-1 achieves only a slight

reduction of choice points compared to P-0 that cannot be improved by P-2 and P-3 at all.

The effect is not surprising: We considered the somewhat artificial SSS test sets because

of their obvious relation to subset sum benchmarks for Knapsack Problems. Due to the equal

efficiency pi



wiof all programs, the knapsack optimization constraint has great difficulties to in-

clude or exclude programs. Therefore, the knapsack constraint is not effective, and the burden of

domain reduction lies on the WSSP optimization constraint only. In total, using the optimization

constraint for pruning purposes only is most time efficient here.

132

Chapter 6. Automatic Recording

test set avg. no P-0 P-1 P-2 P-3

3 CU of programs time nodes time nodes time nodes time nodes

12h 100ch 1048.3 18.8 680.8 9.1 326.3 5.1 108.6 7.6 95.5

24h 50ch 1013.4 37.8 1461.1 19.5 734.4 11.1 187.9 13.8 178.5

72h 20ch 1175.0 177.4 3003.1 111.7 1897.2 36.6 468.4 42.9 401.2

Tab. 6.5: The table compares the performance of the different algorithms on benchmark sets with 3 classes

and objective CU, each containing 50 randomly generated problem instances with roughly 1000 programs

on average.

Finally, we investigate the impact of the number of channels. Table 6.5 shows a comparison

of three different test sets that are generated using 3 different program classes and CU objec-

tives. All instances have a similar size and roughly contain 1000 programs. We observe that

the instances become more difficult to solve for all approaches when the number of channels is

decreasing. This surprising result may be caused by the fact that a smaller number of channels

increases the relative importance of the knapsack optimization constraint that is inherently more

difficult than the WSSP constraint. However, a sound answer to that question can only be given

by further investigation. It remains to note that we observe a converse behavior for TC data sets:

the instances become more difficult the more channels are involved which is obviously caused

by many temporally conflicting programs of similar value.

6.4 Summary and Future Work

We have shown that the Automatic Recording Problem can be viewed as a composition of a

Knapsack Problem and a Weighted Stable Set Problem on an interval graph. By introducing an

FPTAS, we showed that the problem, though being NP-hard, can be approximated with arbitrary

approximation quality. Then, using an exact tree search approach, we showed the benefits of

linking filtering algorithms via Lagrangian relaxation. The numerical results we obtained show

a significant improvement achieved by CP-based Lagrangian relaxation with respect to the com-

putation time and the number of choice points.

There are several natural extensions for the ARP. For example, a digital video recorder could

have more than one recording unit which allows the recording of a limited number of channels

simultaneously. In an IP context, this modification can be introduced easily. For the exact ap-

proach presented, a fast and efficient filtering algorithm for this type of relaxed non-overlapping

constraint is subject to further research.

Chapter 7

Capacitated Network Design

In the last chapter, we presented the AutomaticRecording Problem (ARP) as an example of a dis-

crete optimization problem that can be naturally decomposed into two basic substructures. Now,

we want to develop a branch & bound approach for the Capacitated Network Design Problem,

that can also be viewed as a conglomerate of two simple constraint families defined by the mass-

balance and the bundle constraints. We investigate both induced Lagrangian sub-problems and

develop problem reduction algorithms for them. As we shall see, applying the strongest version

of CP-based Lagrangian relaxation is inefficient for this problem. Therefore, we generalize the

idea of cost-based domain filtering, that may be viewed as an adding of unary local constraints.

An algorithm is presented that adds local cardinality cuts based on Lagrangian relaxation. Their

usefulness is evaluated empirically by numerous tests.

The work presented in this chapter was published in [136, 193]. It is structured as follows: In

Section 7.1, we introduce the Capacitated Network Design Problem (CNDP). To solve the prob-

lem, we use bounds, variable fixing algorithms and local cardinality cuts based on Lagrangian

relaxation as described in Section 7.2. The entire branch & bound-approach is described in

Section 7.3. Finally, in Section 7.4, we give numerical results.

7.1 The Capacitated Network Design Problem

The Capacitated Network Design Problem was first defined in [91] and is relevant for a wide

area of applications, ranging from telecommunications to transportation problems [97, 144]. The

problem consists in finding an optimal subset of arcs in a network G

 





such that we can

transport a given demand dk







of goods 1



K(so-called commodities) at optimal total

cost. The latter consists of two components: the flow costs and the design costs. The flow cost is

the sum of costs for the routing of each commodity, whereby, for each arc







and commodity

k, a scalar ck



0 determines the cost of routing one unit of commodity kvia







. The design

133

134

Chapter 7. Capacitated Network Design

costs are determined by the costs of installing the chosen arcs, whereby, for each arc







, we

are given a fixed arc-installation cost fij. Additionally, for each arc, there is a capacity uij given

that limits the total amount of flow that can be routed via







For allarcs









Eandcommodities1



K, letbl



min







uij



. Usingvariablesxl







for the flows and y











for the design decisions, a mixed integer program formulation

for the Capacitated Network Design Problem can be stated as follows:

Minimize LCNDP



∑l





Txl



fTy

subject to Nxl







∑lxl



uijyij

















ijyij







































For ease of notation, we refer to the previous LP with LCNDP, which is also used to denote the

optimal objective value. The network flow constraints (also called mass-balance constraints) (1)

are defined by the node-arc-incidence matrix N

 

nia









Eand a demand vector dk







for

all commodities k, whereby nia



1 iff a

 





,nia





1 iff a

 





, and nia



0 otherwise,

and dk



0 iff node i



Vis a demand node and dk



0 iff node iis a supply node for commodity

k. Without loss of generality, we may assume that there is exactly one demand node and one

supply node for each commodity [112].

The total flow on an arc







is constrained by the capacity uij (so-called capacity or bundle

constraints (2)). The set of upper bound constraints (3) is redundant to the problem formulation.

It is introduced to strengthen the linear continuous relaxation of the mixed integer problem.

7.1.1 State of the Art

The CNDP is an NP-hard problem, which is easy to see by reduction to the Steiner Tree Problem.

Since the latter is MAXSNP-hard [20], we cannot even hope for a fully polynomial approxima-

tion scheme (FPTAS) for the CNDP. Here, we focus on exact and heuristic solution approaches.

Regarding exact solution approaches, Crainic, Frangioni, and Gendron develop lower bound-

ing procedures for the CNDP [44]. The main insights are the following: Tight approximations

of the so-called strong LP-relaxation (see LCNDP including the redundant constraints (3)) can be

found muchfaster by Lagrangianrelaxation than byoptimizingthe LP usingstandard LP-solvers.

The authors investigateso-called shortest path and knapsack relaxations (see Section 7.2). When

solving the Lagrangian dual, bundle methods converge faster than ordinary subgradient methods

and are more robust. Motivated by this successful work, we evaluate several Lagrangian relax-

ations in the context of branch & bound.

7.1. The Capacitated Network Design Problem

135

In [112], Holmberg and Yuan present a method to compute exact or heuristic solutions for the

CNDP. They use the Lagrangian knapsack relaxation in each node of the branch & bound tree

to efficiently compute lower bounds. Special penalty tests were developed which correspond to

variable fixing strategies presented in the following. An evaluation of the following components

is given: subgradient search procedure for solving the Lagrangian dual, primal heuristic for

finding feasible solutions, and interplay between branch & bound and the subgradient search.

On top of that work, a heuristic is developed that is embedded in the tree search procedure.

That heuristic is able to provide near-optimal solutions for large CNDP instances which are out

of reach of exact methods like Lagrangian relaxation based branch & bound or branch-and-cut

approaches (represented by the CPLEX implementation, for example).

In [22], Bienstock et al. describe two cutting-plane algorithms for a variant of the CNDP

with multi-arcs (i.e., an arc can be inserted multiple times). One of them is based on the multi-

commodity formulation of CNDP and uses cutset and three-partition inequalities. The other one

adds total capacity, partition, and rounded metric inequalities. In a branch-and-cut framework,

both variants provide sound results on a benchmark of realistic data. A substantial improve-

ment of this procedure is achieved by Bienstock in [23]: a branch-and-cut algorithm based on

ε-approximations of linear programs performs better on the same benchmark data.

Further branch-and-cut algorithms for the Multi-Arc CNDP are investigated in research pa-

pers by Günlük, and Atamtürk et al. [7, 102]. Several valid inequalities are considered, and it is

found that branch-and-cut is substantially superior to branch & bound code on the investigated

benchmark data. After addingthe cuts, the integralitygap at theroot node isreduced significantly

and the total number of evaluated nodes diminishes. These results emphasize the importance of

tight lower bounds for the CNDP.

Sridhar and Park report about an implementation of a Benders-and-cut algorithm for the

CNDP [204]. The algorithm consists of three parts: a cutting plane algorithm for the compu-

tation of tight lower bounds, a heuristic to generate feasible solutions, and the Benders-and-cut

algorithm itself. The computational results provided in the paper are based on a wide range of

instances with varying traffic demand. The complexity of the problem instances depends heavily

on the capacity provided in the network. CNDP instances with low transportation demand are

easy to solve, whereas for problem instances with higher demand the authors suggest to use the

Benders-and-cut algorithm. In cases when the traffic becomes extremely dense, the integrality

gaps increase. To strengthen the LP-relaxation, flow inequalities are very effective.

A branch-and-price algorithm for the path-based formulation of the CNDP is compared with

the traditional branch & bound for the link-based formulation by Clarke and Gong in [40]. The

path-based formulation is computationallymore efficient. Adding SOS-branching on each origin

and destination node increases this advantage. The efficient solution of the pricing problem

(which is a simple Shortest Path Problem) enables a faster solving of the LP-relaxations in the

branch & bound tree. Computational results on instances with 6, 10, and 15 nodes are reported.

136

Chapter 7. Capacitated Network Design

7.2 Lagrangian Relaxation Bounds

The CNDP can be viewed as a mixture of a continuous and a discrete optimization problem.

The latter is obviously constituted by the design variables, whereas the first is a Min-Cost Mul-

ticommodity Flow Problem (MMCF) that evolves when the design variables are fixed. For the

MMCF, besides linear programming solvers, especially cost-decomposition approaches based

on Lagrangian relaxation have been applied successfully [85]. The bounds we will use for the

CNDP will be based on those cost-decomposition approaches for the MMCF.

Regarding the MMCF and also for the CNDP, we are left with two promising choices of

which hard constraints should be softened:



the bundle constraints (“shortest path relaxation”), or



the mass-balance constraints (“knapsack relaxation”).

7.2.1 Shortest Path Relaxation

Assume, in LCNDP, for each arc









Ewe introduce a Lagrangian multiplier λij



0 and

transfer the corresponding bundle constraint into the objective function. At the same time, we

also relax the upperbound constraintsxl



ij usingpenalty costsνl



0. Let





 

 

denote the component-wise product of two vectors, i.e. z





yiff zs



xsys









nand



n. Then we get the following linear program:

Minimize LSP









∑l





νl



Txl







∑lνl





subject to Nxl



















The theory of Lagrangian relaxation shows that, for every choice of the Lagrangian multipliers





0, LSP







is a lower bound on the CNDP. Notice that this value can be obtained easily,

because there is no cross-talking between variables yand xand among the variables xlanymore.

Thus, we can compute LSP







by solving Kproblems of the form

Minimize Ll







 



νl



Txl

subject to Nxl





and by setting yij



1 iff fij



λijuij



∑kνl

ijbl



0, and yij



0 otherwise. That is, we can solve

the Lagrangian sub-problem mainly by computing KSingle Source Shortest Path Problems with

positive arc weights. Therefore, the shortest path sub-problem can be solved in time O





nlogn



7.2. Lagrangian Relaxation Bounds

137

7.2.2 Knapsack Relaxation

The other promising alternative is to relax the mass-balance constraints in LCNDP. For the con-

straints to be relaxed, we introduce Lagrangian multipliers µl

ifor all 1



Kand i



V. We get

the following linear program:

Minimize LKP







∑l∑ij





µl



µlj



Txl



fTy



µTd

subject to ∑lxl



uijyij













ijyij























Whereas the shortest path relaxation decomposes the Lagrangian sub-problem by the different

commodities, here we achieve an arc-wise decomposition. To solve the previous LP, for each









Ewe consider the following linear program, that is similar to the linear continuous relax-

ation of a Knapsack Problem:

Minimize L













∑lcl

ijxl

subject to ∑lxl



uij









where cl





µl



µlj. For each









E, we set xl



ij for all 1



K, and yij



1, iff

fij















0. Otherwise, we set xl



0 for all 1



K, and yij



0. Obviously, this setting

provides us with an optimal solution for L











. Thus, the main effort is to solve the problems











. However, this is an easy task (compare with [147, 149]): first, we can eliminate all

variables withpositivecost coefficients, i.e., we set xl



0 for all1



Kwithcl



0. Next, we

sort the xl

ij according to increasing cost coefficients cl

ij, that is, from now on we may assume that







0 for all 1







K, whereby sis the number of negative objective coefficients.

Let k



denote the critical item with k



min







∑h



lbh



uij











. We obtain











by setting xh



ij for all h



k,xh



0 for all h



min







, and, in case of k







uij



∑h



kbh

ij. Thus, the knapsack sub-problem can be solved in time O







KlogK



Note that both relaxations exhibit the integrality property. Thus, the bound we achieve in

both settings equals the linear continuous relaxation bound of the CNDP [44].

7.2.3 Subgradient Optimization

For both relaxation types, the Lagrangian dual consists in maximizing the lower bound. That is,

for the shortest path relaxation we have to maximizeL







subject to λ





0. For the knapsack

138

Chapter 7. Capacitated Network Design

relaxation, our task is simply to maximize L





. A popular way of solving these concave, piece-

wise linear maximizationproblems over a convex region is to apply a subgradient search (see [1]

for a general introduction). Several proposals regarding the specification of the general method

have been made in the CNDP literature. Let stdenote a subgradient in iteration t. We compute

the new search direction by setting dt

 



αdt













[112]. We also experimented with

other approaches to solve the Lagrangian dual, such as the modified Camerini-Fratta-Maffioli

rule [28] or the volume algorithm [11]. Without going into details here, we just note that none of

these modifications yield visible improvements on the overall performance.

7.2.4 Variable Fixing

A big advantage of Lagrangian relaxation based bound computations is that they can be used for

variable fixing in a very efficient way. In the presence of an optimal or at least high quality upper

bound B





for the CNDP, it is an easy task to check whether a variable yij can still be set

to either of its bounds without worsening the lower bound too much. More formally, given the

Lagrangian multipliers λand νin the current shortest path sub-problem, a value l









and

any arc









E, we can set

yij



lif









fij



λijuij



∑

νl

ijbl







LSP





 

(7.1)

Analogously, given the Lagrangian multipliers µin the current knapsack sub-problem, a value









and any arc









E, we can set

yij



lif









fij

















LKP



 

(7.2)

Now, we have two variable fixing algorithms at hand for two different Lagrangian sub-problems.

Therefore, we are able to apply the concept of CP-based Lagrangian relaxation (see Section 3.2).

First, we can choose one of the two alternatives (for example the one for which the Lagrangian

dual can be solved more quickly) and apply the corresponding variable fixing algorithm in every

Lagrangian sub-problem. If we find that this procedure does not filter enough values, we can

do even more: with the help of dual values gained in the solution process of the Lagrangian

sub-problem, in every Lagrange iteration we can apply both variable fixing algorithms.

Let us assume we decide to use the shortest path relaxation. Given the current Lagrangian

multipliers λand ν, in every iteration we need to solve KShortest Path Problems. Using the

well-known Dijkstra algorithm for that purpose, we do not only get the optimal objective of each

problem, we also get the shortest path distances µl

iof each node ifor free. It is known for a long

time that those distances are optimal dual values in the corresponding LP. The idea of the linking

method consists in using these duals as Lagrangian multipliers for the knapsack sub-problem

next. That is, we consider LKP





. That way, we can apply the variable fixing algorithms for

7.2. Lagrangian Relaxation Bounds

139

both sub-problems. Note that, for optimal Lagrangian multipliers λand ν, the shortest path

distances µin combination with the multipliers are also optimal for the dual of LCNDP.

When using the knapsack relaxation, the situation is slightly more complicated, because we

need to provide dual values for the bundle as well as the upper bound constraints. Given the

current Lagrangian multipliers µ, we solve



knapsack sub-problems as described in 7.2.2.

Again, when given any arc









E, we assume that s

 

denotes the number of negative cost

coefficients in L











, that the remaining variables xl

ij are ordered with respect to increasing

cost coefficients, and that k





1 is the critical item.

In case of k





1, we set λij



ij,νl





ij for all l



kand νl



0 for all l



k. For





1, we set λij



0, νl



ij for all l



kand νl



0 for all l



Theorem 7.1 The vectors λand νdefine optimal dual values for LKP





Proof: It is sufficientto show that the valueλij and the vector νij

 

ν1



νK



givean optimal

dual solution for L











for all









E. The dual of L











is the following linear program:

Maximize D













uijλij



∑lbl

ijνl

subject to λij



νl







λ,ν



First, we show that our solution is dual feasible:





1: It holds that λij





0, because k



s. Also, if l



k, then νl







because cl



ij. If l



k, then νl



0. Thus, all values are non-positive.

Furthermore, if l



k, then it holds that λij



νl









ij. For l



k, it holds

that λij



νl





ij.





1: It holds that λij



0 and νl





0 for all l



k, because this implies l



Furthermore, νl



0 for all l



k. Thus, all values are non-positive.

For k





1, we have that λij



νl



ij for all 1



To prove the optimalityof the dual solution(and, at the same time, also for the primal solution

we gave in Section 7.2.2), we show that the two objective values for xij in L











and λij,νij in











match each other:





1: ∑lcl

ijxl



∑l



kcl

ijbl





uij



∑l



kbl





∑l



kbl









uijck



∑lbl

ijνl



uijλij.





1: ∑lcl

ijxl



∑l



scl

ijbl



∑lbl

ijνl



uijλij.

140

Chapter 7. Capacitated Network Design

To summarize: if we choose the shortest path relaxation, in every Lagrangian sub-problem

we solve Kshortest path sub-problems and achieve a lower bound on the CNDP. If that bound

is worse than the best known or even optimal objective value B, we can prune the current choice

point. Otherwise, we fix variables according to Implication 7.1. Then we set up LKP





, where µ

are the shortest path distances, and fix variables according to Implication 7.2.

If, instead, we choose to use the knapsack relaxation, in every Lagrangian sub-problem we

solve



linear continuous Knapsack Problems and achieve a lower bound for the CNDP. Again,

if that bound is worse than B, we can prune the current choice point. Otherwise, we fix variables

according to Implication 7.2. Then we set up LSP







, where λand νare the dual values of

LKP





in Theorem 7.1, and fix variables according to Implication 7.1.

Of course, many variants of the algorithm sketched above can be thought of. For example, we

may find that applying both variable fixing algorithms in every Lagrangian sub-problem is too

time consuming and does not pay off. Then we can introduce frequency parameters that control

the percentage of Lagrangian sub-problems for which either of the twovariable fixing algorithms

is applied. As an extreme choice, we may for example decide to apply variable fixing for the

optimalLagrangian multipliersonly. As our experimentsshow, it is favorable to use theknapsack

relaxation to solve the Lagrangian dual quickly. As one would expect, solving KShortest Path

Problems in everyLagrangian sub-problemin additionto the



knapsack sub-problemsis rather

costly and slows down the solution process considerably. The following theorem helps to cope

with this situation more efficiently.

Theorem 7.2 Given Lagrangian multipliers µ in the knapsack relaxation, denote some optimal

dual values for LKP





by λ



0and ν



0. Then,

LSP









LKP





(7.3)

Proof: Let κ



0 denote optimal dual values for the upper bound constraints of yin LKP





. The

vectors µand κare dual feasible for LSP







, because λ,νand κare primal optimal for LKP





and thus

µk



µk





λij



νk

















and (7.4)



κij



fij



uijλij



∑

ijνk











E(7.5)

Any dual feasible solution for a minimization problem determines a valid lower bound on that

problem. Denote the dual of an optimization problem Lby D. Then,

LSP









DSP









µTd



1Tκ



DKP







LKP





(7.6)

7.2. Lagrangian Relaxation Bounds

141

We can use this theorem to relax Implication 7.1, which improves the running time of our

variable fixing algorithm, but makes it also less effective. Unfortunately, as we shall see in

Section 7.4, even in its strong version the shortest path variable fixing algorithm is already too

ineffective, and therefore this idea cannot be used to improve the running time of a CNDP solver.

Another important choice is to decide when the effects of the variable fixing algorithms

shouldbe made visible tothe subgradient algorithm. Obviously,when we fix variables duringthe

process of finding optimal Lagrangian multipliers, we are actually changing the problem. Thus,

with respect of the convergence of the iterative procedure, we have to reset. The most defensive

strategy is to start from scratch (with the current Lagrangian multipliers as starting point) and

to set the step-length coefficient to its initial setting. However, we may consider to allow only

a little more flexibility by scaling the factor by a fixed percentage. A totally different approach

would be not to incorporate the changes of the bounds of the variables during the optimization

process, but to restart the whole procedure (if any variable fixing has taken place) after optimal

Lagrangian multipliers have been found. We evaluate different parameter settings in the numeric

section.

7.2.5 Lagrangian Cardinality Cuts

Since CP-based Lagrangian relaxation is partly inefficient for the CNDP, we generalize the idea

of domain filtering. Variable fixing may be viewed as an addition of unary constraints that force

a variable to take a specific value. Domain filtering that achieves a state of bound consistency

may be viewed as a procedure that adds unary interval constraints. Even general domain filtering

can be viewed as an additionof unary constraints that enforce a variable not to take some specific

values. In general, all these additional constraints are only valid locally. Now, we do not want to

restrict ourselves to unary constraints anymore. Instead, we consider the generation of additional

local constraints with respect to cost considerations.

We will see that, in the presence of a near-optimal solution to the CNDP with associated

objective value B, in each Lagrangian sub-problem we can infer restrictions on the number of

arcs that need to be installed in any improving solution. Before we can state the idea more

formally, to ease the notation, we introduce some identifiers. For the shortest path relaxation, we

define

fij



fij



λijuij



∑

lνl

ijbl











E(7.7)





∑







(7.8)

for Lagrangian multipliers λ





0. Likewise, for the knapsack relaxation, we set

fij



fij











 









E(7.9)





µTd(7.10)

142

Chapter 7. Capacitated Network Design

for Lagrangian multipliers µ. Furthermore, denote the current Lagrangian sub-problem LSP







or LKP





by LR.

Theorem 7.3 Denote an ordering of the arcs in E by e1





such that i



j implies ˆ

fei



fej.

It holds: if LR



B, then

a) there exist values



min







 

∑



feh





and (7.11)



max







 

∑



feh







(7.12)

b) In any improving solution







, it holds that



∑







yij





(7.13)

Proof:

a) It is sufficient to show that there exists a value 0





such that ∑h



lˆ

feh



B. Let



argmin0





∑h



lˆ

feh



. It holds that:

∑



feh



LSP









∑













∑













(7.14)

Or, respectively, ∑



feh



LKP







µTd





µTd





(7.15)

b) Let L











∑







Eyij





denote the optimization problem that evolves by adding

the constraint ∑







Eyij



Uto a problem L. By the definition of U, we know that









B. We remind the reader that we identify the name of an optimization problem

with its optimal value. Now, assume that there exists a feasible solution







for LCNDP

with



 

∑





 

Eyij



U. Then it holds that

cTx



fTy





CNDP

















(7.16)

The case ∑







Eyij



Ffollows analogously.

Theorem 7.3 allows us to add cardinality cuts on the number of arcs to be installed without

loosing improving solutions. We will evaluate the effect of local cardinality cuts on the solution

process in Section 7.4.

7.3. A Branch & Bound Algorithm

143

7.3 A Branch & Bound Algorithm

After having described the bound computation and possible reduction strategies based on La-

grangian relaxation, now we sketch the decisions taken in the tree search.

Dominance Cut-Off Rule Apart from the lower bound exceeding the upper bound, the search

in the current node can be pruned if the min-cost routing of all commodities only uses arcs

that have already been decided to be installed. Thus, in every choice point we use a column

generation approach to solve the Min-Cost Multicommodity Flow Problem on the subset of arcs

with associated yij that have a current upper bound of 1. If that routing only uses arcs







with

yij with lower bound 1, we can prune the search and backtrack.

Branching Variable Selection The previous discussion also induces a rule for the selection of

the branching variable: it is clearly favorable to choose a variable for branching that is being

used by the current optimal min-cost multicommodity flow. Of course, there may be more than

just one such variable. Then we can choose the one with minimal or maximal reduced costs



fij



in the Lagrangian sub-problem with the best associated multipliers. The different choices will be

evaluated in Section 7.4.

Tree Traversal A simple depth first search procedure is used to choose the next search node.

This allows to find feasible solutions quickly and eases the reuse of Lagrangian multipliers.

Primal Heuristic To find reasonably good and near-optimal solutions quickly, in every search

node we apply a Lagrangian heuristic that was suggested by Holmberg and Yuan. It works by

computing multicommodity flow solutions on a subset of the arcs and de-assigning all arcs that

carry no flow. For further details, we refer the reader to [112].

Variable Fixing Heuristic Since the Capacitated Network Design Problem is very hard to be

solved exactly, we may decide to search for relatively good solutions quickly. The exact ap-

proach can be transformed into a heuristic for the problem by fixing variables more optimisti-

cally. Holmberg and Yuan [112] developed the so-called α-heuristic for this purpose: While

solving the Lagrangian dual, we protocol how often a variable is set to zero or one. If one of the

values is dominant with respect to a given parameter, the variable is simply set to this value.

7.4 Numerical Results

We report on our computational experience with the previously developed algorithms. The sec-

tion is structured as follows: first, we introduce the benchmark data used in the experiments.

Then we define the possible parameter settings that activate and deactivate different algorith-

mic components. Finally, we compare the variants when solving the CNDP from scratch, in the

optimality proof, and when using the approach as a heuristic.

144

Chapter 7. Capacitated Network Design

All tests were carried out on systems with AMD Athlon, 600 MHz processors, and 256 MB

main memory. The code was compiled with the GNU g++ 2.95 compiler using optimization

level O3.

7.4.1 Benchmark Data

Surprisingly, in spite of the theoretical interest the CNDP has drawn and the large number of

research groups that have dealt with the problem, apparently there has been no benchmark set

established on which researchers can compare algorithms that solve the CNDP exactly. Much

work has been done with respect to the computationof lower boundsand the heuristic solution of

the problem. Benchmarks used for this purpose (to be found in [44, 112], for example) are still

too large to allow the computation of optimal solutions. For variations of the problem (such as

the Multi-Arc CNDP, Network Loading, etc.), benchmark data exists, but it is not straightforward

to see how it could be converted into meaningful instances for the pure CNDP as we consider it

here.

Thus, we decided to base our comparison on a benchmark of 48 instances generated by

a CNDP generator developed by Crainic et al. and described in [44]. The generator is used

by different research groups after it was enhanced with a stable random number generator by

A. Frangioni. We generated graphs with 12, 18, and 24 nodes with 50 to 440 arcs and 50 to 160

commodities. For the heuristic comparison we use the benchmark set from [44, 45].

7.4.2 Algorithm Variants Considered in the Experiments

The optimization system developed consists of several parts. We list the components that are

compared and evaluated in the experiments: different Lagrangian relaxation algorithms based on

the shortest path (SP) or the knapsack relaxation (KP), respectively; a branch & bound algorithm

using bounds based on those relaxations, where the branching variable is chosen according to

minimal (BR0) or maximal (BR1) absolute reduced-cost values ˆ

fij; two different filtering al-

gorithms based on the shortest path relaxation (SF) and the knapsack relaxation (KF) that fix

variables statically (STAT) after optimal Lagrangian multipliers have been computed, or dynam-

ically (DYN) during the optimization of the Lagrangian dual; and finally, the cardinality interval

tightening algorithm that adds Lagrangian cardinality cuts to the problem (CIT).

7.4. Numerical Results

145

7.4.3 Evaluation

BR0



BR0-CIT BR1



BR1-CIT

time 93.7% 25.3%

min 4.72% 0.28%

max 353% 131.5%

variance 62.1% 11.5%

nodes 38.6% 10.1%

min 0.73% 0.02%

max 120.7% 78.4%

variance 14.5% 2.9%

Tab. 7.1: The impact of cardinality interval tightening when us-

ing the knapsack relaxation for pruning and problem reduction.

Mean, minimum, maximum, and variance of running time and

number of nodes in the branch-and-bound tree are given.

With the first experiments we per-

formed we wanted to find out

which type of Lagrangian relax-

ation was preferable. In ac-

cordance to the results reported

in [112], we found that the knap-

sack relaxation is clearly superior

both with respect to the number

of subgradient iterations needed to

solve the Lagrangian dual as well

as the time needed to solve the

Lagrangian sub-problems. There-

fore, we start right away with an

evaluation of the impact of La-

grangian cardinality cuts when solvingthe CNDP using the knapsack relaxation. Table 7.1 shows

a comparison of lower bound routines using the Lagrangian knapsack relaxation with and with-

out cardinality cuts. Table 7.2 shows a comparison of two different strategies for the selection

of the branching variable. Comparing two variants, the tables give the average percentage of the

second variant when compared to the first (that is always set to 100%) with respect to running

times and the number of search nodes visited in the branch & bound trees. Moreover, we specify

minima, maxima, and the variance of those percentages.

BR0



BR1 BR0-CIT



BR1-CIT

time 1817.7% 235.1%

min 68.03% 30.91%

max 7445.5% 1221.7%

variance 37944.6% 604.7%

nodes 2750.4% 311.1%

min 89.832% 15.636%

max 19415.3% 1427%

variance 172454% 1163.3%

Tab. 7.2: The impact of the branching variable selection when

pruning and filtering is done with the help of the knapsack relax-

ation.

Clearly, choosing a branch-

ing variable with minimal reduced

costs is favorable, no matter if car-

dinality cuts are introduced or not.

This contradicts the recommenda-

tion given in [112]. Actually, this

result is not very surprising. Intu-

itively, the variable with the mini-

mal absolute reduced costs is least

likely to be set by variable fixing.

It is the variable we have the least

knowledge about, and therefore it

is a good choice to base a case dis-

tinction on it. In contrast, the vari-

able with the largest absolute reduced costs is most likely to be set by variable fixing, and there-

fore it is no good idea to double the effort by using this variable for branching.

146

Chapter 7. Capacitated Network Design

Regarding the introduction of Lagrangian cardinality cuts, Table 7.1 shows that they have a

great impact on the number of search nodes that have to be investigated. Cardinality cuts are also

favorable with respect to the total running time, but the gains are not as large as with respect to

the size of the search tree. The trade-off is caused by the additional effort that is necessary to sort

the arcs with respect to the current reduced costs ˆ

fij.

When looking at the data more precisely, we find that the primal heuristic works much better

in the presence of cardinality cuts. The result of this positive effect is clear: high quality upper

bounds are found much earlier in the search, pruning and variable fixing work much better, and

the number of search nodes is greatly reduced, which explains the numbers in Table 7.1.

We conjecture that the primal heuristic works so well in the presence of cardinality cuts,

because they provide a good estimate on the number of arcs that need to be installed in order to

improve the current solution. Thus, the right amount of arcs is opened for the heuristic, and it is

able to compute near-optimal solutions at a higher rate.

Next, we evaluate the use of CP-based Lagrangian relaxation for the CNDP. Table 7.3 shows

a comparison of runs when using shortest pathvariable fixing in addition to the knapsack variable

fixing algorithm. The results are very disappointing: not only is the integrated approach inferior

with respect to the total running time. On top of that, the reduction of choice points is negligible,

and therefore the additional effort taken is almost worthless.

SOLVE: KF



KF-SF OPT: KF



KF-SF

time 148.6% 144.1%

min 96.59% 51.87%

max 466% 271.3%

variance 46.1% 13.5%

nodes 133.8% 94.9%

min 71.42% 20%

max 677.1% 180.3%

variance 166.3% 7.5%

Tab. 7.3: The impact of additional shortest-path filtering when using

the knapsack relaxation for pruning and problem reduction. Branching

strategy BR0 and cardinality interval tightening are used.

Note that, when using

the linking method, the num-

ber of search nodes some-

times even exceeds the value

that is achieved when us-

ing knapsack variable fix-

ing only. This is caused

bydifferenceswhenbuilding

up the search tree: the La-

grangian dual usually stops

with different Lagrangian

multipliers that have a severe

impact on the variable selec-

tion. Moreover, the genera-

tion of primal solutions dif-

fers, which makesthe comparisonparticularly difficult, because variable fixing ishighlysensitive

to the quality of upper bounds. Thus, to eliminate the last perturbation, we repeated the exper-

iment on the algorithmic optimality proof. That is, we provide the algorithm with an optimal

solution and let it prove its optimality only.

7.4. Numerical Results

147

Table 7.3 shows the results that still reveal the poor performance of the additional applica-

tion of shortest path variable fixing. The reason for this is that the shortest path variable fixing

algorithm is much less effective than the one based on the knapsack relaxation: Recall from Sec-

tion 7.2.4 that the shortest path relaxation value consists of two values, one for the shortest-path

routings, and the other one for the design variables. However, the variable fixing algorithm only

incorporates the latter costs and does not take into account that the removal of arcs may generally

increase the routing costs as well. Therefore, using shortest path variable fixing as described in

Section 7.2.4 is comparably ineffective and does not pay off.

We tried to improve the effectiveness of the algorithm by adding node-capacity constraints.

If a node is a source for some commodities, its out-capacity must be large enough to push the

corresponding supplyinto the network. Similarly, if a node is a sink node for some commodities,

its in-capacity must be large enough to let the required demand in. In contrast to the knapsack

relaxation, where the x- and y-variables are not independent, the shortest path relaxation allows

to incorporate those constraints very easily. Moreover, we tried to improve the shortest path

reduction algorithmfurther by computinga lower bound onthe additionalcosts ofrouting a given

commodity when a certain arc is not installed. The algorithm presented in Section 2.2.4.1 was

implementedfor this purpose. However, all these efforts did not result in a filtering algorithmthat

was effective enough to be worth applying. Therefore, we cannot recommend to use the shortest

path relaxation, neither for the computation of Lagrangian relaxation bounds, nor for problem

reduction. Note that this recommendation stands in contradiction to the one given in [44].

DYN



STAT

time 115.9%

min 36.95%

max 229.2%

variance 14.1%

nodes 123.2%

min 29.72%

max 233.8%

variance 20.8%

Fig. 7.1: Optimality proofs: comparison of

two strategies when using reduction based on

knapsack relaxation: on-the-fly fixing vs. fix-

ing after Lagrange.

In Figure 7.1, we compare the strategy to fix vari-

ables during the optimization of the Lagrangian dual

with the fixing of variables after optimal multipliers

have been found only. We see that filtering “on the

fly” is slightly favorable. Our experience also shows

that the subgradient method is very robust with re-

spect to problem reduction, and we can afford not to

reset the step length after variables have been fixed

in the Lagrangian sub-problem without loosing con-

vergence in practice.

Next, in Table 7.4, we compare the performance

of the algorithm we developed and the standard

solver ILOG CPLEX 7.5 [118] when solving the

CNDP and when proving the optimality of a given

solution. Clearly, using LP-bounds improvedby sev-

eral kinds of cuts that CPLEX adds to the problem

results in a huge reduction of search nodes. How-

ever, Lagrangian relaxation allows to compute lower bounds much faster, so that the approach

148

Chapter 7. Capacitated Network Design

OPT: CPLEX



KP-BR0-KF-CIT SOLVE: CPLEX



KP-BR0-KF-CIT

time 73.5% 229.2%

min 9.63% 22.48%

max 259% 753.5%

variance 36.5% 356.5%

nodes 1148.1% 3014.6%

min 196.666% 100%

max 5250% 10279.5%

variance 10762.4% 73762.5%

Tab. 7.4: Comparison of the CPLEX branch-and-cut algorithm and Lagrangian relaxation (pruning and

reduction based on the knapsack relaxation plus cardinality interval tightening).

presented here is still competitive when solving the CNDP. It even achieves a considerable im-

provement upon the running time in the optimality proof. Most important, however, is the fact

that the approach we developed is able to tackle much larger instances using the variable fixing

heuristic presented in Section 7.3.

We compare the non-exact version of our approach (using the α-fixing heuristic) with other

heuristic approaches [45, 94, 95] that have been developed for the CNDP. A comparison with

CPLEX is left out because it is not at all competitive for this benchmark set containing larger

CNDP instances. In Figure 7.2, we give the percentage of instances in a benchmark set (set C

in [44, 45], containing 31 instances) that have been solved within a given solution quality (in

percent, compared with the best known solution). Not only are the α-fixing with and without

cardinality cuts clearly superior with respect to the achieved solution quality. On top of that,

the heuristic variable fixing approach was stopped after at most 300 seconds CPU time. On this

benchmark set, heuristic variable fixing is on average about 6 times faster than TABU-PATH and

23 times faster than PATH-RELINKING (using SPECint values to make different architectures

comparable).

7.5 Summary and Future Work

We have presented an approach for the solution of the Capacitated Network Design Problem. It is

based on a tree search where lower bounds based on Lagrangian relaxation are used for pruning.

Two kinds of relaxation are considered, the shortest path and the knapsack relaxation. The latter

is clearly favorable with respect to the convergence of the subgradient algorithm that optimizes

the Lagrangian dual.

Two different variable fixing algorithms have been proposed in the literature that belong to

7.5. Summary and Future Work

149

Histogramm for the benchmark set C

.00%

20.00%

40.00%

60.00%

80.00%

100.00%

0 1 2 3 4 5 6 7 8 9 10 and larger

solution quality (%)

cumulated probability

ALPHA 300 TABU-PATH TABU-ARC (400) SS/PL/ID (400)

TABU-CYCLE PATH-RELINKING ALPHA CIT 300

Fig. 7.2: Comparison of different heuristic solvers for the CNDP.

the kind of relaxation that is chosen. When using the knapsack relaxation, we have shown how

variables can also be fixed with respect to shortest-path considerations by using dual values in

the Lagrangian knapsack sub-problem. However, even in a stronger version and in combination

with node-capacity constraints, the shortest path variable fixing algorithm is too ineffective to

justify the additional effort that is necessary for its application.

To tighten the problem formulation in a search node, we introduced the idea of local La-

grangian cardinality cuts. Experimental results show that their application improves the overall

running time, eventhough the timeper search node increases considerably whentheyare applied.

Finally, we compared the heuristic variable fixing approach with other heuristic approaches

developed for the CNDP. The results show that the tree-search approach that we implemented

clearly outperforms other heuristics both with respect to the CPU time needed and the solution

quality that is achieved.

Regarding the fact that we set up our system for the evaluation of variable fixing algorithms

and local Lagrangian cardinality cuts, and taking intoaccount that no sophisticated methods(like

bundle methods for example) for the optimization of the Lagrangian dual are used, we consider

these results as very encouraging. Most importantly, note that no global cuts are introduced yet to

strengthen the lower bounds computed. With the help of additional cuts that provably strengthen

the LP relaxation (see [33] for example), we expect that the performance of the approach pre-

sented can be further improved.

150

Chapter 7. Capacitated Network Design

Chapter 8

The Social Golfer Problem

In Chapter 4, we introduced the Social Golfer Problem as an example for a highly symmetric

constraint satisfaction problem. We revise the symmetry detection function for the symmetry

breaking method that was presented in Chapter 4. Then, with the help of heuristic constraint

propagation, we develop a state-of-the-art algorithm for the Social Golfer Problem.

The Social Golfer Problem has attracted much interest in recent years. That interest is mainly

caused by its highly symmetric structure, that has let it become a favorite playground for research

on the systematic breaking of symmetries.

In [202], Barbara Smith applied an approach that breaks symmetries during the search, i.e.,

she uses SBDS [93] to tackle the Social Golfer Problem. In combination with careful model

selection she was able to efficiently break most of the symmetries, but still found non-unique

solutions for the instances studied. Note that work is in progress which removes SBDS’s need

for an explicit list of symmetries [92, 152]; eventually this should allow SBDS to be used to

eliminate all symmetries from the problem.

In [81], Filippo Focacci and Michela Milano presented another generic method for breaking

symmetries, based on global cut seeds, generating symmetry removal cuts. With this approach it

should be possible to eliminate all the symmetries of the Social Golfer Problem, but at the time

of writing this has not been done.

The very similar SBDD method that we presented in Chapter 4was developedindependently.

It is based on the detection of dominance relations between choice points and works particularly

well for highly symmetric problems. At the time of writing, this is the only technique which has

been used to completely eliminate all symmetries from non-trivial instances of the Social Golfer

Problem.

In [104], Warwick Harvey compares SBDS and SBDD, and also givessome numerical results

on the Social Golfer Problem.

151

152

Chapter 8. The Social Golfer Problem

The approach that we present in the following was the best one known for the Social Golfer

Problem by the time of publication. Meanwhile, our results have been assimilated and extended

by several other research groups.

In [16], Nicolas Barnier and Pascal Brisset developed an approach for the Social Golfer

Problem that extends the concept of SBDD by incorporating the branching variable selection.

In [174], Francois Puget also refines symmetry breaking for the Social Golfer Problem.

The work presented in this chapter was published in [191, 192]. It is structured as follows:

To keep the chapter self-contained, we review the definition of the Social Golfer Problem in Sec-

tion 8.1. Then, in Section 8.2, we present a refined symmetry breaking function. Incorporating

heuristic constraint propagation of some redundant constraints that are introduced in Section 8.3,

we present numerical results of our approach in Section 8.4.

8.1 Definition

Given natural numbers w



 

, the Social Golfer Problem consists in finding wpartitionings

of the set







into gsets of size ssuch that no two such sets have more than one member

in common. More formally, the problem is to compute wg sets X1





















such that











sfor all 1



w, 1





gXi











for all 1



w, and















1 for all









 





An instance of the Social Golfer Problem is written as a triple g-s-wfrom now on. When











1, we have a configuration where every player must play with every other exactly

once. This corresponds to a resolvable Balanced Incomplete Block Design. Perhaps the most

well-known of these is Kirkman’s Schoolgirl Problem, posed (and solved) by Thomas Kirkman

in 1850. This instance, which is equivalent to the golfer 5-3-7 problem, was stated as follows:

How can 15 schoolgirls walk in 5 rows of 3 each for 7 days so that no girl walks

with any other girl in the same triplet more than once?

To the best of our knowledge, the computational complexity of the Social Golfer Problem is

not known yet. In the combinatorial design area, solutions for s



3 are known as Kirkman Triple

Systems or Resolvable Steiner Systems [41]. It can be shown that an instance x-x-4 is equivalent

to finding two orthogonal latin squares of size x. Even more so, an instance x-x-yis equivalent

to finding a set of y



2 mutually orthogonal latin squares, a problem that has been studied for

many years now [42].

8.2. Another SBDD-Approach for the Social Golfer Problem

153

Despite its apparently simple definition, computational approaches have great difficulties

solving even small instances of the Social Golfer Problem in a reasonable amount of time. In our

view, there are two main aspects to the problem that cause its big complexity for enumeration

approaches:



The Social Golfer Problem is highly symmetric.



The clique structure of the constraints ensuring that any two golfers do not play together

more than once makes it hard to judge the feasibility of a partial assignment.

In the following, we will address these two points by introducing a refined symmetry detec-

tion function for SBDD and the idea of heuristic constraint propagation.

8.2 Another SBDD-Approach for the Social Golfer Problem

The Social Golfer Problem contains a remarkable number of symmetries. Players can be placed

at any position within a group, groups can be rearranged within their week, and the weeks can

be ordered arbitrarily. Moreover, the player names can be permuted in any way desired. To give

an example: even the best models (in terms of symmetry reduction) for the original schoolgirl

instance still contain more than 1012 symmetries.

We have chosen to apply symmetry breaking during search (SBDD) in combination with a

straightforward model for the Social Golfer Problem that can be implemented with very little

effort using the ILOG SOLVER environment [121]. The groups are modeled as sets of players

with the cardinality of each set fixed to s. Each week contains gsuch sets, and the full pattern

covers wweeks. Initially, we fix all the players in the first week in increasing order. Additionally,

we insert the first splayers into the first sgroups for all weeks thereafter. Finally, the first group

of the second week can be filled with the smallest indexed players possible. None of these initial

labelings exclude any unique solutions.

We build up the schedule week by week, choosing as branching variable the group with the

smallest domain or as branching value the player with the fewest possible groups he or she can

be assigned to, depending on what leaves us with fewer choices.

To apply SBDD, we need to define a symmetry detection function ϕthat, for two given choice

points c1and c2represented as patterns reflecting the branching decisions taken so far, returns

true if and only if there exists a symmetry showing that c1defines a sub-tree of c2under that

symmetry. Then, at every choice point we check whether it is dominated in this fashion by some

previously expanded choice point, and if so, we prune the search.

In Section 4.3, to find a symmetry that proves a dominance relation between choice points,

we presented a procedure that, when given a specific player permutation, checks whether there

154

Chapter 8. The Social Golfer Problem

exists a week permutation that shows that one pattern dominates the other. To remove all sym-

metries, it was then necessary to iterate over all player permutations, a test that turned out to be

extremely expensive. Now, instead of iterating over all player permutations and computing week

permutations, we suggest to proceed vice versa: given a permutation of the weeks, we check

whether there exists a permutation of the players such that one pattern dominates the other. That

is, we iterate over all week permutations in c2and search for a player permutation that is feasible

with respect to the currently required matching of the weeks. Thus, again we set up a nested

constraint satisfaction problem to find a suitable symmetry or prove that none exists.

As this full dominance check is still very expensive for the Social Golfer Problem, we only

perform it when a week is being filled completely. This idea is motivatedby the experiments that

we presented earlier in Section 4.3. There, the application of a full dominancecheck turned out to

be most efficient when being applied in selected levels of the search tree only. For the remaining

nodes, we fix the playerpermutationto the identityand apply ΦW



G(see Section 4.3.2), that is, we

search for symmetries according to feasible orderings of the weeks, the groups and the players

within the groups. This less expensive check is implemented as a pairwise dominance check

between weeks, followed by the computation of a maximum cardinality matching on a bipartite

graph









where the weeks in c1and c2define the nodes in V1and V2, respectively, and









Eiff week iin c1is dominated by week jin c2.

8.3 Heuristic Constraint Propagation

Having introduced the Social Golfer Problem, and having presented an efficient way of handling

the many symmetries in the problem, we now introduce a new idea, the heuristic propagation of

additional redundant constraints by means of local search.

Assume we are given an NP-hard constraint satisfaction problem. Even though there is no

proof that we cannot solve the problem efficiently, there is strong empirical evidence that we

cannot compute a solution in polynomial time. The common approach then is to explore the

search space in some sophisticated manner that tries to consider huge parts implicitly. For con-

straint satisfaction problems, that means that we try to cut off preferably large regions that do not

contain any feasible solutions. In constraint programming, particularly when performing some

kind of tree search, domain filtering algorithms are used for that purpose. Basically, the model

and the degree of propagation determine how the work is partitioned between the choice points

and the search tree as a whole. That is, we can aim at reducing the number of choice points by

spending additional effort in each of the choice points, or we can choose to keep the work done

per choice point small, resulting in a bigger search tree.

Thus, we face a trade-off between the time spent per choice point and the total number of

choice points. To take the alternatives to extremes, on one hand we can explore the entire domain

space, and on the other hand we can compute a solution or prove that none exists in the first node

8.3. Heuristic Constraint Propagation

155

visited, without making any choices which might need to be backtracked. For most applications,

the optimal balance will lie somewhere between the two extremes.

If we find that we expand too many choice points we may want to give more burden to the

individual choice points. Revising the model by adding redundant constraints is a common way

to achieve that goal. In general, we expect the redundant constraints to detect inconsistencies

higher up in the search tree. However, since checking whether a given partial assignment is

extensible to a full solution is usually of the same computational complexity as the original

problem, redundant constraints typically still only enforce a relaxation of the actual problem.

We propose adding tight redundant constraints that may be hard to verify exactly, but that

can be checked by applying some heuristic. That is, when formulating additional constraints,

we do not wish to restrict ourselves to considering only those constraints which are (relatively)

easy to propagate completely. Instead, we perform an incomplete check of complex redundant

constraints usingthe rich heuristic machinery that was developedin the operations research com-

munity.

8.3.1 Literature on the Integration of CP and Local Search

In recent years, a substantial number of different approaches to the integration of constraint

programming (CP) and local search (LS) have been developed. The main problem when con-

structing CP-LS hybrids is caused by the fact that CP uses monotonic reasoning whereas LS does

not.

Fairly balanced hybrids result from sequential applications of the two methods [36, 173].

Other balanced hybrids can also be achieved by applying decomposition methods (like La-

grangian relaxation, column generation or Benders decomposition), where sub- and master prob-

lem can be solved by different solution methods.

On the other hand, there also exist many developments that favor one of the methods and

just use the other one to overcome certain weaknesses. Constrained local search, for example,

uses LS as the predominant approach, with CP used to find neighbors in a sparse and/or large

neighborhood [5, 165].

Focusingon constraintprogramminginsteadhasyielded hybridsthat uselocal searchto adapt

the variable and/or value ordering in the search tree. Only recently, an approach was presented

that uses local search to find dominating partial assignments that prove the sub-optimality of

the current search node, information that can be used for pruning [82]. For a more complete

overview on the field we refer the reader to the recent tutorial by Focacci et al. [76].

As with other methods, constraint programming forms the basis of our approach. However,

our use of local search is quite different to any of the methods mentioned above. For a given

model for a problem, we suggest considering the addition of complex redundant constraints, and

then using local search to perform (incomplete) propagation of these constraints.

156

Chapter 8. The Social Golfer Problem

The idea of using a stochastic search method to prove unsatisfiability is not new; it was

Challenge 5 in [196]. Also, heuristic search has been used when tackling certain optimization

problems like maximum clique or graph coloring [15, 60]. However, it appears that the idea

has never been introduced systematically. In the few examples where they are used, complex

redundant constraints are only used for pruning, but not for domain filtering. To the best of our

knowledge, the workpresented here is the first to do it, albeiton a fairly specific set ofconstraints

that comprise sub-problems of the real problem we are trying to solve.

We already mentioned that the constraints requiring that every golfer must not play with any

other golfer more than once makes it very hard to judge the extensibility of a partial assignment.

That is, the sub-tree rooted by the current choice point may not contain any feasible solution, but

due to the fact that the different constraints in the problem are propagated independently from

each other and only interact via domain reductions, we cannot detect infeasibilities high up in

the search tree. As a matter of fact, when using the model we described above, many searches

will only backtrack when the assignments for an entire week are almost complete or even after

having started to do assignments in the last week only.

For most constraint satisfaction problems, to check whether a partial assignment can still be

extended to a feasible solution is of the same computational complexity as the original problem

itself. Therefore, in general the pruning and filtering algorithm applied in the search nodes

cannot be expected to be exact. Rather, it must be looked at as a heuristic to tighten the problem

formulation and to shrink the search space. Of course, there is a trade-off between the time

needed to apply that heuristic and the time needed to explore the remaining sub-tree.

Now, to improve the situation for the Social Golfer Problem as well as for many other con-

straint satisfaction problems, we can try to formulate necessary constraints for partial assign-

ments to be extensible to complete, feasible solutions. For the Social Golfer Problem (and, we

believe, for many other problems as well), we are left with the decision to choose a weak redun-

dant constraint that can be propagated efficiently, or to pick a condition that is more accurate but

maybe much harder to verify. For the latter, we may consider applying a heuristic to perform

incomplete domain filtering or pruning. Note that since the added constraints are redundant, the

incompleteness of this filtering does not affect either the soundness or the completeness of the

search; if some opportunity for pruning is missed, it just means that the tree searched may be

larger than strictly necessary.

In the following, we describe two different types of additional constraints that we would like

to add to our model to be able to detect inconsistencies quickly and as early as possible. The first

type of redundant constraint is defined with respect to the possibility of completing the assign-

ments in a given week. Since we represent weeks by rows in our schedule under construction,

we use the term horizontal constraints to refer to these constraints. Correspondingly, by verti-

cal constraints we mean those used to check necessary conditions for a partial assignment to be

extensible to a w-week solution.

8.3. Heuristic Constraint Propagation

157

group 1 group 2 group 3 group 4 group 5

week 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

week 2 1 4 7 2 5 8 3 * * * * * * * *

Tab. 8.1: A partial instantiation of the 5-3-2 Social Golfer Problem.

group 1 group 2 group 3 group 4 group 5

week 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

week 2 1 4 7 2 5 8 3 6 10 11 * * 12 * *

Tab. 8.2: A more complete partial instantiation of the 5-3-2 Social Golfer Problem.

8.3.2 Horizontal Constraints

Consider the example in Table 8.1. We are searching for a two-week schedule for 15 golfers, to

be arranged in 5 groups of 3 in each week.

For the given partial assignment, suppose we add player 6 to group 3. Next observe that

players 10, 11 and 12 mustbe separated in week 2. As there are only three more groups that have

not yet been filled completely, the third player in group three must be one of players 10, 11 and

12. A possible continuation is given in Table 8.2.

Now there are only two groups left that have not been completed yet, but players 13, 14 and

15 must still be separated. Therefore, the current partial assignment is inconsistent, and we can

backtrack.

We can generalize this observation. For a given incomplete week, we define a residual graph

Rthat consistsof a node for each unassigned player and an edge for each pair of such players that

have already been assigned to play together in some other week. An example of such a graph,

corresponding to week two from Table 8.1, is shown in Figure 8.1. Then, for the given week we

count the number of groups that are not closed yet and compare that number with the size of the

biggest clique in R. If the first number is smaller than the latter, then there is no way to extend

the current assignment to the rest of the week, and the assignment is inconsistent.

11 15 14

Fig. 8.1: The residual graph of week 2 from Table 8.1.

158

Chapter 8. The Social Golfer Problem

In this way we can define a sufficient condition for a witness which proves that the current

partial assignment is inconsistent: a clique exceeding a certain cardinality. If we can find such

a witness, we can backtrack immediately. Finding a clique of size kis known to be NP-hard for

arbitrary graphs, and while the residual graphs we are dealing with have special structure that

may allow the efficient computation of such a clique, we chose not to try to find a polynomial-

time complete method. Instead we apply a heuristic search to find a sufficiently large clique,

an approach which, as we shall see, has advantages over one which simply returns the largest

possible clique.

Reconsider the situation given in Table 8.1. We can add neither player 6 nor player 9 to

group 3 for the same reason: the members of groups 4 and 5 of week 1 must be separated, and

to do so we require the two open positions in group 3 of week 2.

When checking the redundant constraint described above, assume we have set up the residual

graph and suppose the heuristic we apply finds the two disjoint cliques of size three (







and







). Since the sizes of these cliques are equal to the number of incomplete groups,

we have not found any witnesses showing that the current partial assignment cannot be extended

to a full schedule — indeed, the schedule can still be completed. However, since group 3 has only

two open positions left, we can conclude that group 3 must be a subset of





















. That is, we can use heuristic information for domain filtering.

We conclude that finding a witness for unsatisfiability is a rather complex task, but it can be

looked for by applying a heuristic. Moreover, even if we do not find such a witness, we may find

other “good” witnesses (namely some fairly large cliques) and their informationcan be combined

and used for domain filtering.

Therefore, it is advantageous to use a heuristic that not only provides us with good solutions

quickly, but that also gives us several solutions achieving almost optimal objective values. Local

search heuristics seem perfectly suited for this purpose.

8.3.2.1 Heuristic Clique Search

To find large cliques in the residual graph, we perform a randomized local search that works in

the following way: We initialize our current clique Cwith a random node and set Cbest



Next we intensifyCby repeatedly searching for a random node that is adjacent to all nodes inC.

If no such node exists anymore, we compare the cardinalities



and



Cbest



and update Cbest if

necessary. We then move on with a diversification step by adding a random node v





Cto C

and removing all nodes inCthat are not adjacent to v. These nodes shall not be considered in the

next diversification step. Now, the loop is complete and we return to the intensification phase.

The process stops after having found a clique that exceeds the crucial cardinality or after a given

iteration limit.

8.3. Heuristic Constraint Propagation

159

group 1 group 2 group 3 group 4 group 5

week 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

week 2 1 5 9 13 2 6 10 * 3 7 * * 4 8 * * * * * *

Tab. 8.3: A partial instantiation of the 5-4-2 Social Golfer Problem.

16 15

14 19

Fig. 8.2: The residual graph of week 2 from Table 8.3.

Obviously, the approach as sketched above produces a sequence of cliques that we can use

for pruning and domain filtering. We do not claim that the clique search procedure we use is

very sophisticated for finding maximum cardinality cliques in a given graph. In fact, many other

heuristic and exact approaches can be thought of, and there has certainly been a lot of relevant

research done. However, we did not aim at developing a special method that produces one clique

of large cardinality, but rather that finds a large number of fairly big cliques. In any event, while

the algorithm could no doubt be improved on, for the graphs we are dealing with it works well

enough to prove the point.

Before we continue by presenting another type of redundant constraint developed for the

Social Golfer Problem, we give a more complete example of how horizontal constraints can be

used for pruning and domain filtering.

Consider the partial assignment for the social golfer instance 5-4-2 given in Table 8.3. The

associated residual graph for week 2 is given in Figure 8.2.

The local search procedure that we apply returns three disjoint cliques, namely













, and







. Since there are still four groups that have not been filled up

yet and the largest clique is of size four, we cannot prune right away. However, we do know

that the final element in group 2 must come from the clique of size four, and so we can shrink

the set of possible elements appearing in this group to







. This leaves just

three open groups left, and we have a remaining clique of size 3 at hand. Again we cannot

prune here, but we know that the remaining two elements of groups 3 and 4 must come from the

cliques of size three and four, so we can shrink the set of possible elements of those groups to







and







, respectively. Now there is

only one open group left, but we still have to separate players 11 and 12; as a result, the current

assignment is inconsistent, and we can backtrack.

160

Chapter 8. The Social Golfer Problem

group 1 group 2 group 3 group 4 group 5

week 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

week 2 1 6 7 2 5 10 3 4 13 8 11 14 9 12 15

week 3 1 5 14 2 4 12 3 7 11 9 10 13 6 8 15

week 4 1 4 * 2 * * 3 5 * * * * * * 15

week 5 * * * * * * * * * * * * * * *

week 6 * * * * * * * * * * * * * * *

week 7 * * * * * * * * * * * * * * *

Tab. 8.4: A partial instantiation of the 5-3-7 Social Golfer Problem.

The example shows that it can even be advantageous to have several cliques at hand rather

than one big clique only: all cliques together allowed us to prune the search at the current choice

point. If we had known the largest clique of size four only, though, we would have had to expand

the sub-tree below.

8.3.3 Vertical Constraints

Horizontal constraints are very helpful for judging the extensibility of the week currently under

construction. However, they do not help much in getting a clearer view of whether a partial

assignment can still be extended to a full w-week solution. Therefore, we added another type of

redundant constraint to our model, so-called vertical constraints.

Again, we start our discussion with an example (see Table 8.4). In the current partial as-

signment, week 4 can still be completed consistently. Is there still a continuation of the given

schedule to a full 7-week solution, though?

Fig. 8.3: The residual graph of player

15 from Table 8.4.

Looking at player 15 (let us assume she is fe-

male), we find that she has played with all players in







already. As there are 4 weeks left that

she has not yet been assigned partners for, she must still

play with all players in







. To be able

to do so, there must be four independent pairs of players in

this set that still have not been assigned to play with each

other. To check this, we define a residual graph again (this

time for each player, in contrast to each week for horizon-

tal constraints) that consists of one node for each player

that player 15 has not been assigned to play with yet, and

an edge between all such players that have already played

with each other (see Figure 8.3).

8.3. Heuristic Constraint Propagation

161

group 1 group 2 group 3 group 4

week 1 1 2 3 4 5 6 7 8 9 10 11 12

week 2 1 4 7 2 8 10 3 5 11 6 9 12

week 3 1 8 11 2 4 9 3 6 10 5 7 12

week 4 * * * * * * * * * * * *

week 5 * * * * * * * * * * * *

Tab. 8.5: A partial instantiation of the 4-3-5 Social Golfer Problem.

1 2

348

Fig. 8.4: The residual graph of player 12 from Table 8.5.

When trying to find four disjoint stable sets (the term independent set is also used in the

literature) of size two heuristically, we find that we do not succeed. Of course, this does not

prove that there is none. That is, for the vertical constraints we need a witness that not enough

disjoint stable sets of a given size exist. The given example already gives an idea of what such

a witness might look like. Looking at Figure 8.3, we find that the clique







prevents

us from finding four disjoint stable sets of size two. That is, a clique that exceeds a certain

cardinality is a witness that the current assignment is inconsistent.

To find large cliques in the residual graph of a player, we can apply the local search procedure

developed in Section 8.3.2.1. However, we are facing a slight drawback when using cliques as

witnesses. For the Schoolgirl Problem, i.e. when we know that every player must play with every

other player exactly once, the bound on the maximum cardinality clique in the residual graph can

be chosen to be rather tight. However, if we look at the Social Golfer Problem instance 4-3-5 for

example, every golfer does not play every other golfer; they play every other golfer except one,

and a priori we cannot say which one that is. As a result, if we look for a clique large enough to

guarantee that there will be too few disjoint sets of the appropriate size, that condition is stronger

than we would like. This means that it does not become satisfied until later in the computation,

and we are not able to prune as early as would be desirable.

To see an example of this, consider the partial schedule in Table 8.5. There are five players

which player 12 has so far not been assigned to play with, and to complete the schedule we need

two disjoint stable sets of size two. To prove that this is not possible with a single clique, we

162

Chapter 8. The Social Golfer Problem

group 1 group 2 group 3 group 4

week 1 1 2 3 4 5 6 7 8 9 10 11 12

week 2 1 4 7 2 5 12 3 8 11 6 9 10

week 3 1 6 11 2 4 9 3 7 12 5 8 10

week 4 * * * * * * * * * * * *

week 5 * * * * * * * * * * * *

Tab. 8.6: A partial instantiation of the 4-3-5 Social Golfer Problem.

Fig. 8.5: The residual graphs of players 4 (left), 6 (middle) and 12 (right) from Table 8.6.

would need a clique of size four. Looking at the residual graph for player 12 (see Figure 8.4),

there are three cliques of size three, namely













and







, but no cliques of size

four. However, there are no pairs of disjoint stable sets of size two either, so the schedule cannot

be completed but this is not detected. (Note that the residual graphs for the other players all have

exactly the same structure.)

As with the horizontal constraints, we can obtain better results by considering more than one

clique at once. To illustrate this, consider the slightly different example in Table 8.6. Looking

at the residual graph of player 12 (see Figure 8.5), we find two cliques of size three, namely







and







. As before, there are no cliques of size four, but this time there is a pair of

disjoint stable sets of size two:







and







. So we cannot tell that the schedule cannot be

extendedsimply by lookingat this residualgraph. However, we can draw inferences about which

player is the one that player 12 will not be playing with: it must be an element of the intersection

of all cliques of size three. This is because all cliques of size three have to be broken by the

removal of a node; otherwise we are guaranteed that there is no way to partition the remaining

four nodes into a pair of disjoint stable sets. Thus we can deduce that the player that player 12

must not play with is either player 4 or player 6.

We now look at the residual graphs for player 4 and player 6 (see Figure 8.5). For player 4

we find that the intersection of cliques







and







requires that player 4 will

certainly not play with player 11. Similarly, player 6 must not play with player 3, since, for

example,













 





. This means both of these players must play with player 12,

which contradicts the fact that player 12 must not play with one of them. Hence we can prune

the search.

8.4. Numerical Results

163

4-3-3 4-3-4 4-3-5 5-4-3 5-4-4 5-4-5 5-4-6 5-3-6 5-3-7

PI 0.32 0.71 0.98 46.63 83.24 132 143



5 days



5 days

(102) (182) (227) (7430) (5392) (6409) (17129)

3.1 3.9 4.3 6.3 15.4 20.6 83.2

H0.23 0.43 0.31 40.19 43.2 43.74 33.80 13933 13311

(76) (87) (90) (6205) (2829) (2823) (2818) (266140) (266268)

3.0 4.9 3.4 6.5 15.3 15.5 12.0 52.4 50.0

V0.36 0.62 0.47 57.30 90 98.25 46.69 85855 1815

(100) (173) (116) (7415) (5282) (5574) (3300) (1075165) (153697)

3.6 3.6 4.1 7.7 17.0 17.6 14.1 79.9 11.8

HV 0.25 0.4 0.26 47.78 47.45 44.18 23.21 6771 394

(76) (75) (74) (6205) (2750) (2791) (1770) (165238) (85790)

3.3 5.3 3.5 7.7 17.3 15.8 13.1 41.0 46.0

Tab. 8.7: The CPU time needed to compute all unique solutions (in seconds), in brackets the number of

choice points visited, and the time per choice point (in milliseconds) when computing all unique solutions.

8.4 Numerical Results

To confirm our theoretical discussion, we implemented the model as described in Section 8.2 in

C++, compiled by gcc 2.95 with maximal optimization (O3). All experiments were performed

on a PC with a INTEL Pentium III/933MHz-processor and 512 MB RAM running Linux 2.4.

Regarding the additional redundant constraints, we present a comparison of four different

parameter settings:

1. The plain implementation without redundant constraints (PI),

2. PI plus horizontal constraints only (H),

3. PI plus vertical constraints only (V), and

4. PI plus horizontal and vertical constraints (HV).

In Table 8.7, the variants are evaluated on several social golfer instances. To make the com-

parison fair and reduce the impact of other choices such as the variable and value orderings used,

we compute all unique solutions of an instance (or prove that there are none, for the 4-3-5 and

5-4-6 instances), counting the number of choice points and measuring the CPU time required.

Clearly, using both types of additional constraints results in the biggest reduction of choice

points, though for small numbers of weeks the vertical constraints do not give any benefit when

horizontal constraints are used (and little benefit even when they are not). It is not surprising

164

Chapter 8. The Social Golfer Problem

4-3-3 4-3-4 4-3-5 5-4-3 5-4-4 5-4-5 5-4-6 5-3-6 5-3-7

PI 111 285 358 27057 19804 22295 22673

36 115 137 3025 4515 5747 5814

28 45 48 31 71 81 80

H68 113 114 19264 10589 10690 10652 2013018 2012597

36 64 58 3018 1805 1934 1870 463737 463984

35 58 26 35 66 71 60 94 96

V105 270 123 26970 19280 20209 10896 7969706 724678

36 110 10 3025 4515 4757 2310 2684251 143631

22 40 2 25 62 63 35 96 80

HV 68 89 70 19264 10262 10476 5993 1161383 386390

36 51 0 3018 1805 1785 1074 242709 18424

48 40 0 30 59 58 23 87 44

Tab. 8.8: The number of simple and complete symmetry checks, and the percentage of time spent in these

checks when computing all unique solutions.

that the vertical constraints are of little use when the number of weeks is small, since for such

instances the constraints are very weak: there is a lot of flexibility available in deciding which

players any given player should play with in the remaining weeks since the number of players

they will never play with is quite large. However, as the number of weeks increases, each player

must play with a greater number of the other players, so the amount of flexibility is reduced and

the constraints become steadily stronger.

A similar trend can be seen in the effectiveness of the horizontal constraints: while they

are useful for small numbers of weeks, their effectiveness improves as the number of weeks

increases. Again, this is due to the fact that as the number of weeks increases, each player

must play with a greater number of the other players, so that there is less flexibility available

in deciding who should play together in a given week. This means that it is more likely that

a partial week assignment cannot be completed, and hence the horizontal constraints can prune

more often.

Whether or not a reduction in the number of choice points also results in a reduction of CPU

time is of course determined by the trade-off between the time needed to apply the incomplete

propagation algorithms and the time saved by the reduction of choice points. Adding vertical

constraints when the number of weeks is small can be expected to worsen the CPU time due to

the ineffectiveness of these constraints on these instances, and that is indeed what happens. For

(almost) all other instances, the reduction in the number of nodes is sufficiently large to offset the

extra cost of the constraints. The only real surprise here is that adding the extra constraints can

result in the average amount of time spent at each node decreasing, sometimes quite markedly.

8.5. Summary

165

Number of Groups – Number of Players per Group

Number of Weeks 2-2 3-2 3-3 4-2 4-3 4-4 5-2 5-3 5-4 5-5

2 1 1 1 2 1 1 2 2 1 1

3 1 2 1 8 4 2 23 251 40 2

4 0 1 1 16 3 1 310 13933 20 1

5 1 0 19 0 1 3468 9719 10 1

6 0 13 0 13277 49 0 1

7 6 14241 7 0

8 0 3192 0

9 396

10 0

Tab. 8.9: The number of unique solutions for several social golfer instances.

This is due to some synergy between the extra constraints and the symmetry breaking method

used: roughly speaking, the more pruning, the easier the average dominance check becomes,

presumably because the nodes pruned tend to have more expensive checks (see Table 8.8). Since

the dominance checks form a significant part of the run time, this can be quite a noticeable effect.

Using the (HV) setting, we have been able to compute all unique solutions for all instances

with at most 5 groups and 5 players per group (see Table 8.9). To the best of our knowledge, this

is the first computational approach that has been able to compute these numbers for instances

of this size. Moreover, we found solutions for many previously unsolved (at least by constraint

programming) larger instances, such as the 10-6-6 and the 9-7-4 instances.1Finally, we were

able to solve the formerly unknown 6-5-6 and 6-5-7 instances. We computed a six week solution

for the six groups of five instance, proved that there are exactly two unique solutions for that

instance, and showed that these solutions are optimal by proving that no seven week solution

exists.

8.5 Summary

For the Social Golfer Problem, we developed an algorithm that efficiently computes unique solu-

tions only. Symmetry breaking is based on the concept of SBDD. We have introduced the idea of

heuristic constraint propagation for complex redundant constraints. We proposed two different

types of additional constraints, so-called horizontal and vertical constraints. Propagating both

types of constraints exactly would require the computation of all the maximal cliques in residual

graphs with certain structural properties. Instead, we perform an incomplete propagation using

1An overview of solutions found by constraint programming can be found at [105].

166

Chapter 8. The Social Golfer Problem

a local search method to find a number of maximal cliques (but perhaps not all of them and per-

haps not the largest possible). We have shown how such sets of cliques can be used for domain

filtering and pruning. The experiments clearly show that adding tight redundant constraints to

the problem can be of benefit, even when they are only propagated incompletely.

Our ability to use local search for domain filtering and proving unsatisfiability relies on the

fact that we have identified sufficient conditions for proving that a partial schedule cannot be

extended to a complete one, and that a local search procedure can be defined such that a solution

returned is a proof that the conditions have been met. It would be interesting to see what other

kinds of constraints this can be done for, and whether it can be generalized far enough to pass

Challenge 5 from [196].

Chapter 9

Graph Bisection

The last problem that we consider in this thesis is the NP-hard Graph Bisection Problem. We

develop an algorithmthat, given a graph, determines its exact bisectionwidth. Due to its inherent

hardness, we cannot hope for an efficient algorithm to tackle the problem (at least in terms

of worst case running times and under the common assumption that NP





P), but we aim at

increasing the size of instances that can still be solved in an affordable amount of time.

As it is the case for any combinatorial optimization problem, the task of computing the bi-

section width of a graph is twofold. First, an optimal solution to the problem must be computed,

and second, its optimality must be proven. Regarding the first task, efficient heuristics have been

developed in the literature to compute high quality and, for small graphs, often even optimal

solutions very quickly. To prove the optimality of a given solution, we use a total enumeration

branch & bound approach. The key issue for the development of a good tree search algorithm is

to compute tight lower bounds on the bisection width.

In [197], Sensen presented a new lower bound for the bisection width of a graph that consists

in solving a generalized Maximum Multicommodity Flow Problem: We need to solve a Max-

imum Multicommodity Flow Problem on an undirected graph, whereby the commodities have

exactly one source, but may have many sinks.

In contrast to the previous chapters, problem reduction is a minor subject of our research for

the Graph Bisection Problem. However, we have seen that filtering based on cost considerations

relies on a tight bound on the objective. Our contribution here is the development of time and

memory efficient algorithms that approximate Sensen’s lower bound. Building up on existing

techniques for the solution of Multicommodity Flow Problems, we develop two bound computa-

tion routines: the first is a fully polynomial time approximation scheme (FPTAS), and the second

is a Lagrangian relaxation based cost-decomposition approach. Both routines are embedded in

a branch & bound-framework and compared with a barrier LP-solver on various test instances.

The algorithms presented allow us to compute the bisection width of large structured graphs,

such as DeBruijn 9 and Shuffle-Exchange 10, which were unknown and out of the reach of exact

167

168

Chapter 9. Graph Bisection

graph bisection algorithms before.

The work presented has not been published yet. It is joint work with Norbert Sensen and

Larissa Timajev. The chapter is structured as follows: In Section 9.1, we introduce the Graph

Bisection Problem. In Section 9.2, we review the lower bounds on the bisection width that have

been proposed in [197], especially the so-called VarMC-bound. In Sections 9.3 and 9.4, we

develop two algorithms for the computation/approximationof that VarMC-bound. The complete

branch & bound-approach is sketched in Section 9.5. Finally, in Section 9.6, we compare the

different algorithms on a set of test instances.

9.1 The Graph Bisection Problem

The Graph Bisection Problem is defined as follows:

Definition 9.1 Let G

 





denote an undirected, edge-weighted graph, whereby uvw is the

weight of the edge









E. Since G is undirected, uvw



uwv for any









Let 2



 

and V1





V such that, for all 1







k, Vi





0and V









Vk. Then we call







a k-partition (of G). A balanced k-partition of a

graph G additionally satisfies the condition

 







 



1for any 1









Given a k-partition







, we set U :

























E. Then

the value ∑







Uuvw is called the cut size of the partition. The minimal cut size among

all balanced k-partitions of a graph G is called the k-section width of G.



For k



2, a k-partition is also called a bisection (of G). The minimum cut size over all

balanced bisections of G is called the bisection width of G.



Given an edge-weighted, undirected graph G, the Graph Bisection Problem consists in the

computation of the bisection width of G.

For many special graph classes (such as grids, tori, hyper-cubes, butterflies, cube-connected-

cycles etc.), the bisection width has been sought out by theoretical reflections. However, in gen-

eral the Graph Bisection Problem is NP-hard [89]. The best known approximation algorithm for

the problem is presented in [69], where a poly-logarithmic algorithm is presented that achieves

an approximation quality in O



log2n



In practice, the size of instances that can stillbe solved efficiently is rather small. Until today,

optimal bisections can be computed for small graphs with a few hundred vertices only. On the

other hand, there are very efficient heuristics available for the problem [109, 134, 164, 172, 214].

Therefore, the main problem is the proof of optimality, and thus, the computation of tight lower

bounds on the bisection width of a graph.

9.2. Bounds on Graph Bisection

169

In the last years, a number of approaches have been presented that solve the Graph Partition-

ing Problem exactly. In [72], a branch & cut algorithm for the problem is presented. The same

approach is followed in [27], whereas in [124], the Graph Partitioning Problem is tackled using

a column generation approach. The most recent and apparently also most successful approach is

presented in [133] where semi-definite programming relaxations are used as lower bounds. An

experimental comparison of our developments and the semi-definite bound in [133] is given in

Section 9.6.

9.2 Bounds on Graph Bisection

Our work is based on the lower bounds on the Graph Partitioning Problem introduced in [197].

In the following, we give a small survey on the main ideas of these bounds:

A well-known lower bound on the bisection width can be achieved by embedding a clique

with the same number of nodes ninto the given graph G[140]. If that complete graph can

be embedded with a congestion C, we know that Ghas a bisection width of at least n2

4C. A

similar lower bound on the bisection width can be computed by solving a Multicommodity Flow

Problem: For every node in G, we introduce a commodity that originates at its corresponding

node. Every other node requires exactly one unit of that commodity. That way, every node has

to send one unit of its corresponding commodity to every other node. Note that we do not need

to enforce integrality constraints on the flows while strengthening the bound computed.

We can improve this bound by taking two steps of generalization: First, it can be observed

that every single-sourcemulticommodityflowinstance (with arbitrary demands and destinations)

can be used to compute a lower bound on the bisection width. The critical point is theCutFlow,

i.e. the amount of flow which can be ensured to cross every possible bisection of the graph. It is

easy to show that CutFlow

Cis always a valid lower bound on the bisection width of a graph.

Second, we do not have to select an appropriate multicommodity flow instance by ourselves:

Typically, linear programming techniques are used to solve Multicommodity Flow Problems.

In [197], it is proposed to leave the selection of an appropriate multicommodity flow instance to

the linear program by adding some variables and constraints. Two different possibilities with a

different degree of freedom for the selection of the Multicommodity Flow Problem have been

introduced: the VarMC and the MVarMC formulations. In the MVarMC formulation, every node

has the freedom to send a commodity of arbitrary size to each other node. On the other hand, in

the VarMC formulation, every nodehas tosend a commodityof arbitrary size toevery othernode,

whereby each destination gets the same share. Experiments have shown that the VarMC formu-

lation gives equally good bounds on connected graphs as the MVarMC formulation. Therefore,

we develop different algorithms for the computation of the VarMC-bound.

170

Chapter 9. Graph Bisection

9.3 Approximation of the VarMC-bound

Even though the continuous Multicommodity Flow Problem (MMCF) is solvable in polynomial

time, for more than a decade now researchers have tried to develop approximation schemes

for the problem. The reason for this at first surprising fact is that large multicommodity flow

instances have to be solved in many application areas and in combination with a whole variety of

discrete optimization problems such as network design or graph bisection. Standard simplex or

interior point LP-solvers, even specialized software based on primal basis partitioning [66, 67]

or Lagrangian relaxation based resource- or cost-decompositions [44, 84] are simply not fast

enough to tackle real-size instances of these problems in a reasonable amount of time. Therefore,

there is a big interest in the development of algorithms that provide near-optimal solutions more

quickly.

The first FPTAS’s for maximum multicommodity flow were based on Lagrangian relaxations

and linear programming. They have been improved and adapted to different models in a series

of papers [100, 101, 131, 135, 141, 169, 175, 198]. While all this research was built on the idea

of rerouting existing flows, the idea of augmenting flow has led to a couple of new algorithms

with improved running times [74, 75, 90, 130, 217]. Several publications also report about the

usability of approximation algorithms for Multicommodity Flow Problems in practice [2, 98,

101, 176, 184].

In this section, we develop an ε-approximation scheme for the VarMC-bound: Let G









denote an undirected, edge-weighted graph with









m, with associated node-

arc incidence matrix N













2m, and capacities u







0. Furthermore, let K

 

denote

the number of commodities with source nodes sk



Vand demands dk





n, 1



K, whereby



0 for all i





sk, and ∑idk



0. Finally, denote the cost coefficient of commodity kby pk.

Then the problem is to

Maximize ∑kpkλk

subject to Nxk



λkdk









∑kxk





uij















This way to state the problem allows us directly to compute a lower bound on the bisection width

of a given graph (compare with Section 9.2): the objective maximizes the cutflow, whereby the

variables λkdetermine the sender volume of commodity k, and the xkgive the correspondingflow

in the network. The constraints





ensure that xkis a feasible flow of commodity kwith volume

λk, and restrictions





enforce that the capacity of the undirected edges







is not violated.

Note that, to solve the problem, it is sufficient to look at the special case with pk



1 for all



K, because for pk



0 we set λk



0 and xk



0, and otherwise we can scale the demand

dkby 1



pk.

9.3. Approximation of the VarMC-bound

171

9.3.1 The FPTAS

We try to keep the chapter self-contained. Note however, that the following description is ana-

logue to the “maximum multicommodity flow”-section in [74], which itself builds directly on

the analysis given in [90]. Our contribution here consists in the generalization of the existing

algorithms for the Maximum Multicommodity Flow Problem to an FPTAS that can handle com-

modities with more than one sink. This, of course, is essential for the computation of a lower

bound for the bisection width of a graph (see Section 9.2). At the same time, since we focus on

graph bisection here, we enable the FPTAS to handle undirected edges. However, the changes

that are necessary for that purpose are not of fundamental nature, so that the theory we present

can also be used for the development of an approximation scheme for generalized Maximum

Multicommodity Flow Problems on directed graphs with multi-sink commodities.

For all 1



K, denote the set of all paths from the source skto the sink i









πk





. Furthermore, set πk









skπk





. Using variables yk

Prepresenting the flow of commodity

kalong some path P



πk, we achieve a path-based formulation of the problem

Maximize ∑kλk

subject to ∑



πk







λkdk















∑k∑



πk:







Pyk



uij

 









The dual of the previous LP can be written as

Minimize ∑







Euijlij

subject to ∑







Plij





















πk





∑





skdk

izk









That is, we have to assign lengths lij



0 to each edge









E, such that ∑i





skdk

idistk







for all 1



K, and ∑







Euijlij is minimized, whereby distk





denotes the shortest path

distance from skto the sink i



Vin Gunder length function l.

The approximation scheme that we investigate in the following is sketched in Algorithm 5.

We start with length function l



δfor an appropriately defined δ









, and the primal

solutionx



0. Thelength functionlis definedon theundirected edge setE, whereas we maintain

flow values xk

ij and xk

ji for each edge









Eand all 1



K. While there is still a tree T

along which we can route the demand of a commodity kwith costs less than 1, the algorithm

selects such a tree and augments flow along this tree. More precisely, the algorithm selects a

172

Chapter 9. Graph Bisection

Algorithm 5 APPROXIMATION SCHEME

1: x:







2: ˆ



mink∑idk

iδ,oldSource





3: repeat

4: for all 1



Kdo

5: if oldSource





skthen

6: T



SHORTEST_PATH_TREE(sk,l)

7: oldSource



8: while ∑idk

idistk







min













9: for all









Tdo

10: ck



∑



11: ϕ



min





 



2xk



uij





ij,∆:



12: for all









Tdo

13: ∆ji





min





ϕck



14: ∆ij



ϕck



∆ji

15: if ∆ij



∆ji



0then

16: lij



lij







∆ij



∆ji





uij



17: T



SHORTEST_PATH_TREE(sk,l)

18: x



∆

19: ˆ









20: until ˆ



tree with approximately minimal costs up to an approximation factor of 1



ε. This property is

achieved by maintaininga lowerbound ˆ

αon the current minimalroutingcosts of any commodity.

The amount of flow sent along tree Tis determined in the following way: Let Tjdenote all

nodes in the sub-tree rooted at node j



V(including j). For each edge









T, we com-

pute the congestion ck

ij when routing the demand of commodity kalong that tree, i.e., we set



∑h



Tjdk

h. Basically, we achieve a feasible routing by scaling the flow by min







Tuij



ij.

However, since we are working on an undirected network here, we would like to consider only

flows with min





 

0 for all 1



Kand









E. When we also incorporate and

change the current flow xk

ji of commodity kin the opposite direction, we achieve an even bigger

scaling factor of min





 



2xk



uij





ij. Formally, we can prove Lemma 9.1 regarding the

change ∆ij



∆ji of the current flow on edge









Eof commodity k.

In case of a positive flow change on an edge









E(i.e., when ∆ij



∆ji



0), we update

the dual variables by setting lij









∆ij



∆ji





uij



lij, i.e., we increase the lengths of an

edge with respect to the congestion of that edge.

9.3. Approximation of the VarMC-bound

173

Finally, the primal solution is updated by xk



∆ij. This setting may yield an infeasible

solution, since it may violate some capacity constraint ∑k









uij for some edge









E. However, the mass balance constraints are still valid. This allows us, at the end of the

algorithm, to scale the final flows xkso that they build a feasible solution to the problem.

For the algorithm sketched above, we are able to prove

Theorem 9.1 Let TSP





nlogn, and S



















. An ε-approximate

VarMC-bound on the bisection width of a graph can be computed in time O





mTSP









ε2



Following the analysis in [74], we prove the previous theorem with the help of a sequence of

lemmata. We set ρ



mink











, and σ



log1













δρ



. Then,

Lemma 9.1 In every iteration in which the current flow is changed, it holds that:

a) ∆ij



∆ji



uij for all









E, and

b) there exists an edge









E such that ∆ij



∆ji



uij.

Proof: Let ϕ



min









2xk



uij





ij.

a) For















Tthere is no change in the flow. Let









T. In case of xk



0, we have

∆ij



∆ji



∆ij



ϕck

 

2xk



uij







uij. In case of xk



ϕck

ij, we have ∆ji





ϕck

and ∆ij



0. Thus, ∆ij



∆ji



∆ji





ϕck



uij. In case of 0



ϕck

ij, we have

∆ij



∆ji

 

ϕck













 

2xk



uij







2xk



uij. Finally, the case









is analogue to









b) Considerthe edge









Twithϕ

 

2xk



uij





ij. In case ofxk



0, we have∆ij



∆ji



∆ij



ϕck

 

2xk



uij







uij. In case of xk



0, it holds that xk



2xk



uij





2xk



uij







ϕck

ij. Thus, ∆ij



∆ji

 

ϕck













 

2xk



uij







2xk



uij.

Lemma 9.2 The flow obtained by scaling the final flow by 1



σis primal feasible.

Proof: Let ∆ij





denote the change of the flow on edge









Tin iteration t. Then,

∑t





∆ij







∆ji





equals the flow on edge









Eafter iteration I. It is sufficient to show

that, at the end of the algorithm, it holds that ∑t





∆ij







∆ji







σuij.

In each iteration with ∆ij







∆ji







0, the length lij of edge









Eincreases by a factor

of 1





∆ij







∆ji







uij. Denote the set of all iterations tin which ∆ij







∆ji







0 by I.

Then we have that

lij



δ∏





ε∆ij







∆ji





uij





(9.1)

174

Chapter 9. Graph Bisection

With Lemma 9.1 and 1



εx









xfor all 0



1, we have that

lij



δ∏









∆ij







∆ji





uij









∑



∆ij







∆ji





uij



(9.2)

Thus, at the end of the algorithm we have

∑





∆ij







∆ji







uij log1



εlij



(9.3)

Since the left hand side is a valid upper bound for the final congestion on edge









E, it

remains to show that lij

 







ρ. Assume the opposite. With Lemma 9.1, lij always increases

by a factor of at most 1



ε. Thus, before the last change it must hold lij





ρ. Consider

the iteration in which lij increases the last time during the increase of the flow regarding some

commodity 1



K. We know that then ∑h





skdk

hdistk







1. However, since lij





ρand







is an edge on the shortest path from skto one of its sinks, we have that ∑h





Sdistk









ρ. Thus,

∑





hdistk







ρ∑





distk











(9.4)

which is a contradiction.

Lemma 9.3 Let τ

 







ρ, and denote the maximum number of edges in a simple path from

a source skto one of its sinks i



V by L. When setting δ





τLmaxk∑i





skdk







ε, the final flow

scaled by 1



σis optimal with a relative error of at most 3ε.

Proof: We prove the desired accuracy of the solution computed by comparing it against the

objective value Dof a dual feasible solution, which our algorithm produces as a byproduct, and

which gives us a valid upper bound on the primal optimal solution value Zopt.

For any given length function l:E

 



0, let αdenote the minimal routing costs of all

commodities, i.e., α









mink∑i





skdk

idistk





. Furthermore, denote thedual objectivevalue

corresponding to the current choice of lby D







∑







Euijlij. Then the optimal dual objective

value is Zopt



minlD











Consider iterationt, and let l





,k,ϕ,cand Tdenote the current choice of the length function,

commodity, scaling factor, congestion vector and routing tree, respectively. For ease of notation,

we set α





















, and distk







distk





. For any edge









E, define

Γij



max



∆ij







∆ji







. Note that, in every iteration t, it holds that Γij



ϕck

ij. Then,















ε∑







Γijlij

















ε∑







ϕck

ijlij

















εϕ ∑





h∑







T;h



lij

















εϕ ∑





hdistk

















εϕ













9.3. Approximation of the VarMC-bound

175

Denote the primal objective value ∑kλk(corresponding to the — possibly infeasible — flows

xk) after the iteration tby Z











0). Obviously, it holds that Z















ϕ. Thus,







































, and therefore



















∑



















 

(9.5)

Consider the length function l



















δ. Then, for the dual objective, we have D























, andfor theprimal objective,we get α



















Lδmaxk∑i





skdk

where Ldenotes the maximum number of edges on a simple path from a source skto one of its

sinks i



V. Hence,

Zopt









































Lδmaxk∑





skdk



(9.6)

Thus,







Lδmax

k∑













Zopt ∑





















(9.7)

Denote the right hand side of Inequality 9.7 by A





. Then,





















Zopt





































Zopt





















eε







Zopt











 







eε











Zopt



because of 1





exfor all x





and Z







0. Now consider the lastiterationt. Then, α







With A







Lδmaxk∑i





skdk

i, we get















Lδmax

k∑





ieε











Zopt



(9.8)

Thus,







Zopt







ln 1

Lδmaxk∑





skdk



(9.9)

When setting δaccording to Lemma 9.3, a simple calculation shows that for the scaled objective

value we get Z







Zopt 1









ln1











Zopt





3ε



(9.10)

for all ε



176

Chapter 9. Graph Bisection

Proof of Theorem 9.1: In Lemmas 9.2 and 9.3, we have shown that the algorithm returns a

feasible flow of the desired accuracy. It remains to show that the running time is polynomial.

The running time is dominated by the operations in Lines 6, 8, 10 and 17. Setting ν



mink∑i





skdk

iand µ



log1













νδ





σ, the following holds:

6) Since ˆ

αis initially set to νδ, and ˆ





εat the end of the computation, we know that

the outer while loop is executed at most log1













νδ





µtimes. If we order the

commodities according to the corresponding source nodes sk, we have to compute Line 6

at most nµ times. Thus, this line adds a workload of O



nσTSP



8) Obviously, every execution of Line 8 may require time O





. However, the simple estima-

tion of a cost of kn per outer while iteration can be strengthened by observing that in each

such iteration all flow destinations will be investigated exactly once. Thus, for every outer

while iteration this line adds a workload of O







. For every inner while iteration the work

to be done in Line 8 is dominated by Line 15. Thus, it is sufficient to add a workload of







for Line 8.

9,10) At first, the computation of the current congestion ck

ij on each edge









Tseems to

require a running time quadratic in n. However, when computing the shortest path tree

T, we get a topological ordering of Tfor free (simply by using the ordering in which

Dijkstra’s algorithm labels the nodes). Using this topological ordering, we can compute

the values ∑h



Tjdk

hbottom up, requiring a total running time in O





. Thus, Lines 9 and 10

are dominated by Line 17.

17) Whenever a shortest-tree computation results in a flow change along the computed tree, we

know from Lemma 9.1 that at least for one edge the length increases by a factor of 1



ε.

At the beginning, all lengths are equal to δ, and the proof of Lemma 9.2 shows that, at the

end of the algorithm, it holds that lij

 







ρ. Therefore, Line 17 is executed at most

mlog1













δρ





mσtimes. Thus, Line 17 adds a workload of O



mσTSP



Putting the results together and assuming m



n, we get a running time in



nσTSP







mσTSP







mTSP









(9.11)

Thus, when we set δaccording to Lemma 9.3, we achieve a running time in



mTSP











logn



log



max

k∑









ε2









mTSP









ε2



(9.12)

9.4. Cost-Decomposition Approach

177

9.3.2 Implementation Details

The previous theoreticworkgivesus an approximationscheme for a lower boundon thebisection

widthof a graph. Note that, withrespect to Section9.2, we knowthat









.1Therefore, for

any connected graph we can achieve an ε-approximation of the VarMC-bound on the bisection

width in time O







ε2



Even though our theoretical work does not give any further guarantees, in practice, the ap-

proximation scheme presented can be improved. Most importantly, the final scaling factor σ

used to make the final flow primal feasible should not be determined by the formula given in

Lemma 9.2. Instead, we can easily determine a scaling factor ˆ



σby computing the con-

gestion cij of the final flow on each edge









Eand setting ˆ



max







Ecij



uij. This

improvement does not affect the overall running time of the algorithm.

In practice, we may consider to do even more and to apply an idea that we call enhanced

scaling: before scaling the final flow, at the end of the algorithm, for each commodity kwe

obtain a flow xkand possibly also a scalar λk











0. In order to construct a feasible

flow, instead of scaling all xkequally, we could also set up another optimization problem to find

scalars ξkthat solve the following LP:

Maximize ∑kλkξk

subject to ∑kξk









uij

 









That way, the final bound obtained can be improved in practice. However, this gain has to be

paid for by an additional computational effort that, in theory, dominates the overall running

time. However, as we shall see in Section 9.6, when solving an instance of the Graph Bisection

Problem, the effort is taken worthwhile.

9.4 Cost-Decomposition Approach

Another way of computing the VarMC-bound on the bisection width is to use cost-based de-

composition techniques, that were originally developed for the Min-Cost Multicommodity Flow

Problem. The idea is to relax the capacity constraints (another term used in the literature is

bundle constraints) in order to decompose the problem into a set of Shortest Path Problems.

Recall from Section 9.2 that we need to compute flows that guarantee a certain cutflow ˆ

while keeping the maximum congestion on any edge within a given limit ˆ

C. More formally, we

1Without going into details here, we would like to add that it even holds that





n2, and this remains true even

when new commodities are added when branching.

178

Chapter 9. Graph Bisection

need to find flows xksuch that

Nxk



λkdk





∑kxk





uij ˆ











LP 1



∑kλk



We will investigate a transformation of this problem into an optimization problem, namely by

trying to minimize the congestion while guaranteeing that the required cutflow is achieved. We

develop an approach on that optimization problem using an integration of column generation and

Lagrangian relaxation.

9.4.1 Column Generation

The path-based formulation in Section 9.3.1 could be used to build a column generation ap-

proach on. However, taking into account the large number of source-sink pairs that we are facing

when computing the VarMC-bound on the bisection width of a given graph — it is in Θ





— we would have to cope with an LP with extremely many constraints. For example, for the

DeBruijn 9, the simplex tableau would have more than 218 rows. In contrast to the number of

columns that can be controlled first by generating only columns with negative reduced costs and

second by successive matrix compressions, such a huge number of rows cannot be handled effi-

ciently. Thus, in practice we cannot afford to generate columns that represent paths in the graph,

even though this would be advantageous with respect to the total number of columns that have to

be generated in order to achieve a near-optimal solution. For the same reason, a master problem

consisting of key paths and cycles as it was proposed in [12] cannot be handled efficiently for our

application. Thus, we will use a master problem that is based on trees.

Let Mk











denote the set of all trees rooted at skand routing the demand of

commodity kto its sinks (where





Gkdenotes the number of all such tree routings). Define





kMk, and let ck







∑h







hdenote the congestion on an edge









g(i.e., iis the

unique predecessor of jin Tk

g) for all 1



Kand Tk



Mk. Then the problem can be written

as Minimize LPC



subject to ∑



K∑



Gkξgck







uijC

 







∑



K∑



Gkξg



Starting with a subset ˆ



M, we solve the reduced master problem. Using dual variables rij



for the capacity constraints and f



0 for the cut-flow restriction, we generate new columns with

9.4. Cost-Decomposition Approach

179

negative reduced costs by solving the sub-problem

Minimize ∑

















rij





subject to Nxk









Note that the sub-problem decomposes into Ksingle-source Shortest Path Problems with non-

negative edge costs. We can prune the search, if LPC



The process of generating columns and solving the master problem is iterated until we cannot

compute trees with associated negative reduced costs anymore or until a master iteration limit is

reached. The number of columns in the master matrix is controlled by a frequent compressing

step that reduces the number of columns with respect to the current associated reduced costs.

However, our experiments showed that the compression must not be carried out too aggressively,

because we need a fairly large number of columns in the master matrix in order to keep the total

number of master iterations within reasonable limits. This is a clear drawback of the tree-based

formulation of the master problem that results in a relatively high solution time for the master

problem.

Thus, we try to keep the number of master iterations small by adding a whole set of columns

between two master problem solutions. The only question that must be answered is how to

generate meaningful new columns without the help of new dual variables. We use Lagrangian

relaxation for this purpose, first on a min-congestion formulation of the problem, and second, on

a max-cutflow representation. The idea to use Lagrangian relaxation to generate new columns is

motivated by the fact that the optimal Lagrangian multipliers in the min-congestion formulation

are also optimal dual values for the column generation procedure and vice versa.

The whole procedure works as follows: we start a subgradient optimizationof the Lagrangian

dual and achieve an upper bound for the cutflow or a lower boundon the congestion, respectively.

At the same time, we feed the tree-flows computed in the successive Lagrangian sub-problems

into the matrix of the master problem. If the upper bound on the maximum cutflow is smaller

than ˆ

F, or if the lower boundon the minimum congestionas greater than ˆ

C, respectively, we have

proven that the VarMC-bound does not allow to prune the current search node, and we can branch

right away. Otherwise, we solve the master problem and achieve a feasible flow that yields an

upper bound on the congestion. If we achieve a flow with an associated congestion smaller than

C, we can prune the current search node. Otherwise, we restart the Lagrangian sub-routine with

the current dual values again. This process is iterated until we find optimal flows or an iteration

limit is reached.

9.4.2 Lagrangian Relaxation Based Column Generation

To complete the description, we briefly present the two Lagrangian relaxations that we consider

to generate columns in a column generation framework in more detail.

180

Chapter 9. Graph Bisection

By relaxing the capacity constraints in LP 1 using Lagrangian multipliers rij



0, we get a

max-cutflow formulation

Maximize ∑k



λk



∑













rij





CrTu

subject to Nxk









Or, we can aim at a min-congestion formulation

Minimize ∑k∑

















rij









rTu



subject to Nxk







∑kλk



In both cases, we need to solve KSingle-Source Shortest Path Problems again. Thus, both

formulations allow us to use the shortest path trees computed in the Lagrangian sub-problems

to generate columns for the master problem. Note that both Lagrangian sub-problems may be

unbounded. This problem is overcome by setting upper bounds on λk(for example the out-

capacity of sk) or on C(for example 1



1ˆ

C), respectively.

The update of the Lagrangian multipliers can be done by different subgradient algorithms us-

ing different formulas for the computation of the new search direction dt. Let stdenote a subgra-

dient in iteration t. We can set dt



st(pure subgradient); or dt



αdt



st, whereby 0



is fix (so-called Crowder rule [46]); or we may set dt



αtdt



st, with αt



 



 



 





Tdt





0, and α



0 otherwise (modified Camerini-Fratta-Maffioli rule [28]); another possi-

bility is to set dt



αdt









st, whereby 0



1 is fix (so-called volume algorithm[11]).

All variants will be evaluated in the numeric section.

Interestingly, there are some similarities to the approximation scheme presented in Sec-

tion 9.3. In both cases, we compute a sequence of upper and lower bounds. In the approxi-

mation scheme as well as in column generation and Lagrangian relaxation we compute shortest

path trees with respect to some changing length function. The only difference is how we change

that length function. In the approximation scheme, it is increased exponentially with respect to

the current congestion on an edge. In the subgradient algorithm for Lagrangian relaxation, it is

changed with respect to search directions that reflect a current subgradient and possibly parts of

the search history. In column generation, new lengths are simply set to the current dual values in

the master problem. Thus, we consider an experimental comparison of these different strategies

of interest.

9.5. A Branch & Bound Algorithm

181

9.5 A Branch & Bound Algorithm

Our maingoal isthe computationof exact solutionsfor Graph BisectionProblems. Therefore, we

construct a branch&bound algorithm using the described VarMC-bound as lower bound for the

problems. A detailed description of the branch & bound implementation can be found in [197].

In the following, we give a brief survey on the main ideas:

First, we heuristically compute a graph bisection using PARTY [172]. Since the initial solu-

tion obtained is optimal in most cases, usually we only need to prove optimality. A pure depth

first search tree traversal is sufficient for that purpose. The branching is done on the decision

whether two specific vertices







stay in the same partition (join) or if they are separated

(split). A join is performed by adapting the graph by merging these two vertices into one vertex.

A split is performed by introducing an additional commodity from vertex vto vertex wwhose

entire scaled amount is known to cross the cut. Thus, it can be added to theCutFlow completely.

The selection of the pair







for the next branching is done with the help of an upper bound on

the lower bound. Additionally to this idea described in [197], the selection is restricted to pairs







(if any) where one node (say, v) has been split with some other node x





wbefore. Then

the split of







implies a join of







, of course.

Problem reduction is done by improving the so-called “Forcing Moves” strategy described

in [197], Lemma 2. In the original version, only adjacent vertices







were considered. Now,

we look at a residual graph with edge-capacities that equal the amount of capacity which is not

used by a given VarMC solution. Two vertices vand wcan be joined if the maximal flow in the

residual graph exceeds a specific value.

9.6 Numerical Results

In this section, we present the results of our computational experiments. Before comparing the

algorithms with each other and with the semi-definite bound developed in [133], we first present

some experiments regarding the effects of different parameter settings for the FPTAS in Sec-

tion 9.3 and the cost-decomposition approach in Section 9.4. The following experiments were

executed on systems with INTEL Pentium-III, 850 MHz, and 512 MB memory. To show the

performance on different kinds of graphs, we use four different sets of 20 randomly generated

graphs: The set RandPlan contains random maximal planar graphs with 100 vertices; the graphs

are generated using the same principle as it is used in LEDA [153]. Benchmark set RandReg con-

sists of random regular graphs with 100 vertices and degree four; for its generation, the algorithm

from Steger and Wormald [206] is used. The set Random contains graphs with 44 vertices where

every pair







is adjacent with probability 0.2. Finally, the set RandW consists of complete

graphs with 24 vertices where every edge has a random weight in the interval







In most works on exact graph partitioning (see e.g. [27, 133]), sets like Random and RandW

182

Chapter 9. Graph Bisection

0.01

0.1

100

10 100 1000 10000 100000

absolute Diff. to exact value [logscaled]

Iteration [logscaled]

primal value, ε=0.025

enhanced primal value, ε=0.025

dual value, ε=0.025

primal value, ε=0.25

enhanced primal value, ε=0.25

dual value, ε=0.25

Fig. 9.1: Progression of the bounds with ε



025 and ε



are used for the experiments. We added the sets of random regular and random planar graphs

here, because we believe that more structured graph classes should also be considered taking into

account their bigger relevance for practical applications.

9.6.1 Approximating Lower Bounds

First, we show the results regarding the FPTAS developed in Section 9.3. To illustrate the be-

havior of the algorithm, Figure 9.1 shows the progression of the primal and dual value, and the

enhanced primal value for ε





025 and ε





25. Recall from Section 9.3.2 that the enhanced

primal value is achieved by solving a linear program to compute the optimal scaling factors of the

final flows. The run was made on one specific RandReg graph, but similarresults are obtained by

using any of the other graphs we considered in our experiments. Thus, we consider this example

as representative.

Interestingly, it appears that the improvement caused by enhanced scaling is a specific factor

that is almost independent of the number of iterations. As one might have expected, a closer

look at the data shows that the improvement usually gets slightly smaller with more iterations,

whereby the effect is most visible for graphs in the Random set. Only for planar graphs we found

that the gain by enhanced scaling becomes clearly greater with increasing iterations. Regarding

the dependencyof the chosenε, we find that for the setsRandReg and RandPlanthe improvement

achieved by enhanced scalingbecomes smaller thegreater εis, and for (weighted) randomgraphs

it is almost constant.

In Figure 9.1, for both settings of ε, the dual value converges to its final error fairly quickly

and then remains nearly unchanged. When comparing the primal values with ε





25 and





025, we find that the first has a smaller error at the beginning, but is not improved anymore

after few iterations. Using ε





025, the convergence of the primal value is slower, but reaches

a clearly better result at the end.

9.6. Numerical Results

183

εˆ

εsˆ

εdnum its. time

0.0125 0.0008 0.00035 0.0022 361,014 3360.0

0.025 0.0019 0.00091 0.0042 89,026 888.3

0.05 0.0043 0.0021 0.0084 21,642 232.2

0.1 0.011 0.0062 0.016 5,110 56.0

0.2 0.029 0.016 0.033 1,130 12.9

0.4 0.069 0.040 0.057 213 2.4

0.6 0.12 0.067 0.057 66 0.8

0.8 0.20 0.11 0.057 23 0.3

Tab.9.1: Real errors and computational effort depending

on the given ε.

The best choice of εfor the use in a

branch & bound environment is a trade-

off. If εis too big, the bound approxi-

mation computed is too bad, and the num-

ber of sub-problems in the search tree ex-

plodes. On the other hand, if εis cho-

sen toosmall, the boundapproximation for

each sub-problem is too time consuming.

Table 9.1 shows this trade-off for the same

graph as in Figure 9.1. Again, we consider

this example as representative. We denote

the approximationparameter thealgorithm

is started with by ε,ˆ

εis the final error relative to the real VarMC-bound value, ˆ

εsdenotes the final

error when enhanced scaling is used, and ˆ

εdgives the final error of the dual value. Furthermore,

the number of iterations and the running time in seconds are given.

It gets clear that the error that is actually reached is much better than the approximation

guarantee εwhich the algorithm is started with. The figure also shows that the theoretical factor

of 1

ε2in the running time of the algorithm is confirmed by the experiments.

Note that the runningtime for the enhanced scalingversionis not explicitly statedin the table.

It increases the running time by only a hundredth of a second, and is therefore negligible in our

comparison. In this experiment, however, the linear program for the computation of enhanced

scaling factors was only solved once at the end of the approximation. When using the FPTAS

for lower bound computations in a branch & bound, we are also interested in intermediate primal

values that may allow us to prune the current choice point even before the approximation of the

current VarMC-bound is finished. Then we found that it is a good choice to compute enhanced

primal values only every hundredth iteration, which almost does not increase the overall running

time but improves the approximation quality significantly.

In Table 9.2, we give the resulting running times and the number of sub-problems of the

branch & bound algorithm using approximated bounds. The results given are averages over all

20 instances for every set of graphs.

The results without forcing moves show the expected behavior for the choice of ε. The

smaller εis, the smaller is the number of search nodes. This rule is not strict when using forcing

moves: looking at the random planar graphs, we see that less good solutionsof the VarMC-bound

can result in stronger forcing moves so that the number of sub-problems may even decrease. The

figure also shows that the effects of enhanced scaling and forcing moves are different for the

different classes of graphs and also for changing ε’s. Altogether, the experiments show that

setting εto 50% is favorable, which is a surprisingly high value.

184

Chapter 9. Graph Bisection



025 ε



05 ε



1ε



25 ε



5ε



graph time subp. time subp. time subp. time subp. time subp. time subp.

RandPlan 6513 462 2350 463 1058 468 632 582 438 1536 524 5740

RandReg 3728 24 1902 29 1001 37 522 101 551 552 1194 4063

(1) Random 2857 114 727 119 334 129 175 212 230 1124 781 15038

RandW 1487 47 534 60 217 89 108 297 102 2034 118 11498

RandPlan 6395 461 2158 461 962 463 587 557 450 1466 562 5273

RandReg 2192 22 1107 23 561 26 196 37 126 90 163 489

(2) Random 2620 113 525 114 228 118 98 139 80 280 186 2188

RandW 1383 37 472 46 176 62 76 150 85 788 1289 3859

RandPlan 3013 412 1009 406 587 381 133 239 54 117 35 78

RandReg 2186 22 1083 23 622 25 181 34 117 67 160 173

(3) Random 2249 107 465 108 209 111 83 121 49 164 47 328

RandW 737 14 283 17 122 25 44 54 35 188 41 582

Tab. 9.2: Times and sizes of the search trees of the branch & bound algorithm using the approximation

algorithm with different ε’s. (1): without enhanced scaling, without forcing moves, (2): with enhanced

scaling, without forcing moves, (3): with enhanced scaling and forcing moves.

9.6.2 Lower Bounds using Cost-Decomposition

Now, we evaluate the cost-decomposition approach developed in Section 9.4. Analogically to

Table 9.2, the Table 9.3 shows the results of the branch & bound algorithm when using the

cost-decomposition technique for the computation of the VarMC-bound.

Both, the running times and the sizes of the search trees produced by the various subgradient

algorithms and Lagrangian formulations differ considerably on the different benchmark sets.

Thus, it is not an easy task to draw valid conclusions out of these experiments. As a tendency,

the max-cutflow formulation looks better than the min-congestion formulation (except for the

random planar graphs). When using the max-cutflow formulation, the Crowder rule gives a good

overall performance.

9.6.3 Comparison of Lower Bound Algorithms

After having investigatedthe developed algorithmssolitarily, we now want to compare them with

each other and with the semi-definite bound presented in [133]. We start with a presentation of

time and quality of the four different lower bound algorithms:



The VarMC-bound using the ILOG CPLEX 7.0 standard barrier solver [117].



The VarMC-bound using the max-cutflow decomposition with the Crowder rule.

9.6. Numerical Results

185

pure subgr. Crowder mod. CFM Volume

graph time subp. time subp. time subp. time subp.

RandPlan 11318 85 6626 85 24986 85 7752 85

RandReg 1192 38 589 28 737 29 568 32

(1) Random 748 117 694 119 722 116 849 127

RandW 151 11 128 11 134 10 166 17

RandPlan 2032 85 1276 85 1831 85 1307 85

RandReg 3768 64 17320 299 4374 72 17633 292

(2) Random 1776 160 3366 239 1941 169 6211 418

RandW 1710 183 1394 139 1591 166 3661 588

Tab.9.3: Average running times in seconds and average number of search nodes using cost-decomposition

without forcing moves. (1): max-cutflow formulation, (2): min-congestion formulation.



The VarMC-bound using the FPTAS with a desired approximation guarantee of 50% and

enhanced scaling.



The semi-definite bound presented in [133] and available as CUTSDP-package (program

bis0) at [132]. The program uses parameters

maxlarge

and

maxsmall

. According to the

setting in [133], we set

maxlarge



1, and

maxsmall



10.

In Tables 9.4, 9.5, and 9.6, we apply the different algorithms to grids, tori, shuffle-exchange

graphs, DeBruijn graphs, and graphs stemming from a real-world finite elements application.

The experiments were performed on a SUN Enterprise 450 Model 4400 machine with 1 GB main

memory and a SUN UltraSparc-II 400 MHz processor. The tables give the number of nodes, the

number of edges, the exact bisection width, and, for each algorithm, the bound computed and the

time needed for its computation (in seconds).

First, by comparing the VarMC-bound achieved by CPLEX and the semi-definite bound, we

see that the VarMC-bound is indeed superior with respect to its quality on sparse and structured

graphs like the ones that we consider here. As a slight drawback, the bound is not suited for

disconnected graphs like the BCR graphs

ma, m1

, and

. For them, the MVarMC-bound

discussed in Section 9.2 yields much better results [197]. Work is in progress that tries to extend

the work presented here to the MVarMC bound as well. However, here we focus on the VarMC-

bound only.

For the remaining connected, and especially the larger graphs, we see that the VarMC-bound

dominates the semi-definite bound, sometimes quite remarkably. Moreover, we see that the

CPLEX barrier solver2computes the VarMC-bound always faster than CUTSDP obtains the semi-

definite bound - except for large shuffle-exchange and DeBruijn graphs. For these graphs, the

2We also experimented with the primal and dual simplex, but, while the dual simplex outperforms the primal

simplex algorithm, both algorithms could not at all compete with the interior point solver.

186

Chapter 9. Graph Bisection

Graph



 



bw CPLEX Decomp. Approx (50%) CUTSDP

Bound Time Bound Time Bound Time Bound Time

Grid 9x4 36 59 5 4.50 <1 4.50 2 4.32 <1 5.00 1

Grid 10x5 50 85 5 5.00 1 5.00 4 5.00 <1 5.00 3

Grid 10x9 90 161 9 9.00 15 9.00 21 9.00 1 8.12 17

Grid 10x10 100 180 10 10.00 26 10.00 32 10.00 1 8.28 24

Grid 10x11 110 199 11 11.00 16 11.00 54 10.59 2 8.48 30

Grid 11x11 121 220 12 11.48 19 11.48 67 11.11 2 8.98 42

Torus 9x4 36 72 10 8.10 <1 8.10 3 7.78 <1 7.60 1

Torus 10x5 50 100 10 10.00 3 10.00 4 9.84 <1 9.88 3

Torus 10x9 90 180 18 18.00 19 18.00 32 17.56 1 17.22 19

Torus 10x10 100 200 20 20.00 11 20.00 38 18.95 2 17.98 27

Torus 10x11 110 220 22 20.17 15 20.17 48 19.66 2 17.63 30

Torus 11x11 121 242 24 22.18 15 22.00 65 20.62 3 18.50 39

Tab. 9.4: Comparison of bounds on Grids and Tori.

Graph



 



bw CPLEX Decomp. Approx (50%) CUTSDP

Bound Time Bound Time Bound Time Bound Time

SE 2 4 3 2 2.00 <1 2.00 <1 2.00 <1 2.00 <1

SE 3 8 10 2 2.00 <1 2.00 <1 2.00 <1 2.00 <1

SE 4 16 21 4 3.10 <1 3.10 <1 3.08 <1 3.49 <1

SE 5 32 46 6 5.23 <1 5.20 2 5.10 <1 5.08 2

SE 6 64 93 10 8.91 8 8.87 11 8.69 <1 7.40 21

SE 7 128 190 16 15.12 43 15.01 53 14.73 3 11.31 120

SE 8 256 381 28 26.15 484 25.69 244 24.94 16 18.14 453

DB 2 4 5 4 4.00 <1 4.00 <1 4.00 <1 4.00 <1

DB 3 8 13 4 4.00 <1 4.00 <1 4.00 <1 4.00 <1

DB 4 16 29 6 6.00 <1 6.00 <1 5.89 <1 6.00 <1

DB 5 32 61 10 10.00 <1 10.00 3 9.82 <1 9.70 2

DB 6 64 125 18 16.96 9 16.90 14 16.26 <1 14.94 20

DB 7 128 253 30 28.98 62 28.80 65 27.45 4 22.58 49

DB 8 256 509 54 49.54 652 49.05 322 46.95 21 35.08 443

Tab. 9.5: Comparison of bounds on shuffle-exchange (SE) and DeBruijn (DB) graphs.

9.6. Numerical Results

187

Graph



 



bw CPLEX Decomp. Approx (50%) CUTSDP

Bound Time Bound Time Bound Time Bound Time

BCR ma 54 72 2 0 – 0 – 0 – 1.93 4

BCR mb 74 120 4 3.08 8 3.08 9 3.08 4 3.12 10

BCR mc 74 125 6 5.10 8 5.10 14 5.10 4 5.45 10

BCR md 80 129 4 3.16 9 3.16 11 3.15 4 3.20 13

BCR me 60 96 3 3.00 3 3.00 6 3.00 3 3.00 5

BCR mf 90 146 4 3.21 14 3.21 18 3.21 4 2.86 18

BCR m1 100 155 4 0 – 0 – 0 – 2.46 26

BCR m4 32 50 6 0 – 0 – 0 – 5.68 1

BCR m6 70 120 7 6.36 7 6.36 14 6.36 4 6.03 9

BCR m8 148 265 7 7.00 22 7.00 64 6.96 5 5.98 80

Tab. 9.6: Comparison of bounds on real-world graphs stemming from a finite elements application.

quality of the VarMC-bound is much better, though. Therefore, even when using a standard

barrier solver for its computation, the VarMC-bound is clearly preferable to the semi-definite

bound on sparse, structured graphs.

However, for larger graphs containing 500 nodes or more, the memory consumption of

CPLEX becomes critical. For example, the bound on DeBruijn 9 could not be computed within

2 GB main memory. As stated in Section 9.4, cost-decomposition can help to cope with that sit-

uation. We see that the Lagrangian relaxation based column generation approach yields slightly

worse lower bounds, and there is also more time needed, except for large shuffle-exchange

and DeBruijn graphs. In exchange, the memory requirements are much lower, and the cost-

decompositionapproach allowsus totackle large graphs like DeBruijn9 andShuffle-Exchange 10.

When using the approximation scheme with an approximation guarantee of 50%, we lose

even more of the bounds quality. However, in most cases, the bounds obtained are still better

than those computed by CUTSDP, and the computation time is drastically reduced. At the same

time, the memory consumption is comparable to the cost-decomposition approach. Therefore,

the FPTAS developed in Section 9.3 is the algorithm of choice for the bisection of sparse and

structured graphs.

Next, we embed the three different algorithmsfor the VarMC-bound into the branch & bound

approach with problem reduction as sketched in Section 9.5. Again, the experiments were exe-

cuted on systems with 850 MHz INTEL Pentium-III processors. Table 9.7 shows the results on

the four previously described benchmark sets.

We see that, also within a branch & bound approach, the enhanced approximation algorithm

gives the best running times, even though the bounds are worst and the search trees are largest.

Recall from the formulation of the the master problemin Section 9.4, that the cost-decomposition

approach was especially designed to be memory efficient, which makes it less competitive with

188

Chapter 9. Graph Bisection

respect to the running time.

CPLEX Decomp. Approx. (50%)

graph time subp. time subp. time subp.

RandPlan 74 7 889 26 54 117

RandReg 993 21 557 28 117 67

Random 350 99 612 106 49 164

RandW 30 9 120 11 35 188

Tab. 9.7: Average running times (seconds) and sizes of

the search trees using the different methods for computing

the VarMC-bound.

Finally, we note that, using the FPTAS

in Section 9.3, we are able to compute

the bisection widths of DeBruijn 9 (92),

Shuffle-Exchange 9 (48), and Shuffle-

Exchange 10 (82) with the additional help

of the symmetry breaking method SBDD

developed in Chapter 4. In our view, these

resultsimpressivelyshowthe efficiencyof

the VarMC-bound as well as the FPTAS

for its approximation.

9.7 Summary and Future Work

We developed two specialized algorithms for the computation/approximation of the VarMC-

bound on the bisection width of an undirected, edge-weighted graph. The first algorithm is

based on an approximation scheme (FPTAS) for maximum multicommodity flows and yields

an ε-approximation in time O







ε2



. We could show empirically that the real error obtained

is usually much better than the approximation guarantee, especially when using an enhanced

scaling method at the end of the algorithm.

The second algorithm that we developeduses the idea of cost-decomposition inan integration

of Lagrangian relaxation and column generation. We compared two different Lagrangian formu-

lations for the generation of columns and four different rules to determine the search direction

in a subgradient algorithm. The performance of the different algorithms varies a lot on different

graph classes, and it is a hard task to find a set of robust parameter settings that guarantee a stable

performance.

When comparingthe two algorithmswitha barrier LP-solverand a semi-definite program, we

found that it is clearly favorable to use the approximation scheme that yields very good bounds

on sparse, structured graphs in very little time. It allowed us to compute the bisection widths

of large graphs, such as DeBruijn 9, Shuffle-Exchange 9, and Shuffle-Exchange 10, which were

unknown and out of the reach of exact graph bisection algorithms before.

As a subject of future work, we investigate the possibility to adapt the FPTAS that we devel-

oped for the VarMC-bound for an approximation of the MVarMC-bound. Then, we hope to be

able to tackle also disconnected graphs for which the VarMC-bound is inefficient.

Chapter 10

Conclusion

In the introduction, we observed that there exists an incongruity of the needs for optimization

software on one hand and, on the other hand, the solutions that algorithmic computer science is

able to offer. From an algorithmic point of view, the optimization abilities of todays software

libraries are often more than satisfactory for many real-life applications. However, the need

for complex problem modeling appears as a major obstacle for a broader use of optimization

software.

We tried to improve upon this situation by providing filtering algorithms for higher level

symbolic constraints. They allow to model real-life problems more intuitively while preserv-

ing the strong optimization abilities of mathematical programming. We believe that the set of

optimization constraints that we considered covers very important substructures that arise fre-

quently in real-life applications. However, our presentation is not exhaustive, and more work has

to be done to provide practitioners with a more complete set of higher level building blocks for

problem modeling.

Whenever a problem is decomposed into substructures — even when they are larger than

usual — the question arises of how a global view on the entire problem can be achieved. Espe-

cially with respect to tight bounds on the objective, the answer to this question is crucial. We

have proposed to link optimization constraints via well-known decomposition techniques from

operations research. When the user is able to provide a solver with information about the sub-

structures of a problem, we believe that the linking of optimization constraints via CP-based

column generation and CP-based Lagrangian relaxation can also be automated and hidden from

the user.

Regarding symmetry breaking, the method that we proposed does not by itself detect the

symmetries in a problem model. Instead, the representation of a search node and the symmetry

detection function still must be provided. Therefore, a deeper knowledge is required from the

user. However, the idea to think of symmetries algorithmically by asking: When are two choice

points symmetric? is much more intuitive than to develop a problem model that contains no

189

symmetries. Therefore, we believe that our method can help inexperienced users to cope with

symmetry more efficiently.

When tackling the optimization problems that we considered in the second part of this thesis,

the methods and reduction algorithms developed in Part I accelerated the software development

process considerably. Moreover, as we have seen, they can yield to competitive algorithms.

However, note that, especially for the Capacitated Network Design Problem, the Social Golfer

Problem, and the Airline Crew Assignment Problem, we added problem specific knowledge to

improvethe efficiency. In our view, asa subjectof futurework, itwouldbe desirable togeneralize

the ideas of local Lagrangian cuts, heuristic constraint propagation and the repair techniques for

column generation. Finally, the work on the Graph Bisection Problem motivates the question

whether approximation algorithms can be exploited for problem reduction as well, which may

give yield to a notion of relaxed ε-consistency for optimization constraints.

In the end, a brief note that goes beyond the scientific scope of this thesis. We aimed at giving

a broader access to optimization power. However, we strongly believe that efficiency is no value

by itself. Optimization is to be seen as a tool that can be used and misused. We therefore ask to

exploit it with care and responsibility for the good of the people.

List of Figures

2.1 The figure showsarcs on shortestpaths fromv1and to v11 ina DAG. Dashed lines

mark shortest-path arcs from v1, dotted lines those to v11. Solid lines represent

arcs that are in both sets. Consider for example node 7: the shortest path from v1

to 7 is







, and the shortest path from node 7 to v11 is





v11



. Therefore,

a shortest path from v1to v11 via node 7 is





v11



. . . . . . . . . . . . . 19

2.2 The structure replacing a node in G. . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3 The figure schematically shows an edge









Ethat must exist according to

Lemma 2.4. Solid lines mark edges in E, and dashed lines mark parts of the

shortest path between v1and vn. The dotted line between land vnindicates that

there exists a path between the two nodes that does not visit the edge







. The

alternating lines and dots between land rindicate that the shortest path from lto

vnvisits node r. The numbers on top of the nodes give their corresponding DFS

numbers, and triangles mark DFS sub-trees. . . . . . . . . . . . . . . . . . . . . 24

2.4 The figure schematically shows an edge









Ethat must exist according

to Lemma 2.5. Solid lines mark edges in E, and dashed lines mark parts of

the shortest path between v1and vn. Alternating lines and dots indicate parts

of the shortest path from v1to a node, and dotted lines indicate parts of the

shortest path from a node to vn. The proof of Theorem 2.2 shows that the path

























is 2-admissible and does not visit the edge







. 25

2.5 A directed graph with non-negative arc weights. Assume we are given an upper

bound B



8. All arcs in the graph are part of an admissible path with costs

lower than B, and every admissible path with costs lower than Bmust visit the

arc







. However, there exists a path







that does not visit this arc. . . . 27

2.6 The figure schematically shows a shortest path tree Trooted at v1. Solid lines

denote arcs in G, dashed lines mark parts of the shortest path P







from v1

to vn. The triangles symbolize shortest path sub-trees. For an edge e

 













, the nodes in Vare partitioned into two non-empty sets Seand SC

e. If

eis removed from the graph, the shortest path from v1to vnmust visit an edge













T. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

191

192

List of Figures

2.7 (a) The table gives the costs cij of assigning a value xjto a variable Xi. (b) A

bipartite graph links variables to values that they can take. Bold numbers and

lines mark the optimal solution with objective value 26. . . . . . . . . . . . . . . 47

2.8 (a) The new cost matrix cM, and (b) the network NMfor the optimal matching

from Figure 2.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.9 (a) The changed cost vector c



cM, and (b) the network NMwith node potentials

π1and π2. Bold numbers show those assignments that can be eliminated by

simple reduced-cost propagation in the presence of a solution with value B



28. 51

2.10 (a) Shortest paths from nodes in V1to nodes in V2with respect to the reduced

costs c. (b) The same shortest paths using the original cost vector cM. (c) The

additional costs imposed by an assignment Xi



xj. Bold numbers show those

assignments that can be eliminated in the presence of a solution with value B



28. 52

2.11 The width of each element is proportional to its weight. The elements are or-

dered with respect to the efficiencies pi



wi. The leftmost element has the biggest

efficiency, and the rightmost the smallest one. smarks the critical item in U1. . . 57

2.12 U3requires the integrality of item s. The figures show U1











, and











. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

2.13 The figure illustratesthe processofthe reductionalgorithm presentedfor KP







. The weight ordering in whichthe items are tested ensures thatthe critical item

moves monotonically to the right. . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.1 The concept of SBDD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.2 DeBruijn networks of dimension 3 (left) and 4 (right). A node is marked by the

binary string corresponding to its number. The dashed lines mark the symmetries

of the DeBruijn network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.3 The search tree when bisectioning DB(8) without breaking any symmetries. . . . 88

4.4 The search tree for the bisection of DB(8) when breaking all possible symme-

tries. Chains of choice points with only one successor result from symmetry-

based domain filtering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.5 The left hand side shows two patterns P



and P



. Each pattern consists of

three weeks (horizontal) of three groups of three players. Unfixed variables are

left empty. On the right hand side, the corresponding bipartite graph is shown,

containinga node for each week of both patterns. Since a matching of cardinality

3 exists (bold edges), P



is dominated by P



. . . . . . . . . . . . . . . . . . . . 90

4.6 Six out of 40 solutions of 7-queens are unique. . . . . . . . . . . . . . . . . . . . 94

5.1 Constructing a legal reduced-cost optimal roster is equivalent to finding a con-

strained shortest path in a weighted DAG. . . . . . . . . . . . . . . . . . . . . . 102

List of Figures

193

5.2 The entire approach: The inner loop generates columns using dual information,

the outer loop solves the master problem. . . . . . . . . . . . . . . . . . . . . . 103

5.3 Number of choice points versus master iterations (left), and running time versus

master iterations (right) for SPC, NRC, and total enumeration. The tests were

run with a data instance of type 10-00-20 that was solved to optimality. . . . . . . 104

5.4 Number of choice points versus master iterations using SPC, NRC with a data

set of type 7-0-30. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.5 The left picture shows time versus the number of calls of the propagation routine

using the incremental and the non-incremental implementation of the shortest

path constraint. Both versions were stopped after 10000 seconds total CPU time.

The experiment was run with a data instance of type 10-00-70. The right picture

shows a comparison of NRC (upper curve) and SPC (lower curve) in a time

versus quality diagram on a data instance of type 67-165-280. . . . . . . . . . . 106

5.6 Data set with 65 crew members and 959 pairings. . . . . . . . . . . . . . . . . . 116

5.7 Data set with 50 crew members and 766 pairings. . . . . . . . . . . . . . . . . . 117

5.8 Data set with 7 crew members and 129 pairings. . . . . . . . . . . . . . . . . . . 118

5.9 Data set with 30 crew members and 279 pairings. . . . . . . . . . . . . . . . . . 118

6.1 The automatic recording scenario. . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.1 Optimality proofs: comparison of two strategies when using reduction based on

knapsack relaxation: on-the-fly fixing vs. fixing after Lagrange. . . . . . . . . . . 147

7.2 Comparison of different heuristic solvers for the CNDP. . . . . . . . . . . . . . . 149

8.1 The residual graph of week 2 from Table 8.1. . . . . . . . . . . . . . . . . . . . 157

8.2 The residual graph of week 2 from Table 8.3. . . . . . . . . . . . . . . . . . . . 159

8.3 The residual graph of player 15 from Table 8.4. . . . . . . . . . . . . . . . . . . 160

8.4 The residual graph of player 12 from Table 8.5. . . . . . . . . . . . . . . . . . . 161

8.5 The residual graphs of players 4 (left), 6 (middle) and 12 (right) from Table 8.6. . 162

9.1 Progression of the bounds with ε





025 and ε





25 . . . . . . . . . . . . . . 182

194

List of Figures

List of Tables

2.1 The table gives an overview of the findings in this section. . . . . . . . . . . . . 32

2.2 Characteristics of the four algorithms used in the experiments. . . . . . . . . . . 64

2.3 The pure CP approach for both problem classes. cp is the average number of

choice points, time the average time in seconds for 100 instances of the given size. 65

2.4 Uncorrelated data instances. We give the average numbers for 100 test sets per

size. time is the time in seconds, cp the number of choice points. . . . . . . . . . 65

2.5 Weakly correlated data instances. We give the average numbers for 100 test sets

per size. time is the time in seconds, cp the number of choice points. . . . . . . . 66

2.6 Uncorrelated and weakly correlated data instances. We give the average time per

choice point in milliseconds for 100 test sets per size. . . . . . . . . . . . . . . . 66

2.7 Uncorrelated data. Comparison of running times per choice point for the new

amortized linear time propagation algorithm based on bound U2and the imple-

mentation of MTR. We give the average time per choice point in milliseconds

for 100 test sets per size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

2.8 Uncorrelated data. Comparison of running times for the new amortized linear

time propagation algorithms and implementations of DHR, and MTR. We give

the average time in seconds as well as the number of choice points for 100 test

sets per size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

2.9 Comparison of running times of linU2and MTR on uncorrelated and weakly

correlated data. cp is the number of choice points, time the running time in

seconds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.1 Results of the golfer 4-3-Xproblem. . . . . . . . . . . . . . . . . . . . . . . . . 91

4.2 Results of the golfer 4-4-Xinstance. . . . . . . . . . . . . . . . . . . . . . . . . 92

4.3 Results of the golfer 4-4-4 instance performing additional checks for symmetry

ϕXin search nodes of every q-th depth. . . . . . . . . . . . . . . . . . . . . . . . 92

4.4 Improved results of the golfer 4-4-Xperforming additional checks for symmetry

ϕXin search tree nodes of every 8-th depth. . . . . . . . . . . . . . . . . . . . . 92

195

196

List of Tables

4.5 Solvingn-Queenswithoutbreaking symmetries(sym), withbreaking symmetries

via SBDS, and by avoiding them via SBDD. Computing times are given in seconds. 95

6.1 The table compares the different approaches on three different test sets with 5

classes, 12 hours, 20 channels and different objectives. The time (in seconds) and

the number of choice points are averages for 50 randomly generated instances for

each objective. The average number of programs per instance are between 607.6

and 612.6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.2 The table shows a comparison of the performance of the different approaches on

5 test sets with 5 classes and objective CU for various time horizons (in hours)

and channel numbers (ch). Italic numbers give the average time (in seconds) and

the average number of nodes of 50 randomly generated instances in each test set

(avg). Numbers below are: minimum(min), maximum(max), and standard devi-

ation (std) for these 50 instances. The average number of programs per instance

is 315.2 for (12h/20ch), 793.5 (12h/50ch), 607.6 (24h/20ch), 1512.1 (24h/50ch),

and 1782.6 (72h/20ch), respectively. . . . . . . . . . . . . . . . . . . . . . . . . 130

6.3 The table illustrates the performance of the different approaches on very differ-

ent benchmark classes. Each test set contains 50 randomly generated problem

instances. There is an average of 1956.7 programs in the 120h/20ch test set,

1782.6 programs in test set 72h/20ch, and 1423.3 programs in test set 24h/50ch. . 131

6.4 The table shows the performance of the different approaches on subset sum data

sets ranging from 12 hours and 20 channels up to 72 hours and 50 channels. The

average number of programs in the 50 randomly generated instances per test set

is given as parameter p. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.5 The table compares the performance of the different algorithms on benchmark

sets with 3 classes and objective CU, each containing 50 randomly generated

problem instances with roughly 1000 programs on average. . . . . . . . . . . . . 132

7.1 The impact of cardinality interval tightening when using the knapsack relaxation

for pruning and problem reduction. Mean, minimum, maximum, and variance of

running time and number of nodes in the branch-and-bound tree are given. . . . . 145

7.2 The impact of the branching variable selection when pruning and filtering is done

with the help of the knapsack relaxation. . . . . . . . . . . . . . . . . . . . . . . 145

7.3 The impact of additional shortest-path filtering when using the knapsack relax-

ation for pruningand problemreduction. Branching strategyBR0 and cardinality

interval tightening are used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

List of Tables

197

7.4 Comparison of the CPLEX branch-and-cut algorithm and Lagrangian relaxation

(pruning and reduction based on the knapsack relaxation plus cardinality interval

tightening). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

8.1 A partial instantiation of the 5-3-2 Social Golfer Problem. . . . . . . . . . . . . . 157

8.2 A more complete partial instantiation of the 5-3-2 Social Golfer Problem. . . . . 157

8.3 A partial instantiation of the 5-4-2 Social Golfer Problem. . . . . . . . . . . . . . 159

8.4 A partial instantiation of the 5-3-7 Social Golfer Problem. . . . . . . . . . . . . . 160

8.5 A partial instantiation of the 4-3-5 Social Golfer Problem. . . . . . . . . . . . . . 161

8.6 A partial instantiation of the 4-3-5 Social Golfer Problem. . . . . . . . . . . . . . 162

8.7 The CPU time needed to compute all unique solutions (in seconds), in brackets

the number of choice points visited, and the time per choice point (in millisec-

onds) when computing all unique solutions. . . . . . . . . . . . . . . . . . . . . 163

8.8 The number of simple and complete symmetry checks, and the percentage of

time spent in these checks when computing all unique solutions. . . . . . . . . . 164

8.9 The number of unique solutions for several social golfer instances. . . . . . . . . 165

9.1 Real errors and computational effort depending on the given ε. . . . . . . . . . . 183

9.2 Times and sizes of the search trees of the branch & bound algorithm using the

approximation algorithm with different ε’s. (1): without enhanced scaling, with-

out forcing moves, (2): with enhanced scaling, without forcing moves, (3): with

enhanced scaling and forcing moves. . . . . . . . . . . . . . . . . . . . . . . . . 184

9.3 Average running times in seconds and average number of search nodes using

cost-decomposition without forcing moves. (1): max-cutflow formulation, (2):

min-congestion formulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

9.4 Comparison of bounds on Grids and Tori. . . . . . . . . . . . . . . . . . . . . . 186

9.5 Comparison of bounds on shuffle-exchange (SE) and DeBruijn (DB) graphs. . . . 186

9.6 Comparison of bounds on real-world graphs stemming from a finite elements

application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

9.7 Average running times (seconds) and sizes of the search trees using the different

methods for computing the VarMC-bound. . . . . . . . . . . . . . . . . . . . . . 188

198

List of Tables

Bibliography

[1] R.K. Ahuja, T.L. Magnati, and J.B. Orlin. Network Flows. Prentice Hall, 1993.

[2] C. Albrecht. Provably good global routing by a new approximation algorithm for multi-

commodity flow. International Conference on Physical Design, pp. 19–25, 2000.

[3] E. Andersson, E. Housos, N. Kohl, and D. Wedelin. Crew pairing optimization. Interna-

tional Series in Operations Research and Management Science, 9:228–258, Kluwer Aca-

demic Publishers, 1998.

[4] Y. Aneja, V. Aggarwal, and K. Nair. Shortest chain subject to side conditions. Networks,

13:295-302, 1983.

[5] D. Applegate and W. Cook. A computational study of the job-shop scheduling problem.

ORSA Journal on Computing, 3:149–156, 1991.

[6] K. R. Apt. The Rough Guide to Constraint Propagation. 5th International Conference on

Principles and Practice of Constraint Programming (CP), LNCS 1713:1–23, 1999.

[7] A. Atamtürk, D. Rajan. On Splittable and Unsplittable Capacitated Network Design Arc-

Set Polyhedra. To appear in Mathematical Programming, 2001.

[8] G. Ausiello, G.F. Italiano, A.M. Spaccamela, and U. Nanni. Incremental Algorithms for

Minimal Length Paths. Journal of Algorithms, 12(4): 615–638, 1990.

[9]

mediaTV

, Technical Description, Axcent AG. http://www.axcent.de.

[10] E. Balas and E. Zemel. An algorithm for large-scale zero-one knapsack problems. Opera-

tions Research, 28:119–148, 1980.

[11] F. Barahona and R. Anbil. The Volume Algorithm: producing primal solutions with a

subgradient algorithm. Mathematical Programming, 87:385–399, 2000.

[12] C. Barnhart, C.A. Hane, E.L. Johnson, and G. Sigismondi. A column generation and parti-

tioning approach for multi-commodtiy flow problems. Telecommunication Systems, 3:239–

258, 1995.

199

200

Bibliography

[13] C. Barnhart, E.L. Johnson, G.L. Nemhauser, M.W.P. Savelsbergh, and P.H. Vance. Branch-

and-price: Column generation for solving huge integer programs. Operations Research,

46(3):316–329, 1998.

[14] C. Barnhart and R.G. Shenoi. An approximate model and solution approach for the long-

haul crew pairing problem. Transportation Science, 32(3):221–231, 1998.

[15] N. Barnier and P. Brisset. Graph Coloring for Air Traffic Flow Management. 4th Interna-

tional Workshop on Integration of AI and OR Techniques in Constraint Programming for

Combinatorial Optimization Problems (CP-AI-OR), pp. 133–147, 2002.

[16] N. Barnier and P. Brisset. Solving the Kirkman’s Schoolgirl Problem in a Few Seconds.

8th International Conference on Principles and Practice of Constraint Programming (CP),

LNCS 2470:477–491, 2002.

[17] J. Beasley and N. Christofides. An Algorithm for the Resource Constrained Shortest Path

Problem. Networks, 19:379-394, 1989.

[18] H. Beringer and B. De Backer. Combinatorial problem solving in constraint logic pro-

gramming with cooperative solvers. Logic Programming: Formal Methods and Practical

Applications, pp. 245–272, Elsevier, 1995.

[19] J.C. Bermond and C. Peyrat. De Bruijn and Kautz networks: a competitor for the hyper-

cube? 1st European Workshop on Hypercubes and Distributed Computers, pp. 279–293,

North-Holland, 1989.

[20] M. Bern and P. Plassmann. The Steiner problem with edge lengths 1 and 2. Information

Processing Letters (IPL), 32:171–176, 1989.

[21] C. Bessière. Arc-consistency and arc-consistency again. Artificial Intelligence, 65:179–

190, 1994.

[22] D. Bienstock, O. Günlük, S. Chopra, and C.Y. Tsai. Mininum cost capacity installation for

multicommodity flows. Mathematical Programming, 81:177-199, 1998.

[23] D. Bienstock. Experiments with a network design algorithm using epsilon-approximate

linear programs. Technical Report, CORC Report 1999-4, 1999.

[24] A. Bockmayr and T. Kasper. Branch and infer: A unifying framework for integer and finite

domain constraint programming. INFORMS Journal on Computing, 10(3):287–300, 1998.

[25] R. Borndörfer and A. Löbel. Scheduling duties by adaptive column generation. Technical

Report, Konrad-Zuse-Zentrum für Informationstechink Berlin, ZIB-01-02, 2001.

Bibliography

201

[26] C. Bornstein, A. Litman, B. Maggs, R. Sitaraman, and T. Yatzkar. On the Bisection Width

andExpansionof ButterflyNetworks. 1stMerged InternationalParallelProcessingSympo-

sium and Symposium on Parallel and Distributed Processing (IPPS/SPDP), IEEE, pp. 144–

150, 1998.

[27] L. Brunetta, M. Conforti, and G. Rinaldi. A branch-and-cut algorithm for the equicut

problem. Mathematical Programming, 78:243–263, 1997.

[28] P. Camerini, L. Fratta, and F. Maffioli. On Improving Relaxation methods by Modified

Gradient Techniques. Mathematical Programming Studies, 3:26–34, 1975.

[29] A. Caprara, M. Fischetti, and P. Toth. A heuristic algorithm for the set covering problem.

5th International Conference on Integer Programming and Combinatorial Optimization

(IPCO), LNCS 1084:72–84, 1996.

[30] A. Caprara, F. Focacci, E. Lamma, P. Mello, M. Milano, P. Toth, and D. Vigo. Integrating

constraint logic programming and operations research techniques for the crew rostering

problem. Software – Practice and Experience, 28(1): 49–76, 1998.

[31] A. Caprara, D. Pisinger, and P. Toth. Exact Solution of the Quadratic Knapsack Problem.

INFORMS Journal on Computing, 11:125–137, 1999.

[32] A. Caprara, P. Toth, D. Vigo, and M. Fischetti. Modeling and solving the crew rostering

problem. Operations Research, 46(6):820–830, 1998.

[33] R.D. Carr, L.K. Fleischer, V.J. Leung, and C.A. Phillips. Strengthening Integrality Gaps

for Capacitated Network Design and Covering Problems. 11th Symposium on Discrete

Algorithms (SODA), 2000.

[34] Y. Caseau and F. Laburthe. Solving Various Weighted Matching Problems with Constraints.

3rd International Conference on Principles and Practice of Constraint Programming (CP),

LNCS 1330:17–31, 1997.

[35] Y. Caseau and F. Laburthe. Solving Small TSPs with Constraints. 14th International Con-

ference on Logic Programming (ICLP), pp. 316–330, The MIT Press, 1997.

[36] Y. Caseau and F. Laburthe. Heuristics for large constrained routing problems. Journal of

Heuristics, 5:281–303, 1999.

[37] L. Cavique, C. Rego, and I. Themido. Subgraph ejection chains and tabu search for the

crew scheduling problem. Journal of the Operational Research Society, 50:608–616, 1999.

[38] C. Chu and J. Antonio. Approximation algorithm to solve real-life multicriteria cutting

stock problems. Operations Research, 47(4):495–508, 1999.

202

Bibliography

[39] H.D. Chu, E. Gelman, and E.L. Johnson. Solving large scale crew scheduling problems.

European Journal of Operational Research, 97:260–268, 1997

[40] L.W. Clarke and P. Gong. Capacitated Network Design with Column Generation. Research

Report, Georgia Institute of Technology, 1998.

[41] M.B. Cohen, C.J. Colbourn, L.A. Ives, and A.C.H. Ling. Kirkman triple systems of order

21 with non-trivial automorphism group. Mathematics of Computation, 71:873–881, 2001.

[42] C. Colbourn and J. Dinitz. The CRC Handbook of Combinatorial Designs. CRC Press,

1996.

[43] T.H. Cormen, C.E. Leiserson, and R.L. Rivest. Introduction to Algorithms. The MIT Press,

1990.

[44] T.G. Crainic, A. Frangioni, and B. Gendron. Bundle-based relaxation methods for mul-

ticommodity capacitated fixed charge network design. Discrete Applied Mathematics,

112:73–99, 2001.

[45] T.G. Crainic, M. Gendreau, and J.M. Farvolden. A simplex-based tabu search method for

capacitated network design. INFORMS Journal on Computing, 12(3):223–236, 2000.

[46] H. Crowder. Computational improvements for subgradient optimization. Symposia Mathe-

matica, XIX:357–372, 1976.

[47] CSPLib: a problem library for constraints, maintained by I.P. Gent, T. Walsh, and B. Sel-

man, http://www-users.cs.york.ac.uk/˜tw/csplib/

[48] J. Czyzyk, S. Mehrotra, M. Wagner, and S.J. Wright. PCx user guide (Version 1.1). Tech-

nical Report, Optimization Technology Center, Aragone National Laboratory and North-

western University, 1996.

[49] J. Czyzyk, S. Mehrotra, M. Wagner, and S.J. Wright. PCx: An interior-point code for linear

programming. Optimization Methods and Software, 11(2):397–430, 1999.

[50] G.B. Dantzig. Discrete variable extremum problems. Operations Research, 5:266–277,

1957.

[51] G.B. Dantzig and P. Wolfe. The decomposition algorithm for linear programs. Economet-

rica, 29(4):767–778, 1961.

[52] P.R. Day and D.M. Ryan. Flight attendant rostering for short-haul airline operations. Op-

erations Research, 45(5):649–661, 1997.

Bibliography

203

[53] R.S. Dembo and P.L. Hammer. A reduction algorithm for knapsack problems. Methods of

Operations Research, 36:49–60, 1980.

[54] C. Demetrescu and G.F. Italiano. Fully Dynamic All Pairs Shortest Paths with Real Edge

Weights. 42nd Annual Symposium on Foundations of Computer Science (FOCS), IEEE,

pp. 260–267, 2001.

[55] G. Desaulniers, J. Desrosiers, Y. Dumas, S. Marc, B. Rioux, M.M. Solomon, and F. Soumis.

Crew pairing at Air France. European Journal of OperationalResearch, 97:245–259, 1997.

[56] J. Desrosiers, Y. Dumas, M.M. Solomon, and F. Soumis. Time constrained routing and

scheduling. Network Routing — Handbooks in Operations Research and Management

Science, 8:35–139, North-Holland, 1995.

[57] I. Dumitrescu and N. Boland. The weight-constrained shortest path problem: preprocess-

ing, scaling and dynamic programming algorithms with numerical comparisons. 17th In-

ternational Symposium on Mathematical Programming (ISMP), 2000.

[58] ECLIPSE. ParcTechnologiesLimited.http://www.icparc.ic.ac.uk/eclipse/.

[59] H. Everett. Generalized lagrange multiplier method for solving problems of optimum allo-

cation of resource. Operations Research, 11:399–417, 1963.

[60] T. Fahle. Cost Based Filtering vs. Upper Bounds for Maximum Clique. 4th International

Workshop on Integration of AI and OR Techniques in Constraint Programming for Combi-

natorial Optimization Problems (CP-AI-OR), pp. 93–107, 2002.

[61] T. Fahle, U. Junker, S.E. Karisch, N. Kohl, M. Sellmann, and B. Vaaben. Constraint Prop-

agation for Complex Column Generation Subproblems. 17th International Symposium on

Mathematical Programming (ISMP), 2000.

[62] T. Fahle, U. Junker, S.E. Karisch, N. Kohl, M. Sellmann, and B. Vaaben. Constraint pro-

gramming based column generation for crew assignment. Journal of Heuristics, 8(1):59–

81, 2002.

[63] T. Fahle, S. Schamberger, and M. Sellmann. Symmetry Breaking. 7th International Con-

ference on Principles and Practice of Constraint Programming (CP), LNCS 2239:93–107,

2001.

[64] T. Fahle and M. Sellmann. Constraint Programming Based Column Generation with Knap-

sack Subproblems. 2nd International Workshop on Integration of AI and OR Techniques

in Constraint Programming for Combinatorial Optimization Problems (CP-AI-OR), Pader-

born Center for Parallel Computing, Technical Report tr-001-2000:33–44, 2000.

204

Bibliography

[65] T. Fahle and M. Sellmann. Cost-Based Filtering for the Constrained Knapsack Problem.

Annals of Operations Research, 115:73–93, 2002.

[66] J. Farvolden, K. Jones, I. Lustig, and W. Powell. Multicommodity Network Flows — The

Impact Of FormulationOn Decomposition. MathematicalProgramming, 62:95–117, 1993.

[67] J. Farvolden, W. Powell, and I. Lustig. A primal partitioning solution for the arc-chain

formulation of a multicommoditynetwork flow problem. Operations Research, 41(4):669–

693, 1993.

[68] D. Fayard and G. Plateau. An algorithm for the solution of the 0-1 knapsack problem.

Computing, 28:269–287, 1982.

[69] U. Feige and R. Krauthgamer. A Polylogarithmic Approximation of the Minimum Bisec-

tion. Journal on Computing, 31(4):1090–1118, 2002.

[70] R. Feldmann, B. Monien, P. Mysliwietz, and S. Tschöke. A Better Upper Bound on the

Bisection Width of de Bruijn Networks. 14th International Symposium on Theoretical

Aspects of Computer Science (STACS), LNCS 1200:511–522, 1997.

[71] T. Feo and M. Resende. Greedy randomized adaptive search procedures. Journal of Global

Optimization, 6:109–133, 1995.

[72] C.E. Ferreira, A. Martin, C.C. de Souza, R. Weismantel, and L.A. Wolsey. The node ca-

pacitated graph partitioning problem: a computational study. Journal of Mathematical

Programming, 81:229–256, 1998.

[73] P.O. Fjällström. Algorithms for graph partitioning: A survey. Linköping Electronic Ar-

ticles in Computer and Information Science,http://www.ep.liu.se/ea/cis/-

1998/010/, 1998.

[74] L.K. Fleischer. Approximating Fractional Multicommodity Flow Independent of the Num-

ber of Commodities. SIAM Journal on Discrete Mathematics, 13(4):505–520, 2000.

[75] L. Fleischer and K.D. Wayne. Fast and simpleapproximation schemes for generalized flow.

Mathematical Programming, 91(2):215–238, 2002.

[76] F. Focacci, F. Laburthe, and A. Lodi. Local Search and Constraint Programming. Handbook

of Metaheuristic, Kluwer Academic Publishers, to appear.

[77] F. Focacci, A. Lodi, and M. Milano. Solving TSP through the Integration of OR and CP

Techniques. Workshop on Large Scale Combinatorial Optimization and Constraints, Elec-

tronic Notes in Discrete Mathematics, 1998.

Bibliography

205

[78] F. Focacci, A. Lodi, and M. Milano. Integration of CP and OR methods

for Matching Problems. 1st International Workshop on Integration of AI and

OR Techniques in Constraint Programming for Combinatorial Optimization Prob-

lems (CP-AI-OR),http://www.deis.unibo.it/Events/Deis/Workshops/-

Proceedings.html, 1999.

[79] F. Focacci, A. Lodi, and M. Milano. Cost-Based Domain Filtering. 5th International

Conference on Principlesand Practice of Constraint Programming(CP), LNCS 1713:189–

203, 1999.

[80] F. Focacci, A. Lodi, and M. Milano. Cutting Planes in Constraint Programming: An Hy-

brid Approach. 2nd International Workshop on Integration of AI and OR Techniques in

Constraint Programming for Combinatorial Optimization Problems (CP-AI-OR), Pader-

born Center for Parallel Computing, Technical Report tr-001-2000:45–51, 2000.

[81] F. Focacci and M. Milano. Global Cut Framework for Removing Symmetries. 7th Inter-

national Conference on Principles and Practice of Constraint Programming (CP), LNCS

2239:77–92, 2001.

[82] F. Focacci and P. Shaw. Pruning sub-optimal search branches using local search. 4th Inter-

national Workshop on Integration of AI and OR Techniques in Constraint Programming for

Combinatorial Optimization Problems (CP-AI-OR), pp. 181–189, 2002.

[83] S. Fortune, J. Hopcroft, and J. Wyllie. The directed subgraph homeomorphism problem.

Theoretical Computer Science, 10(2):111–121, 1980.

[84] A. Frangioni. A Bundle type Dual-ascent Aproach to Linear Multi-Commodity Min Cost

Flow Problems. Technical Report, Dipartimento di Informatica, Universita di Pisa, TR-96-

01, 1996.

[85] A. Frangioni. Dual Ascent Methods and MulticommodityFlow Problems. DoctoralThesis,

Dipartimento di Informatica, Universita di Pisa, TD-97-05, 1997.

[86] M.L. Fredmann and R.E. Tarjan. Fibonacci heaps and their uses in improved network

optimization algorithms. Journal of the ACM, 34:596–615, 1987.

[87] M. Gamache, F. Soumis, D. Villeneuve, J. Desrosiers, and E. Gélinas. The preferential

bidding system at Air Canada. Transportation Science, 32(3):246–255, 1998.

[88] M.R. Garey and D.S. Johnson. Computers and Intractability, A Guide to the Theory of

NP-Completeness. Freeman, San Francisco, 1979.

[89] M.R. Garey, D.S. Johnson, and L. Stockmeyer.. Some simplified NP-complete graph prob-

lems. Theoretical Comuter Sience, 1:237–267, 1976.

206

Bibliography

[90] N. Garg and J. Könemann. Faster and simpler algorithms for multicommodity flow and

other fractional packing problems. 39th Annual Symposium on Foundations of Computer

Science (FOCS), IEEE, pp. 300–309, 1998.

[91] B. Gendron and T.G. Crainic. Relaxations for multicommodity capacitated network design

problems. Technical Report, Centre de recherche sur les transports, Universitéde Montréal,

CRT-96-05, 1994.

[92] I.P. Gent, W. Harvey, and T. Kelsey. Groups and Constraints: Symmetry Breaking During

Search. 8th International Conference on Principles and Practice of Constraint Program-

ming (CP), LNCS 2470:415–430, 2002.

[93] I.P. Gent and B.M. Smith. Symmetry Breaking During Search in Constraint Programming.

14th European Conference on Artificial Intelligence (ECAI), pp. 599–603, 2000.

[94] I. Ghamlouche, T.G. Crainic, and M. Gendreau. Cycle-based neighbourhoods for fixed-

charge capacitated multicommoditynetwork design. Technical Report, Centre de recherche

sur les transports, Université de Montréal, CRT-2001-01, 2001.

[95] I. Ghamlouche, T.G. Crainic, and M. Gendreau. Path relinking, cycle-based neighbour-

hoods and capacitated multicommodity network design. Technical Report, Centre de

recherche sur les transports, Université de Montréal, CRT-2002-01, 2002.

[96] P.C. Gilmore and R.E. Gomory. A linear programming approach to the cutting stock prob-

lem. Operations Research, 9:849–859, 1961.

[97] F. Glover, D. Klingman, and N.V. Phillips. Network Models in Optimization and Their

Applications in Practice. Wiley, 1992.

[98] A.V. Goldberg, J.D. Oldham, S.A. Plotkin, and C. Stein. An Implementation of a Com-

binatorial Approximation Algorithm for Minimum-Cost Multicommodity Flow. 6th In-

ternational Conference on Integer Programming and Combinatorial Optimization (IPCO),

LNCS 1412:338–352, 1998.

[99] M.C. Golumbic. Algorithmic Graph Theory and Perfect Graphs. Academic Press, New

York, 1991.

[100] M.D. Grigoriadis and L.G. Khachiyan. Fast approximation schemes for convex programs

with many blocks and coupling constraints. SIAM Journal on Optimization, 4:86–107,

1994.

[101] M.D. Grigoriadis and L.G. Khachiyan. Approximate minimum-cost multicommodity

flows. Mathematical Programming, 75:477–482, 1996.

Bibliography

207

[102] O. Günlük. A branch-and-cut algorithm for capacitated network design problems. Math-

ematical Programming, 86(1):17–39, 1999.

[103] G. Handler and I. Zang. A Dual Algorithm for the Restricted Shortest Path Problem.

Networks, 10:293–310, 1980.

[104] W. Harvey. Symmetry Breaking and the Social Golfer Problem. Workshop on Symmetry

in Constraints (SymCon), 2001.

[105] W. Harvey. Warwick’s Results Page for the Social Golfer Problem.http://-

www.icparc.ic.ac.uk/˜wh/golf/.

[106] W.D. Harvey and M.L. Ginsberg. Limited discrepancy search. 14th International Joint

Conference on Artificial Intelligence (IJCAI), pp. 607–613, 1997.

[107] M. Held and R.M. Karp. The traveling-salesman problem and minimum spanning trees.

Operations Research, 18:1138–1162, 1970.

[108] M. Held and R.M. Karp. The traveling-salesman problem and minimum spanning trees:

Part II. Mathematical Programming, 1:6–25, 1971.

[109] B. Hendrickson and B. Leland. The chaco user’s guide: Version 2.0. Technical Report,

Sandia National Laboratories, Albuquerque, SAND94-2692, 1994.

[110] D.S. Hochbaum. Approximation Algorithms for NP-hard Problems. PWS Publishing

Company, 1997.

[111] K.L. Hoffman and M. Padberg. Solving airline crew scheduling problems by branch-and-

cut. Management Science, 39(6):657–682, 1993.

[112] K. Holmberg and D. Yuan. A Lagrangean Heuristic Based Branch-and-Bound Approach

for the Capacitated Network Design Problem. Operations Research, 48:461–481, 2000.

[113] J. Hooker. Unifying optimization and constraint satisfaction. Invited talk at the 16th

International Joint Conference on Artificial Intelligence (IJCAI). Slides available at

http://ba.gsia.cmu.edu/jnh/ijcai.ppt.

[114] P.D. Hudson. Improving the branch and bound algorithm for the knapsack problem.

Queen’s University Research Report, Belfast, 1977.

[115] J.Y. Hsiao, C.Y. Tang, and R.S. Chang. An efficient algorithm for finding a maximum

weight 2-independent set on interval graphs. Information Processing Letters, 43(5):229–

235, 1992.

208

Bibliography

[116] ILOG CPLEX 6.5. Reference manual and user manual. ILOG, 1999.

[117] ILOG CPLEX 7.0. Reference manual and user manual. ILOG, 2000.

[118] ILOG CPLEX 7.5. Reference manual and user manual. ILOG, 2001.

[119] ILOG Planner 3.3. Reference manual and user manual. ILOG, 1999.

[120] ILOG Solver 4.4. Reference manual and user manual. ILOG, 1999.

[121] ILOG Solver 5.0. Reference manual and user manual. ILOG, 2000.

[122] G.P. Ingargiola and J.F. Korsh. A reduction algorithm for zero-one single knapsack prob-

lems. Management Science, 20:460–463, 1973.

[123] O. Jahn, R. Möhring, and A. Schulz. Optimal routing of traffic flows with length restric-

tions in networks with congestion. Technical Report, TU Berlin, 658-1999, 1999.

[124] E. Johnson, A. Mehrotra, and G. Nemhauser. Min-cut clustering. Mathematical Program-

ming, 62:133–151, 1993.

[125] H. Joksch. The Shortest Route Problem with Constraints. Journal of Mathematical Anal-

ysis and Application, 14:191–197, 1966.

[126] M. Jünger and D. Naddef (Editors). Computational Combinatorial Optimization. LNCS

2241, 2001.

[127] M. Jünger and S. Thienel. The ABACUS system for Branch and Cut and Price Algo-

rithms in Integer Programming and Combinatorial Optimization. Software Practice and

Experiments, 30:1325–1352, 2000.

[128] U. Junker, S.E. Karisch, N. Kohl, B. Vaaben, T. Fahle, and M. Sellmann. A Framework

for Constraint programming based column generation. 5th International Conference on

Principles and Practice of Constraint Programming (CP), LNCS 1713:261–274, 1999.

[129] U. Junker, S.E. Karisch, N. Kohl, B. Vaaben, T. Fahle, and M. Sellmann. A Framework for

Constraint Programming Based Column Generation. 16th International Joint Conference

on Artificial Intelligence (IJCAI), Workshop on Non-Binary Constraints, 1999.

[130] G. Karakostas. Fast Approximation Schemes for Fractional Multicommodity Flow Prob-

lems. 13th Symposium on Discrete Algorithms (SODA), 2002.

[131] D. Karger and S. Plotkin. Adding multiple cost constraints to combinatorial optimiza-

tion problems, with applications to multicommodity flows. 27th Symposium on Theory of

Computing, pp. 18–25, 1995.

Bibliography

209

[132] S. Karisch. CUTSDP — A toolbox for a cutting-plane approach based on semidefinite

programming. User’s guide, Version1.0, Department of Mathematical Modeling,Technical

University of Denmark, 10/98, 1998.

[133] S.E. Karisch, F. Rendl, and J. Clausen. Solving graphbisectionproblemswith semidefinite

programming. INFORMS Journal on Computing, 12(3):177–191, 2000.

[134] G. Karypis and V. Kumar. Multilevel algorithms for multi-constraint graph partitioning.

Technical Report, Deptartment of Computer Science, University of Minnesota, TR 98-019,

1998.

[135] P. Klein, S. Plotkin, C. Stein, and E. Tardos. Faster approximation algorithms for the

unit capacity concurrent flow problem with applications to routing and finding sparse cuts.

SIAM Journal on Computing, 23:466–487, 1994.

[136] G. Kliewer, M. Sellmann, and A. Koberstein. Solving the capacitated network design

problem in parallel. 3rd meeting of the PAREO Euro working group on Parallel Processing

in Operations Research (PAREO), 2002.

[137] N. Kohl and S.E. Karisch. Airline Crew Rostering: Problem Types, Modeling and Opti-

mization. Carmen Research and Technology Report, CRTR-2001-1, 2001.

[138] V. Kumar. Algorithms for Constraints-Satisfactionproblems: A Survey. The AI Magazine,

AAAI, 13:32–44, 1992.

[139] M. Lehradt. Basisalgorithmen für ein TV Anytime System. Diploma Thesis, University

of Paderborn, 2000.

[140] F.T. Leighton. Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hy-

percubes. Morgan Kaufmann Publishers, 1992.

[141] T. Leighton, F. Makedon, S. Plotkin, C. Stein, E. Tardos, and S. Tragoudas. Fast Approx-

imation Algorithms for Multicommodity Flow Problems. Journal of Computer and System

Sciences, 50(2):228–243, 1995.

[142] M. Lübbecke and U. Zimmermann. Computer aided scheduling of switching engines. 8th

International Conference on Computer-Aided Scheduling of Public Transport (CASPT),

2000.

[143] A.K. Mackworth. Consistency in networks of relations. Artificial Intelligence, 8(1):99–

118, 1977.

[144] T.L. Magnanti and R.T. Wong. Network design and transportation planning: Models and

algorithms. Transportation Science, 18:1–55, 1984.

210

Bibliography

[145] K. Marriott and P.J. Suckey. Programming with Constraints: An Introduction. The MIT

Press, 1998.

[146] S. Martello, D. Pisinger, and P. Toth. Dynamic programming and tight bounds for the 0-1

knapsack problem. Management Science, 45:414–424, 1999.

[147] S. Martello and P. Toth. An upper bound for the zero-one knapsack problem and a branch

and bound algorithm. European Journal of Operational Research, 1:169–175, 1977.

[148] S. Martello and P. Toth. A new algorithm for the 0-1 knapsack problem. Management

Science, 34:633–644, 1988.

[149] S. Martello and P. Toth. Knapsack Problems — Algorithms and Computer Implementa-

tions. Wiley, 1990.

[150] S. Martello and P. Toth. Upper Bounds and Algorithms for hard 0-1 knapsack problems.

Operations Research, 45(5):768–778, 1997.

[151] R.D. McBride. Progress made in solving the multicommodity flow problem. SIAM Jour-

nal on Optimization, 8:947–955, 1998.

[152] I. McDonald. Unique Symmetry Breaking in CSPs Using Group Theory. Workshop on

Symmetry in Constraints (SymCon), 2001.

[153] K. Mehlhorn and S. Nähler. LEDA: A Platform for Combinatorial and Geometric Com-

puting. Communications of the ACM, 38(1):96–102, 1995.

[154] K. Mehlhorn and M. Ziegelmann. Resource Constrained Shortest Paths. 8th Annual

European Symposium on Algorithms (ESA), LNCS 1879:326–337, 2000.

[155] P. Meseguer and C. Torras. Exploiting symmetries within constraint satisfaction search.

Artificial Intelligence, 129(1–2):133–163, 2001.

[156] M. Milano. Integration of Mathematical Programming and Constraint Programming for

Combinatorial Optimization Problems. Tutorial at the 6th International Conference on

Principles and Practice of Constraint Programming (CP), 2000.

[157] U. Montanari. Networks of constraints: fundamental properties and applications. Infor-

mation Science, 7(2):95–132, 1974.

[158] G.L. Nemhauser and L.A. Wolsey. Integer and Combinatorial Optimization. Wiley, 1988.

[159] G.L. Nemhauser, M.W.P. Savelsberg, and G.C. Sigismondi. MINTO, a Mixed INTeger

Optimizer. Operations Research Letters, 15:47–58, 1994.

Bibliography

211

[160] W.P.M. Nuijten and E.H.L. Aarts. A computational study of constraint satisfaction for

multiple capacitated job shop scheduling. European Journal of Operational Research,

90(2):269–284, 1996.

[161] A. Orda. Routing with end to end QoS guarantees in broadband networks. Conference on

Computer Communications (Infocom), IEEE, pp. 27–34, 1998.

[162] G. Ottosson and E.S. Thorsteinsson. Linear Relaxation and Reduced-Cost Based Propa-

gation of Continuous Variable Subscripts. 2nd International Workshop on Integration of AI

and OR Techniques in Constraint Programming for Combinatorial Optimization Problems

(CP-AI-OR), Paderborn Center for Parallel Computing, Technical Report tr-001-2000:129–

138, 2000.

[163] PARROT. Executive Summary. ESPRIT 24960, 1997.

[164] F. Pellegrini and J. Roman. SCOTCH: A software package for static mapping by dual

recursive bipartitioning of process and architecture graphs. 4th European Conference on

High Performance Computing and Networking (HPCN), pp. 493–498, 1996.

[165] G. Pesant and M. Gendreau. A view of local search in constrained programming. 2nd

International Conference on Principles and Practice of Constrained Programming (CP),

LNCS 1118:353–366, 1996.

[166] S. Pettie and V. Ramachandran. Computing undirected shortest paths using comparisons

and additions. 13th Symposium on Discrete Algorithms (SODA), 2002.

[167] D. Pisinger. An expanding-core algorithm for the exact 0-1 knapsack problem. European

Journal of Operational Research, 87:175–187, 1995.

[168] D. Pisinger. An exact algorithm for large multiple knapsack problems. European Journal

of Operational Research, 114:528–541, 1999.

[169] S.A. Plotkin, D. Shmoys, and E. Tardos. Fast approximation algorithms for fractional

packing and covering problems. Math. of Operations Research, 20:257–301, 1995.

[170] O. Porto, M. de Moraes, and A. Lucena. A relax and cut algorithm for the quadratic

knapsack problem. 17th International Symposium on Mathematical Programming (ISMP),

2000.

[171] R. Preis and R. Dieckmann. The PARTY PartitioningLibrary —User Guide– Version 1.1.

Technical Report, University of Paderborn, tr-rsfb-96-024, 1996.

212

Bibliography

[172] R. Preis and R. Diekmann. PARTY - A Software Libraryfor Graph Partitioning. Advances

in Computational Mechanics with Parallel and Distributed Processing, Civil-Comp Press,

pp. 63–71, 1997.

[173] S. Prestwich. A hybrid search architecture applied to hard random 3-SAT and low-

autocorrelation binary sequences. 6th International Conference on Principles and Practice

of Constrained Programming (CP), LNCS 1894:337–352, 2000.

[174] J.-F. Puget. Symmetry Breaking Revisited. 8th International Conference on Principles

and Practice of Constrained Programming (CP), LNCS 2470:446–461, 2002.

[175] T. Radzik. Fast deterministic approximation for the multicommodity flow problem. 6th

Symposium on Discrete Algorithms (SODA), pp. 486–492, 1995.

[176] T. Radzik. Experimental study of a solution method for multicommodity flow problems.

2nd Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 79–102, 2000.

[177] G. Ramalingam and T. Reps. An Incremental Algorithm for a Generalization of the

Shortest-Path Problem. Journal of Algorithms, 21(2): 267–305, 1992.

[178] G. Ramalingam and T. Reps. On the computational complexity of dynamic graph prob-

lems. Theoretical Computer Science, 158(1–2): 233–277, 1995.

[179] J.-C. Régin. A filtering algorithm for constraints of difference in CSPs. 12th National

Conference on Artificial Intelligence, AAAI, pp. 362–367, 1994.

[180] R. Rodosek, M. Wallace, and M.T. Haijan. A new approach to integrating mixed integer

programming and constraint logic programming. Annals of Operations Research, 86:63–

87, 1999.

[181] E. Rothberg. Using Cuts to Remove Symmetry. 17th International Symposium on Math-

ematical Programming (ISMP), 2000.

[182] R.A. Rushmeier, K.L. Hoffman, and M. Padberg. Recent advances in exact optimization

of airline scheduling problems. Technical Report, George Mason University, 1995.

[183] D.M. Ryan. The solution of massive generalized set partitioning problems in aircrew

rostering. Journal of the Operational Research Society, 43(5):459–467, 1992.

[184] M. Sato. Efficient implementation of an approximation algorithm for multicommodity

flows. Master Thesis, Graduate School of Engineering Science, Osaka University, 2000.

[185] J. Schulze and T. Fahle. A parallel algorithm for the vehicle routing problem with time

window constraints. Annals of Operations Research, 86:585–607, 1999.

Bibliography

213

[186] SCIL. Symbolic Constraints in Integer Linear Programming. http://www.mpi-sb.-

mpg.de/SCIL/.

[187] M. Sellmann. An Arc-Consistency Algorithm for the Weighted All Different Constraint.

8th International Conference on Principles and Practice of Constraint Programming (CP),

LNCS 2470:744–749, 2002.

[188] M. Sellmann and T. Fahle. CP-Based Lagrangian Relaxation for a Multimedia Applica-

tion. 3rd International Workshop on Integration of AI and OR Techniques in Constraint

Programming for Combinatorial Optimization Problems (CP-AI-OR), pp. 1–14, 2001.

[189] M. Sellmann and T. Fahle. Coupling Variable Fixing Algorithms for the Automatic

Recording Problem. 9th Annual European Symposium on Algorithms (ESA), LNCS

2161:134–145, 2001.

[190] M. Sellmann and T. Fahle. Constraint Programming Based Lagrangian Relaxation for the

Automatic Recording Problem. Annals of Operations Research, to appear.

[191] M. Sellmann and W. Harvey. Heuristic Constraint Propagation. 4th International Work-

shop on Integration of AI and OR Techniques in Constraint Programming for Combinato-

rial Optimization Problems (CP-AI-OR), pp. 191–204, 2002.

[192] M. Sellmann and W. Harvey. Heuristic Constraint Propagation. 8th International Confer-

ence on Principles and Practice of Constraint Programming (CP), LNCS 2470:738–743,

2002.

[193] M. Sellmann, G. Kliewer, and A. Koberstein. Lagrangian Cardinality Cuts and Variable

Fixing for Capacitated Network Design. 10th Annual European Symposium on Algorithms

(ESA), LNCS 2461:845–858, 2002.

[194] M. Sellmann, K. Zervoudakis, P. Stamatopoulos, and T. Fahle. Integrating Direct CP

Search and CP-based Column Generation for the Airline Crew Assignment Problem. 2nd

International Workshop on Integration of AI and OR Techniques in Constraint Program-

ming for Combinatorial Optimization Problems (CP-AI-OR), Paderborn Center for Parallel

Computing, Technical Report tr-001-2000:163–170, 2000.

[195] M. Sellmann, K. Zervoudakis, P. Stamatopoulos, and T. Fahle. Crew Assignment via Con-

straint Programming: Integrating Column Generation and Heuristic Tree Search. Annals of

Operations Research, 115:207–225, 2002.

[196] B. Selman, H. Kautz, and D. McAllester. Ten Challenges in Propositional Reasoning and

Search. 14th International Joint Conference on Artificial Intelligence (IJCAI), pp. 50–54,

1997.

214

Bibliography

[197] N. Sensen. Lower Bounds and Exact Algorithmsfor the Graph PartitioningProblem using

Multicommodity Flows. 9th Annual European Symposium on Algorithms (ESA), LNCS

2161:391–403, 2001.

[198] F. Shahrokhi and D.W. Matula. The maximum concurrent flow problem. Journal of the

ACM, 37:318–334, 1990.

[199] F. Shahrokhi and L. Szekely. On Canonical Concurrent Flows, Crossing Number and

Graph Expansion. Combinatorics, Probability and Computing, 3:523–543, 1994.

[200] P. Shaw. Using constraint programming and local search methods to solve vehicle routing

problems. 4th International Conference on Principles and Practice of ConstraintProgram-

ming (CP), LNCS 1520:417–431, 1998.

[201] H.D. Sherali and J. Cole Smith. Improving Discrete Model Representation Via Symmetry

Considerations. 17th International Symposium on Mathematical Programming (ISMP),

2000.

[202] B. Smith. Reducing Symmetry in a Combinatorial Design Problem. 3rd International

Workshop on Integration of AI and OR Techniques in Constraint Programming for Combi-

natorial Optimization Problems (CP-AI-OR), pp. 351–360, 2001.

[203] C. Souza, R. Keunings, L.A. Wolsey, and O. Zone. A new approach to minimising the

frontwidth in finite element calculations. Computer Methods in Applied Mechanics and

Engineering, 111:323–334, 1994.

[204] V. Sridhar and J.S. Park. Benders-and-cut algorithm for fixed-charge capacitated network

design problems. European Journal of Operational Research, 125(3):622–632, 2000.

[205] P. Stamatopoulos, G. Boukeas, K. Zervoudakis, V. Stoumpos, and C. Halatsis. Parallel

CP-based direct crew rostering. ESPRIT 24960 (PARROT), University of Athens and

University of Paderborn, Deliverable D-TEC2.1, 1999.

[206] A. Steger and N.C. Wormald. Generating random regular graphs quickly. Combinatorics,

Probability and Computation, 8:377–396, 1999.

[207] M. Thorup. Undirected singlesource shortestpaths inlinear time. 38thAnnualSymposium

on Foundations of Computer Science (FOCS), IEEE, pp. 12–21, 1997.

[208] TIVO. TV your way. TIVO, Inc., http://www.tivo.com/home.asp.

Bibliography

215

[209] M. Trick. A Dynamic Programming Approach for Consistency and Propagation for Knap-

sack Constraints. 3rd International Workshop on Integration of AI and OR Techniques in

Constraint Programming for Combinatorial Optimization Problems (CP-AI-OR), pp. 113–

124, 2001.

[210] UP-TV. European Research Project, IST-1999-20751. http://www.up-tv.de/.

[211] P. Van Hentenryck, Y. Deville, and C.M. Teng. A generic arc-consistency algorithm and

its specializations. Artificial Intelligence, 57:291–321, 1992.

[212] P.R.C. Villela and C.T. Bornstein. An improved bound for the 0-1 knapsack problem.

Technical Report, COPPE-Federal University of Rio de Janeiro, ES31-83, 1983.

[213] T. Walsh. Depth-bounded discrepancy search. 14th International Joint Conference on

Artificial Intelligence (IJCAI), pp. 1388–1393, 1997.

[214] C. Walshaw, M. Cross, and M. Everett. A localised algorithm for optimising unstructured

mesh partitions. International Journal of Supercomputer Applications and High Perfor-

mance Computing, 9(4):280–295, 1995.

[215] H.P. Williams. Model Building in Mathematical Programming. Wiley, 1978.

[216] G. Xue. Primal-dual algorithms for computing weight-constrained shortest paths and

weight-constrained minimum spanning trees. International Performance, Computing, and

Communications Conference (IPCCC), IEEE, pp. 271–277, 2000.

[217] N. Young. Randomized rounding without solving the linear program. 6th Symposium on

Discrete Algorithms (SODA), pp. 170–178, 1995.

[218] G. Yu (Editor). Operations Research in the Airline Industry. International Series in Oper-

ations Research and Management Science, Kluwer Academic Publishers, 1998.

[219] T.H. Yunes, A.V. Moura, and C.C. Souza. A hybrid approach for solving large crew

scheduling problems. International Workshop on Practical Aspects of Declarative Lan-

guages (PADL), LNCS 1753:293–307, 2000.