Faster exact solution of sparse MaxCut and QUBO problems [original]

Mathematical Programming Computation (2023) 15:445–470

https://doi.org/10.1007/s12532-023-00236-6

FULL LENGTH PAPER

Faster exact solution of sparse MaxCut and QUBO problems

Daniel Rehfeldt1·Thorsten Koch1,2 ·Yuji Shinano1

Received: 31 January 2022 / Accepted: 3 February 2023 / Published online: 15 April 2023

Abstract

The maximum-cut problem is one of the fundamental problems in combinatorial opti-

mization. With the advent of quantum computers, both the maximum-cut and the

equivalent quadratic unconstrained binary optimization problem have experienced

much interest in recent years. This article aims to advance the state of the art in the

exact solution of both problems—by using mathematical programming techniques.

The main focus lies on sparse problem instances, although also dense ones can be

solved. We enhance several algorithmic components such as reduction techniques and

cutting-plane separation algorithms, and combine them in an exact branch-and-cut

solver. Furthermore, we provide a parallel implementation. The new solver is shown

to significantly outperform existing state-of-the-art software for sparse maximum-cut

and quadratic unconstrained binary optimization instances. Furthermore, we improve

the best known bounds for several instances from the 7th DIMACS Challenge and the

QPLIB, and solve some of them (for the first time) to optimality.

Keywords Maximum-cut ·Quadratic unconstrained binary optimization ·

Branch-and-cut

Mathematics Subject Classification 90C27 ·90C10

BDaniel Rehfeldt

rehfeldt@zib.de

Thorsten Koch

koch@zib.de

Yuji Shinano

shinano@zib.de

1Applied Algorithmic Intelligence Methods departement, Zuse Institute Berlin, Takustr. 7,

14195 Berlin, Germany

2TU Berlin, Chair of Software and Algorithms for Discrete Optimization, Str. des 17. Juni 135,

10623 Berlin, Germany

123

446 D. Rehfeldt et al.

1 Introduction

Given an undirected graph G=(V,E), and edge weights w:E→Q,themaximum-

cut (MaxCut) problem is to find a partition (V1,V2)of Vsuch that the summed weight

of the edges between V1and V2is maximized. MaxCut is one of the fundamental NP-

hard optimization problems [28] and has applications for example in VLSI design [3]

and the theory of spin glasses in physics [33].1The latter application is particularly

interesting, because it requires an exact solution of the MaxCut problem.

A problem that is equivalent to MaxCut is the quadratic unconstrained binary

optimization (QUBO) problem. Given a matrix Q∈Qn×n, the corresponding QUBO

problem can be formulated as

min xTQx

x∈{0,1}n.

Any QUBO instance can be formulated as a MaxCut instance in a graph with n+1

vertices, and any MaxCut instance on a graph (V,E)can be formulated as a QUBO

instance with n=|V|−1, see e.g. [4]. The focus of this article is mostly on MaxCut

algorithms, but due to the just mentioned equivalence, all results can be (and indeed

are) applied to QUBO as well.

The huge recent interest in quantum computing has also put MaxCut and QUBO

in the spotlight: Both of them can be heuristically solved by current quantum anneal-

ers. However, Jünger et al. [25] demonstrate on a wide range of test-sets that digital

computing methods prevail against state-of-the-art quantum annealers.

For digital computers, many heuristics have been proposed both for MaxCut and

QUBO. See Dunning et. al. [13] for a recent overview. There have also been various

articles on exact solution. See Barahona et al. [4] for an early, Rendl et al. [37]for

a more recent, and Jünger et al. [25] for an up-to-date overview. In the last years,

more focus has been put on the development of methods that are best suited for dense

instances, see for example [20,23,31] for state-of-the-art methods. However, the

maximum number of nodes for MaxCut (or number of variables for QUBO) instances

that can be handled by these methods is roughly 300. In contrast, this article aims

to advance the state of the art in the practical exact solution of sparse MaxCut and

QUBO instances. The largest (sparse) instance solved in this article has more than

10,000 nodes.

1.1 Contribution and structure

This article describes the design and implementation of a branch-and-cut based Max-

Cut and QUBO solver. In particular, we suggest several algorithmic improvements of

key components of a branch-and-cut framework.

Section 2shows how to efficiently solve a well-known linear programming (LP)

relaxation for the MaxCut problem by using cutting planes. Among other things,

we demonstrate how the separation of maximally violated constraints, which was

1As a side note, the 2021 Nobel prize in Physics was awarded for work on spin glasses.

123

Faster exact solution of sparse MaxCut and QUBO problems 447

described by many authors as being too slow for practical use, can be realized with

quite moderate run times.

Section 3is concerned with another vital component within branch-and-cut: reduc-

tion techniques. We review methods from the literature and propose new ones. The

reduction methods can be applied for preprocessing and domain propagation.

Section 4shows how to integrate the techniques from the previous two sections as

well as several additional methods in a branch-and-cut algorithm. Parallelization is

also discussed.

Section 5provides computational results of the newly implemented MaxCut and

QUBO solver on a large collection of test-sets from the literature. It is shown that

the new solver outperforms the previous state of the art. Furthermore, the best known

solutions of several benchmark instances can be improved and one is even solved (for

the first time) to optimality.

1.2 Preliminaries and notation

In the remainder of this article, we assume that a MaxCut instance IMC =(G,w)

with graph G=(V,E)and edge weights wis given. Graphs are always assumed

to be undirected and simple, i.e., without parallel edges or self-loops. Given a graph

G=(V,E), we refer to the vertices and edges of any subgraph G⊆Gas V(G)

and E(G)respectively, An edge between vertices u,v ∈Vis denoted by {u,v}.An

edge set C={{v1,v

2},{v2,v

3}, ..., {vk−1,v

k},{v1,v

k}}is called a cycle. A cycle C

is called simple if all its vertices have degree 2 in C. An edge {u,w}∈E\Cis called

achord of Cif both uand ware contained in (an edge of) C. If no such {u,w}exists,

we say that Cis chordless. Given a graph G=(V,E)and a U⊆V, we define the

induced edge cut as δ(U):= {{u,v}∈E|u∈U,v ∈V\U}.

Finally, for any function x:M→ Rwith Mfinite, and any M⊆M, we define

x(M):= i∈Mx(i).

2 Solving the relaxation: eﬃcient separation of odd-cycle cuts

This section is concerned with an integer programming (IP) formulation for MaxCut

due to Barahona and Mahjoub [5], given below.

max wTx(1)

s.t. 

e∈F

x(e)−

e∈C\F

x(e)≤|F|−1 for all cycles C,F⊆C,|F|odd (2)

x(e)∈{0,1}for all e∈E.(3)

The formulation is based on the observation that for any edge cut δ(U)and any cycle

Cthe number of their common edges, namely |C∩δ(U)|, is even. This property is

enforced by the constraints (2). These constraints are called cycle inequalities.

123

448 D. Rehfeldt et al.

Fig. 1 MaxCut graph and corresponding auxiliary graph for cycle cut separation

2.1 Cutting plane separation

Barahona and Mahjoub [5] show that the LP-relaxation of Formulation 1 can be solved

in polynomial time. More precisely, they describe how to separate the constraints (2)

in polynomial time, as demonstrated in the following. First, rewrite the constraints (2)

as 

e∈F

(1−x(e))+

e∈C\F

x(e)≥1 for all cycles C,F⊆C,|F|odd.(4)

Next, construct a new graph Hfrom the MaxCut graph G=(V,E). This graph H

consists of two copies G=(V,E)and G =(V,E)of G, connected by the

following additional edges. For each v∈Vlet vand v be the corresponding vertices

in Gand G, respectively. For each edge {v, w}∈Elet {v,w

}and {v,w

}be

in H. Finally, for any (LP-relaxation vector) x∈[0,1]Edefine the following edge

weights pon H: For each e={v,w}∈E,setp({v,w

}):= p({v,w

}):= x(e)

and p({v,w

}):= p({v,w

}):= 1−x(e). The construction is exemplified in Fig.1.

Consider, for example, the edge {v,w}in Fig.1a. The weight pof the corresponding

(dashed) edges {v,w

}and {v,w

}in Fig.1bis1−x({v,w}). The weight pof the

corresponding (bold) edges {v,w

}and {v,w

}is x({v, w}).

Given an LP-relaxation vector x∈[0,1]E, we can find violated inequalities (2)as

follows. For each v∈Vcompute a shortest path between vand v in the weighted

graph (H,p). By construction of H, such a path contains an odd number of edges

which are neither in Enor E.LetFbe the corresponding set of edges in E; i.e. for

each edge {v,w

}or {v,w

}that is in the shortest path, let {v,w}be in F. Further-

more, the edges of the shortest path correspond to a closed walk Cin G. The length of

the shortest path in (H,p)is equal to e∈F(1−x(e))+e∈C\Fx(e). Thus, if for

each v∈Vthe corresponding shortest path between vand v in (H,p)has length

123

Faster exact solution of sparse MaxCut and QUBO problems 449

at least 1, the vector xis an optimal solution to the LP-relaxation of Formulation 1.

Otherwise, we have found at least one violated constraint.

Although shortest paths can be computed in polynomial time, the literature has so

far considered the above separation procedure as too time-consuming to be directly

used in practical exact MaxCut or QUBO solution. Instead, heuristics are employed

and exact cycle separation is only used if no more cuts can be found otherwise, see,

e.g., [3,4,9,25,33]. However, as we will show in the following, the exact separation

can actually be realized in a practically quite efficient way.

2.2 Fast computation of maximally-violated constraints

Initially, we observe that it is usually possible to considerably reduce the size of

the auxiliary graph Hdescribed above. First, all edges eof Hwith p(e)=1(or

practically, with p(e)being sufficiently close to 1) can be removed. Because all edge

weights are non-negative, no such edges can be contained in a path of weight smaller

than 1. Second, one can contract edges ewith p(e)=0. Both of these operations can

be done implicitly while creating the auxiliary graph (e.g., edges with weight 1 are

never added). In this way, one can use cache-efficient, static data structures, such as

the compressed-sparse-row format, see e.g. [29], for representing the auxiliary graph.

For computing a shortest path, we use a modified version of Dijkstra’s algorithm.

For any vertex vin the auxiliary graph let d(v) denote the distance of vto the start

vertex, as computed by the algorithm. We use the following modifications. First, we

stop the execution of the algorithm as soon as we scan a vertex vwith d(v) ≥1.

Second, as already observed in Jünger and Mallach [26], one does not need to proceed

the shortest path computation from any vertex, say v, in the auxiliary graph where

the twin vertex, v, has already been scanned and the following condition holds:

d(v)+d(v)≥1.

Finally, we use an optimized implementation of Dijkstra’s algorithm together with

a specialized binary heap. For the latter, we exploit the fact that the values (i.e. vertex

indices) of the key, value pairs inserted into the heap are natural numbers bounded by

the number of vertices of the auxiliary graph.

2.3 Post-processing

As already mentioned above, the edges of the shortest path computed in the auxiliary

graph correspond to a closed walk in G—but not necessarily to a simple cycle. Thus,

Jünger and Mallach [26] suggest to extract all simple cycles from such a closed walk

and separate the corresponding inequalities. We follow this suggestion (although we

note that this modification is performance neutral in our implementation).

Barahona and Mahjoub [5] observe that a cycle inequality is only facet-defining if

the corresponding cycle is chordless. If a cycle Chas a chord e, one readily obtains

two smaller cycles C1and C2with C1∪C2=C∪{e}and C1∩C2={e}.One

verifies that any cycle inequality defined on Ccan be written as the sum of two cycle

inequalities defined on C1and C2, where eis in the odd edges set Fof exactly one of

the two cycle inequalities. Jünger and Mallach [27] suggest a procedure to extract from

123

450 D. Rehfeldt et al.

any simple cycle Cwith corresponding violated cycle-inequality a chordless cycle C

whose cycle-inequality is also violated. This procedure runs in O(|E|). However, a

disadvantage of this approach is that it finds only one such chordless cycle, which

might not be the smallest or most violated one. Additionally, there can be several

such chordless cycles. In the following, we suggest a procedure to find several non-

overlapping chordless cycles with corresponding violated cycle inequality from a given

cycle Cwith violated cycle inequality.

Consider a simple cycle C={{v1,v

2},{v2,v

3}, ..., {vk−1,v

k},{v1,v

k}}and let

F⊆Cwith |F|odd. Assume there is a vector x∈[0,1]Esuch that the cycle

inequality corresponding to Cand Fis violated, that is:



e∈F

(1−x(e))+

e∈C\F

x(e)<1.

For each i=2, ..., kdefine Pi:= {{v1,v

2},{v2,v

3}, ..., {vi−1,v

i}}and store the

following information.

–f(i):= |F∩Pi|,

–q(i):= e∈F∩Pi(1−x(e))+e∈(C∩Pi)\Fx(e).

This information can be computed in total time O(|C|): Traverse the nodes vi,i=

2,3, .., kof Cin this order and compute the above two values for ifrom i−1.

With the above information at hand, traverse for each i=2, ..., kthe incident

edges of vi. Whenever a chord {vi,vj}with j<iis found, check whether the cycle

inequality of one or both of the corresponding cycles is violated. This check can be

performed in constant time by using the precomputed information for the indices iand

j. For example, if f(i)−f(j)is even, one of the corresponding two cycle inequalities

q(i)−q(j)+1−x({vi,vj})≥1.

If a violated cycle inequality is found, add the corresponding chord together with

a flag that indicates which of the two possible cycles is to be used to some (ini-

tially empty) queue R. Once the incident edges of all nodes vifor i=2, ..., k

have been traversed, sort the elements of Raccording to the size of the corre-

sponding cycles in non-decreasing order. Consider all indices of the original cycle

as unmarked. Check the (implicit) cycles in Rin non-decreasing order. Let {vi,vj}

with i<jbe the corresponding chord. If both iand jare unmarked, mark the

indices i+1,i+2, ..., j−1. Otherwise, discard the (implicit) cycle. Finally, add all

cycle inequalities corresponding to non-discarded cycles to the cut pool. The overall

procedure runs in O(|E|log(|E|)). In practice, its run time is completely neglectable.

Finally, we suggest a procedure to obtain additional cycle cuts from the auxiliary

graph. This approach is particularly useful for MaxCut instances with few vertices,

because the number of generated cycle inequalities separated in each round is limited

by the number of vertices of the MaxCut instance (if we ignore additional cycle-

inequalities that are possibly found by the above post-processing). The procedure

makes use of the symmetry of the auxiliary graph. Assume that we have computed a

shortest path between a pair of vertices, say vand v, as described above. Recall that

123

Faster exact solution of sparse MaxCut and QUBO problems 451

d(w) denotes the distance of any vertex wto the start vertex v. If there is a twin pair

of vertices u,u such that none of them are part of the shortest path between vand

v, and d(u)+d(u)<1, we can get another violated cycle inequality as follow:

First, we take the v-upath computed by the algorithm. Second, we consider the v-u

path, and transform it to an u-v path (of same length) by exploiting the symmetry

of the auxiliary graph. By combining the two paths, we obtain an v-v path of length

d(u)+d(u).

3 Simplifying the problem: reduction techniques

Reduction techniques are a key ingredient for the exact solution of many NP-hard

optimization problems, such as Steiner tree [36] or vertex coloring [34]. For QUBO,

several reductions methods have been suggested in the literature. Basic techniques

can already be found in Hammer et al. [22]. The perhaps most extensive reduction

framework is given in Tavares et. al. [10]. Recently, Glover et al. [18] provided efficient

realizations and extensions of the classic first and second order derivative and co-

derivative techniques [21]. We have implemented the methods from Glover et al. [18]

for this article. However, we do not provide details, but rather concentrate on MaxCut

reduction techniques in the following.

For MaxCut, there are several articles that discuss reduction techniques for

unweighted MaxCut. Ferizovic et al. [14] provide the practically most powerful col-

lection of such techniques. Lange et al. [32] provide techniques for general (weighted)

MaxCut instances. In the following, we will describe some of their methods. Further-

more, we suggest new MaxCut reduction methods. Their practical strength will be

demonstrated in Sect.5.

Initially, we note that any edge with weight 0 can be removed from IMC.Any

solution to this reduced version of IMC can be extended to a solution of same weight

to the original instance (in linear time). Thus, in the following we assume no edges

have weight 0. We also note that for the incidence vector x∈{0,1}Eof any graph

cut one obtains a corresponding (but not unique) vertex assignment y∈{0,1}Vthat

satisfies for all {u,v}∈Ethe relation y(u)= y(v) ⇐⇒ x({u,v})=1. This

correspondence will be used repeatedly in the following.

3.1 Cut-based reduction techniques

The first reduction technique from Lange et al. [32] is based on the following propo-

sition.

Proposition 1 [32]Let e ∈E and U ⊂V such that e ∈δ(U).If

|w(e)|≥ 

a∈δ(U)\{e}

|w(a)|,

then there is an optimal solution x ∈{0,1}Eto IMC with x(e)=β, where β=1if

w(e)>0, and β=0if w(e)<0.

123

452 D. Rehfeldt et al.

Note that in the case of x(e)=0, one can simply contract e. In the case of x(e)=1,

one needs to multiply the weights of the incident edges of one of the endpoints of e

by −1 before the contraction.

One way to check for all e∈Ewhether an U⊂Vexists such that the conditions of

Proposition 1are satisfied is by using Gomory-Hu trees. We have only implemented a

simpler check that considers for an edge e={v,u}∈Ethe sets {v}and {u}as U,as

already suggested in Lange et al. [32]. A combined check for all edges can be made

in O(|E|). We note that this test corresponds to the first order derivative reduction

method (mentioned above) for QUBO. This relation can be readily verified by means

of the standard transformations between MaxCut and QUBO.

The next reduction technique from Lange et al. [32] is based on triangles, and is

given below.

Proposition 2 [32]Assume there is a triangle in G with edges {v1,v

2},{v1,v

3},

{v2,v

3}. Let V1⊂V such that {v1,v

2},{v1,v

3}⊂δ(V1), and V2⊂V such that

{v1,v

2},{v2,v

3}⊂δ(V2).If

−w({v1,v

3})−w({v1,v

2})≥

e∈δ(V1)\{{v1,v3},{v1,v2}}

|w(e)|

and

−w({v1,v

2})−w({v2,v

3})≥

e∈δ(V2)\{{v1,v2},{v2,v3}}

|w(e)|,

then there is an optimal solution x ∈{0,1}Eto IMC with x({v1,v

2})=0.

Similarly to the previous proposition, we only implemented tests for the simple cases

of {v1},{v2},{v1,v

3}, and {v2,v

3}for V1and V2, respectively.

In the following, we propose a new reduction test based on triangles, which com-

plements the above one from Lange et al. [32].

Proposition 3 Assume there is a triangle in G with edges {v1,v

2},{v1,v

3},{v2,v

3}∈

E such that w({v1,v

2})>0,w({v1,v

3})>0, and w({v2,v

3})<0. Let V1⊂Vsuch

that {v1,v

2},{v1,v

3}∈δ(V1)and let V2⊂V such that {v1,v

2},{v2,v

3}∈δ(V2).If

w({v1,v

2})+w({v1,v

3})≥

e∈δ(V1)\{{v1,v2},{v1,v3}}

|w(e)|,(5)

and

w({v1,v

2})−w({v2,v

3})≥

e∈δ(V2)\{{v1,v2},{v1,v3}}

|w(e)|,(6)

then there is an optimal solution x ∈{0,1}Eto IMC such that x({v1,v

2})=1.

Proof Let x∈{0,1}Ebe a feasible solution to IMC with x({v1,v

2})=0. We will

construct a feasible solution x∈{0,1}Ewith x({v1,v

2})=1 such that wTx≥wTx.

Thus, there exists at least one optimal solution x∈{0,1}Ewith x({v1,v

2})=1.

123

Faster exact solution of sparse MaxCut and QUBO problems 453

Because x({v1,v

2})=0, it needs to hold that either

x({v1,v

3})=x({v2,v

3})=0(7)

x({v1,v

3})=x({v2,v

3})=1.(8)

We just consider the case (7); the second one can be handled in an analogeous way. Let

y∈{0,1}Vbe a vertex assignment corresponding to x; i.e., for all {u,v}∈Eit holds

that y(u)= y(v) ⇐⇒ x({u,v})=1. Define a new vertex assignment y∈{0,1}V

as follows

y(v) := 1−y(v) if v∈V1

y(v) otherwise.

Let x∈{0,1}Ebe the cut corresponding to y; i.e., for all {u,v}∈Eit holds

x({u,v})=1ify(u)= y(v), and x({u,v})=0 otherwise. Note that for all

e∈E\δ(V1)it holds that x(e)=x(e). For all e∈δ(V1)it holds that x(e)=1−x(e).

In particular,

x({v1,v

2})=x({v1,v

3})=1,(9)

because of x({v1,v

2})=x({v1,v

3})=0. Thus, we obtain



e∈E

w(e)x(e)=

e∈E\δ(V1)

w(e)x(e)+

e∈δ(V1)

w(e)x(e)

=

e∈E\δ(V1)

w(e)x(e)+

e∈δ(V1)

w(e)x(e)

(9)

=

e∈E\δ(V1)

w(e)x(e)

+

e∈δ(V1)\{{v1,v2},{v1,v3}}

w(e)x(e)+w({v1,v

2})+w({v1,v

3})

(5)

≥

e∈E\δ(V1)

w(e)x(e)

+

e∈δ(V1)\{{v1,v2},{v1,v3}}

(w(e)x(e)+|w(e)|)

≥

e∈E\δ(V1)

w(e)x(e)

+

e∈δ(V1)\{{v1,v2},{v1,v3}}

w(e)(1−x(e))

=

e∈E\δ(V1)

w(e)x(e)

+

e∈δ(V1)\{{v1,v2},{v1,v3}}

w(e)x(e)

123

454 D. Rehfeldt et al.

=

e∈E\δ(V1)

w(e)x(e)+

e∈δ(V1)

w(e)x(e)

=

e∈E

w(e)x(e),

which concludes the proof.

As for the previous triangle test, we only consider the simple cases of {v1},{v2},

{v1,v

3}, and {v2,v

3}for V1and V2in our implementation.

Note that Lange et al. [32] furthermore propose a generalization of Proposition 2to

more general connected subgraphs. Also Proposition 3could be generalized in a similar

way. However, since we only implemented reductions tests for the triangle conditions,

we do not provide details on this generalization here. We also note that exploiting this

more general condition for effective practical reductions is not straight-forward and

seems computationally considerably more expensive than the triangle tests.

3.2 Further reduction techniques

In the following, we propose two additional reduction methods, based on different

techniques. One uses the reduced-costs of the LP-relaxation of Formulation 1, and

one exploits simple symmetries in MaxCut instances.

We start with the latter. If successful, the test based on the following proposition

allows one to contract two (possibly non-adjacent) vertices.

Proposition 4 Assume there are two distinct vertices u,v ∈V such that N(u)\{v}=

N(v)\{u}. If there exists a non-zero αsuch that w(e)=αw(e)for all pairs e,ewith

e∈δ(u)\{u,v},e∈δ(v)\{u,v},e∩e=∅, and moreover

–{u,v}/∈E∨w({u,v})<0in case of α>0

–{u,v}/∈E∨w({u,v})>0in case of α<0,

then there is an optimal vertex solution y ∈{0,1}Vto IMC such that y(u)=y(v) if

α>0, and y(u)=(1−y(v)) if α<0.

Proof We consider only the case α>0; the case α<0 can be shown in a similar way.

Let y∈{0,1}Vwith y(v) = y(u). We will construct a y∈{0,1}Vwith y(v) =y(u)

such that the weight of the induced cut of yis not lower than the weight of the induced

cut of y. In this way, the proof is complete, because we can apply this construction

also for any optimal vertex assignment

Let x∈{0,1}Ebe the induced cut of y. Assume that



e∈δ(u)\{u,v}

w(e)x(e)≥α

e∈δ(v)\{u,v}

w(e)x(e). (10)

Otherwise, switch the roles of uand vin the following.

123

Faster exact solution of sparse MaxCut and QUBO problems 455

Let f:δ(v)\{{u,v}}→δ(u)\{{u,v}}such that e∩f(e)=∅for all e∈

δ(v)\{u,v}. Note that fis well-defined. Define a new cut x∈{0,1}Eas follows

x(e):= ⎧

⎨

⎩

x(e)if e∈E\δ(v)

x(f(e)if e∈δ(v) \{{u,v}}

0ife={u,v}

Because of (10) and {u,v}/∈E∨w({u,v})<0 it holds that wTx≥wTx.

The condition of Proposition 4can be checked efficiently in practice by using hash-

ing techniques, similar to the ones used for the parallel rows test for mixed-integer

programs [2].

A well-known reduction method for binary integer programs, which was already

used for MaxCut [4], is as follows. Consider a feasible solution ˜xto the LP-relaxation

of Formulation 1, with reduced-costs ˜w, and with objective value ˜

U. Further, let Lbe

the weight of a graph cut. If for an e∈Eit holds that ˜x(e)=0 and ˜

U−˜w(e)<L, one

can fix x(e):= 0. If for a e∈Eit holds that ˜x(e)=1 and ˜

U+˜w(e)<L, one can fix

x(e):= 1. This method can also be used for LP-solutions (obtained during separation)

that satisfy only a subset of the cycle inequalities (2). In the following, we will only

consider optimal LP-solutions ˜x(possibly for a subset of the cycle inequalities). Since

we furthermore consider only LP-solutions obtained by the Simplex algorithm, all

non-zero variables have reduced-cost 0.

From incident fixed edges one obtains a (non-unique) partial vertex assignment

y:V→{0,1}. This assignment can be used to obtain additional fixings, as detailed

in the following proposition.

Proposition 5 Let ˜x be an optimal solution to the LP-relaxation of Formulation 1,

with reduced-costs ˜w, and objective value ˜

U. Let L be an upper bound on the weight

of a maximum-cut. Let V ⊂V and y:V→{0,1}such that for any optimal

vertex assignment y ∈{0,1}Vit holds that y(v) =y(v) for all v∈V. Further, let

u∈V\Vand define

0:= 

{u,v}∈δ(u)|v∈V,y(v)=0

˜w({u,v})

and

1:= 

{u,v}∈δ(u)|v∈V,y(v)=1

˜w({u,v}).

For any optimal vertex assignment y ∈{0,1}Vthe following conditions hold. If

L+˜

0>˜

U, then y(u)=0.IfL+˜

1>˜

U, then y(u)=1.

The proposition follows from standard linear programming results. If one of the con-

ditions of the proposition is satisfied, one can fix all edges between uand V.

123

Related document tools

Prepare academic work with more confidence

Plag helps identify passages that may need closer source checking. Identific can help confirm that document-related processes are easier to manage. Together, they help make academic review more transparent.

plag.ai

456 D. Rehfeldt et al.

4 Solving to optimality: branch-and-cut

This section describes how to incorporate the methods introduced so far together with

additional components in an exact branch-and-cut algorithm. This branch-and-cut

algorithm has been implemented based on the academic MIP solver SCIP [7]. Besides

being a stand-alone MIP solver, SCIP provides a general branch-and-cut framework.

Most importantly, we rely on SCIP for organizing the branch-and-bound search, and

the cutting plane management. Most native, general-purpose algorithms of SCIP such

as primal heuristics, conflict analysis, or generic cutting planes are deactivated by our

solver for performance reasons.

4.1 Key components

In the following, we list the main components of the branch-and-cut framework that

was implemented for this article.

Presolving For presolving, the reduction methods described in this article are executed

iteratively within a loop. This loop is reiterated as long as at least one edge has been

contracted during the previous round, and the predefined maximum number of loop

passes has not been reached yet.

Domain propagation For domain propagation we use the reduced-cost criteria

described in Sect.3.2. The simple single-edge fixing is done by the generic reduced-

costs propagator plug-in of SCIP. For the new implication based method we have

implemented an additional propagator.

A classic propagation method, e.g. [4], is as follows: Consider the connected com-

ponents induced by edges that have been fixed to 0 or 1. All additional edges in these

connected components can be readily fixed. However, this technique brought no ben-

efits in our experiments, since the variable values of such edges are implied by the

cycle inequalities (2).

Decomposition It is well-known that connected components of the graph underly-

ing a MaxCut instance can be solved separately, see e.g. [25]. More generally, one

can solve biconnected components separately (this simple observation does not seem

to have been mentioned in the MaxCut literature so far). Since several benchmark

instances used in this article contain many very small biconnected components, we

solve components with a limited number of vertices by enumeration. In this way, we

avoid the overhead associated with creating and solving a new MaxCut instance for

each subproblem.

Primal heuristics Primal heuristics are an important component of practical branch-

and-bound algorithms: First, to find an optimal solution (verified by the dual-bound),

and second to find strong primal bounds that allow the algorithm to cut off many

branch-and-bound nodes. For computing an initial primal solution, we have imple-

mented the MaxCut heuristic by Burer et. al. [11]. We further use the Kernighan-Lin

algorithm [30] to improve any (intermediary) solution found by the algorithm of Burer

123

Faster exact solution of sparse MaxCut and QUBO problems 457

et. al. Additionally, we use this combined algorithm as a local search heuristic when-

ever a new best primal solution has been found during the branch-and-bound search.

In this case, we initiate the heuristic with this new best solution (which can be done

by translating the solution into the two-dimensional angle vectors required by the

heuristic).

We also implemented the spanning-tree heuristic from Barahona et al. [4], which

uses a given (not necessarily optimal) LP-solution to find graph cuts. We execute this

heuristic after each separation round.

Separation In each separation round, we initially try to find violated cycle inequalities

on triangles of the underlying graph (by enumerating some triangles). Moreover, we

use a heuristic strategy based on a minimum spanning tree computation to obtain

additional odd-cycle cuts, see [4] for details. Next, we use shortest-path computations

to find (locally) maximally violated cuts, as described in Sects.2.2 and 2.3. Among the

speed-up techniques, we have not (yet) implemented the contraction of edges, since

the separation routine is already quite fast and other implementations seemed more

promising. Finally, we also use the odd-clique-cuts introduced in [5]. For separating

these cuts, we initially enumerate a (bounded) number of cliques of size 5, 7, and 9

at the beginning of the solving process. In each separation round, we check whether

some of the corresponding cuts are violated, and if so, add them to the LP. Overall,

we bound the number of cuts that can be added per separation round.

Branching We simply branch on the edge variables and use the pseudo-costs branching

strategy of SCIP, see Gamrath [16] for more details. Initial experiments showed that

the default branching strategy of SCIP,reliable pseudo-costs branching, spends too

much time on strong-branching to be competitive.

4.2 Parallelization

For parallelizing our solver, we use the Ubiquity Generator Framework (UG)[38], a

software package to parallelize branch-and-bound based solvers—for both shared- and

distributed-memory environments. UG implements a Supervisor-Worker load coordi-

nation scheme [35]. Importantly, Supervisor functions make decisions about the load

balancing without actually storing the data associated with the branch-and-bound

(B&B) search tree.

A major problem of parallelizing the B&B search lies in the simple fact that parallel

resources can only be used efficiently once the number of open B&B nodes is suffi-

ciently large. Thus, we employ so-called racing ramp-up [35]: Initially, each thread

(or process) starts the solving process of the given problem instance, but each with

different (customized) parameters and random seeds. Additionally, we reserve some

threads to exclusively run primal heuristics. During the racing, information such as

improved primal solutions or global variable fixings is being exchanged among the

threads. We terminate the racing once a predefined number of open B&B nodes has

been created by one thread, or the problem has been solved. Once the racing has been

terminated and the problem is still unsolved, the open nodes are distributed among the

threads and the actual parallel solving phase starts.

123

458 D. Rehfeldt et al.

Possible further paralleizations would be of the cutting plane generation or the

solution of the linear programs. However, as demonstrated in Sect.5.1, the cutting

plane generation only requires a small portion of the entire solving time, so paral-

lizing it would have little overall impact. The parallelization of the linear program

(re-)optimizations is possible, but since a Simplex algorithm is used (which is notori-

ously hard to parallelize), using more than one thread also has little impact.

5 Computational results

This section provides computational results on a large collection of MaxCut and QUBO

instances from the literature. We look at the impact of individual components, and

furthermore compare the new solver with the state of the art for the exact solution of

MaxCut and QUBO instances. An overview of the test-sets used in the following is

given in Table 1. The second column gives the number of instances per test-sets. The

third and fourth columns give the range of nodes and edges in the case of MaxCut, or

the range of variables and non-zero coefficients in the case of QUBO.

Only a few exact MaxCut or QUBO solvers are publicly available, and some, such

as BiqMac [37], only as web services. Still, the state-of-the-art solvers BiqBin [20],

BiqCrunch [31] and MADAM [23] are freely available. However, we have observed

that all of these solvers are outperformed on most instances listed in Table 1by the

recent release 9.5 of the state-of-the-art commercial solver Gurobi [19]. Gurobi

solves mixed-integer quadratic programs, which are a superclass of QUBO. In fact,

the standard benchmark library for quadratic programs, QPLIB [15], contains vari-

ous QUBO instances. Compared to the previous release 9.1, Gurobi 9.5 has hugely

improved on QUBO (and thereby also MaxCut) instances. For example, while Gurobi

9.1 could not solve any of the IsingChain instances from Table 1in one hour (with one

thread), Gurobi 9.5 solves all of them in less than one minute. Thus, in the following,

we will use Gurobi 9.5 as a reference for our new solver. We will also provide results

from the literature, but the comparison with Gurobi 9.5 allows us to obtain results

in the same computational environment. Very recently, an article describing a new

solver, called McSparse, specialized to sparse MaxCut and QUBO instances was

published [12]. The computational experiments in [12] demonstrate that McSparse

outperforms previous MaxCut and QUBO solvers on sparse instances. Like Biq-

Mac [37] and BiqBin [20], this solver is only available via a web interface. However,

we will still provide some comparison with our solver in the following by using the

results published in [12].

The computational experiments were performed on a cluster of Intel Xeon Gold

5122 CPUs with 3.60 GHz, and 96 GB RAM per compute node. We ran only one

job per compute node at a time, to avoid a distortion of the run time measures—

originating for example from shared (L3) cache. For our solver, we use the commerical

CPLEX 12.10 [24] as LP-solver, although our solver also allows for the use of the

non-commercial (but slower) SoPlex [7] instead. For Gurobi we set the parame-

ter MipGap to 0. Otherwise, we would obtain suboptimal solutions even for many

instances with integer weights.

123

Faster exact solution of sparse MaxCut and QUBO problems 459

Table 1 Details of MaxCut (upper part) and QUBO (lower part) test-sets used in this article

Name # |V||E|Description

DIMACS 4 512–3375 1536–10125 Instances introduced at the

7th DIMACS Challenge

Mannino 4 48–487 1128–8511 Instances from a radio frequency assignment problem

introduced in [9]

PM1s100 10 100 495 Instances generated with the rudy framework

from the BiqMac Lib [40]

W01100 10 100 495 Instances generated with the rudy framework

from the BiqMac Lib [40]

K64-chimera 80 2049 8064 Instances on D-Wave Chimera graphs

introduced in [25]

Kernel 14 33–2888 91–2981 Instances from various sources

collected by [14]

IsingChain 30 100–300 4950–44850 Instances from an application in statistical physics

introduced in [33]

Torus 18 100–343 200–1029 2D and 3D torus instances from an application in

statistical physics, introduced in [33]

Name # nnnzDescription

QPLIB 22 120–1225 602–34876 QUBO instances from the standard benchmark software

for quadratic programs, see [15]

BQP100 10 100 471–528 Randomly generated instances

introduced in [6]

123

460 D. Rehfeldt et al.

Table 1 continued

Name # nnnz Description

BQP250 10 250 3039–3208 Randomly generated instances

introduced in [6]

BE120.3 10 120 2176–2253 Randomly generated instances

introduced in [8]

BE250 10 250 3268–3388 Randomly generated instances

introduced in [8]

GKAa-d35 20–125 204–7788 Randomly generated instances with different densities

introduced in [17]

123

Faster exact solution of sparse MaxCut and QUBO problems 461

Table 2 Average times spent in

separation and (re-) optimization

of the LP for MaxCut and

QUBO test-sets

Name # Sepa-time [%] LP-time [%]

BE120.3 10 2.579.3

BE250 10 2.884.3

BQP100 10 3.032.0

BQP250 10 2.686.2

GKAa-d35 6.749.1

IsingChain 30 0.00.0

K64-chimera 80 10.158.3

Kernel 14 1.34.8

PM1s100 10 13.178.3

QPLIB 22 18.165.5

Torus 18 16.715.5

W01100 10 12.059.3

5.1 Individual components

This section takes a look at individual algorithmic components introduced in Sects. 2

and 3.

First, we show the run time required for our improved separation of cycle inequali-

ties. Table 2reports per test-set the average (arithmetic mean) percentual time required

for the separation procedure (column four), as well as for solving the LP-relaxations

(column five). Recall that the latter is done by CPLEX 12.10, one of the leading

commercial LP-solvers. For more than half of the test-sets the average time required

for the separation is less than 10 %. Also for the remaining test-sets it is always less

than 20 %. Notably, this time also includes adding the cuts (including the triangle

inequalities) to the cut pool, which requires additional computations. The time could

be further reduced by contracting 0-weight edges in the auxiliary graph, as described

in Sect.2. Notably, both the separation time and LP-solution time are very small for

the IsingChain and Kernel instances. This behavior is due to the fact that many of

these instances are already solved during presolving, as detailed in the following,

Next, we demonstrate the strength of the reduction methods implemented for

this article. Only results for the MaxCut test-sets are reported. We show the impact

of the MaxCut reduction techniques from [32] described in Sect. 3as well as the

QUBO reduction techniques from [18]—by using the standard problem transforma-

tions between QUBO and MaxCut. We refer to the combination of these two as base

preprocessing. Additionally, the methods described in Propositions 3and 4are referred

to as new techniques. Note that Proposition 5cannot be applied, because no reduced-

costs are available.

Table 3shows in the first column the name of the test-set, followed by its number of

instances. The next columns show the percentual average number of nodes and edges

of the instances after the preprocessing without (column three and four), and with

(columns five and six) the new methods. The last two columns report the percentual

123

462 D. Rehfeldt et al.

Table 3 Average remaining size of MaxCut instances after preprocessing

Test-set # Base preprocessing +new techniques Relative change

|V|[%] |E|[%] |V|[%] |E|[%] |V|[%] |E|[%]

IsingChain 30 6.10.81.1<0.05 −82.0<−93.8

K64-chimera 80 3.14.63.14.60.00.0

Kernel 14 24.130.116.420.6−32.0−31.6

Mannino 4 64.169.363.268.7−1.4−0.9

Torus 18 80.687.578.585.2−2.6−2.6

W01100 10 99.194.899.194.80.00.0

DIMACS 4 97.098.996.998.9−0.10.0

PM1s100 10 99.799.999.799.90.00.0

Bold numbers signify a favorable impact of the new algorithms

relative change between the previous results. The run time is not reported, because it

is on all instances below 0.05s.

The new reduction techniques have an impact on five of the eight test-sets. The

strongest reductions occur on Kernel and IsingChain. We remark that the symmetry-

based reductions from Proposition 4have a very small impact, and only allow for

contracting a few dozen additional edges on Kernel. We also note that while the

IsingChain instances are already drastically reduced by the base preprocessing, the

new methods still have an important impact, as they reduce the number of edges

of several instances from more than a thousand to less than 300. The IsingChain

instances were already completely solved by reduction techniques in Tavares [39], by

using maximum-flow based methods. However the run time was up to three orders

of magnitudes larger than in our case. The machine used by Tavares had a Pentium

4 CPU at 3.60 GHz, thus being significantly slower than the machines used for this

article. Still, also when taking the different computing environments into account, the

run time difference is huge.

5.2 Exact solution

This section compares our new solver with state-of-the-art exact solvers with respect

to the mean time, the maximum time, and the number of solved instances. For the

mean time we use the shifted geometric mean [1] with a shift of 1 second. In this

section, we use only single-thread mode.

First, we provide a comparison with Gurobi 9.5. Table 4provides the results for

a time-limit of one hour. The second column shows the number of instances in the

test-set. Columns three gives the number of instances solved by Gurobi, column four

the number of instances solved by our solver. Column five and six show the mean

time taken by Gurobi and our solver. The next column gives the relative speedup of

our solver. The last three columns provide the same information for the maximum run

time, Speedups that signify an improved performance of the new solver are marked in

bold.

123

Faster exact solution of sparse MaxCut and QUBO problems 463

It can be seen that our solver consistently outperforms Gurobi 9.5—both with

respect to mean and maximum time. Also, it solves on each test-set at least as many

instances as Gurobi. The only test-set where Gurobi performs better is BQP100,

which, however, can be solved by both solvers in far less than a second.

On the other test-sets, the mean time of the new solver is better, often by large

factors (up to 60.07). On the instance sets that can both be fully solved, the maximum

time taken by the new solver is in most cases also much smaller. On five of the test-sets,

the new solver can solve more instances to optimality than Gurobi 9.5.

Next, we compare our solver with the very recent QUBO and MaxCut solver McSparse,

specialized on sparse instances. In Table 5we provide an instance-wise comparison

of our solver and McSparse. We provide the number of branch-and-bound nodes

(columns three and four) and the run times (columns five and six) of McSparse and

our solver per problem instance. We use the 14 instances that were selected in the arti-

cle by Charfreitag et al. [12] as being representative of their test-bed. The first seven

instances are MaxCut and the last seven QUBO problems. Charfreitag et al. [12] only

use one thread per run. Their results were obtained on a system with AMD EPYC

7543P CPUs at 2.8 GHz, and with 256 GB RAM. CPU benchmarks 2consider this

system to be faster than the one used in this article, already in single-thread mode.

Furthermore, McSparse is embedded into Gurobi (version 9.1), which is widely

regarded as the fastest commercial MIP-solver, whereas our solver is based on the

non-commercial SCIP, although we also use a commercial LP-solver.

As to the number of branch-and-bound nodes, the pictures is somewhat mixed—

with McSparse requiring fewer nodes on three, and more nodes on four instances.

Notably, McSparse also includes a specialized branching strategy, while we use a

simple generic one. This feature might explain the smaller number of nodes on some

instances. As to the run time, five instances can be solved in less than a second by both

solvers (with the new solver being slightly faster). On the remaining nine instances,

the new solver is always faster—for all but one instances by a factor of more than 3.

On one instance (mannino_k487b), it is even by a factor of more than 40 faster.

Finally, we also provide a few remarks concerning dense instances, although these

are not the focus of this article. It is often reported, see e.g. [37], that LP-based

approaches using odd-cycle cuts do not work well on dense instances. However,

dense instances from the literature are typically randomly generated. We have indeed

observed that our solver is not at all competitive with semidefinite-based solvers for

randomly generated dense instances with more than around 80 vertices (although

these solver typically cannot handle instances with more than 250 vertices). However,

the picture can be different on real-world instances. We exemplarily demonstrate this

behavior on two test-sets from the literature. First, we selected the test-set PW05 [40],

which consists of nine randomly generated instances with 100 vertices and 50 % den-

sity each. Second, we selected the IsingChain test-set from Table 1, which consists

of instances with up to 50 % density. We compare our solver with the state-of-the-art

semidefinite solver BiqBin. On the (randomly generated) PW05 instances, BiqBin

performs vastly better, solving all instances with a mean time of 64 s. In contrast,

our solver cannot solve any of them within one hour, and the primal-dual gaps are

2https://www.cpubenchmark.net/singleThread.html#server-thread.

123

464 D. Rehfeldt et al.

Table 4 Comparison of Gurobi 9.5 (Grb) and new solver (new)

Test-set # solved Mean time (sh. geo. mean) Maximum time

# Grb New Grb[s] New[s] Speedup Grb[s] New[s] Speedup

PM1s100 10 10 10 192.3 20.9 9.20 303.3 48.4 6.27

W01100 10 10 10 44.1 3.1 14.23 97.1 21.5 4.52

Kernel 14 14 14 0.7 0.1 7.00 14.3 1.1 13.00

IsingChain 30 30 30 1.3 <0.05 >26.00 41.0 <0.05 >820.00

Torus 18 18 18 3.8 0.4 9.50 628.0 7.6 82.63

K64-chimera 80 80 80 90.1 1.5 60.07 195.4 6.0 32.57

QPLIB 22 8 13 687.4 165.5 4.15 3600 3600 1.00

BQP100 10 10 10 0.1 0.1 1.00 0.2 0.3 0.67

BQP250 10 0 73600 610.6 5.90 3600 3600 1.00

BE120.3 10 9 10 265.6 50.1 5.30 3600 525.1 >6.86

BE250 10 0 83600 571.8 6.30 3600 3600 1.00

GKAa-d35 29 31 6.5 6.1 1.07 3600 3600 1.00

Bold numbers signify a favorable impact of the new algorithms

123

Faster exact solution of sparse MaxCut and QUBO problems 465

Table 5 Comparison of McSparse (MS) and our solver (new) on seven MaxCut and seven QUBO instances

(considered to be representative [12])

Name |V||E|# B&B nodes Run time

MS New MS[s] New[s]

pm1s_100.3 100 495 341 741 48.248.0

pw01_100.0 100 495 171 179 20.08.5

mannino_k487b 487 5391 1 15 167.34.3

bio-diseasome 516 1188 1 1 9.50.6

ca-netscience 379 914 1 1 0.10.0

g000981 110 188 1 1 0.00.0

imgseg_138032 12,736 23,664 1 1 30.53.9

Name n nnz MS NewMS[s]New[s]

bqp250-3 250 3092 25 17 414.184.1

gka2c 50 813 1 1 0.50.3

gka4d 100 2010 129 9 219.643.7

gka5c 80 721 1 1 0.10.1

gka7a 30 241 1 1 0.00.0

be120.3.5 120 2248 111 15 257.746.6

be250.3 250 3277 107 47 841.0 150.7

up to 4.4%. It should be mentioned that our solver does not perform any better on

most randomly-generated instances with 100 or more vertices from the literature. The

(real-world) IsingChain, instances, on the other hand, are all solved within 0.1s by

our solver. In contrast, the mean time of BiqBin is 144 s on these instances and four

of the instances cannot be solved within one hour. These four include instances with

density of around 50 %.

5.3 Parallelization

Although parallelization is not the main topic of this article, we still provide some

corresponding results in the following. To give insights into the strengths and weak-

nesses of our racing-based parallelization, we provide instance-wise results. We use

the test-sets Mannino and DIMACS, which both contain instances that cannot be solved

within one hour by Gurobi and our new solver in single-thread mode. The sizes of

the instances are given in Table 6.

Table 7provides results of Gurobi and the new solver on the DIMACS and Mannino

instances. Both solvers are run once with one thread and once with eight threads. As

before, a time-limit of one hour is used. The table provides the percentual primal-

dual gap, as well as the run time. The results reveal for both solvers a performance

degradation on easy instances with increased number of threads. Most notably on

mannino_k487b, where Gurobi takes almost 10 times longer with eight threads. On

the other hand, the new solver shows a strong speedup on two hard instances that cannot

123

466 D. Rehfeldt et al.

Table 6 Details on DIMACS (left) and Mannino (right) instances

Name |V||E|Name |V||E|

torusg3-8 512 1536 mannino_k487a.mc 487 1435

toruspm3-8-50 512 1536 mannino_k487b.mc 487 5391

torusg3-15 3375 10125 mannino_k487c.mc 487 8511

toruspm3-15-50 3375 10125 mannino_k48.mc 48 1128

be solved in one hour singke-threaded, namely toruspm3-8-50 and mannino_k487c.

On the latter, one even observes a super-linear speedup. This speedup can be at least

partly attributed to the exclusive use of primal heuristics on one thread during racing,

which finds an optimal solution quickly in both cases. On the other hand, in single-

thread mode the best primal solution is sub-optimal even at the time-limit.

Finally, Table 8provides results for several previously unsolved MaxCut and QUBO

benchmark instances from the QPLIB and the 7th DIMACS Challenge. We also report

the previous best known solution values (previous primal) from the literature, which

were taken from the QPLIB and the MQLib [13]. For the QPLIB instances we report

the results from the one hour single-thread run in Sect.5.2. However, for the DIMACS

instances, torusg3-15 and toruspm3-15-50, we performed additional runs. Note that

the DIMACS instances were originally intended to be solved with negated weights.

However, it seems that most publications, e.g., [13], do not perform this transformation.

Thus, we also use the unmodified instances, to allow for better comparison. However,

we additionally report the solution values of the transformed instances, these trans-

formed instances are marked by a . We used a machine with 88 cores of Intel Xeon

E7-8880 v4 CPUs @ 2.20GHz. We ran the two instances (non-exclusively) for at most

3 days while using 80 threads. Both torusg3-15 and torusg3-15could be solved to

optimality in this way, but toruspm3-15-50 and toruspm3-15-50still remained with

a primal-dual gap of 1.8 percent each.

Finally, we note that using more than 8 threads (on the above machine with 88

cores) does not provide additional speed-ups on most instances, neither for our solver

nor Gurobi. This behaviour can be put down to the fact that we parallelize mostly the

branch-and-bound search, and the number of open branch-and-bound nodes is usually

quite small on most instances.

6 Conclusion and outlook

This article has demonstrated how to design a state-of-the-art solver for sparse QUBO

and MaxCut instances, by enhancing and combining key algorithmic ingredients such

as presolving and cutting-plane generation. The newly implemented solver outper-

forms both the leading commercial and non-commercial competitors on a wide range

of test-sets from the literature. Moreover, the best known solutions to several instances

could be improved.

123

Faster exact solution of sparse MaxCut and QUBO problems 467

Table 7 Results of Gurobi 9.5 (Grb) and the new solver (new), with one (-T1) and eight (-T8) threads each

Name Primal-dual gap[%] Run time[s]

Grb-T1 Grb-T8 New-T1 New-T8 Grb-T1 Grb-T8 New-T1 New-T8

torusg3-8 0.0 0.0 0.0 0.0 1494.2 1178.5 8.5 9.3

toruspm3-8-50 1.8 1.80.5 0.0 >3600 >3600 >3600 1415.8

torusg3-15 6.8 3.41.3 0.4 >3600 >3600 >3600 >3600

toruspm3-15-50 9.5 12.22.3 2.3 >3600 >3600 >3600 >3600

mannino_k487a 0.0 0.0 0.0 0.0 3.5 10.7 1.1 1.3

mannino_k487b 0.0 0.0 0.0 0.0 9.2 80.5 4.3 2.8

mannino_k487c 0.1 0.00.1 0.0 >3600 3176.7 >3600 398.2

mannino_k48 0.0 0.0 0.0 0.0 0.1 0.4 2.7 3.8

123

468 D. Rehfeldt et al.

Table 8 Improved solutions for MaxCut (first four) and QUBO (last two) benchmark instances

Name gap [%] New primal Previous primal

torusg3-15 opt 286626481 282534518

torusg3-15opt 292031950 –

toruspm3-15-50 1.8 3010 2968

toruspm3-15-501.8 3008 –

QPLIB_3693 1.0 −1154 −1148

QPLIB_3850 1.1 −1198 −1192

Bold numbers signify that an optimal solution was found

For QUBO and MaxCut instances with not more than 10 % density, the compu-

tational results obtained for this article strongly suggest the use of our new solver.

However, for instances with 20 % or more density, using semidefinite programming

based solvers such as BiqBin [20], BiqCrunch [31] and MADAM [23] seems usually

far more promising. However, one should keep in mind that most dense instances from

the literature are randomly generated. For instances from real-world applications the

picture can be somewhat different, especially when presolving is effective (which is

not included in any of the mentioned solvers). A prominent example are the dense

Ising chain instances discussed in this article, which can all be solved in less than 0.1

seconds by our solver, but some of which cannot be solved even in one hour by a

state-of-the-art semidefinite programming based solver.

There are various promising routes for further improvement. Examples would be a

new branching strategy, or the implementation of additional separation methods [9].

In this way, a considerable further speedup of the new solver might be achieved.

Funding Open Access funding enabled and organized by Projekt DEAL. The work for this article has been

conducted in the Research Campus MODAL 646 funded by the German Federal Ministry of Education and

Research (BMBF) (fund numbers 05M14ZAM, 647 05M20ZBM).

Data availibility The data used for the computational experiments in this article is available on the websites:

https://biqmac.aau.at/biqmaclib.html.http://bqp.cs.uni-bonn.de/library/html/index.html.https://qplib.zib.

de.

Code availability The full code was made available to the reviewers

Conflict of interest The authors declare no conflict of interest.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which

permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give

appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,

and indicate if changes were made. The images or other third party material in this article are included

in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If

material is not included in the article’s Creative Commons licence and your intended use is not permitted

by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the

References

1. Achterberg, T.: Constraint integer programming. In: Ph.D. Thesis, Technische Universität Berlin (2007)

123

Faster exact solution of sparse MaxCut and QUBO problems 469

2. Achterberg, T., Bixby, R.E., Gu, Z., Rothberg, E., Weninger, D.: Presolve reductions in mixed integer

programming. INFORMS J. Comput. 32(2), 473–506 (2020). https://doi.org/10.1287/ijoc.2018.0857

3. Barahona, F., Grötschel, M., Jünger, M., Reinelt, G.: An application of combinatorial optimization

to statistical physics and circuit layout design. Oper. Res. 36(3), 493–513 (1988). https://doi.org/10.

1287/opre.36.3.493

4. Barahona, F., Jünger, M., Reinelt, G.: Experiments in quadratic 0–1 programming. Math. Program.

44(1–3), 127–137 (1989). https://doi.org/10.1007/BF01587084

5. Barahona, F., Mahjoub, A.R.: On the cut polytope. Math. Program. 36(2), 157–173 (1986). https://doi.

org/10.1007/BF02592023

6. Beasley, J.: Heuristic algorithms for the unconstrained binary quadratic programming problem. Tech.

Rep. (1998)

7. Bestuzheva, K., Besançon, M., Chen, W.K., Chmiela, A., Donkiewicz, T., van Doornmalen, J., Eifler,

L., Gaul, O., Gamrath, G., Gleixner, A., Gottwald, L., Graczyk, C., Halbig, K., Hoen, A., Hojny, C.,

van der Hulst, R., Koch, T., Lübbecke, M., Maher, S.J., Matter, F., Mühmer, E., Müller, B., Pfetsch, M.E.,

Rehfeldt, D., Schlein, S., Schlösser, F., Serrano, F., Shinano, Y., Sofranac, B., Turner, M., Vigerske, S.,

Wegscheider, F., Wellner, P., Weninger, D., Witzig, J.: The SCIP Optimization Suite 8.0. ZIB-Report

21-41, Zuse Institute Berlin (2021). http://nbn-resolving.de/urn:nbn:de:0297-zib-85309

8. Billionnet, A., Elloumi, S.: Using a mixed integer quadratic programming solver for the unconstrained

quadratic 0–1 problem. Math. Program. 109(1), 55–68 (2007). https://doi.org/10.1007/s10107-005-

0637-9

9. Bonato, T., Jünger, M., Reinelt, G., Rinaldi, G.: Lifting and separation procedures for the cut polytope.

Math. Program. 146(1–2), 351–378 (2014). https://doi.org/10.1007/s10107-013-0688-2

10. Boros, E., Hammer, P.L., Tavares, G.: Preprocessing of unconstrained quadratic binary optimization.

Tech. Rep. (2006)

11. Burer, S., Monteiro, R.D.C., Zhang, Y.: Rank-two relaxation heuristics for MAX-CUT and

other binary quadratic programs. SIAM J. Optim. 12(2), 503–521 (2002). https://doi.org/10.1137/

S1052623400382467

12. Charfreitag, J., Jünger, M., Mallach, S., Mutzel, P.: McSparse: Exact solutions of sparse maximum cut

and sparse unconstrained binary quadratic optimization problems. In: Phillips, C.A., Speckmann, B.

eds.) Proceedings of the Symposium on Algorithm Engineering and Experiments (ALENEX 2022),

online first (2022)

13. Dunning, I., Gupta, S., Silberholz, J.: What works best when? A systematic evaluation of heuristics

for max-cut and QUBO. INFORMS J. Comput. 30(3), 608–624 (2018). https://doi.org/10.1287/ijoc.

2017.0798

14. Ferizovic, D., Hespe, D., Lamm, S., Mnich, M., Schulz, C., Strash, D.: Engineering Kernelization for

Maximum Cut. In: Blelloch, G.E., Finocchi, I. (eds.) Proceedings of the Symposium on Algorithm

Engineering and Experiments, ALENEX 2020, Salt Lake City, UT, USA, January 5-6, 2020, pp. 27–41.

SIAM (2020). doi:https://doi.org/10.1137/1.9781611976007.3

15. Furini, F., Traversi, E., Belotti, P., Frangioni, A., Gleixner, A.M., Gould, N., Liberti, L., Lodi, A.,

Misener, R., Mittelmann, H.D., Sahinidis, N.V., Vigerske, S., Wiegele, A.: QPLIB: a library of quadratic

programming instances. Math. Program. Comput. 11(2), 237–265 (2019). https://doi.org/10.1007/

s12532-018-0147-4

16. Gamrath, G.: Enhanced predictions and structure exploitation in branch-and-bound. Technische Uni-

versitaet Berlin (Germany) (2020)

17. Glover, F., Kochenberger, G.A., Alidaee, B.: Adaptive memory tabu search for binary quadratic pro-

grams. Manag. Sci. 44(3), 336–345 (1998)

18. Glover, F.W., Lewis, M.W., Kochenberger, G.A.: Logical and inequality implications for reducing the

size and difficulty of quadratic unconstrained binary optimization problems. Eur. J. Oper. Res. 265(3),

829–842 (2018). https://doi.org/10.1016/j.ejor.2017.08.025

19. Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2021). https://www.gurobi.com

20. Gusmeroli, N., Hrga, T., Lužar, B., Povh, J., Siebenhofer, M., Wiegele, A.: Biqbin: A parallel branch-

and-bound solver for binary quadratic problems with linear constraints. ACM Trans. Math. Softw.

(2022). https://doi.org/10.1145/3514039

21. Hammer, P.L., Hansen, P., Simeone, B.: Roof duality, complementation and persistency in quadratic

0–1 optimization. Math. Program. 28(2), 121–155 (1984). https://doi.org/10.1007/BF02612354

22. Hammer, P.L., Rudeanu, S.: Boolean Methods in Operations Research and Related Areas. Springer

(1968)

123

470 D. Rehfeldt et al.

23. Hrga, T., Povh, J.: MADAM: a parallel exact solver for max-cut based on semidefinite programming and

ADMM. Comput. Optim. Appl. 80(2), 347–375 (2021). https://doi.org/10.1007/s10589-021-00310-

24. IBM: CPLEX (2020). https://www.ibm.com/analytics/cplex-optimizer

25. Jünger, M., Lobe, E., Mutzel, P., Reinelt, G., Rendl, F., Rinaldi, G., Stollenwerk, T.: Quantum annealing

versus digital computing: an experimental comparison. ACM J. Exp. Algorithmics (2021). https://doi.

org/10.1145/3459606

26. Jünger, M., Mallach, S.: Odd-cycle separation for maximum cut and binary quadratic optimization.

In: Bender, M.A., Svensson, O., Herman, G. (eds.) 27th Annual European Symposium on Algorithms,

ESA 2019, September 9-11, 2019, Munich/Garching, Germany, LIPIcs, vol. 144, pp. 63:1–63:13.

Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019). doi:https://doi.org/10.4230/LIPIcs.ESA.

2019.63

27. Jünger, M., Mallach, S.: Exact facetial odd-cycle separation for maximum cut and binary quadratic

optimization. INFORMS J. Comput. 33(4), 1419–1430 (2021). https://doi.org/10.1287/ijoc.2020.1008

28. Karp, R.: Reducibility among combinatorial problems. In: R. Miller, J. Thatcher (eds.) Complexity of

Computer Computations, pp. 85–103. Plenum Press (1972). doi:https://doi.org/10.1007/978-1-4684-

2001-2_9

29. Kepner, J., Gilbert, J.: Graph algorithms in the language of linear algebra. SIAM (2011). https://doi.

org/10.1137/1.9780898719918

30. Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J.

49(2), 291–307 (1970). https://doi.org/10.1002/j.1538-7305.1970.tb01770.x

31. Krislock, N., Malick, J., Roupin, F.: Biqcrunch: a semidefinite branch-and-bound method for solving

binary quadratic problems. ACM Trans. Math. Softw. (2017). https://doi.org/10.1145/3005345

32. Lange, J., Andres, B., Swoboda, P.: Combinatorial Persistency Criteria for Multicut and Max-Cut. In:

IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA,

June 16-20, 2019, pp. 6093–6102. Computer Vision Foundation / IEEE (2019). doi:https://doi.org/10.

1109/CVPR.2019.00625

33. Liers, F.: Contributions to determining exact ground-states of ising spin-glasses and to their physics.

In: Ph.D. Thesis, University of Cologne (2004)

34. Lin, J., Cai, S., Luo, C., Su, K.: A reduction based method for coloring very large graphs. In: Sierra, C.

(ed.) Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI

2017, Melbourne, Australia, August 19-25, 2017, pp. 517–523. ijcai.org (2017). https://doi.org/10.

24963/ijcai.2017/73

35. Ralphs, T., Shinano, Y., Berthold, T., Koch, T.: Parallel Solvers for Mixed Integer Linear Optimization,

pp. 283–336. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-319-

63516-3_8

36. Rehfeldt, D., Koch, T.: Implications, Conflicts, and Reductions for Steiner Trees. In: Singh, M.,

Williamson, D.P. (eds.) Integer Programming and Combinatorial Optimization - 22nd International

Conference, IPCO 2021, Atlanta, GA, USA, May 19-21, 2021, Proceedings, Lecture Notes in Com-

puter Science, vol. 12707, pp. 473–487. Springer (2021). https://doi.org/10.1007/978-3-030-73879-

2_33

37. Rendl, F., Rinaldi, G., Wiegele, A.: Solving max-cut to optimality by intersecting semidefinite and

polyhedral relaxations. Math. Program. 121(2), 307–335 (2010). https://doi.org/10.1007/s10107-008-

0235-8

38. Shinano, Y., Achterberg, T., Berthold, T., Heinz, S., Koch, T., Winkler, M.: Solving open MIP instances

with parascip on supercomputers using up to 80, 000 cores. In: 2016 IEEE International Parallel and

Distributed Processing Symposium, IPDPS 2016, Chicago, IL, USA, May 23-27, 2016, pp. 770–779.

IEEE Computer Society (2016). doi:10.1109/IPDPS.2016.56

39. Tavares, G.: New algorithms for quadratic unconstrained binary optimization (qubo) with applications

in engineering and social sciences. In: Ph.D. Thesis, Rutgers, the State University of New Jersey-New

Brunswick (2008)

40. Wiegele, A.: BiqMac Library: A collection of Max-Cut and quadratic 0-1 programming instances of

medium size. Tech. Rep. (2007)

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps

and institutional affiliations.

123