Constraints#

A constraint in Gurobi captures a restriction on the values that a set of variables may take. The simplest example is a linear constraint, which states that a linear expression on a set of variables take a value that is either less-than-or-equal, greater-than-or-equal, or equal to another linear expression. More complicated constraints are also supported, including quadratic constraints (e.g., \(x^2 + y^2 \leq z^2\)), logical constraints (e.g., logical AND on binary variables, if-then, etc.), and a few non-linear functions (e.g., \(y = \sin(x)\)).

We now consider a few more details about linear, SOS, quadratic (both convex and non-convex), and general constraints. General constraints are the catch-all we use for the constraint types that don’t fit in the other categories.

Recall that Gurobi works in finite-precision arithmetic, so constraints are only satisfied to tolerances. Tolerances can be tightened to reduce such violations, but there are limits to how small the violations can be – errors are inherent in floating-point arithmetic. This point will be reiterated in several places in this section.

Linear Constraints#

A linear constraint allows you to restrict the value of a linear expression. For example, you may require that any feasible solution satisfies the constraint \(3 x + 4 y \leq 5z\). Note that the matrix-oriented Gurobi APIs (C, MATLAB, and R) require the right-hand side of a linear constraint to be a constant, while the object-oriented APIs (C++, Java, .NET, and Python) allow arbitrary linear expressions on both sides of the comparator.

The computed solution should satisfy the stated constraint to within FeasibilityTol (although it may not in cases of numerical ill-conditioning – we’ll discuss this shortly).

Gurobi supports a limited set of comparators. Specifically, you can constrain an expression to be less-than-or-equal, greater-than-or-equal, or equal another. We do not support strict less-than, strict greater-than, or not-equal comparators. While these other comparators may seem appropriate for mathematical programming, we exclude them to avoid potential confusion related to numerical tolerances. Consider a simple example of a strict inequality constraint on a pair of continuous variables: \(x > y\). How large would \(x-y\) need to be in order to satisfy the constraint? Rather than trying to embed a subtle and potentially confusing strategy for handling such constraints into the solver, we’ve chosen not to support them instead.

SOS Constraints#

A Special-Ordered Set, or SOS constraint, is a highly specialized constraint that places restrictions on the values that variables in a given list can take. There are two types of SOS constraints. In an SOS constraint of type 1 (an SOS1 constraint), at most one variable in the specified list is allowed to take a non-zero value. In an SOS constraint of type 2 (an SOS2 constraint), at most two variables in the specified, ordered list are allowed to take a non-zero value, and those non-zero variables must be contiguous in the list. The variables in an SOS constraint can be continuous, integer, or binary.

Again, tolerances play an important role in SOS constraints. Specifically, variables that take values less than IntFeasTol (in absolute value) are considered to be zero for the purposes of determining whether an SOS constraint is satisfied.

An SOS constraint is described using a list of variables and a list of corresponding weights. While the weights have historically had intuitive meanings associated with them, we simply use them to order the list of variables. The weights should be unique. This is especially important for an SOS2 constraint, which relies on the notion of contiguous variables. Since the variables in the SOS are ordered by weight, contiguity becomes ambiguous when multiple variables have the same weight.

It is often more efficient to capture SOS structure using linear constraints rather than SOS constraints. The optimizer will often perform this reformulation automatically. This is controlled with four parameters: PreSOS1BigM, PreSOS1Encoding, PreSOS2BigM and PreSOS2Encoding.

The reformulation adds constraints of the form \(x \leq M b\), where \(x\) is the variable that participates in the SOS constraint, \(b\) is a binary variable, and \(M\) is an upper bound on the value of variable \(x\). Large values of \(M\) can lead to numerical issues. The two parameters PreSOS1BigM and PreSOS2BigM control the maximum value of \(M\) that can be introduced by this reformulation. SOS constraints that would require a larger value aren’t converted. Setting one of these parameters to 0 disables the corresponding reformulation.

Additionally, there are several known integer formulations for SOS1 and SOS2 constraints. These reformulations differ in the number of variables that they introduce to the problem, in the complexity of the resulting LP relaxations, and in their properties in terms of branching and cutting planes. The two parameters PreSOS1Encoding and PreSOS2Encoding control the choice of the reformulation performed.

Quadratic Constraints#

A quadratic constraint allows you to restrict the value of a quadratic expression. For example, you may require that any feasible solution satisfy the constraint \(3 x^2 + 4 y^2 + 5 z \leq 10\). Note that the matrix-oriented Gurobi APIs (C, MATLAB, and R) require the right-hand side of a quadratic constraint to be a constant, while the object-oriented APIs (C++, Java, .NET, and Python) allow arbitrary quadratic expressions on both sides of the comparator.

The computed solution should satisfy the stated constraint to within FeasibilityTol. Quadratic constraints are often much more challenging to satisfy than linear constraints, so tightening the parameter may increase runtimes dramatically.

Gurobi can handle both convex and non-convex quadratic constraints. However, there are some subtle and important differences in how the different constraint types are handled. In general, it is much easier to solve a model whose constraints all have convex feasible regions. It is actually quite difficult to recognize all such cases, but the following forms are always recognized:

  • \(x^TQx + q^Tx \le b\), where \(Q\) is Positive Semi-Definite (PSD)

  • \(x^TQx \le y^{2}\), where \(Q\) is Positive Semi-Definite (PSD), \(x\) is a vector of variables, and \(y\) is a non-negative variable (a Second-Order Cone constraint, if \(Q = I\), identity matrix)

  • \(x^TQx \le y z\), where \(Q\) is Positive Semi-Definite (PSD), \(x\) is a vector of variables, and \(y\) and \(z\) are non-negative variables (a rotated Second-Order Cone constraint, if \(Q = I\), identity matrix)

To be more precise, a quadratic constraint will always be recognized as convex if presolve is able to transform it into one of these forms. Note that if all quadratic terms in a quadratic constraint contain at least one binary variable, then presolve will always be able to transform it to a convex form.

Why distinguish between convex and non-convex quadratic constraints? In some situations you may know that your problem should be convex, and thus it may be a sign of a modeling error if your model isn’t recognized as such. To avoid accidentally solving a much harder problem than you may have intended, you can set the NonConvex parameter to either 0 or 1. In the default setting of -1 or if the NonConvex parameter is set to 2, Gurobi will accept arbitrary quadratic constraints and attempt to solve the resulting model using the appropriate algorithm.

Note that other non-convex quadratic solvers often only find locally optimal solutions. The algorithms in Gurobi explore the entire search space, so they provide a globally valid lower bound on the optimal objective value, and given enough time they will find a globally optimal solution (subject to tolerances).

We would like to note a subtle point here regarding terminology. A quadratic constraint that involves only products of disjoint pairs of variables is often called a bilinear constraint, and a model that contains bilinear constraints is often called a bilinear program. Bilinear constraints are a special case of non-convex quadratic constraints, and the algorithms Gurobi uses to handle the latter are also well suited to solving bilinear programming problems.

General Constraints#

Gurobi includes an additional set of higher-level constraints, which we collectively refer to as general constraints. Those constraints allow you to directly model complex relationships between variables. We think of these as belonging to three types that are treated differently by Gurobi: simple constraints, function constraints, and nonlinear constraints.

Simple constraints are a modeling convenience. They allow you to state fairly simple relationships between variables (min, max, absolute value, logical OR, etc.). Techniques for translating these constraints into lower-level modeling objects (typically using auxiliary binary variables and linear or SOS constraints) are well known and can be found in optimization modeling textbooks. By automating these translations and removing the need for the modeler to perform them, the hope is that these simple general constraints will allow you to write more readable and more maintainable models.

The two other types of constraints allow you to state relationships between variables in the model defined by a nonlinear function \(y=f(x)\). The two types of constraints differ in the generality of the functions that they can represent. With function constraints, you can only state univariate functions: \(f: \mathbf{R} \rightarrow \mathbf{R}\), chosen from a predefined list. With nonlinear constraints, you can state multivariate functions: \(f: \mathbf{R}^n \rightarrow \mathbf{R}\), that arise from the composition of various operations, possibly involving multiple variables.

Nonlinear constraints are the more general form and can be used to represent all function constraints (as well as linear and quadratic constraints). Depending on the API you use, writing nonlinear constraints might be more involved than writing function constraints. In Python, nonlinear constraints can be expressed naturally and should be more convenient and better to use except in very particular cases.

Note

Similarly to non-convex quadratic constraints, Gurobi intends to solve models containing nonlinear constraints to global optimality. Note that this differs from solvers that seek a locally optimal solution and don’t aim to prove global optimality.

Simple Constraints#

Gurobi supports the following simple general constraints, each with its own syntax and semantics:

  • MAX constraint: The constraint \(r = \max\{x_1,\ldots,x_k,c\}\) states that the resultant variable \(r\) should be equal to the maximum of the operand variables \(x_1,\ldots,x_k\) and the constant \(c\). For example, a solution \((r=3, x_1=2, x_2=3, x_3=0)\) would be feasible for the constraint \(r = \max\{x_1,x_2,x_3,1.7\}\) because \(3\) is indeed the maximum of \(2\), \(3\), \(0\), and \(1.7\).

  • MIN constraint: Similar to a MAX constraint, the constraint \(r = \min\{x_1,\ldots,x_k,c\}\) states that the resultant variable \(r\) should be equal to the minimum of the operand variables \(x_1,\ldots,x_k\) and the constant \(c\).

  • ABS constraint: The constraint \(r = \mbox{abs}\{x\}\) states that the resultant variable \(r\) should be equal to the absolute value of the operand variable \(x\). For example, a solution \((r=3, x=-3)\) would be feasible for the constraint \(r = \mbox{abs}\{x\}\).

  • AND constraint: The constraint \(r = \mbox{and}\{x_1,\ldots,x_k\}\) states that the binary resultant variable \(r\) should be \(1\) if and only if all of the binary operand variables \(x_1,\ldots,x_k\) are equal to \(1\). For example, a solution \((r=1, x_1=1, x_2=1, x_3=1)\) would be feasible for the constraint \(r = \mbox{and}\{x_1,x_2,x_3\}\). Note that any involved variables that are not already binary are converted to binary.

  • OR constraint: Similar to an AND constraint, the constraint \(r = \mbox{or}\{x_1,\ldots,x_k\}\) states that the binary resultant variable \(r\) should be \(1\) if and only if at least one of the binary operand variables \(x_1,\ldots,x_k\) is equal to \(1\). Note that any involved variables that are not already binary are converted to binary.

  • NORM constraint: The constraint \(r = \mbox{norm}\{x_1,\ldots,x_k\}\) states that the resultant variable \(r\) should be equal to the vector norm of the operand variables \(x_1,\ldots,x_k\). A few options are available: the 0-norm, 1-norm, 2-norm, and infinity-norm.

  • INDICATOR constraints: An indicator constraint \(y = f \rightarrow a^Tx \leq b\) states that if the binary indicator variable \(y\) is equal to \(f\) in a given solution, where \(f \in \{0,1\}\), then the linear constraint \(a^Tx \leq b\) has to be satisfied. On the other hand, if \(y \neq f\) (i.e., \(y = 1-f\)) then the linear constraint may be violated. Note that the sense of the linear constraint can also be \(=\) or \(\geq\); refer to this earlier section for a more detailed description of linear constraints. Note also that declaring an INDICATOR constraint implicitly declares the indicator variable to be of binary type.

  • Piecewise-linear constraints: A piecewise-linear constraint \(y = f(x)\) states that the point \((x, y)\) must lie on the piecewise-linear function \(f\) defined by a set of points \((x_1, y_1), (x_2, y_2), ..., (x_n, y_n)\). Refer to the description of piecewise-linear objectives for details of how piecewise-linear functions are defined.

Note that adding any of these constraints to an otherwise continuous model will transform it into a MIP.

As stated above, each simple general constraint has an equivalent MIP formulation that consists of linear and SOS constraints, and possibly auxiliary variables. Thus, you could always model such constraints yourself without using a Gurobi general constraint. For example, the MAX constraint \(r = \max\{x_1,\ldots,x_k,c\}\) can be modeled as follows:

\[\begin{split}\begin{array}{rcll} r & = & x_j + s_j & \mbox{ for all } j = 1,\ldots,k \\ r & = & c + s_{k+1} & \\ z_1 + \ldots + z_{k+1} & = & 1 & \\ SOS1(s_j, z_j) & & & \mbox{ for all } j = 1,\ldots,k+1 \\ s_j & \geq & 0 & \mbox{ for all } j = 1,\ldots,k+1 \\ z_j & \in & \{0,1\} & \mbox{ for all } j = 1,\ldots,k+1 \end{array}\end{split}\]

The first two constraints state that \(r \geq \max\{x_1,\ldots,x_k,c\}\), i.e., that the resultant variable \(r\) must be at least as large as each of the operand variables \(x_j\) and the constant \(c\). This could be modeled using inequalities, but the explicit slack variables play an important role in the constraints that follow.

The next two constraints enforce \(r \leq \max\{x_1,\ldots,x_k,c\}\), which ensures that \(r\) is equal to the MAX expression. Enforcing this side of the equality is actually a lot more complicated. We need to introduce binary auxiliary variables \(z_j \in \{0,1\}\), and SOS1 constraints to required that at most one of the two variables \(s_j\) and \(z_j\) can be non-zero, which models the implication \(z_j = 1 \rightarrow s_j = 0\). Due to the third constraint, one \(z_j\) will be equal to \(1\) and thus at least one \(s_j\) will be zero. Hence, \(r = x_j\) for at least one \(j\) due to the first constraint, or \(r = c\) due to the second constraint.

Tolerances play a role in general constraints, although as you might expect, the exact role depends on the constraint type. As a general rule, violations in the resultant will be smaller than the feasibility tolerance, and integrality violations in integer resultants will also satisfy the integrality tolerance.

By most measures, general constraints are just a means of concisely capturing relationships between variables while removing the burden of creating an equivalent MIP formulation. However, general constraints have another potential advantage: Gurobi might be able to simplify the MIP formulation if it can prove during presolve that the simplified version suffices for the correctness of the model. For this reason, Gurobi might be able to produce a smaller or tighter representation of the general constraint than you would get from the most general formulation. For example, it might be the case that \(r \leq \max\{x_1,\ldots,x_k,c\}\) is already implied by the other constraints in the model, so that a simple set of inequalities

\[\begin{split}\begin{array}{rcll} r & \geq & x_j \;\;\mbox{ for all } j = 1,\ldots,k \\ r & \geq & c \end{array}\end{split}\]

to describe \(r \geq \max\{x_1,\ldots,x_k,c\}\) suffices to model the relevant part of the MAX constraint.

Norm Constraint#

The norm constraint introduces a few complications that are important to be aware of. As mentioned above, this constraint allows you to set one variable equal to the norm of a vector of variables. A few norms are available. The L1 norm is equal to the sum of the absolute values of the operand variables. The L-infinity norm is equal to the maximum absolute value of any operand. The L2 norm is equal to the square root of the sum of the squares of the operands. The L0 norm counts the number of non-zero values among the operands.

Regarding the L2 norm, one obvious complication comes from the fact that enforcing it requires a quadratic constraint. If your model only ever bounds the result from above (e.g., \(r = ||x|| \leq 1\)), then the resulting constraint will be convex. If your model was otherwise convex, the resulting model will be a (convex) QCP. However, if you try to bound the result from below (e.g., \(r = ||x|| \geq y\)), adding the L2 norm constraint will lead to a non-convex QCP model, which will typically be significantly harder to solve.

Regarding the L0 norm, note that results obtained with this constraint can be counter-intuitive. This is a consequence of the fact that for nearly any feasible solution with a variable at exactly 0, you can add a small value into that variable while still satisfying all associated constraints to tolerances. The net result is that a lower bound on the L0 norm is often satisfied by “cheating” - by setting enough variables to values that are slightly different from zero. We strongly recommend that you only bound the result from above. That is, you should avoid using the resultant in situations where the model incentivizes a larger value. This would include situations where the objective coefficient is negative, as well as situations where a larger value for the variable could help to satisfy a constraint (e.g., a greater-than constraint where the resultant appears with a positive coefficient).

Function Constraints#

Gurobi supports the following function constraints, each with somewhat different syntax and semantics (\(x\) and \(y\) below are Gurobi decision variables, and other terms are constants provided as input when the constraint is added to the model):

  • Polynomial: \(y = p_0 x^n + p_1 x^{n-1} + ... + p_{n-1} x + p_n\)

  • Natural exponential: \(y = \exp(x)\) or \(y = e^x\)

  • Exponential: \(y = a^x\), where \(a > 0\) is the base for the exponential function

  • Natural logarithm: \(y = \log_e(x)\) or \(y = \ln(x)\)

  • Logarithm: \(y = \log_a(x)\), where \(a > 0\) is the base for the logarithmic function

  • Logistic: \(y = \frac{1}{1 + \exp(-x)}\) or \(y = \frac{1}{1 + e^{-x}}\)

  • Power: \(y = x^a\), where \(x \geq 0\) for any fractional \(a\) and \(x > 0\) for \(a < 0\)

  • Sine: \(y = \sin(x)\)

  • Cosine: \(y = \cos(x)\)

  • Tangent: \(y = \tan(x)\)

Function constraints can either be treated directly using our spatial branch-and-bound or solved as a MIP after replacing the corresponding constraints by a (static) piecewise-linear approximation. This can be controlled globally by the parameter FuncNonlinear or for each constraint individually with the attribute FuncNonlinear. If a static piecewise-linear approximation is chosen, various attributes and parameters can be used to control its accuracy. The section Solution Strategies discusses these choices. In Dynamic Piecewise-Linear Approximation we give an overview of the spatial branch-and-bound and in Static Piecewise-Linear Approximation the static piecewise-linear approximation.

Nonlinear Constraints#

We call nonlinear constraints, relationships of the form \(y = f(x)\) where \(x\) is a vector of \(n\) Gurobi variables, \(y\) is a single Gurobi variable and \(f\) is a function \(f: \mathbf{R}^n \rightarrow \mathbf{R}\). The function \(f\) may be composed of arithmetic operations (addition, subtraction, multiplication) and various univariate nonlinear functions. The function \(f\) can essentially use the same nonlinear univariate functions as the function constraints described above, but they differ in that those functions can be combined and together with the arithmetic operations can be used to represent compound multivariate functions.

Internally, the function \(f\) is represented by an expression tree. Such a tree encodes how the function \(f\) is composed of various elementary operations. The leaves of the tree are either Gurobi variables or constants, the internal nodes are operations applied on the children of each node.

In our Python API, you can write nonlinear functions naturally using arithmetic operators and nonlinear functions helpers. You can refer to the documentation of NLExpr for the details. In other APIs, you have to construct the expression tree that represents the function \(f\). This is considered advanced usage. You can refer to the section Expression Trees for an explanation of how to build those trees.

In the table Operations in nonlinear constraints below, we give a succinct list of the various operations that can be used in nonlinear constraints. We describe each operation by the corresponding mathematical function. The arguments of the functions are denoted by \(u\) and \(v\). They may be Gurobi variables, but may also be constants or another part of the expression. The last column of the table gives the corresponding operation code in the expression tree. The reader is invited to refer to those operations code for details of each operation.

Operations in nonlinear constraints#

Operation name

Formula

OPCODE

Addition

\(\sum_{i=1}^n v_i\)

OPCODE_PLUS

Subtraction

\(u - v\)

OPCODE_MINUS

Negation

\(-v\)

OPCODE_UMINUS

Multiplication

\(v_1 \times v_2 \times \ldots \times v_n\)

OPCODE_MULTIPLY

Division

\(\frac{u}{v}\)

OPCODE_DIVIDE

Square

\(v^2\)

OPCODE_SQUARE

Square root

\(\sqrt v\)

OPCODE_SQRT

Natural exponential

\(\exp(v)\) or \(e^v\)

OPCODE_EXP

Natural logarithm

\(\log_e(v)\) or \(\ln(v)\)

OPCODE_LOG

Binary logarithm

\(\log_2(v)\)

OPCODE_LOG2

Decimal logarithm

\(\log_{10}(v)\)

OPCODE_LOG10

Power

\(u^v\)

OPCODE_POW

Sine

\(\sin(v)\)

OPCODE_SIN

Cosine

\(\cos(v)\)

OPCODE_COS

Tangent

\(\tan(v)\)

OPCODE_TAN

Logistic

\(\frac{1}{1 + \exp(-v)}\)

OPCODE_LOGISTIC