Nonlinear models¶
The abstraction for nonlinear models is independent of both the solver and the user’s representation of the problem, whether using an algebraic modeling language or customized low-level code.
The diagram below illustrates MathProgBase as the connection between typical nonlinear programming (NLP) solvers IPOPT, MOSEK, and KNITRO, and modeling languages such as JuMP and AMPL (via AmplNLReader.jl).
This structure also makes it easy to connect solvers written in Julia itself with user-provided instances in a variety of formats.
We take the prototypical format for a nonlinear problem as
Where \(x \in \mathbb{R}^n, f: \mathbb{R}^n \to \mathbb{R}, g: \mathbb{R}^n \to \mathbb{R}^m\), and vectors \(lb \in (\mathbb{R} \cup \{-\infty\})^m, ub \in (\mathbb{R} \cup \{\infty\})^m,l \in (\mathbb{R} \cup \{-\infty\})^n, u \in (\mathbb{R} \cup \{\infty\})^n\).
The objective function \(f\) and constraint function \(g\) may be nonlinear and nonconvex, but are typically expected to be twice differentiable.
Below we describe the interface for AbstractNonlinearModel.
-
loadproblem!(m::AbstractNonlinearModel, numVar, numConstr, l, u, lb, ub, sense, d::AbstractNLPEvaluator)¶ Loads the nonlinear programming problem into the model. The parameter numVar is the number of variables in the problem,
numConstris the number of constraints,lcontains the variable lower bounds,ucontains the variable upper bounds,lbcontains the constraint lower bounds, andubcontains the constraint upper bounds. Sense contains the symbol:Maxor:Min, indicating the direction of optimization. The final parameterdis an instance of anAbstractNLPEvaluator, described below, which may be queried for evaluating \(f\) and \(g\) and their corresponding derivatives.
The abstract type AbstractNLPEvaluator is used by solvers for accessing the objective function \(f\) and constraints \(g\). Solvers may query the value, gradients, Hessian-vector products, and the Hessian of the Lagrangian.
-
initialize(d::AbstractNLPEvaluator, requested_features::Vector{Symbol})¶ Must be called before any other methods. The vector
requested_featureslists features requested by the solver. These may include:Gradfor gradients of \(f\),:Jacfor explicit Jacobians of \(g\),:JacVecfor Jacobian-vector products,:HessVecfor Hessian-vector and Hessian-of-Lagrangian-vector products,:Hessfor explicit Hessians and Hessian-of-Lagrangians, and:ExprGraphfor expression graphs.
-
features_available(d::AbstractNLPEvaluator)¶ Returns the subset of features available for this problem instance, as a list of symbols in the same format as in
initialize.
-
eval_f(d::AbstractNLPEvaluator, x)¶ Evaluate \(f(x)\), returning a scalar value.
-
eval_g(d::AbstractNLPEvaluator, g, x)¶ Evaluate \(g(x)\), storing the result in the vector
gwhich must be of the appropriate size.
-
eval_grad_f(d::AbstractNLPEvaluator, g, x)¶ Evaluate \(\nabla f(x)\) as a dense vector, storing the result in the vector
gwhich must be of the appropriate size.
-
jac_structure(d::AbstractNLPEvaluator)¶ Returns the sparsity structure of the Jacobian matrix \(J_g(x) = \left[ \begin{array}{c} \nabla g_1(x) \\ \nabla g_2(x) \\ \vdots \\ \nabla g_m(x) \end{array}\right]\) where \(g_i\) is the \(i\text{th}\) component of \(g\). The sparsity structure is assumed to be independent of the point \(x\). Returns a tuple
(I,J)whereIcontains the row indices andJcontains the column indices of each structurally nonzero element. These indices are not required to be sorted and can contain duplicates, in which case the solver should combine the corresponding elements by adding them together.
-
hesslag_structure(d::AbstractNLPEvaluator)¶ Returns the sparsity structure of the Hessian-of-the-Lagrangian matrix \(\nabla^2 f + \sum_{i=1}^m \nabla^2 g_i\) as a tuple
(I,J)whereIcontains the row indices andJcontains the column indices of each structurally nonzero element. These indices are not required to be sorted and can contain duplicates, in which case the solver should combine the corresponding elements by adding them together. Any mix of lower and upper-triangular indices is valid. Elements(i,j)and(j,i), if both present, should be treated as duplicates.
-
eval_jac_g(d::AbstractNLPEvaluator, J, x)¶ Evaluates the sparse Jacobian matrix \(J_g(x) = \left[ \begin{array}{c} \nabla g_1(x) \\ \nabla g_2(x) \\ \vdots \\ \nabla g_m(x) \end{array}\right]\). The result is stored in the vector
Jin the same order as the indices returned byjac_structure.
-
eval_jac_prod(d::AbstractNLPEvaluator, y, x, w)¶ Computes the Jacobian-vector product \(J_g(x)w\), storing the result in the vector
y.
-
eval_jac_prod_t(d::AbstractNLPEvaluator, y, x, w)¶ Computes the Jacobian-transpose-vector product \(J_g(x)^Tw\), storing the result in the vector
y.
-
eval_hesslag_prod(d::AbstractNLPEvaluator, h, x, v, σ, μ)¶ Given scalar weight
σand vector of constraint weightsμ, computes the Hessian-of-the-Lagrangian-vector product \(\left(\sigma\nabla^2 f(x) + \sum_{i=1}^m \mu_i \nabla^2 g_i(x)\right)v\), storing the result in the vectorh.
-
eval_hesslag(d::AbstractNLPEvaluator, H, x, σ, μ)¶ Given scalar weight
σand vector of constraint weightsμ, computes the sparse Hessian-of-the-Lagrangian matrix \(\sigma\nabla^2 f(x) + \sum_{i=1}^m \mu_i \nabla^2 g_i(x)\), storing the result in the vectorHin the same order as the indices returned byhesslag_structure.
-
isobjlinear(d::AbstractNLPEvaluator)¶ trueif the objective function is known to be linear,falseotherwise.
-
isobjquadratic(d::AbstractNLPEvaluator)¶ trueif the objective function is known to be quadratic (convex or nonconvex),falseotherwise.
-
isconstrlinear(d::AbstractNLPEvaluator, i)¶ trueif the \(i\text{th}\) constraint is known to be linear,falseotherwise.
-
obj_expr(d::AbstractNLPEvaluator)¶ Returns an expression graph for the objective function as a standard Julia
Exprobject. All sums and products are flattened out as simpleExpr(:+,...)andExpr(:*,...)objects. The symbolxis used as a placeholder for the vector of decision variables. No other undefined symbols are permitted; coefficients are embedded as explicit values. For example, the expression \(x_1+\sin(x_2/\exp(x_3))\) would be represented as the Julia object:(x[1] + sin(x[2]/exp(x[3]))). See the Julia manual for more information on the structure ofExprobjects. There are currently no restrictions on recognized functions; typically these will be built-in Julia functions like^,exp,log,cos,tan,sqrt, etc., but modeling interfaces may choose to extend these basic functions.
-
constr_expr(d::AbstractNLPEvaluator, i)¶ Returns an expression graph for the \(i\text{th}\) constraint in the same format as described above. The head of the expression is
:comparison, indicating the sense of the constraint. The right-hand side of the comparison must be a constant; that is,:(x[1]^3 <= 1)is allowed, while:(1 <= x[1]^3)is not valid. Double-sided constraints are allowed, in which case both the lower bound and upper bounds should be constants; for example,:(-1 <= cos(x[1]) + sin(x[2]) <= 1)is valid.
Nonlinear solvers may also provide optimal Lagrange multipliers if available through getreducedcosts and getconstrduals.
-
getreducedcosts(m::AbstractNonlinearModel)¶ Returns the dual solution vector corresponding to the variable bounds, known as the reduced costs. Not available when integer variables are present.
-
getconstrduals(m::AbstractNonlinearModel)¶ Returns the dual solution vector corresponding to the constraints. Not available when integer variables are present.