Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Breville Promo Code: $700 Off | June 2026
    • Nevada injunction ruling backs regulators against Polymarket
    • Apple’s Foldable iPhone Ultra: Release Date, Price, and Leaks
    • American Rheinmetall and Harbinger Partner on Autonomous Hybrid Military Trucks
    • Startup Muster is back in 2026 thanks to widespread support to save it
    • Pura Promo Codes: $20 Off May 2026
    • June deadline approaches for Hawthorne sale process
    • Today’s NYT Mini Crossword Answers for June 4
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Thursday, June 4
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»A Gentle Introduction to Nonlinear Constrained Optimization with Piecewise Linear Approximations
    Artificial Intelligence

    A Gentle Introduction to Nonlinear Constrained Optimization with Piecewise Linear Approximations

    Editor Times FeaturedBy Editor Times FeaturedMarch 21, 2026No Comments19 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    drawback, the purpose is to search out one of the best (most or minimal) worth of an goal perform by deciding on actual variables that fulfill a set of equality and inequality constraints.

    A normal constrained optimization drawback is to pick nn actual determination variables x0,x1,…,xn−1x_0,x_1,dots,x_{n-1} from a given possible area in such a means as to optimize (reduce or maximize) a given goal perform

    [f(x_0,x_1,dots,x_{n-1}).]

    We normally let xx denote the vector of nn actual determination variables x0,x1,…,xn−1.x_0,x_1,dots,x_{n-1}. That’s, x=(x0,x1,…,xn−1)x=(x_0,x_1,dots,x_{n-1}) and write the overall nonlinear program as:

    [begin{aligned}text{ Maximize }&f(x)text{ Subject to }&g_i(x)leq b_i&&qquad(i=0,1,dots,m-1)&xinmathbb{R}^nend{aligned}]

    the place every of the constraint capabilities g0g_0 by way of gm−1g_{m-1} is given, and every bib_i is a continuing (Bradley et al., 1977).

    This is just one attainable solution to write the issue. Minimizing f(x)f(x) is equal to maximizing −f(x).-f(x). Likewise, an equality constraint h(x)=bh(x)=b might be expressed because the pair of inequalities h(x)≤bh(x)leq b and −h(x)≤−b.-h(x)leq-b. Furthermore, by including a slack variable, any inequality might be transformed into an equality constraint (Bradley et al., 1977).

    These kinds of issues seem throughout many utility areas, for instance, in enterprise settings the place an organization goals to maximise revenue or reduce prices whereas working on assets or funding constraints (Cherry, 2016).

    If f(x)f(x) is a linear perform and the constraints are linear, then the issue known as a linear programming drawback (LP) (Cherry, 2016).

    “The issue known as a nonlinear programming drawback (NLP) if the target perform is nonlinear and/or the possible area is decided by nonlinear constraints.” (Bradley et al., 1977, p. 410)

    The assumptions and approximations utilized in linear programming can generally produce an appropriate mannequin for the choice variable ranges of curiosity. In different instances, nevertheless, nonlinear conduct within the goal perform and/or the constraints is important to formulate the applying as a mathematical program precisely (Bradley et al., 1977).

    Nonlinear programming refers to strategies for fixing an NLP. Though many mature solvers exist for LPs, NLPs, particularly these involving higher-order nonlinearities, are usually tougher to resolve (Cherry, 2016).

    Difficult nonlinear programming issues come up in areas resembling digital circuit design, monetary portfolio optimization, fuel community optimization, and chemical course of design (Cherry, 2016).

    One solution to deal with nonlinear programming issues is to linearize them and use an LP solver to acquire approximation. One want to try this for a number of causes. As an example, linear fashions are normally quicker to resolve and might be extra numerically steady.

    To maneuver from principle to a working Python mannequin, we are going to stroll by way of the next sections:

    This textual content assumes some familiarity with mathematical programming. After defining Separable Functions, we current a separable nonlinear maximization Example and description an strategy primarily based on piecewise linear (PWL) approximations of the nonlinear goal perform.

    Subsequent, we outline Special Ordered Sets of Type 2 and clarify how they help the numerical formulation.

    We then introduce the theoretical background in On Convex and Concave Functions, offering instruments used all through the rest of this work on nonlinear programming (NLP).

    Lastly, we current a Python Implementation process that makes use of Gurobipy and PWL approximations of the nonlinear goal perform, enabling LP/MIP solvers to acquire helpful approximations for pretty massive NLPs.


    Separable Features

    This text’s answer process is for separable packages. Separable packages are optimization issues of the shape:

    [begin{aligned}text{ Maximize }&sumlimits_{j=0}^{n-1}f_j(x_j)text{ Subject to }&sumlimits_{j=0}^{n-1} g_{ij}(x_j)leq 0qquad(i=0,1,dots,m-1)&xinmathbb{R}^nend{aligned}]

    the place every of the capabilities fjf_j and gijg_{ij} are recognized. These issues are referred to as separable as a result of the choice variables seem individually: one in every constraint perform gijg_{ij} and one in every goal perform fjf_j (Bradley et al., 1977).


    Instance

    Take into account the next drawback from Bradley et al. (1977) that arises in portfolio choice:

    [begin{aligned}text{ Maximize }&f(x)=20x_0+16x_1-2x_0^2-x_1^2-(x_0+x_1)^2text{ Subject to }&x_0+x_1leq5&x_0geq0,x_1geq0.end{aligned}]

    Goal perform (Animation by the creator).
    Goal perform over the possible area (Animation by the creator).

    As said, the issue just isn’t separable due to the time period (x0+x1)2(x_0+x_1)^2 within the goal perform. Nonetheless, nonseparable issues can steadily be diminished to a separable type utilizing numerous formulation tips. Specifically, by letting x2=x0+x1x_2=x_0+x_1 we will re-express the issue above in separable type as:

    [begin{aligned}text{ Maximize }&f(x)=20x_0+16x_1-2x_0^2-x_1^2-x_2^2text{ Subject to }&x_0+x_1leq5&x_0+x_1-x_2=0&x_0geq0,x_1geq0,x_2geq 0.end{aligned}]

    Clearly, the linear constraints are separable, and the target perform can now be written as f(x)=f0(x0)+f1(x1)+f2(x2),f(x)=f_0(x_0)+f_1(x_1)+f_2(x_2), the place

    [begin{aligned}f_0(x_0)&=20x_0-2x_0^2f_1(x_1)&=16x_1-x_1^2end{aligned}]

    and

    [f_2(x_2)=-x_2^2.]

    Nonlinear capabilities (Picture by the creator).

    To type the approximation drawback, we approximate every nonlinear time period fj(xj)f_j(x_j) by a piecewise linear perform fj^(xj)hat{f_j}(x_j) through the use of a specified variety of breakpoints.

    Every PWL perform fj^(xj)hat{f_j}(x_j) is outlined over an interval [l,u][l,u] and consists of line segments whose endpoints lie on the nonlinear perform fj(xj),f_j(x_j), beginning at (l,fj(l))(l,f_j(l)) and ending at (u,fj(u)).(u,f_j(u)). We symbolize the PWL capabilities utilizing the next parametrization. One can write any l≤xj≤ulleq x_jleq u as:

    [x_j=sumlimits_{i=0}^{r-1}theta_j^i a_i,quad hat f_j(x_j)=sumlimits_{i=0}^{r-1}theta_j^if_j(a_i),quadsumlimits_{i=0}^{r-1}theta_j^i=1,quadtheta_j=(theta_j^0,theta_j^1,dotstheta_j^{r-1})inmathbb{R}_{geq0}^r]

    for every j=0,1,2,j=0,1,2, the place the PWL capabilities are specified by the factors {(ai,fj(ai)}{(a_i,f_j(a_i)} for i=0,1,…,r−1i=0,1,dots,r-1 (Cherry, 2016). Right here, we use 4 uniformly distributed breakpoints to approximate every fj(xj)f_j(x_j) Additionally, since x0+x1≤5,x_0+x_1leq5, x0+x1=x2,x_0+x_1=x_2, and x0,x1,x2≥0x_0,x_1,x_2geq0 we have now that

    [0leq x_0leq5,qquad0leq x_1leq 5,quadtext{ and }quad0leq x_2leq5.]

    And so, our approximations needn’t lengthen past these variable bounds. Subsequently the PWL perform fj^(xj)hat{f_j}(x_j) shall be outlined by the factors (0,fj(0)),(0,f_j(0)), (5/3,fj(5/3)),(5/3,f_j(5/3)), (10/3,fj(10/3))(10/3,f_j(10/3)) and (5,fj(5))(5,f_j(5)) for j=0,1,2.j=0,1,2.

    Any x0∈[0,5]∩ℝx_0in[0,5]capmathbb{R} might be written as

    [begin{aligned}x_0&=theta_0^0cdot0+theta_0^1cdotfrac{5}{3}+theta_0^2cdotfrac{10}{3}+theta_0^3cdot5&=theta_0^1cdotfrac{5}{3}+theta_0^2cdotfrac{10}{3}+theta_0^3cdot 5text{ s.t. }sumlimits_{i=0}^3theta_0^i=1end{aligned}]

    and f0(x0)f_0(x_0) might be approximated as

    [begin{aligned}hat f_0(x_0)&=theta_0^0f_0(0)+theta_0^1f_0left(frac{5}{3}right)+theta_0^2f_0left(frac{10}{3}right)+theta_0^3f_0(5)&=theta_0^1cdotfrac{250}{9}+theta_0^2cdotfrac{400}{9}+theta_0^3cdot50.end{aligned}]

    The identical reasoning follows for x1x_1 and x2x_2 For instance, evaluating the approximation at x0=1.5x_0=1.5 provides:

    [begin{aligned}hat f_0(1.5)&=frac{1}{10}f_0(0)+frac{9}{10}f_0left(frac{5}{3}right)&=frac{1}{10}cdot 0+frac{9}{10}cdotfrac{250}{9}&=25end{aligned}]

    since

    [1.5=frac{1}{10}cdot0+frac{9}{10}cdotfrac{5}{3}.]

    PWL approximation curves (Picture by the creator).

    Such weighted sums are referred to as convex combos, i.e., linear combos of factors with non-negative coefficients that sum to 1. Now, since f0(x0),f_0(x_0), f1(x1),f_1(x_1), and f2(x2)f_2(x_2) are concave capabilities, one could ignore extra restrictions, and the issue might be solved as an LP (Bradley et al., 1977). Nonetheless, generally, such a formulation doesn’t suffice. Specifically, we nonetheless must implement the adjacency situation: At most two θjitheta_j^i weights are constructive, and if two weights are constructive, then they’re adjoining, i.e., of the shape θjitheta_j^i and θji+1.theta_j^{i+1}. An analogous restriction applies to every approximation. As an example, if the weights θ00=2/5theta_0^0=2/5 and θ02=3/5theta_0^2=3/5 then the approximation provides

    [begin{aligned}hat f_0(x_0)&=frac{2}{5}cdot 0+frac{3}{5}cdotfrac{400}{9}=frac{80}{3}x_0&=frac{2}{5}cdot0+frac{3}{5}cdotfrac{10}{3}=2.end{aligned}]

    In distinction, at x0=2,x_0=2, the approximation curve provides

    [begin{aligned}hat f_0(x_0)&=frac{4}{5}cdotfrac{250}{9}+frac{1}{5}cdotfrac{400}{9}=frac{280}{9}x_0&=frac{4}{5}cdotfrac{5}{3}+frac{1}{5}cdotfrac{10}{3}=2.end{aligned}]

    Want for adjacency situation (Picture by the creator).

    One can implement the adjacency situation by introducing extra binary variables into the mannequin, as formulated in Cherry (2016). Nonetheless, we leverage Gurobipy‘s SOS (Particular Ordered Set) constraint object.

    The whole program is as follows

    [begin{alignat}{2}text{ Maximize }&hat f_0(x_0)+hat f_1(x_1)+hat f_2(x_2)text{ Subject to }&x_0=sumlimits_{i=0}^{r-1}theta_0^ia_i&x_1=sumlimits_{i=0}^{r-1}theta_1^ia_i&x_2=sumlimits_{i=0}^{r-1}theta_2^ia_i&hat f_0=sumlimits_{i=0}^{r-1}theta_0^if_0(a_i)&hat f_1=sumlimits_{i=0}^{r-1}theta_1^if_1(a_i)&hat f_2=sumlimits_{i=0}^{r-1}theta_2^if_2(a_i)&sumlimits_{i=0}^{r-1}theta_j^i=1&&qquad j=0,1,2&{theta_j^0,theta_j^1,dotstheta_j^{r-1}}text{ SOS type 2 }&&qquad j=0,1,2&x_0,x_1,x_2in[0,5]capmathbb{R}&hat f_0in[0,50]capmathbb{R}&hat f_1in[0,55]capmathbb{R}&hat f_2in [-25,0]capmathbb{R}&theta_j^iin[0,1]capmathbb{R}&&qquad j=0,1,2,;i=0,1,dots,{r-1}.finish{alignat}]


    Particular Ordered Units of Kind 2

    “A set of variables {x0,x1,…,xn−1}{x_0,x_1,dots,x_{n-1}} is a particular ordered set of sort 2 (SOS sort 2) if xixj=0x_ix_j=0 each time |i−j|≥2,|i-j|geq2, i.e., at most two variables within the set might be nonzero, and if two variables are nonzero they should be adjoining within the set.” (de Farias et al., 2000, p. 1)

    In our formulation, the adjacency situation on the θjitheta_j^i variables matches precisely a SOS sort 2 situation: at most two θjitheta_j^i might be constructive, they usually should correspond to adjoining breakpoints. Utilizing SOS sort 2 constraints lets us guarantee adjacency instantly in Gurobi, which applies specialised branching methods that robotically seize the SOS construction, as an alternative of introducing further binary variables by hand.


    On Convex and Concave Features

    For the next dialogue, we take a normal separable constrained maximization drawback occasion, as beforehand outlined within the Introduction part:

    [begin{aligned}text{ Maximize }&sumlimits_{j=0}^{n-1}f_j(x_j)text{ Subject to }&sumlimits_{j=0}^{n-1} g_{ij}(x_j)leq 0qquad(i=0,1,dots,m-1)&xinmathbb{R}^n.end{aligned}]

    the place every of the capabilities fif_i and gijg_{ij} are recognized.

    All through, the time period piecewise-linear (PWL) approximation refers back to the secant interpolant obtained by connecting ordered breakpoints pairs (aok,f(aok))(a_k,f(a_k)) with line segments.

    Convex and concave are among the many most important practical kinds in mathematical optimization. Formally, a perform f(x)f(x) known as convex if, for each yy and zz and each 0≤λ≤1,0leqlambdaleq1,

    [f[lambda y+(1-lambda)z]leqlambda f(y)+(1-lambda)f(z).]

    This definition implies that the sum of convex capabilities is convex, and nonnegative multiples of convex capabilities are additionally convex. Concave capabilities are merely the adverse of convex capabilities, for which the definition above follows apart from the reversed course of the inequality. In fact, some capabilities are neither convex nor concave.

    Convex and concave capabilities (Eric W. Weisstein on MathWorld)

    It’s simple to see that linear capabilities are each convex and concave. This property is important because it permits us to optimize (maximize or reduce) linear goal capabilities utilizing computationally environment friendly methods, such because the simplex methodology in linear programming (Bradley et al., 1977).

    Moreover, a single-variable differentiable perform f(x)f(x) is convex on an interval precisely when its by-product f′(x)f^prime(x) is nondecreasing — that means the slope doesn’t go down. Likewise, differentiable f(x)f(x) is concave on an interval precisely when f′(x)f^prime(x) is nonincreasing — that means the slope doesn’t go up.

    From the definition above, one can trivially conclude that PWL approximations overestimate convex capabilities since each line section becoming a member of two factors on its graph doesn’t lie under the graph at any level. Equally, PWL approximations underestimate concave capabilities.

    Nonlinear Goal Perform

    This notion helps us clarify why we will omit the SOS sort 2 constraints from an issue such because the one within the Example part. By concavity, the PWL approximation curve lies above the section becoming a member of any two non-adjacent factors. Subsequently, the maximization will choose the approximation curve with solely adjoining weights. An analogous argument applies if three or extra weights are constructive (Bradley et al., 1977). Formally, suppose that every fj(xj)f_j(x_j) is a concave perform and every recognized gijg_{ij} is a linear perform in x.x. Then we will apply the identical process as within the Example part — We let breakpoints fulfill

    [l= a_0lt a_1lt dotslt a_{r-1}= u,qquad rgeq 2.]

    For actual numbers l≤xj≤u.lleq x_jleq u. Additionally, let θj∈ℝ+rtheta_jinmathbb{R}_+^r such that,

    [sumlimits_{i=0}^{r-1}theta_j^i=1]

    and assign

    [x_j=sumlimits_{i=0}^{r-1}theta_j^icdot a_i,qquadhat f_j=sumlimits_{i=0}^{r-1}theta_j^if_j(a_i)]

    for every j=0,1,…,n−1.j=0,1,dots,n-1. And so, the reworked LP is as follows

    [begin{alignat}{2}text{ Maximize }&sumlimits_{j=0}^{n-1}hat f_jtext{ Subject to }&sumlimits_{j=0}^{n-1} g_{ij}(x_j)leq 0&&qquad(i=0,1,dots,m-1)&x_j=sumlimits_{i=0}^{r-1}theta_j^icdot a_i&&qquad(j=0,1,dots,n-1)&hat f_j=sumlimits_{i=0}^{r-1}theta_j^if_j(a_i)&&qquad(j=0,1,dots,n-1)&sum_{i=0}^{r-1}theta_j^i=1&&qquad(j=0,1,dots,n-1)&theta_jinmathbb{R}_{geq 0}^r&&qquad(j=0,1,dots,n-1)&xinmathbb{R}^n.end{alignat}]

    Now, let pj(xj)p_j(x_j) be the PWL curve that goes by way of the factors (ai,fj(ai))(a_i,f_j(a_i)) i.e., for every xj∈[ak,ak+1],x_jin[a_k,a_{k+1}], j=0,1,…,n−1:j=0,1,dots,n-1:

    [p_j(x_j)=frac{a_{k+1}-x_j}{a_{k+1}-a_k}f_j(a_k)+frac{x_j-a_k}{a_{k+1}-a_k}f_j(a_{k+1})]

    fj(xj)f_j(x_j) concave implies that its secant slopes are non-increasing, and since pj(xj)p_j(x_j) is constructed by becoming a member of ordered breakpoints on the graph of fj(xj),f_j(x_j), we have now that pj(xj)p_j(x_j) can also be concave. And so, by Jensen’s inequality,

    [p_j(x_j)=p_jleft(sumlimits_{i=0}^{r-1}theta_j^icdot a_iright)geqsumlimits_{i=0}^{r-1}theta_j^icdot p_j(a_i)=sumlimits_{i=0}^{r-1}theta_j^if_j(a_i)=hat f_j.]

    Furthermore, for all xjx_j in any possible answer ((x0,f0^,θ0),(x1,f1^,θ1),…,(xn−1,f^n−1,θn−1)),((x_0,hat{f_0},theta_0),(x_1,hat{f_1},theta_1),dots,(x_{n-1},hat{f}_{n-1},theta_{n-1})), one can select okok such that xj∈[ak,ak+1]x_jin[a_k,a_{k+1}] and outline the vector θj^hat{theta_j} by

    [hattheta_j^k=frac{a_{k+1}-x_j}{a_{k+1}-a_k},qquad hattheta_j^{k+1}=frac{x_j-a_k}{a_{k+1}-a_k},qquadhattheta_j^i=0,;(inotin {k,k+1})qquad]

    with θ^jok,θ^jok+1≥0hattheta_j^ok,hattheta_j^{ok+1}geq 0 and θ^jok+θ^jok+1=1.hattheta_j^ok+hattheta_j^{ok+1}=1. Then θ^jok⋅aok+θ^jok+1⋅aok+1=xjhattheta_j^kcdot a_k+hattheta_j^{ok+1}cdot a_{ok+1}=x_j and offers

    [hat f_j^text{new}=hattheta_j^k f_j(a_k)+hattheta_j^{k+1}f_j(a_{k+1})=p_j(x_j)geqhat f_j.]

    As a result of the constraints rely on θjtheta_j solely by way of the induced worth xj,x_j, changing every θjtheta_j by θ^jhattheta_j preserves feasibility (as a result of it leaves every xjx_j unchanged) and it weakly improves the target perform (as a result of it will increase every f^jhat f_j to pj(xj)p_j(x_j)). Subsequently, the SOS sort 2 constraints that implement the adjacency situation don’t change the optimum worth in separable maximization issues with concave goal capabilities. An analogous argument exhibits that the SOS sort 2 constraints don’t change the optimum worth in separable minimization issues with convex goal capabilities.

    Be aware, nevertheless, that this exhibits that there exists an optimum answer wherein solely weights akin to adjoining breakpoints are constructive. It doesn’t assure that every one optimum options of the LP fulfill adjacency. Subsequently, one should embrace SOS sort 2 constraints to implement a constant illustration.

    In observe, fixing fashions with concave goal capabilities through PWL approximations can yield an goal worth that underestimates the true optimum worth. Conversely, fixing fashions with convex goal capabilities through PWL approximations yield an goal worth that overestimates the true optimum worth.

    Nonetheless, though these approximations can distort the target worth, in fashions the place the unique linear constraints stay unchanged, feasibility just isn’t affected by the PWL approximations. Any variable assignments discovered by fixing the mannequin by way of PWL approximations fulfill the identical linear constraints and due to this fact lie within the unique possible area. Consequently, one can consider the unique nonlinear goal f(x)f(x) on the returned answer to acquire the target worth, despite the fact that this worth needn’t equal the true nonlinear optimum.

    Nonlinear Constraints

    What requires completely different approaches are PWL approximations of nonlinear constraints, since these approximations can change the unique possible area. We outline the possible area/set of the issue, just like the one above, by:

    [F=bigcaplimits_{i=0}^{m-1}{xinmathbb{R}^n:G_i(x)leq0},]

    the place

    [G_i(x):=sumlimits_{j=0}^{n-1}g_{ij}(x_j).]

    Now, suppose (for notational simplicity) that every one Gi(x)G_i(x) are nonlinear capabilities. Approximate every univariate gij(xj)g_{ij}(x_j) by a PWL perform g^ij(xj)hat g_{ij}(x_j) and outline:

    [hat G_i(x):=sumlimits_{j=0}^{n-1}hat g_{ij}(x_j).]

    Then

    [hat F=bigcaplimits_{i=0}^{m-1}{xinmathbb{R}^n:hat G_i(x)leq0},]

    is the possible set of the reworked LP that makes use of PWL approximations. To keep away from really infeasible options, we would like

    [hat Fsubseteq F.]

    If, as an alternative, F⊆F^Fsubseteqhat F then there could exist x∈F^∖Fxin hat Fsetminus F that means the mannequin that makes use of PWL approximations may settle for factors that violate the unique nonlinear constraints.

    Possible units (Picture by the creator).

    Given the best way we assemble the PWL approximations, a enough situation for F^⊆Fhat Fsubseteq F is that for constraints of the shape Gi(x)≤0G_i(x)leq0 every gij(xj)g_{ij}(x_j) is convex and for constraints of the shape Gi(x)≥0G_i(x)geq0 every gij(xj)g_{ij}(x_j) is concave.

    To see why this holds, take any x∈F^.xinhat F. First, contemplate constraints of the shape Gi(x)≤0G_i(x)leq0 the place every gij(xj)g_{ij}(x_j) is convex. As a result of PWL approximations overestimate convex capabilities, for any fastened i=0,1,…,m−1,i=0,1,dots,m-1, we have now:

    [(forall j,;hat g_{ij}(x_j)geq g_{ij}(x_j))rightarrowsumlimits_{j=0}^{n-1}hat g_{ij}(x_j)geqsumlimits_{j=0}^{n-1}g_{ij}(x_j)rightarrowhat G_i(x)geq G_i(x)]

    Since x∈F^,xinhat F, it satisfies G^i(x)≤0.hat G_i(x)leq0. Along with G^i(x)≥Gi(x),hat G_i(x)geq G_i(x), this means Gi(x)≤0.G_i(x)leq0.

    Subsequent, contemplate constraints of the shape Gi(x)≥0G_i(x)geq0 the place every gij(xj)g_{ij}(x_j) is concave. As a result of PWL approximations underestimate concave capabilities, for any fastened i=0,1,…,m−1,i=0,1,dots,m-1, we have now:

    [(forall j,;hat g_{ij}(x_j)leq g_{ij}(x_j))rightarrowsumlimits_{j=0}^{n-1}hat g_{ij}(x_j)leqsumlimits_{j=0}^{n-1}g_{ij}(x_j)rightarrowhat G_i(x)leq G_i(x).]

    Once more, since x∈F^,xinhat F, it satisfies G^i(x)≥0.hat G_i(x)geq0. Mixed with G^i(x)≤Gi(x),hat G_i(x)leq G_i(x), this means Gi(x)≥0.G_i(x)geq0.

    This motivates the notion of a convex set: A set CC is convex if, for any two factors x,y∈Cx,yin C your complete line section connecting them lies inside C.C. Formally, CC is convex if for all x,y∈Cx,yin C and any λ∈[0,1]lambdain[0,1] the purpose λx+(1−λ)ylambda x+(1-lambda)y can also be in C.C.

    Convex and concave capabilities (Eric W. Weisstein on MathWorld)

    Specifically, if Gi:ℝn→ℝG_i:mathbb{R}^ntomathbb{R} is convex, then

    [S_i:={xinmathbb{R}^n:G_i(x)leq b}]

    is convex for any b∈ℝ.binmathbb{R}. Proof: Take any x1,x2∈S,x_1,x_2in S, then Gi(x1)≤bG_i(x_1)leq b and Gi(x2)≤b.G_i(x_2)leq b. Since Gi(x)G_i(x) is convex, for any λ∈[0,1]lambdain[0,1] we have now that

    [G_i(lambda x_1+(1-lambda)x_2)leqlambda G_i(x_1)+(1-lambda)G_i(x_2)leqlambda b+(1-lambda)b=b.]

    Therefore, λx1+(1−λ)x2∈S,lambda x_1+(1-lambda) x_2in S, and so, SS is convex. An analogous argument applies to indicate that

    [T_i:={xinmathbb{R}^n:G_i(x)geq b}]

    is convex for any b∈ℝbinmathbb{R} if Gi:ℝn→ℝG_i:mathbb{R}^ntomathbb{R} is concave.

    And since one may simply present that the intersection of any assortment of convex units can also be convex,

    “for a convex possible set we would like convex capabilities for less-than-or-equal-to constraints and concave capabilities for greater-than-or-equal to constraints.” (Bradley et al., 1977, p.418)

    Nonlinear optimization issues wherein the purpose is to attenuate a convex goal (or maximize a concave one) over a convex set of constraints are referred to as convex nonlinear issues. These issues are typically simpler to deal with than nonconvex ones.

    Certainly, if F⊆F^Fsubseteqhat F then variable assignments could result in constraint violations. In these instances, one would wish to depend on different methods, resembling the straightforward modification MAP to the Frank-Wolfe algorithm supplied in Bradley et al. (1977), and/or introduce tolerances that, to a point, permit constraint violations.


    Python Implementation

    Earlier than presenting the code, it’s helpful to contemplate how our modeling strategy will scale as the issue dimension will increase. For a separable program with nn determination variables and rr breakpoints per variable, the mannequin introduces n⋅rncdot r steady variables (for the convex combos) and rr SOS sort 2 constraints. Which means that each the variety of variables and constraints develop linearly with nn and r,r, however their product can shortly result in massive fashions when both parameter is elevated.

    As beforehand talked about, we are going to use Gurobipy to numerically remedy this system above. Usually, the process is as follows:

    1. Select bounds [l,u][l,u]
    2. Select breakpoints aia_i
    3. Add θ,theta, convex-combination constraints
    4. Add SOS sort 2
    5. Clear up
    6. Consider the nonlinear goal on the returned xx
    7. Refine breakpoints if wanted

    After importing the required libraries and dependencies, we declare and instantiate our nonlinear capabilities f0,f_0, f1,f_1, and f2:f_2:

    # Imports
    import gurobipy as grb
    import numpy as np
    
    # Nonlinear capabilities
    f0=lambda x0: 20.0*x0-2.0*x0**2
    f1=lambda x1: 16.0*x1-x1**2
    f2=lambda x2: -x2**2

    Subsequent, we select the variety of breakpoints. To assemble the PWL approximations, we have to distribute them throughout the interval [0,5]∩ℝ.[0,5]capmathbb{R}. Many methods exist to do that. As an example, one may cluster extra factors close to the ends and even depend on adaptive procedures as in Cherry (2016). For simplicity, we keep on with uniform sampling:

    lb,ub=0.0,5.0
    r=4  # Variety of breakpoints
    a_i=np.linspace(begin=lb,cease=ub,num=r)  # Distribute breakpoints uniformly

    Then, we will compute the worth of the capabilities fj(xj)f_j(x_j) on the chosen breakpoints to approximate the choice variables:

    # Compute perform values at breakpoints
    f0_hat_r=f0(a_i)
    f1_hat_r=f1(a_i)
    f2_hat_r=f2(a_i)

    The subsequent step is to declare and instantiate a Gurobipy optimization mannequin:

    mannequin=gp.Mannequin()

    After creating of our mannequin, we have to add the choice variables. Gurobipy gives strategies so as to add a number of determination variables to a mannequin directly, and we leverage this characteristic to declare and instantiate a few of our determination variables, specifically the coefficients of the PWL approximations. Be aware that every one determination variables in our mannequin are steady and might be assigned to any actual worth inside the supplied bounds. Such fashions might be solved effectively utilizing trendy software program packages resembling Gurobi.

    # Determination variables
    x0=mannequin.addVar(lb=0.0,ub=5.0,title='x0',vtype=GRB.CONTINUOUS)
    x1=mannequin.addVar(lb=0.0,ub=5.0,title='x1',vtype=GRB.CONTINUOUS)
    x2=mannequin.addVar(lb=0.0,ub=5.0,title='x2',vtype=GRB.CONTINUOUS)
    
    f0_hat=mannequin.addVar(lb=0.0,ub=50.0,title='f0_hat',vtype=GRB.CONTINUOUS)
    f1_hat=mannequin.addVar(lb=0.0,ub=55.0,title='f1_hat',vtype=GRB.CONTINUOUS)
    f2_hat=mannequin.addVar(lb=-25.0,ub=0.0,title='f2_hat',vtype=GRB.CONTINUOUS)
    
    theta=mannequin.addVars(vary(0,3),vary(0,r),lb=0.0,ub=1.0,title='theta',vtype=GRB.CONTINUOUS)

    Moreover, we have to add constraints that restrict the values our determination variables could take. Firstly, we add the constraints that guarantee the proper determination of the variables that yield the approximations of the nonlinear capabilities:

    # Constraints
    mannequin.addConstr(x0==gp.quicksum(theta[0,i]*a_i[i] for i in vary(0,r)))
    mannequin.addConstr(x1==gp.quicksum(theta[1,i]*a_i[i] for i in vary(0,r)))
    mannequin.addConstr(x2==gp.quicksum(theta[2,i]*a_i[i] for i in vary(0,r)))
    
    mannequin.addConstr(f0_hat==gp.quicksum(theta[0,i]*f0_hat_r[i] for i in vary(0,r)))
    mannequin.addConstr(f1_hat==gp.quicksum(theta[1,i]*f1_hat_r[i] for i in vary(0,r)))
    mannequin.addConstr(f2_hat==gp.quicksum(theta[2,i]*f2_hat_r[i] for i in vary(0,r)))
    
    mannequin.addConstr(gp.quicksum(theta[0,i] for i in vary(0,r))==1)
    mannequin.addConstr(gp.quicksum(theta[1,i] for i in vary(0,r))==1)
    mannequin.addConstr(gp.quicksum(theta[2,i] for i in vary(0,r))==1)

    Secondly, we add the constraints that outline the unique drawback, specifically these restrictions within the nonlinear formulation:

    mannequin.addConstr(x0+x1<=5.0)
    mannequin.addConstr(x0+x1-x2==0.0)

    And at last, we add the SOS Kind 2 constraints to make sure the adjacency situation:

    # Particular ordered units
    mannequin.addSOS(GRB.SOS_TYPE2,[theta[0,i] for i in vary(0,r)])
    mannequin.addSOS(GRB.SOS_TYPE2,[theta[1,i] for i in vary(0,r)])
    mannequin.addSOS(GRB.SOS_TYPE2,[theta[2,i] for i in vary(0,r)])

    All that continues to be is to outline the target perform, which we want to maximize:

    mannequin.setObjective(f0_hat+f1_hat+f2_hat,GRB.MAXIMIZE)  # Goal perform

    The mannequin is now full and we will use the Gurobi solver to retrieve the optimum worth and optimum variable assignments. The code to do that is proven under:

    mannequin.optimize()  # Optimize!
    
    obj_val=mannequin.ObjVal
    
    res={}
    
    all_vars=mannequin.getVars()
    values=mannequin.getAttr('X',all_vars)
    names=mannequin.getAttr('VarName',all_vars)
    
    # Retrieve values of variables after optimizing
    for title,val in zip(names,values):
        res[name]=val

    After executing the code above and printing attributes to the display screen, the output is as follows:

    - Standing: 2
    - Variety of breakpoints: 4
    - Optimum goal worth: 45.00
    - Optimum x0, f0(x0) worth (PWL approximation): 1.67, 27.78
    - Optimum x1, f1(x1) worth (PWL approximation): 3.33, 42.22
    - Optimum x2, f2(x2) worth (PWL approximation): 5.00, -25.00
    PWL approximation curves (3 segments) with optimum values (Picture by the creator).

    The distinction from our computed answer to the optimum worth of 139/3 is 4/3. Now, to extend accuracy and procure a greater answer, one may add extra breakpoints, consequently rising the variety of segments used to approximate the nonlinear capabilities. As an example, the output under exhibits how rising the variety of breakpoints to 16 results in an approximation with virtually no error to the true optimum worth:

    - Standing: 2
    - Variety of breakpoints: 16
    - Optimum goal worth: 46.33
    - Optimum x0, f0(x0) worth (PWL approximation): 2.33, 35.78
    - Optimum x1, f1(x1) worth (PWL approximation): 2.67, 35.56
    - Optimum x2, f2(x2) worth (PWL approximation): 5.00, -25.00
    PWL approximation curves (15 segments) with optimum values (Picture by the creator).
    Goal perform over the possible area with optimum worth utilizing PWL approximation curves consisting of 15 segments (Animation by the creator).

    You’ll find the entire Python code on this GitHub repository: link.


    Additional Readings

    For a extra superior, sturdy algorithm that makes use of PWL capabilities to approximate the nonlinear goal perform, Cherry (2016) is a priceless reference. It presents a process that begins with coarse approximations, that are refined iteratively utilizing the optimum answer to the LP leisure from the earlier step.

    For a deeper understanding of nonlinear programming, together with strategies, utilized examples, and an evidence of why nonlinear issues are intrinsically tougher to resolve, the reader could seek advice from Nonlinear Programming: Utilized Mathematical Programming (Bradley et al., 1977).

    Lastly, de Farias et al. (2000) current computational experiments that use SOS constraints inside a branch-and-cut scheme to resolve a generalized project drawback arising in fiber-optic cable manufacturing scheduling.


    Conclusion

    Nonlinear programming is a related space in numerical optimization — with purposes starting from monetary portfolio optimization to chemical course of management. This text introduced a process for tackling separable nonlinear packages by changing nonlinear phrases with PWL approximations and fixing the ensuing mannequin utilizing LP/MIP solvers. We additionally supplied theoretical background on convex and concave capabilities and defined how these notions relate to nonlinear programming. With these parts, a reader ought to have the ability to mannequin and remedy LP/MIP-compatible approximations of many nonlinear programming situations with out counting on devoted nonlinear solvers.

    Thanks for studying!


    References

    Cherry, A., 2016. Piecewise Linear Approximation for Nonlinear Programming Issues. A dissertation. Texas Tech College. Out there at: https://ttu-ir.tdl.org/server/api/core/bitstreams/2ffce0e6-87e9-4e4c-b41a-60ea670c5a13/content (Accessed: 27 January 2026).

    Bradley, S. P., Hax, A. C. & Magnanti, T. L., 1977. Nonlinear Programming. In: Utilized Mathematical Programming. Studying, MA: Addison-Wesley Publishing Firm, pp. 410–464. Out there at: https://web.mit.edu/15.053/www/AMP-Chapter-13.pdf (Accessed: 27 January 2026).

    de Farias, I. R., Johnson, E. L. & Nemhauser, G. L., 2000. A generalized project drawback with particular ordered units: a polyhedral strategy. Mathematical Programming, 89(1), pp. 187–203. Out there at: https://doi.org/10.1007/PL00011392 (Accessed: 27 January 2026).


    Join With Me



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    I Built a C++ Backend So My GPU Would Stop Eating Air

    June 3, 2026

    I Spent May Evaluating Different Engines for OCR

    June 3, 2026

    Why AI Is NOT Stealing Your Job

    June 3, 2026

    What AI Agents Should Never Do on Their Own

    June 3, 2026

    Exploring Income Patterns with Python Pandas, Matplotlib, and Seaborn

    June 2, 2026

    From Local App to Public Website in Minutes

    June 2, 2026

    Comments are closed.

    Editors Picks

    Breville Promo Code: $700 Off | June 2026

    June 4, 2026

    Nevada injunction ruling backs regulators against Polymarket

    June 4, 2026

    Apple’s Foldable iPhone Ultra: Release Date, Price, and Leaks

    June 4, 2026

    American Rheinmetall and Harbinger Partner on Autonomous Hybrid Military Trucks

    June 4, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Eye infection linked to Alzheimer’s disease risk

    February 8, 2026

    Today’s NYT Wordle Hints, Answer and Help for Feb. 1 #1688

    February 1, 2026

    Engineering at Scale – IEEE Spectrum

    October 2, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.