fmincon non-linear optimisation: issues with sqp and interior-point

23 views (last 30 days)
chicken vector on 5 Jul 2023
Commented: chicken vector on 12 Jul 2023
I am running an optimisation problem using fmincon.
The problem is subjected to very few bound constraints, linear inequality constraints and non-linear inequality constraints.
There are also a lot or non-linear equality constraints. These are the dynamics of the problem imposed as defect constraints using the direct transcription formulation.
I am trying to find the optimal control using multiple shooting and normalised NLP variables.
I was doing some test as a trade-off between SQP and interior-point and I noticed some issues with both methods.
1) SQP:
As the documentation highlights, a change to the NLP vector is applied if the computed shift improves the objective function.
My issue with this is that SQP tends to improve "too much" the objective function, hence converging towards an infeasible solution.
Basically, it lacks the tools to make a step back when overshoots, and frequently gets stucked into some resonance region where the solution cannot improves in terms of feasibility.
This can usually be compensated by iteratively bounding the NLP variables towards its optimal solution, but requires many runs.
The main advantage of this approach is that the convergence, when it works, is very fast.
2) Interior-point:
This case is much trickier as the interior-point appears to be a more complex algorithm.
From my tests it appears to have a larger convergence rate than SQP but requires x10 iterations.
The main issue I'm having with interior-point is that, if I start from a solution very close to the optimal one, it just ignores it and start all over.
Note that in the following example, the objective function is bounded to a minimum of -1.
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 443 -9.920000e-01 1.890e-06 7.998e-03 <----------- Near-optimal initial guess
1 890 -9.816996e-01 1.360e+00 1.830e-02 1.999e+00
2 1334 -9.816996e-01 1.360e+00 1.830e-02 2.146e-12
3 1781 -9.214377e-01 1.097e+00 2.277e-01 3.600e+00
4 2225 -9.214377e-01 1.097e+00 1.780e-01 1.720e-10
5 2670 -9.205512e-01 1.251e+00 4.427e-01 2.729e-01
6 3113 -9.236640e-01 1.029e+00 2.175e-01 1.249e+00
................................................................
I understand is a bit silly to ask for help and sharing so little information, but maybe someone is already familiar with these issues and knows some workaround.
Catalytic on 7 Jul 2023
From my tests it appears to have a larger convergence rate than SQP but requires x10 iterations.
Was this a typo? If it takes 10x the iterations, that makes it sound slower than sqp. In what way then does it have a "larger convergence rate"?
chicken vector on 7 Jul 2023
Edited: chicken vector on 7 Jul 2023
Interior-point is slower than SQP but, out of 100 trials, would converge more times than SQP.

Matt J on 5 Jul 2023
Edited: Matt J on 5 Jul 2023
SQP...Basically, it lacks the tools to make a step back when overshoots
On the contrary, it is one of the few algorithms that can backtrack, but you have to set the objective and/or constraints to NaN or Inf in those regions, see Robustness to Non-Double Results.
Interior-Point...From my tests it appears to have a larger convergence rate than SQP but requires x10 iterations.
Are you supplying an analytic Hessian computation? If not, that is probably why it is slow.
Matt J on 7 Jul 2023
OK, I think I understand. However, it doesn't appear to me that your originally posted question has any remaining unanswered components. It would appear that for your particular problem, you have difficult local solutions outside the feasible set that the optimization cannot come back from, and therefore it is critical that you choose an algorithm that remains within the feasible set. Assuming all that is true, IP seems like a pretty good choice.
You have speculated that SQP would serve you better in terms of speed if it could be prevented from straying outside the feasible set, but I am skeptical of that. The whole reason that IP is slow seems to be that it is confined to move through the feasible region, and so must traverse a less direct path to the solution. If you confined SQP in the same way, I imagine it would also be slow, and for the same reasons.
chicken vector on 8 Jul 2023
You have speculated that SQP would serve you better in terms of speed if it could be prevented from straying outside the feasible set, but I am skeptical of that. The whole reason that IP is slow seems to be that it is confined to move through the feasible region, and so must traverse a less direct path to the solution. If you confined SQP in the same way, I imagine it would also be slow, and for the same reasons.
I agree on everything.
The computational difference is probably due to the difficulty of selecting an interior point as initial guess, which SQP does not care about.

Alan Weiss on 11 Jul 2023
It sounds like you have done a good job in analyzing the solver behavior. There is one more thing that I have found that sometimes helps the sqp algorithm converge better: multiply the nonlinear equality constraint functions by appropriate constants.
The idea is this. fmincon is trying to satisfy ceq(x) = 0. However, at the same time it is trying to minimize the objective function. If, as you say, it is minimizing the objective function at the expense of feasibility, then try multiplying the various components of ceq(x) by 10 or 100 or even 1000. That will encourage fmincon to pay more attention to feasibility.
This procedue is, of course, quite manual, and can be tricky to balance, meaning it can lead to undesirable behavior such as slow convergence or even divergence. Good luck,
Alan Weiss
MATLAB mathematical toolbox documentation
chicken vector on 12 Jul 2023
Thank you Alan for the insight, but I have to admit I have already tried this one and it didn't work for this particular problem.
From my experience, it is helpful when some constraints are a bit harder to satisfy than others, so you can put some weight with a coefficient, e.g.:
ceq(10:20) = 1e5*ceq(10:20);
But for SQP the algorithm didn't care at all about the order of magnitude of the feasibility, exhibiting the same behavior described above.