MATLAB Answers

0

fmincon generating (very) noisy results - initial guesses probably to blame

Asked by Dan Goldwater on 11 Jul 2019
Latest activity Answered by Dan Goldwater on 18 Jul 2019
I have a fairly large optimisation problem consisting of eight variables, nine non-linear inequalities and three non-linear equalities. My problem is that I need to be optimising all of these variables subject to these constraints, whilst also changing the bounds on these variables and the parameters of the constraints. So, in effect, I'm producing data which says 'here are the optimum values for all variables, as we vary this external parameter'. When I do this, I get extremely noisy results, which cannot represent the physics which I'm solving for. For each value of the parameter which I am varying, I am running fmincon inside a forloop, which generates random seedpoints for the optimisation.
A simplified version of the code looks something like this (I will insert text between code blocks, but the following should be read as one script). I have put mini versions of some of the functions called further below.
function datapoint = MakeSingleDataPoint(inputs)
args = DefineArguments(inputs); %Define various constants which are to be used
bounds = DefineBounds(args); %Define the bounds to be used for fmincon
x0 = Makex0(bounds); %This currently picks a point which is the logarithmic mean of the bounds for each variable
Now, one of the problems I have is that my variables (and constraints) are all on extremely different scales, up to 30 orders apart. So in order to treat them properly, they all need to be re-scaled to be of order ~1 before being passed to fmincon. The same goes for the constraints. My feeling is that there is something amiss in the rescaling process which might be the source of the trouble
x0scale = x0+(x0==0); %We will need to scale by x0. This array is to avoid multiplying by 0.
x01 = ones(1,length(x0));% This will be the first guess put into fmincon. It gets multiplied by x0 inside the objective function
testscale.c = ones(1,9); testscale.ceq = ones(1,3); %Here we're making arrays of 1's which are used as dummies for the constraint function
[cscale.c, cscale.ceq] = nlcon(x01,args,x0,testscale); %Generate some 'typical values' of our constraint violations
LC = lincon(x01,args,x0); %Generate the linear constraints for the peoblem. Currently all are simply 0.
To explain the above four lines of code: in the actual equations, since our variables are so many orders apart, the constraint violations can be too. So we want that all elements of our constraint violation c and ceq which get fed into fmincon are of order 1 too. To do this, we make cscale. In nlcon, the outputs c and ceq are divided by cscale before being returned, so hopefully they're not too huge or too tiny. So above we're essentially feeding x0 into the constraint function to see what scale a typical violation would be for each constraint, and then using those to scale future outputs. The testscale structure is there to simply divide this initial result by 1 - in future calls of nlcon, cscale will take it's place, so as to properly scale the output.
Next, we want to create pure functions for fmincon
objectiveopt = @(y) objective(y,args,x0); %Create pure version of the objective
nlconopt = @(y) nlcon(y,args,x0,cscale); %Same for nlcon
And then the forloop for the optimisation itself
parfor i = 1:inputs.loops
InitRand = MakeSeed(bounds); %Generate seeds as initial guesses
[outs(i).xopt, outs(i).optvalOG,outs(i).exitflag,outs(i).output] = fmincon(objectiveopt,InitRand,LC.A,LC.B,LC.Aeq,LC.Beq,bounds.LB,bounds.UB,nlconopt,options);
end
So, we generate a seedpoint for each value of i, and optimise it from there.
datapoint = PickBest(outs); %A function which sorts through the outputs and selects the lowest value which also has a good exitflag
end
This concludes the function to generate each datapoint. This entire function is called in a forloop as we vary over parameter values included in inputs.
When I call this function with varying parameters included in inputs, I get very noisy results out. My suspicion is that perhaps the initial guesses for x0 which are being used to scale the constraints might be to blame. After all, they are not subject to the constraints, and are simply in the middle of the allowed parameter space. Then again, fmincon ought to take care of the constraint violation, and in principle all output results with good exitflags ought to satisfy the constraints.
Things which I've tried:
  • Experimenting with MaxFunEvals, MaxIter, and StepTol
  • Using a generic and fixed x0, regardless of the parameter values in input
  • Relaxing, one by one, individual constraints in nlcon and setting them to -1 or 0 appropriately
Below I've included some of the more important functions which I call.
Functions Called
function out = objective(x,x0,args) %Objective function which is to be minimised
x = x.*x0; %Re-scale the input parameter so that it matches the scale of the actual variables
out = f(x)...;
out = out*args.objectivescaling; %This is in case we want to re-scale the objective
end
function [c,ceq] = nlcon(xscaled,args,x0,cscale) %Non-linear constraint function
x = xscaled.*x0; %Re-scale variables
c(1) = ...; %Add non-linear inequalities
c(2) = ...;
ceq(1) = ...; %Add non-linear equalities
ceq(2) = ...;
c = c./abs(cscale.c+(cscale.c==0)); %Re-scale constraints
c=c*args.constraintscaling; %Re-scale again, this time by brute force
ceq = ceq./abs(cscale.ceq+(cscale.ceq==0));
ceq = ceq*args.constraintscaling;
end
function Seed = MakeSeed(bounds) %Function to generate seed guesses for fmincon
LB = log10(bounds.LB);
UB = log10(bounds.UB);
for i = 1:length(UB)
v = LB(i) + (UB(i)-LB(i))*rand;
Seed(i) = 10^v;
end
end
function x0 = Makex0(bounds) %Generate x0 which is the logarithmic mean of the bounds
LB = log10(bounds.LB);
UB = log10(bounds.UB);
for i = 1:length(UB)
v = LB(i) + (UB(i)-LB(i))/2;
x0(i) = 10^v;
end
end

  4 Comments

Show 1 older comment
No - I don't think I can do that because as far as I can tell the problem itself isn't analytically tractable, or at least it's not within my powers. So I'm relying on the numerics to find my optimal solutions, and don't know any otherwise.
The objective isn't some kind of error function?
No - it's the sensitivity of a detector (a lower objective means a more sensitive detector). The detector is defined by nine variables, and we're trying to find the optimum values for all of them over a range of frequencies.

Sign in to comment.

Products


Release

R2019a

2 Answers

Answer by Dan Goldwater on 18 Jul 2019
 Accepted Answer

After much trial and error and checking a lot of code, I think I've solved this problem by the rather trivial solution of switching the algorighm to 'sqp'.
I think perhaps this was one of those examples where the problem was overly specific, and my code was too built up to be communicated clearly.

  0 Comments

Sign in to comment.


Answer by Matt J
on 11 Jul 2019

If you're suspicious about the initial point, I would recommend MultiStart, or just try lots of different initial guesses yourself. If you get significant differences in the converged objective value for different initial guesses, you know you were right.

  1 Comment

I agree with the thinking here - but I'd already implemented something along these lines with the MakeSeed function, included in the post.

Sign in to comment.