selecting units for (1) scaling of variable and (2) condition number minimization

5 views (last 30 days)

Frank on 4 Aug 2023

1
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/2004702-selecting-units-for-1-scaling-of-variable-and-2-condition-number-minimization

Commented: Frank on 5 Aug 2023

In gradient-based optimization problem, selecting units can influence the condition number of the gradient. A smaller condition number is generally good for optimization.

At the same time, selecting untis can also make the variable unblanced. For example, one variable maybe of the scale of 10^10, while another variable maybe of the scale of 0.0001.

My experience is if I make the variable of similar scale, the optimization problem generally finishes good.

Sometimes, the the two objectives are contradictory to each other.

How to balance these two contradictions? Thank you very much!

1 Comment
Show -1 older commentsHide -1 older comments

Matt J on 4 Aug 2023

Edited: Matt J on 4 Aug 2023

selecting units can influence the condition number of the gradient

The condition number of the Hessian, I think you mean.

Answers (1)

Matt J on 4 Aug 2023

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/2004702-selecting-units-for-1-scaling-of-variable-and-2-condition-number-minimization#answer_1283392

Edited: Matt J on 4 Aug 2023

You are free to translate as well as scale your optimization variables (or make any other nonlinear 1-1 transformation that might be useful).

For example, this quadratic objective is well-conditioned, wih condition number = 1, and doesn't require a change of units,

but has solutions at very large x and very small y. I'm not sure why you consider this a problem, but you could remedy it by making the change of variables

and rewriting the problem as,

9 Comments
Show 7 older commentsHide 7 older comments

Bruno Luong on 5 Aug 2023

Edited: Bruno Luong on 5 Aug 2023

"I am not talking about the conditioning of the Hessian. I am talking about the conditioning of the gradient."

AFAIK condition number applied on matrix. It's defined as

norm(A)*norm(inv(A)) https://en.wikipedia.org/wiki/Condition_number

There is no such thing as conditioning of the gradient which is a vector and NOT a matrix.

To describe your problem you must start to use correctly math terminology.

I do SVD to compute the gradient of the objective function, whose largest singular value to smallest singular value is defined as condition number"

I don't know what SVD you are talking about (what is the matrix, is it the Jacobian of the model (?)), but if that is the case please explain this process.

What you call conditioning might be something entirely different than what WE think (the matrix is the Hessian) and may be that's explain why you get confusing result.

Note that for non-linear case the Hessian is NOT J'*J, where J is the Jacobian of the model (at the considered point).

And the Jacobian change wrt the point. Do you take the Jacobian at the first guess? At the solution of the preceding optimisation? Something else?

"This condition number is dependant on selection of units."

Of course this we all know, but you did not explain:

when normalizing the unit and it converges faster; does the conditiong improve or degrade?

Also the conditiong is just a partial view of the whole picture. May be you have in your model somesort of null-space (space of the decision variables that is NOT observable by your data), or you have some constrained problem and active constraints and you need to evalute the conditioning of the Hessian projected on the tangent space (*), in this case the condition number of the full Hessian does NOT reflect the convergence rate.

Manything that can lead you to a wrong conclusion. If you are not able to show with a MWE the disussion is just vain.

At least show us the figure of normalization process, the problem dimension, the condition number you estimate - at the initial point and at the convergence point -, the number of iterations for convergence, the number of active constraints at the solution; etc...

I'll stop here, without more details the discussion is a waste of time.

(*) Acutually the curvature of the constraints also matter.

Frank on 5 Aug 2023

Thanks for your thought-provoking reply. Sorry I wasn't clear enough and made some wrong statements.

I am doing a tomography problem, which has the main physical parameter traveltime. When I said gradient, I meant the gradeint of each ray traveltime with regard to the velocity. Since there are m traveltimes and n velocity parameters, we have a gradient matrix. I did Singular Value Decomposition of this matrix and obtain its condition number.

What you said is NOT in vain to me, they are very helpful!!! Thank you very much!!!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

selecting units for (1) scaling of variable and (2) condition number minimization

1 Comment
Show -1 older commentsHide -1 older comments

Answers (1)

9 Comments
Show 7 older commentsHide 7 older comments

See Also

Categories

Tags

Community Treasure Hunt

selecting units for (1) scaling of variable and (2) condition number minimization

1 Comment Show -1 older commentsHide -1 older comments

Answers (1)

9 Comments Show 7 older commentsHide 7 older comments

See Also

Categories

Tags

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

9 Comments
Show 7 older commentsHide 7 older comments