- Joined
- Mar 28, 2008
- Messages
- 2,668
- Reaction score
- 796
- Points
- 128
... Continuing discussion started in other thread http://orbiter-forum.com/showthread.php?p=68073#post68073
-----Post Added-----
The book I have, ISBN 038 7966145, says nothing about Levenberg-Marquardt. But it do describe a method of adding lambda * Identity into the Hessian until it becomes positive definite. I thought this applies only to the initial Hessian. So, I haven' t implemented it. This will still leave question of how to find a proper value for lambda. It must be positive and small enough not to perturb the Hessian too much. I have never encountered this problem before but I suppose there is iterative solution available to differentiate the lambda respect to most negative principal minor of Hessian but I hope there are better ways.
It looks like the Broyden's method doesn't suffer about non-positive definite Jacobian ? So, does that problem apply only to second order derivatives ?
Also the book suggest to replace the initial Hessian with the identity matrix and let the hessian update to build the Hessian. In that case the first step would be in the direction of steepest descent. Taking a full Newton step at first and backtracking if it over shoots. However the backtracking may require several evaluations of the function and in this case that's a major problem.
Also the search of Levenberg-Marquardt from the Internet gives a lot of hits. So I may spend coming night searching them.
A noise has been a major problem. The gradient is computed using a finite difference between two solutions g(x)= ( f(x+d) - f(f) ) / d if the 'd' is too small it seems to give a gradient of the noise not the function. I suppose that was the second flaw that prevented BFGS from working. There was a problem with the noise in the Broyden approach as well but it was easier to identify and try to do something about it.
Yes, That is exactly what it was. But unfortunately a very small change in the impulse vector will cause very large change into a lunar fly-by. Usually the first step towards the minimum took the trajectory out-of valid range.
Since, the things ware not working out very well I desided to switch to Broyden's method and I re-described the problem in "more well behavied environment" by using a lunar offset vector. A change in the lunar offset vector will cause about equal change into the lunar fly-by and the initial Jacobian matrix is pretty close to identity matrix and doesn't change much during the iteration. Also evaluation of true Jacobian is not as much a problem as evaluation of true Hessian.
Of course, this method do require additional work to evaluate the impulse vector from the lunar offset vector. That's pretty fast and accurate, not a problem.
I suppose that re-implementing the case using BFGS method, now with more experience and knowledge about the case, would make it to converge as the Broyden does. However, it would be computationally lot more expensive. But it would allow to optimize bi-elliptic transfer trajectories.
-----Post Added-----
(Wed Dec 17 21:51:02 2008 )Row 0: [ -0.930364, -5.09622e-009, 8.91826e-009]
(Wed Dec 17 21:51:02 2008 )Row 1: [1.18611e+006, -0.790991, -0.0271801]
(Wed Dec 17 21:51:02 2008 )Row 2: [-1.84411e+006, 0.0458468, -0.277252]
(Wed Dec 17 21:51:02 2008 )Jacobian: Principal Minors Vector: [-0.930364, 0.741954, -0.219647]
I have been working with optimization consepts as well during the last two months. My intention was to calculate a TLI maneuver that would take the vessel into a user defined fly-by with the moon. By entering desired altitude of fly-by, inclination and date of fly-by.
At first I tried to solve the situation using quasi-newton BFGS method and a penalty functions to minimize a single function in multible dimensions to find a parameters those will lead into a transfer orbit matching the user criteria and find rest of the paramaters, those are not under user interest, to minimize the delta velocity.
And that leaded into a lot of trouble. It really didn't work as I would have hoped. That's probably because of bad design of penalty functions and the Hessian had a habit to become non-positive definite or singular for some reason. What's so special about being positive definite anyway ?
So, I made an other try using Broyden's method to search the roots only instead of trying to find a minimum, so far, it seems to be working well. The only problem is that if the user will setup the an inclination close to zero or 180, in that case the root doesn't exist.
That process requires about 3-7 iterations of the whole system and that will take about 40 -80 milliseconds to execute the program (using all cpu power). However, the speed is not yet fully optimized.
Maybe this shoud be discussed in an other thread when I release the MFD that makes the calculations described above.
I suppose the main question is how to deal with the situation that requires finding a roots of several function and minimizing a function at the same time.
-----Post Added-----
Well, positive-definiteness is a pretty important feature if you want to invert the matrix. Did you try any Levenberg-Marquardt style regularisation (i.e. add a lambda*I term to your Hessian) to ensure a positive definite matrix? Maybe you need to increase the value of lambda.
The book I have, ISBN 038 7966145, says nothing about Levenberg-Marquardt. But it do describe a method of adding lambda * Identity into the Hessian until it becomes positive definite. I thought this applies only to the initial Hessian. So, I haven' t implemented it. This will still leave question of how to find a proper value for lambda. It must be positive and small enough not to perturb the Hessian too much. I have never encountered this problem before but I suppose there is iterative solution available to differentiate the lambda respect to most negative principal minor of Hessian but I hope there are better ways.
It looks like the Broyden's method doesn't suffer about non-positive definite Jacobian ? So, does that problem apply only to second order derivatives ?
Also the book suggest to replace the initial Hessian with the identity matrix and let the hessian update to build the Hessian. In that case the first step would be in the direction of steepest descent. Taking a full Newton step at first and backtracking if it over shoots. However the backtracking may require several evaluations of the function and in this case that's a major problem.
Also the search of Levenberg-Marquardt from the Internet gives a lot of hits. So I may spend coming night searching them.
I am not exactly sure what you mean by that but, Yes, I suppose they should be. Therefore, I suppose the question is why BFGS approach failed and Broyden succeed ?A question: are minimisation methods and root-finding methods related anyway via the derivative (minimum of f(x) is root of f'(x)), so shouldn't they be equivalent in their convergence behaviour?
A noise has been a major problem. The gradient is computed using a finite difference between two solutions g(x)= ( f(x+d) - f(f) ) / d if the 'd' is too small it seems to give a gradient of the noise not the function. I suppose that was the second flaw that prevented BFGS from working. There was a problem with the noise in the Broyden approach as well but it was easier to identify and try to do something about it.
How many parameters does your model actually have? If it is a single-impulse problem, then I guess the parameters are time of impulse (scalar) and impulse (vector), so the solution space would be of dimension 4
Yes, That is exactly what it was. But unfortunately a very small change in the impulse vector will cause very large change into a lunar fly-by. Usually the first step towards the minimum took the trajectory out-of valid range.
Since, the things ware not working out very well I desided to switch to Broyden's method and I re-described the problem in "more well behavied environment" by using a lunar offset vector. A change in the lunar offset vector will cause about equal change into the lunar fly-by and the initial Jacobian matrix is pretty close to identity matrix and doesn't change much during the iteration. Also evaluation of true Jacobian is not as much a problem as evaluation of true Hessian.
Of course, this method do require additional work to evaluate the impulse vector from the lunar offset vector. That's pretty fast and accurate, not a problem.
I suppose that re-implementing the case using BFGS method, now with more experience and knowledge about the case, would make it to converge as the Broyden does. However, it would be computationally lot more expensive. But it would allow to optimize bi-elliptic transfer trajectories.
I haven' t encountered that problem yet and it shouldn't cause problems in this task. But, of course, I do need to face that sooner or later.Do you run into problems with local minima? How do you deal with them (e.g. restarts with different initial parameters, genetic algorithms, simulated annealing etc.)? The problem with all of these methods is that they require many forward solutions so could be costly.
-----Post Added-----
I'll take that one back. It's not exactly close since the exponent was +6 not -6and the initial Jacobian matrix is pretty close to identity matrix
(Wed Dec 17 21:51:02 2008 )Row 0: [ -0.930364, -5.09622e-009, 8.91826e-009]
(Wed Dec 17 21:51:02 2008 )Row 1: [1.18611e+006, -0.790991, -0.0271801]
(Wed Dec 17 21:51:02 2008 )Row 2: [-1.84411e+006, 0.0458468, -0.277252]
(Wed Dec 17 21:51:02 2008 )Jacobian: Principal Minors Vector: [-0.930364, 0.741954, -0.219647]