同时优化一组非线性方程中的参数
我有大量方程 (n) 和大量未知数 (m),其中 m 大于 n。我试图使用 n 个方程和一大组观测值来找到 m 的值。
我已经研究过 Levenberg-Marquardt 在 C# 中的一些实现,但我找不到任何可以求解超过 1 个方程的实现。例如,我查看了 http://kniaz.net/software/LMA.aspx ,它似乎是我想要的,除了它只需要一个方程作为参数,我想同时求解多个方程。类似地,这个包: http://www.alglib.net/ 包含一个很好的 LM 实现,但仅适用于单一方程。
我想知道 C# 中是否有任何好的实现,或者我可以将其与我的 C# 代码一起使用来实现此目的?尝试计算方程的一阶微分也是昂贵的,因此我希望能够使用小的有限差分来近似它们。
此外,对于 LM 的工作原理以及如何实现,是否有任何好的且易于理解的解释?我尝试阅读一些数学教科书,以便自己实现它,但我对数学非常无知,所以大部分解释我都听不懂。
编辑:
我的问题的更多细节:
1)方程是动态形成的,并且可以随着问题的每次运行而改变
2)我对起始参数没有很好的猜测。我计划使用随机启动参数多次运行它,以找到全局最小值。
编辑2:
还有一个问题,我正在阅读这篇论文:http://ananth.in/docs/lmtut。 pdf ,我在第 2 节下看到了以下内容:
x = (x1; x2 ... xn) 是一个向量,每个 rj 是 ℜn 的函数 至ℜ。 rj 被称为 作为残差并且假设m≥n。
这是否意味着如果我的参数多于函数,LM 就不起作用?例如,如果我想求解函数 A 和 B:
Y = AX + B
由于我的参数向量的大小为 2(A 和 B)并且我的函数计数为 1,所以这是不可能的?
I have a large number of equations (n) with a large number of unknowns (m) where m is larger than n. I am trying to find the values of m using the n equations and a large set of observed values.
I have looked at some implementations of Levenberg-Marquardt in C# but I couldn't find any that solve more than 1 equation. For instance, I looked at http://kniaz.net/software/LMA.aspx and it seems to be what I want except that it only takes a single equation as a parameter, I want to solve for a number of equations at the same time. Similarly this package: http://www.alglib.net/ contains a good implementation of LM but only for a single equation.
I was wondering if there are any good implementations in C# or that I can use with my C# code that can do this? It will be costly to attempt to work out the first order differentials of my equations as well so I am hoping to be able to use small finite differences to approximate them.
Furthermore is there any good and easy to understand explanation of how LM works and how to implement it? I have tried reading through some maths textbooks in order to implement it myself but I am pretty clueless at maths so most of the explanation is lost on me.
edit:
More details of my problem:
1) The equations are formed dynamically and can change with each running of my problem
2) I have no good guess for the starting parameters. I am planning to run it multiple times with randomised starting parameters in order to find the global minimum.
Edit 2:
One more question, I am reading this paper: http://ananth.in/docs/lmtut.pdf and I saw the following under section 2:
x = (x1; x2 ... xn) is a vector, and each rj
is a function from ℜn
to ℜ. The rj are referred to
as a residuals and it is assumed that m >= n.
Does that mean that LM does not work if I have more parameters than functions? For instance, if I want to solve A and B for the function:
Y = AX + B
It won't be possible due to the fact that my parameter vector is of size 2 (A and B) and my function count is 1?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Levenberg-Marquardt 算法可以解决您的问题;但是,我没有在 C# 中找到实现这种情况的实现[更新:有关如何让 alglib.net 执行您想要的操作的详细信息,请参阅下面的编辑]。 MINPACK 确实有这种情况的入口点(LMDIF1 或 LMDIF,如果如您所述,您希望使用差异来近似导数)。您可以尝试使用以下列出的工具自动翻译 C/C++ 版本的 MINPACK 之前关于 StackOverflow 的问题。
至于你在“编辑2”中的问题:“如果我的参数多于函数,这是否意味着LM不起作用?”,答案是:不,你错了。在您的情况下,论文中的“m”实际上等于您拥有的方程数量乘以您拥有的数据点数量(假设您所说的“观察值”是什么意思“是每个方程右侧和左侧之间的差值)。换句话说,他在那里谈论的 r-sub-i 函数正是那些方程视差(RHS - LHS)。
重要编辑:现在我看到您找到的第二个包 alglib.net, 将执行您想要的操作(但请注意,它仅在 GPL 下免费提供)。由于您不想提供导数,因此应该使用“V”方案,其中假设您有 n 个方程和参数 处的 k 个观测值,f 向量具有 n*k 个元素,其中
(i 和 j 从 0 开始, 当然)。
The Levenberg-Marquardt algorithm can handle your problem; however, I do not find an implementation in C# which implements that case [UPDATE: see edit below for details of how to get alglib.net to do what you want]. MINPACK does have entry points for that case (LMDIF1 or LMDIF, if, as you stated, you wish to approximate derivatives using differences). You might try automatically translating the C/C++ version of MINPACK using tools listed at a previous question on StackOverflow.
As for your question in "Edit 2": "Does that mean that LM does not work if I have more parameters than functions?", the answer is: no, you are wrong. "m" in the paper at that point is actually, in your case, equal to the number of equations you have, multiplied by the number of data points you have (assuming what you mean by "observed value" is a value for the difference between the right-hand side and left-hand side of every equation). In other words, the r-sub-i functions, which he talks about there, are exactly those equation disparities (RHS - LHS).
Important edit: now I see that the second package you found, alglib.net, will do what you want (but note that it is only available for free under GPL). Since you don't want to provide derivatives, you should use the "V" scheme, where, assuming you have n equations and k observed values at the parameters , the f vector has n*k elements where
(i and j start at 0, of course).