为什么使用牛顿法的 FindMaximum 会抱怨它无法找到足够的函数减少?
首先,这看起来(来自 ContourPlot)是一个相当简单的最大化问题,为什么使用牛顿法的 FindMaximum 会出现问题?
其次,如何摆脱警告?
第三,如果我无法摆脱这些警告,我如何判断警告是否有意义,即最大化失败?
例如,在下面的代码中,使用牛顿方法的 FindMaximum 会发出警告,而 PrimaryAxis 方法则不会发出
o = 1/5 Log[E^(-(h/Sqrt[3]))/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 3/10 Log[E^(h/Sqrt[3])/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/5 Log[E^(-(h/Sqrt[3]) - Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(h/Sqrt[3] - Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(-Sqrt[3] h + Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(Sqrt[3] h + Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))]; (* -1 makes more contours towards maximum *) contourFunc[n_, p_] := Function[{min, max}, range = max - min; Table[Exp[p (x - 1)] x range + min, {x, 0, 1, 1/n}] ]; cf = contourFunc[10, -1]; ContourPlot @@ {o, {j, -1, 1}, {h, -1, 1}, Contours -> cf} FindMaximum @@ {o, {{j, 0}, {h, 0}}, Method -> "Newton"} FindMaximum @@ {o, {{j, 0}, {h, 0}}, Method -> "PrincipalAxis"}
警告,我认为可能在其中一个分量的方向上梯度为 0 是问题所在,但如果我扰乱初始点我仍然收到同样的警告,这是一个例子
o = 1/5 Log[E^(-(h/Sqrt[3]))/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/5 Log[E^(h/Sqrt[3])/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(-(h/Sqrt[3]) - Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 3/10 Log[E^(h/Sqrt[3] - Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(-Sqrt[3] h + Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(Sqrt[3] h + Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))]; ContourPlot @@ {o, {j, -1, 1}, {h, -1, 1}} FindMaximum @@ {o, {{j, -0.008983550852535105`}, {h, 0.06931364191023386`}}, Method -> "Newton"}
Firstly, this seems like (from ContourPlot) a fairly straightforward maximization problem, why is FindMaximum with Newton's method having problems?
Secondly, how can I get rid of the warnings?
Thirdly, if I can't get rid of these warnings, how can I tell if the warning is meaningful, ie, maximization failed?
For instance, in the code below, FindMaximum with Newton's method gives a warning, whereas the PrincipalAxis method doesn't
o = 1/5 Log[E^(-(h/Sqrt[3]))/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 3/10 Log[E^(h/Sqrt[3])/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/5 Log[E^(-(h/Sqrt[3]) - Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(h/Sqrt[3] - Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(-Sqrt[3] h + Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(Sqrt[3] h + Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))]; (* -1 makes more contours towards maximum *) contourFunc[n_, p_] := Function[{min, max}, range = max - min; Table[Exp[p (x - 1)] x range + min, {x, 0, 1, 1/n}] ]; cf = contourFunc[10, -1]; ContourPlot @@ {o, {j, -1, 1}, {h, -1, 1}, Contours -> cf} FindMaximum @@ {o, {{j, 0}, {h, 0}}, Method -> "Newton"} FindMaximum @@ {o, {{j, 0}, {h, 0}}, Method -> "PrincipalAxis"}
Note, I thought that maybe gradient being 0 in direction of one of the components was the problem, but if I perturb the initial point I still get the same warning, here's an example
o = 1/5 Log[E^(-(h/Sqrt[3]))/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/5 Log[E^(h/Sqrt[3])/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(-(h/Sqrt[3]) - Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 3/10 Log[E^(h/Sqrt[3] - Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(-Sqrt[3] h + Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 1/10 Log[E^(Sqrt[3] h + Sqrt[2] j)/( 2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))]; ContourPlot @@ {o, {j, -1, 1}, {h, -1, 1}} FindMaximum @@ {o, {{j, -0.008983550852535105`}, {h, 0.06931364191023386`}}, Method -> "Newton"}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
从数学上讲,我不确定 Netwon 的方法失败的确切原因,但是 文档中的示例对于
FindMaximum
,请在可能的问题下指出这个具体问题和错误消息:“使用机器精度算术,即使具有平滑最大值的函数也可能看起来很颠簸”。因此,如果您使用例如
WorkingPrecision -> 来提高工作精度
选项警告消失:FindMaximum
的 20鉴于错误文本具有相当的描述性:
...我怀疑牛顿的方法无法使用机器精度算术达到误差足够小的固定点。
正如错误消息提示的那样,如果您不想切换到较慢的高精度算术,则可以使用
AccuracyGoal
选项来指定解决方案中所需的有效位数:希望有帮助!
Mathematically, I'm not sure exactly why Netwon's method fails, but the examples in the documentation for
FindMaximum
point out this specific problem and error message under Possible Issues: "With machine-precision arithmetic, even functions with smooth maxima may seem bumpy".Thus, if you increase the working precision with e.g. the
WorkingPrecision -> 20
option toFindMaximum
the warnings go away:Given that the text of the error is fairly descriptive:
... I suspect Newton's method is failing to reached a fixed point with sufficiently small error using machine-precision arithmetic.
As the error message hints, you can instead use the
AccuracyGoal
option to specify the number of significant digits you want in the solution if you don't want to switch to slower high-precision arithmetic:Hope that helps!