为什么使用牛顿法的 FindMaximum 会抱怨它无法找到足够的函数减少?

发布于 2024-09-16 11:04:30 字数 3394 浏览 3 评论 0原文

首先,这看起来(来自 ContourPlot)是一个相当简单的最大化问题,为什么使用牛顿法的 FindMaximum 会出现问题?

其次,如何摆脱警告?

第三,如果我无法摆脱这些警告,我如何判断警告是否有意义,即最大化失败?

例如,在下面的代码中,使用牛顿方法的 FindMaximum 会发出警告,而 PrimaryAxis 方法则不会发出

o = 1/5 Log[E^(-(h/Sqrt[3]))/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   3/10 Log[E^(h/Sqrt[3])/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/5 Log[E^(-(h/Sqrt[3]) - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(h/Sqrt[3] - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(-Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))];
(* -1 makes more contours towards maximum *)

contourFunc[n_, p_] := Function[{min, max},
   range = max - min;
   Table[Exp[p (x - 1)] x range + min, {x, 0, 1, 1/n}]
   ];
cf = contourFunc[10, -1];
ContourPlot @@ {o, {j, -1, 1}, {h, -1, 1}, Contours -> cf}

FindMaximum @@ {o, {{j, 0}, {h, 0}}, Method -> "Newton"}
FindMaximum @@ {o, {{j, 0}, {h, 0}}, Method -> "PrincipalAxis"}

警告,我认为可能在其中一个分量的方向上梯度为 0 是问题所在,但如果我扰乱初始点我仍然收到同样的警告,这是一个例子

o = 1/5 Log[E^(-(h/Sqrt[3]))/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/5 Log[E^(h/Sqrt[3])/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(-(h/Sqrt[3]) - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   3/10 Log[E^(h/Sqrt[3] - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(-Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))];
ContourPlot @@ {o, {j, -1, 1}, {h, -1, 1}}
FindMaximum @@ {o, {{j, -0.008983550852535105`}, {h, 
    0.06931364191023386`}}, Method -> "Newton"}

Firstly, this seems like (from ContourPlot) a fairly straightforward maximization problem, why is FindMaximum with Newton's method having problems?

Secondly, how can I get rid of the warnings?

Thirdly, if I can't get rid of these warnings, how can I tell if the warning is meaningful, ie, maximization failed?

For instance, in the code below, FindMaximum with Newton's method gives a warning, whereas the PrincipalAxis method doesn't

o = 1/5 Log[E^(-(h/Sqrt[3]))/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   3/10 Log[E^(h/Sqrt[3])/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/5 Log[E^(-(h/Sqrt[3]) - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(h/Sqrt[3] - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(-Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))];
(* -1 makes more contours towards maximum *)

contourFunc[n_, p_] := Function[{min, max},
   range = max - min;
   Table[Exp[p (x - 1)] x range + min, {x, 0, 1, 1/n}]
   ];
cf = contourFunc[10, -1];
ContourPlot @@ {o, {j, -1, 1}, {h, -1, 1}, Contours -> cf}

FindMaximum @@ {o, {{j, 0}, {h, 0}}, Method -> "Newton"}
FindMaximum @@ {o, {{j, 0}, {h, 0}}, Method -> "PrincipalAxis"}

Note, I thought that maybe gradient being 0 in direction of one of the components was the problem, but if I perturb the initial point I still get the same warning, here's an example

o = 1/5 Log[E^(-(h/Sqrt[3]))/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/5 Log[E^(h/Sqrt[3])/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(-(h/Sqrt[3]) - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   3/10 Log[E^(h/Sqrt[3] - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(-Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))];
ContourPlot @@ {o, {j, -1, 1}, {h, -1, 1}}
FindMaximum @@ {o, {{j, -0.008983550852535105`}, {h, 
    0.06931364191023386`}}, Method -> "Newton"}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

红ご颜醉 2024-09-23 11:04:30

从数学上讲,我不确定 Netwon 的方法失败的确切原因,但是 文档中的示例对于FindMaximum,请在可能的问题下指出这个具体问题和错误消息:“使用机器精度算术,即使具有平滑最大值的函数也可能看起来很颠簸”。

因此,如果您使用例如 WorkingPrecision -> 来提高工作精度FindMaximum 的 20 选项警告消失:

In[25]:= FindMaximum[o, {{j, 0}, {h, 0}}, Method->"Newton", WorkingPrecision->20]

Out[25]= {-2.0694248079871222533, {j -> -0.14189560954670761863, h -> 0}}

鉴于错误文本具有相当的描述性:

FindMaximum::lstol:线搜索将步长减小到容差范围内
由 AccuracyGoal 和 PrecisionGoal 指定,但无法找到足够的增量
在函数中。您可能需要超过 MachinePrecision 位数的工作精度
满足这些公差。 >>

...我怀疑牛顿的方法无法使用机器精度算术达到误差足够小的固定点。

正如错误消息提示的那样,如果您不想切换到较慢的高精度算术,则可以使用 AccuracyGoal 选项来指定解决方案中所需的有效位数:

In[27]:= FindMaximum[o, {{j, 0}, {h, 0}}, Method -> "Newton", AccuracyGoal -> 5]

Out[27]= {-2.06942, {j -> -0.141896, h -> -2.78113*10^-17}}

希望有帮助!

Mathematically, I'm not sure exactly why Netwon's method fails, but the examples in the documentation for FindMaximum point out this specific problem and error message under Possible Issues: "With machine-precision arithmetic, even functions with smooth maxima may seem bumpy".

Thus, if you increase the working precision with e.g. the WorkingPrecision -> 20 option to FindMaximum the warnings go away:

In[25]:= FindMaximum[o, {{j, 0}, {h, 0}}, Method->"Newton", WorkingPrecision->20]

Out[25]= {-2.0694248079871222533, {j -> -0.14189560954670761863, h -> 0}}

Given that the text of the error is fairly descriptive:

FindMaximum::lstol: The line search decreased the step size to within the tolerance
specified by AccuracyGoal and PrecisionGoal but was unable to find a sufficient increase
in the function. You may need more than MachinePrecision digits of working precision to
meet these tolerances. >>

... I suspect Newton's method is failing to reached a fixed point with sufficiently small error using machine-precision arithmetic.

As the error message hints, you can instead use the AccuracyGoal option to specify the number of significant digits you want in the solution if you don't want to switch to slower high-precision arithmetic:

In[27]:= FindMaximum[o, {{j, 0}, {h, 0}}, Method -> "Newton", AccuracyGoal -> 5]

Out[27]= {-2.06942, {j -> -0.141896, h -> -2.78113*10^-17}}

Hope that helps!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文