当前位置：文江博客话题详情

有没有更快的方法来找到最小值和最大值？

发布于 2024-11-05 04:41:02 字数 119 浏览 0 评论 0原文

我经常写： {Min@#, Max@#} &

但这似乎效率低下，因为必须扫描表达式两次，一次查找最小值，一次查找最大值。有没有更快的方法来做到这一点？表达式通常是张量或数组。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

客…行舟 2024-11-12 04:41:02

这比它稍胜一筹。

minMax = Compile[{{list, _Integer, 1}},
   Module[{currentMin, currentMax},
    currentMin = currentMax = First[list];
    Do[
     Which[
      x < currentMin, currentMin = x,
      x > currentMax, currentMax = x],
     {x, list}];
    {currentMin, currentMax}],
   CompilationTarget -> "C",
   RuntimeOptions -> "Speed"];
v = RandomInteger[{0, 1000000000}, {10000000}];
minMax[v] // Timing

我认为它比 Leonid 的版本快一点，因为 Do 比 For 快一点，即使在编译代码中也是如此。

但归根结底，这是使用高级函数式编程语言时所遭受的性能影响的一个示例。

回应 Leonid 的补充：

我不认为该算法可以解释所有的时间差异。我认为，通常情况下，这两种测试都会被应用。然而，Do 和 For 之间的差异是可以衡量的。

cf1 = Compile[{{list, _Real, 1}},
   Module[{sum},
    sum = 0.0;
    Do[sum = sum + list[[i]]^2,
     {i, Length[list]}];
    sum]];
cf2 = Compile[{{list, _Real, 1}},
   Module[{sum, i},
    sum = 0.0;
    For[i = 1, i <= Length[list],
     i = i + 1, sum = sum + list[[i]]^2];
    sum]];
v = RandomReal[{0, 1}, {10000000}];
First /@{Timing[cf1[v]], Timing[cf2[v]]}

{0.685562, 0.898232}

This beats it by a bit.

minMax = Compile[{{list, _Integer, 1}},
   Module[{currentMin, currentMax},
    currentMin = currentMax = First[list];
    Do[
     Which[
      x < currentMin, currentMin = x,
      x > currentMax, currentMax = x],
     {x, list}];
    {currentMin, currentMax}],
   CompilationTarget -> "C",
   RuntimeOptions -> "Speed"];
v = RandomInteger[{0, 1000000000}, {10000000}];
minMax[v] // Timing

I think it's a little faster than Leonid's version because Do is a bit faster than For, even in compiled code.

Ultimately, though, this is an example of the kind of performance hit you take when using a high level, functional programming language.

Addition in response to Leonid:

I don't think that the algorithm can account for all the time difference. Much more often than not, I think, both tests will be applied anyway. The difference between Do and For is measurable, however.

cf1 = Compile[{{list, _Real, 1}},
   Module[{sum},
    sum = 0.0;
    Do[sum = sum + list[[i]]^2,
     {i, Length[list]}];
    sum]];
cf2 = Compile[{{list, _Real, 1}},
   Module[{sum, i},
    sum = 0.0;
    For[i = 1, i <= Length[list],
     i = i + 1, sum = sum + list[[i]]^2];
    sum]];
v = RandomReal[{0, 1}, {10000000}];
First /@{Timing[cf1[v]], Timing[cf2[v]]}

{0.685562, 0.898232}

回复收藏 0 原文

毁梦 2024-11-12 04:41:02

我认为这在 Mathematica 编程实践中已经是最快的了。我认为尝试在 mma 中使其更快的唯一方法是将 Compile 与 C 编译目标一起使用，如下所示：

getMinMax = 
 Compile[{{lst, _Real, 1}},
   Module[{i = 1, min = 0., max = 0.},
       For[i = 1, i <= Length[lst], i++,
        If[min > lst[[i]], min = lst[[i]]];
        If[max < lst[[i]], max = lst[[i]]];];
        {min, max}], CompilationTarget -> "C", RuntimeOptions -> "Speed"]

但是，即使这似乎比您的代码慢一些：

In[61]:= tst = RandomReal[{-10^7,10^7},10^6];

In[62]:= Do[getMinMax[tst],{100}]//Timing
Out[62]= {0.547,Null}

In[63]:= Do[{Min@#,Max@#}&[tst],{100}]//Timing
Out[63]= {0.484,Null}

您可能可以编写该函数完全用 C 编写，然后编译并加载为 dll - 您可以通过这种方式消除一些开销，但我怀疑您是否会赢得超过几个百分点 - 在我看来，不值得付出努力。

编辑

有趣的是，您可以通过手动公共子表达式消除显着提高编译解决方案的速度（lst[[i]] 此处）：

getMinMax = 
 Compile[{{lst, _Real, 1}},
   Module[{i = 1, min = 0., max = 0., temp},
     While[i <= Length[lst],
       temp = lst[[i++]];
       If[min > temp, min = temp];
       If[max < temp, max = temp];];
       {min, max}], CompilationTarget -> "C", RuntimeOptions -> "Speed"]

比 < 快一点代码>{最小@#,最大@#}。

I think this is as fast as it gets, within Mathematica programming practices. The only way I see to attempt to make it faster within mma is to use Compile with the C compilation target, as follows:

getMinMax = 
 Compile[{{lst, _Real, 1}},
   Module[{i = 1, min = 0., max = 0.},
       For[i = 1, i <= Length[lst], i++,
        If[min > lst[[i]], min = lst[[i]]];
        If[max < lst[[i]], max = lst[[i]]];];
        {min, max}], CompilationTarget -> "C", RuntimeOptions -> "Speed"]

However, even this seems to be somewhat slower than your code:

In[61]:= tst = RandomReal[{-10^7,10^7},10^6];

In[62]:= Do[getMinMax[tst],{100}]//Timing
Out[62]= {0.547,Null}

In[63]:= Do[{Min@#,Max@#}&[tst],{100}]//Timing
Out[63]= {0.484,Null}

You probably can write the function entirely in C, and then compile and load as dll - you may eliminate some overhead this way, but I doubt that you will win more than a few percents - not worth the effort IMO.

EDIT

It is interesting that you may significantly increase the speed of the compiled solution with manual common subexpression elimination (lst[[i]] here):

getMinMax = 
 Compile[{{lst, _Real, 1}},
   Module[{i = 1, min = 0., max = 0., temp},
     While[i <= Length[lst],
       temp = lst[[i++]];
       If[min > temp, min = temp];
       If[max < temp, max = temp];];
       {min, max}], CompilationTarget -> "C", RuntimeOptions -> "Speed"]

is a little faster than {Min@#,Max@#}.

回复收藏 0 原文

ζ澈沫 2024-11-12 04:41:02

对于数组，您可以做最简单的功能性事情并使用

Fold[{Min[##],Max[##]}&, First@#, Rest@#]& @ data

不幸的是，它不是速度恶魔。即使对于短列表，5 个元素，所有列昂尼德的回答和马克的回答至少快 7 倍，在 v.7 中未编译。对于包含 25000 个元素的长列表，情况会变得更糟，Mark 的速度提高了 19.6 倍，但即使在这个长度下，该解决方案也只需要大约 0.05 秒即可运行。

但是，我不会将 {Min[#], Max[#]}& 作为一个选项。未编译时，它比 Mark 的短列表快 1.7 倍，长列表快近 15 倍（分别比 Fold 解决方案快 8 倍和近 300 倍）。

不幸的是，我无法获得 {Min[#], Max[#]}&、Leonid 或 Mark 答案的编译版本的良好数字，相反，我收到了许多难以理解的错误消息。事实上，{Min[#], Max[#]}& 的执行时间增加了。不过，Fold 解决方案得到了显着改进，并且花费的时间是 Leonid 答案的未编译时间的两倍。

编辑：出于好奇，这里有一些未编译函数的计时测量 -

log 上的计时测量- logplot

每个函数都在 100 个列表上使用，水平轴指定长度，垂直轴为平均时间（以秒为单位）。按时间升序排列，曲线分别为 {Min[#], Max[#]}&、Mark 的答案、Leonid 的第二个答案、Leonid 的第一个答案和 Fold 上面的方法。

For an array, you could do the simplest functional thing and use

Fold[{Min[##],Max[##]}&, First@#, Rest@#]& @ data

Unfortunately, it is no speed demon. Even for short lists, 5 elements, all of Leonid's answers and Mark's answer are at least 7 times faster, uncompiled in v.7. For long lists, 25000 elements, this gets worse with Mark's being 19.6 times faster, yet even at this length this solution took only about 0.05 secs to run.

However, I'd not count out {Min[#], Max[#]}& as an option. Uncompiled it was 1.7 times faster than Mark's for short list and nearly 15 times faster for long lists (8 times and nearly 300 times faster, respectively, than the Fold solution).

Unfortunately, I could not get good numbers for the compiled versions of either {Min[#], Max[#]}&, Leonid's, or Mark's answers, instead I got numerous incomprehensible error messages. In fact, {Min[#], Max[#]}& increased in execution time. The Fold solution improved dramatically, though, and took twice as long as Leonid's answers' uncompiled times.

Edit: for the curious, here's some timing measurements of the uncompiled functions -

timing measurements on a log-log plot

Each function was used on 100 lists of the length specified on the horizontal axis and the average time, in seconds, is the vertical axis. In ascending order of time, the curves are {Min[#], Max[#]}&, Mark's answer, Leonid's second answer, Leonid's first answer, and the Fold method from above.

回复收藏 0 原文

瑾兮 2024-11-12 04:41:02

对于所有进行计时的人，我想警告您，执行顺序非常重要。例如，看看以下两个略有不同的计时测试：

(1)

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   Min[a] // AbsoluteTiming // First, Max[a] // AbsoluteTiming // First,
   Max[a] // AbsoluteTiming // First, Min[a] // AbsoluteTiming // First
   }
  , {100}
  ]

在此处输入图像描述
这里奇怪的人是最后一个 Min

(2)

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   Max[a] // AbsoluteTiming // First, Min[a] // AbsoluteTiming // First,
   Min[a] // AbsoluteTiming // First, Max[a] // AbsoluteTiming // First
   }
  , {100}
  ]

在此处输入图像描述

这里，第一个 Max 的最高时序，第二个 max 的第二最高时序，两个 Min 的时序大致相同且最低。实际上，我希望 Max 和 Min 花费大约相同的时间，但事实并非如此。前者似乎比后者多花 50% 的时间。获得杆位似乎也有 50% 的障碍。

现在与 Mark 和 Leonid 给出的算法进行比较：

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   {Max[a], Min[a]} // AbsoluteTiming // First,
   {Min@#, Max@#} &@a // AbsoluteTiming // First,
   getMinMax[a] // AbsoluteTiming // First,
   minMax[a] // AbsoluteTiming // First,
   {Min[a], Max[a]} // AbsoluteTiming // First
   }
  , {100}
  ]

在此处输入图像描述

这里我们发现大约 0.3 秒的 { Max[a], Min[a]}（包括杆位让分），0.1 级别为 Mark 方法；其他的都差不多。

For all of you who are doing timings I'd like to warn you that order of execution is extremely important. For instance, look at the following two subtly different timings tests:

(1)

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   Min[a] // AbsoluteTiming // First, Max[a] // AbsoluteTiming // First,
   Max[a] // AbsoluteTiming // First, Min[a] // AbsoluteTiming // First
   }
  , {100}
  ]

enter image description here
The odd man out here is the last Min

(2)

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   Max[a] // AbsoluteTiming // First, Min[a] // AbsoluteTiming // First,
   Min[a] // AbsoluteTiming // First, Max[a] // AbsoluteTiming // First
   }
  , {100}
  ]

enter image description here

Here, the highest timing is found for the first Max, the second-highest for the second max and the two Mins are about the same and lowest. Actually, I'd expect Max and Min to take about the same time,but they don't. The former seems to take 50% more time than the latter. Having the pole position also seems to come with a 50% handicap.

Now a comparison with the algorithms given by Mark and Leonid:

res =
 Table[
  a = RandomReal[{0, 100}, 10^8];
  {
   {Max[a], Min[a]} // AbsoluteTiming // First,
   {Min@#, Max@#} &@a // AbsoluteTiming // First,
   getMinMax[a] // AbsoluteTiming // First,
   minMax[a] // AbsoluteTiming // First,
   {Min[a], Max[a]} // AbsoluteTiming // First
   }
  , {100}
  ]

enter image description here

Here we find about .3 s for the {Max[a], Min[a]} (which includes the pole position handicap), the .1 level is for Mark's method; the others are all about the same.

回复收藏 0 原文

~没有更多了~