在 Mathematica 中从列表末尾搜索

发布于 2024-12-05 05:50:58 字数 1912 浏览 4 评论 0原文

许多算法（例如按字典顺序查找列表的下一个排列的算法）都涉及查找列表中最后一个元素的索引。然而，我一直无法在 Mathematica 中找到一种不尴尬的方法来做到这一点。最直接的方法使用 LengthWhile，但这意味着反转整个列表，如果您知道所需的元素位于列表末尾附近，并且反转谓词的含义，这可能效率低下：

findLastLengthWhile[list_, predicate_] :=
 (Length@list - LengthWhile[Reverse@list, ! predicate@# &]) /. (0 -> $Failed)

我们可以使用 Do，但是最终也变得有点笨重。如果 Return 实际上从函数而不是 Do 块，但它没有，所以您不妨使用 Break：

findLastDo[list_, pred_] :=
 Module[{k, result = $Failed},
  Do[
   If[pred@list[[k]], result = k; Break[]],
   {k, Length@list, 1, -1}];
  result]

最终，我决定使用尾递归进行迭代，这意味着提前终止会更容易一些。使用奇怪但有用的 #0 表示法让匿名函数调用自己，这就变成了：

findLastRecursive[list_, pred_] :=
 With[{
   step =
    Which[
      #1 == 0, $Failed,
      pred@list[[#1]], #1,
      True, #0[#1 - 1]] &},
  step[Length@list]]

不过，所有这些看起来都太难了。有人看到更好的方法吗？

编辑添加：当然，我的首选解决方案有一个错误，这意味着由于$IterationLimit。

In[107]:= findLastRecursive[Range[10000], # > 10000 &]
$IterationLimit::itlim: Iteration limit of 4096 exceeded. 
Out[107]= (* gack omitted *)

您可以使用 Block 修复此问题：

findLastRecursive[list_, pred_] :=
 Block[{$IterationLimit = Infinity},
  With[{
    step =
     Which[
       #1 == 0, $Failed,
       pred@list[[#1]], #1,
       True, #0[#1 - 1]] &},
   step[Length@list]]]

$IterationLimit 不是我最喜欢的 Mathematica 功能。

原文

Many algorithms (like the algorithm for finding the next permutation of a list in lexicographical order) involve finding the index of the last element in a list. However, I haven't been able to find a way to do this in Mathematica that isn't awkward. The most straightforward approach uses LengthWhile, but it means reversing the whole list, which is likely to be inefficient in cases where you know the element you want is near the end of the list and reversing the sense of the predicate:

findLastLengthWhile[list_, predicate_] :=
 (Length@list - LengthWhile[Reverse@list, ! predicate@# &]) /. (0 -> $Failed)

We could do an explicit, imperative loop with Do, but that winds up being a bit clunky, too. It would help if Return would actually return from a function instead of the Do block, but it doesn't, so you might as well use Break:

findLastDo[list_, pred_] :=
 Module[{k, result = $Failed},
  Do[
   If[pred@list[[k]], result = k; Break[]],
   {k, Length@list, 1, -1}];
  result]

Ultimately, I decided to iterate using tail-recursion, which means early termination is a little easier. Using the weird but useful #0 notation that lets anonymous functions call themselves, this becomes:

findLastRecursive[list_, pred_] :=
 With[{
   step =
    Which[
      #1 == 0, $Failed,
      pred@list[[#1]], #1,
      True, #0[#1 - 1]] &},
  step[Length@list]]

All of this seems too hard, though. Does anyone see a better way?

EDIT to add: Of course, my preferred solution has a bug which means it's broken on long lists because of $IterationLimit.

In[107]:= findLastRecursive[Range[10000], # > 10000 &]
$IterationLimit::itlim: Iteration limit of 4096 exceeded. 
Out[107]= (* gack omitted *)

You can fix this with Block:

findLastRecursive[list_, pred_] :=
 Block[{$IterationLimit = Infinity},
  With[{
    step =
     Which[
       #1 == 0, $Failed,
       pred@list[[#1]], #1,
       True, #0[#1 - 1]] &},
   step[Length@list]]]

$IterationLimit is not my favorite Mathematica feature.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

惟欲睡 2024-12-12 05:50:58

并不是真正的答案，只是 findLastDo 的几个变体。

(1) 实际上 Return 可以采用未记录的第二个参数来告诉从什么返回。

In[74]:= findLastDo2[list_, pred_] := 
 Module[{k, result = $Failed}, 
  Do[If[pred@list[[k]], Return[k, Module]], {k, Length@list, 1, -1}];
  result]

In[75]:= findLastDo2[Range[25], # <= 22 &]
Out[75]= 22

(2) 更好的是使用 Catch[...Throw...]

In[76]:= findLastDo3[list_, pred_] := 
 Catch[Module[{k, result = $Failed}, 
   Do[If[pred@list[[k]], Throw[k]], {k, Length@list, 1, -1}];
   result]]

In[77]:= findLastDo3[Range[25], # <= 22 &]
Out[77]= 22

Daniel Lichtblau

Not really an answer, just a couple of variants on findLastDo.

(1) Actually Return can take an undocumented second argument telling what to return from.

In[74]:= findLastDo2[list_, pred_] := 
 Module[{k, result = $Failed}, 
  Do[If[pred@list[[k]], Return[k, Module]], {k, Length@list, 1, -1}];
  result]

In[75]:= findLastDo2[Range[25], # <= 22 &]
Out[75]= 22

(2) Better is to use Catch[...Throw...]

In[76]:= findLastDo3[list_, pred_] := 
 Catch[Module[{k, result = $Failed}, 
   Do[If[pred@list[[k]], Throw[k]], {k, Length@list, 1, -1}];
   result]]

In[77]:= findLastDo3[Range[25], # <= 22 &]
Out[77]= 22

Daniel Lichtblau

回复收藏 0 原文

十二 2024-12-12 05:50:58

对于冒险...

以下定义定义了一个包装器表达式reversed[...]，它伪装成一个列表对象，其内容似乎是包装的反向版本list：

reversed[list_][[i_]] ^:= list[[-i]]
Take[reversed[list_], i_] ^:= Take[list, -i]
Length[reversed[list_]] ^:= Length[list]
Head[reversed[list_]] ^:= List

示例使用：

$list = Range[1000000];
Timing[LengthWhile[reversed[$list], # > 499500 &]]
(* {1.248, 500500} *)

请注意，此方法比实际反转列表慢

Timing[LengthWhile[Reverse[$list], # > 499500 &]]
(* 0.468, 500500 *)

......但当然它使用的内存要少得多。

我不建议将此技术广泛使用，因为伪装中的缺陷可能会表现为微妙的错误。考虑一下：需要实现哪些其他功能才能使模拟完美？对于简单的情况，所展示的包装器定义显然足以欺骗 LengthWhile 和 TakeWhile，但其他函数（特别是内核内置函数）可能不会那么容易被欺骗。覆盖 Head 似乎特别充满危险。

尽管存在这些缺点，这种模拟技术有时在受控环境下还是有用的。

For the adventurous...

The following definitions define a wrapper expression reversed[...] that masquerades as a list object whose contents appear to be a reversed version of the wrapped list:

reversed[list_][[i_]] ^:= list[[-i]]
Take[reversed[list_], i_] ^:= Take[list, -i]
Length[reversed[list_]] ^:= Length[list]
Head[reversed[list_]] ^:= List

Sample use:

$list = Range[1000000];
Timing[LengthWhile[reversed[$list], # > 499500 &]]
(* {1.248, 500500} *)

Note that this method is slower than actually reversing the list...

Timing[LengthWhile[Reverse[$list], # > 499500 &]]
(* 0.468, 500500 *)

... but of course it uses much less memory.

I would not recommend this technique for general use as flaws in the masquerade can manifest themselves as subtle bugs. Consider: what other functions need to implemented to make the simulation perfect? The exhibited wrapper definitions are apparently good enough to fool LengthWhile and TakeWhile for simple cases, but other functions (particularly kernel built-ins) may not be so easily fooled. Overriding Head seems particularly fraught with peril.

Notwithstanding these drawbacks, this impersonation technique can sometimes be useful in controlled circumstances.

回复收藏 0 原文

情魔剑神 2024-12-12 05:50:58

就我个人而言，我认为基于 LengthWhile 的解决方案没有任何问题。另外，如果我们想重用 mma 内置列表遍历函数（而不是显式循环或递归），我没有找到避免恢复列表的方法。这是一个版本，它可以做到这一点，但不会反转谓词：

Clear[findLastLengthWhile];
findLastLengthWhile[{}, _] = 0;
findLastLengthWhile[list_, predicate_] /; predicate[Last[list]] := Length[list];
findLastLengthWhile[list_, predicate_] :=
   Module[{l = Length[list]}, 
     Scan[If[predicate[#], Return[], l--] &, Reverse[list]]; l];

我不知道它是否更简单。它肯定比基于 LengthWhile 的效率低，特别是对于打包数组。另外，当没有找到满足条件的元素时，我使用返回 0 的约定，而不是 $Failed，但这只是个人偏好。

编辑

这是一个基于链表的递归版本，它在某种程度上更有效：

ClearAll[linkedList, toLinkedList];
SetAttributes[linkedList, HoldAllComplete];
toLinkedList[data_List] := Fold[linkedList, linkedList[], data];

Clear[findLastRec];
findLastRec[list_, pred_] :=
  Block[{$IterationLimit = Infinity},
     Module[{ll = toLinkedList[list], findLR},
       findLR[linkedList[]] := 0;
       findLR[linkedList[_, el_?pred], n_] := n;
       findLR[linkedList[ll_, _], n_] := findLR[ll, n - 1];
       findLR[ll, Length[list]]]]

一些基准：

In[48]:= findLastRecursive[Range[300000],#<9000&]//Timing
Out[48]= {0.734,8999}

In[49]:= findLastRec[Range[300000],#<9000&]//Timing
Out[49]= {0.547,8999}

编辑2

如果您的列表可以制作为压缩数组（任何尺寸），那么您可以利用 C 编译来实现基于循环的解决方案。为了避免编译开销，您可以记住已编译的函数，如下所示：

Clear[findLastLW];
findLastLW[predicate_, signature_] := findLastLW[predicate, Verbatim[signature]] = 
   Block[{list},
       With[{sig = List@Prepend[signature, list]},
      Compile @@ Hold[
        sig,
        Module[{k, result = 0},
          Do[
            If[predicate@list[[k]], result = k; Break[]], 
            {k, Length@list, 1, -1}
          ];
          result], 
        CompilationTarget -> "C"]]]

Verbatim 部分是必要的，因为在 {_Integer,1}、_Integer 等典型签名中否则， 将被解释为模式，并且记忆的定义将不匹配。这是一个例子：

In[60]:= 
fn = findLastLW[#<9000&,{_Integer,1}];
fn[Range[300000]]//Timing

Out[61]= {0.016,8999}

EDIT 3

这是一个基于链表的递归解决方案的更紧凑和更快的版本：

Clear[findLastRecAlt];
findLastRecAlt[{}, _] = 0;
findLastRecAlt[list_, pred_] :=
  Module[{lls, tag},
    Block[{$IterationLimit = Infinity, linkedList},
       SetAttributes[linkedList, HoldAllComplete];
       lls = Fold[linkedList, linkedList[], list];
       ll : linkedList[_, el_?pred] := Throw[Depth[Unevaluated[ll]] - 2, tag];
       linkedList[ll_, _] := ll;
       Catch[lls, tag]/. linkedList[] :> 0]]

它与基于Do的版本一样快 - 循环，比原来的 findLastRecursive 快两倍（很快就会添加相关基准 - 目前我无法在不同的机器上进行一致的（与以前的）基准测试）。我认为这很好地说明了 MMA 中的尾递归解决方案可以与过程（未编译）解决方案一样高效。

Personally, I don't see anything wrong with LengthWhile-based solution. Also, if we want to reuse mma built-in list-traversing functions (as opposed to explicit loops or recursion), I don't see a way to avoid reverting the list. Here is a version that does that, but does not reverse the predicate:

Clear[findLastLengthWhile];
findLastLengthWhile[{}, _] = 0;
findLastLengthWhile[list_, predicate_] /; predicate[Last[list]] := Length[list];
findLastLengthWhile[list_, predicate_] :=
   Module[{l = Length[list]}, 
     Scan[If[predicate[#], Return[], l--] &, Reverse[list]]; l];

Whether or not it is simpler I don't know. It is certainly less efficient than the one based on LengthWhile, particularly for packed arrays. Also, I use the convention of returning 0 when no element satisfying a condition is found, rather than $Failed, but this is just a personal preference.

EDIT

Here is a recursive version based on linked lists, which is somewhat more efficient:

ClearAll[linkedList, toLinkedList];
SetAttributes[linkedList, HoldAllComplete];
toLinkedList[data_List] := Fold[linkedList, linkedList[], data];

Clear[findLastRec];
findLastRec[list_, pred_] :=
  Block[{$IterationLimit = Infinity},
     Module[{ll = toLinkedList[list], findLR},
       findLR[linkedList[]] := 0;
       findLR[linkedList[_, el_?pred], n_] := n;
       findLR[linkedList[ll_, _], n_] := findLR[ll, n - 1];
       findLR[ll, Length[list]]]]

Some benchmarks:

In[48]:= findLastRecursive[Range[300000],#<9000&]//Timing
Out[48]= {0.734,8999}

In[49]:= findLastRec[Range[300000],#<9000&]//Timing
Out[49]= {0.547,8999}

EDIT 2

If your list can be made a packed array (of whatever dimensions), then you can exploit compilation to C for loop-based solutions. To avoid the compilation overhead, you can memoize the compiled function, like so:

Clear[findLastLW];
findLastLW[predicate_, signature_] := findLastLW[predicate, Verbatim[signature]] = 
   Block[{list},
       With[{sig = List@Prepend[signature, list]},
      Compile @@ Hold[
        sig,
        Module[{k, result = 0},
          Do[
            If[predicate@list[[k]], result = k; Break[]], 
            {k, Length@list, 1, -1}
          ];
          result], 
        CompilationTarget -> "C"]]]

The Verbatim part is necessary since in typical signatures like {_Integer,1}, _Integer will otherwise be interpreted as a pattern and the memoized definition won't match. Here is an example:

In[60]:= 
fn = findLastLW[#<9000&,{_Integer,1}];
fn[Range[300000]]//Timing

Out[61]= {0.016,8999}

EDIT 3

Here is a much more compact and faster version of recursive solution based on linked lists:

Clear[findLastRecAlt];
findLastRecAlt[{}, _] = 0;
findLastRecAlt[list_, pred_] :=
  Module[{lls, tag},
    Block[{$IterationLimit = Infinity, linkedList},
       SetAttributes[linkedList, HoldAllComplete];
       lls = Fold[linkedList, linkedList[], list];
       ll : linkedList[_, el_?pred] := Throw[Depth[Unevaluated[ll]] - 2, tag];
       linkedList[ll_, _] := ll;
       Catch[lls, tag]/. linkedList[] :> 0]]

It is as fast as versions based on Do - loops, and twice faster than the original findLastRecursive (the relevant benchmark to be added soon - I can not do consistent (with previous) benchmarks being on a different machine at the moment). I think this is a good illustration of the fact that tail-recursive solutions in mma can be as efficient as procedural (uncompiled) ones.

回复收藏 0 原文

往事风中埋 2024-12-12 05:50:58

这里有一些替代方案，其中两个不会颠倒列表：

findLastLengthWhile2[list_, predicate_] := 
 Length[list]-(Position[list//Reverse, _?(!predicate[#] &),1,1]/.{}->{{0}})[[1, 1]]+1

findLastLengthWhile3[list_, predicate_] := 
    Module[{lw = 0}, 
      Scan[If[predicate[#], lw++, lw = 0] &, list]; 
      Length[list] - lw
    ]

findLastLengthWhile4[list_, predicate_] := 
   Module[{a}, a = Split[list, predicate]; 
         Length[list] - If[predicate[a[[-1, 1]]], Length[a[[-1]]], 0]
   ]

一些时间（数字 1 是 Pillsy 的第一个）在 100,000 个 1 的数组中找到最后一串 1，其中单个零被放置在各个位置上。计时是 10 次重复测量的平均值：

在此处输入图像描述

用于计时的代码：

Monitor[
 timings = Table[
   ri = ConstantArray[1, {100000}];
   ri[[daZero]] = 0;
   t1 = (a1 = findLastLengthWhile[ri, # == 1 &];) // Timing // First;
   t2 = (a2 = findLastLengthWhile2[ri, # == 1 &];) // Timing // First;
   t3 = (a3 = findLastLengthWhile3[ri, # == 1 &];) // Timing // First;
   t4 = (a4 = findLastLengthWhile4[ri, # == 1 &];) // Timing // First;
   {t1, t2, t3, t4},
   {daZero, {1000, 10000, 20000, 50000, 80000, 90000, 99000}}, {10}
   ], {daZero}
 ]

ListLinePlot[
   Transpose[{{1000, 10000, 20000, 50000, 80000, 90000,99000}, #}] & /@ 
     (Mean /@ timings // Transpose), 
   Mesh -> All, Frame -> True, FrameLabel -> {"Zero position", "Time (s)", "", ""}, 
   BaseStyle -> {FontFamily -> "Arial", FontWeight -> Bold, 
   FontSize -> 14}, ImageSize -> 500
]

Here are some alternatives, two of which don't reverse the list:

findLastLengthWhile2[list_, predicate_] := 
 Length[list]-(Position[list//Reverse, _?(!predicate[#] &),1,1]/.{}->{{0}})[[1, 1]]+1

findLastLengthWhile3[list_, predicate_] := 
    Module[{lw = 0}, 
      Scan[If[predicate[#], lw++, lw = 0] &, list]; 
      Length[list] - lw
    ]

findLastLengthWhile4[list_, predicate_] := 
   Module[{a}, a = Split[list, predicate]; 
         Length[list] - If[predicate[a[[-1, 1]]], Length[a[[-1]]], 0]
   ]

Some timings (number 1 is Pillsy's first one) of finding the last run of 1's in an array of 100,000 1's in which a single zero is placed on various positions. Timings are the mean of 10 repeated meusurements:

enter image description here

Code used for timings:

Monitor[
 timings = Table[
   ri = ConstantArray[1, {100000}];
   ri[[daZero]] = 0;
   t1 = (a1 = findLastLengthWhile[ri, # == 1 &];) // Timing // First;
   t2 = (a2 = findLastLengthWhile2[ri, # == 1 &];) // Timing // First;
   t3 = (a3 = findLastLengthWhile3[ri, # == 1 &];) // Timing // First;
   t4 = (a4 = findLastLengthWhile4[ri, # == 1 &];) // Timing // First;
   {t1, t2, t3, t4},
   {daZero, {1000, 10000, 20000, 50000, 80000, 90000, 99000}}, {10}
   ], {daZero}
 ]

ListLinePlot[
   Transpose[{{1000, 10000, 20000, 50000, 80000, 90000,99000}, #}] & /@ 
     (Mean /@ timings // Transpose), 
   Mesh -> All, Frame -> True, FrameLabel -> {"Zero position", "Time (s)", "", ""}, 
   BaseStyle -> {FontFamily -> "Arial", FontWeight -> Bold, 
   FontSize -> 14}, ImageSize -> 500
]

回复收藏 0 原文

墨小墨 2024-12-12 05:50:58

字符串和实数的时序反向

a = DictionaryLookup[__];
b = RandomReal[1, 10^6];
Timing[Short@Reverse@#] & /@ {a, b}

(*
 ->
{{0.016,         {Zyuganov,Zyrtec,zymurgy,zygotic,zygotes,...}},
 {3.40006*10^-15,{0.693684,0.327367,<<999997>>,0.414146}}}
*)

Timing Reverse for Strings and Reals

a = DictionaryLookup[__];
b = RandomReal[1, 10^6];
Timing[Short@Reverse@#] & /@ {a, b}

(*
 ->
{{0.016,         {Zyuganov,Zyrtec,zymurgy,zygotic,zygotes,...}},
 {3.40006*10^-15,{0.693684,0.327367,<<999997>>,0.414146}}}
*)

回复收藏 0 原文

半城柳色半声笛 2024-12-12 05:50:58

一个优雅的解决方案是：

findLastPatternMatching[{Longest[start___], f_, ___}, f_] := Length[{start}]+1

(* match this pattern if item not in list *)
findLastPatternMatching[_, _] := -1

但由于它基于模式匹配，因此它比建议的其他解决方案慢得多。

An elegant solution would be:

findLastPatternMatching[{Longest[start___], f_, ___}, f_] := Length[{start}]+1

(* match this pattern if item not in list *)
findLastPatternMatching[_, _] := -1

but as it's based on pattern matching, it's way slower than the other solutions suggested.

回复收藏 0 原文

~没有更多了~