查找 Mathematica 列表中大于阈值的第一个元素
我想知道如何获得(已排序)列表中大于给定阈值的第一个元素。
我不太了解 Mathematica 中的列表操作函数,也许有人可以给我一个技巧来有效地做到这一点。
I was wondering how I could obtain the first element of a (already ordered) list that is greater than a given threshold.
I don't know really well the list manipulation function in Mathematica, maybe someone can give me a trick to do that efficiently.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
Select
满足您的需要,并且保持一致,尊重列表预先存在的顺序:例如:
您可以在第二个参数中提供所需的任何阈值或标准函数。
第三个参数指定您仅匹配一个(即第一个)元素。
希望有帮助!
Select
does what you need, and will be consistent, respecting the pre-existing order of the list:For example:
You can provide whatever threshold or criterion function you need in the second argument.
The third argument specifies you only one (i.e., the first) element that matches.
Hope that helps!
乔在他的答案中正确陈述 人们期望二分搜索技术比
Select
更快,即使列表已排序,它似乎也只是进行线性搜索:(出于演示目的,我随意将阈值设置为 2)。
但是,Combinatorica 中的
BinarySearch
函数 a) 不合适(它返回一个与请求的元素匹配的元素,但不是第一个(最左边的),这正是问题所要求的。要获取最左边的元素大于阈值,给定一个有序列表,我们可以递归地进行:
或迭代地进行:
递归方法更清晰,但我会坚持使用迭代方法
来测试其速度,
这会产生
因此,速度更快并且具有更好的缩放行为。
实际上没有必要编译它,但我还是做了。
总之,不要对长列表使用
Select
。下面是
我 对手动或通过 Combinatorica 包进行二进制搜索的一些评论。与
Combinatorica
中的BinarySearch
进行二分搜索 请注意,这不会执行问题所要求的操作(Combinatorica 中的
);我上面给出的代码确实如此。BinarySearch
也不会执行此操作。二分搜索可以迭代地实现
,我们现在可以将其与
Combinatorica
中的BinarySearch
进行比较。请注意,a) 列表必须已排序 b) 这不会返回第一个匹配元素,而是返回a匹配元素。让我们比较两个二分查找例程。重复50000次:
所以手写的速度更快。现在,由于事实上二分搜索仅访问这些参数列表中的 6-7 个点(例如
{500000, 250000, 125000, 62500, 31250, 15625, 23437}
),显然区别只是开销;例如,也许BinarySearch
更通用,或者未编译。Joe correctly states in his answer that one would expect a binary search technique to be faster than
Select
, which seem to just do a linear search even if the list is sorted:(I arbitrarily put the threshold at 2 for demonstration purposes).
However, the
BinarySearch
function in Combinatorica is a) not appropriate (it returns an element which does match the requested one, but not the first (leftmost), which is what the question is asking.To obtain the leftmost element that is larger than a threshold, given an ordered list, we may proceed either recursively:
or iteratively:
The recursive approach is clearer but I'll stick to the iterative one.
To test its speed,
which produces
so, much faster and with better scaling behaviour.
Actually it's not necessary to compile it but I did anyway.
In conclusion, then, don't use
Select
for long lists.This concludes my answer. There follow some comments on doing a binary search by hand or via the Combinatorica package.
I compared the speed of a (compiled) short routine to do binary search vs the
BinarySearch
fromCombinatorica
. Note that this does not do what the question asks (and neither doesBinarySearch
fromCombinatorica
); the code I gave above does.The binary search may be implemented iteratively as
and we can now compare this and
BinarySearch
fromCombinatorica
. Note that a) the list must be sorted b) this will not return the first matching element, but a matching element.Let us compare the two binary search routines. Repeating 50000 times:
So the handwritten one is faster. Now since in fact a binary search just visits 6-7 points in the list for these parameters (something like
{500000, 250000, 125000, 62500, 31250, 15625, 23437}
for instance), clearly the difference is simply overhead; perhapsBinarySearch
is more general, for instance, or not compiled.您可能还想看看 TakeWhile[] 和 LengthWhile[] 。
http://reference.wolfram.com/mathematica/ref/TakeWhile.html
http://reference.wolfram.com/mathematica/ref/LengthWhile.html
You might want to look at TakeWhile[] and LengthWhile[] as well.
http://reference.wolfram.com/mathematica/ref/TakeWhile.html
http://reference.wolfram.com/mathematica/ref/LengthWhile.html
例如
For example
使用
Select
可以解决问题,但如果您关心效率,那么这是一个糟糕的解决方案。Select
会遍历列表中的所有元素,因此所花费的时间与列表的长度成线性关系。既然你说列表是有序的,那么最好使用
BinarySearch< /code>
,它将在列表大小的对数时间内工作。表达式(编辑:我做了一个小调整,因为我之前编写的表达式没有正确处理列表中重复出现的元素。另一个编辑:这仍然不起作用当阈值本身作为重复元素出现在列表中时,请参阅注释):
将为您提供所需元素的索引。如果所有元素都小于阈值,您将得到列表的长度加一。
ps 在使用
BinarySearch
之前不要忘记调用Needs["Combinatorica'"]
。Using
Select
will solve the problem, but it is a poor solution if you care about efficiency.Select
goes over all the elements of the list, and therefore will take time which is linear in the length of the list.Since you say the list is ordered, it is much better to use
BinarySearch
, which will work in a time which is logarithmic in the size of the list. The expression (edit: I have made a small adjustment since the previous expression I wrote did not handle correctly recurring elements in the list. another edit: this still doesn't work when the threshold itself appears in the list as a recurring element, see comments):will give you the index of the desired element. If all the elements are smaller than the threshold, you'll get the length of the list plus one.
p.s. don't forget to call
Needs["Combinatorica'"]
before usingBinarySearch
.仅供将来参考,从 v10 开始,您可以使用
SelectFirst
它有一些额外的优点,例如返回
Missing[]
或默认值。来自文档:
对于您的具体情况,您将使用:
Just for future reference, starting from v10 you can use
SelectFirst
It has some added niceties, such as returning
Missing[]
or default values.From the docs:
For your specific case, you would use: