无分支二分查找
我很好奇是否有人可以向我解释无分支二进制搜索实现。我在最近的问题中看到了它,但我无法想象它将如何实施。我认为如果项目数量很大,避免分支可能会很有用。
I'm curious if anyone could explain a branchless binary search implementation to me. I saw it mentioned in a recent question but I can't imagine how it would be implemented. I assume it could be useful to avoid branches if the number of items is quite large.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我假设您正在谈论这句话“创建一个包含您想要支持的域中所有完美平方的静态常量数组,并对其执行快速无分支二分搜索。”可以在这个答案中找到。
“无分支”二分搜索基本上只是展开的二分搜索循环。仅当您事先知道要搜索的数组中的项目数时,此方法才有效(就像它是
static const
一样)。如果手动执行的代码太长,您可以编写一个程序来编写展开的代码。然后,您必须对您的解决方案进行基准测试,看看它是否真的比循环更快。如果您的无分支代码太大,它将无法容纳在 CPU 的快速指令缓存中,并且运行时间将比等效循环更长。
I'm going to assume you're talking about the sentence "Make a
static const
array of all the perfect squares in the domain you want to support, and perform a fast branchless binary search on it." found in this answer.A "branchless" binary search is basically just an unrolled binary search loop. This only works if you know in advance the number of items in the array you're searching (as you would if it's
static const
). You can write a program to write the unrolled code if it's too long to do by hand.Then, you must benchmark your solution to see whether it really is faster than a loop. If your branchless code is too big, it won't fit inside the CPU's fast instruction cache and will take longer to run than the equivalent loop.
如果有一个函数根据正确项与当前项的位置返回+1、-1或0,则可以将位置初始化为列表大小/2,将步长初始化为位置/2,然后在每次比较后位置+=方向*步长;步长=步长/2。迭代直到步长为零。
If one has a function which returns +1, -1, or 0 based upon the position of the correct item versus the current one, one could initialize position to list size/2, and stepsize to position/2, and then after each comparison do position+=direction*stepsize; stepsize=stepsize/2. Iterate until stepsize is zero.