整数数组的按位异或和移位
假设一个比特序列的大小为M,另一个比特序列的大小为N,其中M>> N. M和N都可以保存在整数数组中:如果N的长度为30,则需要一个仅包含1个整数的数组,但如果N的长度为300,则需要一个包含10个整数的数组来存储它。
我想做的是将 N 移到 M 内,并对 M 内每个可能的位置 k 找到 N 和 M(k) 之间的差异数(通过异或)。如果M有10000位并且N有100位,则有10000-100=9900个位置将执行XOR比较。
您是否知道有一个库可以做到这一点或者可能提出一种算法?我知道可以通过许多其他方法来完成,但我相信最快的方法就是这里提出的方法。如果您能想到更快的方法,那么我愿意接受建议!
我更喜欢 C 或 C++ 语言,但其他语言,甚至伪代码也是可以接受的。
提前致谢。
Suppose a bit sequence of size M, and another bit sequence of size N, with M >> N. Both M and N can be saved inside integer arrays: If N has a length of 30 then an array with only one integer will be needed, but if N has a length of 300 then an array with 10 integers will be needed to store it.
What I am trying to do is to shift N inside M, and for each possible position k inside M to find the number of differences (by XORing) between N and M(k). If M has 10000 bits and N has 100 bits then there are 10000-100=9900 positions in which an XOR comparison will be performed.
Are you aware of a library that could do that or maybe propose an algorithm ? I know that it can be done with many other ways however I believe that the fastest possible method is the one proposed here. If you can think of a faster way then I'm open to suggestions !
I'd prefer something in C or C++ but other languages, even pseudocode are also acceptable.
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是一个完整且可行的解决方案。稍有疏忽,留给读者作为练习:)
Here's a complete and working solution. Minor sloppiness left as an exercise for the reader :)
简单的方法:
现代 CPU 的功能足够强大,即使对于 10,000 和 100 位,这也只需要几毫秒。
为了“计算并统计 M xor N”,
有很多方法可以优化它。有很多数字大多数时候都是 0,你可以识别并忽略这些......但上面的算法应该可以帮助你入门。
Simple approach:
Modern CPUs are powerful enough that even for 10,000 and 100 bits this should only take a few milliseconds.
To "compute and tally up M xor N",
There are lots of ways to optimize this. There are lots of digits that will be 0 most of the time, and you can recognize and ignore these... but the above algorithm should get you started.
您可以将 N 向上移动到 M 上,或者将 M 向下移动到 N 上。移动 N 时,如果输入与字大小不对齐,您还需要移动掩码。移位可以缓存到长度为字大小(以位为单位)的数组中,但考虑到跨多个字的移位是每个字 1 条指令(如果使用 RCR 指令),可能不值得冒破坏 L1 缓存的风险。
除非您可以依靠带有 POPCNT 指令的 Core i7 处理器,否则最重要的部分无论如何都是位计数。请参阅此页面以快速实现位计数。
对于较小的 N 长度(以机器字计),通过对内循环进行特殊封装,您将获得较大的速度提高。对于具有 SSE4.2 的处理器上的 N <= 192 位,它应该能够运行两个最里面的循环,并将所有内容保存在寄存器中。我生锈的 ASM 向我展示了 14 个实时寄存器,最里面的循环(在 64 位位置上移位)长度为 20 条指令,另外 5 条指令用于从输入读取下一个字。
You can either shift N up over M or shift M down into N. When shifting N you'll also need to shift the mask if the input doesn't align with word size. The shifts could be cached into an array with length of word size in bits, but given that shifting across multiple words is 1 instruction per word (if you use the RCR instruction) it might not be worth the risk of thrashing the L1 cache.
Unless you can count on a Core i7 processor with the POPCNT instruction the most significant part will be the bit counting anyway. See this page for fast implementations of bit counting.
For smaller lengths of N (in machine words) you'll get large increases of speed by special casing the inner loop. For N <= 192 bits on a processor with SSE4.2 it should be able to run the two innermost loops with keeping everything in registers. My rusty ASM shows me 14 live registers with innermost loop (of shifting over 64 bit locations) length of 20 instructions with another 5 for reading in the next word from the input).