在红宝石中计算汉明距离的最有效方法?
在ruby中,计算两个无符号整数之间的位差(例如汉明距离)的最有效方法是什么?
例如,我有整数 a = 2323409845 和 b = 1782647144。
它们的二进制表示形式是:
a = 10001010011111000110101110110101
b = 01101010010000010000100101101000
a 和 b 之间的位差; b 是 17..
我可以对它们进行逻辑异或,但这会给我一个不同的整数!= 17,然后我必须迭代结果的二进制表示并计算 1 的数量。
计算位差最有效的方法是什么?
现在,计算多个整数序列的位差的答案是否会改变?例如,给定 2 个无符号整数序列:
x = {2323409845,641760420,509499086....}
y = {uint,uint,uint...}
计算两个序列之间的位差的最有效方法是什么?
您会迭代序列,还是有更快的方法来一次计算整个序列的差异?
In ruby, what is the most efficient way to calculate the bit difference between two unsigned integers (e.g. the hamming distance)?
Eg, I have integer a = 2323409845 and b = 1782647144.
Their binary representations are:
a = 10001010011111000110101110110101
b = 01101010010000010000100101101000
The bit difference between the a & b is 17..
I can do a logical XOR on them, but that will give me a different integer != 17, I would then have to iterate through the binary representation of the result and tally the # of 1s.
What is the most efficient way to calculate the bit difference?
Now, does the answer change for calculating the bit difference of sequences of many ints? E.g. given 2 sequences of unsigned integers:
x = {2323409845,641760420,509499086....}
y = {uint,uint,uint...}
What is the most efficient way to calculate the bit difference between the two sequences?
Would you iterate through the sequence, or is there a faster way to calculate the difference across the entire sequence at once?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以利用 Ruby 中优化的 String 函数来进行位计数,而不是纯粹的算术。事实证明,通过一些快速基准测试,速度提高了约 6 倍。
h1是正常的计算方式,而h2则将异或转换为字符串,统计“1”的个数
Benchmark:
You can make use of the optimized String functions in Ruby to do the bit counting, instead of pure arithmetic. It turns out to be about 6 times faster with some quick benchmarking.
h1 is the normal way to calculate, while h2 converts the xor into a string, and counts the number of "1"s
Benchmark:
根据 mu 太短的建议,我编写了一个简单的 C 扩展来使用 __builtin_popcount ,并使用基准测试验证它至少比 ruby 的优化字符串函数快 3 倍。
我查看了以下两个教程:
在我的程序中:
然后在包含我的程序的目录中,我创建一个包含以下文件的文件夹“PopCount”。
extconf.rb:
popcount.c:
然后在popcount目录中运行:
ruby extconf.rb
然后运行
程序,你就得到了......在 ruby 中进行汉明距离的最快方法。
Per the suggestion of mu is too short, I wrote a simple C extension to use __builtin_popcount , and using benchmark verified that it is at least 3X faster than ruby's optimized string functions..
I looked at the following two tutorials:
In my program:
Then in the dir containing my program, I create a folder "PopCount" with the following files.
extconf.rb:
popcount.c:
Then in the popcount directory run:
ruby extconf.rb
make
Then run the program, and there you have it....fastest way to do hamming distance in ruby.
韦格纳的算法:
An algorithm of Wegner:
如果打算遵循基于 C 的路径,最好将编译器标志
-msse4.2
添加到 makefile 中。这允许编译器生成基于硬件的popcnt
指令,而不是使用表来生成 popcount。在我的系统上,速度大约快 2.5 倍。If one intends to follow c-based path, it is a good idea to add the compiler flag
-msse4.2
to your makefile. This allows the compiler to generate hardware basedpopcnt
instructions instead of using a table to generate the popcount. On my system this was approximately 2.5x faster.