在 MySQL 中对大位字符串执行按位运算?
我有一个 MySQL 数据库,其中包含大量 2048 位二进制字符串(例如“0111001...0101”)。我需要的一项计算是这些字符串与某些外部生成的位串相比的汉明距离(异或结果中 1 的总数)。为了了解如何编写此查询,我尝试为较小的位字符串编写它。下面是一个示例:
select BIT_COUNT(bin((b'0011100000') ^ (b'1111111111')))
计算 XOR 的内部部分工作正常,但 BIT_COUNT 返回奇怪的结果。此示例返回 14,它比字符串本身长。
所以我有几个问题:
首先,为什么 BIT_COUNT 返回如此奇怪的结果。它是在字符串上操作而不是我希望它操作的二进制字符串吗?如果是这样,我该如何处理?
其次,请注意,我通过在前面加上 b 将字符串转换为二进制(这里用的是正确的词吗?)。我该如何使用列名和变量来做到这一点?显然,我不能简单地将 ab 添加到变量名前,也不能在变量名之间插入空格。有什么想法吗?
谢谢,
编辑: 所以这是第一个问题的解决方案:
select BIT_COUNT(b'0011100000' ^ b'1111111111')
将其用于较大的字符串(2048 位)时似乎存在问题。我尝试过:
select BIT_COUNT(b'001110...00011')
它给了我像 28 这样的结果,而实际位数应该在 1024 左右。如果我删除 b,那么它看起来最大为 64。关于如何解决这个问题有什么想法吗?
I've got a MySQL database with a large amount of 2048-bit binary strings (e.g '0111001...0101'). One calculation I'll need is the Hamming Distance (the total count of 1's in the XOR'd result) of these strings compared to some externally generated bitstring. In order to get an idea of how to write this query, I tried writing it for smaller bitstrings. Here's an example:
select BIT_COUNT(bin((b'0011100000') ^ (b'1111111111')))
The inner portion that computes the XOR works correctly, but BIT_COUNT returns strange results. This example returns 14, which is longer than the string itself.
So I have a few questions:
First, why is BIT_COUNT returning such strange results. Is it operating on a string rather than the binary string I'd like it to operate on? If so, how do I deal with this?
Second, notice that I'm casting (is that the right word here?) the strings as binary by prepending with a b. How would I do this with column names and variables? Clearly I can't simply prepend a b to a variable name, and I can't insert a space between. Any ideas?
Thanks,
EDIT:
So here's a solution to the first problem:
select BIT_COUNT(b'0011100000' ^ b'1111111111')
There seems to be a problem when using this for larger strings (2048 bits). I tried:
select BIT_COUNT(b'001110...00011')
and it gives me results like 28, when the actual bitcount should be around 1024. If I remove the b, then it appears to max-out at 64. Any ideas on how to resolve this problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
只需删除
bin
函数即可。使用它,BIN_COUNT
将其参数视为字符串,而不是一组位。所以会做这项工作
Just remove
bin
function. With itBIN_COUNT
treats its argument as a chars string, not as a set of bits. Sowill do the work