如何有效地从字节中读取位？

发布于 2024-12-11 06:00:29 字数 898 浏览 0 评论 0原文

我正在开发一个包含 WebSockets 的项目，服务器 (Node.js) 和客户端 (Chrome) 之间的数据是使用我设置的用于数据交换的自定义（非常简单）格式发送的。

我以 3 位为单位发送数据，因为我发送的项目都有 8 种可能性。数据格式如下所示：

            0          1
bit index   01234567 8901...
item        aaabbbcc cddd...

目前，我正在从字节中解析项目，如下所示：

var itemA = bytes[0] >> 5;
var itemB = (bytes[0] >> 2) & 7;
var itemC = (bytes[0] & 3) << 1 | bytes[1] >> 7;
var itemD = (bytes[1] >> 4) & 7;

就个人而言，这感觉太复杂了。问题是它很复杂，因为我以字节为单位获取数据，这是 8 的倍数。为了解析 3 位的项目，我必须进行位移位，执行 AND 运算，并且因为 8 不能被 3 整除有时甚至必须组合两个字节的部分，例如 itemC。

以 3 位组而不是 8 位组的形式读取该数据会更有效。

我想出的是使用 .toString(2) 将所有字节转换为位到字符串，然后使用 .substring 获取长度为 3 的子字符串，然后使用 parseInt(bitString, 2) 转换回数字，但我想这不是这样做的方法，因为字符串操作很慢而且我实际上没有做任何与字符串相关的事情。

是否可以读取 3 个一组的位，而不是从字节中解析它们？或者是否有更有效的方法从字节中读取位？

原文

I'm working on a project that includes WebSockets, and data between the server (Node.js) and the client (Chrome) is sent using a custom (very simple) format for data exchange I set up.

I'm sending data in pieces of 3 bits because I'm sending items which all have 8 possibilities. The data format looks like this:

            0          1
bit index   01234567 8901...
item        aaabbbcc cddd...

Currently, I'm parsing the items out of the bytes like this:

var itemA = bytes[0] >> 5;
var itemB = (bytes[0] >> 2) & 7;
var itemC = (bytes[0] & 3) << 1 | bytes[1] >> 7;
var itemD = (bytes[1] >> 4) & 7;

Personally, this feels as being too sophisticated. The problem is that it's only complex because I'm getting data in bytes, which are a multiple of 8. To parse out items of 3 bits I have to bit-shift, doing AND operations, and because 8 is not divisible by 3 I sometimes even have to combine parts of two bytes like for itemC.

It would be much more effective to read this data as groups of 3 bits instead of groups of 8 bits.

What I've come up with is converting all bytes into bits to a string using .toString(2), then using .substring to get a substring with length 3, and converting back to a number with parseInt(bitString, 2), but I guess that's not the way to do it, since string manipulation is slow and I'm actually not doing anything string-related.

Is it possible to read bits in groups of e.g. 3 instead of parsing them from bytes? Or is there a more efficient way to read bits out of bytes?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

岁月苍老的讽刺 2024-12-18 06:00:29

二进制 AND 和位移操作是执行此操作的最快方法。它们可以很好地翻译成机器代码指令。进一步加快速度的唯一方法是通过牺牲带宽来提高速度，例如，每个字节不使用超过 3 位，但从您的问题来看，您可能已经考虑并拒绝了这种权衡。

回复收藏 0 原文

帅哥哥的热头脑 2024-12-18 06:00:29

function byte2bits(a)
{
    var tmp = "";
    for(var i = 128; i >= 1; i /= 2)
        tmp += a&i?'1':'0';
    return tmp;
}
function split2Bits(a, n)
{
    var buff = "";
    var b = [];
    for(var i = 0; i < a.length; i++)
    {
        buff += byte2bits(a[i]);
        while(buff.length >= n)
        {
            b.push(buff.substr(0, n));
            buff = buff.substr(n);
        }
    }
    return [b, buff];
}
var a, b, r;
a = [227, 142];
[b, r] = split2Bits(a, 3);
//b = ["111", "000", "111", "000", "111"];
//r = '0'; //rest of bits

function byte2bits(a)
{
    var tmp = "";
    for(var i = 128; i >= 1; i /= 2)
        tmp += a&i?'1':'0';
    return tmp;
}
function split2Bits(a, n)
{
    var buff = "";
    var b = [];
    for(var i = 0; i < a.length; i++)
    {
        buff += byte2bits(a[i]);
        while(buff.length >= n)
        {
            b.push(buff.substr(0, n));
            buff = buff.substr(n);
        }
    }
    return [b, buff];
}
var a, b, r;
a = [227, 142];
[b, r] = split2Bits(a, 3);
//b = ["111", "000", "111", "000", "111"];
//r = '0'; //rest of bits

回复收藏 0 原文

抱着落日 2024-12-18 06:00:29

如果注意字节顺序，您可以将其作为 int 或 long int 数组访问。
还有一种不使用位 3 和位 7 的可能性

回复收藏 0 原文

你对谁都笑 2024-12-18 06:00:29

我们可以通过获取适当的 16 位整数然后对其进行位移来获得我们需要的值。

很明显，要获得第 i 个值，我们应该获得 16 位整数，其偏移量以字节为单位，适合 (bits * (i + 1) - 16)/8 <=偏移量<=（位* i）/8。

让我们采用M=bits*i/8，所以我们有M + bits/8 - 2<= offset <= M。然后我们得到最小偏移量 ceil(M + bits/8 - 2) 并通过偏移量计算第 i 个值在 16 位整数中的位置。我刚刚编写了以下函数

function getDataFromStream(buffer, bitsPerValue, endianness) {
    var valuesCount = Math.floor(buffer.length * 8 / bitsPerValue);
    var ret = new Buffer(valuesCount);

    if (valuesCount > 0) {
        for (var i = 0; i < valuesCount; i++) {
            var offsetMin = Math.ceil(bitsPerValue * i / 8. + bitsPerValue / 8. - 2);
            if (offsetMin < 0) {
                offsetMin = 0;
            }
            if(endianness == 'BE')
                var wordWithValue = buffer.readUInt16BE(offsetMin, true);
            else
                var wordWithValue = buffer.readUInt16LE(offsetMin, true); 
            var offsetInWord = bitsPerValue * i - offsetMin * 8;
            var leftInWord = 16 - bitsPerValue - offsetInWord;

            // then get value in the word by shifting and then remove other bits by "%"
            ret[i] = (wordWithValue >> (endianness == 'BE' ? leftInWord : offsetInWord ))  % Math.pow(2, bitsPerValue);
        }
    }
    return ret;
}

和以下示例，以从缓冲区读取 5 个字节长度的 8 个 5 位值。

// buffer with 5 bytes
var xx = new Buffer(5);
xx[0] = 255;
xx[1] = 255;
xx[2] = 255;
xx[3] = 255;
xx[4] = 250;

// get data, 5bits per value.
var yy = getDataFromStream(xx, 5, 'BE');
console.log('got buffer with length='+ yy.length);
for(i = 0; i < yy.length; i++){
    console.log(i+'-'+yy[i]);
}

当我启动 Node test.js 时，我得到了

got buffer with length=8
0-31
1-31
2-31
3-31
4-31
5-31
6-31
7-26

We can get value we need by getting appropriate 16-bits integer and then bitshift it.

It is clear, that to get i-th value we should get 16-bits integer with offset in bytes that fits (bits * (i + 1) - 16)/8 <= offset <= (bits * i)/8.

Lets take M=bits*i/8, so we have M + bits/8 - 2<= offset <= M. Then we get minimal offset as ceil(M + bits/8 - 2) and calculate position of i-th value in the 16-bit integer by offsets. I have just wrote the following function

function getDataFromStream(buffer, bitsPerValue, endianness) {
    var valuesCount = Math.floor(buffer.length * 8 / bitsPerValue);
    var ret = new Buffer(valuesCount);

    if (valuesCount > 0) {
        for (var i = 0; i < valuesCount; i++) {
            var offsetMin = Math.ceil(bitsPerValue * i / 8. + bitsPerValue / 8. - 2);
            if (offsetMin < 0) {
                offsetMin = 0;
            }
            if(endianness == 'BE')
                var wordWithValue = buffer.readUInt16BE(offsetMin, true);
            else
                var wordWithValue = buffer.readUInt16LE(offsetMin, true); 
            var offsetInWord = bitsPerValue * i - offsetMin * 8;
            var leftInWord = 16 - bitsPerValue - offsetInWord;

            // then get value in the word by shifting and then remove other bits by "%"
            ret[i] = (wordWithValue >> (endianness == 'BE' ? leftInWord : offsetInWord ))  % Math.pow(2, bitsPerValue);
        }
    }
    return ret;
}

And the following example to read 8 5-bit values off the Buffer with 5 bytes length.

// buffer with 5 bytes
var xx = new Buffer(5);
xx[0] = 255;
xx[1] = 255;
xx[2] = 255;
xx[3] = 255;
xx[4] = 250;

// get data, 5bits per value.
var yy = getDataFromStream(xx, 5, 'BE');
console.log('got buffer with length='+ yy.length);
for(i = 0; i < yy.length; i++){
    console.log(i+'-'+yy[i]);
}

When I launch node test.js I got

got buffer with length=8
0-31
1-31
2-31
3-31
4-31
5-31
6-31
7-26

回复收藏 0 原文

绝不服输 2024-12-18 06:00:29

如果你有多个固定长度（即你总是可以保证它是 2 个字节），你可以像这样读取这些位：

// convert to binary
const num = 256;
const numAsBinaryString = num.toString(2);
const leastSignificantByteAsBinaryString = numAsBinaryString.substr(8);

const [
    eighthBit,
    seventhBit,
    sixthBit,
    fifthBit,
    fourthBit,
    thirdBit,
    secondBit,
    firstBit,
] = leastSignificantByteAsBinaryString.join('');

If you have a number of fixed length (ie you can always guarantee it'll be 2 bytes), you can read the bits like this:

// convert to binary
const num = 256;
const numAsBinaryString = num.toString(2);
const leastSignificantByteAsBinaryString = numAsBinaryString.substr(8);

const [
    eighthBit,
    seventhBit,
    sixthBit,
    fifthBit,
    fourthBit,
    thirdBit,
    secondBit,
    firstBit,
] = leastSignificantByteAsBinaryString.join('');

回复收藏 0 原文

司马昭之心 2024-12-18 06:00:29

我知道的最短的方法是：

function isBitSet(value, bit) {
    return !!(bit === 1 ? value & 1 : value & Math.pow(2,bit));
}

假设 value 是一个“big-endian”字节（数字的左侧先于右侧写入内存）。如果不是，您需要先将其转换。

此函数还应适用于任何位大小的数字（16、32、64...）。

运作原理：
因为从最右边开始，每个位代表一个等效的“十进制”值，等于 2（代表二进制系统中的 2 个符号）的位位置次方（从 1 开始，不包括第一个为 1 的位）-如果设置了该位，则将其与原始值进行“与”操作将始终返回一个非零值 - Math.pow 的结果（也是真值），如果不是，则返回 0（假值）。

The shortest way I know is this:

function isBitSet(value, bit) {
    return !!(bit === 1 ? value & 1 : value & Math.pow(2,bit));
}

This is assuming value is a "big-endian" byte (the left side of the number is written in memory before the right side). If it isn't, you'll need to convert it first.

This function should also work for any bit-size number (16, 32, 64...).

HOW IT WORKS:
Because each bit, starting from the right most, represents an equivalent "decimal" value equals to 2 (representing 2 symbols in the Binary system) to the power of the bit position (1-based, excluding the first which is just 1) - if that bit is set, ANDing it with the original value will always return a non-zero value - the result of Math.pow (which is also truthy), and 0 (falsy) if it is not.

回复收藏 0 原文

~没有更多了~