如何高效地将8个17位整数转换为17个8位整数

发布于 2024-11-26 12:22:25 字数 1174 浏览 0 评论 0原文

好的，我有以下问题：我有一组 8 个（无符号）数字，它们都是 17 位（也就是说，它们都不大于 131071）。由于 17 位数字很烦人（将它们保存在 32 位 int 中会浪费空间），我想将它们转换为 17 个 8 位数字，如下所示：

如果我有这些 8 个 17 位整数：

[25409, 23885, 24721, 23159, 25409, 23885, 24721, 23159]

我会将它们转换为以 2 为基数的表示形式L

["00110001101000001", "00101110101001101", "00110000010010001", "00101101001110111", "00110001101000001", "00101110101001101", "00110000010010001", "00101101001110111"]

然后将其连接成一个大字符串：

"0011000110100000100101110101001101001100000100100010010110100111011100110001101000001001011101010011010011000001001000100101101001110111"

然后将其拆分为 17 个字符串，每个字符串有 8 个字符：

["00110001", "10100000", "10010111", "01010011", "01001100", "00010010", "00100101", "10100111", "01110011", "00011010", "00001001", "01110101", "00110100", "11000001", "00100010", "01011010", "01110111"]

最后，将二进制表示形式转换回整数

[49, 160, 151, 83, 76, 18, 37, 167, 115, 26, 9, 117, 52, 193, 34, 90, 119]

这种方法有效，但是它不是很有效，我正在寻找比这更有效的东西，最好用 C++ 编码，因为那是我正在使用的语言。我只是想不出任何方法可以更有效地做到这一点，而且 17 位数字并不容易使用（16 位数字会更好用）。

预先感谢，xfbs

原文

Okay, I have the following problem: I have a set of 8 (unsigned) numbers that are all 17bit (a.k.a. none of them are any bigger than 131071). Since 17bit numbers are annoying work work with (keeping them in a 32-bit int is a waste of space), I would like to turn these into 17 8-bit numbers, like so:

If I have these 8 17-bit integers:

[25409, 23885, 24721, 23159, 25409, 23885, 24721, 23159]

I would turn them into a base 2 representationL

["00110001101000001", "00101110101001101", "00110000010010001", "00101101001110111", "00110001101000001", "00101110101001101", "00110000010010001", "00101101001110111"]

Then join that into one big string:

"0011000110100000100101110101001101001100000100100010010110100111011100110001101000001001011101010011010011000001001000100101101001110111"

Then split that into 17 strings, each with 8 chars:

["00110001", "10100000", "10010111", "01010011", "01001100", "00010010", "00100101", "10100111", "01110011", "00011010", "00001001", "01110101", "00110100", "11000001", "00100010", "01011010", "01110111"]

And, finally, convert the binary representations back into integers

[49, 160, 151, 83, 76, 18, 37, 167, 115, 26, 9, 117, 52, 193, 34, 90, 119]

This method works, but it's not very efficient, I am looking for something more efficient than this, preferrably coded in C++, since that's the language I am working with. I just can't think of any way to do this more efficient, and 17-bit numbers aren't exactly easy to work with (16-bit numbers would be much nicer to work with).

Thanks in advance, xfbs

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你列表最软的妹 2024-12-03 12:22:25

按原样存储每个数字的最低 16 位（即两个字节）。这留下了每个数字的最高有效位。由于有八个这样的数字，只需将这八位组合成一个额外的字节即可。

这将需要与您的方法完全相同的内存量，但涉及的操作要少得多。

PS 无论采用哪种存储方法，您都应该使用位操作运算符（<<、>>、&、 | 等）来完成这项工作；不应涉及任何基于字符串的中间表示。

回复收藏 0 原文

渡你暖光 2024-12-03 12:22:25

看一下 std::bitset。也许你可以把它们塞进去？

回复收藏 0 原文

天暗了我发光 2024-12-03 12:22:25

有效率吗？然后不要使用字符串转换、位域等。自己设法进行转换来实现这一点。（请注意，数组必须是无符号，这样我们在移位时就不会遇到问题）。

uint32 A[8]; //Your input, unsigned int
ubyte B[17]; //Output, unsigned byte
B[0] = (ubyte)A[0];
B[1] = (ubyte)(A[0] >> 8);
B[2] = (ubyte)A[1];
B[3] = (ubyte)(A[1] >> 8);
.
:

对于最后一项，我们按照 ajx 所说的去做。我们取每个数字的最高有效位（将它们向右移动 16 位，剩下第 17 位），然后通过将每个最高有效位从 0 到 7 向左移动来填充输出的位：

B[16] = (A[0] >> 16)  | ((A[1] >> 16) << 1) | ((A[2] >> 16) << 2) | ((A[3] >> 16) << 3) | ... | ((A[7] >> 16) << 7);

嗯，“高效”是这个。还存在其他更简单的方法。

Efficiently? Then don't use string conversions, bitfields, etc. Manage to do shifts yourself to achieve that. (Note that the arrays must be unsigned so that we don't encounter problems when shifting).

uint32 A[8]; //Your input, unsigned int
ubyte B[17]; //Output, unsigned byte
B[0] = (ubyte)A[0];
B[1] = (ubyte)(A[0] >> 8);
B[2] = (ubyte)A[1];
B[3] = (ubyte)(A[1] >> 8);
.
:

And for the last one, we do what ajx said. We take the most significant digit of each number (shifting them 16 bits to the right leaves the 17th digit) and fill the bits of our output by shifting each of the most significant digits from 0 to 7 to the left:

B[16] = (A[0] >> 16)  | ((A[1] >> 16) << 1) | ((A[2] >> 16) << 2) | ((A[3] >> 16) << 3) | ... | ((A[7] >> 16) << 7);

Well, "efficient" was this. Other easier methods exist, too.

回复收藏 0 原文

不必在意 2024-12-03 12:22:25

虽然你说它们是 17 位数字，但它们必须存储到 32 位整数数组中，其中仅使用较低有效的 17 位。您可以从前直接提取两个字节（dst[0] = src[0] >> 9 是第一个，dst[1] = (src[0] > > 1) & 0xff 第二个);然后你将第一个位“推”为第二个位的第 18 位，这样

  dst[2] = (src[0] & 1) << 7 | src[1] >> 10;
  dst[3] = (src[1] >> 2) & 0xff;

如果你概括它，你会看到这个“公式”可以应用

   dst[2*i] = src[i] >> (9+i) | (src[i-1] & BITS(i)) << (8-i);
   dst[2*i + 1] = (src[i] >> (i+1)) & 0xff;

，对于最后一个： dst[16] = src[ 7]& 0xff；。

整个代码可能看起来像

  dst[0] = src[0] >> 9;
  dst[1] = (src[0] >> 1) & 0xff;

  for(i = 1; i < 8; i++)
  {
    dst[2*i] = src[i] >> (9+i) | (src[i-1] & BITS(i)) << (8-i);
    dst[2*i + 1] = (src[i] >> (i+1)) & 0xff;
  }
  dst[16] = src[7] & 0xff;

可能更好地分析循环，可以进行优化，这样我们就不需要以特殊的方式处理边界上的情况。 BITS 宏创建设置为 1（最低有效位）的 N 位掩码。类似的东西（要检查更好的方法，如果有的话）

#define BITS(I) (~((~0)<<(I)))

ADD

这里我认为src是例如int32_t和dstint8_t或类似的。

Though you say they are 17-bit numbers, they must be stored into an array of 32bit integers, where only the less significant 17 bits are used. You can extract from the first directly two bytes (dst[0] = src[0] >> 9 is the first, dst[1] = (src[0] >> 1) & 0xff the second); then you "push" the first bit as the 18th bit of the second, so that

  dst[2] = (src[0] & 1) << 7 | src[1] >> 10;
  dst[3] = (src[1] >> 2) & 0xff;

if you generalize it, you will see that this "formula" may be applied

   dst[2*i] = src[i] >> (9+i) | (src[i-1] & BITS(i)) << (8-i);
   dst[2*i + 1] = (src[i] >> (i+1)) & 0xff;

and for the last one: dst[16] = src[7] & 0xff;.

The whole code could look like

  dst[0] = src[0] >> 9;
  dst[1] = (src[0] >> 1) & 0xff;

  for(i = 1; i < 8; i++)
  {
    dst[2*i] = src[i] >> (9+i) | (src[i-1] & BITS(i)) << (8-i);
    dst[2*i + 1] = (src[i] >> (i+1)) & 0xff;
  }
  dst[16] = src[7] & 0xff;

Likely analysing better the loops, optimizations can be done so that we don't need to treat in a special manner the cases on the boundaries. The BITS macro create a mask of N bits set to 1 (least significant bits). Something like (to be checked for a better way, if any)

#define BITS(I) (~((~0)<<(I)))

ADD

Here I supposed src is e.g. int32_t and dst int8_t or alike.

回复收藏 0 原文

压抑⊿情绪 2024-12-03 12:22:25

这是用 C 编写的，因此您可以使用 vector 代替。

#define srcLength 8
#define destLength 17
int src[srcLength] = { 25409, 23885, 24721, 23159, 25409, 23885, 24721, 23159 };
unsigned char dest[destLength] = { 0 };

int srcElement = 0;
int bits = 0;
int i = 0;
int j = 0;

do {
    while( bits >= srcLength ) {
        dest[i++] = srcElement >> (bits - srcLength);
        srcElement = srcElement & ((1 << bits) - 1);
        bits -= srcLength;
    }

    if( j < srcLength ) {
        srcElement <<= destLength;
        bits += destLength;
        srcElement |= src[j++];
    }
} while (bits > 0);

免责声明：如果您确实有十七个整数（而不是 100000 个 17 组），只要您的程序运行速度不是非常慢，您就应该忘记这些优化。

This is in C, so you can use vector instead.

#define srcLength 8
#define destLength 17
int src[srcLength] = { 25409, 23885, 24721, 23159, 25409, 23885, 24721, 23159 };
unsigned char dest[destLength] = { 0 };

int srcElement = 0;
int bits = 0;
int i = 0;
int j = 0;

do {
    while( bits >= srcLength ) {
        dest[i++] = srcElement >> (bits - srcLength);
        srcElement = srcElement & ((1 << bits) - 1);
        bits -= srcLength;
    }

    if( j < srcLength ) {
        srcElement <<= destLength;
        bits += destLength;
        srcElement |= src[j++];
    }
} while (bits > 0);

Disclaimer: if you literally have seventeen integers (and not 100000 groups by 17), you should forget these optimizations as long as your program doesn't run veeery slowly.

回复收藏 0 原文

颜 2024-12-03 12:22:25

我可能会这样处理。我不想在处理时处理奇怪的类型。也许由于遗留问题，我需要以某种时髦的格式存储它们。硬编码的值可能应该基于 17 值，只是没有打扰。

struct int_block {
    static const uint32 w = 17;
    static const uint32 m = 131071;
    int_block() : data(151, 0) {} // w * 8 + (sizeof(uint32) - w)
    uint32 get(size_t i) const {
        uint32 retval = *reinterpret_cast<const uint32 *>( &data[i*w] );
        retval &= m;
        return retval;
    }
    void set(size_t i, uint32 val) {
        uint32 prev = *reinterpret_cast<const uint32 *>( &data[i*w] );
        prev &= ~m;
        val |= prev;
        *reinterpret_cast<uint32 *>( &data[i*w] ) = val;
    }
    std::vector<char> data;
};

TEST(int_block_test) {

    int_block ib;
    for (uint32 i = 0; i < 8; i++)
        ib.set(i, i+25);

    for (uint32 i = 0; i < 8; i++)
        CHECK_EQUAL(i+25, ib.get(i));
}

您可以通过给它错误的值来打破这个问题，但我将把它作为练习留给读者。 :)）

老实说，我认为您会更高兴将它们表示为 32 位整数并只编写转换函数。但我怀疑你无法控制这一点。

I'd probably go about it this way. I don't want to deal with weird types when I'm doing my processing. Maybe I need to store them in some funky formatting due to legacy problems though. The values that are hard-coded should probably be based off of the 17 value, just didn't bother.

struct int_block {
    static const uint32 w = 17;
    static const uint32 m = 131071;
    int_block() : data(151, 0) {} // w * 8 + (sizeof(uint32) - w)
    uint32 get(size_t i) const {
        uint32 retval = *reinterpret_cast<const uint32 *>( &data[i*w] );
        retval &= m;
        return retval;
    }
    void set(size_t i, uint32 val) {
        uint32 prev = *reinterpret_cast<const uint32 *>( &data[i*w] );
        prev &= ~m;
        val |= prev;
        *reinterpret_cast<uint32 *>( &data[i*w] ) = val;
    }
    std::vector<char> data;
};

TEST(int_block_test) {

    int_block ib;
    for (uint32 i = 0; i < 8; i++)
        ib.set(i, i+25);

    for (uint32 i = 0; i < 8; i++)
        CHECK_EQUAL(i+25, ib.get(i));
}

You'd be able to break this by giving it bad values, but I'll leave that as an exercise for the reader. :))

Quite honestly, I think you'd be happier off representing them as 32-bit integers and just writing conversion functions. But I suspect you don't have control over that.

回复收藏 0 原文

~没有更多了~