C/C++将signed char 打包成int

发布于 2024-08-25 04:10:43 字数 328 浏览 15 评论 0原文

我需要将四个有符号字节打包为 32 位整数类型。这就是我想到的：

int32_t byte(int8_t c) { return (unsigned char)c; }

int pack(char c0, char c1, ...) {
  return byte(c0) | byte(c1) << 8 | ...;
}

这是一个好的解决方案吗？它是否便携（不是通信意义上的）？是否有现成的解决方案，也许是 boost？

我最关心的问题是将负位从 char 转换为 int 时的位顺序。我不知道正确的行为应该是什么。

谢谢

原文

I have need to pack four signed bytes into 32-bit integral type.
this is what I came up to:

int32_t byte(int8_t c) { return (unsigned char)c; }

int pack(char c0, char c1, ...) {
  return byte(c0) | byte(c1) << 8 | ...;
}

is this a good solution? Is it portable (not in communication sense)?
is there a ready-made solution, perhaps boost?

issue I am mostly concerned about is bit order when converting of negative bits from char to int. I do not know what the correct behavior should be.

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

紫南 2024-09-01 04:10:43

char 不保证有符号或无符号（在 PowerPC Linux 上，char 默认为无符号）。传播这个词！

你想要的是像这样的宏：

#include <stdint.h> /* Needed for uint32_t and uint8_t */

#define PACK(c0, c1, c2, c3) \
    (((uint32_t)(uint8_t)(c0) << 24) | \
    ((uint32_t)(uint8_t)(c1) << 16) | \
    ((uint32_t)(uint8_t)(c2) << 8) | \
    ((uint32_t)(uint8_t)(c3)))

它很难看，主要是因为它不能很好地适应 C 的操作顺序。另外，反斜杠返回在那里，所以这个宏不必是一大长行。

此外，我们在转换为 uint32_t 之前转换为 uint8_t 的原因是为了防止不必要的符号扩展。

char isn't guaranteed to be signed or unsigned (on PowerPC Linux, char defaults to unsigned). Spread the word!

What you want is something like this macro:

#include <stdint.h> /* Needed for uint32_t and uint8_t */

#define PACK(c0, c1, c2, c3) \
    (((uint32_t)(uint8_t)(c0) << 24) | \
    ((uint32_t)(uint8_t)(c1) << 16) | \
    ((uint32_t)(uint8_t)(c2) << 8) | \
    ((uint32_t)(uint8_t)(c3)))

It's ugly mainly because it doesn't play well with C's order of operations. Also, the backslash-returns are there so this macro doesn't have to be one big long line.

Also, the reason we cast to uint8_t before casting to uint32_t is to prevent unwanted sign extension.

回复收藏 0 原文

三生殊途 2024-09-01 04:10:43

我喜欢 Joey Adam 的答案，除了它是用宏编写的（这在许多情况下会造成真正的痛苦），并且如果“char”不是 1 字节宽，编译器不会给您警告。这是我的解决方案（基于乔伊的解决方案）。

inline uint32_t PACK(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3) {
    return (c0 << 24) | (c1 << 16) | (c2 << 8) | c3;
}

inline uint32_t PACK(sint8_t c0, sint8_t c1, sint8_t c2, sint8_t c3) {
    return PACK((uint8_t)c0, (uint8_t)c1, (uint8_t)c2, (uint8_t)c3);
}

我省略了将 c0->c3 转换为 uint32_t ，因为编译器应该在转换时为您处理这个问题，并且我使用了 c 样式转换，因为它们适用于 c 或 c++（OP 标记为两者）。

I liked Joey Adam's answer except for the fact that it is written with macros (which cause a real pain in many situations) and the compiler will not give you a warning if 'char' isn't 1 byte wide. This is my solution (based off Joey's).

inline uint32_t PACK(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3) {
    return (c0 << 24) | (c1 << 16) | (c2 << 8) | c3;
}

inline uint32_t PACK(sint8_t c0, sint8_t c1, sint8_t c2, sint8_t c3) {
    return PACK((uint8_t)c0, (uint8_t)c1, (uint8_t)c2, (uint8_t)c3);
}

I've omitted casting c0->c3 to a uint32_t as the compiler should handle this for you when shifting and I used c-style casts as they will work for either c or c++ (the OP tagged as both).

回复收藏 0 原文

旧人哭 2024-09-01 04:10:43

您可以避免使用隐式转换进行强制转换：

uint32_t pack_helper(uint32_t c0, uint32_t c1, uint32_t c2, uint32_t c3) {
    return c0 | (c1 << 8) | (c2 << 16) | (c3 << 24);
}

uint32_t pack(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3) {
    return pack_helper(c0, c1, c2, c3);
}

其想法是您看到“正确转换所有参数。移位并组合它们”，而不是“对于每个参数，正确转换它，移位并组合它们”。不过，其中内容并不多。

然后：

template <int N>
uint8_t unpack_u(uint32_t packed) {
    // cast to avoid potential warnings for implicit narrowing conversion
    return static_cast<uint8_t>(packed >> (N*8));
}

template <int N>
int8_t unpack_s(uint32_t packed) {
    uint8_t r = unpack_u<N>(packed);
    return (r <= 127 ? r : r - 256); // thanks to caf
}

int main() {
    uint32_t x = pack(4,5,6,-7);
    std::cout << (int)unpack_u<0>(x) << "\n";
    std::cout << (int)unpack_s<1>(x) << "\n";
    std::cout << (int)unpack_u<3>(x) << "\n";
    std::cout << (int)unpack_s<3>(x) << "\n";
}

输出：

这与 uint32_t、uint8_t 和 int8_t 类型一样可移植。 C99 中不需要它们，并且 C++ 或 C89 中未定义头文件 stdint.h。但是，如果类型存在并且满足 C99 要求，则代码将起作用。当然，在 C 中，解包函数需要函数参数而不是模板参数。如果您想编写用于解包的短循环，您可能也更喜欢在 C++ 中这样做。

为了解决类型是可选的这一事实，您可以使用 uint_least32_t，这在 C99 中是必需的。类似地，uint_least8_t 和int_least8_t。您必须更改 pack_helper 和 unpack_u 的代码：

uint_least32_t mask(uint_least32_t x) { return x & 0xFF; }

uint_least32_t pack_helper(uint_least32_t c0, uint_least32_t c1, uint_least32_t c2, uint_least32_t c3) {
    return mask(c0) | (mask(c1) << 8) | (mask(c2) << 16) | (mask(c3) << 24);
}

template <int N>
uint_least8_t unpack_u(uint_least32_t packed) {
    // cast to avoid potential warnings for implicit narrowing conversion
    return static_cast<uint_least8_t>(mask(packed >> (N*8)));
}

说实话，这不太值得 - 您的应用程序的其余部分很可能是在假设 int8_t 等确实存在的情况下编写的。这是一种罕见的没有 8 位和 32 位 2 补码类型的实现。

You can avoid casts with implicit conversions:

uint32_t pack_helper(uint32_t c0, uint32_t c1, uint32_t c2, uint32_t c3) {
    return c0 | (c1 << 8) | (c2 << 16) | (c3 << 24);
}

uint32_t pack(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3) {
    return pack_helper(c0, c1, c2, c3);
}

The idea is that you see "convert all the parameters correctly. Shift and combine them", rather than "for each parameter, convert it correctly, shift and combine it". Not much in it, though.

Then:

template <int N>
uint8_t unpack_u(uint32_t packed) {
    // cast to avoid potential warnings for implicit narrowing conversion
    return static_cast<uint8_t>(packed >> (N*8));
}

template <int N>
int8_t unpack_s(uint32_t packed) {
    uint8_t r = unpack_u<N>(packed);
    return (r <= 127 ? r : r - 256); // thanks to caf
}

int main() {
    uint32_t x = pack(4,5,6,-7);
    std::cout << (int)unpack_u<0>(x) << "\n";
    std::cout << (int)unpack_s<1>(x) << "\n";
    std::cout << (int)unpack_u<3>(x) << "\n";
    std::cout << (int)unpack_s<3>(x) << "\n";
}

Output:

This is as portable as the uint32_t, uint8_t and int8_t types. None of them is required in C99, and the header stdint.h isn't defined in C++ or C89. If the types exist and meet the C99 requirements, though, the code will work. Of course in C the unpack functions would need a function parameter instead of a template parameter. You might prefer that in C++ too if you want to write short loops for unpacking.

To address the fact that the types are optional, you could use uint_least32_t, which is required in C99. Similarly uint_least8_t and int_least8_t. You would have to change the code of pack_helper and unpack_u:

uint_least32_t mask(uint_least32_t x) { return x & 0xFF; }

uint_least32_t pack_helper(uint_least32_t c0, uint_least32_t c1, uint_least32_t c2, uint_least32_t c3) {
    return mask(c0) | (mask(c1) << 8) | (mask(c2) << 16) | (mask(c3) << 24);
}

template <int N>
uint_least8_t unpack_u(uint_least32_t packed) {
    // cast to avoid potential warnings for implicit narrowing conversion
    return static_cast<uint_least8_t>(mask(packed >> (N*8)));
}

To be honest this is unlikely to be worth it - chances are the rest of your application is written on the assumption that int8_t etc do exist. It's a rare implementation that doesn't have an 8 bit and a 32 bit 2's complement type.

回复收藏 0 原文

网名女生简单气质 2024-09-01 04:10:43

“善良”
恕我直言，这是您将获得的最佳解决方案。编辑：虽然我会使用 static_cast 而不是 C 风格的强制转换，而且我可能不会使用单独的方法来隐藏强制转换......

可移植性：
不会有可移植的方法来做到这一点，因为没有任何规定 char 必须是 8 位，也没有规定 unsigned int 必须是 4 个字节宽。

此外，您依赖于字节顺序，因此在一种体系结构上打包的数据将无法在具有相反字节顺序的体系结构上使用。

有现成的解决方案吗？
我不知道。

回复收藏 0 原文

笑红尘 2024-09-01 04:10:43

这是基于 Grant Peters 和 Joey Adams 的答案，扩展为展示如何解包有符号值（解包函数依赖于 C 中无符号值的模规则）：（

正如 Steve Jessop 在评论中指出的那样，不需要单独的 pack_s 和 pack_u 函数）。

inline uint32_t pack(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3)
{
    return ((uint32_t)c0 << 24) | ((uint32_t)c1 << 16) |
        ((uint32_t)c2 << 8) | (uint32_t)c3;
}

inline uint8_t unpack_c3_u(uint32_t p)
{
    return p >> 24;
}

inline uint8_t unpack_c2_u(uint32_t p)
{
    return p >> 16;
}

inline uint8_t unpack_c1_u(uint32_t p)
{
    return p >> 8;
}

inline uint8_t unpack_c0_u(uint32_t p)
{
    return p;
}

inline uint8_t unpack_c3_s(uint32_t p)
{
    int t = unpack_c3_u(p);
    return t <= 127 ? t : t - 256;
}

inline uint8_t unpack_c2_s(uint32_t p)
{
    int t = unpack_c2_u(p);
    return t <= 127 ? t : t - 256;
}

inline uint8_t unpack_c1_s(uint32_t p)
{
    int t = unpack_c1_u(p);
    return t <= 127 ? t : t - 256;
}

inline uint8_t unpack_c0_s(uint32_t p)
{
    int t = unpack_c0_u(p);
    return t <= 127 ? t : t - 256;
}

（这些是必要的，而不是简单地转换回 int8_t ，因为如果值超过 127，后者可能会导致引发实现定义的信号，因此它不是严格可移植的）。

This is based on Grant Peters and Joey Adams' answers, extended to show how to unpack the signed values (the unpack functions rely on the modulo rules of unsigned values in C):

(As Steve Jessop noted in comments, there is no need for separate pack_s and pack_u functions).

inline uint32_t pack(uint8_t c0, uint8_t c1, uint8_t c2, uint8_t c3)
{
    return ((uint32_t)c0 << 24) | ((uint32_t)c1 << 16) |
        ((uint32_t)c2 << 8) | (uint32_t)c3;
}

inline uint8_t unpack_c3_u(uint32_t p)
{
    return p >> 24;
}

inline uint8_t unpack_c2_u(uint32_t p)
{
    return p >> 16;
}

inline uint8_t unpack_c1_u(uint32_t p)
{
    return p >> 8;
}

inline uint8_t unpack_c0_u(uint32_t p)
{
    return p;
}

inline uint8_t unpack_c3_s(uint32_t p)
{
    int t = unpack_c3_u(p);
    return t <= 127 ? t : t - 256;
}

inline uint8_t unpack_c2_s(uint32_t p)
{
    int t = unpack_c2_u(p);
    return t <= 127 ? t : t - 256;
}

inline uint8_t unpack_c1_s(uint32_t p)
{
    int t = unpack_c1_u(p);
    return t <= 127 ? t : t - 256;
}

inline uint8_t unpack_c0_s(uint32_t p)
{
    int t = unpack_c0_u(p);
    return t <= 127 ? t : t - 256;
}

(These are necessary rather than simply casting back to int8_t, because the latter may cause an implementation-defined signal to be raised if the value is over 127, so it's not strictly portable).

回复收藏 0 原文

┾廆蒐ゝ 2024-09-01 04:10:43

您还可以让编译器为您完成这项工作。

union packedchars {
  struct {
    char v1,v2,v3,v4;
  }
  int data;
};

packedchars value;
value.data = 0;
value.v1 = 'a';
value.v2 = 'b;

ETC。

You could also let the compiler do the work for you.

union packedchars {
  struct {
    char v1,v2,v3,v4;
  }
  int data;
};

packedchars value;
value.data = 0;
value.v1 = 'a';
value.v2 = 'b;

Etc.

回复收藏 0 原文

~没有更多了~

关于作者

鸠书

暂无简介

0 文章

0 评论

25 人气

关注发私信

友情链接

文江博客

C/C++将signed char 打包成int

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

C/C++将signed char 打包成int

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。