数据如何根据“字节顺序”以位级存储?

发布于 2024-09-03 08:36:14 字数 1019 浏览 4 评论 0原文

我读到了字节​​序并理解了蹲...

所以我写了这个

main()
{
    int k = 0xA5B9BF9F;

    BYTE *b = (BYTE*)&k;    //value at *b is 9f
    b++;    //value at *b is BF
    b++;    //value at *b is B9
    b++;    //value at *b is A5
}

k等于A5 B9 BF 9F

和(字节)指针“walk “ o/p 是 9F BF b9 A5

所以我知道字节是向后存储的......好吧。

所以现在我想它是如何存储在BIT级别的......

我的意思是“9f”(1001 1111)存储为“f9”(1111 1001)?

所以我写了这个,

int _tmain(int argc, _TCHAR* argv[])
{
    int k = 0xA5B9BF9F;
    void *ptr = &k;
    bool temp= TRUE;
    cout<<"ready or not here I come \n"<<endl;

    for(int i=0;i<32;i++)
    {   
        temp = *( (bool*)ptr + i );
        if( temp )
            cout<<"1 ";
        if( !temp)
            cout<<"0 ";
        if(i==7||i==15||i==23)
            cout<<" - ";
   }
}

我也得到了一些随机输出

即使对于 nos, 。就像“32”我没有得到任何明智的东西。

为什么 ?

I read about Endianness and understood squat...

so I wrote this

main()
{
    int k = 0xA5B9BF9F;

    BYTE *b = (BYTE*)&k;    //value at *b is 9f
    b++;    //value at *b is BF
    b++;    //value at *b is B9
    b++;    //value at *b is A5
}

k was equal to A5 B9 BF 9F

and (byte)pointer "walk" o/p was 9F BF b9 A5

so I get it bytes are stored backwards...ok.

~

so now I thought how is it stored at BIT level...

I means is "9f"(1001 1111) stored as "f9"(1111 1001)?

so I wrote this

int _tmain(int argc, _TCHAR* argv[])
{
    int k = 0xA5B9BF9F;
    void *ptr = &k;
    bool temp= TRUE;
    cout<<"ready or not here I come \n"<<endl;

    for(int i=0;i<32;i++)
    {   
        temp = *( (bool*)ptr + i );
        if( temp )
            cout<<"1 ";
        if( !temp)
            cout<<"0 ";
        if(i==7||i==15||i==23)
            cout<<" - ";
   }
}

I get some random output

even for nos. like "32" I dont get anything sensible.

why ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

相思碎 2024-09-10 08:36:14

为了完整起见,机器是用字节顺序和位顺序来描述的。

intel x86 被称为一致小尾数法,因为它随着内存地址的增加以 LSB 到 MSB 的顺序存储多字节值。其位编号约定为 b0 = 2^0 和 b31 = 2^31。

Motorola 68000 被称为不一致大尾数法,因为它随着内存地址的增加以 MSB 到 LSB 的顺序存储多字节值。它的位编号约定是 b0 = 2^0 和 b31 = 2^31(与 intel 相同,这就是为什么它被称为“不一致”Big Endian)。

32 位 IBM/Motorola PowerPC 被称为一致大尾数法,因为它随着内存地址的增加以 MSB 到 LSB 的顺序存储多字节值。其位编号约定为 b0 = 2^31 和 b31 = 2^0。

在正常的高级语言使用下,位顺序通常对开发人员是透明的。当用汇编语言编写或使用硬件时,位编号确实会发挥作用。

Just for completeness, machines are described in terms of both byte order and bit order.

The intel x86 is called Consistent Little Endian because it stores multi-byte values in LSB to MSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31.

The Motorola 68000 is called Inconsistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31 (same as intel, which is why it is called 'Inconsistent' Big Endian).

The 32-bit IBM/Motorola PowerPC is called Consistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^31 and b31 = 2^0.

Under normal high level language use the bit order is generally transparent to the developer. When writing in assembly language or working with the hardware, the bit numbering does come into play.

三生一梦 2024-09-10 08:36:14

Endianness,正如您通过实验发现的那样,是指字节在对象中存储的顺序。

位的存储方式没有什么不同,它们始终是 8 位,并且始终是“人类可读的”(高位->低位)。

现在我们已经讨论了您不需要代码...关于您的代码:

for(int i=0;i<32;i++)
{   
  temp = *( (bool*)ptr + i );
  ...
}

这并没有按照您认为的方式进行。您正在迭代 0-32,即一个字中的位数 - 很好。但是您的 temp 分配完全错误:)

重要的是要注意 bool* 的大小与 int* 的大小相同作为 BigStruct*。同一台机器上的所有指针大小相同 - 32 位机器上为 32 位,64 位机器上为 64 位。

ptr + i 正在向 ptr 地址添加 i 个字节。当i>3时,您正在阅读一个全新的单词...这可能会导致段错误。

您要使用的是位掩码。像这样的东西应该有效:

for (int i = 0; i < 32; i++) {
  unsigned int mask = 1 << i;
  bool bit_is_one = static_cast<unsigned int>(ptr) & mask;
  ...
}

Endianness, as you discovered by your experiment refers to the order that bytes are stored in an object.

Bits do not get stored differently, they're always 8 bits, and always "human readable" (high->low).

Now that we've discussed that you don't need your code... About your code:

for(int i=0;i<32;i++)
{   
  temp = *( (bool*)ptr + i );
  ...
}

This isn't doing what you think it's doing. You're iterating over 0-32, the number of bits in a word - good. But your temp assignment is all wrong :)

It's important to note that a bool* is the same size as an int* is the same size as a BigStruct*. All pointers on the same machine are the same size - 32bits on a 32bit machine, 64bits on a 64bit machine.

ptr + i is adding i bytes to the ptr address. When i>3, you're reading a whole new word... this could possibly cause a segfault.

What you want to use is bit-masks. Something like this should work:

for (int i = 0; i < 32; i++) {
  unsigned int mask = 1 << i;
  bool bit_is_one = static_cast<unsigned int>(ptr) & mask;
  ...
}
半暖夏伤 2024-09-10 08:36:14

您的机器几乎肯定无法寻址内存的各个位,因此字节内的位布局毫无意义。字节顺序仅指多字节对象内的字节顺序。

为了使你的第二个程序有意义(尽管没有任何理由,因为它不会给你任何有意义的结果),你需要了解按位运算符 - 特别是 &应用。

Your machine almost certainly can't address individual bits of memory, so the layout of bits inside a byte is meaningless. Endianness refers only to the ordering of bytes inside multibyte objects.

To make your second program make sense (though there isn't really any reason to, since it won't give you any meaningful results) you need to learn about the bitwise operators - particularly & for this application.

烛影斜 2024-09-10 08:36:14

字节尾数

在不同的机器上,此代码可能会给出不同的结果:

union endian_example {
   unsigned long u;
   unsigned char a[sizeof(unsigned long)];
} x;

x.u = 0x0a0b0c0d;

int i;
for (i = 0; i< sizeof(unsigned long); i++) {
    printf("%u\n", (unsigned)x.a[i]);
}

这是因为不同的机器可以自由地以它们希望的任何字节顺序存储值。这是相当任意的。宏伟的计划中没有前进或后退。

位字节顺序

通常您不必担心位字节顺序。访问各个位的最常见方法是使用移位( >><< ),但这些实际上与值相关,而不是字节或位。他们对一个值执行算术运算。该值以位(以字节为单位)存储。

在 C 中,如果您使用过位字段,则可能会遇到位字节顺序问题。这是一个很少使用的 C 语言“功能”(出于这个原因和其他一些原因),它允许您告诉编译器 struct 的成员将使用多少位。

struct thing {
     unsigned y:1; // y will be one bit and can have the values 0 and 1
     signed z:1; // z can only have the values 0 and -1
     unsigned a:2; // a can be 0, 1, 2, or 3
     unsigned b:4; // b is just here to take up the rest of the a byte
};

在这种情况下,位字节顺序取决于编译器。 y 应该是事物中的最高有效位还是最低有效位?谁知道?如果您关心位顺序(描述 IPv4 数据包标头的布局、设备的控制寄存器或只是文件中的存储格式等内容),那么您可能不想担心某些不同的编译器会错误地执行此操作方式。此外,编译器并不总是像人们希望的那样聪明地处理位字段。

Byte Endianness

On different machines this code may give different results:

union endian_example {
   unsigned long u;
   unsigned char a[sizeof(unsigned long)];
} x;

x.u = 0x0a0b0c0d;

int i;
for (i = 0; i< sizeof(unsigned long); i++) {
    printf("%u\n", (unsigned)x.a[i]);
}

This is because different machines are free to store values in any byte order they wish. This is fairly arbitrary. There is no backwards or forwards in the grand scheme of things.

Bit Endianness

Usually you don't have to ever worry about bit endianness. The most common way to access individual bits is with shifts ( >>, << ) but those are really tied to values, not bytes or bits. They preform an arithmetic operation on a value. That value is stored in bits (which are in bytes).

Where you may run into a problem in C with bit endianness is if you ever use a bit field. This is a rarely used (for this reason and a few others) "feature" of C that allows you to tell the compiler how many bits a member of a struct will use.

struct thing {
     unsigned y:1; // y will be one bit and can have the values 0 and 1
     signed z:1; // z can only have the values 0 and -1
     unsigned a:2; // a can be 0, 1, 2, or 3
     unsigned b:4; // b is just here to take up the rest of the a byte
};

In this the bit endianness is compiler dependant. Should y be the most or least significant bit in a thing? Who knows? If you care about the bit ordering (describing things like the layout of a IPv4 packet header, control registers of device, or just a storage formate in a file) then you probably don't want to worry about some different compiler doing this the wrong way. Also, compilers aren't always as smart about how they work with bit fields as one would hope.

你的心境我的脸 2024-09-10 08:36:14

这里的这一行:

temp = *( (bool*)ptr + i );

...当您执行这样的指针算术时,编译器会将指针移动您添加的数字乘以您指向的内容的大小。因为您将 void* 转换为 bool*,所以编译器会将指针移动一个“bool”的大小,这可能只是一个 int ,所以您将从更远的地方打印出内存比你想象的要多。

您无法对字节中的各个位进行寻址,因此询问它们的存储方式几乎毫无意义。 (你的机器可以以任何它想要的方式存储它们,但你无法分辨)。您可能关心的唯一一次是当您通过 I2C 或 RS232 或类似的物理接口实际吐出位时,您必须实际将这些位逐一吐出。但即便如此,协议仍将定义吐出位的顺序,并且设备驱动程序代码必须按照协议顺序在“值为 0xAABBCCDD 的 int”和“位序列 11100011...[无论如何]”之间进行转换”。

This line here:

temp = *( (bool*)ptr + i );

... when you do pointer arithmetic like this, the compiler moves the pointer on by the number you added times the sizeof the thing you are pointing to. Because you are casting your void* to a bool*, the compiler will be moving the pointer along by the size of one "bool", which is probably just an int under the covers, so you'll be printing out memory from further along than you thought.

You can't address the individual bits in a byte, so it's almost meaningless to ask which way round they are stored. (Your machine can store them whichever way it wants and you won't be able to tell). The only time you might care about it is when you come to actually spit bits out over a physical interface like I2C or RS232 or similar, where you have to actually spit the bits out one-by-one. Even then, though, the protocol would define which order to spit the bits out in, and the device driver code would have to translate between "an int with value 0xAABBCCDD" and "a bit sequence 11100011... [whatever] in protocol order".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文