为什么 DWORD 值通常以十六进制表示?

发布于 2024-11-01 11:15:31 字数 312 浏览 0 评论 0原文

我试图理解为什么 DWORD 值经常在 MSDN 上以十六进制描述。

我分析这个的原因是因为我试图从根本上理解为什么所有这些不同的数字数据类型存在。一位当地导师向我暗示,DWORD 和其他 Microsoft 类型的创建与处理器的发展有关。这为我对这些数据类型的理解提供了意义和背景。我想要更多的背景和背景。

无论哪种方式,我都可以使用一些解释或一些资源来了解如何记住 DWORD、无符号整数、字节、位、WORD 等之间的区别。

总之,我的问题是: 1) 为什么 DWORD 以十六进制表示? 2) 您能否提供有关数值数据类型之间的差异以及创建它们的原因的资源?

I am trying to understand why a DWORD value is often described in Hexadecimal on MSDN.

The reason why I am analyzing this is because I am trying to understand fundamentally why all these different number data types exist. A local mentor alluded to me that the creation of DWORD and other Microsoft types had something to do with the evolution of processors. This gives meaning and context to my understanding of these data types. I would like more context and background.

Either way, I could use some explanation or some resources on how to remember the difference between DWORD, unsigned integers, bytes, bits, WORD, etc.

In summary, my questions are:
1) Why are DWORDs represented in Hex?
2) Can you provide resources on the differences between numerical data types and why they were created?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

乖乖哒 2024-11-08 11:15:31

计算机中的所有内容都是一堆 0 和 1。但是以二进制形式编写整个 DWORD 是相当乏味的:

00000000 11111111 00000000 11111111

为了节省空间并提高可读性,我们喜欢以更短的形式编写它。十进制是我们最熟悉的,但不能很好地映射到二进制。八进制和十六进制映射非常方便,与二进制位完全对齐:

// each octal digit is exactly 3 binary digits
01 010 100 binary  =  124 octal

// each hexadecimal digit is exactly 4 binary digits
0101 0100 binary   =  54 hexadecimal

由于十六进制与 8 位字节(2 个十六进制数字构成一个字节)非常吻合,因此符号被卡住了,这就是最常用的。当使用位掩码时,它更容易阅读,更容易理解,更容易排列。

识别正在使用哪个基数的正常简写:

  1234543 = decimal
 01234543 = octal (leading zero)
0x1234543 = hexadecimal (starts with 0x)

至于你关于 BYTE、WORD、DWORD 等的问题......

计算机是从一点开始的。只有 1 或 0。他在原版《创》中客串过。

字节有 8 位长(嗯,曾经有 7 位字节,但我们可以忽略它们)。这允许您拥有 0-255 之间的数字,或 -128 到 127 之间的带符号数字。比 1/0 好,但仍然有限。您可能听说过“8 位游戏”。这就是我们所指的。该系统是围绕字节构建的。

后来计算机发展到拥有 16 位寄存器。这是 2 个字节,并被称为“字”(不,我不知道为什么)。现在,数字可以是 0-65535 或 -32768 到 32767。

我们仍然需要更多的功能,并且计算机已扩展到 32 位寄存器。 4 个字节,2 个字,也称为 DWORD(双字)。直到今天,您可以在“C:\Windows”中查看“system”(旧的 16 位组件)和“system32”(新的 32 位组件)的目录。

然后是 QWORD(四字)。 4 个字、8 个字节、64 位。听说过 Nintendo-64 吗?这就是这个名字的由来。现代建筑现在就在这里。 CPU内部包含64位寄存器。通常可以在此类 CPU 上运行 32 位或 64 位操作系统。

涵盖位、字节、字、双字。这些是原始类型,通常用于标志、位掩码等。如果您想保存实际数字,最好使用有符号/无符号整数、长整型等。

我没有介绍浮点数,但希望是这样有助于总体思路。

Everything within a computer is a bunch of 0s and 1s. But writing an entire DWORD in binary is quite tedious:

00000000 11111111 00000000 11111111

to save space and improve readability, we like to write it in a shorter form. Decimal is what we're most familiar with, but doesn't map well to binary. Octal and Hexadecimal map quite conveniently, lining up exactly with the binary bits:

// each octal digit is exactly 3 binary digits
01 010 100 binary  =  124 octal

// each hexadecimal digit is exactly 4 binary digits
0101 0100 binary   =  54 hexadecimal

Since hex lines up very nicely with 8-bit Bytes (2 hex digits make a Byte), the notation stuck, and that's what gets used most. It's easier to read, easier to understand, easier to line up when messing around with bitmasks.

The normal shorthand for identifying which base is being used:

  1234543 = decimal
 01234543 = octal (leading zero)
0x1234543 = hexadecimal (starts with 0x)

As for your question about BYTE, WORD, DWORD, etc...

Computers started with a bit. Only 1 or 0. He had a cameo in the original Tron.

Bytes are 8 bits long (well, once upon a time there were 7-bit bytes, but we can ignore those). This allows you to have a number from 0-255, or a signed number from -128 to 127. Better than just 1/0, but still limited. You may have heard references to "8-bit gaming". This is what we refer to. The system was built around Bytes.

Then computers grew to have 16-bit registers. This is 2 Bytes, and became known as a WORD (no, I don't know why). Now, numbers could be 0-65535 or -32768 to 32767.

We continued to want more power, and computers were expanded to 32-bit registers. 4 Bytes, 2 Words, also known as a DWORD (double-word). To this day, you can look in "C:\Windows" and see a directory for "system" (old 16-bit pieces) and "system32" (new 32-bit components).

Then came the QWORD (quad-word). 4 WORDS, 8 Bytes, 64 bits. Ever hear of the Nintendo-64? That's where the name came from. Modern architecture is now here. The internals of the cpu contain 64-bit registers. You can generally run a 32- or 64-bit operating system on such cpus.

That covers Bit, Byte, Word, Dword. Those are raw types, and are used often for flags, bitmasks, etc. If you want to hold an actual number, it's best to use signed/unsigned integer, long, etc.

I didn't cover floating point numbers, but hopefully this helps with the general idea.

云归处 2024-11-08 11:15:31

当 DWORD 常量用作可以按位方式进行“或”运算的标志时,它们通常以十六进制编写。这让人更容易看出事实确实如此。这就是为什么您会看到 0x01、0x02、0x04、0x08、0x10、0x20 等。程序员只需将这些值识别为仅具有单个位集的二进制表示形式。

当它是一个枚举时,您会看到 0x01、0x02、0x03 等。它们通常仍然以十六进制编写,因为程序员往往会养成这些习惯!

DWORD constants are typically written in hex when they are used as flags that can be OR'd together in bitwise fashion. It makes it easier to see that is so. That's why you see 0x01, 0x02, 0x04, 0x08, 0x10, 0x20 etc. Programmers just recognise those values as having binary representations with just a single bit set.

When it's an enumeration you'd see 0x01, 0x02, 0x03 etc. They'd often still be written in hex because programmers tend to get into these habits!

凉风有信 2024-11-08 11:15:31

郑重声明一下,16 位无符号数据被命名为 WORD,因为当时计算机有 16 位寄存器。

在计算机历史上,8 位数据是寄存器中可以存储的最大数据。由于它可以存储 ASCII 字符,因此通常称为 CHAR。

但是16位计算机出现了,CHAR不适合命名16位数据。
因此,16 位数据通常称为“字”,因为它是可以在一个寄存器上存储的最大数据单元,并且这是一个很好的类比,可以继续对 CHAR 进行类比。

因此,在某些计算机上,使用不同的 CPU WORD 通常指的是寄存器的大小。在 Saturn CPU 上,使用 64 位寄存器,一个 WORD 是 64 位。

当 32 位 x86 处理器出现时,出于兼容性原因,WORD 保留为 16 位,并且创建了 DWORD 将其扩展到 32 位。对于 QWORD 和 64 位也是如此。

至于为什么通常使用十六进制来描述 WORD,这与与其寄存器来源相关的 WORD 定义的性质有关。在汇编程序编程中,您使用十六进制来描述数据,因为处理器只知道二进制整数(0 和 1)。十六进制是一种更紧凑的使用二进制的方式,并且仍然保留它的一些属性。

Just for the record, 16 bit unsigned data are named WORD beacause at the time being, computers had 16 bits registers.

In computer history, 8 bits data where the biggest data you could store on a register. As it could store an ascii character it was commonly called a CHAR.

But 16 bits computer came out and CHAR was not appropriate to name 16 bits data.
So 16 bits data was commonly called a WORD because it was the biggest unit of data you could store on one register and it was a good analogy to continue the one made for CHAR.

So, On some computers, using a different CPU WORD commonly refers to the size of the register. On Saturn CPU, wich use 64 bit register, a WORD is 64 bits.

When 32 bits x86 processors came out, WORD stayed 16 bit for compatibility reasons, and the DWORD was created to extend it to 32 bits. The same is true for QWORD and 64 bits.

As for why hexadecimal is commonly used to describe a WORD, it has to do to the nature of the definition of a WORD that is tied to it's register origin. In assembler programming you use hexadecimal to describe data, because processors only know binray intergers, (0 and 1). And hexadecimal is a more compact way to use binary and still keeping some of it's properties.

不疑不惑不回忆 2024-11-08 11:15:31

你有一个非常有趣和棘手的问题。

简而言之,有两个驱动因素导致竞争类型系列的存在 - 基于 DWORD 和基于 int:

1)一方面希望具有跨平台性,另一方面希望具有严格的大小类型。

2)人们的保守主义。

无论如何,为了给您的问题提供完整详细的答案以及该领域足够好的背景,我们必须深入研究计算机的历史。我们的故事要从计算的早期开始。

首先,有一个机器字的概念。机器字是大小严格的二进制数据块,适合在特定处理器中进行处理。因此,机器字的大小几乎不依赖于处理器,并且通常等于通用内部处理器寄存器的大小。通常它可以分为两个相等的部分,处理器也可以将它们作为独立的数据块进行访问。例如,在 x86 处理器上,机器字大小为 32 位。这意味着所有通用寄存器(eax、ebx、ecx、edx、esi、edi、ebp、esp 和 eip)具有相同的大小 - 32 位。但其中许多可以作为寄存器的一部分进行访问。例如,您可以将 eax 作为 32 位数据块进行访问,将 ax 作为 16 位数据块进行访问,甚至将 al 作为 8 位数据块进行访问。但实际上这并不是一个 32 位寄存器。我认为您可以在维基百科(http://en.wikipedia.org/wiki/Word_(computer_architecture))上找到该领域非常好的背景知识。简而言之,机器字就是有多少位数据块可以用作单个指令的整数操作数。即使在今天,不同的处理器架构也具有不同的机器字大小。

好了,我们对计算机这个词有了一些了解。现在是回到计算历史的时候了。第一个流行的 Intel x86 处理器具有 16 位字长。它于 1978 年进入市场。当时,汇编语言即使不是主要的编程语言,也是非常流行的。如您所知,汇编程序只是本机处理器语言下的一个非常薄的包装器。因此,它完全依赖于硬件。当英特尔将新的 8086 处理器推向市场时,他们需要取得成功的第一件事就是将新处理器的汇编器也推向市场。没有人想要一个没人知道如何编程的处理器。当 Intel 在 8086 的汇编器中为不同数据类型命名时,他们做出了明显的选择并将 16 位数据块命名为一个字,因为 8086 的机器字具有 16 位大小。机器字的一半称为字节(8 位),用作一个操作数的两个字称为双字(32 位)。
Intel 在处理器手册和汇编器助记符中使用了这些术语(db、dw nd dd 表示字节、字和双字的静态分配)。

几年过去了,1985 年,英特尔推出了 80386 处理器,从 16 位架构转向 32 位架构。但当时有大量开发人员习惯了这个词是 16 位值。除此之外,还有大量的软件是真实相信这个词是 16 位的。许多已经编写的代码都依赖于字是 16 位的事实。因此,除了机器字大小实际上发生变化之外,符号保持不变,除了新的数据类型到达汇编器 - 四字(64 位),因为依赖于两台机器的指令单词保持不变,但机器单词被扩展。同样,双四字(128 位)现在也出现在 64 位 AMD64 架构中。结果我们

byte    =   8 bit
word    =  16 bit
dword   =  32 bit
qword   =  64 bit
dqword  = 128 bit

注意到该类型系列中的主要内容是它是强尺寸类型系列。因为它来自汇编程序并且大量使用,所以需要具有恒定大小的数据类型。请注意,几年过去了,但该系列的数据类型仍然具有相同的恒定大小,除此之外,它的名称已经没有其原始含义。

另一方面,在同一时期,高级语言变得越来越流行。因为该语言是在考虑跨平台应用程序的情况下开发的,所以它们从绝对不同的角度来看待其内部数据类型的大小。如果我正确理解的话,没有一种高级语言不会明确声明其某些内部数据类型具有固定的常量大小,并且将来永远不会改变。让我们看看示例中的 C++。 C++ 标准告诉我们:

"The fundamental storage unit in the C++ memory model is the byte. A byte is at 
least large enough to contain any member of the basic execution character set and 
is composed of a contiguous sequence of bits, the number of which is implementa-
tion-defined. The least significant bit is called the low-order bit; the most 
significant bit is called the high-order bit. The memory available to a C++ program
consists of one or more sequences of contiguous bytes. Every byte has a unique 
address."

因此,我们可以看到令人惊讶的信息 - 在 C++ 中,即使字节也没有任何恒定的大小。因此,即使我们习惯于认为大小为 8 位,根据 C++,大小不仅可以是 8 位,还可以是 9、10、11、12 等位。甚至可能是 7 位。

"There are five signed integer types: “signed char”, “short int”, “int”, and 
“long int”., and “long long int”. In this list, each type provides at least as 
much storage as those preceding it in the list. Plain ints have the natural size
suggested by the architecture of the execution environment; the other signed 
integer types are provided to meet special needs."

该引用描述了两个主要声明:

1) sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)

2) 普通整数具有自然大小由执行环境的架构建议。这意味着 int 必须具有目标处理器体系结构的机器字大小。

您可以浏览所有 C++ 标准文本,但您将找不到诸如“int 的大小是 4 字节”或“long 的长度是 64 位”之类的内容。特定整数 C++ 类型的大小可能会随着从一种处理器架构迁移到另一种处理器架构以及从一个编译器迁移到另一种编译器而发生变化。但即使当您用 C++ 编写程序时,您也会定期面临使用具有众所周知的常量大小的数据类型的要求。

至少早期的编译器开发人员遵循了该标准声明。但现在我们可以看到,人们的保守主义再次上演了。人们过去认为int是32位的,可以存储从–2,147,483,648到2,147,483,647范围内的值。早些时候,当工业界跨越 16 位和 32 位架构之间的界限时。第二项主张得到严格执行。当您使用 C++ 编译器创建 16 位程序时,编译器使用 16 位大小的 int ,这是 16 位处理器的“自然大小”,相反,当您使用另一个 C++ 编译器创建 32 位程序但在同一源代码中,编译器使用 32 位大小的 int,这是 32 位处理器的“自然大小”。如今,如果您查看 Microsoft C++ 编译器,您会发现无论目标处理器架构如何(32 位或 64 位),它都会使用 32 位 int,只是因为人们过去认为 int 是32位!

总而言之,我们可以看到有两种数据类型系列 - 基于双字和基于整数。第二个目标的动机很明显——跨平台应用程序开发。第一个的动机是考虑变量大小有意义的所有情况。例如,除其他外,我们可以提到以下情况:

1)您需要在预定义的众所周知的范围内拥有一些值,并且需要在类或另一个数据结构中使用它,这些数据结构将在运行时填充到大量实例中。在这种情况下,如果您使用基于 int 的类型来存储该值,那么在某些体系结构上会存在巨大内存开销的缺点,并且可能会破坏另一种体系结构上的逻辑。例如,您需要操作 0 到 1000000 范围内的值。如果您使用 int 来存储它,如果 int 是 32 位,则程序将正确运行,如果 int 则每个值实例将有 4 字节内存开销将为 64 位,如果 int 为 16 位,则无法正常工作。

2)下一步工作涉及的数据。为了能够在不同的 PC 上正确处理网络协议,您需要以基于大小的格式简单地指定它,这将逐位描述所有数据包和标头。如果在一台 PC 上,协议头的长度为 20 字节长度(32 位),而在另一台 PC 上,协议头长度为 28 字节长度(64 位 int),那么您的网络通信将完全中断。

3)您的程序需要存储用于某些特殊处理器指令的值,否则您的程序将与用汇编程序编写的模块或代码块进行通信。

4) 您需要存储用于与设备通信的值。每个设备都有自己的规范,描述了输入设备需要什么类型的输入以及以什么形式提供输出。如果设备需要 16 位值作为输入,则无论 int 大小如何,甚至无论安装设备的系统上的处理器使用的机器字大小如何,它都必须接收同等的 16 位值。

5)你的算法依赖于整数溢出逻辑。例如,您有 2^16 个条目的数组,并且您希望无限顺序地遍历它并刷新条目值。如果您使用 16 位 int,您的程序将完美运行,但如果您立即转向 32 位 int,您将拥有超出范围的数组索引访问。

因此,Microsoft 使用这两种数据类型。在实际数据大小不太重要的情况下使用基于 Int 的类型,在实际数据大小不太重要的情况下使用基于 DWORD 的类型。即使在这种情况下,微软也将两者都定义为宏,以便通过向特定处理器架构和/或编译器分配正确的 C++ 等效项,提供快速、轻松地将微软使用的虚拟类型系统采用到特定处理器架构和/或编译器的能力。

我希望我已经很好地讨论了有关数据类型的起源及其差异的问题。

因此,我们可以转向第二个问题,即为什么使用十六进制数字来表示基于 DWORD 的数据类型值。实际上有几个原因:

1)如果我们使用严格大小的二进制数据类型,那么我们可以期望以二进制形式查看它们。

2) 当位掩码值以二进制形式编码时,很容易理解。同意,如果值采用下一种形式,则更容易理解设置了哪些位以及重置了哪些位,

1100010001011001

然后将其编码为下一种形式

50265

3) 以二进制形式编码的数据和描述的一个基于 DWORD 的值具有常量长度,当以十进制形式编码的相同数据将具有可变长度时。请注意,即使当小数以二进制形式编码时,也提供完整的值描述,

0x00000100

而不是

0x100

提供完整的值描述。在需要分析大量二进制数据的情况下,二进制编码的这一特性非常有吸引力。例如,当断点被命中时,十六进制编辑器或分析调试器中的程序使用的普通内存。同意查看整齐的值列比查看弱对齐的可变大小值堆要舒服得多。

因此,我们决定使用二进制编码。我们有三种选择:使用普通二进制编码、使用八进制编码和使用十六进制编码。人们更喜欢使用十六进制编码,因为它是可用编码集中最短的。只需比较一下

10010001101000101011001111000

0x1234568

您能快速找到下一个值中设置的位数吗?

00000000100000000000000000000

接下来呢?

0x00100000

在第二种情况下,您可以快速将数字分成四个单独的字节,

0x00 0x10 0x00 0x00
   3    2    1    0

每个字节中的第一个数字表示 4 个最高有效位,第二个数字表示另外 4 个最低有效位。当您花费一些时间处理十六进制值后,您将记住每个十六进制数字的普通位模拟,并且将在头脑中将一个数字替换为另一个数字而不会出现任何问题:

0 - 0000  4 - 0100  8 - 1000  C - 1100
1 - 0001  5 - 0101  9 - 1001  D - 1101
2 - 0010  6 - 0110  A - 1010  E - 1110
3 - 0011  7 - 0111  B - 1011  F - 1111

因此,我们只需要一两秒钟即可发现我们有位数20已定!

人们使用十六进制是因为它是最短、最容易理解和使用的二进制数据编码形式。

You have very interesing and tricky question.

In short there was two drivers that lead to existing of to competetive type families - DWORD-based and int-based:

1) Desire to have crosspltformity on the one hand and the stricktly size types on the another hand.

2) Peoples conservatism.

In any case to provide ful detailed answer to you question and good enough background of this field we must dig into the computers history. And start our story from the early days of computing.

For the first, there is such a notion as a machine word. Machine word is a stricktly sized chunk of binary data that is natural for processing in the particular processor. So, size of the machine word is hardly processor dependent and in general equal to size of the geneal internal processor registers. Usually it can be subdivided into the two equal parts that also can be accessed by processor as independent chuncks of data. For example, on the x86 processors the machine word size is 32 bits. Thats mean that all general registers (eax, ebx, ecx, edx, esi, edi, ebp, esp and eip) have the same size - 32 bits. But many of them can be accessed as part of the register. For example you can access eax as 32 bit data chunk, ax, as 16 bit data chunk or even al as 8 bit data chunk. But not that physically this is all one 32 bit register. I think that you can found very good background on that field on Wikipedia (http://en.wikipedia.org/wiki/Word_(computer_architecture)). In short, machine word is how much bit data chunk can be used as an integer operand for the single instruction. Even today different processor architectures have different machine word size.

Ok, we have some understanding of the computer word. It is a time to come back into the history of computing. The first Intel x86 processors that was popular had 16 bit word size. It came to the market at 1978. On that time the assembler was highly popular if not a primary programming language. As you know assembler is just a very thin wrapper under the native processor language. Due to this it is enterely hardware dependent. And when Intel push they new 8086 processor into the market, the first thing that they was needed to achive success is to push the assempler for the new processor into the market too. Nobody wants a processor that is nobody know how to program. And when Intel gave the names for the different data types in the assembler for the 8086 they make the evident chois and name 16-bit data chunk as a word, because the machine word of the 8086 have 16-bit size. The half of the machine word was called byte (8-bit) and two words used as one operand was called double word (32-bit).
Intel used this terms in the processors manuals and in the assembler mnemonics (db, dw nd dd for statical allocation of byte, word and double word).

Years passed and 1985 Intel moved from 16-bit architecture to the 32-bit architecture with introduction of the 80386 processor. But at that time there was huge number of developers that was accustomed to that word is a 16-bit value. Beside that there was huge amount of soft was written with true belive that word is 16-bit. And many of the already written code rely to the fact that word is 16 bit. Due to this, beside the fact that the machine word size actually was changed, notation stayed the same, except the fact that new data type arrived to the assembler - quad word (64-bit), because the instruction that relies to the two machine words stayed the same, but the machine word was extended. In the same way the double quad word (128-bit) arrived now with 64-bit AMD64 architecture. As result we have

byte    =   8 bit
word    =  16 bit
dword   =  32 bit
qword   =  64 bit
dqword  = 128 bit

Note the main thing in that type family is that it is strongly sized types family. Because it is come from and it heavily used in assembler, that require data types with constant size. Note, years pass ony by one but the data types from this family continue have the same constant size, beside the fact the its name already haven't its original meaning.

On the another hand, in the same time year by year, the high level languages became more and more popular. And because that languges was developed with cross-platform application in the mind thay looked to the sizes of its internal data types from absolutly other point of view. If I correctly understand no one high level language doesn't clearly claim that the some of its internal data types have a fixed constant size that never will be changed in the future. Let't look at the C++ as at the example. The C++ standart tells that:

"The fundamental storage unit in the C++ memory model is the byte. A byte is at 
least large enough to contain any member of the basic execution character set and 
is composed of a contiguous sequence of bits, the number of which is implementa-
tion-defined. The least significant bit is called the low-order bit; the most 
significant bit is called the high-order bit. The memory available to a C++ program
consists of one or more sequences of contiguous bytes. Every byte has a unique 
address."

So, we can see surprising information - in C++ even byte haven't any constant size. So even if we accustomed to think have size - 8 bit, according to C++ can be not only 8 but also 9, 10, 11, 12 etc. bits in size. And maybe even 7 bit.

"There are five signed integer types: “signed char”, “short int”, “int”, and 
“long int”., and “long long int”. In this list, each type provides at least as 
much storage as those preceding it in the list. Plain ints have the natural size
suggested by the architecture of the execution environment; the other signed 
integer types are provided to meet special needs."

That cite describe two main claims:

1) sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)

2) Plain ints have the natural size suggested by the architecture of the execution environment. That is mean that int must have the machine word size of the target processor architecture.

You can goes through all C++ standard text but you will fail to find something like "size of int is 4 bytes" or "length of long is 64 bit". Size of particular integer C++ types can change with moving from one processor architecture to another, and with moving from one compiler to another. But even when you write program in c++ you will periodically faced with requirenment to use data types with well known constant size.

At least earlier compiler developers followed that standard claims. But now we can see that people conservatism comes into the play one more time. People used to think that int is 32-bit and can store values from the range from –2,147,483,648 to 2,147,483,647. Earlier when industry came through the border between 16-bit and 32-bit architectures. The second claim was strictly enforced. And when you used C++ compiler to create 16-bit program, compiler used int with 16-bit size that is "natural size" for 16-bit processors and, in contrast, when you used another C++ compiler to create 32-bit program but from the same source code, compiler used int with 32-bit size that is "natural size" for 32-bit processors. Nowadays, if you will look, for example, to the Microsoft C++ compiler you will found that it will use 32-bit int regardless of the target processor architecture (32-bit or 64-bit) just because people used to think that int is 32-bit!

As summury, we can see that thare are two data types families - dword-based and int-based. Motivation for the second one is obvious - cross-platform application development. Motivation for the fisrt one is all cases when the taking into the accaunt the sizes of variables have sense. For example, among other we can mention next cases:

1) You need to have some value in predifined well-known range and you need use it class or in another data structure that will populate into huge number of instances in run-time. In that case if you will use int-based types to store that value it will have the drawback in huge memory overhead on some architectures and potentially can broke the logic on another one. For example you need to manipulate values in the range from 0 to 1000000. If you will use int to store it, you program will correctly behave if int will be 32-bit, will have 4-byte memory overhead per each value instance if int will be 64-bit, and won't correctly work if int will be 16-bit.

2) Data involved into the nextworking. To have the ability to correctly handle your networking protocol on different PCs you will need to specify it in plain in size-based format, that will describe all packets and header bit by bit. Your network communication will be completely broken if on one PC you protocol header will be 20 byte length with 32-bit, and on another PC it will be 28 byte length with 64-bit int.

3) Your program need to store value used for some special processor instructions, or your program will communicate with modules or code chunks written in assembler.

4) You need store values that will be used to communicate with devices. Each device have its own specificetion that describes what sort of input device require as input and in what form it will provide output. If device require 16-bit value as input it must receive equally 16-bit value regardless of the int size and even regardless of machine word size used by processor on the system where device is installed.

5) Your algoritm relies to the integer overflow logic. For example you have array of 2^16 entries, and you want infenitely and sequentely goes through it and refresh entries values. If you will use 16-bit int your program will works perfectly, but wimmediatelly you moving to the 32-bit int usage you will have out of range array index access.

Due to this Microsoft use both families of data types. Int-based types in case where actual data size haven't big importance, and DWORD-based in the cases it have. And even in that case Microsoft define both as macroses, to provide the ability quickly and easy enough adopt virtual type system used by Microsoft to the particular processor architecture and/or compiler by assigning to it correct C++ equivalent.

I hope that I have covered the question about the origin of the data types and their differences quite well.

So, we can switch to the seqond question about why hexademical digit is used to denote DWORD-based data types values. There are actually few reasons:

1) If we use stricktly-sized binary data types it will expectable enough that we can want to look on them in the binary form.

2) It is much easy to understand bit masks values when they encoded in binary form. Agree that it is much easier to understang what bit is set and what bit is reset if value in the next form

1100010001011001

then if it will be encoded in next form

50265

3) Data encoded in the binary form and described one DWORD-based value have the constant length, when the same data encoded in teh decimal form will have variable length. Note that even when the small number is encoded in binary form, the full value description is provided

0x00000100

instead of

0x100

This property of the binary encoding is very attractive in the case when the analysis of the huge amount of binary data is required. For example, hex editor or analysis of the plain memory used by your program in debugger when your breakpoint was hit. Agree that it is much more comfortableto look on the neat columns of values that to the heap of weakly aligned variable size values.

So, we decided that we want to use binary encoding. We have three choices: use plain binary encoding, use octal encoding and use hexadecimal encoding. Peple prefere to use hexademical encoding because it most shortest from the set of avaliable encodings. Just compare

10010001101000101011001111000

and

0x1234568

Can you quickly found the numbers of bits that is set in the next value?

00000000100000000000000000000

and in next?

0x00100000

In the second case you can quickly divide the number in four separated bytes

0x00 0x10 0x00 0x00
   3    2    1    0

in each of which the first digit denote 4 most significant bits and the second one denote another 4 least significant bits. After you will spent some time on working with hex values you will remember the plain bit analog of each hexadecimal digit and will replace one by another in the mind without any problems:

0 - 0000  4 - 0100  8 - 1000  C - 1100
1 - 0001  5 - 0101  9 - 1001  D - 1101
2 - 0010  6 - 0110  A - 1010  E - 1110
3 - 0011  7 - 0111  B - 1011  F - 1111

So, we need only second or two to found that we have bit number 20 is set!

People use hex because it is most short, confortable to undestand and use form of binary data encoding.

昨迟人 2024-11-08 11:15:31

为了详细说明 Tim 的答案,这是因为将十六进制转换为二进制并转换回来非常容易 - 每个十六进制数字都是 4 个二进制数字:

0x1 = 0001
0x2 = 0010
...
0xD = 1101
0xE = 1110
0xF = 1111

因此,0x2D = 0010 1101

To elaborate on Tim's answer, it's because converting Hex to binary and back is very easy - each Hex digit is 4 binary digits:

0x1 = 0001
0x2 = 0010
...
0xD = 1101
0xE = 1110
0xF = 1111

So, 0x2D = 0010 1101

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文