RAM 可以处理不同的数据类型大小吗?
int、char 和 bool 通常具有不同的大小。我想,其中 int>char>bool 。
- 但 RAM 是否支持这个?
- 它是如何建立的?
- 它可以利用 bool 只有 1 个字节并将其存储在一个小“寄存器”中吗?
int, char and bool usually have different sizes. Where int>char>bool, I suppose.
- But does the RAM even support this?
- How is it built up?
- Can it take advantage of bool being only 1 byte and store it in a small "register"?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
在普通的现代计算机上,所有内存都是字节可寻址的。也就是说,RAM 中的每个字节大小的存储位置都有一个分配给它的唯一编号。如果您想要存储一个一字节值,例如
bool
(尽管bool
在 C++ 中不要求是一个字节,但它们通常都是这样),它需要一个存储位置,比如位置 42。如果你想存储大于一个字节的东西,比如一个
int
,那么它将需要多个连续的存储位置。例如,如果您的int
类型长度为 16 位(2 个字节),则其中一半将存储在位置 42 中,另一半将存储在位置 43 中。这适用于更大的类型。假设您有一个 64 位(8 字节)long long int
类型。这种类型的值可以跨位置 42、43、44、45、46、47、48 和 49 存储。某些类型的处理器需要考虑一些更高级的考虑因素,称为“对齐”。例如,处理器可能有一个规则,即两字节值必须从偶数地址开始,或者四字节值必须从可被 4 整除的地址开始。您的编译器将处理此细节为你。
编译器还知道每种类型的长度,因此当它为程序生成机器代码时,它会知道每个变量的存储从哪个地址开始,并且会知道该变量存储在多少个连续字节中。
“寄存器另一方面,它们存在于处理器中,而不是 RAM 中,并且通常具有固定大小。处理器寄存器的一种用途是存储从 RAM 检索的值。例如,如果您的处理器有 32 位(4 字节)寄存器,则从 RAM 加载的
bool
值仍将消耗整个 4 字节寄存器,即使它在加载时仅消耗一个字节。内存。On a normal, modern computer all memory is byte addressable. That is each byte-sized storage location in RAM has a unique number assigned to it. If you want to store a one-byte value such as a
bool
(althoughbool
s are not required to be one byte in C++, they just usually are), it takes a single storage location, say location 42.If you want to store something larger than one byte, say an
int
, then it will take multiple consecutive storage locations. For example, if yourint
type is 16 bits (2 bytes) long, then half of it will be stored in location 42 and the other half in location 43. This generalizes to larger types. Say you have a 64-bit (8-byte)long long int
type. A value of this type might be stored across locations 42, 43, 44, 45, 46, 47, 48, and 49.There are some more advanced considerations called "alignment" that some sorts of processors need to have respected. For example, a processor might have a rule that a two-byte value must begin on an even address, or that a four-byte value must begin on an address that is divisible by 4. Your compiler will take care of the details of this for you.
The compiler also knows how long each type is, so when it generates the machine code for your program, it will know at which address the storage for each variable begins, and it will know how many consecutive bytes the variable is stored in.
"Registers" on the other hand, are something that exist in the processor, not in RAM, and are usually a fixed size. One use of processor registers is to store a value retrieved from RAM. For example, if your processor has 32 bit (4 byte) registers, then a
bool
value loaded from RAM will still consume an entire 4-byte register, even though it consumed only one byte when it was in RAM.计算机内存被组织成“字”,即给定大小(通常是 2 的幂)的字节序列。内存通常以这些单元进行读写,这些单元通常与寄存器的大小以及 CPU 对算术运算符的本机支持兼容。这通常是机器“比特等级”的来源(例如,32 位 CPU、64 位 CPU、旧的 8 位视频游戏控制台)。
当然,您经常需要与本机字号不同的字号。机器指令和智能编码允许您通过应用各种位级逻辑运算符将这些单词分解为更小的单元,或者通过“组合”多个单词将它们组合成更大的单元。
例如,如果您有一个 32 位字,则可以将一个字与 0xff0000ff 等模式进行 AND 运算来获取该字中的第一个和最后一个字节,或者使用 0x0000ffff 来获取第二个 16 位 int 的内容。
对于布尔值,通常将内存用作位图。本质上,您可以将 X 个“布尔值”放入 X 位字中,并通过与引用该布尔值的“掩码”进行“与”或“或”运算来访问特定位。例如,1 代表第一位,2 代表第二位,4 代表第四位,等等。
在大多数机器中,不建议将较小的数据类型拆分为两个字(这称为对齐)。
当您使用 C 或 C++ 等高级语言时,您通常不必担心所有这些内存组织问题。如果分配 int、short 和 double,编译器将生成适当的机器代码。仅当您想要在动态分配的内存中巧妙地组织内容时(例如手动实现位图时),才直接执行此操作。
当使用比本机字大小更大的单元时,编译器将再次为您处理大多数事情。例如,在 32 位机器上,您可以轻松处理 32 位 int 运算,但要在 8 位机器或 16 位机器上运行相同的代码,编译器将生成代码来执行较小的操作并将它们组合起来得到结果。这就是为什么通常认为建议在 64 位机器上运行 64 位操作系统的部分原因,因为否则您可能会在 32 位操作系统上执行多个指令和读/写来模拟 64 位,而不是单个指令指令或存储器访问。
Computer memory is organized into "words", a sequence of bytes of a given size (often a 2-power). Memory is usually read and written in these units which are often compatible with the size of the registers and the CPU's native support for arithmetic operators. This is typically the source of the "bit rating" of a machine (e.g., a 32 bit CPU, a 64 bit CPU, the old 8-bit video game consoles).
Of course, you often need a different size from the native word size. Machine instructions and smart coding allows you to break these words into smaller units by applying various bit-level logical operators, or to combine them into larger units by "combining" multiple words.
For instance, if you have a 32 bit word, you could AND a word against a pattern like 0xff0000ff to get the first and last byte in that word, or 0x0000ffff to get just the contents of the second 16-bit int.
In the case of bools, it is common to use memory as a bitmap. You can essentially place X "bools" in an X-bit word and access a specific bit by ANDing or ORing against a "mask" that refers to that bool. E.g., 1 for the first bit, 2 for the second bit, 4 for the fourth bit, etc.
In most machines, it is inadvisable to split a smaller data type across two words (this is called alighment).
When you work with a higher level language like C or C++, you usually don't have to worry about all this memory organization stuff. If you allocate an int, a short, and a double, the compiler will generate the appropriate machine code. You only do this directly when you want to smartly organize things in dynamically allocated memory, for example when manually implementing a bitmap.
When working with larger units than the native word size, the compiler will again handle most things for you. For instance, on a 32-bit machine you can easily handle 32-bit int operations, but to run the same code on an 8-bit machine or a 16-bit machine the compiler would generate code to do the smaller operations and combine them to get the results. This is partially why it is generally considered advisable to run a 64-bit OS on a 64-bit machine, since otherwise you might be performing multiple instructions and read/writes to simulate 64-bit on a 32-bit OS rather than a single instruction or memory access.
想必您的意思是缓存?只是好奇为什么您担心数据结构的大小,您是嵌入式编程吗?这通常是唯一值得担心内存占用的时间。
如果您想要同时维护多个位字段,则可以使用字节作为位字段,并记住这些值
彼此独立,并且可以独立于其他值进行检查。人们一直这样做是为了节省一点空间。这就是你想要弄清楚的吗?
例如,如果每个 bool 占用一个字节的空间,那么显然每个字节只使用一位。因此,如果将 8 位链接在一起,则只会消耗 1 个字节的空间。
但不要忘记内存中的每个变量也有某种编组(在 .NET 中比在“较低”级别语言中更明显,但总有一些东西跟踪正在使用的变量)。因此,就像 C# 的情况一样,单个字节实际上需要 3 个字节的 RAM。
但是RAM是按块传输的,据我所知,它比单个字节大得多。通常至少以字来衡量,正常大小为一次 32、64 或 128 位。这些数字取决于平台。
Presumably you mean cache? Just curious why you're worried about the sizes of data structures, are you programming for embedded? That's usually the only time memory footprint is worth worrying about.
If you have several bit fields that you want to maintain concurrently you can use a byte as a bitfield and remember that values like
are each separate from each other and can be checked for independently of the others. People do this all the time to try and save a bit of space. Is that sort of what you're trying to figure out?
So for instance, if each bool takes up one byte of space, then obviously only one bit per byte is being used. So if you chain 8 bits together, it will only consume one byte of space.
But don't forget each variable in memory also has some sort of marshalling to it (more evident in .NET than in "lower" level languages, but there's always something tracking the variables in use). So like in the case of C# a single byte actually needs like 3 bytes of RAM.
But RAM is transferred in by the block, which is much larger as I understand it than a single byte. Usually that's measured in at least words, and the normal size is either 32, 64, or 128 bits at a time. Those numbers are platform dependent.
如果“支持”是指机器中的 RAM 是否具有与每个大小匹配的本机存储单元,那么答案是“这取决于机器和编译器”。
现代机器的最小可寻址存储大小通常是 8 位(8/16/32/64 位)的倍数。编译器可以使用任何这些大小来存储和操作数据。编译器可以优化存储和寄存器的使用,但它们不是必须的。
If by 'support' you mean does the RAM in a machine have a native storage unit matching each size, the answer is 'it depends on the machine and the compiler'.
Modern machines typically have minimum addressable storage sizes that are multiples of 8-bits (8/16/32/64 bits). Compilers can use any of those sizes to store and manipulate data. A compiler may optimize storage and register usage, but they do not have to.
RAM 并不真正关心数据类型的大小。它只是以字节为单位存储数据。 CPU 控制基本数据类型,知道它们有多少字节。例如,当创建 int 时,CPU 决定使用 4 或 8 字节(分别为 32 或 64 位架构)。
一位无法寻址,但您可以创建一个自定义结构,在一个字节中存储 8 个布尔值。在 C++ 中,您可以使用位字段来利用它。
RAM does not really care about data type sizes. It just stores data in bytes. The CPU controls the basic data types, knowing how many bytes they are. When creating an int, for example, the CPU decides to use for example 4 or 8 bytes (32 or 64 bit architecture respectively)
One bit cannot be addressed, but you can make a custom structure where you store 8 booleans in one byte. In C++, you can utilize this using bit fields.
这和内存有什么关系?
bool 可以是 true 或 false,通常表示为 0 或 1(1 位)。
字符可以有不同的大小,具体取决于所使用的字符集。 ASCII 使用 7 位。 Unicode 最多使用 32 位。
整数是整数,通常支持 -2^31....2^31-1(32 位)的范围,但它们也有其他大小。
What does that have to do with RAM?
A bool can be true or false, which is usually represented as 0 or 1 (1 bit).
A char can have different sizes, depending on the charset used. ASCII uses 7 bits. Unicode uses up to 32 bits.
Integers are whole numbers, often supporting the range of -2^31....2^31-1 (32 bits), but they also come in other sizes.
如果您愿意,您可以使用 C++ 位字段,但您将是这个星球上为数不多的这样做的人之一(从技术上讲,位字段在 C 中定义良好,但从未真正使用过)
如何访问 RAM 是隐藏的你被 C++ 编译器使用是有充分理由的。在某些情况下您想要对此进行优化,但这种情况极为罕见。在当今客户端 PC 内存巨大的世界中,不值得进行大量的微优化。
一般来说,您应该相信您的(优化)编译器会做正确的事情。您提供给编译器的源代码与编译器生成的机器代码只是模糊地相似。如果你的编译器很好的话,微优化会有很大帮助,这是一个神话。你必须准确地知道编译器在优化过程中哪些地方需要帮助才能比编译器优化得更好。如果编译器认为您的代码太复杂而无法优化,甚至会让事情变得更糟。
如果您需要一些技术背景:
在机器语言级别上,这取决于处理器。例如,Motorola 680x0 系列处理器允许您
读取和写入 RAM 的不同“单元”(长/字/字节)。处理器的 RAM 看起来有所不同,具体取决于它处理的指令。一些嵌入式处理器甚至可能使用 4 位作为其最小单位。
You can use C++ bit fields if you like, but you will be one of the few on this planet who do it (technically, bitfields are well-defined in C, but they were never really used much)
How RAM is accessed is hidden to you by the C++ compiler for good reasons. There are cases where you want to optimize this, but they are extremely rare. In today's world of massive RAM amounts in client PCs it's just not worth the amount of micro-optimization.
Generally, you should trust your (optimizing) compiler to do the right thing. The source code you deliver to the compiler only vaguely resembles the machine code produced by the compiler. It's a myth that micro-optimizations help much if your compiler is good. You have to know exactly where the compiler needs help in its optimization process to optimize better than the compiler. You can even make matters worse if the compiler decides your code is too complicated to optimize.
If you want some technical background:
On machine language level it depends on the processor. For example the Motorola 680x0 line of processors allows you to do
to read and write different "units" of RAM (long/word/byte). The processor looks differently on its RAM depending on what instruction it processes. Some embedded processors may even use 4 bits as their smallest unit.
大多数商品硬件都具有字节寻址内存。更深入地观察,我们发现 CPU 寄存器是有位宽的(对于日常使用来说是 32 或 64 位)。然后缓存和总线在这些块(64 或 128 个字节)上运行。您可以尝试利用这一点,但您需要对硬件有非常详细的了解,并且您会将自己绑定到特定的平台。另一方面,您不必利用这一点,因为您的编译器已经这样做了。
Most commodity hardware has byte-addressable memory. Looking a bit deeper we see that CPU registers are given bit-width (32 or 64 bits for everyday stuff). Then caches and buses operate on blocks of these (64 or 128 something bytes.) You can try taking advantage of this but you'd need pretty detailed understanding of the hardware, and you'd be binding yourself to a particular platform. On the other hand, you don't have to take advantage of this since your compiler already does.