原始数据类型的大小
像 int
这样的基本数据类型的大小到底取决于什么?
- 编译器
- 处理器
- 开发环境
或者是这些或其他因素的组合?
对其原因的解释将非常有帮助。
编辑:抱歉造成混乱。我的意思是询问像 int 这样的原始数据类型,而不是 POD,我确实了解 POD 可以包含结构,并且对于结构来说,这是一个完全不同的球类游戏,其中包含填充。 我已经更正了问题,此处的编辑注释应确保有关 POD 的答案看起来不无关紧要。
On what exactly does the size of a primitive data type like int
depend on?
- Compiler
- Processor
- Development Environment
Or is it a combination of these or other factors?
An explanation on the reason of the same will be really helpful.
EDIT: Sorry for the confusion..I meant to ask about Primitive data type like int and not regarding PODs, I do understand PODs can include structure and with structure it is a whole different ball game with padding coming in to the picture.
I have corrected the Q, the edit note here should ensure the answers regarding POD don't look irrelevant.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我认为这个问题有两个部分:
允许基本类型的大小。
由 C 和 C++ 标准指定:类型允许它们必须具有的最小值范围,这隐式地对其大小设置了下限以位为单位(例如
long
必须至少为32位才能符合标准)。标准未指定以字节为单位的大小,因为 字节的定义取决于实现,例如
char
是字节,但字节大小(CHAR_BIT
宏)可能是16 位。实现定义的实际大小。
正如其他答案已经指出的那样,这取决于实现:编译器。反过来,编译器的实现也很大程度上受到目标体系结构的影响。因此,两个编译器在相同的操作系统和体系结构上运行,但具有不同大小的
int
是合理的。您可以做出的唯一假设是标准中规定的假设(假设编译器实现了它)。还可能有其他 ABI 要求(例如枚举的固定大小)。
I think there are two parts to this question:
What sizes primitive types are allowed to be.
This is specified by the C and C++ standards: the types have allowed minimum value ranges they must have, which implicitly places a lower bound on their size in bits (e.g.
long
must be at least 32 bit to comply with the standard).The standards do not specify the size in bytes, because the definition of the byte is up to the implementation, e.g.
char
is byte, but byte size (CHAR_BIT
macro) may be 16 bit.The actual size as defined by the implementation.
This, as other answers have already pointed out, is dependent on the implementation: the compiler. And the compiler implementation, in turn, is heavily influenced by the target architecture. So it's plausible to have two compilers running on the same OS and architecture, but having different size of
int
. The only assumption you can make is the one stated by the standard (given that the compiler implements it).There also may be additional ABI requirements (e.g. fixed size of enums).
首先,这取决于编译器。编译器通常又取决于体系结构、处理器、开发环境等,因为它会将它们考虑在内。所以你可能会说这是所有的组合。 但我不会这么说。我会说,编译器,因为在同一台机器上,如果您使用不同的编译器,您可能会有不同大小的 POD 和内置类型。另请注意,您的源代码是编译器的输入,因此编译器对 POD 和内置类型的大小做出最终决定。然而,这个决定也确实受到目标机器底层架构的影响。毕竟,真正的有用编译器必须发出最终在您的目标机器上运行的高效代码。
编译器也提供
选项
。其中很少有可能也会影响尺寸!编辑:标准所说的,
char
、signed char
和unsigned char
的大小是由 C++ 标准本身定义的!所有其他类型的大小由编译器定义。C++03 标准 $5.3.3/1 说,
C99 标准 ($6.5.3.4) 本身也将
char
、signed char
和unsigned char
的大小定义为 1 ,但其他类型的大小由编译器定义!编辑:
我发现这个 C++ FAQ 章节非常好。整个章节。虽然这是非常小的一章。 :-)
http://www.parashift.com/c++-faq- lite/intrinsic-types.html
另请阅读下面的评论,其中有一些很好的论点!
First of all, it depends on Compiler. Compiler in turns usually depends on the architecture, processor, development environment etc because it takes them into account. So you may say it's a combination of all. But I would NOT say that. I would say, Compiler, since on the same machine you may have different sizes of POD and built-in types, if you use different compilers. Also note that your source code is input to the compiler, so it's the compiler which makes final decision of the sizes of POD and built-in types. However, it's also true that this decision is influenced by the underlying architecture of the target machine. After all, the real useful compiler has to emit efficient code that eventually runs on the machine you target.
Compilers provides
options
too. Few of them might effect sizes also!EDIT: What Standards say,
Size of
char
,signed char
andunsigned char
is defined by C++ Standard itself! Sizes of all other types are defined by the compiler.C++03 Standard $5.3.3/1 says,
C99 Standard ($6.5.3.4) also itself defines the size of
char
,signed char
andunsigned char
to be 1, but leaves the size of other types to be defined by the compiler!EDIT:
I found this C++ FAQ chapter really good. The entire chapter. It's very tiny chapter though. :-)
http://www.parashift.com/c++-faq-lite/intrinsic-types.html
Also read the comments below, there are some good arguments!
如果您询问像
int
这样的原始类型的大小,我会说这取决于您引用的因素。编译器/环境对(其中环境通常意味着操作系统)肯定是其中的一部分,因为编译器可以出于各种原因以不同的方式将各种“合理”大小映射到内置类型上:例如,x86_64 Windows 上的编译器通常会有一个 32 位
long
和一个 64 位long long
以避免破坏普通 x86 的代码思想;相反,在 x86_64 Linux 上,long
通常是 64 位,因为它是一个更“自然”的选择,并且为 Linux 开发的应用程序通常更与架构无关(因为 Linux 可以在更多种类的架构上运行)。处理器在决策中肯定很重要:
int
应该是处理器的“自然大小”,通常是处理器的通用寄存器的大小。这意味着它是在当前架构上运行速度更快的类型。相反,long
通常被认为是一种以性能换取扩展范围的类型(这在普通 PC 上很少出现,但在微控制器上这是正常的)。如果 in 相反,您也在谈论
struct
s &公司(如果它们遵守某些规则,是POD
),编译器和处理器也会影响它们的大小,因为它们是由内置类型和选择的适当填充组成的由编译器在目标架构上实现最佳性能。If you're asking about the size of a primitive type like
int
, I'd say it depends on the factor you cited.The compiler/environment couple (where environment often means OS) is surely a part of it, since the compiler can map the various "sensible" sizes on the builtin types in different ways for various reasons: for example, compilers on x86_64 Windows will usually have a 32 bit
long
and a 64 bitlong long
to avoid breaking code thought for plain x86; on x86_64 Linux, instead,long
is usually 64 bit because it's a more "natural" choice and apps developed for Linux are generally more architecture-neutral (because Linux runs on a much greater variety of architectures).The processor surely matters in the decision:
int
should be the "natural size" of the processor, usually the size of the general-purpose registers of the processor. This means that it's the type that will work faster on the current architecture.long
instead is often thought as a type which trades performance for an extended range (this is rarely true on regular PCs, but on microcontrollers it's normal).If in instead you're also talking about
struct
s & co. (which, if they respect some rules, arePOD
), again the compiler and the processor influence their size, since they are made of builtin types and of the appropriate padding chosen by the compiler to achieve the best performance on the target architecture.正如我在 @Nawaz 的答案下评论的那样,它在技术上仅取决于编译器。
编译器的任务只是获取有效的 C++ 代码,并输出有效的机器代码(或其目标语言)。
因此,C++ 编译器可以决定将
int
的大小设置为 15,并要求它在 5 字节边界上对齐,并且它可以 决定在 POD 中的变量之间插入任意填充。标准中没有任何内容禁止这样做,并且它仍然可以生成工作代码。只是速度会慢很多。
因此,在实践中,编译器通过两种方式从其运行的系统中获取一些提示:
- CPU 有一定的偏好:例如,它可能有 32 位宽的寄存器,因此将
int
设为 32 位宽将是一个好主意,并且通常需要变量自然对齐(a例如,4 字节宽的变量必须在可被 4 整除的地址上对齐),因此明智的编译器会尊重这些首选项,因为它会生成更快的代码。- 操作系统也可能有一些影响,因为如果它使用编译器之外的另一个 ABI,则进行系统调用将变得不必要的困难。
但这些只是实际的考虑因素,目的是让程序员的生活更轻松或生成更快的代码。它们不是必需的。
编译器有最终决定权,它可以选择完全忽略CPU和操作系统。只要它生成具有 C++ 标准中指定的语义的工作可执行文件即可。
As I commented under @Nawaz's answer, it technically depends solely on the compiler.
The compiler is just tasked with taking valid C++ code, and outputting valid machine code (or whatever language it targets).
So a C++ compiler could decide to make an
int
have a size of 15, and require it to be aligned on 5-byte boundaries, and it could decide to insert arbitrary padding between the variables in a POD. Nothing in the standard prohibits this, and it could still generate working code.It'd just be much slower.
So in practice, compilers take some hints from the system they're running on, in two ways:
- the CPU has certain preferences: for example, it may have 32-bit wide registers, so making an
int
32 bits wide would be a good idea, and it usually requires variables to be naturally aligned (a 4-byte wide variable must be aligned on an address divisible by 4, for example), so a sensible compiler respects these preferences because it yields faster code.- the OS may have some influence too, in that if it uses another ABI than the compiler, making system calls is going to be needlessly difficult.
But those are just practical considerations to make life a bit easier for the programmer or to generate faster code. They're not required.
The compiler has the final word, and it can choose to completely ignore both the CPU and the OS. As long as it generates a working executable with the semantics specified in the C++ standard.
这取决于实现(编译器)。
实现定义的行为
表示未指定的行为,其中每个实施记录了如何做出选择。It depends on the implementation (compiler).
Implementation-defined behavior
means unspecified behavior where each implementation documents how the choice is made.struct
也可以是 POD,在这种情况下,您可以在某些编译器上使用#pragma pack
显式控制成员之间的潜在填充。A
struct
can also be POD, in which case you can explicity control potential padding between members with#pragma pack
on some compilers.