(c/c++) 字符串文字的副本是否共享 TEXT 部分中的内存?
如果我调用像这样的函数 myObj.setType(“流体”); 很多时候,在一个程序中,内存中保存了多少个“流体”字面量的副本?编译器能否识别出该文字已经定义并再次引用它?
If I call a function like
myObj.setType("fluid");
many times in a program, how many copies of the literal "fluid" are saved in memory? Can the compiler recognize that this literal is already defined and just reference it again?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
这与 C++(语言)无关。相反,它是编译器可以做的“优化”。因此,答案是是和否,具体取决于您使用的编译器/平台。
@David 这是来自 最新草案语言:
重点是我的。
换句话说,C++ 中的字符串文字是不可变的,因为修改字符串文字是未定义的行为。所以,编译器是自由的,消除了多余的副本。
顺便说一句,我只谈论 C++ ;)
This has nothing to do with C++(the language). Instead, it is an "optimization" that a compiler can do. So, the answer yes and no, depending on the compiler/platform you are using.
@David This is from the latest draft of the language:
The emphasis is mine.
In other words, string literals in C++ are immutable because modifying a string literal is undefined behavior. So, the compiler is free, to eliminate redundant copies.
BTW, I am talking about C++ only ;)
是的,可以。当然,这取决于编译器。对于 VC++,它甚至是可配置的:
http://msdn。 microsoft.com/en-us/library/s0s0asdt(VS.80).aspx
Yes, it can. Of course, it depends on the compiler. For VC++, it's even configurable:
http://msdn.microsoft.com/en-us/library/s0s0asdt(VS.80).aspx
是的,可以,但不能保证一定会。如果你想确定的话,定义一个常量。
Yes it can, but there's no guarantee that it will. Define a constant if you want to be sure.
这是一个编译器实现问题。我使用过的许多编译器都可以选择共享或合并重复的字符串文字。允许重复的字符串文字可以加快编译过程,但会产生更大的可执行文件。
This is a compiler implementation issue. Many compilers that I have used have an option to share or merge duplicate string literals. Allowing duplicate string literals speeds up the compilation process but produces larger executables.
我相信在 C/C++ 中没有针对这种情况的指定处理,但在大多数情况下会使用该字符串的多个定义。
I believe that in C/C++ there is no specified handling for that case, but in most cases would use multiple definitions of that string.
2.13.4/2:“所有字符串文字是否不同(即存储在不重叠的对象中)是实现定义的”。
这允许您所要求的优化。
顺便说一句,至少在标准的该部分中,可能存在轻微的歧义。字符串文字的定义对我来说不太清楚以下代码是使用一个字符串文字两次,还是每次使用两个字符串文字一次:
但下一段说“在翻译阶段 6 中,相邻的窄字符串文字被连接”。除非它的意思是说某个东西可以与其自身相邻,否则我认为这段代码的意图非常明确,即该代码使用两个字符串文字,它们在第 6 阶段连接起来。所以它不是一个字符串文字两次:
不过,如果您确实读过这一点“a”和“a”是相同字符串文字,那么标准需要您正在讨论的优化。但我不认为它们是相同的文字,我认为它们是碰巧由相同字符组成的不同文字。这可能在标准的其他地方得到了明确的说明,例如在语法和解析的一般信息中。
无论是否明确,许多编译器编写者都按照我认为的方式解释了该标准,所以我可能是对的;-)
2.13.4/2: "whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined".
This permits the optimisation you're asking about.
As an aside, there may be a slight ambiguity, at least locally within that section of the standard. The definition of string literal doesn't quite make clear to me whether the following code uses one string literal twice, or two string literals once each:
But the next paragraph says "In translation phase 6 adjacent narrow string literals are concatenated". Unless it means to say that something can be adjacent to itself, I think the intention is pretty clear that this code uses two string literals, which are concatenated in phase 6. So it's not one string literal twice:
Still, if you did read that "a" and "a" are the same string literal, then the standard requires the optimisation you're talking about. But I don't think they are the same literal, I think they're different literals that happen to consist of the same characters. This is perhaps made clear elsewhere in the standard, for instance in the general information on grammar and parsing.
Whether it's made clear or not, many compiler-writers have interpreted the standard the way I think it is, so I might as well be right ;-)