字符串文字是常量吗?
如果我将字符串文字分配给 char*
,GCC 和 Clang 都不会抱怨,即使使用大量迂腐的选项 (-Wall -W -pedantic -std=c99
):
char *foo = "bar";
虽然他们(当然)确实会抱怨如果我将 const char*
分配给 char*
。
这是否意味着字符串文字被视为 char* 类型?它们不应该是 const char* 吗?如果它们被修改,这不是定义的行为!
并且(一个不相关的问题)命令行参数(即:argv)怎么样:它是否被视为字符串文字数组?
Both GCC and Clang do not complain if I assign a string literal to a char*
, even when using lots of pedantic options (-Wall -W -pedantic -std=c99
):
char *foo = "bar";
while they (of course) do complain if I assign a const char*
to a char*
.
Does this mean that string literals are considered to be of char*
type? Shouldn't they be const char*
? It's not defined behavior if they get modified!
And (an uncorrelated question) what about command line parameters (ie: argv
): is it considered to be an array of string literals?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
它们的类型为
char[N]
,其中N
是包括终止\0
在内的字符数。所以,是的,您可以将它们分配给char*
,但您仍然无法写入它们(效果将是未定义的)。Wrt argv:它指向一个字符串指针数组。这些字符串是可以显式修改的。您可以更改它们,并且它们需要保存最后存储的值。
They are of type
char[N]
whereN
is the number of characters including the terminating\0
. So yes you can assign them tochar*
, but you still cannot write to them (the effect will be undefined).Wrt
argv
: It points to an array of pointers to strings. Those strings are explicitly modifiable. You can change them and they are required to hold the last stored value.为了完整起见,C99 草案标准(< em>C89 和 C11 有类似的措辞)在
6.4.5
部分字符串文字第 5 段 说:因此这表示字符串文字具有静态存储持续时间(持续程序的生命周期)它的类型是
char[]
(不是char *
),它的长度是附加零的字符串文字的大小。 *第6段`说:因此,无论事实如何,尝试修改字符串文字都是未定义行为它们不是 const。
关于
5.1.2.2.1
部分中的argv
程序启动段落2说:所以argv不被认为是字符串数组,修改argv的内容是可以的。
For completeness sake the C99 draft standard(C89 and C11 have similar wording) in section
6.4.5
String literals paragraph 5 says:So this says a string literal has static storage duration(lasts the lifetime of the program) and it's type is
char[]
(notchar *
) and its length is the size of the string literal with an appended zero. *Paragraph 6` says:So attempting to modify a string literal is undefined behavior regardless of the fact that they are not
const
.With respect to
argv
in section5.1.2.2.1
Program startup paragraph 2 says:So
argv
is not considered an array of string literals and it is ok to modify the contents ofargv
.使用
-Wwrite-strings
选项,您将得到:无论该选项如何,GCC 都会将文字放入只读内存部分,除非使用
-fwritable-strings
另有说明(但是这个选项已从最近的 GCC 版本中删除)。命令行参数不是常量,它们通常位于堆栈中。
Using
-Wwrite-strings
option you will get:Irrespective of that option, GCC will put literals into read-only memory section, unless told otherwise by using
-fwritable-strings
(however this option has been removed from recent GCC versions).Command line parameters are not const, they typically live on the stack.
c,而不是
c++
。也许我的答案与这个问题不太相关!)(抱歉,我刚刚注意到这个问题被标记为
不完全是 const 或 not-const,对于文字有一个特殊的奇怪规则。
(总结:文字可以通过引用数组作为
foo( const char (&)[N])
来获取,并且不能作为非常量数组到目前为止,他们更喜欢衰退为const char *
,但是看起来他们是const。 > 有一个特殊的遗留规则允许文字衰减为char *
。请参阅下面的实验。)(以下实验是在 clang3.3 上使用
-std=gnu++0x
进行的。 >。也许这是一个 C++11 问题?或者是 clang 特有的问题?无论如何,都发生了一些奇怪的事情。)首先,文字似乎是 const:
两个数组的行为符合预期 。 ,证明
foo
能够告诉我们指针是否为const
。然后,"hi"
选择foo
的const
版本。所以看起来这就解决了:文字是 const ...不是吗?但是,如果你删除
void foo( const char * )
那么事情就会变得奇怪。首先,对foo(arr_c)
的调用失败,并在编译时出现错误。这是预期的。但文字调用 (foo("hi")
) 通过非常量调用进行工作。因此,文字比
arr_c
“更常量”(因为它们更喜欢衰减为const char *
,与arr_c
不同。但文字是“ less const” 比arr_cc
因为它们愿意在需要时衰减为char *
。(当它衰减为
char *
时,Clang 会发出警告) 但是为了简单起见,让我们避免它。
让我们通过引用将数组放入 foo 中,这给了我们更“直观”的结果:
和以前一样,文字和 const 数组 (< code>arr_cc) 使用 const 版本,而
arr_c
使用非 const 版本,如果我们删除foo( const char (&)[3] )。
,那么foo(arr_cc);
和foo("hi");
都会出错。简而言之,如果我们避免指针衰减并且使用数组引用来代替,文字的行为就好像它们是模板?
在模板中,系统将推导出
const char *
。的char *
并且你“卡住”了。因此基本上,文字在任何时候都表现为 const,除非您直接使用文字初始化 char * 的特定情况。
(Sorry, I've only just noticed this question is tagged as
c
, notc++
. Maybe my answer isn't so relevant to this question after all!)String literals are not quite
const
ornot-const
, there is a special strange rule for literals.(Summary: Literals can be taken by reference-to-array as
foo( const char (&)[N])
and cannot be taken as the non-const array. They prefer to decay toconst char *
. So far, that makes it seem like they areconst
. But there is a special legacy rule which allows literals to decay tochar *
. See experiments below.)(Following experiments done on clang3.3 with
-std=gnu++0x
. Perhaps this is a C++11 issue? Or specific to clang? Either way, there is something strange going on.)At first, literals appears to be
const
:The two arrays behave as expected, demonstrating that
foo
is able to tell us whether the pointer isconst
or not. Then"hi"
selects theconst
version offoo
. So it seems like that settles it: literals areconst
... aren't they?But, if you remove
void foo( const char * )
then it gets strange. First, the call tofoo(arr_c)
fails with an error at compile time. That is expected. But the literal call (foo("hi")
) works via the non-const call.So, literals are "more const" than
arr_c
(because they prefer to decay to theconst char *
, unlikearr_c
. But literals are "less const" thanarr_cc
because they are willing to decay tochar *
if needed.(Clang gives a warning when it decays to
char *
).But what about the decaying? Let's avoid it for simplicity.
Let's take the arrays by reference into foo instead. This gives us more 'intuitive' results:
As before, the literal and the const array (
arr_cc
) use the const version, and the non-const version is used byarr_c
. And if we deletefoo( const char (&)[3] )
, then we get errors with bothfoo(arr_cc);
andfoo("hi");
. In short, if we avoid the pointer-decay and use reference-to-array instead, literals behave as if they areconst
.Templates?
In templates, the system will deduce
const char *
instead ofchar *
and you're "stuck" with that.So basically, a literal behaves as
const
at all times, except in the particular case where you directly initialize achar *
with a literal.约翰内斯的回答关于类型和内容是正确的。但除此之外,修改字符串文字的内容是未定义的行为。
关于您关于 argv 的问题:
Johannes' answer is correct concerning the type and contents. But in addition to that, yes, it is undefined behavior to modify contents of a string literal.
Concerning your question about
argv
:在 C89 和 C99 中,字符串文字都是 char * 类型(据我所知,出于历史原因)。您是正确的,尝试修改会导致未定义的行为。 GCC 有一个特定的警告标志, -Wwrite-strings(不是
-Wall
的一部分),如果您尝试这样做,它会警告您。至于
argv
,参数被复制到程序的地址空间中,并且可以在main()
函数中安全地修改。编辑:哎呀,意外复制了
-Wno-write-strings
。使用警告标志的正确(正)形式进行更新。In both C89 and C99, string literals are of type
char *
(for historical reasons, as I understand it). You are correct that trying to modify one results in undefined behavior. GCC has a specific warning flag, -Wwrite-strings (which is not part of-Wall
), that will warn you if you try to do so.As for
argv
, the arguments are copied into your program's address space, and can safely be modified in yourmain()
function.EDIT: Whoops, had
-Wno-write-strings
copied by accident. Updated with the correct (positive) form of the warning flag.字符串文字具有正式类型
char []
,但语义类型const char []
。纯粹主义者讨厌它,但这通常是有用且无害的,除了让很多新手“为什么我的程序崩溃了?!?!”问题。String literals have formal type
char []
but semantic typeconst char []
. The purists hate it but this is generally useful and harmless, except for bringing lots of newbies to SO with "WHY IS MY PROGRAM CRASHING?!?!" questions.它们是 const char*,但对于在 const 之前存在的遗留代码,将它们分配给 char* 有一个特定的排除。并且命令行参数绝对不是字面意思,它们是在运行时创建的。
They are const char*, but there is a specific exclusion for assigning them to char* for legacy code that existed before const did. And the command line arguments are definitely not literal, they are created at run-time.