字符串文字是常量吗?

发布于 2024-10-08 08:53:37 字数 356 浏览 4 评论 0原文

如果我将字符串文字分配给 char*,GCC 和 Clang 都不会抱怨,即使使用大量迂腐的选项 (-Wall -W -pedantic -std=c99 ):

char *foo = "bar";

虽然他们(当然)确实会抱怨如果我将 const char* 分配给 char*

这是否意味着字符串文字被视为 char* 类型?它们不应该是 const char* 吗?如果它们被修改,这不是定义的行为!

并且(一个不相关的问题)命令行参数(即:argv)怎么样:它是否被视为字符串文字数组?

Both GCC and Clang do not complain if I assign a string literal to a char*, even when using lots of pedantic options (-Wall -W -pedantic -std=c99):

char *foo = "bar";

while they (of course) do complain if I assign a const char* to a char*.

Does this mean that string literals are considered to be of char* type? Shouldn't they be const char*? It's not defined behavior if they get modified!

And (an uncorrelated question) what about command line parameters (ie: argv): is it considered to be an array of string literals?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

倾城泪 2024-10-15 08:53:37

它们的类型为 char[N],其中 N 是包括终止 \0 在内的字符数。所以,是的,您可以将它们分配给 char*,但您仍然无法写入它们(效果将是未定义的)。

Wrt argv:它指向一个字符串指针数组。这些字符串是可以显式修改的。您可以更改它们,并且它们需要保存最后存储的值。

They are of type char[N] where N is the number of characters including the terminating \0. So yes you can assign them to char*, but you still cannot write to them (the effect will be undefined).

Wrt argv: It points to an array of pointers to strings. Those strings are explicitly modifiable. You can change them and they are required to hold the last stored value.

情深缘浅 2024-10-15 08:53:37

为了完整起见,C99 草案标准(< em>C89 和 C11 有类似的措辞)在 6.4.5 部分字符串文字第 5 段 说:

[...]值零的字节或代码被附加到由一个或多个字符串文字产生的每个多字节字符序列。然后,使用多字节字符序列来初始化静态存储持续时间和长度足以包含该序列的数组。对于字符串文字,数组元素具有类型 char,并使用多字节字符序列的各个字节进行初始化;[...]

因此这表示字符串文字具有静态存储持续时间(持续程序的生命周期)它的类型是 char[](不是 char *),它的长度是附加零的字符串文字的大小。 *第6段`说:

如果程序尝试修改此类数组,则行为未定义。

因此,无论事实如何,尝试修改字符串文字都是未定义行为它们不是 const。

关于5.1.2.2.1部分中的argv程序启动段落2说:

如果声明了它们,则主函数的参数应遵守以下规定
限制:

[...]

-参数argc和argv以及argv数组指向的字符串应可由程序修改,并在程序之间保留其最后存储的值
启动和程序终止。

所以argv不被认为是字符串数组,修改argv的内容是可以的。

For completeness sake the C99 draft standard(C89 and C11 have similar wording) in section 6.4.5 String literals paragraph 5 says:

[...]a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence;[...]

So this says a string literal has static storage duration(lasts the lifetime of the program) and it's type is char[](not char *) and its length is the size of the string literal with an appended zero. *Paragraph 6` says:

If the program attempts to modify such an array, the behavior is undefined.

So attempting to modify a string literal is undefined behavior regardless of the fact that they are not const.

With respect to argv in section 5.1.2.2.1 Program startup paragraph 2 says:

If they are declared, the parameters to the main function shall obey the following
constraints:

[...]

-The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program
startup and program termination.

So argv is not considered an array of string literals and it is ok to modify the contents of argv.

夜无邪 2024-10-15 08:53:37

使用 -Wwrite-strings 选项,您将得到:

warning: initialization discards qualifiers from pointer target type

无论该选项如何,GCC 都会将文字放入只读内存部分,除非使用 -fwritable-strings 另有说明(但是这个选项已从最近的 GCC 版本中删除)。

命令行参数不是常量,它们通常位于堆栈中。

Using -Wwrite-strings option you will get:

warning: initialization discards qualifiers from pointer target type

Irrespective of that option, GCC will put literals into read-only memory section, unless told otherwise by using -fwritable-strings (however this option has been removed from recent GCC versions).

Command line parameters are not const, they typically live on the stack.

人间☆小暴躁 2024-10-15 08:53:37

c,而不是 c++。也许我的答案与这个问题不太相关!)

(抱歉,我刚刚注意到这个问题被标记为不完全是 const 或 not-const,对于文字有一个特殊的奇怪规则。

(总结:文字可以通过引用数组作为 foo( const char (&)[N]) 来获取,并且不能作为非常量数组到目前为止,他们更喜欢衰退为const char *,但是看起来他们是const。 > 有一个特殊的遗留规则允许文字衰减为 char *。请参阅下面的实验。)

(以下实验是在 clang3.3 上使用 -std=gnu++0x 进行的。 >。也许这是一个 C++11 问题?或者是 clang 特有的问题?无论如何,都发生了一些奇怪的事情。)

首先,文字似乎是 const:

void foo( const char  * ) { std::cout << "const char *" << std::endl; }
void foo(       char  * ) { std::cout << "      char *" << std::endl; }

int main() {
        const char arr_cc[3] = "hi";
        char arr_c[3] = "hi";

        foo(arr_cc); // const char *
        foo(arr_c);  //       char *
        foo("hi");   // const char *
}

两个数组的行为符合预期 。 ,证明 foo 能够告诉我们指针是否为 const 。然后,"hi" 选择 fooconst 版本。所以看起来这就解决了:文字是 const ...不是吗?

但是,如果你删除 void foo( const char * ) 那么事情就会变得奇怪。首先,对 foo(arr_c) 的调用失败,并在编译时出现错误。这是预期的。但文字调用 (foo("hi")) 通过非常量调用进行工作。

因此,文字比 arr_c “更常量”(因为它们更喜欢衰减为 const char *,与 arr_c 不同。但文字是“ less const” 比 arr_cc 因为它们愿意在需要时衰减为 char *

(当它衰减为 char * 时,Clang 会发出警告) 但是

为了简单起见,让我们避免它。

让我们通过引用将数组放入 foo 中,这给了我们更“直观”的结果:

void foo( const char  (&)[3] ) { std::cout << "const char (&)[3]" << std::endl; }
void foo(       char  (&)[3] ) { std::cout << "      char (&)[3]" << std::endl; }

和以前一样,文字和 const 数组 (< code>arr_cc) 使用 const 版本,而 arr_c 使用非 const 版本,如果我们删除 foo( const char (&)[3] )。,那么 foo(arr_cc);foo("hi"); 都会出错。简而言之,如果我们避免指针衰减并且使用数组引用来代替,文字的行为就好像它们是

模板?

在模板中,系统将推导出const char * 。的 char * 并且你“卡住”了。

template<typename T>
void bar(T *t) { // will deduce   const char   when a literal is supplied
    foo(t);
}

因此基本上,文字在任何时候都表现为 const,除非您直接使用文字初始化 char * 的特定情况。

(Sorry, I've only just noticed this question is tagged as c, not c++. Maybe my answer isn't so relevant to this question after all!)

String literals are not quite const or not-const, there is a special strange rule for literals.

(Summary: Literals can be taken by reference-to-array as foo( const char (&)[N]) and cannot be taken as the non-const array. They prefer to decay to const char *. So far, that makes it seem like they are const. But there is a special legacy rule which allows literals to decay to char *. See experiments below.)

(Following experiments done on clang3.3 with -std=gnu++0x. Perhaps this is a C++11 issue? Or specific to clang? Either way, there is something strange going on.)

At first, literals appears to be const:

void foo( const char  * ) { std::cout << "const char *" << std::endl; }
void foo(       char  * ) { std::cout << "      char *" << std::endl; }

int main() {
        const char arr_cc[3] = "hi";
        char arr_c[3] = "hi";

        foo(arr_cc); // const char *
        foo(arr_c);  //       char *
        foo("hi");   // const char *
}

The two arrays behave as expected, demonstrating that foo is able to tell us whether the pointer is const or not. Then "hi" selects the const version of foo. So it seems like that settles it: literals are const ... aren't they?

But, if you remove void foo( const char * ) then it gets strange. First, the call to foo(arr_c) fails with an error at compile time. That is expected. But the literal call (foo("hi")) works via the non-const call.

So, literals are "more const" than arr_c (because they prefer to decay to the const char *, unlike arr_c. But literals are "less const" than arr_cc because they are willing to decay to char * if needed.

(Clang gives a warning when it decays to char *).

But what about the decaying? Let's avoid it for simplicity.

Let's take the arrays by reference into foo instead. This gives us more 'intuitive' results:

void foo( const char  (&)[3] ) { std::cout << "const char (&)[3]" << std::endl; }
void foo(       char  (&)[3] ) { std::cout << "      char (&)[3]" << std::endl; }

As before, the literal and the const array (arr_cc) use the const version, and the non-const version is used by arr_c. And if we delete foo( const char (&)[3] ), then we get errors with both foo(arr_cc); and foo("hi");. In short, if we avoid the pointer-decay and use reference-to-array instead, literals behave as if they are const.

Templates?

In templates, the system will deduce const char * instead of char * and you're "stuck" with that.

template<typename T>
void bar(T *t) { // will deduce   const char   when a literal is supplied
    foo(t);
}

So basically, a literal behaves as const at all times, except in the particular case where you directly initialize a char * with a literal.

嘿咻 2024-10-15 08:53:37

约翰内斯的回答关于类型和内容是正确的。但除此之外,修改字符串文字的内容是未定义的行为。

关于您关于 argv 的问题:

参数 argc 和 argv 以及
argv 数组指向的字符串
应可由程序修改,
并保留最后存储的值
程序启动和程序之间
终止。

Johannes' answer is correct concerning the type and contents. But in addition to that, yes, it is undefined behavior to modify contents of a string literal.

Concerning your question about argv:

The parameters argc and argv and the
strings pointed to by the argv array
shall be modifiable by the program,
and retain their last-stored values
between program startup and program
termination.

红ご颜醉 2024-10-15 08:53:37

在 C89 和 C99 中,字符串文字都是 char * 类型(据我所知,出于历史原因)。您是正确的,尝试修改会导致未定义的行为。 GCC 有一个特定的警告标志, -Wwrite-strings(不是 -Wall 的一部分),如果您尝试这样做,它会警告您。

至于argv,参数被复制到程序的地址空间中,并且可以在main()函数中安全地修改。

编辑:哎呀,意外复制了-Wno-write-strings。使用警告标志的正确(正)形式进行更新。

In both C89 and C99, string literals are of type char * (for historical reasons, as I understand it). You are correct that trying to modify one results in undefined behavior. GCC has a specific warning flag, -Wwrite-strings (which is not part of -Wall), that will warn you if you try to do so.

As for argv, the arguments are copied into your program's address space, and can safely be modified in your main() function.

EDIT: Whoops, had -Wno-write-strings copied by accident. Updated with the correct (positive) form of the warning flag.

情定在深秋 2024-10-15 08:53:37

字符串文字具有正式类型char [],但语义类型const char []。纯粹主义者讨厌它,但这通常是有用且无害的,除了让很多新手“为什么我的程序崩溃了?!?!”问题。

String literals have formal type char [] but semantic type const char []. The purists hate it but this is generally useful and harmless, except for bringing lots of newbies to SO with "WHY IS MY PROGRAM CRASHING?!?!" questions.

冷弦 2024-10-15 08:53:37

它们是 const char*,但对于在 const 之前存在的遗留代码,将它们分配给 char* 有一个特定的排除。并且命令行参数绝对不是字面意思,它们是在运行时创建的。

They are const char*, but there is a specific exclusion for assigning them to char* for legacy code that existed before const did. And the command line arguments are definitely not literal, they are created at run-time.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文