int a[] = {1,2,};为什么允许在初始化列表中使用尾随逗号?

发布于 2024-11-29 20:24:34 字数 597 浏览 3 评论 0原文

也许我不是来自这个星球,但在我看来,以下内容应该是语法错误:

int a[] = {1,2,}; //extra comma in the end

但事实并非如此。当这段代码在 Visual Studio 上编译时,我感到很惊讶,但就 C++ 规则而言,我已经学会了不要信任 MSVC 编译器,所以我检查了标准,它也是标准所允许的。不信的话你可以看8.5.1的语法规则。

在此处输入图像描述

为什么允许这样做?这可能是一个愚蠢无用的问题,但我希望你明白我为什么问。如果这是一般语法规则的子情况,我会理解 - 他们决定不让一般语法变得更加困难,只是不允许在初始化列表末尾使用多余的逗号。但不可以,额外的逗号是明确允许的。例如,不允许在函数调用参数列表末尾有多余的逗号(当函数采用 ... 时),这是正常的

那么,再次强调,是否有任何特殊原因明确允许使用这种多余的逗号

Maybe I am not from this planet, but it would seem to me that the following should be a syntax error:

int a[] = {1,2,}; //extra comma in the end

But it's not. I was surprised when this code compiled on Visual Studio, but I have learnt not to trust MSVC compiler as far as C++ rules are concerned, so I checked the standard and it is allowed by the standard as well. You can see 8.5.1 for the grammar rules if you don't believe me.

enter image description here

Why is this allowed? This may be a stupid useless question but I want you to understand why I am asking. If it were a sub-case of a general grammar rule, I would understand - they decided not to make the general grammar any more difficult just to disallow a redundant comma at the end of an initializer list. But no, the additional comma is explicitly allowed. For example, it isn't allowed to have a redundant comma in the end of a function-call argument list (when the function takes ...), which is normal.

So, again, is there any particular reason this redundant comma is explicitly allowed?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(21

GRAY°灰色天空 2024-12-06 20:24:35

我看到其他答案中未提及的一个用例,
我们最喜欢的宏:

int a [] = {
#ifdef A
    1, //this can be last if B and C is undefined
#endif
#ifdef B
    2,
#endif
#ifdef C
    3,
#endif
};

添加宏来处理最后一个 , 会很痛苦。通过语法上的这个小变化,管理起来就很简单了。这比机器生成的代码更重要,因为通常用图灵完整语言比非常有限的预处理器更容易做到这一点。

I see one use case that was not mentioned in other answers,
our favorite Macros:

int a [] = {
#ifdef A
    1, //this can be last if B and C is undefined
#endif
#ifdef B
    2,
#endif
#ifdef C
    3,
#endif
};

Adding macros to handle last , would be big pain. With this small change in syntax this is trivial to manage. And this is more important than machine generated code because is usually lot of easier to do it in Turing complete langue than very limited preprocesor.

信仰 2024-12-06 20:24:35

据我所知,允许这样做的原因之一是自动生成代码应该很简单;您不需要对最后一个元素进行任何特殊处理。

One of the reasons this is allowed as far as I know is that it should be simple to automatically generate code; you don't need any special handling for the last element.

箹锭⒈辈孓 2024-12-06 20:24:35

它使生成数组或枚举的代码生成器变得更容易。

想象一下:

std::cout << "enum Items {\n";
for(Items::iterator i(items.begin()), j(items.end); i != j; ++i)
    std::cout << *i << ",\n";
std::cout << "};\n";

即,不需要对第一个或最后一个项目进行特殊处理以避免吐出尾随逗号。

例如,如果代码生成器是用 Python 编写的,则可以使用 str.join() 函数轻松避免吐出尾随逗号:

print("enum Items {")
print(",\n".join(items))
print("}")

It makes code generators that spit out arrays or enumerations easier.

Imagine:

std::cout << "enum Items {\n";
for(Items::iterator i(items.begin()), j(items.end); i != j; ++i)
    std::cout << *i << ",\n";
std::cout << "};\n";

I.e., no need to do special handling of the first or last item to avoid spitting the trailing comma.

If the code generator is written in Python, for example, it is easy to avoid spitting the trailing comma by using str.join() function:

print("enum Items {")
print(",\n".join(items))
print("}")
路弥 2024-12-06 20:24:35

原因很简单:易于添加/删除行。

想象一下以下代码:

int a[] = {
   1,
   2,
   //3, // - not needed any more
};

现在,您可以轻松地向列表添加/删除项目,而无需有时添加/删除尾随逗号。

与其他答案相比,我真的不认为生成列表的容易性是一个有效的理由:毕竟,代码对最后一行(或第一行)进行特殊处理是微不足道的。代码生成器编写一次并使用多次。

The reason is trivial: ease of adding/removing lines.

Imagine the following code:

int a[] = {
   1,
   2,
   //3, // - not needed any more
};

Now, you can easily add/remove items to the list without having to add/remove the trailing comma sometimes.

In contrast to other answers, I don't really think that ease of generating the list is a valid reason: after all, it's trivial for the code to special-case the last (or first) line. Code-generators are written once and used many times.

笑饮青盏花 2024-12-06 20:24:35

它允许每一行遵循相同的形式。首先,这使得添加新行变得更容易,并使版本控制系统有意义地跟踪更改,并且还允许您更轻松地分析代码。我想不出技术原因。

It allows every line to follow the same form. Firstly this makes it easier to add new rows and have a version control system track the change meaningfully and it also allows you to analyze the code more easily. I can't think of a technical reason.

我纯我任性 2024-12-06 20:24:35

在实践中*,唯一不允许的语言是 Javascript,它会导致无数的问题。例如,如果您复制 &从数组中间粘贴一行,将其粘贴到末尾,然后忘记删除逗号,那么对于 IE 访问者来说,您的网站将完全崩溃。

*理论上是允许的,但 Internet Explorer 不遵循标准并将其视为错误

The only language where it's - in practice* - not allowed is Javascript, and it causes an innumerable amount of problems. For example if you copy & paste a line from the middle of the array, paste it at the end, and forgot to remove the comma then your site will be totally broken for your IE visitors.

*In theory it is allowed but Internet Explorer doesn't follow the standard and treats it as an error

最冷一天 2024-12-06 20:24:35

对于机器来说,它更容易,即解析和生成代码。
对于人类来说也更容易,即通过一致性进行修改、注释和视觉优雅。

假设用C,你会写出以下内容吗?

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    puts("Line 1");
    puts("Line 2");
    puts("Line 3");

    return EXIT_SUCCESS
}

不。不仅因为最后的语句是错误的,而且还因为它不一致。那么为什么要对集合做同样的事情呢?即使在允许省略最后一个分号和逗号的语言中,社区通常也不喜欢它。例如,Perl 社区似乎不喜欢省略分号和俏皮话。他们也将其应用于逗号。

不要在多行集合中省略逗号,其原因与在多行代码块中不省略分号的原因相同。我的意思是,即使语言允许,你也不会这样做,对吗?正确的?

It's easier for machines, i.e. parsing and generation of code.
It's also easier for humans, i.e. modification, commenting-out, and visual-elegance via consistency.

Assuming C, would you write the following?

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    puts("Line 1");
    puts("Line 2");
    puts("Line 3");

    return EXIT_SUCCESS
}

No. Not only because the final statement is an error, but also because it's inconsistent. So why do the same to collections? Even in languages that allow you to omit last semicolons and commas, the community usually doesn't like it. The Perl community, for example, doesn't seem to like omitting semicolons, bar one-liners. They apply that to commas too.

Don't omit commas in multiline collections for the same reason you don't ommit semicolons for multiline blocks of code. I mean, you wouldn't do it even if the language allowed it, right? Right?

自找没趣 2024-12-06 20:24:35

这可以防止因在长列表中移动元素而导致的错误。

例如,假设我们有一个如下所示的代码。

#include <iostream>
#include <string>
#include <cstddef>
#define ARRAY_SIZE(array) (sizeof(array) / sizeof *(array))
int main() {
    std::string messages[] = {
        "Stack Overflow",
        "Super User",
        "Server Fault"
    };
    size_t i;
    for (i = 0; i < ARRAY_SIZE(messages); i++) {
        std::cout << messages[i] << std::endl;
    }
}

它很棒,因为它展示了 Stack Exchange 站点的原始三部曲。

Stack Overflow
Super User
Server Fault

但它有一个问题。你看,这个网站的页脚在超级用户之前显示了服务器故障。最好在有人注意到之前解决这个问题。

#include <iostream>
#include <string>
#include <cstddef>
#define ARRAY_SIZE(array) (sizeof(array) / sizeof *(array))
int main() {
    std::string messages[] = {
        "Stack Overflow",
        "Server Fault"
        "Super User",
    };
    size_t i;
    for (i = 0; i < ARRAY_SIZE(messages); i++) {
        std::cout << messages[i] << std::endl;
    }
}

毕竟,移动线路并没有那么困难,不是吗?

Stack Overflow
Server FaultSuper User

我知道,没有一个名为“Server FailureSuper User”的网站,但我们的编译器声称它存在。现在的问题是 C 具有字符串连接功能,它允许您编写两个双引号字符串并不使用任何东西连接它们(类似的问题也可能发生在整数上,因为 - 符号有多种含义) 。

现在,如果原始数组末尾有一个无用的逗号怎么办?好吧,线路会移动,但这样的错误就不会发生。很容易错过像逗号这样小的东西。如果您记得在每个数组元素后面加一个逗号,这样的错误就不会发生。您不想浪费四个小时调试某些内容,直到您发现逗号是问题的原因

This is allowed to protect from mistakes caused by moving elements around in a long list.

For example, let's assume we have a code looking like this.

#include <iostream>
#include <string>
#include <cstddef>
#define ARRAY_SIZE(array) (sizeof(array) / sizeof *(array))
int main() {
    std::string messages[] = {
        "Stack Overflow",
        "Super User",
        "Server Fault"
    };
    size_t i;
    for (i = 0; i < ARRAY_SIZE(messages); i++) {
        std::cout << messages[i] << std::endl;
    }
}

And it's great, as it shows the original trilogy of Stack Exchange sites.

Stack Overflow
Super User
Server Fault

But there is one problem with it. You see, the footer on this website shows Server Fault before Super User. Better fix that before anyone notices.

#include <iostream>
#include <string>
#include <cstddef>
#define ARRAY_SIZE(array) (sizeof(array) / sizeof *(array))
int main() {
    std::string messages[] = {
        "Stack Overflow",
        "Server Fault"
        "Super User",
    };
    size_t i;
    for (i = 0; i < ARRAY_SIZE(messages); i++) {
        std::cout << messages[i] << std::endl;
    }
}

After all, moving lines around couldn't be that hard, could it be?

Stack Overflow
Server FaultSuper User

I know, there is no website called "Server FaultSuper User", but our compiler claims it exists. Now, the issue is that C has a string concatenation feature, which allows you to write two double quoted strings and concatenate them using nothing (similar issue can also happen with integers, as - sign has multiple meanings).

Now what if the original array had an useless comma at end? Well, the lines would be moved around, but such bug wouldn't have happened. It's easy to miss something as small as a comma. If you remember to put a comma after every array element, such bug just cannot happen. You wouldn't want to waste four hours debugging something, until you would find the comma is the cause of your problems.

三寸金莲 2024-12-06 20:24:35

与许多事情一样,数组初始值设定项中的尾随逗号是 C++ 从 C 继承的内容之一(并且必须永远支持)。 与这里的观点完全不同的观点“Deep C Secrets”一书中提到。

在一个包含多个“逗号悖论”的例子之后:

char *available_resources[] = {
"color monitor"           ,
"big disk"                ,
"Cray"                      /* whoa! no comma! */
"on-line drawing routines",
"mouse"                   ,
"keyboard"                ,
"power cables"            , /* and what's this extra comma? */
};

我们读到:

...最终初始化程序后面的逗号不是拼写错误,而是从原始 C 继承下来的语法中的一个问题。它的存在或不存在是允许的,但没有意义。 ANSI C 基本原理中声称的理由是它使 C 的自动生成变得更容易。 如果每个逗号分隔列表中都允许使用尾随逗号(例如在枚举声明中或单个声明中的多个变量声明符),则该声明会更可信。他们不是。

...对我来说这更有意义

Like many things, the trailing comma in an array initializer is one of the things C++ inherited from C (and will have to support for ever). A view totally different from those placed here is mentioned in the book "Deep C secrets".

Therein after an example with more than one "comma paradoxes" :

char *available_resources[] = {
"color monitor"           ,
"big disk"                ,
"Cray"                      /* whoa! no comma! */
"on-line drawing routines",
"mouse"                   ,
"keyboard"                ,
"power cables"            , /* and what's this extra comma? */
};

we read :

...that trailing comma after the final initializer is not a typo, but a blip in the syntax carried over from aboriginal C. Its presence or absence is allowed but has no significance. The justification claimed in the ANSI C rationale is that it makes automated generation of C easier. The claim would be more credible if trailing commas were permitted in every comma-sepa-rated list, such as in enum declarations, or multiple variable declarators in a single declaration. They are not.

... to me this makes more sense

撞了怀 2024-12-06 20:24:35

除了代码生成和编辑容易之外,如果你想实现一个解析器,这种类型的语法更简单,更容易实现。 C# 在多个地方都遵循此规则,即有一个以逗号分隔的项目列表,例如 enum 定义中的项目。

In addition to code generation and editing ease, if you want to implement a parser, this type of grammar is simpler and easier to implement. C# follows this rule in several places that there's a list of comma-separated items, like items in an enum definition.

转身泪倾城 2024-12-06 20:24:35

它使生成代码变得更容易,因为您只需要添加一行,并且不需要将添加最后一个条目视为特殊情况。当使用宏生成代码时尤其如此。有人试图消除语言中对宏的需求,但许多语言确实随着宏的可用而不断发展。额外的逗号允许定义和使用如下宏:

#define LIST_BEGIN int a[] = {
#define LIST_ENTRY(x) x,
#define LIST_END };

用法:

LIST_BEGIN
   LIST_ENTRY(1)
   LIST_ENTRY(2)
LIST_END

这是一个非常简化的示例,但宏通常使用此模式来定义调度、消息、事件或转换映射和表等内容。如果末尾不允许使用逗号,我们就需要一个特殊的:

#define LIST_LAST_ENTRY(x) x

,这使用起来会很尴尬。

It makes generating code easier as you only need to add one line and don't need to treat adding the last entry as if it's a special case. This is especially true when using macros to generate code. There's a push to try to eliminate the need for macros from the language, but a lot of the language did evolve hand in hand with macros being available. The extra comma allows macros such as the following to be defined and used:

#define LIST_BEGIN int a[] = {
#define LIST_ENTRY(x) x,
#define LIST_END };

Usage:

LIST_BEGIN
   LIST_ENTRY(1)
   LIST_ENTRY(2)
LIST_END

That's a very simplified example, but often this pattern is used by macros for defining things such as dispatch, message, event or translation maps and tables. If a comma wasn't allowed at the end, we'd need a special:

#define LIST_LAST_ENTRY(x) x

and that would be very awkward to use.

焚却相思 2024-12-06 20:24:35

这样,当两个人在不同分支的列表中添加新项目时,Git 可以正确合并更改,因为 Git 是按行工作的。

So that when two people add a new item in a list on separate branches, Git can properly merge the changes, because Git works on a line basis.

ㄖ落Θ余辉 2024-12-06 20:24:35

它使编辑代码变得更加容易。
我将 editinc c/c++ 数组元素与编辑 json 文档进行比较 - 如果您忘记删除最后一个逗号,则 JSON 将无法解析。 (是的,我知道 JSON 不适合手动编辑)

It makes editing the code a lot easier.
I'm comparing editinc c/c++ array elements with editing json documents - if you forget to remove the last comma, the JSON will not parse. (Yes, I know JSON is not meant to be edited manually)

风柔一江水 2024-12-06 20:24:35

此规则有用性的另一个很好的例子是与参数包一起使用,该参数包可能为空。演示:

template <int I, int... Is>    // at least one template argument is required...
struct Test {
  void test() {
    int arr[] = { I, Is... };  // ...because arrays of 0 length are not allowed
  }
};

如果不允许尾随逗号,则在参数包为空的情况下将无法编译,例如:

Test<1> t;  // results in 'int arr[] = { 1, };'
t.test();

现场演示: https://godbolt.org/z/c6fbY87Mr

Another great example of the usefulness of this rule is use with parameter packs, which may be empty. Demo:

template <int I, int... Is>    // at least one template argument is required...
struct Test {
  void test() {
    int arr[] = { I, Is... };  // ...because arrays of 0 length are not allowed
  }
};

Without allowing trailing commas, this would not compile in case of an empty parameter pack, such as for:

Test<1> t;  // results in 'int arr[] = { 1, };'
t.test();

Live demo: https://godbolt.org/z/c6fbY87Mr

浅暮の光 2024-12-06 20:24:34

它使得生成源代码变得更加容易,也使得编写可以在以后轻松扩展的代码变得更加容易。考虑添加额外条目需要什么:

int a[] = {
   1,
   2,
   3
};

... 您必须将逗号添加到现有行添加新行。与这三个已经后面有逗号的情况相比,您只需添加一行即可。同样,如果您想删除一行,您可以这样做而不必担心它是否是最后一行,并且您可以对行重新排序而无需摆弄逗号。基本上,这意味着您对待线条的方式是一致的。

现在考虑生成代码。像(伪代码)这样​​的东西:

output("int a[] = {");
for (int i = 0; i < items.length; i++) {
    output("%s, ", items[i]);
}
output("};");

无需担心您正在写出的当前项目是第一个还是最后一个。简单多了。

It makes it easier to generate source code, and also to write code which can be easily extended at a later date. Consider what's required to add an extra entry to:

int a[] = {
   1,
   2,
   3
};

... you have to add the comma to the existing line and add a new line. Compare that with the case where the three already has a comma after it, where you just have to add a line. Likewise if you want to remove a line you can do so without worrying about whether it's the last line or not, and you can reorder lines without fiddling about with commas. Basically it means there's a uniformity in how you treat the lines.

Now think about generating code. Something like (pseudo-code):

output("int a[] = {");
for (int i = 0; i < items.length; i++) {
    output("%s, ", items[i]);
}
output("};");

No need to worry about whether the current item you're writing out is the first or the last. Much simpler.

生来就爱笑 2024-12-06 20:24:34

如果你做这样的事情,它会很有用:

int a[] = {
  1,
  2,
  3, //You can delete this line and it's still valid
};

It's useful if you do something like this:

int a[] = {
  1,
  2,
  3, //You can delete this line and it's still valid
};
酷炫老祖宗 2024-12-06 20:24:34

我认为开发人员易于使用。

int a[] = {
            1,
            2,
            2,
            2,
            2,
            2, /*line I could comment out easily without having to remove the previous comma*/
          }

此外,如果出于某种原因您有一个可以为您生成代码的工具;该工具不必关心它是否是初始化中的最后一项。

Ease of use for the developer, I would think.

int a[] = {
            1,
            2,
            2,
            2,
            2,
            2, /*line I could comment out easily without having to remove the previous comma*/
          }

Additionally, if for whatever reason you had a tool that generated code for you; the tool doesn't have to care about whether it's the last item in the initialize or not.

孤星 2024-12-06 20:24:34

我一直认为它可以更容易地附加额外的元素:

int a[] = {
            5,
            6,
          };

简单地变成:

int a[] = { 
            5,
            6,
            7,
          };

在以后的日期。

I've always assumed it makes it easier to append extra elements:

int a[] = {
            5,
            6,
          };

simply becomes:

int a[] = { 
            5,
            6,
            7,
          };

at a later date.

檐上三寸雪 2024-12-06 20:24:34

我很惊讶,这么长时间以来都没有人引用带注释的 C++ 参考手册ARM ),它说了以下关于[dcl.init]的内容,重点是我的:

显然有太多的初始化符号,但每种符号似乎都能很好地服务于特定的使用风格。 ={initializer_list,opt} 表示法继承自 C,非常适合数据结构和数组的初始化。 [...]

尽管自 ARM 编写以来语法已经发生了演变,但其起源仍然存在。

我们可以转到 C99 基本原理看看为什么这在 C 中是允许的,它说:

K&R 允许在初始化程序末尾使用尾随逗号
初始化列表。标准保留了这种语法,因为它
提供从初始值设定项添加或删除成员的灵活性
列表,并简化此类列表的机器生成。

I am surprised after all this time no one has quoted the Annotated C++ Reference Manual(ARM), it says the following about [dcl.init] with emphasis mine:

There are clearly too many notations for initializations, but each seems to serve a particular style of use well. The ={initializer_list,opt} notation was inherited from C and serves well for the initialization of data structures and arrays. [...]

although the grammar has evolved since ARM was written the origin remains.

and we can go to the C99 rationale to see why this was allowed in C and it says:

K&R allows a trailing comma in an initializer at the end of an
initializer-list. The Standard has retained this syntax, since it
provides flexibility in adding or deleting members from an initializer
list, and simplifies machine generation of such lists.

弱骨蛰伏 2024-12-06 20:24:34

每个人所说的关于添加/删除/生成行的简便性的一切都是正确的,但这种语法的真正亮点是在将源文件合并在一起时。想象一下您有这个数组:

int ints[] = {
    3,
    9
};

并假设您已将此代码签入存储库。

然后你的好友编辑它,添加到结尾:

int ints[] = {
    3,
    9,
    12
};

你同时编辑它,添加到开头:

int ints[] = {
    1,
    3,
    9
};

从语义上讲,这些类型的操作(添加到开头,添加到结尾)应该是完全合并安全的,并且你的版本控制软件(希望是 git )应该能够自动合并。遗憾的是,情况并非如此,因为您的版本在 9 之后没有逗号,而您的好友则有。然而,如果原始版本有尾随 9,它们就会自动合并。

所以,我的经验法则是:如果列表跨越多行,则使用尾随逗号;如果列表位于单行,则不要使用它。

Everything everyone is saying about the ease of adding/removing/generating lines is correct, but the real place this syntax shines is when merging source files together. Imagine you've got this array:

int ints[] = {
    3,
    9
};

And assume you've checked this code into a repository.

Then your buddy edits it, adding to the end:

int ints[] = {
    3,
    9,
    12
};

And you simultaneously edit it, adding to the beginning:

int ints[] = {
    1,
    3,
    9
};

Semantically these sorts of operations (adding to the beginning, adding to the end) should be entirely merge safe and your versioning software (hopefully git) should be able to automerge. Sadly, this isn't the case because your version has no comma after the 9 and your buddy's does. Whereas, if the original version had the trailing 9, they would have automerged.

So, my rule of thumb is: use the trailing comma if the list spans multiple lines, don't use it if the list is on a single line.

三生池水覆流年 2024-12-06 20:24:34

我认为出于向后兼容性的原因允许使用尾随逗号。有很多现有代码(主要是自动生成的)会在末尾添加逗号。它使得在最后没有特殊条件的情况下编写循环变得更容易。
例如,

for_each(my_inits.begin(), my_inits.end(),
[](const std::string& value) { std::cout << value << ",\n"; });

对于程序员来说实际上没有任何优势。

PS 虽然这种方式自动生成代码更容易,但实际上我总是注意不要放置尾随逗号,工作量很小,可读性得到提高,这是更重要的。您编写一次代码,然后多次阅读它。

Trailing comma I believe is allowed for backward compatibility reasons. There is a lot of existing code, primarily auto-generated, which puts a trailing comma. It makes it easier to write a loop without special condition at the end.
e.g.

for_each(my_inits.begin(), my_inits.end(),
[](const std::string& value) { std::cout << value << ",\n"; });

There isn't really any advantage for the programmer.

P.S. Though it is easier to autogenerate the code this way, I actually always took care not to put the trailing comma, the efforts are minimal, readability is improved, and that's more important. You write code once, you read it many times.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文