在 C++ 中使用下划线的规则是什么? 标识符?

发布于 2024-07-07 13:55:09 字数 203 浏览 14 评论 0原文

在 C++ 中,使用某种前缀来命名成员变量是很常见的,以表示它们是成员变量,而不是局部变量或参数。 如果您有 MFC 背景,您可能会使用 m_foo。 我偶尔也见过 myFoo

C#(或者可能只是 .NET)似乎建议仅使用下划线,如 _foo 中所示。 C++ 标准允许这样做吗?

It's common in C++ to name member variables with some kind of prefix to denote the fact that they're member variables, rather than local variables or parameters. If you've come from an MFC background, you'll probably use m_foo. I've also seen myFoo occasionally.

C# (or possibly just .NET) seems to recommend using just an underscore, as in _foo. Is this allowed by the C++ standard?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

酒绊 2024-07-14 13:55:09

规则(在 C++11 中没有更改):

来自 2003 C++ 标准:

17.4.3.1.2 全局名称 [lib.global.names]

某些名称和函数签名集始终保留给实现:

  • 每个包含双下划线 (__) 或以下划线后跟大写字母 (2.11) 开头的名称均保留给实现以供任何使用。
  • 每个以下划线开头的名称都保留给实现,以用作全局命名空间中的名称。165

165) 这些名称也在命名空间 ::std (17.4.3.1) 中保留。

C++ 语言基于 C 语言(1.1/2、C++03),而 C99 是规范性参考(1.2/1、C++03),因此了解 1999 C 标准的限制很有用(尽管它们并不直接适用于 C++):

7.1.3 保留标识符

每个标头声明或定义其关联子条款中列出的所有标识符,并且
可选地声明或定义在其关联的未来库方向子条款中列出的标识符和标识符,这些标识符始终保留用于任何用途或用作文件范围标识符。

  • 以下划线和大写字母或其他字母开头的所有标识符
    下划线始终保留用于任何用途。
  • 所有以下划线开头的标识符始终保留用作标识符
    文件范围位于普通名称空间和标记名称空间中。
  • 以下任何子条款中的每个宏名称(包括未来的库
    如果包含任何相关标头,则保留按指定使用;
    除非另有明确说明(见7.1.4)。
  • 以下任何子条款中具有外部链接的所有标识符(包括
    未来的图书馆方向)始终保留用作外部标识符
    链接。154
  • 以下任何子条款中列出的具有文件范围的每个标识符(包括
    未来的库方向)被保留用作宏名称和标识符
    如果包含任何关联的标头,则文件范围位于同一名称空间中。

没有保留其他标识符。 如果程序在a中声明或定义了一个标识符
保留它的上下文(7.1.4 允许的除外),或定义一个保留的
标识符作为宏名称,行为未定义。

如果程序删除(使用#undef)第一个标识符中的任何宏定义
上面列出的组,行为未定义。

154) 具有外部链接的保留标识符列表包括 errnomath_errhandlingsetjmp 和 <代码>va_end。

其他限制可能适用。 例如,POSIX 标准保留了许多可能出现在普通代码中的标识符:

  • 以大写 E 开头的名称,后跟数字或大写字母:
    • 可用于其他错误代码名称。
  • isto 开头,后跟小写字母的名称:
    • 可用于其他字符测试和转换功能。
  • LC_ 开头,后跟大写字母的名称:
    • 可用于指定区域设置属性的其他宏。
  • 所有以 fl 为后缀的现有数学函数的名称均被保留:
    • 分别对应对 float 和 long double 参数进行操作的相应函数。
  • SIG 开头后跟大写字母的名称被保留:
    • 获取其他信号名称。
  • SIG_ 开头后跟大写字母的名称被保留:
    • 用于其他信号操作。
  • strmemwcs 开头且后跟小写字母的名称被保留:
    • 用于其他字符串和数组函数。
  • PRISCN 开头,后跟任何小写字母或 X 的名称被保留:
    • 用于其他格式说明符宏
  • _t 结尾的名称被保留:
    • 用于其他类型名称。

虽然现在将这些名称用于您自己的目的可能不会造成问题,但它们确实会增加与该标准的未来版本发生冲突的可能性。


就我个人而言,我只是不使用下划线开头标识符。 我的规则的新增内容:不要在任何地方使用双下划线,这很容易,因为我很少使用下划线。

在对本文进行研究后,我不再以 _t 结尾我的标识符
因为这是 POSIX 标准保留的。

关于任何以 _t 结尾的标识符的规则让我很惊讶。 我认为这是一个 POSIX 标准(尚未确定),正在寻求澄清和官方章节。 这是来自 GNU libtool 手册,列出了保留名称。

CesarB 提供了以下链接到 POSIX 2004 保留符号和注释'许多其他保留的前缀和后缀......可以在那里找到。 这
POSIX 2008 保留符号在这里定义。 这些限制比上述限制更为细致。

The rules (which did not change in C++11):

  • Reserved in any scope, including for use as implementation macros:
    • identifiers beginning with an underscore followed immediately by an uppercase letter
    • identifiers containing adjacent underscores (or "double underscore")
  • Reserved in the global namespace:
    • identifiers beginning with an underscore
  • Also, everything in the std namespace is reserved. (You are allowed to add template specializations, though.)

From the 2003 C++ Standard:

17.4.3.1.2 Global names [lib.global.names]

Certain sets of names and function signatures are always reserved to the implementation:

  • Each name that contains a double underscore (__) or begins with an underscore followed by an uppercase letter (2.11) is reserved to the implementation for any use.
  • Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.165

165) Such names are also reserved in namespace ::std (17.4.3.1).

The C++ language is based on the C language (1.1/2, C++03), and C99 is a normative reference (1.2/1, C++03), so it's useful to know the restrictions from the 1999 C Standard (although they do not apply to C++ directly):

7.1.3 Reserved identifiers

Each header declares or defines all identifiers listed in its associated subclause, and
optionally declares or defines identifiers listed in its associated future library directions subclause and identifiers which are always reserved either for any use or for use as file scope identifiers.

  • All identifiers that begin with an underscore and either an uppercase letter or another
    underscore are always reserved for any use.
  • All identifiers that begin with an underscore are always reserved for use as identifiers
    with file scope in both the ordinary and tag name spaces.
  • Each macro name in any of the following subclauses (including the future library
    directions) is reserved for use as specified if any of its associated headers is included;
    unless explicitly stated otherwise (see 7.1.4).
  • All identifiers with external linkage in any of the following subclauses (including the
    future library directions) are always reserved for use as identifiers with external
    linkage.154
  • Each identifier with file scope listed in any of the following subclauses (including the
    future library directions) is reserved for use as a macro name and as an identifier with
    file scope in the same name space if any of its associated headers is included.

No other identifiers are reserved. If the program declares or defines an identifier in a
context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved
identifier as a macro name, the behavior is undefined.

If the program removes (with #undef) any macro definition of an identifier in the first
group listed above, the behavior is undefined.

154) The list of reserved identifiers with external linkage includes errno, math_errhandling, setjmp, and va_end.

Other restrictions might apply. For example, the POSIX standard reserves a lot of identifiers that are likely to show up in normal code:

  • Names beginning with a capital E followed a digit or uppercase letter:
    • may be used for additional error code names.
  • Names that begin with either is or to followed by a lowercase letter:
    • may be used for additional character testing and conversion functions.
  • Names that begin with LC_ followed by an uppercase letter:
    • may be used for additional macros specifying locale attributes.
  • Names of all existing mathematics functions suffixed with f or l are reserved:
    • for corresponding functions that operate on float and long double arguments, respectively.
  • Names that begin with SIG followed by an uppercase letter are reserved:
    • for additional signal names.
  • Names that begin with SIG_ followed by an uppercase letter are reserved:
    • for additional signal actions.
  • Names beginning with str, mem, or wcs followed by a lowercase letter are reserved:
    • for additional string and array functions.
  • Names beginning with PRI or SCN followed by any lowercase letter or X are reserved:
    • for additional format specifier macros
  • Names that end with _t are reserved:
    • for additional type names.

While using these names for your own purposes right now might not cause a problem, they do raise the possibility of conflict with future versions of that standard.


Personally I just don't start identifiers with underscores. New addition to my rule: Don't use double underscores anywhere, which is easy as I rarely use underscore.

After doing research on this article I no longer end my identifiers with _t
as this is reserved by the POSIX standard.

The rule about any identifier ending with _t surprised me a lot. I think that is a POSIX standard (not sure yet) looking for clarification and official chapter and verse. This is from the GNU libtool manual, listing reserved names.

CesarB provided the following link to the POSIX 2004 reserved symbols and notes 'that many other reserved prefixes and suffixes ... can be found there'. The
POSIX 2008 reserved symbols are defined here. The restrictions are somewhat more nuanced than those above.

人疚 2024-07-14 13:55:09

避免名称冲突的规则既包含在 C++ 标准中(参见 Stroustrup 的书),也由 C++ 大师(Sutter 等)提到过。

个人规则

因为我不想处理案例,并且想要一个简单的规则,所以我设计了一个既简单又正确的个人规则:


命名符号时,如果您满足以下条件,则可以避免与编译器/操作系统/标准库发生冲突:

  • 永远不要以下划线开头的符号
  • 永远不要命名内部包含两个连续下划线的符号。

当然,将代码放在唯一的命名空间中也有助于避免冲突(但不能防止邪恶的宏)

一些示例

(我使用宏是因为它们对 C/C++ 符号的代码污染更大,但它可以是从变量名到类名的任何内容)

#define _WRONG
#define __WRONG_AGAIN
#define RIGHT_
#define WRONG__WRONG
#define RIGHT_RIGHT
#define RIGHT_x_RIGHT

摘自 C++0x 草稿

来自 n3242.pdf 文件(我希望最终的标准文本类似):

17.6.3.3.2 全局名称 [global.names]

某些名称和函数签名集始终保留给实现:

— 每个包含双下划线 _ _ 或以下划线后跟大写字母 (2.12) 开头的名称都保留给实现以供任何使用。

— 每个以下划线开头的名称都保留给实现,以用作全局命名空间中的名称。

但是也:

17.6.3.3.5 用户定义的文字后缀 [usrlit.suffix]

不以下划线开头的文字后缀标识符保留用于将来的标准化。

最后一个子句令人困惑,除非您认为如果在全局命名空间中定义,则以一个下划线开头并后跟一个小写字母的名称是可以的...

The rules to avoid collision of names are both in the C++ standard (see Stroustrup book) and mentioned by C++ gurus (Sutter, etc.).

Personal rule

Because I did not want to deal with cases, and wanted a simple rule, I have designed a personal one that is both simple and correct:


When naming a symbol, you will avoid collision with compiler/OS/standard libraries if you:

  • never start a symbol with an underscore
  • never name a symbol with two consecutive underscores inside.

Of course, putting your code in an unique namespace helps to avoid collision, too (but won't protect against evil macros)

Some examples

(I use macros because they are the more code-polluting of C/C++ symbols, but it could be anything from variable name to class name)

#define _WRONG
#define __WRONG_AGAIN
#define RIGHT_
#define WRONG__WRONG
#define RIGHT_RIGHT
#define RIGHT_x_RIGHT

Extracts from C++0x draft

From the n3242.pdf file (I expect the final standard text to be similar):

17.6.3.3.2 Global names [global.names]

Certain sets of names and function signatures are always reserved to the implementation:

— Each name that contains a double underscore _ _ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.

— Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.

But also:

17.6.3.3.5 User-defined literal suffixes [usrlit.suffix]

Literal suffix identifiers that do not start with an underscore are reserved for future standardization.

This last clause is confusing, unless you consider that a name starting with one underscore and followed by a lowercase letter would be Ok if not defined in the global namespace...

梦幻的心爱 2024-07-14 13:55:09

来自 MSDN

在标识符开头使用两个连续下划线字符 ( __ ) 或单个前导下划线后跟大写字母,是为所有范围内的 C++ 实现保留的。 您应该避免使用一个前导下划线后跟一个小写字母作为具有文件范围的名称,因为可能与当前或将来的保留标识符发生冲突。

这意味着您可以使用单个下划线作为成员变量前缀,只要它后面跟着一个小写字母即可。

这显然取自 C++ 标准的第 17.4.3.1.2 节,但我无法在线找到完整标准的原始来源。

另请参阅此问题

From MSDN:

Use of two sequential underscore characters ( __ ) at the beginning of an identifier, or a single leading underscore followed by a capital letter, is reserved for C++ implementations in all scopes. You should avoid using one leading underscore followed by a lowercase letter for names with file scope because of possible conflicts with current or future reserved identifiers.

This means that you can use a single underscore as a member variable prefix, as long as it's followed by a lower-case letter.

This is apparently taken from section 17.4.3.1.2 of the C++ standard, but I can't find an original source for the full standard online.

See also this question.

内心激荡 2024-07-14 13:55:09

至于问题的其他部分,通常将下划线放在变量名称的末尾,以免与任何内部内容发生冲突。

我什至在类和命名空间内也这样做,因为这样我只需要记住一条规则(与“在全局范围内的名称末尾,以及其他地方的名称开头”相比)。

As for the other part of the question, it's common to put the underscore at the end of the variable name to not clash with anything internal.

I do this even inside classes and namespaces because I then only have to remember one rule (compared to "at the end of the name in global scope, and the beginning of the name everywhere else").

山川志 2024-07-14 13:55:09

是的,下划线可以用在标识符中的任何地方。 我相信规则是:第一个字符中的任何 az、AZ、_ 以及后续字符中的 +0-9。

下划线前缀在 C 代码中很常见——单下划线表示“私有”,双下划线通常保留供编译器使用。

Yes, underscores may be used anywhere in an identifier. I believe the rules are: any of a-z, A-Z, _ in the first character and those +0-9 for the following characters.

Underscore prefixes are common in C code -- a single underscore means "private", and double underscores are usually reserved for use by the compiler.

小嗲 2024-07-14 13:55:09

首先,当前工作草案中的规则在 [lex.name] p3< 中列出/a>:

此外,一些显示为令牌预处理令牌的标识符保留供C++实现使用,不得以其他方式使用; 无需诊断。

  • 每个包含双下划线 __ 或以下划线后跟大写字母开头的标识符都保留给实现以供任何使用。
  • 以下划线开头的每个标识符都保留给实现,以用作全局命名空间中的名称。

此外,标准库保留了namespace std中定义的所有名称和一些僵尸名称; 请参阅[reserved.names.general]

那么 POSIX 呢?

正如已接受的答案所指出的那样,实现的其他部分(例如 POSIX 标准)可能会限制您可以使用的标识符。

如果包含标头,则标头部分中描述的每个具有文件范围的标识符都被保留用作同一名称空间中具有文件范围的标识符。

任意标头 [保留] 后缀 _t

- POSIX 2008 标准,2.2.2

在 C++ 中,几乎所有与 POSIX 相关的问题都可以通过命名空间来避免。
这也是为什么 C++ 标准可以在不破坏 POSIX 兼容性的情况下添加大量符号,例如 std::enable_if_t

可视化

int x;      // OK
int x_;     // OK
int _x;     // RESERVED
int x__;    // RESERVED (OK in C)
int __x;    // RESERVED
int _X;     // RESERVED
int assert; // RESERVED (macro name)
int x_t;    // RESERVED (only by POSIX)

namespace {
int y;      // OK
int y_;     // OK
int _y;     // OK
int y__;    // RESERVED (OK in C, ignoring namespaces)
int __y;    // RESERVED
int _Y;     // RESERVED
int assert; // RESERVED (macro name)
int y_t;    // OK
}

上述 y 规则适用于命名和未命名命名空间。
不管怎样,在下面的命名空间中,遵循全局命名空间的规则
不再适用(请参阅[namespace.unnamed])。

上述 y 规则也适用于类、函数等中的标识符; 除了全球范围之外的任何东西。

尽管 assert 在这里不像函数式宏那样使用,但该名称是保留的。 这也是提案 P2884 的原因考虑将其作为 C++26 中的关键字,到目前为止取得了一些成功

建议的做法

为了安全起见,请始终避免使用双下划线,并始终避免带有前导下划线的 nam。
后者在某些情况下是可以的,但记住这些规则很困难,安全总比后悔好。

_ 本身怎么样?

有些人使用 _ 来表示未使用某些变量或函数参数。 但是,您可以通过以下方式避免这种情况:

void foo(T _) { /* ... */ }
// replace with:
void foo(T) { /* ... */ }

std::scoped_lock _{mutex};
// replace with:
std::scoped_lock lock{mutex};

您还可以将参数 p 强制转换为 void,如 (void)p,如果这是关于静音警告关于 p 未使用,并且您需要 C 兼容性。 请参阅为什么将未使用的返回值转换为 void?

Firstly, the rules in current working draft are laid out in [lex.name] p3:

In addition, some identifiers appearing as a token or preprocessing-token are reserved for use by C++ implementations and shall not be used otherwise; no diagnostic is required.

  • Each identifier that contains a double underscore __ or begins with an underscore followed by an uppercase letter is reserved to the implementation for any use.
  • Each identifier that begins with an underscore is reserved to the implementation for use as a name in the global namespace.

Furthermore, the standard library reserves all names defined in namespace std and some zombie names; see [reserved.names.general].

What about POSIX?

As the accepted answer has pointed out, there may be other parts of the implementation, like the POSIX standard, which limit the identifiers you can use.

Each identifier with file scope described in the header section is reserved for use as an identifier with file scope in the same name space if the header is included.

ANY Header [reserves] Suffix _t

- POSIX 2008 Standard, 2.2.2

In C++, almost all problems associated with POSIX can be avoided through namespaces.
This is also why the C++ standard can add tons of symbols like std::enable_if_t without breaking POSIX compatibility.

Visualization

int x;      // OK
int x_;     // OK
int _x;     // RESERVED
int x__;    // RESERVED (OK in C)
int __x;    // RESERVED
int _X;     // RESERVED
int assert; // RESERVED (macro name)
int x_t;    // RESERVED (only by POSIX)

namespace {
int y;      // OK
int y_;     // OK
int _y;     // OK
int y__;    // RESERVED (OK in C, ignoring namespaces)
int __y;    // RESERVED
int _Y;     // RESERVED
int assert; // RESERVED (macro name)
int y_t;    // OK
}

The above rules for y apply to both named and unnamed namespaces.
Either way, in the following namespace, the rules of the global namespace
no longer apply (see [namespace.unnamed]).

The above rules for y also apply to identifiers in classes, functions, etc.; anything but global scope.

Even though assert isn't used like a function-style macro here, the name is reserved. This is also why proposal P2884 contemplates making it a keyword in C++26, with some success so far.

Recommended Practice

To be safe, always avoid double underscores, and always avoid nams with leading underscores.
The latter are okay in some cases, but it's difficult to memorize these rules, and it's better to be safe than sorry.

What about _ in itself?

Some people use _ to indicate that some variable or function parameter isn't used. However, you can avoid this with:

void foo(T _) { /* ... */ }
// replace with:
void foo(T) { /* ... */ }

std::scoped_lock _{mutex};
// replace with:
std::scoped_lock lock{mutex};

You can also cast a parameter p to void like (void)p, if this is about silencing warnings about p being unused, and you need C compatibility. See Why cast unused return values to void?.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文