为什么允许空 wchar_t 文字?

发布于 2024-11-10 04:49:06 字数 470 浏览 0 评论 0原文

看下面的代码:

int main(int argc, char* argv[])
{
    // This works: (Disable Lang Ext = *Yes* (/Za))
    wchar_t wc0 = L'\0';
    wchar_t wc_ = L'';
    assert(wc0 == wc_);

    // This doesn't compile (VC++ 2010):
    char c0 = '\0';
    char c_ = ''; // error C2137: empty character constant
    assert(c0 == c_);
    return 0;
}

为什么编译器允许为宽字符定义字符文字?这对于 Wide 没有意义,就像对于编译器标记错误的 char 没有意义一样。

标准允许这样做吗?

Look at the following code:

int main(int argc, char* argv[])
{
    // This works: (Disable Lang Ext = *Yes* (/Za))
    wchar_t wc0 = L'\0';
    wchar_t wc_ = L'';
    assert(wc0 == wc_);

    // This doesn't compile (VC++ 2010):
    char c0 = '\0';
    char c_ = ''; // error C2137: empty character constant
    assert(c0 == c_);
    return 0;
}

Why does the compiler allow defining an empty character literal for wide characters? This doesn't make sense for wide, just as it doesn't make sense for char where the compiler flags an error.

Is this allowed by the Standard?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

画▽骨i 2024-11-17 04:49:06

ISO 标准不允许。这是微软产品中的一个错误。即使他们的描述该特定功能的页面也没有提及这种异常(或可憎的,取决于你的观点)行为。

字符文字的定义(取自 C++0x 的 2.14.3,但相关位在 C++03 中没有变化)包含:

character-literal:
    L’ c-char-sequence ’
c-char-sequence:
    c-char
    c-char-sequence c-char
c-char:
    any member of the source character set except
      the single-quote ’, backslash \, or new-line character
    escape-sequence
    universal-character-name
escape-sequence:
    simple-escape-sequence
    octal-escape-sequence
    hexadecimal-escape-sequence
simple-escape-sequence: one of
    \’ \" \? \\ \a \b \f \n \r \t \v
octal-escape-sequence:
    \ octal-digit
    \ octal-digit octal-digit
    \ octal-digit octal-digit octal-digit
hexadecimal-escape-sequence:
    \x hexadecimal-digit
    hexadecimal-escape-sequence hexadecimal-digit

如您所见,没有 这样,L'x' 中的 ' 字符之间就不会出现任何内容。它必须是一个或多个 c_char 字符。事实上,这一点在下面的段落中已经明确(我的重点):

字符文字是用单引号括起来的一个或多个个字符,如'x',前面可以选择字母u、UL,如 u'y'U'z'L'x',分别。


It is not allowed per the ISO standard. This is a bug in Microsoft's product. Even their page describing that particular feature makes no mention of this aberrant (or abhorrent, depending on your viewpoint) behaviour.

The definition for a character literal (as taken from 2.14.3 of C++0x but the relevant bit is unchanged from C++03) contains:

character-literal:
    L’ c-char-sequence ’
c-char-sequence:
    c-char
    c-char-sequence c-char
c-char:
    any member of the source character set except
      the single-quote ’, backslash \, or new-line character
    escape-sequence
    universal-character-name
escape-sequence:
    simple-escape-sequence
    octal-escape-sequence
    hexadecimal-escape-sequence
simple-escape-sequence: one of
    \’ \" \? \\ \a \b \f \n \r \t \v
octal-escape-sequence:
    \ octal-digit
    \ octal-digit octal-digit
    \ octal-digit octal-digit octal-digit
hexadecimal-escape-sequence:
    \x hexadecimal-digit
    hexadecimal-escape-sequence hexadecimal-digit

As you can see, there is no way that you can end up with nothing between the ' characters in L'x'. It has to be one or more of the c_char characters. In fact, this is made explicit in the following paragraph (my emphasis):

A character literal is one or more characters enclosed in single quotes, as in ’x’, optionally preceded by one of the letters u, U, or L, as in u’y’, U’z’, or L’x’, respectively.

Oo萌小芽oO 2024-11-17 04:49:06

我认为,根据 C++ 标准 2.23.2.1,第一个示例是不允许的:

字符文字是一个或多个
用单引号括起来的字符,
'x' 所示,前面可以选择
字母L,如L'x'

(强调我的。)

I would argue that the first example is not allowed, per 2.23.2.1 of the C++ standard:

A character literal is one or more
characters enclosed in single quotes,
as in ’x’, optionally preceded by the
letter L, as in L’x’.

(Emphasis mine.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文