为什么允许空 wchar_t 文字？

发布于 2024-11-10 04:49:06 字数 470 浏览 0 评论 0原文

看下面的代码：

int main(int argc, char* argv[])
{
    // This works: (Disable Lang Ext = *Yes* (/Za))
    wchar_t wc0 = L'\0';
    wchar_t wc_ = L'';
    assert(wc0 == wc_);

    // This doesn't compile (VC++ 2010):
    char c0 = '\0';
    char c_ = ''; // error C2137: empty character constant
    assert(c0 == c_);
    return 0;
}

为什么编译器允许为宽字符定义空字符文字？这对于 Wide 没有意义，就像对于编译器标记错误的 char 没有意义一样。

标准允许这样做吗？

原文

Look at the following code:

int main(int argc, char* argv[])
{
    // This works: (Disable Lang Ext = *Yes* (/Za))
    wchar_t wc0 = L'\0';
    wchar_t wc_ = L'';
    assert(wc0 == wc_);

    // This doesn't compile (VC++ 2010):
    char c0 = '\0';
    char c_ = ''; // error C2137: empty character constant
    assert(c0 == c_);
    return 0;
}

Why does the compiler allow defining an empty character literal for wide characters? This doesn't make sense for wide, just as it doesn't make sense for char where the compiler flags an error.

Is this allowed by the Standard?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

奢华的一滴泪 2024-11-17 04:49:06

这是 VC++ 中的一个错误< /a>.

回复收藏 0 原文

画▽骨i 2024-11-17 04:49:06

ISO 标准不允许。这是微软产品中的一个错误。即使他们的描述该特定功能的页面也没有提及这种异常（或可憎的，取决于你的观点）行为。

字符文字的定义（取自 C++0x 的 2.14.3，但相关位在 C++03 中没有变化）包含：

character-literal:
    L’ c-char-sequence ’
c-char-sequence:
    c-char
    c-char-sequence c-char
c-char:
    any member of the source character set except
      the single-quote ’, backslash \, or new-line character
    escape-sequence
    universal-character-name
escape-sequence:
    simple-escape-sequence
    octal-escape-sequence
    hexadecimal-escape-sequence
simple-escape-sequence: one of
    \’ \" \? \\ \a \b \f \n \r \t \v
octal-escape-sequence:
    \ octal-digit
    \ octal-digit octal-digit
    \ octal-digit octal-digit octal-digit
hexadecimal-escape-sequence:
    \x hexadecimal-digit
    hexadecimal-escape-sequence hexadecimal-digit

如您所见，没有这样，L'x' 中的 ' 字符之间就不会出现任何内容。它必须是一个或多个 c_char 字符。事实上，这一点在下面的段落中已经明确（我的重点）：

字符文字是用单引号括起来的一个或多个个字符，如'x'，前面可以选择字母u、U 或 L，如 u'y'、U'z' 或 L'x'，分别。

It is not allowed per the ISO standard. This is a bug in Microsoft's product. Even their page describing that particular feature makes no mention of this aberrant (or abhorrent, depending on your viewpoint) behaviour.

The definition for a character literal (as taken from 2.14.3 of C++0x but the relevant bit is unchanged from C++03) contains:

character-literal:
    L’ c-char-sequence ’
c-char-sequence:
    c-char
    c-char-sequence c-char
c-char:
    any member of the source character set except
      the single-quote ’, backslash \, or new-line character
    escape-sequence
    universal-character-name
escape-sequence:
    simple-escape-sequence
    octal-escape-sequence
    hexadecimal-escape-sequence
simple-escape-sequence: one of
    \’ \" \? \\ \a \b \f \n \r \t \v
octal-escape-sequence:
    \ octal-digit
    \ octal-digit octal-digit
    \ octal-digit octal-digit octal-digit
hexadecimal-escape-sequence:
    \x hexadecimal-digit
    hexadecimal-escape-sequence hexadecimal-digit

As you can see, there is no way that you can end up with nothing between the ' characters in L'x'. It has to be one or more of the c_char characters. In fact, this is made explicit in the following paragraph (my emphasis):

A character literal is one or more characters enclosed in single quotes, as in ’x’, optionally preceded by one of the letters u, U, or L, as in u’y’, U’z’, or L’x’, respectively.