C++0x 中的新 unicode 字符
我正在构建一个 API,它允许我获取各种编码的字符串,包括 utf8、utf16、utf32 和 wchar_t(根据操作系统,可能是 utf32 或 utf16)。
新的 C++ 标准引入了新类型
char16_t
和char32_t
,它们没有这种 sizeof 歧义,应该在将来使用,所以我也想支持它们,但问题是,它们会干扰正常的uint16_t
、uint32_t
、wchar_t
类型不允许重载,因为它们可以指相同类型吗?class some_class { 民众: 无效集(std::字符串); // utf8字符串 无效集(std::wstring); // wchar字符串根据utf16或utf32 // 到 sizeof(wchar_t) 无效集(std::basic_string
) // wchar独立的utf16字符串 无效集(std::basic_string ); // wchar独立的utf32字符串 #ifdef HAVE_NEW_UNICODE_CHARRECTERS 无效集(std::basic_string ) // 新的标准utf16字符串 无效集(std::basic_string ); // 新的标准utf32字符串 #万一 }; 所以我可以写:
foo.set(U"一些 utf32 字符串"); foo.set(u"一些 utf16 字符串");
今天的
std::basic_string
和std::basic_string
的 typedef 是什么:typedef basic_string
; 字符串。 我找不到任何参考。
编辑:根据 gcc-4.4 的标题,引入了这些新类型:
typedef basic_string
; u16字符串; typedef basic_string ; u32字符串; 我只是想确保这是实际的标准要求,而不是 gcc-ism。
I'm buiding an API that allows me to fetch strings in various encodings, including utf8, utf16, utf32 and wchar_t (that may be utf32 or utf16 according to OS).
New C++ standard had introduced new types
char16_t
andchar32_t
that do not have this sizeof ambiguity and should be used in future, so I would like to support them as well, but the question is, would they interfere with normaluint16_t
,uint32_t
,wchar_t
types not allowing overload because they may refer to same type?class some_class { public: void set(std::string); // utf8 string void set(std::wstring); // wchar string utf16 or utf32 according // to sizeof(wchar_t) void set(std::basic_string<uint16_t>) // wchar independent utf16 string void set(std::basic_string<uint32_t>); // wchar independent utf32 string #ifdef HAVE_NEW_UNICODE_CHARRECTERS void set(std::basic_string<char16_t>) // new standard utf16 string void set(std::basic_string<char32_t>); // new standard utf32 string #endif };
So I can just write:
foo.set(U"Some utf32 String"); foo.set(u"Some utf16 string");
What are the typedef of
std::basic_string<char16_t>
andstd::basic_string<char32_t>
as there is today:typedef basic_string<wchar_t> wstring.
I can't find any reference.
Edit: according to headers of gcc-4.4, that introduced these new types:
typedef basic_string<char16_t> u16string; typedef basic_string<char32_t> u32string;
I just want to make sure that this is actual standard requirement and not gcc-ism.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
1)
char16_t
和char32_t
将是不同的新类型,因此可以对它们进行重载。引自ISO/IEC JTC1 SC22 WG21 N2018:
进一步说明(摘自 devx.com 文章“为 Unicode 革命做好准备” ):
2)
u16string
和u32string
确实是 C++0x 的一部分,而不仅仅是 GCC 的一部分,正如 u16string 中提到的那样。 google.com/search?q=u16string+site%3Aopen-std.org" rel="noreferrer">各种标准草稿文件。 它们将包含在新的
标头中。 引用同一篇文章:1)
char16_t
andchar32_t
will be distinct new types, so overloading on them will be possible.Quote from ISO/IEC JTC1 SC22 WG21 N2018:
Further explanation (from a devx.com article "Prepare Yourself for the Unicode Revolution"):
2)
u16string
andu32string
are indeed part of C++0x and not just GCC'isms, as they are mentioned in various standard draft papers. They will be included in the new<string>
header. Quote from the same article: