C++项目类型:unicode 与多字节;优点和缺点
我想知道 Stack Overflow 社区在使用 unicode 或多字节字符集创建项目(这里主要考虑 c++)时有何想法。
使用 Unicode 有优点吗 从一开始就直接意味着所有 你的字符串将是宽格式的? 是否存在性能问题/较大 内存需求,因为 标准使用更大的字符?
这种方法有什么优点吗? 做一些处理器架构 更好地处理宽字符?
有什么理由让你 如果您不打算的话,请项目 Unicode 支持其他语言吗?
创建具有多字节字符集的项目的原因是什么?
上述所有因素如何在高性能环境(例如现代视频游戏)中发生冲突?
I'm wondering what the Stack Overflow community thinks when it comes to creating a project (thinking primarily c++ here) with a unicode or a multi-byte character set.
Are there pros to going Unicode
straight from the start, implying all
your strings will be in wide format?
Are there performance issues / larger
memory requirements because of a
standard use of a larger character?Is there an advantage to this method?
Do some processor architectures
handle wide characters better?Are there any reasons to make your
project Unicode if you don't plan on
supporting additional languages?What reasons would one have for creating a project with a multi-byte character set?
How do all of the factors above collide in a high performance environment (such as a modern video game) ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我要评论的两个问题。
首先,您没有提及您的目标平台。尽管最近的 Windows 版本(Win2000、WinXP、Vista 和 Win7)支持使用字符串的多字节和 Unicode 版本的系统调用,但 Unicode 版本速度更快(多字节版本是转换为 Unicode 的包装器,调用 Unicode 版本,然后转换任何返回的字符串返回到多字节)。因此,如果您进行大量此类调用,Unicode 会更快。
正因为您不打算明确支持其他语言,所以如果您的应用程序保存并显示用户输入的文本,您仍然应该考虑支持 Unicode。仅仅因为您的应用程序是单语的,并不意味着它的所有用户也都是单语的。他们可能非常乐意使用您的英语 GUI,但可能希望用自己的语言输入名称、评论或其他文本并使其正确显示。
Two issues I'd comment on.
First, you don't mention what platform you're targeting. Although recent Windows versions (Win2000, WinXP, Vista and Win7) support both Multibyte and Unicode versions of system calls using strings, the Unicode versions are faster (the multibyte versions are wrappers that convert to Unicode, call the Unicode version, then convert any returned strings back to mutlibyte). So if you're making a lot of these types of calls the Unicode will be faster.
Just because you're not planning on explicitly supporting additional languages, you should still consider supporting Unicode if your application saves and displays text entered by the users. Just because your application is unilingual, it doesn't follow that all it's users will be unilingual too. They may be perfectly happy to use your English language GUI, but might want to enter names, comments or other text in their own language and have them displayed properly.
你这里说的是VC++项目设置吧?
它唯一影响的是最终执行的 Win32 API 调用的版本。例如,在多字节设置的情况下,对
MessageBox
的调用最终将作为对MessageBoxA
的调用,在多字节设置的情况下,最终将调用MessageBoxW
统一码设置。当然,这也会影响该函数的字符串参数的类型。在内部,MessageBoxA
在将字符串参数从当前系统区域设置转换为 Unicode 后调用MessageBoxW
。我的建议是使用 Unicode 设置并将 Unicode 字符串传递给 Win32 API 调用。这不会阻止您在内部使用任何其他编码中的字符串。
You are talking about the VC++ Project setting here, right?
The only thing it affects is the version of Win32 API calls it ends up being exectuted. For instance, a call to
MessageBox
will end up as a call toMessageBoxA
in case of the multi-byte setting, andMessageBoxW
in case of Unicode setting. Of course, that will affect the types of string parameters to that functions as well. Internally,MessageBoxA
callsMessageBoxW
after converting the string paramteres from the current system locale to Unicode.My advice is to use the Unicode settings and pass Unicode strings to Win32 API calls. That does not stop you from using strings in any other encoding internally.
简短的答案(IMO,我已经被证明是错误的)是最好做最坏的打算(或者最好的情况取决于你的观点)并立即使用 unicode。
除非您的应用程序非常需要字符串密集型,否则直接使用 unicode 并不重要;就游戏而言,与引擎的其他部分相比,它不应该是一个大因素。
最大限度。
The short answer (IMO, and I've been proving wrong) is that it'd better to plan for the worse (or best depending on your point of view) and do unicode right now.
Unless your application is very string intensive, then going directly to unicode will not really matter; in the case of games, it should not be a big factor compared to the rest of the engine.
Max.
这里有一个简单的考虑:如果菅直人先生使用你的程序,它还能工作吗?他的主目录可能很难用 ASCII 表示。
Here's a simple consideration: should your program work if it's used by Mr. 菅 直人 ? His home directory might be hard to represent in ASCII.
几年和一百万行代码之后,您会希望自己的回答是“是”。
我希望微软不要再将“Unicode”与 UTF-16 混为一谈。
您不必以宽格式存储所有字符串。您可以改用 UTF-8,并获得更小的内存占用(对于拉丁字母语言),并向后兼容 7 位 ASCII。
在 Windows 上使用 UTF-8 的一个缺点是它不支持 ANSI 代码页,因此您必须将字符串转换为 UTF-16 才能进行 WinAPI 调用。这会造成多少不便取决于您正在编写 Windows 程序还是恰好在 Windows 上运行的程序。
A few years and a million lines of code later, you're going to wish you had answered "yes".
I wish Microsoft would quit conflating "Unicode" with UTF-16.
You don't have to store all your strings in wide format. You can use UTF-8 instead, and get a smaller memory footprint (for Latin alphabet languages), and backwards compatibility with 7-bit ASCII.
The one downside to using UTF-8 on Windows is that it's not supported as an ANSI code page, so you have to convert your strings to UTF-16 to make WinAPI calls. How much inconvenience this causes depends on whether you're writing a Windows program or a program that just happens to run on Windows.
该问题的第一个答案应该...回答您需要知道的所有内容。
The first answer to that question should... answer everything you need to know.