如何在 C++ 中处理 Unicode 字符

发布于 2025-01-04 13:47:14 字数 1022 浏览 1 评论 0原文

我们的引擎中内置了一个注释系统,允许程序员为各种公开的变量/对象添加注释,然后 GUI 前端将其用于工具提示和帮助。

最近,某些工具提示开始崩溃,在浪费了很多时间之后,我找到了字符:',除非我弄错了,否则它是一个 unicode 字符,并且在 ASCII 中不可用。

考虑到 这个答案,我认为 wstring 可以解决问题。在对更大的项目进行更改之前,我创建了一个测试项目来查看 wstring 是否可以解决问题。尽管项目没有崩溃,但 wstring 的行为并不符合预期。

#include <iostream>
#include <string>

using namespace std;

int main()
{
    string someString = "successive attack that DOESN’T result";
    wstring someWString = L"successive attack that DOESN’T result";

    cout << someString << endl;
    wcout << someWString << endl;

    return 0;
}

//Console Output//
successive attack that DOESNÆT result 
successive attack that DOESNPress any key to continue . . .

我很久以前读过这篇文章,并认为我理解与字符集相关的问题,但事实显然并非如此。我希望能找到解决此问题的方法,并能很好地解释正在发生的情况以及如何避免将来出现类似问题。

We have a commenting system built into our engine that allows programmers to put comments for various exposed variables/objects which are then used by the GUI front-end for tool-tips and help.

Recently, certain tool-tips started crashing, and after much wasted time I tracked it down to the the character: which, unless I am mistaken, is a unicode character and not available in ASCII.

Taking this answer into consideration, I assumed wstring would fix the problem. Before making changes in the bigger project, I created a test project to see if wstring would solve the issue. Although the project doesn't crash, the behavior is not as expected for wstring.

#include <iostream>
#include <string>

using namespace std;

int main()
{
    string someString = "successive attack that DOESN’T result";
    wstring someWString = L"successive attack that DOESN’T result";

    cout << someString << endl;
    wcout << someWString << endl;

    return 0;
}

//Console Output//
successive attack that DOESNÆT result 
successive attack that DOESNPress any key to continue . . .

I read this article quite some time ago and thought I understood the problems associated with character sets, but that is obviously not the case. I would appreciate a solution to this problem as well as a good explanation of what is happening and how to avoid problems similar to this in the future.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

在风中等你 2025-01-11 13:47:14

由于您使用的是 Visual Studio,我假设您使用的是 Windows。 Windows 控制台不支持 unicode。它使用 OEM 字符集。您可以使用 CharToOemW/OemToCharW 在两者之间进行转换。显然它无法表示所有 unicode 字符。

Windows 使用 UTF16 作为其系统 API。如果您的工具提示使用 Windows API,您可能想要使用 wstring。但是,您可以改用 UTF8,并在调用 Windows API 之前将其转换为 UTF16。可以使用 MultiByteToWideChar/WideCharToMultiByte 执行此转换。

Since you are using Visual Studio I assume you are using Windows. The Windows console does not support unicode. It uses the OEM char set. You can convert between the two using CharToOemW/OemToCharW. Obviously it will not be able to represent all unicode characters.

Windows uses UTF16 for its system API. If your tooltips uses the Windows API it is probably wstring that you want to use. However, you can use UTF8 instead and convert this to UTF16 before calling the Windows API. This conversion can be performed using MultiByteToWideChar/WideCharToMultiByte.

葮薆情 2025-01-11 13:47:14

由于您正在处理 Unicode 字符,因此如果您在项目属性中将字符集设置为使用 Unicode 字符集,则比较合适。

另一个可能的问题可能是源文件的编码。使用 Unicode 字符时的最佳做法是使用 UTF-8 编码源文件,尤其是定义像这样的字符串文字的文件。请注意,不带 BOM 的 UTF-8 可能会很麻烦,因为 Visual Studio 需要此 BOM 才能正确解释文件内容。转换您的文件(我使用 Notepad++ 进行此操作)并将其转换为 UTF-8 编码

Since you are dealing with Unicode characters, it would be appropriate if you set Character Set to Use Unicode Character Set in projects properties.

Another possible problem could be the encoding of source files. The best practice while working with Unicode characters is to have your source files encoded in UTF-8, especially files where you define string literals like this one. Note that UTF-8 without BOM could be troublesome because Visual Studio needs this BOM so that it can intepret files content properly. Convert your files (I use Notepad++ for this) and convert it so that they are encoded in UTF-8

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文