MS Visual Studio中的Windows 1250或UTF-8？

发布于 2025-02-02 18:50:45 字数 301 浏览 3 评论 0原文

在这里，我在MS Visual Studio中有一个简单的，模范的代码：

#include<string>
#include<iostream>

using namespace std;

int main()
{
   cout << static_cast<int>('ą') << endl; // -71
   return 0;
}

问题是，如果据我所知，它仿佛MS Visual Studio在使用Windows 1250，为什么它使用UTF -8？

原文

Here I have a simple, exemplary code in MS Visual Studio:

#include<string>
#include<iostream>

using namespace std;

int main()
{
   cout << static_cast<int>('ą') << endl; // -71
   return 0;
}

The question is why this cout prints out -71 as if MS Visual Studio was using Windows 1250 if as far as I know it uses UTF-8?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

北方的韩爷 2025-02-09 18:50:45

您的源文件保存在Windows-1250中，而不是UTF-8中，因此两个单个引号之间存储的 byte 是0xB9（请参阅 Windows-1250表）。 0xb9作为签名的8位值为-71。

将您的文件保存在UTF-8编码中，您将获得不同的答案。我得到50309，即0xC485。由于UTF-8是一种多级编码，因此最好使用Modern C ++输出显式UTF-8字符串的字节，使用UTF-8源编码，并明确地告诉编译器，该源代码源编码UTF-8：

test.c-保存在UTF-8编码中并用/utf-8在MSV中进行编译：

#include<string>
#include<iostream>
#include <cstdint>

using namespace std;

int main()
{
    string s {u8"ą马"};
    for(auto c : s)
        cout << hex << static_cast<int>(static_cast<uint8_t>(c)) << endl;
    return 0;
}

输出：输出：

c4
85
e9
a9
ac

Note C4 85是正确的UTF-8字节，用于ą和e9 a9 ac对于中文（马）是正确的。

Your source file is saved in Windows-1250, not UTF-8, so the byte stored between the two single quotes is 0xB9 (see Windows-1250 table). 0xB9 taken as a signed 8-bit value is -71.

Save your file in UTF-8 encoding and you'll get a different answer. I get 50309 which is 0xc485. since UTF-8 is a multibyte encoding, it would be better to use modern C++ to output the bytes of an explicit UTF-8 string, use UTF-8 source encoding, and tell the compiler explicitly that the source encoding it UTF-8:

test.c - saved in UTF-8 encoding and compiled with /utf-8 switch in MSVS:

#include<string>
#include<iostream>
#include <cstdint>

using namespace std;

int main()
{
    string s {u8"ą马"};
    for(auto c : s)
        cout << hex << static_cast<int>(static_cast<uint8_t>(c)) << endl;
    return 0;
}

Output: