MS Visual Studio中的Windows 1250或UTF-8?
在这里,我在MS Visual Studio中有一个简单的,模范的代码:
#include<string>
#include<iostream>
using namespace std;
int main()
{
cout << static_cast<int>('ą') << endl; // -71
return 0;
}
问题是,如果据我所知,它仿佛MS Visual Studio在使用Windows 1250,为什么它使用UTF -8?
Here I have a simple, exemplary code in MS Visual Studio:
#include<string>
#include<iostream>
using namespace std;
int main()
{
cout << static_cast<int>('ą') << endl; // -71
return 0;
}
The question is why this cout prints out -71 as if MS Visual Studio was using Windows 1250 if as far as I know it uses UTF-8?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的源文件保存在Windows-1250中,而不是UTF-8中,因此两个单个引号之间存储的 byte 是0xB9(请参阅 Windows-1250表)。 0xb9作为签名的8位值为-71。
将您的文件保存在UTF-8编码中,您将获得不同的答案。我得到50309,即0xC485。由于UTF-8是一种多级编码,因此最好使用Modern C ++输出显式UTF-8字符串的字节,使用UTF-8源编码,并明确地告诉编译器,该源代码源编码UTF-8:
test.c-保存在UTF-8编码中并用
/utf-8
在MSV中进行编译:输出:输出:
Note
C4 85
是正确的UTF-8字节,用于ą
和e9 a9 ac
对于中文(马)是正确的。Your source file is saved in Windows-1250, not UTF-8, so the byte stored between the two single quotes is 0xB9 (see Windows-1250 table). 0xB9 taken as a signed 8-bit value is -71.
Save your file in UTF-8 encoding and you'll get a different answer. I get 50309 which is 0xc485. since UTF-8 is a multibyte encoding, it would be better to use modern C++ to output the bytes of an explicit UTF-8 string, use UTF-8 source encoding, and tell the compiler explicitly that the source encoding it UTF-8:
test.c - saved in UTF-8 encoding and compiled with
/utf-8
switch in MSVS:Output:
Note
C4 85
is the correct UTF-8 bytes forą
andE9 A9 AC
are correct for Chinese 马 (horse).