将资源编码从 Windows 1252 更改为带签名的 UNICODE UTF-8
我正在开发一个较旧的项目(使用定义的 UNICODE 进行编译),并在 .rc 中遇到了一个问题。例如,包含由
LTEXT "Copyright ©”,IDC_COPYRIGHT_STATIC,7,154,110,8
资源文件在 DIALOGEX 资源中定义的“©”的静态文本元素(可能是多年前由 MSVC 应用程序向导创建并随着每个版本向前迁移),现在看起来像这样:
#if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_ENU)
LANGUAGE LANG_ENGLISH, SUBLANG_ENGLISH_US
#pragma code_page(1252) //present for over 10 years
#endif
多年来,© 显示正确但最近出现为“Å©”甚至“½¿”。显然,这是一个编码问题,但我需要在进行更改之前了解如何以及为何进行更改。因此,经过研究,.rc 中的这三个属性在错误和编码中发挥了作用:
- .rc 中是否存在“#pragma code_page(…)”
- 用于使用编码保存< /strong>….rc 文件
- 使用编码保存….rc“带签名”或“不带签名”(意思是 BOM?)
作为一项经验测试,在.rc 并查看对话中的结果文本
#pragma code_page(…) | 保存 | 使用编码签名 (BOM) | Dlg 中的文本 |
---|---|---|---|
code_page(1252) | 原始文件 | n/a | Å© |
code_page(1252) | Windows 1252 | n/a | © |
code_page(1252 | UTF- | BOM | Å |
) | 8 65001 | 无 BOM | Å© |
code_page(65001) | Windows 1252 | n/a | © |
code_page(65001) | © code_page(1252) UTF-8 65001 | UTF-8 65001 BOM | © |
code_page( 65001) | UTF-8 65001 | 无 BOM | © |
中无 code_page | .rc UTF-8 | 65001 BOM | © |
中无 code_page | .rc UTF-8 | 65001无 BOM | Å© |
我可以显式使用编码保存所有编码为 Windows 的 .rc 文件 (1252 )或编码为带有签名的 UNICODE UTF-8(并删除#pragma code_pages)。特定的错误将会消失,但这是最好的解决方案吗?
看来从 Windows 1252 切换到 UNICODE UTF-8 是向前迈出的一步,也是长期发展的正确方法。这有什么问题吗?或者更好的解决方案?
I am working on an older project (compiled with UNICODE defined) and came across a problem within the .rc. For example, a static text element which includes “©” defined in a DIALOGEX resource by
LTEXT "Copyright ©”,IDC_COPYRIGHT_STATIC,7,154,110,8
The resource file, probably created by MSVC application wizard many years ago and migrated forward with each release, now looks like this:
#if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_ENU)
LANGUAGE LANG_ENGLISH, SUBLANG_ENGLISH_US
#pragma code_page(1252) //present for over 10 years
#endif
For many years the © display correctly but recently appeared as “Å©” or even “½¿”. Obviously, an encoding issue, but I needed to understand how and why before making changes. So, after researching, these three properties in the .rc play a part in the bug and the encoding:
- The presence or absence of “#pragma code_page(…)” in the .rc
- The encoding used to Save with Encoding… the .rc file
- Save with Encoding… .rc “with signature” or “without signature” (meaning BOM?)
As an empirical test, changing these things in the .rc and looking at the result text in dialogue
#pragma code_page(…) | Save with Encoding | Signature(BOM) | Text in Dlg |
---|---|---|---|
code_page(1252) | Original file | n/a | Å© |
code_page(1252) | Windows 1252 | n/a | © |
code_page(1252) | UTF-8 65001 | BOM | Å© |
code_page(1252) | UTF-8 65001 | No BOM | Å© |
code_page(65001) | Windows 1252 | n/a | © |
code_page(65001) | UTF-8 65001 | BOM | © |
code_page(65001) | UTF-8 65001 | No BOM | © |
No code_page in .rc | UTF-8 65001 | BOM | © |
No code_page in .rc | UTF-8 65001 | No BOM | Å© |
I can explicitly Save with Encoding all .rc files encoding as Windows (1252) OR encoding as UNICODE UTF-8 with signatures (and delete the #pragma code_pages). The specific bug will go away, but is this the best solution?
It seems switching from Windows 1252 to UNICODE UTF-8 is a step forward and the right way to go long term. Is there any problem with this? Or better solutions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Raymond Chen 在博客文章 声明当使用 UTF-8 作为 rc 文件的编码时,不应使用 BOM/签名并包含
#pragma代码页(65001)
。根据开发者社区问题 v10.0.19505.1001之前版本的资源编译器存在问题。此修复版本随 Windows SDK 10.0.20348.0 一起提供
Raymond Chen explains the encoding issues in the resource compiler in a blog post stating that when using UTF-8 as encoding for rc files you should not use a BOM/Signature and include a
#pragma code_page(65001)
.According to a Developer Community issue there are problems in the resource compiler prior to version v10.0.19505.1001. This fixed version ships with Windows SDK 10.0.20348.0