将资源编码从 Windows 1252 更改为带签名的 UNICODE UTF-8
我正在开发一个较旧的项目(使用定义的 UNICODE 进行编译),并在 .rc 中遇到了一个问题。例如,包含由
LTEXT "Copyright ©”,IDC_COPYRIGHT_STATIC,7,154,110,8
资源文件在 DIALOGEX 资源中定义的“©”的静态文本元素(可能是多年前由 MSVC 应用程序向导创建并随着每个版本向前迁移),现在看起来像这样:
#if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_ENU)
LANGUAGE LANG_ENGLISH, SUBLANG_ENGLISH_US
#pragma code_page(1252) //present for over 10 years
#endif
多年来,© 显示正确但最近出现为“Å©”甚至“½¿”。显然,这是一个编码问题,但我需要在进行更改之前了解如何以及为何进行更改。因此,经过研究,.rc 中的这三个属性在错误和编码中发挥了作用:
- .rc 中是否存在“#pragma code_page(…)”
- 用于使用编码保存< /strong>….rc 文件
- 使用编码保存….rc“带签名”或“不带签名”(意思是 BOM?)
作为一项经验测试,在.rc 并查看对话中的结果文本
#pragma code_page(…) | 保存 | 使用编码签名 (BOM) | Dlg 中的文本 |
---|---|---|---|
code_page(1252) | 原始文件 | n/a | Å© |
code_page(1252) | Windows 1252 | n/a | © |
code_page(1252 | UTF- | BOM | Å |
) | 8 65001 | 无 BOM | Å© |
code_page(65001) | Windows 1252 | n/a | © |
code_page(65001) | © code_page(1252) UTF-8 65001 | UTF-8 65001 BOM | © |
code_page( 65001) | UTF-8 65001 | 无 BOM | © |
中无 code_page | .rc UTF-8 | 65001 BOM | © |
中无 code_page | .rc UTF-8 | 65001无 BOM | Å© |
我可以显式使用编码保存所有编码为 Windows 的 .rc 文件 (1252 )或编码为带有签名的 UNICODE UTF-8(并删除#pragma code_pages)。特定的错误将会消失,但这是最好的解决方案吗?
看来从 Windows 1252 切换到 UNICODE UTF-8 是向前迈出的一步,也是长期发展的正确方法。这有什么问题吗?或者更好的解决方案?
I am working on an older project (compiled with UNICODE defined) and came across a problem within the .rc. For example, a static text element which includes “©” defined in a DIALOGEX resource by
LTEXT "Copyright ©”,IDC_COPYRIGHT_STATIC,7,154,110,8
The resource file, probably created by MSVC application wizard many years ago and migrated forward with each release, now looks like this:
#if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_ENU)
LANGUAGE LANG_ENGLISH, SUBLANG_ENGLISH_US
#pragma code_page(1252) //present for over 10 years
#endif
For many years the © display correctly but recently appeared as “Å©” or even “½¿”. Obviously, an encoding issue, but I needed to understand how and why before making changes. So, after researching, these three properties in the .rc play a part in the bug and the encoding:
- The presence or absence of “#pragma code_page(…)” in the .rc
- The encoding used to Save with Encoding… the .rc file
- Save with Encoding… .rc “with signature” or “without signature” (meaning BOM?)
As an empirical test, changing these things in the .rc and looking at the result text in dialogue
#pragma code_page(…) | Save with Encoding | Signature(BOM) | Text in Dlg |
---|---|---|---|
code_page(1252) | Original file | n/a | Å© |
code_page(1252) | Windows 1252 | n/a | © |
code_page(1252) | UTF-8 65001 | BOM | Å© |
code_page(1252) | UTF-8 65001 | No BOM | Å© |
code_page(65001) | Windows 1252 | n/a | © |
code_page(65001) | UTF-8 65001 | BOM | © |
code_page(65001) | UTF-8 65001 | No BOM | © |
No code_page in .rc | UTF-8 65001 | BOM | © |
No code_page in .rc | UTF-8 65001 | No BOM | Å© |
I can explicitly Save with Encoding all .rc files encoding as Windows (1252) OR encoding as UNICODE UTF-8 with signatures (and delete the #pragma code_pages). The specific bug will go away, but is this the best solution?
It seems switching from Windows 1252 to UNICODE UTF-8 is a step forward and the right way to go long term. Is there any problem with this? Or better solutions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Raymond Chen 在博客文章 声明当使用 UTF-8 作为 rc 文件的编码时,不应使用 BOM/签名并包含
#pragma代码页(65001)
。根据开发者社区问题 v10.0.19505.1001之前版本的资源编译器存在问题。此修复版本随 Windows SDK 10.0.20348.0 一起提供
Raymond Chen explains the encoding issues in the resource compiler in a blog post stating that when using UTF-8 as encoding for rc files you should not use a BOM/Signature and include a
#pragma code_page(65001)
.According to a Developer Community issue there are problems in the resource compiler prior to version v10.0.19505.1001. This fixed version ships with Windows SDK 10.0.20348.0