System.Windows.Forms.RichTextBox 对 unicode 字符使用什么编码?
我的应用程序中有一个 WinForms RichTextBox。当我输入中文文本“蜜蜜蜜蜜”时,控件使用以下 RTF:
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fmodern\fprq6\fcharset134 SimSun;}{\f1\ fnil\fcharset0 Microsoft Sans Serif;}} \viewkind4\uc1\pard\f0\fs17\'c3\'db\'c3\'db\'c3\'db\'c3\'db\f1\par 测试
字符串是同一个字符四次。它的 Unicode 值为 34588 (0x871C)。那么字符是如何在 RTF 中存储为“\'c3\'db”的呢?那是什么样的编码?
I've got a WinForms RichTextBox in my application. When I enter the Chinese text "蜜蜜蜜蜜", the control uses the following RTF:
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fmodern\fprq6\fcharset134 SimSun;}{\f1\fnil\fcharset0 Microsoft Sans Serif;}}
\viewkind4\uc1\pard\f0\fs17\'c3\'db\'c3\'db\'c3\'db\'c3\'db\f1\par
}
The test string is the same character four times. It's Unicode value is 34588 (0x871C). So how is it that the character is being stored as "\'c3\'db" in the RTF? What kind of encoding is that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
RTF 很古老,比 Job 还要早,并且大大早于 Unicode。我认为它使用 代码页 936,一种双字节字符集对于简体中文。您的代码片段显示它使用 c3db 作为字符,它与 中显示的字形匹配这个表。
RTF is old, older than Job and considerably predates Unicode. I think it using code page 936, a double-byte character set for Simplified Chinese. Your snippet shows it using c3db for the character, it matches the glyph shown in this table.