delphi 2009 unicode + 安西问题

发布于 2024-07-29 18:23:37 字数 179 浏览 5 评论 0原文

我正在将 isapi(页面生成器)应用程序从 delphi 7 移植到 delphi 2009,页面基于 UTF8 的 html 文件。

一切都很顺利,除了 Onhtmltag 被触发时,我将透明标签替换为带有特殊字符的任何值,例如重音字符 (áé...) 这些字符在输出中被替换为 � 字符。

怎么了?

I'm porting an isapi (pageproducers) application from delphi 7 to delphi 2009, the pages are based on html files in UTF8.

Everything goes well except when Onhtmltag is fired and I replace a transparent tag with any value with special characters like accented characters (áé...) Those characters are replaced in the output with an � character.

What's wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

扎心 2024-08-05 18:23:37

作为调试过程的一部分,您应该准确找出浏览器收到的问号字符的字节值。

您应该知道,Delphi 2009 的字符串类型是 Unicode,而之前的所有版本都是 ANSI。 Delphi 7 引入了 Utf8String 类型,但 Delphi 2009 使该类型变得特殊。 如果您不使用该类型来保存编码为 UTF-8 的字符串,那么您应该开始这样做。 当您将一个变量分配给另一个变量时,Utf8String 变量中保存的值将自动转换为 UnicodeString 值。

如果您将 UTF-8 编码的字符串存储在普通的 AnsiString 变量中,那么当您将它们分配给 UnicodeString 时,它们将使用默认系统代码页转换为 Unicode >。 那不是你想要的。

如果您要将 UTF-8 编码的文字分配给 string 类型的变量,请停止这样做。 该类型期望其值被编码为 UTF-16,就像 WideString 始终一样。

如果您使用 LoadFromFile 将文件加载到 TStrings 后代中,那么您需要开始使用该方法的第二个参数,该参数告诉它要使用什么编码。 UTF-8 编码文件应使用 TEncoding.UTF8。 默认值为 TEncoding.Unicode,即小端 UTF-16。

As part of your debugging procedure, you should go find out exactly what byte value(s) the browser receives for the question-mark character.

As you should know, Delphi 2009's string type is Unicode, whereas all previous version were ANSI. Delphi 7 introduced the Utf8String type, but Delphi 2009 made that type special. If you're not using that type for holding strings that are encoded as UTF-8, then you should start doing so. Values held in Utf8String variables will be converted to UnicodeString values automatically when you assign one to the other.

If you're storing your UTF-8-encoded strings in ordinary AnsiString variables, then they will be converted to Unicode using the default system code page if you assign them to a UnicodeString. That's not what you want.

If you're assigning UTF-8-encoded literals to variables of type string, stop that. That type expects its values to be encoded as UTF-16, just like WideString always has.

If you are loading your files into a TStrings descendant with LoadFromFile, then you need to start using that method's second parameter, which tells it what encoding to use. UTF-8-encoded files should use TEncoding.UTF8. The default is TEncoding.Unicode, which is little-endian UTF-16.

云之铃。 2024-08-05 18:23:37

这可能是字符编码问题。

Delphi IDE通常使用Windows-1252或UTF-16对源代码进行编码。
HTML 通常使用 UTF-8。

您可能需要在这些编码之间进行一些音译。
为此,您需要找出确切使用的编码(例如 Rob 提到的)。

或者恢复为 HTML 转义重音字符(例如 Ralph 提及)

您可以发布一个显示问题的小应用程序吗? (你可以发邮件给我,任何用户名中有jeroen、域名中有pluimers.com的内容都会到达我的邮箱)。

——杰罗恩

This is probably a character encoding issue.

The Delphi IDE usually uses Windows-1252 or UTF-16 to encode source code.
HTML often uses UTF-8.

You probably need some transliteration between those encodings.
For that you need to find out what encodings are used exactly (like Rob mentions).

Or revert to HTML escaping accented characters (like Ralph mentions)

Can you post a small app that shows the problem? (you can email me, about anything that has jeroen in the username and pluimers.com in the domain name will arrive in my mailbox).

--jeroen

夏尔 2024-08-05 18:23:37

谢谢您的帮助,经过一些测试,问题非常非常简单(或者也很愚蠢)

response.contenttype := 'text/html charset=UTF-8'

不需要在 unicodestring utf8string ansisstring Widestring 之间手动翻译。 Delphi 2009 字符串的使用近乎完美。

Thank you for your help, after some test the problem was very very simple (or stupid also)

response.contenttype := 'text/html charset=UTF-8'

No need to translate manually between unicodestring utf8string ansistring widestring. Delphi 2009 string usage is near to perfect.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文