当前位置：文江博客话题详情

tcl utf-8 字符在 ui 中无法正确显示

发布于 2024-11-25 09:16:18 字数 402 浏览 1 评论 0原文

目标：为了在 Enovia v6 中的用户 ID 中包含多语言字符，

我在 tcl 脚本中使用 utf-8 编码，并且它似乎在数据库中正确保存了多语言字符（经过一些转换后）。但是，在用户界面中，我确实看到了数据库中保存的信息。

通过 Power Web 进行相同的练习时，保存的数据会以某种方式转换回正确的多语言字符并正确显示。

我在采用 tcl 方法时是否遗漏了一些东西？

贴一个例子来帮助更好地理解。

原名：Kátai-Pál 数据库中保存的名称为：Kátai-Pál 在 UI 中，我看到名称为：Kátai-Pál

在 Tcl 中，我使用以下语法设置编码[编码转换为utf-8 Kátai-Pál]；现在用户名变为：Kátai-Pál 在用户界面中，我看到名称为“Kátai-Pál”

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

停顿的约定 2024-12-02 09:16:18

诀窍是用字符来思考，而不是字节。它们是不同的东西。编码是将字符表示为字节序列的方法（在内部，Tcl 确实相当复杂，但如果您不开发 Tcl 的实现本身，则不必关心这一点；只需说它是 Unicode）。因此，当您使用：

encoding convertto utf-8 "Kátai-Pál"

您正在获取一个字符序列并要求字节序列（每个结果字符一个），即给定编码 (UTF-8) 中这些字符的编码。

您需要做的是让数据库集成层了解数据库使用的编码，以便它可以为您转换回字符（您只能使用字节进行通信；其他一切都只是一种简化）。有两种可能发生的方式：要么正确共享信息（通过元数据或定义的约定），要么双方做出偶尔会失败的假设。听起来好像是后者，唉。

如果您无法以其他方式处理它，您可以从数据库层获取生成的字节并转换为字符：

encoding convertfrom $theEncoding $theBytes

计算出 $theEncoding 应该是什么通常非常棘手，但听起来就像适合您的 utf-8 一样。一旦获得字符，Tcl/Tk 将能够正确显示它们；它知道如何将它们正确地传输到平台 GUI 的内部。（在您实际编写的脚本中，您最好用 \uXXXX 转义符替换非 ASCII 字符，因为平台不就正确的编码方式达成一致用于脚本。）

The trick is to think in terms of characters, not bytes. They're different things. Encodings are ways of representing characters as byte sequences (internally, Tcl's really quite complicated, but you shouldn't ever have to care about that if you're not developing Tcl's implementation itself; suffice to say it's Unicode). Thus, when you use:

encoding convertto utf-8 "Kátai-Pál"

You're taking a sequence of characters and asking for the sequence of bytes (one per result character) that is the encoding of those characters in the given encoding (UTF-8).

What you need to do is to get the database integration layer to understand what encoding the database is using so it can convert back into characters for you (you can only ever communicate using bytes; everything else is just a simplification). There are two ways that can happen: either the information is correctly shared (via metadata or defined convention), or both sides make assumptions which come unstuck occasionally. It sounds like the latter is what's happening, alas.

If you can't handle it any other way, you can take the bytes produced out of the database layer and convert into characters:

encoding convertfrom $theEncoding $theBytes

Working out what $theEncoding should be is in general very tricky, but it sounds like it's utf-8 for you. Once you've got characters, Tcl/Tk will be able to display them correctly; it knows how to transfer them correctly into the guts of the platform's GUI. (And in scripts that you actually write, you're best off replacing non-ASCII characters with their \uXXXX escapes, because platforms don't agree on what encoding is right to use for scripts. Alas.)

回复收藏 0 原文

~没有更多了~