使用Tcl编码命令将繁体中文转换为简体中文
我支持一个用 Tcl 编写的网站,它以繁体中文(big5)显示数据。 然后我们有一个 Java servlet,使用 mandarintools.com 中的翻译代码将页面请求翻译成简体中文。 指定翻译代码的转换是从UTF-8到UTF-8S; Java 显然正确地将数据转换为 UTF-8。Java
转换代码可以工作,但速度很慢,并且由于该网站是用 Tcl 编写的,另一个列表上的某人建议我尝试使用它。 不幸的是,Tcl 不支持 UTF-8S,我一直无法弄清楚使用什么翻译来代替它。 我尝试过 gb2312、gb2312-raw、gb1988、euc-cn... 所有结果都是乱码。 我的假设是,Tcl 在引入时也会转换为 UTF-8,尽管我已经尝试先从 big5 进行转换,但没有帮助。
我的测试代码如下所示:
set page_body [ns_httpget http://www.mysite.com]
set translated_page_body [encoding convertto gb2312 $page_body]
ns_write $translated_page_body
我也尝试过
set page_body [ns_httpget http://www.mysite.com]
set translated_page_body [encoding convertto gb2312 [encoding convertfrom big5 $page_body]]
ns_write $translated_page_body
但它没有改变任何东西。
有没有人有足够的经验来帮助我解决这个问题?
I support a website written in Tcl which displays data in Traditional Chinese (big5). We then have a Java servlet, using the translation code from mandarintools.com, to translate a page request into Simplified Chinese. The conversion as specified to the translation code is from UTF-8 to UTF-8S; Java is apparently correctly translating the data to UTF-8 as it comes in.
The Java translation code works but is slow, and since the website is written in Tcl someone on another list suggested I try using that. Unfortunately, Tcl doesn't support UTF-8S and I have been unable to figure out what translation to use in its place. I've tried gb2312, gb2312-raw,gb1988, euc-cn... all result in gibberish. My assumption is that Tcl is also translating to UTF-8 as it comes in, though I have tried converting from big5 first and it doesn't help.
My test code looks like this:
set page_body [ns_httpget http://www.mysite.com]
set translated_page_body [encoding convertto gb2312 $page_body]
ns_write $translated_page_body
I have also tried
set page_body [ns_httpget http://www.mysite.com]
set translated_page_body [encoding convertto gb2312 [encoding convertfrom big5 $page_body]]
ns_write $translated_page_body
But it didn't change anything.
Does anyone out there have enough experience with this to help me figure it out?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
仅供参考,为了完整起见,Tcl 专家告诉我,您不能以这种方式进行转换,必须通过字符替换来完成。
FYI for completeness' sake, I've been told by Tcl experts that you can't do the conversion this way, it has to be done via character replacement.
您是否从 Oracle 获取数据?
如果是这样,请查看是否可以使用 CONVERT 函数将“utf8”转换为“al32utf8”,这是真正的 Utf8 标准,并且 Tcl 应该无缝使用。
如果没有,那么我想我会等你的评论。
By any chance, are you grabbing your data from Oracle?
If so, see if you can use the CONVERT function to convert to from "utf8" to "al32utf8", which is the true Utf8 standard and which Tcl should work-with seamlessly.
If not, well, I guess I'll wait for you comment(s).