Eggdrop 上的 TCL 编码问题
我已经在新的 Debian 服务器上安装了 Eggdrop,该服务器具有 TCL8.5 和最新版本的 Eggdrop。不幸的是,我的脚本和特殊字符(如 é、J'aime 等)的处理存在问题。
一个示例可能最好地向您展示:
13:41 <@me> test
13:41 <@me> !tr nl This is a test
13:41 < bot> Dit is een test
13:41 <@me> !tr fr I am a stranger
13:41 < bot> Je suis un étranger
13:41 <@me> !tr fr I love you
13:42 < bot> Je t'aime
我添加了一行,表示“convert-to utf-8”并且 Eggdrop 正在 utf 运行-8 也是如此,它似乎使 étranger 在我的 irc 客户端中可读,但大多数字符(中文、阿拉伯文)根本不接近。 TCL 代码如下:
namespace eval gTranslator {
bind pub - !tr gTranslator::translate
proc translate { nick uhost handle chan text } {
package require http
package require json
set lngto [string tolower [lindex [split $text] 0]]
set text [::http::formatQuery q [join [lrange [split $text] 1 end]]]
set dturl "http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=$text"
set res [::json::json2dict [::http::data [::http::geturl $dturl]]]
set lng [dict get $res responseData language]
if { $lng == $lngto } {
putserv "PRIVMSG $chan :\002Error\002 translating $lng to $lngto."
return 0
}
set trurl "http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=$lng%7c$lngto&$text"
putlog $trurl
set res [::json::json2dict [::http::data [::http::geturl $trurl]]]
putlog $res
#putserv "PRIVMSG $chan :Language detected: $lng"
set translated [dict get $res responseData translatedText]
putserv "PRIVMSG $chan :[encoding convertto utf-8 $translated]"
}
}
通过 telnet 连接提供以下附加信息:
*** Me joined the party line.
[13:49:34] http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=en%7cfr&q=I%20like%20cookies
[13:49:34] responseData {translatedText {J'aime les cookies}} responseDetails null responseStatus 200
[13:50:11] http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=en%7cfr&q=I%20am%20a%20stranger
[13:50:11] responseData {translatedText {Je suis un étranger}} responseDetails null responseStatus 200
I have installed Eggdrop on a new Debian server with TCL8.5 and the latest version of eggdrop. Unfortunately there are issues with my script and the handling of special characters as é, J'aime, etc.
An example might be best to show you:
13:41 <@me> test
13:41 <@me> !tr nl This is a test
13:41 < bot> Dit is een test
13:41 <@me> !tr fr I am a stranger
13:41 < bot> Je suis un étranger
13:41 <@me> !tr fr I love you
13:42 < bot> Je t'aime
I have added the line that says convert-to utf-8 and eggdrop is running at utf-8 too and it seemed to make étranger readable in my irc client, however most characters (Chinese, Arabic) weren't close at all. The TCL code is as follows:
namespace eval gTranslator {
bind pub - !tr gTranslator::translate
proc translate { nick uhost handle chan text } {
package require http
package require json
set lngto [string tolower [lindex [split $text] 0]]
set text [::http::formatQuery q [join [lrange [split $text] 1 end]]]
set dturl "http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=$text"
set res [::json::json2dict [::http::data [::http::geturl $dturl]]]
set lng [dict get $res responseData language]
if { $lng == $lngto } {
putserv "PRIVMSG $chan :\002Error\002 translating $lng to $lngto."
return 0
}
set trurl "http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=$lng%7c$lngto&$text"
putlog $trurl
set res [::json::json2dict [::http::data [::http::geturl $trurl]]]
putlog $res
#putserv "PRIVMSG $chan :Language detected: $lng"
set translated [dict get $res responseData translatedText]
putserv "PRIVMSG $chan :[encoding convertto utf-8 $translated]"
}
}
Connecting via telnet gave the following additional information:
*** Me joined the party line.
[13:49:34] http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=en%7cfr&q=I%20like%20cookies
[13:49:34] responseData {translatedText {J'aime les cookies}} responseDetails null responseStatus 200
[13:50:11] http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=en%7cfr&q=I%20am%20a%20stranger
[13:50:11] responseData {translatedText {Je suis un étranger}} responseDetails null responseStatus 200
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这里出现了很多问题。一是 Google 正在返回应用了独立于 JSON 编码的实体编码的字符串。你必须解码它。其次,您遇到了内存泄漏(需要手动清理
http::geturl
返回的令牌),最好通过编写帮助程序来解决这个问题:(您已经有了
编码转换为utf-8
应用于解决eggdrop对编码缺乏正确理解的问题。)我已经检查了查询阿拉伯语响应的结果,它似乎返回了正确的UTF-8。因此,您遇到的任何问题都在您的客户身上。 (由于 Tcl 目前仅处理 Unicode 的基本多语言平面 – BMP –,某些 中文字符可能存在问题。这是一个已知问题。)
There are a number of issues going on here. One is that Google is delivering strings back that have entity encoding applied independent of JSON encoding. You'll have to decode that. Second, you've got a memory leak (tokens returned by
http::geturl
need to be manually cleaned up) which it's best to address by writing a helper procedure:(You already have the
encoding convertto utf-8
applied to work around eggdrop's lack of proper understanding of encodings.)I've checked the results of querying for an Arabic response, and it appears to be correct UTF-8 returned. As such, any problems you're having with it are in your client. (There may be an issue with some Chinese characters due to the fact that Tcl currently only handles the Basic Multilingual Plane – BMP – of Unicode. This is a known issue.)