ICU添加自定义字符集检测
有人知道 ICU 字符集检测器的数据是如何构建的吗?添加其他语言是否困难?
例如,我在 bug 跟踪器中看到,自 2007 年以来就开出了检测泰语的罚单,但直到今天才出现新情况。
谢谢
Does somebody know how ICU Charset Detector's data is built. And is it difficult to add additional languages?
For example, I saw in the bug tracker that a ticket for the detection of Thai is opened since 2007 but nothing new until today.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我会在 ICU 邮件列表上询问您的问题,甚至提交错误并表示您愿意投入工作/data 来做到这一点。我找不到您提到的票证,但 ICU 是开源的,因此如果您愿意贡献时间和数据,这将对实施产生影响。
I would ask your question on the ICU mailing list or even file a bug and say you are willing to put in the work/data to do it. I couldn't find the ticket you referred to, but ICU is open source, so if you are willing to contribute time and data, that would make a difference in implementation.