文本边界分析资源
我需要在我的项目中进行“文本边界分析”。
我记得谷歌有一个资源可能有助于完成这项工作,但我不太记得名称或下载位置。
我记得这个资源是google搜索的一个集体统计数据,它可能包含很多人们在搜索引擎上用作关键词的单词。 (实际上,我不确定该资源中的内容,因为我很久以前读过这篇文章,但我确信该资源可以用于查找文本边界。)
有人知道吗?
顺便问一下,还有其他资源可以帮助文本边界分析吗?
(Alex Martelli:我尝试了 ICU 和 Java,但他们找不到任何一种东方语言的单词。)
(dwc:非常感谢,这可能会有所帮助。)
I need to do "text boundary analysis" in my project.
I remember there is a resource from google might be a help for doing this job, but I don't quite remember the name or where to download.
I remember this resource is a collective statistic data from google search, it might contains a lot of words that people used as keywords on the search engine.
(Acturally, I am not sure what is in that resource because I read about this article long time ago, but I am sure this resource can be used on finding text boundary.)
Does anyone know about it?
By the way, is there any other resource that might help in text boundary analysis?
(Alex Martelli: I tried ICU and Java, but they can't find words in any one of the Orient languages.)
(dwc : Thanks alot, this might help.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
此 ICU 页面 和 这个用于 Java 但我相信,两者都没有引用您记得的资源。
There's good coverage of the general issue in this ICU page and this one for Java but neither refers to the resource you remember, I believe.