韩国、泰国和印度尼西亚 POS 标记器
有人可以推荐一个适用于韩语、印度尼西亚语、泰语和越南语的开源词性标注器吗?
我可以用它来标记我当前拥有的语料库数据。 (例如 stanford-postagger)
如果您是开发人员并愿意分享,请让我测试一下 POS 标记器,我也不介意。
通过对输出进行一些修改,我用 jvntextpro POS 标记了越南数据,
但我仍然会喜欢对韩语、印度尼西亚语和泰语 POS 标记进行更多输入。
Can someone recommend an open source POS tagger for Korean, Indonesian, Thai and Vietnamese?
That I can use to tag the corpus data that I currently have. (e.g. the stanford-postagger)
If you are a dev and care to share and let me test out the POS tagger, I don't mind either.
With some modifications of the output, I've POS tagged the Vietnamese data with jvntextpro
But I'd still like more input on Korean, Indonesian and Thai POS tagging.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在 acl wiki: 韩语形态分析器和词性标注器之后,
我将开始看看韩国、泰国、韩国的NLP研究部门的网站。 在此页面,您将找到研究部门的链接。
祝你好运!
更新:OpenNLP 有泰国 PoS。以下是模型:http://opennlp.sourceforge.net/models/thai/ PoS opennlp 标记器。
After acl wiki: Korean morphological analyzer and part-of-speech tagger
I would start to look on the websites of NLP research departments in Korea, Thailand, and Korean. On this page, you will find links to the research departments.
Good luck!
UPDATE: OpenNLP has thai PoS. Here are the models: http://opennlp.sourceforge.net/models/thai/ for PoS opennlp tagger.
您可能想尝试 RDRPOSTagger:一个强大、易于使用且独立于语言的 POS 和形态学工具包标记。
(编程语言:Python和Java)
RDRPOSTagger在学习和标记过程中都获得了快速的性能。此外,与最先进的结果相比,RDRPOSTagger 实现了非常有竞争力的准确性。请参阅本文中的实验结果,包括性能速度和标记准确性。
RDRPOSTagger 现在支持 13 种语言的预训练 POS 和形态标记模型,包括泰语和越南语。
You might want to try RDRPOSTagger: a robust, easy-to-use and language-independent toolkit for POS and morphological tagging.
(Programming language: Python & Java)
RDRPOSTagger obtains fast performance in both learning and tagging process. In addition, RDRPOSTagger achieves a very competitive accuracy in comparison to the state-of-the-art results. See experimental results including performance speed and tagging accuracy in this paper.
RDRPOSTagger now supports pre-trained POS and morphological tagging models for 13 languages, including Thai and Vietnamese.