使用 nltk 和 python 编写代码
我想要一个使用 NLTK 和 Python 在给定句子或文本中标记习语的代码。
I want a code for tagging idioms in a given sentence or text using NLTK and Python.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
取决于你所说的“习语”是什么意思。乔关于词性标记的建议可能是一个好的开始 - 也可能是您真正想要的。如果是这样,请阅读 Bird 等人的《Natural Language Processing with Python》。它由 O'Reilly 出版,但也可以根据知识共享许可在线获取。这将帮助您开始使用 POS 标记。对NLTK的能力也有很好的评价。例如,一些“命名实体识别”技术可以适应你想要的吗?或者也许你想要的东西太难了。我怀疑是后者(正如拉菲所暗示的那样),但你会在你的旅程中发现这一点。也许您会在旅途中开发出一些新东西,在这种情况下,我希望您回馈 NLTK 社区。
Depends what you mean by an "idiom". Joe's suggestion of POS tagging is probably a good start - and might be what you are really after. If so, go read "Natural Language Processing with Python" by Bird et al. It is published by O'Reilly but is also available online under a Creative Commons license. This will get you started with POS tagging. It also has a good review of NLTK's abilities. For example, can some "Named Entity Recognition" techniques be adapted to do what you want? Or perhaps what you want is simply too difficult. I suspect the latter is the case (as implied by Rafi) but you will find that out in your journey. Perhaps you'll develop something new during your journey, in which case I hope you give back to the NLTK community.