如何使用 NLTK pos 标签获得更好的结果
我正在使用Python学习nltk。我尝试在各种句子上做 pos_tag 。但得到的结果并不准确。我怎样才能即兴创作结果?
broke = NN
flimsy = NN
crap = NN
此外,我还收到了很多被归类为 NN 的额外单词。我怎样才能过滤掉这些以获得更好的结果?
I am just learning nltk using Python. I tried doing pos_tag on various sentences. But the results obtained are not accurate. How can I improvise the results ?
broke = NN
flimsy = NN
crap = NN
Also I am getting lot of extra words being categorized as NN. How can I filter these out to get better results.?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
给出上下文,你就得到了这些结果。举个例子,我在上下文短语“ They Break climsy crap”上使用 pos_tag 获得了其他结果:
无论如何,如果你认为你看到了这一点很多单词被错误地归类为“NN”,您可以专门针对那些标记为“NN”的单词应用其他技术。
例如,您可以采用一些适当的标记语料库并使用三元标记器对其进行分类。
(实际上,作者在 http:// nltk.googlecode.com/svn/trunk/doc/book/ch05.html)。
像这样的事情:
如果它可以改善您的结果,请告诉我。
Give the context, there you obtained these results. Just as example, I'm obtaining other results with pos_tag on the context phrase "They broke climsy crap":
Anyway, if you see that in your opinion a lot of word are falsely cathegorized as 'NN', you can apply some other technique specially on those which are marked a s 'NN'.
For instance, you can take some appropriate tagged corpora and classify it with trigram tagger.
(actually in the same way the authors do it with bigrams on http://nltk.googlecode.com/svn/trunk/doc/book/ch05.html).
Something like this:
Let me know if it improves your results.