NLTK is good for natural language processing. I've used it for my data-mining project. You can train your own analyzer. The learning curve is not steep.
NLTK got huge corpus for training of your analyzer. You can also provide your own set of data, for example, a journal which a part-of-speech tagged.
Because python is very good for text processing, you may to give it a try. Plus, it got a online tutorial
Please don't forget to use python 2.x version. Try python 2.6. NLTK may not be good with python 3.x
If you already understand the basics of NLP, I think NLTK should be pretty easy to pick up. It's got a bunch of documentation, 2 books, and I've written a number of articles & tutorials on streamhacker.com. And if there's anything from the Java packages you don't want to lose, you could theoretically combine it with NLTK using Jython (and perhaps execnet).
You also may want to take a look at the Pattern library.
发布评论
评论(2)
NLTK 非常适合自然语言处理。我已将它用于我的数据挖掘项目。您可以训练自己的分析器。学习曲线并不陡峭。
NLTK 拥有庞大的语料库来训练您的分析器。您还可以提供自己的数据集,例如带有词性标记的日记。
因为python非常适合文本处理,你可以尝试一下。另外,它有一个在线教程
请不要忘记使用python 2.x版本。尝试Python 2.6。
NLTK 可能不太适合 python 3.x
NLTK is good for natural language processing. I've used it for my data-mining project. You can train your own analyzer. The learning curve is not steep.
NLTK got huge corpus for training of your analyzer. You can also provide your own set of data, for example, a journal which a part-of-speech tagged.
Because python is very good for text processing, you may to give it a try. Plus, it got a online tutorial
Please don't forget to use python 2.x version. Try python 2.6.
NLTK may not be good with python 3.x
如果您已经了解 NLP 的基础知识,我认为 NLTK 应该很容易掌握。它有一堆文档、两本书,而且我写了很多文章和文章。 streamhacker.com 上的教程。如果您不想丢失 Java 包中的任何内容,理论上您可以使用 Jython 将其与 NLTK 结合起来(也许 execnet)。
您可能还想查看 Pattern 库。
If you already understand the basics of NLP, I think NLTK should be pretty easy to pick up. It's got a bunch of documentation, 2 books, and I've written a number of articles & tutorials on streamhacker.com. And if there's anything from the Java packages you don't want to lose, you could theoretically combine it with NLTK using Jython (and perhaps execnet).
You also may want to take a look at the Pattern library.