自然语言解析工具:什么是现有的,什么不是?

发布于 2024-08-17 23:45:11 字数 1539 浏览 2 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

难理解 2024-08-24 23:45:11

我建议你看一下以下内容:

  1. 常用的 nlp 库,例如 Open NLPLingPipe, NLTK >, , UIMA。所有这些都提供了解析器和词干分析器(即它们不会给你一个词的词根,而是它的词干)。有些还提供词形还原器。
  2. 收集 NLP 工具的网站。这些只是其中的一些:计算语言学协会的 wiki语言技术世界编译部门的网站。海德堡大学

我不知道有什么工具可以返回单词的词根,但是,正如我所说,有词干分析器和词形还原器。对于词形还原,请尝试 Tree TaggerMorpha。形态音素分析这个术语不够具体,无法满足您的需求。

一旦您更具体地知道您需要什么,您可以搜索语料库列表的档案或发布那里有一个问题。

I suggest you take a look at the following:

  1. the ususal nlp libraries like Open NLP, LingPipe, NLTK, Gate, UIMA. All of these provide parsers and word stemmers (i.e. they don't give you back the root of a word, but its stem). Some also provide lemmatizers.
  2. websites which collect NLP tools. These are but a few of them: the wiki of the Association of Computational Linguistics, Language Technology World, the website of the compling dep. at Heidelberg University

I'm not aware of a tool which returns the root of a word, but, as I said, there are stemmers and lemmatizers. For lemmatization, try Tree Tagger or Morpha. Morphophonemic analysis is a term not specific enough to get you what you want.

Once you know more specifically what you need, you could search the archives of the Corpora List or post a question there.

情深如许 2024-08-24 23:45:11

NTLK 是一个有趣的工具包,可以构建基于 NLP 的应用程序。这可用于需要 POS 标记或实现简单分类器或实体提取器的实际应用。

然而,我不确定“语言理解器”应用程序将包含什么内容,但这听起来可能超出了[轻松]基于 NLTK 的范围。
完整地阅读这个问题及其对形态学的引用,似乎证实了 NLTK 可能不能很好地满足 OP 的目的;据我所知,NTLK 不提供处理此级别文本的模块。不过,您可能需要亲自检查一下,因为 NLTK 是一个广泛且活跃的项目,并且可能已经看到了该领域最近的新增内容。

NTLK is an interesting toolkit which allows building NLP-based applications. This can be used for practical applications which require for example POS tagging, or which implement simple classifiers or entity extractors.

I'm unsure of what a "language understander" application would encompass, however, but this sounds like something which may be beyond what can [easily] be based upon NLTK.
Reading the question completely, and its reference to morphophonics, seems to confirm that NLTK would probably not serve the OP's purpose very well; to my knowledge NTLK doesn't offer modules that deal with text at this level. You may want to check this for yourself however, as NLTK is a broad and active project and may have seen recent additions in this area.

飘然心甜 2024-08-24 23:45:11

我想附上 MontyLingua python 包的链接,可以在这里找到。我认为它使用与 nltk 不同的解析器。

http://www.fslog.com/2008 /09/20/montylingua3-gpled-fork-of-montylingua/
你可以google一下和nltk的比较。

I want to chime in with a link to the MontyLingua python package, which can be found here. I think it uses a different parser than the nltk.

http://www.fslog.com/2008/09/20/montylingua3-gpled-fork-of-montylingua/
you can google a comparison with nltk.

这样的小城市 2024-08-24 23:45:11

Maluuba 刚刚发布了其自然语言处理器的 API。它位于 http://developer.maluuba.com

Maluuba 为其编写了三个库:

Python 库: https://github.com/maluuba/napi- python

Ruby 库:https://github.com/maluuba/napi-ruby

Java 库:https://github.com/maluuba/napi-java

有关示例它的力量,以这个查询为例,说明可以提取什么:

>> client.interpret phrase: 'Set up a meeting with Bob tomorrow \
          night at 7 PM to discuss the TPS reports'
=> 
    {:entities=>
      {
        :daterange=>[{:start=>"2012-11-15", :end=>"2012-11-16"}],
        :title=>["meeting to discuss the tps reports"],
        :timerange=>[{:start=>"12:00:00AM", :end=>"12:00:00AM"}],
        :contacts=>[{:name=>"bob"}]
      },
     :action=>:CALENDAR_CREATE_EVENT,
     :category=>:CALENDAR
    }

Maluuba has just released an API to their Natural Language Processor. It's available at http://developer.maluuba.com.

There are three libraries written for it by Maluuba:

Python Library: https://github.com/maluuba/napi-python

Ruby Library: https://github.com/maluuba/napi-ruby

Java Library: https://github.com/maluuba/napi-java

For an example of the power of it, take this query as an example of what can be extracted:

>> client.interpret phrase: 'Set up a meeting with Bob tomorrow \
          night at 7 PM to discuss the TPS reports'
=> 
    {:entities=>
      {
        :daterange=>[{:start=>"2012-11-15", :end=>"2012-11-16"}],
        :title=>["meeting to discuss the tps reports"],
        :timerange=>[{:start=>"12:00:00AM", :end=>"12:00:00AM"}],
        :contacts=>[{:name=>"bob"}]
      },
     :action=>:CALENDAR_CREATE_EVENT,
     :category=>:CALENDAR
    }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文