使用 OpenNLP 进行共指解析

发布于 2024-12-22 14:33:23 字数 195 浏览 1 评论 0原文

我想使用 OpenNLP 进行“共指解析”。 Apache 的文档 (共指解析) 不涵盖如何进行“共指解析”。。有人有任何文档/教程如何做到这一点吗?

I want to do "coreference resolution" using OpenNLP. Documentation from Apache (Coreference Resolution) doesn't cover how to do "coreference resolution". Does anybody have any docs/tutorial how to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

命硬 2024-12-29 14:33:23

我最近遇到了同样的问题,并写了一些使用 OpenNLP 1.5.x 工具的博客笔记。完整复制有点密集,所以 这是包含更多详细信息的链接


在较高级别上,您需要加载适当的 OpenNLP 共指模型库还有 WordNet 3.0 词典。考虑到这些依赖关系,初始化链接器对象非常简单:

// LinkerMode should be TEST
//Note: I tried LinkerMode.EVAL before realizing that this was the problem
Linker _linker = new DefaultLinker("lib/opennlp/coref", LinkerMode.TEST);

但是,使用链接器有点不太明显。您需要:

  1. 将内容分解为句子和相应的标记
  2. 为每个句子创建一个 Parse 对象
  3. 包裹每个句子 Parse 以指示句子顺序:

    final DefaultParse parseWrapper = new DefaultParse(parse, idx);
  4. 迭代每个句子解析并使用 Linker 从每个句子获取 Mention 对象解析:

    最终提及[]范围=
       _linker.getMentionFinder().getMentions(parseWrapper);
  5. 最后,使用链接器识别所有句子中的不同实体提及对象:

    DiscourseEntity[]Entity = _linker.getEntities(arrayOfAllMentions);

I recently ran into the same problem and wrote up some blog notes for using OpenNLP 1.5.x tools. It's a bit dense to copy in its entirety, so here's a link with more details.


At a high level, you need to load the appropriate OpenNLP coreference model libraries and also the WordNet 3.0 dictionary. Given those dependencies, initializing the linker object is pretty straightforward:

// LinkerMode should be TEST
//Note: I tried LinkerMode.EVAL before realizing that this was the problem
Linker _linker = new DefaultLinker("lib/opennlp/coref", LinkerMode.TEST);

Using the Linker, however, is a bit less obvious. You need to:

  1. Break the content down into sentences and the corresponding tokens
  2. Create a Parse object for each sentence
  3. Wrap each sentence Parse so as to indicate the sentence ordering:

    final DefaultParse parseWrapper = new DefaultParse(parse, idx);
  4. Iterate over each sentence parse ane use the Linker to get the Mention objects from each parse:

    final Mention[] extents =
       _linker.getMentionFinder().getMentions(parseWrapper);
  5. Finally, use the Linker to identify the distinct entities across all of the Mention objects:

    DiscourseEntity[] entities = _linker.getEntities(arrayOfAllMentions);
零度° 2024-12-29 14:33:23

目前 OpenNLP 几乎没有共指解析文档,除了 自述文件中非常简短地提到如何运行它

如果您没有投资使用 OpenNLP,请考虑 Stanford CoreNLP 软件包,其中包括 如何运行它的 Java 示例, 包括如何使用该包执行共指解析。它还包括总结其性能的页面以及在参考包上发布的论文

There is little coreference resolution documentation for OpenNLP at the moment except for a very short mention of how to run it in the readme.

If you're not invested in using OpenNLP, then consider the Stanford CoreNLP package, which includes a Java example of how to run it, including how to perform coreference resolution using the package. It also includes a page summarizing it's performance, and the papers published on the coreference package.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文