共指解析是否需要 NER?
...或者性别信息就足够了吗? 更具体地说,我有兴趣知道是否可以减少斯坦福核心 NLP 加载的模型数量以提取共指。我对实际的命名实体识别不感兴趣。
谢谢
... or is gender information enough?
More specifically, I'm interested in knowing if I can reduce the number of models loaded by the Stanford Core NLP to extract coreferences. I am not interested in actual named entity recognition.
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
根据描述斯坦福 CoreNLP 打包的 coref 系统的 EMNLP 论文,命名实体标签仅在以下 coref 注释过程中使用:精确构造、宽松头部匹配和代词 (Raghunathan 等人,2010)。
您可以通过 dcoref.sievePasses 配置属性指定要使用的通道。如果您想要共指但不想执行 NER,则应该能够在不使用 NER 的情况下运行管道,并指定 coref 系统应仅使用不需要 NER 标签的注释通道。
但是,生成的 coref 注释将在召回上受到影响。因此,您可能需要做一些实验来确定注释质量下降是否会影响您在下游使用注释的任何内容。
According to the EMNLP paper that describes the coref system packaged with Stanford CoreNLP, named entities tags are just used in the following coref annotation passes: precise constructs, relaxed head matching, and pronouns (Raghunathan et al. 2010).
You can specify what passes to use with the dcoref.sievePasses configuration property. If you want coreference but you don't want to do NER, you should be able to just run the pipeline without NER and specify that the coref system should only use the annotation passes that don't require NER labels.
However, the resulting coref annotations will take a hit on recall. So, you might want to do some experiments to determine whether the degraded quality of the annotations is problem for whatever your are using them for downstream.
一般来说,是的。首先,您需要命名实体,因为它们充当候选先行,或目标代词指代。许多(大多数?)系统一步执行实体识别和类型分类。其次,实体的语义类别(例如人、组织、位置)对于构建准确的共指链非常重要。
In general, yes. First, you need named entities because they serve as the candidate antecedents, or targets to which the pronouns refer. Many (most?) systems perform both entity recognition and type classification in one step. Second, the semantic category (e.g. person, org, location) of the entities are important for constructing accurate coreference chains.