用于 NLP 应用程序的具有语义角色标签的语料库
因此,我构建了一个 NLP 程序,学习从句子中提取语义事件描述,但现在我的训练集仅限于我手上解析为语义事件组件的句子。
虽然这种方法确实可以完成工作,但它很难替代大量预先解析的文本语料库。不幸的是,我所有寻找这样一个语料库的尝试都被证明是徒劳的。
我特别需要的是一个语料库,它标记了句子中每个单词(或单词组)的语义角色。我想到的角色示例如下:
- 代理人
- 行动
- 病人
- 仪器
- 共同代理人
- 共同病人
- 位置
- 副词
如果需要更多细节,请随时询问,或参考 本文 使用与我的约束相同的玩具语料库。
So, I've constructed a NLP program that learns to extract a semantic event description from a sentence, but right now my training set is limited to sentences I've parsed into semantic event components my hand.
While this method does get the job done, its hardly a proper substitute for a large pre-parsed corpus of text. Unfortunately, all of my attempts at finding such a corpus have proven futile.
What I need specifically is a corpus that has tagged the semantic roles of each word (or group of words) in a sentence. Examples of roles I had in mind are things like:
- agent
- action
- patient
- instrument
- co-agent
- co-patient
- location
- adverb
If any more specifics are needed, feel free to ask, or refer to this paper that uses a toy corpa with the same constraints as mine.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
2005 年 CoNLL 的共享任务是“语义角色标签”。 此页面描述了他们的语料库以及他们标记的角色。
The CoNLL Shared Task in 2005 was 'Semantic Role Labelling'. This page describes their corpus and what roles they labelled.