如何从Stanford Parser NLP获取想要的节点?
我的主要问题是我不知道如何从 GrammaticalStructure 中提取节点。 我在 java netbeans 中使用 englishPCFG.ser 。 我的目标是了解屏幕的质量,例如:
iPhone 4 的屏幕很棒。
我想提取屏幕并且很棒。 如何提取 NN(屏幕)和 VP(很棒)?
我写的代码是:
LexicalizedParser lp = new LexicalizedParser("C:\\englishPCFG.ser");
lp.setOptionFlags(new String[]{"-maxLength", "80", "-retainTmpSubcategories"});
String sent ="the screen is very good.";
Tree parse = (Tree) lp.apply(Arrays.asList(sent));
parse.pennPrint();
System.out.println();
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCollapsed();
My main problem is that I don't know how to extract nodes from GrammaticalStructure.
I am using englishPCFG.ser in java netbeans.
My target is to know the quality of the screen like:
The screen of iphone 4 is great.
I want to extract screen and great.
How can I extract the NN (screen) and VP (great) ?
the code that I wrote is:
LexicalizedParser lp = new LexicalizedParser("C:\\englishPCFG.ser");
lp.setOptionFlags(new String[]{"-maxLength", "80", "-retainTmpSubcategories"});
String sent ="the screen is very good.";
Tree parse = (Tree) lp.apply(Arrays.asList(sent));
parse.pennPrint();
System.out.println();
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCollapsed();
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
集合
tdl
是类型化依赖项的列表。对于这句话,它包含:(正如您通过在线尝试看到的那样)。
因此,您想要的依赖项
nsubj(great-7, screen-2)
就在该列表中。nsubj
表示“screen”是“great”的主语。依赖项的集合只是一个Collection(List)。为了进行更复杂的进一步处理,人们通常希望将依赖关系变成可以进行各种搜索和遍历的图结构。有多种方法可以做到这一点。我们经常使用 (jgrapht)[http://www.jgrapht.org/] 库。但这就是您自己编写的代码。
The collection
tdl
is a list of typed dependencies. For this sentence, it contains:(as you can see by trying it out online).
So, the dependency you want,
nsubj(great-7, screen-2)
is right there in that list.nsubj
means that "screen" is the subject of "great".The collection of dependencies is just a Collection (List). For doing more sophisticated further processing, people commonly want to make the dependencies into a graph structure that can be variously searched and traversed. There are various ways of doing that. We often use the (jgrapht)[http://www.jgrapht.org/] library. But that's then code you are writing yourself.