如何从Stanford Parser NLP获取想要的节点?

发布于 2024-10-10 21:49:23 字数 746 浏览 7 评论 0原文

我的主要问题是我不知道如何从 GrammaticalStructure 中提取节点。 我在 java netbeans 中使用 englishPCFG.ser 。 我的目标是了解屏幕的质量,例如:

iPhone 4 的屏幕很棒。

我想提取屏幕并且很棒。 如何提取 NN(屏幕)和 VP(很棒)?

我写的代码是:

LexicalizedParser lp = new LexicalizedParser("C:\\englishPCFG.ser");
lp.setOptionFlags(new String[]{"-maxLength", "80", "-retainTmpSubcategories"});

String sent ="the screen is very good.";
Tree parse = (Tree) lp.apply(Arrays.asList(sent));
parse.pennPrint();
System.out.println();

TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCollapsed();

My main problem is that I don't know how to extract nodes from GrammaticalStructure.
I am using englishPCFG.ser in java netbeans.
My target is to know the quality of the screen like:

The screen of iphone 4 is great.

I want to extract screen and great.
How can I extract the NN (screen) and VP (great) ?

the code that I wrote is:

LexicalizedParser lp = new LexicalizedParser("C:\\englishPCFG.ser");
lp.setOptionFlags(new String[]{"-maxLength", "80", "-retainTmpSubcategories"});

String sent ="the screen is very good.";
Tree parse = (Tree) lp.apply(Arrays.asList(sent));
parse.pennPrint();
System.out.println();

TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCollapsed();

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

寒冷纷飞旳雪 2024-10-17 21:49:23

集合tdl是类型化依赖项的列表。对于这句话,它包含:(

det(screen-2, the-1)
nsubj(great-7, screen-2)
amod(4-5, iphone-4)
prep_of(screen-2, 4-5)
cop(great-7, is-6)

正如您通过在线尝试看到的那样)。

因此,您想要的依赖项 nsubj(great-7, screen-2) 就在该列表中。 nsubj 表示“screen”是“great”的主语。

依赖项的集合只是一个Collection(List)。为了进行更复杂的进一步处理,人们通常希望将依赖关系变成可以进行各种搜索和遍历的图结构。有多种方法可以做到这一点。我们经常使用 (jgrapht)[http://www.jgrapht.org/] 库。但这就是您自己编写的代码。

The collection tdl is a list of typed dependencies. For this sentence, it contains:

det(screen-2, the-1)
nsubj(great-7, screen-2)
amod(4-5, iphone-4)
prep_of(screen-2, 4-5)
cop(great-7, is-6)

(as you can see by trying it out online).

So, the dependency you want, nsubj(great-7, screen-2) is right there in that list. nsubj means that "screen" is the subject of "great".

The collection of dependencies is just a Collection (List). For doing more sophisticated further processing, people commonly want to make the dependencies into a graph structure that can be variously searched and traversed. There are various ways of doing that. We often use the (jgrapht)[http://www.jgrapht.org/] library. But that's then code you are writing yourself.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文