斯坦福解析器问题

发布于 2024-10-14 04:43:25 字数 2219 浏览 7 评论 0原文

我正在编写一个与 NLP(自然语言解析器)一起使用的项目。我正在使用斯坦福解析器。

我创建了一个线程池,它接受句子并用它们运行解析器。 当我创建一个线程时,一切正常,但当我创建更多线程时,就会出现错误。 “测试”过程是寻找具有某些联系的单词。 如果我执行同步,它应该像一个线程一样工作,但我仍然会收到错误。 我的问题是我的代码有错误:

public synchronized String test(String s,LexicalizedParser lp )
{

    if (s.isEmpty()) return "";
    if (s.length()>80) return "";
    System.out.println(s);
    String[] sent = s.split(" ");
 Tree parse = (Tree) lp.apply(Arrays.asList(sent));
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCollapsed();
List list = new ArrayList(tdl);


//for (int i=0;i<list.size();i++)
//System.out.println(list.get(1).toString());

//remove scops and numbers like sbj(screen-4,good-6)->screen good

 Pattern p = Pattern.compile(".*\\((.*?)\\-\\d+,(.*?)\\-\\d+\\).*");

       if (list.size()>2){
    // Split input with the pattern
        Matcher m = p.matcher(list.get(1).toString());
        //check if the result have more than  1 groups
       if (m.find()&& m.groupCount()>1){
           if (m.groupCount()>1)
           {
               System.out.println(list);
 return  m.group(1)+m.group(2);
    }}
}
        return "";

}

我遇到的错误是:

位于 blogsOpinions.ParserText。(ParserText.java:47) 在 blogsOpinions.ThreadPoolTest$1.run(ThreadPoolTest.java:50) 在 blogsOpinions.ThreadPool$PooledThread.run(ThreadPoolTest.java:196) 使用跌倒恢复 策略:将构造一个 (X ...) 树。线程异常 “PooledThread-21” java.lang.ClassCastException: java.lang.String 无法转换为 edu.stanford.nlp.ling.HasWord

在 edu.stanford.nlp.parser.lexparser.LexicalizedParser.apply(LexicalizedParser.java:289) 在 blogsOpinions.ParserText.test(ParserText.java:174) 在 blogsOpinions.ParserText.insertDb(ParserText.java:76) 在 blogsOpinions.ParserText.(ParserText.java:47) 在 blogsOpinions.ThreadPoolTest$1.run(ThreadPoolTest.java:50) 在 blogsOpinions.ThreadPool$PooledThread.run(ThreadPoolTest.java:196)

以及如何获得主题的描述,例如屏幕非常好,我想从我得到的列表中获得良好的屏幕,而不是像 list.get(1)

I am writing a project that works with NLP (natural language parser). I am using the stanford parser.

I create a thread pool that takes sentences and run the parser with them.
When I create one thread its all works fine, but when I create more, I get errors.
The "test" procedure is finding words that have some connections.
If I do an synchronized its supposed to work like one thread but still I get errors.
My problem is that I have errors on this code:

public synchronized String test(String s,LexicalizedParser lp )
{

    if (s.isEmpty()) return "";
    if (s.length()>80) return "";
    System.out.println(s);
    String[] sent = s.split(" ");
 Tree parse = (Tree) lp.apply(Arrays.asList(sent));
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCollapsed();
List list = new ArrayList(tdl);


//for (int i=0;i<list.size();i++)
//System.out.println(list.get(1).toString());

//remove scops and numbers like sbj(screen-4,good-6)->screen good

 Pattern p = Pattern.compile(".*\\((.*?)\\-\\d+,(.*?)\\-\\d+\\).*");

       if (list.size()>2){
    // Split input with the pattern
        Matcher m = p.matcher(list.get(1).toString());
        //check if the result have more than  1 groups
       if (m.find()&& m.groupCount()>1){
           if (m.groupCount()>1)
           {
               System.out.println(list);
 return  m.group(1)+m.group(2);
    }}
}
        return "";

}

the errors that I have are:

at blogsOpinions.ParserText.(ParserText.java:47)
at blogsOpinions.ThreadPoolTest$1.run(ThreadPoolTest.java:50)
at blogsOpinions.ThreadPool$PooledThread.run(ThreadPoolTest.java:196)
Recovering using fall through
strategy: will construct an (X ...)
tree. Exception in thread
"PooledThread-21"
java.lang.ClassCastException:
java.lang.String cannot be cast to
edu.stanford.nlp.ling.HasWord

at
edu.stanford.nlp.parser.lexparser.LexicalizedParser.apply(LexicalizedParser.java:289)
at blogsOpinions.ParserText.test(ParserText.java:174)
at blogsOpinions.ParserText.insertDb(ParserText.java:76)
at blogsOpinions.ParserText.(ParserText.java:47)
at blogsOpinions.ThreadPoolTest$1.run(ThreadPoolTest.java:50)
at blogsOpinions.ThreadPool$PooledThread.run(ThreadPoolTest.java:196)

and how can i get the discription of the subject like the screen is very good, and I want to get screen good from the list the I get and not like list.get(1).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

心病无药医 2024-10-21 04:43:25

您无法在 StringList 上调用 LexicalizedParser.parse;它需要一个 HasWord对象。调用 apply 方法应用于您的输入字符串。这还将在您的输入上运行适当的分词器(而不是在空格上简单的split)。

要从返回的 Tree 中获取主题性等关系,请调用其 依赖项成员。

You can't call LexicalizedParser.parse on a List of Strings; it expects a list of HasWord objects. It's much easier to call the apply method on your input string. This will also run a proper tokenizer on your input (instead of your simple split on spaces).

To get relations such as subjectness out of the returned Tree, call its dependencies member.

兲鉂ぱ嘚淚 2024-10-21 04:43:25

嗯,我目睹了相同的堆栈跟踪。结果我在同一个 JVM 中加载了 LexicalizedParser 的两个实例。这似乎是问题所在。当我确保只创建一个实例时,我就可以很好地调用 lp.apply(Arrays.asList(sent)) 。

Hm, I witnessed the same stack trace. Turned out I was loading two instances of the LexicalizedParser in the same JVM. This seemed to be the problem. When I made sure only one instance is created, I was able to call lp.apply(Arrays.asList(sent)) just fine.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文