lucene 和本体论

发布于 11-02 01:26 字数 133 浏览 4 评论 0原文

我对 Lucene 没有太多经验,但我需要完成一项研究。 我想使用基于本体的Lucene索引。所以,我需要任何建议,我应该使用什么,如何将 Lucene 与本体领域结合起来等等。

谢谢,

  • 幸运儿

I do not have much experience with Lucene, but I need to finish a research.
I want to use Lucene indexing based on ontology. So, I need any kind of advice, what should I use, how to combine Lucene with ontology domain and things like that.

Thanks,

  • Lucky

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

左秋2024-11-09 01:26:41

在 Lucene 中,您可能会这样做,

protected Document createDocumentFromTuple(Tuple t) {
    Document doc = new Document(); // this is the Lucene document to create
    String docid = createId(t);
    doc.add(new Field("id", docid, Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("name", t.getName(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("author", t.getAuthor(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("book", t.getBook(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    return doc;
}

假设这三个字段不应被某些分析器分解为组成项;如果这不是正确的假设,请将最后一个参数更改为 Field.Index.ANALYZED

Solr 等效项(如果您不分析字段,这可能更有意义)

protected SolrInputDocument createIndexableDocument(Tuple t) {
    SolrInputDocument doc = new SolrInputDocument();
    String docid = createId(t);
    doc.addField("id", docid);
    doc.addField("name", t.getName());
    doc.addField("author", t.getAuthor());
    doc.addField("book", t.getBook());
    return doc;
}

在 Solr 中,服务器端配置确定存储哪些字段、如何解析它们等。

在每种情况下,您都需要弄清楚如何为每个元组创建唯一的 id。一种方法是生成三个值的串联(带分隔符)的哈希值。

In Lucene, you might do something like

protected Document createDocumentFromTuple(Tuple t) {
    Document doc = new Document(); // this is the Lucene document to create
    String docid = createId(t);
    doc.add(new Field("id", docid, Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("name", t.getName(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("author", t.getAuthor(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("book", t.getBook(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    return doc;
}

This assumes that the three fields should not be decomposed into constituent terms by some Analyzer; if that's not a correct assumption, change the last parameter to Field.Index.ANALYZED.

The Solr equivalent (which might make more sense if you are not analyzing the fields, would be

protected SolrInputDocument createIndexableDocument(Tuple t) {
    SolrInputDocument doc = new SolrInputDocument();
    String docid = createId(t);
    doc.addField("id", docid);
    doc.addField("name", t.getName());
    doc.addField("author", t.getAuthor());
    doc.addField("book", t.getBook());
    return doc;
}

In Solr, the server-side configuration determines which fields are stored, how they are parsed, etc.

In each case, you will need to figure out how to create a unique id for each Tuple. One way to do it is to generate a hash of the concatenation (with delimiters) of the three values.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文