当前位置：文江博客话题详情

lucene 和本体论

发布于 11-02 01:26 字数 133 浏览 4 评论 0原文

我对 Lucene 没有太多经验，但我需要完成一项研究。我想使用基于本体的Lucene索引。所以，我需要任何建议，我应该使用什么，如何将 Lucene 与本体领域结合起来等等。

谢谢，

幸运儿

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

左秋2024-11-09 01:26:41

在 Lucene 中，您可能会这样做，

protected Document createDocumentFromTuple(Tuple t) {
    Document doc = new Document(); // this is the Lucene document to create
    String docid = createId(t);
    doc.add(new Field("id", docid, Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("name", t.getName(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("author", t.getAuthor(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("book", t.getBook(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    return doc;
}

假设这三个字段不应被某些分析器分解为组成项；如果这不是正确的假设，请将最后一个参数更改为 Field.Index.ANALYZED。

Solr 等效项（如果您不分析字段，这可能更有意义）

protected SolrInputDocument createIndexableDocument(Tuple t) {
    SolrInputDocument doc = new SolrInputDocument();
    String docid = createId(t);
    doc.addField("id", docid);
    doc.addField("name", t.getName());
    doc.addField("author", t.getAuthor());
    doc.addField("book", t.getBook());
    return doc;
}

在 Solr 中，服务器端配置确定存储哪些字段、如何解析它们等。

在每种情况下，您都需要弄清楚如何为每个元组创建唯一的 id。一种方法是生成三个值的串联（带分隔符）的哈希值。

In Lucene, you might do something like

protected Document createDocumentFromTuple(Tuple t) {
    Document doc = new Document(); // this is the Lucene document to create
    String docid = createId(t);
    doc.add(new Field("id", docid, Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("name", t.getName(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("author", t.getAuthor(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    doc.add(new Field("book", t.getBook(), Field.Store.YES, Field.Index.NOT_ANALYZED );
    return doc;
}

This assumes that the three fields should not be decomposed into constituent terms by some Analyzer; if that's not a correct assumption, change the last parameter to Field.Index.ANALYZED.

The Solr equivalent (which might make more sense if you are not analyzing the fields, would be

protected SolrInputDocument createIndexableDocument(Tuple t) {
    SolrInputDocument doc = new SolrInputDocument();
    String docid = createId(t);
    doc.addField("id", docid);
    doc.addField("name", t.getName());
    doc.addField("author", t.getAuthor());
    doc.addField("book", t.getBook());
    return doc;
}

In Solr, the server-side configuration determines which fields are stored, how they are parsed, etc.

In each case, you will need to figure out how to create a unique id for each Tuple. One way to do it is to generate a hash of the concatenation (with delimiters) of the three values.

回复收藏 0 原文

~没有更多了~