使用 Compass/Lucene 搜索对象子集

发布于 2024-08-12 14:34:04 字数 512 浏览 6 评论 0原文

我正在使用 Grails 的可搜索插件(它提供了 Compass 的 API,而 Compass 本身就是 Lucene 上的 API)。我有一个 Order 类,我想搜索它,但是,我不想搜索 Order 的所有实例,而只是搜索其中的一个子集。像这样的事情:

// This is a Hibernate/GORM call
List<Order> searchableOrders = Customer.findAllByName("Bob").orders

// Now search only these orders with the searchable plugin - something like
searchableOrders.search("name: foo")

实际上,获取 searchableOrders 的关系查询比这更复杂,所以我无法单独在 compass 中完成整个查询(Hibernate + compass)。有没有一种方法可以使用 Compass/Lucene 仅搜索特定类实例的主题。

I'm using the searchable plugin for Grails (which provides an API for Compass, which is itself an API over Lucene). I have an Order class that I would like to search but, I don't want to search all the instances of Order, just a subset of them. Something like this:

// This is a Hibernate/GORM call
List<Order> searchableOrders = Customer.findAllByName("Bob").orders

// Now search only these orders with the searchable plugin - something like
searchableOrders.search("name: foo")

In reality the relational query to get the searchableOrders is more complex than this, so I can't do the entire query (Hibernate + compass) in compass alone. Is there a way to search only a subject of instances of a particular class using Compass/Lucene.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

楠木可依 2024-08-19 14:34:04

一种方法是使用自定义过滤器。例如,如果您想根据域类的 id 进行过滤,则可以将 id 添加到域类的可搜索配置中:

static searchable = {
   id name: "id"
}

然后编写自定义过滤器(可以放在 [project]/src/java 中) :

import org.apache.lucene.search.Filter;
import java.util.BitSet;
import org.apache.lucene.index.TermDocs;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.IndexReader;

import java.io.IOException;
import java.util.List;

public class IdFilter extends Filter {

    private List<String> ids;

    public IdFilter(List<String> ids) {
        this.ids = ids;
    }

    public BitSet bits(IndexReader reader) throws IOException {
        BitSet bits = new BitSet(reader.maxDoc());
        int[] docs = new int[1];
        int[] freqs = new int[1];
        for( String id : ids ) {
            if (id != null) {
                TermDocs termDocs = reader.termDocs(new Term("id", id ) );
                int count = termDocs.read(docs, freqs);
                if (count == 1) {
                    bits.set(docs[0]);
                }
            }
        }
        return bits;
    }
}

然后您可以将过滤器作为搜索的参数(如果它位于不同的包中,请确保导入 Filter 类):

def theSearchResult = MyDomainClass.search( 
{
    must( queryString(params.q) )       
}, 
params,
filter: new IdFilter( [ "1" ] ))

这里我只是创建一个硬编码列表,其中包含单个值“1” ,但您可以从数据库、以前的搜索或任何地方检索 id 列表。

您可以轻松地抽象过滤器,我必须在构造函数中获取术语名称,然后像您想要的那样传入“名称”。

One way to do this is with a custom Filter. For example, if you wanted to filter based on ids for your domain class, you would add the id to the searchable configuration for the domain class:

static searchable = {
   id name: "id"
}

Then you would write your custom filter (which can go in [project]/src/java):

import org.apache.lucene.search.Filter;
import java.util.BitSet;
import org.apache.lucene.index.TermDocs;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.IndexReader;

import java.io.IOException;
import java.util.List;

public class IdFilter extends Filter {

    private List<String> ids;

    public IdFilter(List<String> ids) {
        this.ids = ids;
    }

    public BitSet bits(IndexReader reader) throws IOException {
        BitSet bits = new BitSet(reader.maxDoc());
        int[] docs = new int[1];
        int[] freqs = new int[1];
        for( String id : ids ) {
            if (id != null) {
                TermDocs termDocs = reader.termDocs(new Term("id", id ) );
                int count = termDocs.read(docs, freqs);
                if (count == 1) {
                    bits.set(docs[0]);
                }
            }
        }
        return bits;
    }
}

Then you would put the filter as an argument to your search (making sure to import the Filter class if its in a different package):

def theSearchResult = MyDomainClass.search( 
{
    must( queryString(params.q) )       
}, 
params,
filter: new IdFilter( [ "1" ] ))

Here I'm just creating a hard-coded list with a single value of "1" in it, but you could retrieve a list of ids from the database, from a previous search, or wherever.

You could easily abstract the filter I have to take the term name in the constructor, then pass in "name" like you want.

青春如此纠结 2024-08-19 14:34:04

有两种方法:

从实现角度来看,最简单的方法是对所有对象进行两次搜索(一次 findAll 和 search),然后找到它们之间的交集。如果您缓存 findAll 调用的结果,那么您实际上只需进行一次查询。

一种更“干净”的方法是确保使用 Searchable 对域对象的 ID 进行索引,并且当您获得 findAll 结果时,将这些 ID 传递到搜索查询中,从而限制它。

我不记得 Lucene 语法了,但是您必须执行类似的操作

searchableOrders.search("name: foo AND (ID:4 or ID:5 or ID:8 ...)" )

You may run into query size Limits in Lucene, but I think there are settings that allowed you to control query length。

Two ways of doing this:

The easiest from the implementation standpoing is do two searches (one findAll and search) on all objects and then find intersection between them. If you cache the result of findAll call, then you are really down to one query you have to make.

A more "clean" way to do this is to make sure to index the IDs of the domain objects with Searchable, and when you get the findAll result, pass in those IDs into the search query, thus limiting it.

I don't remember the Lucene syntax off the top of my head, but you'd have to do something like

searchableOrders.search("name: foo AND (ID:4 or ID:5 or ID:8 ...)" )

You may run into query size limits in Lucene, but I think there are settings that allows you to control query length.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文