Lucene.NET 2.9 和 BitArray/DocIdSet

发布于 2024-09-04 09:13:46 字数 1152 浏览 14 评论 0原文

我发现了一个关于在基本查询上获取构面计数的很好的例子。它存储基本查询的位数组,以提高每次计算分面时的性能。

        var genreQuery = new TermQuery(new Term("genre", genre));
        var genreQueryFilter = new QueryFilter(genreQuery);
        BitArray genreBitArray = genreQueryFilter.Bits(searcher.GetIndexReader());
        Console.WriteLine("There are " + GetCardinality(genreBitArray) + " document with the genre " + genre);

        // Next perform a regular search and get its BitArray result
        Query searchQuery = MultiFieldQueryParser.Parse(term, new[] {"title", "description"}, new[] {BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD}, new StandardAnalyzer());
        var searchQueryFilter = new QueryFilter(searchQuery);
        BitArray searchBitArray = searchQueryFilter.Bits(searcher.GetIndexReader());
        Console.WriteLine("There are " + GetCardinality(searchBitArray) + " document containing the term " + term);

唯一的问题是我使用的是较新版本的 Lucene.NET (2.9),而 Filter.Bits 已过时。我们被告知要使用 DocIdSet(而不是 BitArray)。

我无法找到如何使用 docIdSet 执行 bitArray.And(bitArray) 。我查看了 Reflector,发现 OpenIdSet 有 And 操作。不确定 OpenIdSet 是否是要走的路线,我只是说明一下。

提前致谢!

I found a great example on grabbing facet counts on a base query. It stores the bitarray of the base query to improve the performance each time the a facet gets counted.

        var genreQuery = new TermQuery(new Term("genre", genre));
        var genreQueryFilter = new QueryFilter(genreQuery);
        BitArray genreBitArray = genreQueryFilter.Bits(searcher.GetIndexReader());
        Console.WriteLine("There are " + GetCardinality(genreBitArray) + " document with the genre " + genre);

        // Next perform a regular search and get its BitArray result
        Query searchQuery = MultiFieldQueryParser.Parse(term, new[] {"title", "description"}, new[] {BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD}, new StandardAnalyzer());
        var searchQueryFilter = new QueryFilter(searchQuery);
        BitArray searchBitArray = searchQueryFilter.Bits(searcher.GetIndexReader());
        Console.WriteLine("There are " + GetCardinality(searchBitArray) + " document containing the term " + term);

The only problem is that I am using a newer version of Lucene.NET (2.9) and Filter.Bits is obsolete. We are told to use DocIdSet instead (rather than BitArray).

I cannot found out how to do the bitArray.And(bitArray) with a docIdSet. I looked in reflector and found OpenIdSet which has And operations. Not sure if OpenIdSet is the route to go, I'm just stating.

Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

暖心男生 2024-09-11 09:13:46

发现了。

            var productsDISI = new OpenBitSetDISI(productResults.Iterator(), 25000);
            var termQuery = new TermQuery(new Term("Spec" + expectedFacet.SpecificationId, expectedFacet.SpecificationOptionId.ToString()));
            var termQueryFilter = new QueryWrapperFilter(termQuery);
            var termIterator = termQueryFilter.GetDocIdSet(productReader).Iterator();
            productsDISI.InPlaceAnd(termIterator);
            var total = productsDISI.Cardinality();

事实证明也快得多。

Found it out.

            var productsDISI = new OpenBitSetDISI(productResults.Iterator(), 25000);
            var termQuery = new TermQuery(new Term("Spec" + expectedFacet.SpecificationId, expectedFacet.SpecificationOptionId.ToString()));
            var termQueryFilter = new QueryWrapperFilter(termQuery);
            var termIterator = termQueryFilter.GetDocIdSet(productReader).Iterator();
            productsDISI.InPlaceAnd(termIterator);
            var total = productsDISI.Cardinality();

turns out to be much faster too.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文