提升字段或文档在 Lucene.Net 中没有效果

发布于 2024-10-17 11:58:19 字数 2309 浏览 2 评论 0原文

我正在尝试提升工作，因此我可以提升文档和/或字段以使搜索结果符合我的要求。

但是，我无法使提升文档或字段对评分产生任何影响。

要么 Lucene.Net 增强不起作用（不太可能），要么我误解了某些内容（很可能）。

这是我精简到最基本的展示代码：

using System;
using System.Collections.Generic;
using Lucene.Net.Analysis;
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.QueryParsers;
using Lucene.Net.Search;

namespace SO_LuceneTest
{  
class Program
{
    static void Main(string[] args)
    {

        const string INDEXNAME = "TextIndex";

        var writer = new IndexWriter(INDEXNAME, new SimpleAnalyzer(), true);
        writer.DeleteAll();

        var persons = new Dictionary<string, string>
                          {
                            { "Smithers", "Jansen" },
                            { "Jan", "Smith" }
                          };

        foreach (var p in persons)
        {
            var doc = new Document();
            var firstnameField = new Field("Firstname", p.Key, Field.Store.YES, Field.Index.ANALYZED);
            var lastnameField = new Field("Lastname", p.Value, Field.Store.YES, Field.Index.ANALYZED);
            //firstnameField.SetBoost(2.0f);
            doc.Add(firstnameField);
            doc.Add(lastnameField);
            writer.AddDocument(doc);
        }

        writer.Commit();
        writer.Close();

        var term = "jan*";
        var queryFields = new string[] { "Firstname", "Lastname" };

        var boosts = new Dictionary<string, float>();
        //boosts.Add("Firstname", 10);

        QueryParser mqp = new MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_24, queryFields, new SimpleAnalyzer(), boosts);

        var query = mqp.Parse(term);

        IndexSearcher searcher = new IndexSearcher(INDEXNAME);

        Hits hits = searcher.Search(query);

        int results = hits.Length();
        Console.WriteLine("Found {0} results", results);
        for (int i = 0; i < results; i++)
        {
            Document doc = hits.Doc(i);
            Console.WriteLine("{0} {1}\t\t{2}", doc.Get("Firstname"), doc.Get("Lastname"), hits.Score(i));
        }

        searcher.Close();

        Console.WriteLine("...");
        Console.Read();

    }
}
}

我已经注释掉了两个提升实例。如果包含在内，分数仍然与没有提升时完全相同。

我在这里缺少什么？

我使用的是Lucene.Net v2.9.2.2，目前最新版本。

原文

I am trying to get boosting to work, so I can boost docs and/or fields to make the search-result as I like it to be.

However, I am unable to make boosting docs or fields have ANY effect at all on the scoring.

Either Lucene.Net boosting does not work (not very likely) or I am misunderstanding something (very likely).

Here is my stripped down to bare essentials showcase code:

using System;
using System.Collections.Generic;
using Lucene.Net.Analysis;
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.QueryParsers;
using Lucene.Net.Search;

namespace SO_LuceneTest
{  
class Program
{
    static void Main(string[] args)
    {

        const string INDEXNAME = "TextIndex";

        var writer = new IndexWriter(INDEXNAME, new SimpleAnalyzer(), true);
        writer.DeleteAll();

        var persons = new Dictionary<string, string>
                          {
                            { "Smithers", "Jansen" },
                            { "Jan", "Smith" }
                          };

        foreach (var p in persons)
        {
            var doc = new Document();
            var firstnameField = new Field("Firstname", p.Key, Field.Store.YES, Field.Index.ANALYZED);
            var lastnameField = new Field("Lastname", p.Value, Field.Store.YES, Field.Index.ANALYZED);
            //firstnameField.SetBoost(2.0f);
            doc.Add(firstnameField);
            doc.Add(lastnameField);
            writer.AddDocument(doc);
        }

        writer.Commit();
        writer.Close();

        var term = "jan*";
        var queryFields = new string[] { "Firstname", "Lastname" };

        var boosts = new Dictionary<string, float>();
        //boosts.Add("Firstname", 10);

        QueryParser mqp = new MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_24, queryFields, new SimpleAnalyzer(), boosts);

        var query = mqp.Parse(term);

        IndexSearcher searcher = new IndexSearcher(INDEXNAME);

        Hits hits = searcher.Search(query);

        int results = hits.Length();
        Console.WriteLine("Found {0} results", results);
        for (int i = 0; i < results; i++)
        {
            Document doc = hits.Doc(i);
            Console.WriteLine("{0} {1}\t\t{2}", doc.Get("Firstname"), doc.Get("Lastname"), hits.Score(i));
        }

        searcher.Close();

        Console.WriteLine("...");
        Console.Read();

    }
}
}

I have commented out two instances of boosting. When included, the score is still the exact same as without the boosting.

What am I missing here?

I am using Lucene.Net v2.9.2.2, the latest version as of now.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

樱花细雨 2024-10-24 11:58:19

请尝试这是否有效，它对我有用，但你必须修改它，因为我有很多其他代码，除非有必要，否则我不会在这篇文章中包含这些代码。主要区别是使用 topfieldcollector 来获取结果

        var dir = SimpleFSDirectory.Open(new DirectoryInfo(IndexPath));
        var ixSearcher = new IndexSearcher(dir, false);
        var qp = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, f_Text, analyzer);
        query = CleanQuery(query);
        Query q = qp.Parse(query);
        TopFieldCollector collector = TopFieldCollector.Create(
              new Sort(new SortField(null, SortField.SCORE, false), new SortField(f_Date, SortField.LONG, true)),
              MAX_RESULTS,
              false,         // fillFields - not needed, we want score and doc only
              true,          // trackDocScores - need doc and score fields
              true,          // trackMaxScore - related to trackDocScores
              false); // should docs be in docId order?
        ixSearcher.Search(q, collector);
        TopDocs topDocs = collector.TopDocs();
        ScoreDoc[] hits = topDocs.ScoreDocs;

        uint pageCount = (uint)Math.Ceiling((double)hits.Length / pageSize);

        for (uint i = pageIndex * pageSize; i < (pageIndex + 1) * pageSize; i++) {
            if (i >= hits.Length) {
                break;
            }
            int doc = hits[i].Doc;

            Content c = new Content {
                Title = ixSearcher.Doc(doc).GetField(f_Title).StringValue(),
                Text = FragmentOnOrgText(ixSearcher.Doc(doc).GetField(f_TextOrg).StringValue(), highligter.GetBestFragments(analyzer, ixSearcher.Doc(doc).GetField(f_Text).StringValue(), maxNumberOfFragments)),
                Date = DateTools.StringToDate(ixSearcher.Doc(doc).GetField(f_Date).StringValue()),
                Score = hits[i].Score
            };

            rv.Add(c);
        }

        ixSearcher.Close();

please try if this will work, it does for me, but you have to modify it, because I have lots of other code which I won't be including in this post unless necessary. The main difference is use of topfieldcollector to get results

        var dir = SimpleFSDirectory.Open(new DirectoryInfo(IndexPath));
        var ixSearcher = new IndexSearcher(dir, false);
        var qp = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, f_Text, analyzer);
        query = CleanQuery(query);
        Query q = qp.Parse(query);
        TopFieldCollector collector = TopFieldCollector.Create(
              new Sort(new SortField(null, SortField.SCORE, false), new SortField(f_Date, SortField.LONG, true)),
              MAX_RESULTS,
              false,         // fillFields - not needed, we want score and doc only
              true,          // trackDocScores - need doc and score fields
              true,          // trackMaxScore - related to trackDocScores
              false); // should docs be in docId order?
        ixSearcher.Search(q, collector);
        TopDocs topDocs = collector.TopDocs();
        ScoreDoc[] hits = topDocs.ScoreDocs;

        uint pageCount = (uint)Math.Ceiling((double)hits.Length / pageSize);

        for (uint i = pageIndex * pageSize; i < (pageIndex + 1) * pageSize; i++) {
            if (i >= hits.Length) {
                break;
            }
            int doc = hits[i].Doc;

            Content c = new Content {
                Title = ixSearcher.Doc(doc).GetField(f_Title).StringValue(),
                Text = FragmentOnOrgText(ixSearcher.Doc(doc).GetField(f_TextOrg).StringValue(), highligter.GetBestFragments(analyzer, ixSearcher.Doc(doc).GetField(f_Text).StringValue(), maxNumberOfFragments)),
                Date = DateTools.StringToDate(ixSearcher.Doc(doc).GetField(f_Date).StringValue()),
                Score = hits[i].Score
            };

            rv.Add(c);
        }

        ixSearcher.Close();

回复收藏 0 原文

~没有更多了~