重建Lucene索引的正确方法是什么

发布于 2024-09-11 22:32:39 字数 2872 浏览 3 评论 0原文

我有一个用 Asp.net MVC 编写的类似论坛的 Web 应用程序。我正在尝试将 Lucene.net 实现为搜索引擎。当我构建索引时,我时不时会遇到与 Lucene 无法重命名 deletable 文件相关的异常。我认为这是因为我每次想重建索引时都会清空索引。下面是处理索引的代码:

public class SearchService : ISearchService
{
    Directory   IndexFileLocation;
    IndexWriter Writer;
    IndexReader Reader; 
    Analyzer    Analyzer;

    public SearchService(String indexLocation)
    {
        IndexFileLocation = FSDirectory.GetDirectory(indexLocation, System.IO.Directory.Exists(indexLocation) == false);
        Reader            = IndexReader.Open(IndexFileLocation);
        Writer            = new IndexWriter(IndexFileLocation, Analyzer, IndexFileLocation.List().Length == 0);
        Analyzer          = new StandardAnalyzer();
    }

    public void ClearIndex()
    {
        var DocumentCount = Writer.DocCount();
        if (DocumentCount == 0)
            return;

        for (int i = 0; i < DocumentCount; i++)
            Reader.DeleteDocument(i);
    }

    public void AddToSearchIndex(ISearchableData Data)
    {
        Document Doc = new Document();

        foreach (var Entry in Data)
        {
            Field field = new Field(Entry.Key, 
                                    Entry.Value, 
                                    Lucene.Net.Documents.Field.Store.NO, 
                                    Lucene.Net.Documents.Field.Index.TOKENIZED, 
                                    Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);
            Doc.Add(field);
        }

        Field KeyField = new Field(
            SearchField.Key.ToString(), 
            Data.Key, 
            Lucene.Net.Documents.Field.Store.YES, 
            Lucene.Net.Documents.Field.Index.NO);

        Doc.Add(KeyField);
        Writer.AddDocument(Doc);
    }

    public void Dispose()
    {
        Writer.Optimize();
        Writer.Close();
        Reader.Close();
    }
}

下面是执行这一切的代码:

    private void btnRebuildIndex_Click(object sender, EventArgs e)
    {
        using (var SearchService = new SearchService(Application.StartupPath + @"\indexs\"))
        {
            SearchService.ClearIndex();
        }

        using (var SearchService = new SearchService(Application.StartupPath + @"\indexs\"))
        {
            Int32 BatchSize = 50;
            Int32 Current = 0;
            var TotalQuestions = SubmissionService.GetQuestionsCount();

            while (Current < TotalQuestions)
            {
                var Questions = SubmissionService.ListQuestions(Current, BatchSize, "Id", Qsparx.SortOrder.Asc);

                foreach (var Question in Questions)
                {
                    SearchService.AddToSearchIndex(Question.ToSearchableData());
                }

                Current += BatchSize;
            }
        }
    }

为什么 Lucene 抱怨重命名“可删除”文件?

I have a forum like web application written in Asp.net MVC. I'm trying to implement Lucene.net as the search engine. When I build my index, every now and then I get exceptions related to Lucene not being able to rename the deletable file. I think it's because I empty the index every time I want to rebuild it. Here is the code that deals with indexing:

public class SearchService : ISearchService
{
    Directory   IndexFileLocation;
    IndexWriter Writer;
    IndexReader Reader; 
    Analyzer    Analyzer;

    public SearchService(String indexLocation)
    {
        IndexFileLocation = FSDirectory.GetDirectory(indexLocation, System.IO.Directory.Exists(indexLocation) == false);
        Reader            = IndexReader.Open(IndexFileLocation);
        Writer            = new IndexWriter(IndexFileLocation, Analyzer, IndexFileLocation.List().Length == 0);
        Analyzer          = new StandardAnalyzer();
    }

    public void ClearIndex()
    {
        var DocumentCount = Writer.DocCount();
        if (DocumentCount == 0)
            return;

        for (int i = 0; i < DocumentCount; i++)
            Reader.DeleteDocument(i);
    }

    public void AddToSearchIndex(ISearchableData Data)
    {
        Document Doc = new Document();

        foreach (var Entry in Data)
        {
            Field field = new Field(Entry.Key, 
                                    Entry.Value, 
                                    Lucene.Net.Documents.Field.Store.NO, 
                                    Lucene.Net.Documents.Field.Index.TOKENIZED, 
                                    Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);
            Doc.Add(field);
        }

        Field KeyField = new Field(
            SearchField.Key.ToString(), 
            Data.Key, 
            Lucene.Net.Documents.Field.Store.YES, 
            Lucene.Net.Documents.Field.Index.NO);

        Doc.Add(KeyField);
        Writer.AddDocument(Doc);
    }

    public void Dispose()
    {
        Writer.Optimize();
        Writer.Close();
        Reader.Close();
    }
}

And here is the code that executes it all:

    private void btnRebuildIndex_Click(object sender, EventArgs e)
    {
        using (var SearchService = new SearchService(Application.StartupPath + @"\indexs\"))
        {
            SearchService.ClearIndex();
        }

        using (var SearchService = new SearchService(Application.StartupPath + @"\indexs\"))
        {
            Int32 BatchSize = 50;
            Int32 Current = 0;
            var TotalQuestions = SubmissionService.GetQuestionsCount();

            while (Current < TotalQuestions)
            {
                var Questions = SubmissionService.ListQuestions(Current, BatchSize, "Id", Qsparx.SortOrder.Asc);

                foreach (var Question in Questions)
                {
                    SearchService.AddToSearchIndex(Question.ToSearchableData());
                }

                Current += BatchSize;
            }
        }
    }

Why does Lucene complain about renaming the "deletable" file?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

最偏执的依靠 2024-09-18 22:32:39

不确定为什么每次都要重新创建索引。您可以这样追加到索引:

Writer = new IndexWriter(IndexFileLocation, Analyzer,false);

末尾的错误标志告诉 IndexWriter 以追加模式打开(即不覆盖)。
这可能会让你的问题消失。

Not sure why you are recreating the index everytime. You can append to the index thus:

Writer = new IndexWriter(IndexFileLocation, Analyzer,false);

The false flag at the end tells the IndexWriter to open in append mode(i.e. not overwrite).
That might make your problem go away.

旧伤还要旧人安 2024-09-18 22:32:39

事实证明,如果不存在索引文件,那么在 IndexWriter 之前创建 IndexReader 并不是一个好主意。我还意识到,尽管 IndexWriter 的 AddDocument 方法有两个重载(一个带分析器参数,一个不带分析器参数),但只有带分析器参数的方法适合我。

It turned out, if no index files exist, then creating an IndexReader before an IndexWriter is not a good idea. I also realized even though the AddDocument method of IndexWriter has two overloads (one w/ and one w/o Analyzer parameter) only the one with analyzer parameter works for me.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文