Haystack Whoosh 未对所有内容建立索引

发布于 2024-10-30 19:16:35 字数 850 浏览 2 评论 0原文


我正在使用 Haystack v1.0 和 Whoosh v1.8.1 为我的网站构建自定义搜索引擎。一切都工作得很好,但问题是我在索引模型中没有得到很多条目的结果。

例如 - 我有四个注册模型 - 会员、访客、活动、赞助商。从 django shell 重建索引时,会发生以下情况:

./manage.pyrebuild_index

Indexing 26 members.  
Indexing 3 events.
Indexing <x> guests.  
Indexing <y> sponsors.  

但是在运行 SearchQuery API 命令以及通过搜索页面进行搜索时,我无法搜索一半的成员名称。让我困惑的是,当我可以搜索 14-15 名成员时,为什么不能搜索其余的成员呢?我的模板 *_text.txt* 文件应该是正确的,因为一半的成员已正确索引。

你可以试试这个
http://www.edciitr.com/search/?q=x
x= Vikesh 返回 1 个结果(如预期)
x= Akshit 不返回任何结果(问题所在!)

“Akshit”和“Vikesh”值在重建索引之前都存在。以下是我尝试搜索的所有 26 名成员的列表 - http://www.edciitr.com/contact/

I am using Haystack v1.0 and Whoosh v1.8.1 to build a customized search engine for my website. Everything works beautifully but the problem is that I get no results for a lot of entries in my indexed models.

For e.g. - I have four registered models - Member, Guest, Event, Sponsor. On rebuilding index from django shell, following happens:

./manage.py rebuild_index

Indexing 26 members.  
Indexing 3 events.
Indexing <x> guests.  
Indexing <y> sponsors.  

But on running the SearchQuery API commands and also on searching through the search page, I cannot search half the member names. What eludes me is that when I can search 14-15 members, why not the rest. My template *_text.txt* file should be correct since half the members are getting indexed correctly.

You can try this
http://www.edciitr.com/search/?q=x
x= Vikesh returns 1 result (as expected)
x= Akshit returns no results (the problem!)

Both values 'Akshit' and 'Vikesh' were present prior to rebuild_index. Here's the list of all 26 members that I am trying to search - http://www.edciitr.com/contact/

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

乜一 2024-11-06 19:16:35

好的,这就是我为了查明问题出在 Whoosh 还是 Haystack 中所做的事情。我打开 django shell 并执行了对 haystack SearchQuery API 搜索中未显示的术语的搜索:

./manage.py shell   
gt;> import whoosh 
gt;> from whoosh.query import *  
gt;> from whoosh.index import open_dir
gt;> ix = open_dir('/home/somedir/my_project/haystack/whoosh/')  
gt;> ix.schema  
<Schema: ['branch', 'category', 'coordinator', 'date_event', 'designation','details', 'django_ct', 'django_id'> 'name', 'organisation', 'overview','text', 'title']>
gt;> searcher = ix.searcher()  
gt;> res = searcher.search(Term('text',u'akshit'))  
gt;> print res  
<Top 1 Results for Term('text', 'akshit') runtime=0.000741004943848>
gt;> print res['0']['name']  
u'Akshit Khurana'   

所以你看,Whoosh 正确地索引了所有数据。所以,现在我尝试 SearchQuery API

./manage.py shell
 
gt;> from haystack.query import SearchQuerySet
 
gt;> sqs = SearchQuerySet().filter(content='akshit')
 
gt;> sqs
 
gt;> []

所以,我意识到我必须检查 haystack 库的 whoosh_backend.py 文件来看看发生了什么。打开 - haystack.backends.whoosh_backend around line number 345

'''Uncomment these two lines because the raw_results set becomes empty after the filter     call for some queries''  
if narrowed_results:
      raw_results.filter(narrowed_results)

然后

#if narrowed_results:
      #raw_results.filter(narrowed_results)

就可以了。 SearchQueryAPI 按预期返回测试查询的一个结果。网络搜索工作正常,但我想知道这里的 haystack 有什么问题。

Okay, so here's what I did to find out whether the problem is in Whoosh or Haystack. I opened the django shell and performed a search for the term that was not showing up in haystack SearchQuery API search:

./manage.py shell   
gt;> import whoosh 
gt;> from whoosh.query import *  
gt;> from whoosh.index import open_dir
gt;> ix = open_dir('/home/somedir/my_project/haystack/whoosh/')  
gt;> ix.schema  
<Schema: ['branch', 'category', 'coordinator', 'date_event', 'designation','details', 'django_ct', 'django_id'> 'name', 'organisation', 'overview','text', 'title']>
gt;> searcher = ix.searcher()  
gt;> res = searcher.search(Term('text',u'akshit'))  
gt;> print res  
<Top 1 Results for Term('text', 'akshit') runtime=0.000741004943848>
gt;> print res['0']['name']  
u'Akshit Khurana'   

So you see, Whoosh is correctly indexing all data. So, now I try the SearchQuery API

./manage.py shell
 
gt;> from haystack.query import SearchQuerySet
 
gt;> sqs = SearchQuerySet().filter(content='akshit')
 
gt;> sqs
 
gt;> []

So, I realize that I must check out the whoosh_backend.py file of the haystack library to see what's happening. Open - haystack.backends.whoosh_backend around line number 345

'''Uncomment these two lines because the raw_results set becomes empty after the filter     call for some queries''  
if narrowed_results:
      raw_results.filter(narrowed_results)

to

#if narrowed_results:
      #raw_results.filter(narrowed_results)

And then it works. SearchQueryAPI returning exactly one result for the test query as expected. Web search working but I would like to know what's the issue with haystack here.

病毒体 2024-11-06 19:16:35

我有类似的症状,这是我问的问题

也可能与您的问题有关。

I have a similar symptom, and this is the question I asked Django django-haystack cannot import CategoryBase from django-categories on the first run

Might relate to your problem too.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文