Django Haystack 精确过滤
我有一个大海捞针搜索,它具有以下搜索索引:
class GrantIndex(indexes.SearchIndex): """ This provides the search index for the Grant application. """ text = indexes.CharField(document=True, use_template=True) year = indexes.IntegerField(model_attr='year__year') date = indexes.DateField(model_attr='date') program = indexes.CharField(model_attr='program__area') grantee = indexes.CharField(model_attr='grantee') amount = indexes.IntegerField(model_attr='amount') site.register(Grant, GrantIndex)
如果我想搜索过滤掉任何不是“健康”的程序,我运行以下查询:
from haystack.query import SearchQuerySet sqs = SearchQuerySet() sqs = sqs.filter(program='Health')
不幸的是,这也会从程序“健康\其他”和“健康”中生成对象\心血管'。如何阻止搜索允许其他程序进入?
我运行 Ubuntu 9.10,并使用 Xapian 作为搜索后端。
I have a haystack search which has the following SearchIndex:
class GrantIndex(indexes.SearchIndex): """ This provides the search index for the Grant application. """ text = indexes.CharField(document=True, use_template=True) year = indexes.IntegerField(model_attr='year__year') date = indexes.DateField(model_attr='date') program = indexes.CharField(model_attr='program__area') grantee = indexes.CharField(model_attr='grantee') amount = indexes.IntegerField(model_attr='amount') site.register(Grant, GrantIndex)
If I want to search filtering out any programs that ARE NOT 'Health', I run the following query:
from haystack.query import SearchQuerySet sqs = SearchQuerySet() sqs = sqs.filter(program='Health')
Unfortunately, this also produces objects from the program 'Health\Other' and 'Health\Cardiovascular'. How do I stop the search from allowing those other programs in?
I run Ubuntu 9.10 with Xapian as my search back-end.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可能已经解决了这个问题,但我刚刚在 Whoosh 后端偶然发现了同样的问题。也许 Xapian 和 Whoosh 后端的行为相同?似乎 Whoosh 默认情况下会阻止所有 CharField,并使用某种包含查询在其中进行搜索。切换到自定义后端,而不在 CharFields 上启用词干,为我解决了这个问题。
希望这能将其他人推向正确的方向。
You've problably solved the problem already, but I just stumbled over the same problem with the Whoosh backend. Maybe the Xapian and Whoosh backends behave the same? Seems Whoosh is stemming all CharFields by default, and searching inside them with some kind of contains-query. Switching to a custom backend, without stemming enabled on CharFields, fixed this issue for me.
Hopefully this will push someone else in the right direction.
您可以按照此处所述使用字段查找。
you can use field lookups as described here.
使用“prepare_data”作为程序字段并摆脱 health\blabla 的东西
use "prepare_data" for program field and get rid of health\blabla things
对于 solr 后端,我需要使用
_exact
(只有一个下划线而不是两个)。For solr backend I need to use
_exact
(just one underline instead of two).免责声明:我是 Xapian-Haystack 的维护者。
我相信发生这种情况是因为 Xapian-Haystack 使用了一个术语生成器来转义
/
等特殊字符。因此,在您的情况下,
“Health\Other”
被索引为“health”
和“other”
。这个问题最近在 Xapian-Haystack 的主分支中得到了修复,请参见此处。Disclaimer: I'm the maintainer of Xapian-Haystack.
I believe this happens because Xapian-Haystack was using a term generator that was escaping special characters like
/
.So, in your case,
"Health\Other"
was being indexed as"health"
and"other"
. This was recently fixed in the master branch of Xapian-Haystack, see e.g. here.