Django Haystack 精确过滤

发布于 2024-08-20 04:29:50 字数 781 浏览 12 评论 0原文

我有一个大海捞针搜索,它具有以下搜索索引:

class GrantIndex(indexes.SearchIndex):
    """
    This provides the search index for the Grant application.
    """
    text = indexes.CharField(document=True, use_template=True)
    year = indexes.IntegerField(model_attr='year__year')
    date = indexes.DateField(model_attr='date')
    program = indexes.CharField(model_attr='program__area')
    grantee = indexes.CharField(model_attr='grantee')
    amount = indexes.IntegerField(model_attr='amount')
site.register(Grant, GrantIndex)

如果我想搜索过滤掉任何不是“健康”的程序,我运行以下查询:

from haystack.query import SearchQuerySet

sqs = SearchQuerySet()
sqs = sqs.filter(program='Health')

不幸的是,这也会从程序“健康\其他”和“健康”中生成对象\心血管'。如何阻止搜索允许其他程序进入?

我运行 Ubuntu 9.10,并使用 Xapian 作为搜索后端。

I have a haystack search which has the following SearchIndex:

class GrantIndex(indexes.SearchIndex):
    """
    This provides the search index for the Grant application.
    """
    text = indexes.CharField(document=True, use_template=True)
    year = indexes.IntegerField(model_attr='year__year')
    date = indexes.DateField(model_attr='date')
    program = indexes.CharField(model_attr='program__area')
    grantee = indexes.CharField(model_attr='grantee')
    amount = indexes.IntegerField(model_attr='amount')
site.register(Grant, GrantIndex)

If I want to search filtering out any programs that ARE NOT 'Health', I run the following query:

from haystack.query import SearchQuerySet

sqs = SearchQuerySet()
sqs = sqs.filter(program='Health')

Unfortunately, this also produces objects from the program 'Health\Other' and 'Health\Cardiovascular'. How do I stop the search from allowing those other programs in?

I run Ubuntu 9.10 with Xapian as my search back-end.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

浮光之海 2024-08-27 04:29:50

您可能已经解决了这个问题,但我刚刚在 Whoosh 后端偶然发现了同样的问题。也许 Xapian 和 Whoosh 后端的行为相同?似乎 Whoosh 默认情况下会阻止所有 CharField,并使用某种包含查询在其中进行搜索。切换到自定义后端,而不在 CharFields 上启用词干,为我解决了这个问题。

希望这能将其他人推向正确的方向。

You've problably solved the problem already, but I just stumbled over the same problem with the Whoosh backend. Maybe the Xapian and Whoosh backends behave the same? Seems Whoosh is stemming all CharFields by default, and searching inside them with some kind of contains-query. Switching to a custom backend, without stemming enabled on CharFields, fixed this issue for me.

Hopefully this will push someone else in the right direction.

孤星 2024-08-27 04:29:50

您可以按照此处所述使用字段查找。

sqs = sqs.filter(program__exact='Health')

you can use field lookups as described here.

sqs = sqs.filter(program__exact='Health')
郁金香雨 2024-08-27 04:29:50

使用“prepare_data”作为程序字段并摆脱 health\blabla 的东西

use "prepare_data" for program field and get rid of health\blabla things

赴月观长安 2024-08-27 04:29:50

对于 solr 后端,我需要使用 _exact (只有一个下划线而不是两个)。

For solr backend I need to use _exact (just one underline instead of two).

国际总奸 2024-08-27 04:29:50

免责声明:我是 Xapian-Haystack 的维护者。

我相信发生这种情况是因为 Xapian-Haystack 使用了一个术语生成器来转义 / 等特殊字符。

因此,在您的情况下,“Health\Other” 被索引为 “health”“other”。这个问题最近在 Xapian-Haystack 的主分支中得到了修复,请参见此处

Disclaimer: I'm the maintainer of Xapian-Haystack.

I believe this happens because Xapian-Haystack was using a term generator that was escaping special characters like /.

So, in your case, "Health\Other" was being indexed as "health" and "other". This was recently fixed in the master branch of Xapian-Haystack, see e.g. here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文