从思考狮身人面像转换为 Solr 太阳黑子

发布于 2024-12-19 22:06:53 字数 393 浏览 2 评论 0原文

由于一些原因,我们正在从思考狮身人面像转向太阳黑子。我必须重写搜索逻辑,但是我不确定如何转换以下内容:

我想将搜索 :any 转换为 Sunspot。这意味着不需要存在所有关键字才能使对象匹配。 (任何关键字都可以,并且将按相关性排序)。但是我在太阳黑子文档中找不到它。

# Thinking Sphinx

search_result = Business.search([attributes[:name],attributes[:address]], match_mode: :any)

我还在寻找一种通过 Sunspot 向 Solr 轻松添加停用词的方法。 Thinking Sphinx提供了一种在yml中指定的方法。但是 Sunspot::Rails yml 中没有等效项。

We are converting from Thinking Sphinx to Sunspot due to a few reasons. I have to rewrite the searching logic, however I am not sure how to convert the following:

I want to convert the search :any to Sunspot. This means not all of the keywords need to be present for the object to be a match. (Any of the keyword will do, and will be ordered by relevance). However I can't find it in the Sunspot documentation.

# Thinking Sphinx

search_result = Business.search([attributes[:name],attributes[:address]], match_mode: :any)

I also am looking for a way to easily add stopwords through Sunspot to Solr. Thinking Sphinx provides a way to specify it in yml. However there is no equivalent in Sunspot::Rails yml.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

定格我的天空 2024-12-26 22:06:53

太阳黑子的最低匹配数

并非所有关键字都需要出现才能使对象匹配

在 Solr 中,这是“最小应匹配”概念。您可以查看以前的一些答案、我的博客文章以及有关该主题的 Solr wiki 文档:

与您类似的查询上面写的可能看起来像这样...

@search = Business.search do
  fulltext query_string, :minimum_match => 0
end
@businesses = @search.results

停用词

对于停用词,我可能首先建议您不要使用它们。 Solr 的 DisMax 算法应该足以忽略常用术语来对结果进行排序。我唯一真正需要停用词的时候是通过在文本字段上分面生成词云时。

因此,如果您确实需要停用词,请添加StopFilterFactory 到文本字段的 analyzer 块。在 solr/conf 目录(与 schema.xml 相同的目录)中创建相应的 stopwords.txt 文件。

(默认情况下,Sunspot 配置实际上应该附带一个示例 stopwords.txt 文件。)

Minimum match in Sunspot

not all of the keywords need to be present for the object to be a match

In Solr this is the "minimum should match" concept. You can see some previous answers, a blog article of mine, and the Solr wiki docs on that subject:

A similar query to what you've written above might look like this...

@search = Business.search do
  fulltext query_string, :minimum_match => 0
end
@businesses = @search.results

Stopwords

For stopwords, I might start by recommending you not use them. Solr's DisMax algorithm should do a sufficient job ignoring common terms for the purpose of sorting results. The only time I have ever really needed stopwords was when generating word clouds by faceting on text fields.

So if you really need stopwords, add the StopFilterFactory to your text field's analyzer block. Create a corresponding stopwords.txt file in your solr/conf directory (the same directory as your schema.xml).

(Sunspot configs should actually come with a sample stopwords.txt file by default.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文