基于字符串字段子集的 Solr 查询

发布于 2024-12-07 07:18:40 字数 798 浏览 0 评论 0原文

我想向 Solr 发送一个字符串,并让它回答该字符串子集的所有记录。

我要发送的字符串包含用空格分隔的整数。我想让 solr 为我提供所有记录,其中特定字符串字段是我作为请求字符串提供的数字的子集。

举个例子......

假设我有一个在 Solr 中索引的字符串字段,它实际上是一组由空格分隔的整数。例如,假设我在 Solr 中索引了以下记录的字段:

  • "888110"
  • "888110 888120"
  • "888110 888120 888130"
  • "888110 888120 888130 888140"
  • "888110 888130 888140"
  • "888110 888140"
  • "888140"
  • "888120 888130"

我希望 Solr 接收一个查询,例如 "888110 888140" 并回复以下记录:

  • "888110"
  • "888110 888140"
  • "888140"

如果我​​按 "888110 查询888120 888130" 检索到的记录将是...

  • "888110"
  • "888110 888120"
  • "888110 888120 888130"
  • "888120 888130"

检索到的记录必须恰好是作为字符串提供的数字的子集。

是否有可能让 Solr 表现得像这样?

I'd like to send a string to Solr and let it answer with all records which are a subset of that string.

The string I would send has integer numbers separated by spaces. I wanna make solr give me all records where a specific string field is a subset of the numbers I provide as the request string.

An example...

Imagine I have an string field indexed in Solr which is in reality a set of integers separated by space. For example, let's say I have the following record's field indexed in Solr:

  • "888110"
  • "888110 888120"
  • "888110 888120 888130"
  • "888110 888120 888130 888140"
  • "888110 888130 888140"
  • "888110 888140"
  • "888140"
  • "888120 888130"

I wanna Solr to receive a query with, for example, "888110 888140" and reply with the following records:

  • "888110"
  • "888110 888140"
  • "888140"

If I query by "888110 888120 888130" the retrieved records would be...

  • "888110"
  • "888110 888120"
  • "888110 888120 888130"
  • "888120 888130"

The retrieved records must be exactly a subset of the numbers provided as a string.

Is it possible to make Solr behave like this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

掩耳倾听 2024-12-14 07:18:40

我有点困惑为什么在第一个示例中没有返回“888110”,但在第二个示例中却返回了“888110”。

无论如何,如果我大致了解您想要做什么,我将创建一个新字段多值并在查询中使用布尔运算符(AND,OR)。

例如在模式中

       <field name="code_string" ... />
       <field name="codes" ... multiValued="true"/>

有一个类似于 and 的文档

<doc>
    <arr name="codes">
       <str>811001</str>
       <str>811002</str>
    </arr>

,这样你的查询中

?=codes=811001 OR codes=811002 OR ....

根据我对 solr 的经验,牺牲一点内存通常更干净/更易于维护,而不是创建极其复杂的过滤器链等

I'm a bit confused why in the first example "888110" is not returned, but it is in the second example.

Anyways, if I understand generally what you are trying to do, I would be making a new field multi valued and use your boolean operators (AND ,OR) on the query.

eg in the schema

       <field name="code_string" ... />
       <field name="codes" ... multiValued="true"/>

so you have a document like

<doc>
    <arr name="codes">
       <str>811001</str>
       <str>811002</str>
    </arr>

and in your query

?=codes=811001 OR codes=811002 OR ....

In my experience with solr it is generally cleaner / more maintainable to sacrifice a little memory rather than creating debilitatingly complex chains of filters etc

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文