我可以通过多值字段的成员搜索 Solr 文档吗?
我有一组 Solr 文档,其中包含(以及其他字段)带有百分比数据的多值字段,如果值为空,则为 -1,例如,
<doc>
...
<arr name="alpha">
<float>0.23</float>
<float>0.23</float>
<float>0.43</float>
</arr>
<arr name="beta">
<float>0.52</float>
<float>-1.0</float>
<float>0.34</float>
</arr>
<arr name="gamma">
<float>-1.0</float>
<float>-1.0</float>
<float>-1.0</float>
</arr>
...
</doc>
我需要查找多值字段包含或不包含某个成员的文档以获得完整的测试用例集。如果我可以使用下面的任何一个查询,那么从数十万个文档中查找特定文档将是一个巨大的帮助:
1)我可以找到一个文档,其中没有的成员特定的多值字段满足一定的标准吗? (如果我查询“alpha 没有与 -1 匹配的成员”,则会返回上述文档。)
2)我能否找到一个文档,其中至少有一个特定多值字段的成员满足一定的标准吗? (如果我查询“alpha 至少有一个成员 > 0”或“beta 至少有一个成员 > 0”,则会返回上述文档。)
我假设像 alpha:[0 这样的查询TO 1]
不起作用,因为该字段是数组而不是标量。 “这是不可能的”的明确答案与“这就是你如何做”的答案一样有用 - 提前致谢。
编辑:与许多问题一样,答案是“重新检查您的假设”——具体来说,生成我们文档的开发人员关闭了百分比字段的索引。
I have a set of Solr documents containing (among other fields) multi-value fields with percentage data or -1 if the value is null, e.g.
<doc>
...
<arr name="alpha">
<float>0.23</float>
<float>0.23</float>
<float>0.43</float>
</arr>
<arr name="beta">
<float>0.52</float>
<float>-1.0</float>
<float>0.34</float>
</arr>
<arr name="gamma">
<float>-1.0</float>
<float>-1.0</float>
<float>-1.0</float>
</arr>
...
</doc>
I need to find documents where a multi-value field contains or doesn't contain a certain member for a complete set of test cases. If I can get either of the queries below to work, it would be a tremendous help to locate a particular document out of several hundred thousand:
1) Can I find a document where none of the members of a specific multi-value field meet a certain criterion? (The above doc would be returned if I queried for "alpha has no members matching -1".)
2) Can I find a document where at least one of the members of a specific multi-value field meets a certain criterion? (The above doc would be returned if I queried for "alpha has least one member > 0" or "beta has at least one member > 0".)
I'm assuming that a query like alpha:[0 TO 1]
doesn't work because the field is an array instead of a scalar. A definitive answer of "This is impossible" is just as useful as an answer of "Here's how you do it" -- thanks in advance.
EDIT: As with so many problems, the answer is "recheck your assumptions" -- specifically, the developer who generated our documents turned off indexing on the percentage fields.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
是的。
-alpha:"-1.0"
实现了这一点。 是的您自己的示例
alpha:[0 TO 1]
就是解决方案。简单地说为什么它有效:每个字段不是一个值或一个数组,而是一个术语向量。查询字段中的某个术语是包含(或排除)的请求,而不是相等操作。
您引用的数组是结果集的一部分,结果集是 Solr 作为搜索结果的一部分返回的普通存储数据。
Yes.
-alpha:"-1.0"
achieves this.Your own example,
alpha:[0 TO 1]
, is the solution.To put simply why this works: Each field is not a value or an array, but rather a vector of terms. Querying a field for a certain term is a request for inclusion (or exclusion), not an equality operation.
The array you are referring to is a part of the result set, which is plain stored data that is returned by Solr as part of the search results.
这当然是可能的。
我通常使用 FQ(过滤查询)参数来获取你想要的:
http://wiki.apache.org/solr/CommonQueryParameters#fq
但是你可以只需将其也放入查询中即可。
#1 的解决方案:
过滤掉 alpha 等于 -1.0 的任何内容
我不确定解决方案#2。您是否尝试过您提到的代码?
我没有好的样本数据集可供测试。
It is certainly possible.
I usually use the FQ (filter query) parameter to get what you want:
http://wiki.apache.org/solr/CommonQueryParameters#fq
But you can just throw it on the query as well.
Solution for #1:
Filters out anything that has alpha equal to -1.0
I am not sure about solution #2. Have you tried the code you mentioned?
I don't have a good sample dataset to test on.