在地图函数中使用多值字段

发布于 2024-12-03 09:43:51 字数 759 浏览 0 评论 0原文

我正在一个项目中实现 Solr,现在我陷入了一个特定的搜索,包括 arr 字段。问题是:

我想搜索对象上的子 id,这些子 id 存储在多值字段中,例如:

<arr name="SubIds">
   <int>12272</int>
   <int>12304</int>
   <int>12306</int>
</arr>

我想要使用的查询(或查询的一部分)如下: map(SubIds,i,i,1,0)

例如,当我在上面的 map 函数中的“i”空间上填充 12304 时,我希望我的函数返回 1。如果我输入 12345,它应该返回 0问题是,当我运行此查询时,它返回 0,或者“此字段中没有数字 12304,我返回 0”。

当从我的映射函数中删除 0 时,我可以看到返回给我的实际值(当 12304 返回 1 时,当不同的返回值时),在本例中是 12306!我已经尝试过使用一些不同的多值字段,但结果是相同的;看起来该函数正在根据我填写的 ID 检查多值字段中的最后一个值。

这是真的吗?当它出现时,有什么方法可以查看整个 arr 并仅在整个多值字段中不存在该值时返回 0 ?

** 编辑:这只是一种预感,但是当 map() 函数看到所有项目都是 int 类型(例如)时,它是否会自动对 arr 列表进行排序。这可能意味着地图返回第一个数字(最大)(在我的示例中)是 12306,而不是 12304...*

谢谢!

I'm working on implementing Solr in a project and right now I'm stuck on a specific search including an arr field. The thing is:

I'd like to search sub-id's on an object, these sub-id's are stored in a multivalue field, e.g.:

<arr name="SubIds">
   <int>12272</int>
   <int>12304</int>
   <int>12306</int>
</arr>

The query (or part of the query) that I want to use is as follows:
map(SubIds,i,i,1,0)

When I, for example, fill 12304 on the 'i' space in the map function above, I would expect my function to return 1. If I would enter 12345 it should return 0. The thing is that when I run this query it returns 0, or "There's no number 12304 in this field, I return 0".

When removing the 0 from my map function I can see the actual value returned to me (when 12304 return 1, when different return value), in this case that's 12306! I've tried this with some different multivalued fields but the result is the same; it looks like the function is checking the last value in the multivalue field against my filled in ID.

Is this true? And when it does, is there any way in looking through the whole arr and only return 0 when the value doesn't exist in the whole multivalued field?

** Edit: It's just a hunch, but could it be that the map() function automatically orders the arr list when it sees that all the items are of type int (for example). That could mean that the map returns the first number (the highest) which would (in my example) be 12306, not 12304...*

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

一身软味 2024-12-10 09:43:51

...看起来函数查询不适用于多值字段...

http://lucene.472066.n3.nabble.com/Using-multivalued-field-in-map-function-td3318843.html#a3322023

函数查询不适用于多值字段。
http://wiki.apache.org/solr/FunctionQuery#Vector_Functions

考虑以下情况,有谁对如何查询所需数据有更好的想法吗?

我有一个充满博客文章的网站,每个博客文章都有一个所有者,
该所有者是通过他/她的 ID 转介的。例如:BloggerId
= 123. 该博客也可能有多个合著者,这
也由 BloggerId 引用,但这些 id 存储在
多值字段,在我之前的示例 SubIds 中。

搜索特定博主时,会搜索 BloggerId。
搜索结果受到许多变量的影响,
国家/州/更具体的地质数据、博客类别等。
为此,我使用分面查询。接下来我想多做出一些成果
重要的是,根据 BloggerId,我尝试使用
以下查询:

?q={!func}map(sum(map(BloggerId,12304,12304,2,0),map(BloggerId,12304,12304,1,0)),3,3,2)&fl =*,score&facet.field=Country&f.Country.facet.limit=6&f acet.field=State&fq=(BlogCategory:internet%20OR%20BlogCategory:sports&sort=score%20desc,Top%20desc,%20SortPriority%20asc&start=0&omitHeader=true

在结果列表中,BloggerId 12304 撰写的博客应该位于
列表顶部,其次是 BloggerId 12304 所在的博客
合著者。之后,所有其他符合标准的博客,但
不是由 BloggerId 12304 编写(或共同编写)。

也许我可以将这个多值字段设置为字符串字段(其中 id 用“;”分隔)并查询我的值,但如果有人有更好的想法,随时欢迎!

最后我选择添加一个带有空格的字符串值字段来分隔不同的值。之后,我使用 solr.WhitespaceTokenizerFactory 类快速扫描字符串以查找特定 ID 的出现。

... It looks like function queries don't work with multivalued fields ...

http://lucene.472066.n3.nabble.com/Using-multivalued-field-in-map-function-td3318843.html#a3322023:

Function queries don't work with multivalued field.
http://wiki.apache.org/solr/FunctionQuery#Vector_Functions

Given the following case, is there anybody who has a better idea on how I can query the wanted data?

I've got a website full of blogposts and every blogpost has an owner,
this owner is refererred to through his/her id. For example: BloggerId
= 123. It's also possible that the blog has multiple co-writers, which
are also referred to by there BloggerId but these id's are stored in
the multivalue field, in my previous example SubIds.

When searching for a specific blogger one searches the BloggerId.
Searchresults are influenced by a number of variables, the
country/state/more specific geological data, the blogcategory, etc.
For this I use a facetted query. Next I want to make some results more
important, depending on the BloggerId, I tried to do this with the
following query:

?q={!func}map(sum(map(BloggerId,12304,12304,2,0),map(BloggerId,12304,12304,1,0)),3,3,2)&fl=*,score&facet.field=Country&f.Country.facet.limit=6&facet.field=State&fq=(BlogCategory:internet%20OR%20BlogCategory:sports&sort=score%20desc,Top%20desc,%20SortPriority%20asc&start=0&omitHeader=true

In the resulting list, blogs written by BloggerId 12304 should be on
top of the list, followed by the blogs where BloggerId 12304 was
co-writer. After that, all other blogs that follow the criteria but
aren't written (or co-written) by BloggerId 12304.

Maybe I could make this multivalued field a string field (where id's are seperated by ";") and query my value, but if one has a better idea your always welcome!

In the end I chose to add a string valued field with whitespaces to seperate the different values. After that I used the solr.WhitespaceTokenizerFactory class to quickly scan the string for occurences of a specific ID.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文