solr 地理层次结构

发布于 2024-10-11 20:03:20 字数 770 浏览 10 评论 0原文

我一直在试图找出一种在 solr 中实现层次结构分面的方法,但不知道在我的情况下如何做到这一点。我读过几篇关于在 solr 中进行层次结构的文章以及补丁 64 和 792 中的解决方案。我遇到的主要问题是我的实体可以属于层次结构的多个分支。我的数据的当前形式是一个用户文档,其中包含国家、州和城市的 MVA。

以地理层次结构为例,人们可以将他们的服务列出到城市级别。一个人可以为整个阿拉巴马州提供服务,但只能为佐治亚州的某些城镇提供服务。现在,州级别的分面计数对服务某个地区的不同个人进行计数,阿拉巴马州为 1,佐治亚州为 1,当扩展到城市级别时,每个城市都会有一个计数(换句话说,城市的总和)计数不一定等于状态计数,因为计数不是互斥的)。

美国(1)
格鲁吉亚(1)
亚特兰大(1)
哥伦布(0)
雅典(0)
阿拉巴马州(1)
手机(1)
伯明翰(1)
亨茨维尔(1)

我所困扰的部分是,当面对城市时,我无法知道它们属于哪个州,因为用户同时在阿拉巴马州和佐治亚州列出,而且我无法找到将属性与每个城市相关联的方法其他。如果 solr-64 支持同一文档的多个分支(例如 US/Alabama/Mobile/ 和 US/Georgia/Atlanta/),则它可能会起作用。截至目前,我还无法使用夜间构建来编译它,所以我有点陷入困境。

有没有人有更好的方法来做到这一点?

I've been trying to figure out a way to implement faceting with hierarchies in solr and can't figure out how to do it in my situation. I've read a couple of the articles on doing hierarchies in solr along with the solutions in patch 64 and 792. The main issue I'm having is that I have entities that can belong to multiple branches of the hierarchy. The current form of my data is a user document with MVAs for country, state, and city.

Take for instance a geographical hierarchy that people can list their services for down to a city level. A person may service all of alabama but only certain towns in georgia. Now the faceting count for the state level counts the distinct individuals that service an area which is a 1 for alabama and a 1 for georgia and when expanded down to the city level has a count for each city (in other words the sum of the city counts won't necessarily equal the state count since the counts are not mutually exclusive).

US(1)

Georgia(1)
Atlanta(1)
Columbus(0)
Athens(0)

Alabama(1)
Mobile(1)
Birmingham(1)
Huntsville(1)


The part I'm getting hung up on is when faceting on the cities I have no way of knowing what state they belong to since the user is listed in both alabama and georgia and I can't figure out a way to relate attributes to each other. solr-64 might work if it supports multiple branches like US/Alabama/Mobile/ and US/Georgia/Atlanta/ for the same document. As of right now I havent been able to get it to compile with the nightly build though so I'm kind of stuck.

Does anyone have a better way of doing this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

往日 2024-10-18 20:03:20

请参阅描述的第一个用例 在这里。 (需要客户端处理索引和查询!)

类别导航

问题:您有一个类别树,并且您的产品被分类为多个类别。

对于这个问题有两个相对相似的解决方案。我将描述其中之一:

  • 创建一个名为“类别”的多值字符串字段。使用类别 ID(如果您想避免数据库查询,则使用类别 ID)。
  • 您有一个类别树。确保文档不仅获取叶类别,还获取根节点之前的所有类别。
  • 现在以“-1”作为限制对类别字段进行分面
  • 但是如果您只想显示一级的类别怎么办?例如,如果您一次不需要其他级别或者它们太多。

    然后索引类别字段ala _category。为此,您在索引时需要 RAM 中的完整类别树。然后使用facet.prefix=_过滤该级别的类别列表

  • 单击类别条目应该会导致过滤器查询alafq=category:”_categoryId”
  • 现在有点棘手的部分是你的 UI 或中间层必须解析级别,例如 2 并将 2+1=3 附加到查询中:facet.prefix=3_
  • 如果你过滤级别那么还有一个问题:

    问:如何显示从所选类别到根类别的路径?

    答:要么通过 DB 获取类别父级,如果您将类别 id 存储在 Solr 中(而不是类别名称),这很容易。
    或者从参数列表中获取父项,这有点复杂但可行。在这种情况下,您需要将类别名称存储在 Solr 中。

See the first use case described here. (client side processing for indexing and querying necessary!)

Category navigation

The problem: you have a tree of categories and your products are categorized in multiple of those categories.

There are two relative similar solutions for this problem. I will describe one of them:

  • Create a multivalued string field called ‘category’. Use the category id (or name if you want to avoid DB queries).
  • You have a category tree. Make sure a document gets not only the leaf category, but all categories until the root node.
  • Now facet over the category field with ‘-1′ as limit
  • But what if you want to display only the categories of one level? E.g. if you don’t want other level at a time or if they are too much.

    Then index the category field ala <level>_category. For that you will need the complete category tree in RAM while indexing. Then use facet.prefix=<level>_ to filter the category list for the level

  • Clicking on a category entry should result in a filter query ala fq=category:”<level>_categoryId”
  • The little tricky part is now that your UI or middle tier has to parse the level e.g. 2 and the append 2+1=3 to the query: facet.prefix=3_
  • If you filter the level then one question remains:

    Q: how can you display the path from the selected category until the root category?

    A: Either get the category parents via DB, which is easy if you store the category ids in Solr – not the category names.
    Or get the parents from the parameter list which is a bit more complicated but doable. In this case you’ll need to store the category names in Solr.

浮华 2024-10-18 20:03:20

我不太熟悉你的问题,但似乎你需要按城市、州进行分组。

查看 SOLR 中称为字段折叠的分组功能 (http://wiki.apache.org/solr/FieldCollapsing)。

另外,也看看 bobo-browse。具体来说,compositeFacetHandlers http://code.google.com/p/bobo-browse/ wiki/CompositeFacetHandlers。 bobo-browse 可以集成到 SOLR 中 (http://code.google.com/p/bobo-browse/wiki/SolrIntegration)

I am not that familiar with your problem but it seems you need to do a group-by city,state.

Have a look at the group-by feature in SOLR called field collapsing (http://wiki.apache.org/solr/FieldCollapsing).

Also, have a look at bobo-browse as well. Specifically, compositeFacetHandlers http://code.google.com/p/bobo-browse/wiki/CompositeFacetHandlers. bobo-browse can be integrated into SOLR (http://code.google.com/p/bobo-browse/wiki/SolrIntegration)

樱花落人离去 2024-10-18 20:03:20

查看数据透视(即决策树)分面:http://wiki.apache.org /solr/SimpleFacetParameters#Pivot_.28ie_Decision_Tree.29_Faceting

Solr 4.0 支持

Check out Pivot (i.e. Decision Tree) Faceting: http://wiki.apache.org/solr/SimpleFacetParameters#Pivot_.28ie_Decision_Tree.29_Faceting

It is supported in Solr 4.0

铜锣湾横着走 2024-10-18 20:03:20

假设索引中的文档代表单个服务......

对于城市,创建一个字段,该字段是使用某种分隔符与城市连接的州。该字段不必向用户显示,它可以是您存储的字段的补充,但不索引城市的真实名称。

例如,您可以有一个 city_facet 字段,其值为:

  • “俄亥俄 - 迈阿密”
  • “佛罗里达 - 迈阿密”

您可能想要选择一个更安全的分隔符。我想连字符可能是潜在的冲突。

Assuming your documents in the index represent a single service....

For the city, manufacture a field that is the state concatenated with the city using a delimiter of some sort. This field doesn't have to ever be displayed to the user, it can be in addition to a field you store but don't index that is the real name of the city.

For example you could have a city_facet field with values of:

  • "Ohio - Miami"
  • "Florida - Miami"

You probably want to pick a delimiter that is safer. I imagine a hyphen could be a potential conflict.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文