使用 apache solr 的 Facet 动态字段

发布于 2024-12-05 15:41:03 字数 278 浏览 1 评论 0原文

我在 ApacheSolr 中定义了动态字段:

我用它来存储产品特征,例如:颜色特征、直径特征、材料特征等。由于产品不断变化,这些字段的数量并不恒定。

是否可以使用相同的查询获取所有这些动态字段的构面结果,或者我是否需要始终在查询中写入所有字段,例如... facet.field=color_feature&facet.field=diameter_feature&facet.field =material_feature&facet.field=...

I have defined dynamic field in ApacheSolr:

I use it to store products features like: color_feature, diameter_feature, material_feature and so on. Number of those fields are not constant becouse products are changing.

Is it possible to get facet result for all those dynamic fields with the same query or do I need to write always all fields in a query like ... facet.field=color_feature&facet.field=diameter_feature&facet.field=material_feature&facet.field=...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

惜醉颜 2024-12-12 15:41:03

Solr 目前不支持facet.field 参数中的通配符。
所以 *_feature 不适合你。

可能想检查一下 - https://issues.apache.org/jira/browse/ SOLR-247

如果您不想传递参数,您可以轻松地将这些参数添加到请求处理程序默认值中。

请求中的 qt=requesthandler 将始终包含这些方面。

Solr currently does not support wildcards in the facet.field parameter.
So *_feature won't work for you.

May want to check on this - https://issues.apache.org/jira/browse/SOLR-247

If you don't want to pass parameters, you can easily add these to your request handler defaults.

The qt=requesthandler in request would always include these facets.

生寂 2024-12-12 15:41:03

我在电子商务平台工作时也遇到过类似的情况。每个项目都有静态字段(PriceNameCategory),可以轻松映射到 SOLR 的 schema.xml,但是每个项目还可以有动态数量的变化。

例如,商店中的一件 T 恤可能有颜色黑色白色红色等。 )和尺寸等)属性,而同一商店中的蜡烛可能有气味代码> (<代码>南瓜, Vanilla 等)变体。本质上,这是一个实体-属性-值(EAV)关系数据库设计,用于描述产品的某些功能。

由于从分面的角度来看,SOLR 中的 schema.xml 文件是扁平的,因此我通过将变体修改为单个多值字段来解决这个问题……

<field
  name="variation"
  type="string"
  indexed="true"
  stored="true"
  required="false"
  multiValued="true" />

将数据库中的数据推入这些字段为 Color|BlackSize|SmallScent|Pumpkin ...

  <doc>
    <field name="id">ITEM-J-WHITE-M</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Medium</field>
  </doc>
  <doc>
    <field name="id">ITEM-J-WHITE-L</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Large</field>
  </doc>
  <doc>
    <field name="id">ITEM-J-WHITE-XL</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Extra Large</field>
  </doc>

...这样当我告诉 SOLR 进行构面时,然后我得到的结果看起来像......

<lst name="facet_counts">
  <lst name="facet_queries"/>
  <lst name="facet_fields">
    <lst name="variation">
      <int name="Color|White">2</int>
      <int name="Size|Extra Large">2</int>
      <int name="Size|Large">2</int>
      <int name="Size|Medium">2</int>
      <int name="Size|Small">2</int>
      <int name="Color|Black">1</int>
    </lst>
  </lst>
  <lst name="facet_dates"/>
  <lst name="facet_ranges"/>
</lst>

所以我的代码解析这些结果以显示给用户,只需在我的 | 分隔符上进行拆分(假设我的键和值中都不会有 | ),然后进行分组通过钥匙……

Color
    White (2)
    Black (1)
Size
    Extra Large (2)
    Large (2)
    Medium (2)
    Small (2)

这对于政府工作来说已经足够了。

这样做的一个缺点是,您将失去对此 EAV 数据执行范围方面的能力,但就我而言,这并不适用(Price 字段适用于所有项目,并且因此在 schema.xml 中定义,以便可以以通常的方式进行分面)。

希望这对某人有帮助!

I was in a similar situation when working on an e-commerce platform. Each item had static fields (Price, Name, Category) that easily mapped to SOLR's schema.xml, but each item could also have a dynamic amount of variations.

For example, a t-shirt in the store could have Color (Black, White, Red, etc.) and Size (Small, Medium, etc.) attributes, whereas a candle in the same store could have a Scent (Pumpkin, Vanilla, etc.) variation. Essentially, this is an entity-attribute-value (EAV) relational database design used to describe some features of the product.

Since the schema.xml file in SOLR is flat from the perspective of faceting, I worked around it by munging the variations into a single multi-valued field ...

<field
  name="variation"
  type="string"
  indexed="true"
  stored="true"
  required="false"
  multiValued="true" />

... shoving data from the database into these fields as Color|Black, Size|Small, and Scent|Pumpkin ...

  <doc>
    <field name="id">ITEM-J-WHITE-M</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Medium</field>
  </doc>
  <doc>
    <field name="id">ITEM-J-WHITE-L</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Large</field>
  </doc>
  <doc>
    <field name="id">ITEM-J-WHITE-XL</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Extra Large</field>
  </doc>

... so that when I tell SOLR to facet, then I get results that look like ...

<lst name="facet_counts">
  <lst name="facet_queries"/>
  <lst name="facet_fields">
    <lst name="variation">
      <int name="Color|White">2</int>
      <int name="Size|Extra Large">2</int>
      <int name="Size|Large">2</int>
      <int name="Size|Medium">2</int>
      <int name="Size|Small">2</int>
      <int name="Color|Black">1</int>
    </lst>
  </lst>
  <lst name="facet_dates"/>
  <lst name="facet_ranges"/>
</lst>

... so that my code that parses these results to display to the user can just split on my | delimiter (assuming that neither my keys nor values will have a | in them) and then group by the keys ...

Color
    White (2)
    Black (1)
Size
    Extra Large (2)
    Large (2)
    Medium (2)
    Small (2)

... which is good enough for government work.

One disadvantage of doing it this way is that you'll lose the ability to do range facets on this EAV data, but in my case, that didn't apply (the Price field applying to all items and thus being defined in schema.xml so that it can be faceted in the usual way).

Hope this helps someone!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文