弹性搜索有没有办法检索不同的过滤器排列？

发布于 2025-01-17 11:09:33 字数 1031 浏览 2 评论 0原文

假设文档有两个属性：gender（男性或女性）和hair_color（金发或黑发）。我们希望获得所有这些排列的文档数量，更重要的是，获得孤立属性的文档数量。例如，在这里我们希望获得以下情况的文档数量：

男性
女性
金发女郎
布鲁内特
男性金发女郎
男性布鲁内特
女性金发女郎
女性布鲁内特

目标是具有如下表所示的内容：

性别	Hair_color	doc_count
男性	-	23
男性	金发女郎	10
男性	布鲁内特	11
女性	-	81
女性	金发	55
女性	布鲁内特	1
-	布鲁内特	70
-	金发	14

请注意，可能存在未定义其中一个属性的情况（因此黑发不一定等于男性黑发 + 女性黑发）。

有什么查询可以清楚地返回给我们吗？

我是 Elasticsearch 的新手，所以如果问题很琐碎，请原谅我。

我尝试过过滤器，首先按性别进行聚合（因此我们得到1-2），然后按hair_color + 性别进行聚合（因此我们得到3-8）。但对于大量属性（我们可以添加年龄），它会变得太复杂，并且从编程的角度来看有点晦涩难懂。

原文

Suppose the documents have two attributes, gender (male or female) and hair_color (blonde or brunette). And we want to get the number of documents for all these permutations and, importantly, the number of documents for the isolated attributes. For example, here we would like to have the number of documents for the following cases:

Male
Female
Blonde
Brunette
Male blonde
Male Brunette
Female blonde
Female brunette

The goal is to have something like the following table:

gender	hair_color	doc_count
male	-	23
male	blonde	10
male	brunette	11
female	-	81
female	blonde	55
female	brunette	1
-	brunette	70
-	blonde	14

Note that there may be cases where one of the attributes is not defined (so brunette is not necessarily equal to male brunette + female brunette).

Is there any query that can return this to us in a clear way?

I'm new to elasticsearch, so please excuse me if the question is trivial.

I've tried filters, first doing an aggregation by gender (so we get 1-2), and then an aggregation by hair_color + gender (so we get 3-8). But for a larger number of attributes (we could add age), it gets too complicated, and it's a bit obscure from a programmatic point of view.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

撩人痒 2025-01-24 11:09:33

是的，条款 聚合可以做到这一点。

{
  "size": 0,
  "aggs": {
    "gender": {
      "terms": {
        "field": "gender",
        "size": 10,
        "missing": "no_gender"
      },
      "aggs": {
        "hair": {
          "terms": {
            "field": "hair_color",
            "size": 10,
            "missing": "no_color"
          }
        }
      }
    }
  }
}

另一个选择是利用 复合聚合，其结果更容易处理，因为它是单个桶数组，具有所有可能的组合：

{
  "size": 0,
  "aggs": {
    "my_buckets": {
      "composite": {
        "sources": [
          {
            "gender": {
              "terms": {
                "field": "gender",
                "missing": "no_gender"
              }
            }
          },
          {
            "color": {
              "terms": {
                "field": "hair_color",
                "missing": "no_color"
              }
            }
          },
          {
            "age": {
              "terms": {
                "field": "age",
                "missing": "no_age"
              }
            }
          }
        ]
      }
    }
  }
}

Yes, terms aggregations can do exactly that

{
  "size": 0,
  "aggs": {
    "gender": {
      "terms": {
        "field": "gender",
        "size": 10,
        "missing": "no_gender"
      },
      "aggs": {
        "hair": {
          "terms": {
            "field": "hair_color",
            "size": 10,
            "missing": "no_color"
          }
        }
      }
    }
  }
}

Another option is to leverage the composite aggregation, whose result is easier to process because it's a single bucket array, with all possible combinations:

{
  "size": 0,
  "aggs": {
    "my_buckets": {
      "composite": {
        "sources": [
          {
            "gender": {
              "terms": {
                "field": "gender",
                "missing": "no_gender"
              }
            }
          },
          {
            "color": {
              "terms": {
                "field": "hair_color",
                "missing": "no_color"
              }
            }
          },
          {
            "age": {
              "terms": {
                "field": "age",
                "missing": "no_age"
              }
            }
          }
        ]
      }
    }
  }
}

回复收藏 0 原文

~没有更多了~