复合聚合的分类分页

发布于 2025-02-11 16:30:58 字数 3232 浏览 1 评论 0原文

我有Elasticsearch 7.1文档，其中包含以下映射： -

{
  "event" : {
    "mappings" : {
      "properties" : {
        "Code1" : {
          "type" : "keyword"
        },
        "Code2" : {
          "type" : "keyword"
        },
        "Date1" : {
          "type" : "date"
        },
        "Date2" : {
          "type" : "date"
        },
        "Value" : {
          "type" : "long"
        }
      }
    }
  }
}

我想通过code1，code2，date1 ，date2 成桶加上

totalValue，它是value的总和，在存储桶中的所有文档的字段

和

count ，是存储桶中的文档数量。

我想要的最终输出是这样的： -

{
    {
        "Code1": "ABC",
        "Code2": "XYZ",
        "Date1": "01/01/2022",
        "Date2": "31/01/2022",
        "TotalValue": "100",
        "Count": "3"
    },
    ...
}

我想要的是，通过对存储桶的任何输出字段进行排序，即。 ; code1，code2，date1，date2，totalValue，count> count < /代码>。

使用复合汇总，我提出了此查询，它可以用分页式响应和对code1，code2 ，date1 ，date2，

但无法在totalValue和count（doc_count）字段上进行适当的分类分页。

GET event/_search
{
  "size":0,
  "aggs": {
      "AggregatedBucket": {
        "composite": {
          "size":"10",
          "sources": [
           {
              "Code1": {
                "terms": {
                  "field": "Code1",
                  "order": "desc"
                }
              }
            },
           {
              "Code2": {
                "terms": {
                  "field": "Code2",
                  "order": "desc"
                }
              }
            },
            {
              "Date1": {
                "terms": {
                  "field": "Date1",
                  "order": "desc"
                }
              }
            },
            {
              "Date2": {
                "terms": {
                  "field": "Date2",
                  "order": "desc"
                }
              }
            }
          ]
        },
        "aggs":{
            "TotalValue":{
              "sum": {
                "field": "Value"
              }
            }
        }
      }
    }
}}

这是截断的响应，我得到的

  "aggregations" : {
    "AggregatedBucket" : {
      "after_key" : {
        "Code1" : "ABC2",
        "Code2" : "XYZ2",
        "Date1" : "02/01/2022",
        "Date2" : "02/02/2022"
      },
      "buckets" : [
        {
          "key" : {
            "Code1" : "ABC1",
            "Code2" : "XYZ1",
            "Date1" : "01/01/2022",
            "Date2" : "01/02/2022"
          },
          "doc_count" : 1,
          "TotalValue" : {
            "value" : 4.0
          }
        },
        {
          "key" : {
            "Code1" : "ABC2",
            "Code2" : "XYZ2",
            "Date1" : "02/01/2022",
            "Date2" : "02/02/2022"
          },
          "doc_count" : 1,
          "TotalValue" : {
            "value" : 3.0
          }
        }
     ]
   }
 }

任何其他方法可以返回我的预期响应，这也是有帮助的。

原文

I have ElasticSearch 7.1 documents with following mappings:-

{
  "event" : {
    "mappings" : {
      "properties" : {
        "Code1" : {
          "type" : "keyword"
        },
        "Code2" : {
          "type" : "keyword"
        },
        "Date1" : {
          "type" : "date"
        },
        "Date2" : {
          "type" : "date"
        },
        "Value" : {
          "type" : "long"
        }
      }
    }
  }
}

I want to group the documents by Code1, Code2, Date1, Date2 into buckets
together with

TotalValue which is sum of Value field of all documents in a bucket

and

Count which is number of documents in a bucket.

Final Output which I want is like this:-

{
    {
        "Code1": "ABC",
        "Code2": "XYZ",
        "Date1": "01/01/2022",
        "Date2": "31/01/2022",
        "TotalValue": "100",
        "Count": "3"
    },
    ...
}

Also I want, paginated output with sorting on any of the output fields of the bucket, viz. ; Code1, Code2, Date1, Date2, TotalValue, Count.

Using Composite Aggregation, I came up with this query, which is able to do aggregation as reqd with paginated response and sorting on Code1, Code2, Date1, Date2

but not able to do proper sorted pagination on TotalValueand Count(doc_count) fields.

GET event/_search
{
  "size":0,
  "aggs": {
      "AggregatedBucket": {
        "composite": {
          "size":"10",
          "sources": [
           {
              "Code1": {
                "terms": {
                  "field": "Code1",
                  "order": "desc"
                }
              }
            },
           {
              "Code2": {
                "terms": {
                  "field": "Code2",
                  "order": "desc"
                }
              }
            },
            {
              "Date1": {
                "terms": {
                  "field": "Date1",
                  "order": "desc"
                }
              }
            },
            {
              "Date2": {
                "terms": {
                  "field": "Date2",
                  "order": "desc"
                }
              }
            }
          ]
        },
        "aggs":{
            "TotalValue":{
              "sum": {
                "field": "Value"
              }
            }
        }
      }
    }
}}

Here is the truncated response I am getting

  "aggregations" : {
    "AggregatedBucket" : {
      "after_key" : {
        "Code1" : "ABC2",
        "Code2" : "XYZ2",
        "Date1" : "02/01/2022",
        "Date2" : "02/02/2022"
      },
      "buckets" : [
        {
          "key" : {
            "Code1" : "ABC1",
            "Code2" : "XYZ1",
            "Date1" : "01/01/2022",
            "Date2" : "01/02/2022"
          },
          "doc_count" : 1,
          "TotalValue" : {
            "value" : 4.0
          }
        },
        {
          "key" : {
            "Code1" : "ABC2",
            "Code2" : "XYZ2",
            "Date1" : "02/01/2022",
            "Date2" : "02/02/2022"
          },
          "doc_count" : 1,
          "TotalValue" : {
            "value" : 3.0
          }
        }
     ]
   }
 }

Any alternate way to return my expected response would also be helpful.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

毅然前行 2025-02-18 16:30:58

很抱歉这样说，但是您无法使用排序顺序分页。复合聚合已经根据您为分页指定的密钥“排序”。
在您的情况下，

按Code1的上升顺序
如果2代码1相同，则将
排序，如果2代码2相同，则code2的上升顺序，则升级date1，
如果2 date1相同，则升序1 date1相同，则升级date of Date2。

您创建的（总计）的亚参数不能用于对复合聚合进行分类。

这是并且一直是复合聚合的主要缺点。

如果您想使这一点变得不那么复杂，那么一种简单的方法是在四个字段中构建一个串联字段：
“ code1-code2-date1-date2”。然后将其插入每个文档。在串联字段上执行术语汇总，并按降序排序（这将自动为您的“总”）。这仍然不允许您分页，但是您可以将返回的聚合响应的大小设置为足够大以满足您需求的东西。

聚集对分页的支持很差。实际上，他们旨在将索引中的所有数据获取并产生响应。分页的概念不是围绕聚合设计的。

Hth。