OpenSearch / Elasticsearch聚合中的串联字段

发布于 2025-02-11 16:34:34 字数 3351 浏览 1 评论 0原文

我有一个opensearch索引,其中包含以下映射(简化):

PUT /house
{
  "mappings": {
    "properties": {
      "house": { "type": "keyword" },
      "people": {
        "type": "nested",
        "properties": {
          "forename": { "type": "keyword" },
          "surname": { "type": "keyword" }
        }
      }
    }
  }
}

我想检索一个集合键,其中存储键键为“ [foreName] [姓氏]”。

玩具数据:

PUT /house/_doc/1
{
  "house": "house1",
  "people": [
    { "forename": "Dave", "surname": "Daveson" },
    { "forename": "Jeff", "surname": "Jeffson" }
  ]
}

PUT /house/_doc/2
{
  "house": "house1",
  "people": [
    { "forename": "Dave", "surname": "Daveson" },
    { "forename": "Jeffs", "surname": "Jeffsons" }
  ]
}

以下内容不会返回我期望的内容,我无法弄清楚在脚本中放置哪些对象路径以使其工作:

GET house/_search
{
  "aggs": {
    "people": {
      "nested": {
        "path": "people"
      },
      "aggs": {
        "people.name": {
          "terms": {
            "script": "[params._source['forename'], params._source['surname']].join(' ')"
          }
        }
      }
    }
  },
  "size": 0
}

返回:

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "people" : {
      "doc_count" : 4,
      "people.name" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "null null",
            "doc_count" : 4
          }
        ]
      }
    }
  }
}

没有script我可以汇总正确地在forename姓氏或两者兼而有之,但是我不能可靠地“加入”结果,因为它们只能在doc_count或键上进行排序:

GET house/_search
{
  "aggs": {
    "people": {
      "nested": {
        "path": "people"
      },
      "aggs": {
        "people.forename": {
          "terms": { "field": "people.forename" }
        },
        "people.surname": {
          "terms": { "field": "people.surname" }
        }
      }
    }
  },
  "size": 0
}

返回:返回:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "people" : {
      "doc_count" : 4,
      "people.surname" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "Daveson",
            "doc_count" : 2
          },
          {
            "key" : "Jeffson",
            "doc_count" : 1
          },
          {
            "key" : "Jeffsons",
            "doc_count" : 1
          }
        ]
      },
      "people.forename" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "Dave",
            "doc_count" : 2
          },
          {
            "key" : "Jeff",
            "doc_count" : 1
          },
          {
            "key" : "Jeffs",
            "doc_count" : 1
          }
        ]
      }
    }
  }
}

I have an OpenSearch index with the following mapping (simplified):

PUT /house
{
  "mappings": {
    "properties": {
      "house": { "type": "keyword" },
      "people": {
        "type": "nested",
        "properties": {
          "forename": { "type": "keyword" },
          "surname": { "type": "keyword" }
        }
      }
    }
  }
}

I'd like to retrieve an aggregate where the bucket key is "[forename] [surname]".

Toy data:

PUT /house/_doc/1
{
  "house": "house1",
  "people": [
    { "forename": "Dave", "surname": "Daveson" },
    { "forename": "Jeff", "surname": "Jeffson" }
  ]
}

PUT /house/_doc/2
{
  "house": "house1",
  "people": [
    { "forename": "Dave", "surname": "Daveson" },
    { "forename": "Jeffs", "surname": "Jeffsons" }
  ]
}

The following doesn't return what I'd expect, and I can't figure out what object paths to put in the script to get it to work:

GET house/_search
{
  "aggs": {
    "people": {
      "nested": {
        "path": "people"
      },
      "aggs": {
        "people.name": {
          "terms": {
            "script": "[params._source['forename'], params._source['surname']].join(' ')"
          }
        }
      }
    }
  },
  "size": 0
}

Returns:

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "people" : {
      "doc_count" : 4,
      "people.name" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "null null",
            "doc_count" : 4
          }
        ]
      }
    }
  }
}

Without script I can aggregate correctly on forename, surname or both, but using both I can't reliably "join" the results since they can be sorted only on the doc_count or key:

GET house/_search
{
  "aggs": {
    "people": {
      "nested": {
        "path": "people"
      },
      "aggs": {
        "people.forename": {
          "terms": { "field": "people.forename" }
        },
        "people.surname": {
          "terms": { "field": "people.surname" }
        }
      }
    }
  },
  "size": 0
}

Returns:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "people" : {
      "doc_count" : 4,
      "people.surname" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "Daveson",
            "doc_count" : 2
          },
          {
            "key" : "Jeffson",
            "doc_count" : 1
          },
          {
            "key" : "Jeffsons",
            "doc_count" : 1
          }
        ]
      },
      "people.forename" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "Dave",
            "doc_count" : 2
          },
          {
            "key" : "Jeff",
            "doc_count" : 1
          },
          {
            "key" : "Jeffs",
            "doc_count" : 1
          }
        ]
      }
    }
  }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

罪#恶を代价 2025-02-18 16:34:36

您想要此结果:

GET house/_search
{
  "aggs": {
    "people": {
      "nested": {
        "path": "people"
      },
      "aggs": {
        "people.name": {
          "terms": {
            "script": "doc['people.forename'].value + ' ' +  doc['people.surname'].value"
          }
        }
      }
    }
  },
  "size": 0
}

结果:

"aggregations" : {
    "people" : {
      "doc_count" : 4,
      "people.name" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "Dave Daveson",
            "doc_count" : 2
          },
          {
            "key" : "Jeff Jeffson",
            "doc_count" : 1
          },
          {
            "key" : "Jeffs Jeffsons",
            "doc_count" : 1
          }
        ]
      }
    }
  }

You want this results:

GET house/_search
{
  "aggs": {
    "people": {
      "nested": {
        "path": "people"
      },
      "aggs": {
        "people.name": {
          "terms": {
            "script": "doc['people.forename'].value + ' ' +  doc['people.surname'].value"
          }
        }
      }
    }
  },
  "size": 0
}

Results:

"aggregations" : {
    "people" : {
      "doc_count" : 4,
      "people.name" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "Dave Daveson",
            "doc_count" : 2
          },
          {
            "key" : "Jeff Jeffson",
            "doc_count" : 1
          },
          {
            "key" : "Jeffs Jeffsons",
            "doc_count" : 1
          }
        ]
      }
    }
  }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文