Elasticsearch嵌套字段阵列聚合脚本
我是ES的新手,所以请忍受我。我在网上到处搜索 +尝试了不同的事情,但无法得到答案。
我有这样的结构映射,
"index_1": { ...
},
"index_2": { ...
},
"index_3": {
"mappings": {
"dynamic": "strict",
"properties": {
...
"keywords": {
"type": "nested",
"properties": {
"id": {
"type": "keyword",
"index": false,
"ignore_above": 256
},
"term": {
"type": "text",
"copy_to": [
"keywordsSearchField"
],
"term_vector": "with_positions_offsets",
"analyzer": "pasc_index_autocomplete_analyzer",
"search_analyzer": "pasc_standard_analyzer"
},
"vocab": {
"type": "keyword",
"ignore_above": 256,
"copy_to": [
"keywordsSearchField"
]
},
"vocabUri": {
"type": "keyword",
"ignore_above": 256,
"copy_to": [
"keywordsSearchField"
]
}
}
},
"keywordsSearchField": {
"type": "text",
"analyzer": "pasc_standard_analyzer"
},
...
}
所有索引都具有相同的映射。我要做的是计算每个索引中每个文档的嵌套关键字数组大小,然后按类别进行分组,例如: 关键字1-5:500 doc,关键字6-10:1000文档等。
在我发现汇总时无法使用它们之前,我一直在查看Script_fields。这是一个示例,
{
"_source": "*",
"query": {
"bool": {
"must": [
{
"match_all": {}
}
]
}
},
"script_fields": {
"keywords_size": {
"script": {
"lang": "painless",
"source": "params['_source']['keywords'].size() > 1 && params['_source']['keywords'].size() <= 5"
}
},
"keywords_size1": {
"script": {
"lang": "painless",
"source": "params['_source']['keywords'].size() > 6 && params['_source']['keywords'].size() <= 10"
}
},
"keywords_size2": {
"script": {
"lang": "painless",
"source": "params['_source']['keywords'].size() > 11 && params['_source']['keywords'].size() <= 15"
}
},
"size": {
"script": {
"lang": "painless",
"source": "params['_source']['keywords'].size()"
}
}
}
足以为每个文档添加一些字段。我也尝试将脚本实现到AGGS,试图为所需的每个类别创建存储桶,但不能使其正常工作。
I am super new to ES, so please, bear with me. I've searched everywhere online + tried different things but can't get an answer.
I have a structure mapping like this
"index_1": { ...
},
"index_2": { ...
},
"index_3": {
"mappings": {
"dynamic": "strict",
"properties": {
...
"keywords": {
"type": "nested",
"properties": {
"id": {
"type": "keyword",
"index": false,
"ignore_above": 256
},
"term": {
"type": "text",
"copy_to": [
"keywordsSearchField"
],
"term_vector": "with_positions_offsets",
"analyzer": "pasc_index_autocomplete_analyzer",
"search_analyzer": "pasc_standard_analyzer"
},
"vocab": {
"type": "keyword",
"ignore_above": 256,
"copy_to": [
"keywordsSearchField"
]
},
"vocabUri": {
"type": "keyword",
"ignore_above": 256,
"copy_to": [
"keywordsSearchField"
]
}
}
},
"keywordsSearchField": {
"type": "text",
"analyzer": "pasc_standard_analyzer"
},
...
}
All indexes have the same mappings. What I'm trying to do, is calculate nested keywords array size for each document in every index, and group it by categories, like:
keywords 1-5: 500 docs, keywords 6-10: 1000 docs, etc.
I was going around looking at script_fields initially before I discovered that they cant be used when aggregating. This is an example
{
"_source": "*",
"query": {
"bool": {
"must": [
{
"match_all": {}
}
]
}
},
"script_fields": {
"keywords_size": {
"script": {
"lang": "painless",
"source": "params['_source']['keywords'].size() > 1 && params['_source']['keywords'].size() <= 5"
}
},
"keywords_size1": {
"script": {
"lang": "painless",
"source": "params['_source']['keywords'].size() > 6 && params['_source']['keywords'].size() <= 10"
}
},
"keywords_size2": {
"script": {
"lang": "painless",
"source": "params['_source']['keywords'].size() > 11 && params['_source']['keywords'].size() <= 15"
}
},
"size": {
"script": {
"lang": "painless",
"source": "params['_source']['keywords'].size()"
}
}
}
Which works well enough adding some fields for every doc. I tried implementing the script to aggs as well, trying to create buckets for every category I require, but cant get it to work.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好的,我设法通过使用脚本来解决此问题。如果对任何人有帮助,我将在此处发布答案,但是我想知道通过无脚本使用嵌套字段聚集的答案。因此,这里进行
示例响应:
okay so i managed to solve this by using scripts. I will post the answer here if it helps anyone, however i would like to know what would the answer be by using nested fields aggregations without script. So, here goes
Sample Response: