弹性搜索将文本字段添加到我的聚合中
我在弹性搜索中有类似的文章信息:
{
"ArticleId":355027,
"ArticleNumber":"433398",
"CharacteristicsMultiValue":[
{
"Name":"Aantal cartridges",
"Value":"4",
"NumValue":4,
"Priority":2147483647
},
{
"Name":"ADF",
"Value":"Ja",
"Priority":10,
"Description":"Een Automatic Document Feeder (ADF), of automatische documentinvoer, laat een multifunctionele printer (all-in-one) automatisch meerdere vellen na elkaar verwerken. Door meerdere vellen in de ADF te plaatsen, wordt ieder vel papier stuk voor stuk automatisch gekopieerd of gescand."
},
{
"Name":"Scanresolutie",
"Value":"600x600 DPI",
"Priority":2147483647
}
]
}
我正在运行以下查询,以检索我的搜索所有可能值的tremitasissmultivalue
,并将它们对我的喜好进行分类。
{
"query": {
"query_string": {
"query": "433398",
"default_operator": "and"
}
},
"aggs":{
"CharacteristicsMultiValue":{
"nested":{
"path":"CharacteristicsMultiValue"
},
"aggs":{
"Name":{
"terms":{
"field":"CharacteristicsMultiValue.Name",
"size":25
},
"aggs":{
"Value":{
"terms":{
"field":"CharacteristicsMultiValue.Value",
"size":25
}
},
"Priority":{
"avg":{
"field":"CharacteristicsMultiValue.Priority"
}
},
"Characteristics_sort": {
"bucket_sort": {
"sort": [
{ "Priority": { "order": "asc" } }
]
}
}
}
}
}
}
}
}
结果显示了tremitiatesMultivalue
的列表。
{
"key":"ADF",
"doc_count":1,
"Priority":{
"value":10
},
"Value":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[
{
"key":"Ja",
"doc_count":1
}
]
}
}
这一切都很好。我想进行更改,以便trumitiessmultivalue.description
字段包含在聚合中。我并不是真正的弹性搜索经验,但是我觉得我应该很容易做到这一点。
我做了一些研究,要我理解,我需要为描述列添加一个新的子聚合。我试图通过将下面的JSON添加到当前查询中的几个地方来做到这一点,但是我一直在404
错误。谁能告诉我如何添加(第一个找到)描述字段到我的聚合中。
"aggs":{
"Description":{
"terms":{
"field":"CharacteristicsMultiValue.Description",
"size":1
}
}
}
我测试了乔提出的解决方案。这会导致以下错误响应:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [CharacteristicsMultiValue.Description] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "articles_dev1_nl",
"node": "HiGH6JY9QvOozRSWJmFXpw",
"reason": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [CharacteristicsMultiValue.Description] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [CharacteristicsMultiValue.Description] in order to load field data by uninverting the inverted index. Note that this can use significant memory.",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [CharacteristicsMultiValue.Description] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
},
"status": 400
}
I have article information like this in Elastic Search:
{
"ArticleId":355027,
"ArticleNumber":"433398",
"CharacteristicsMultiValue":[
{
"Name":"Aantal cartridges",
"Value":"4",
"NumValue":4,
"Priority":2147483647
},
{
"Name":"ADF",
"Value":"Ja",
"Priority":10,
"Description":"Een Automatic Document Feeder (ADF), of automatische documentinvoer, laat een multifunctionele printer (all-in-one) automatisch meerdere vellen na elkaar verwerken. Door meerdere vellen in de ADF te plaatsen, wordt ieder vel papier stuk voor stuk automatisch gekopieerd of gescand."
},
{
"Name":"Scanresolutie",
"Value":"600x600 DPI",
"Priority":2147483647
}
]
}
I'm running the following query to retrieve all the occurrences of the CharacteristicsMultiValue
for my search with all possible values and sort them to my liking.
{
"query": {
"query_string": {
"query": "433398",
"default_operator": "and"
}
},
"aggs":{
"CharacteristicsMultiValue":{
"nested":{
"path":"CharacteristicsMultiValue"
},
"aggs":{
"Name":{
"terms":{
"field":"CharacteristicsMultiValue.Name",
"size":25
},
"aggs":{
"Value":{
"terms":{
"field":"CharacteristicsMultiValue.Value",
"size":25
}
},
"Priority":{
"avg":{
"field":"CharacteristicsMultiValue.Priority"
}
},
"Characteristics_sort": {
"bucket_sort": {
"sort": [
{ "Priority": { "order": "asc" } }
]
}
}
}
}
}
}
}
}
The result shows a list of CharacteristicsMultiValue
like below.
{
"key":"ADF",
"doc_count":1,
"Priority":{
"value":10
},
"Value":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[
{
"key":"Ja",
"doc_count":1
}
]
}
}
This all works great. I want to make a change so the the CharacteristicsMultiValue.Description
field is included in the aggregation. I'm not really experienced with Elastic Search, but I feel I should be able to do this pretty easily.
I did some research and to my understanding I would need to add a new sub aggregation for the description column. I tried to do that by adding the JSON below to my current query on several places but I keep getting 404
errors. Could anyone tell me how I could add (the first found) description field to my aggregation.
"aggs":{
"Description":{
"terms":{
"field":"CharacteristicsMultiValue.Description",
"size":1
}
}
}
I tested the solution proposed by Joe. This results in the following error response:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [CharacteristicsMultiValue.Description] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "articles_dev1_nl",
"node": "HiGH6JY9QvOozRSWJmFXpw",
"reason": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [CharacteristicsMultiValue.Description] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [CharacteristicsMultiValue.Description] in order to load field data by uninverting the inverted index. Note that this can use significant memory.",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [CharacteristicsMultiValue.Description] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
},
"status": 400
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不知道为什么您会得到
404
错误 - 如果您的汇总语法已关闭,通常是400不良请求
。无论哪种方式,如果您想在每个桶装
value
下找到顶部descript
术语,则可以使用:一般而言,
name-> value
或value-> description
从我的示例name-> value
和name-> Priority
一样。I don't know why you're getting
404
errors -- it's usually400 Bad Request
if your aggregations' syntax is off.Either way, if you want to find the top
Description
terms under every bucketedValue
, you can use:Generally speaking, sub-aggregations adhere to the following schema:
and you can:
Name->Value
orValue->Description
from my exampleName->Value
andName->Priority
.???? Tip: your query is already quite heavily nested so you could explore the
typed_keys
query parameter to determine more easily which bucket corresponds to which sub-aggregation.Edit
As described in the error msg, the
Description
field needs to be aggregatable before any aggregations are performed.So if you drop your index, you should turn
fielddata
on:or, if your index already exists, you can use the update API:
You can learn more about
fielddata
vs.keyword
here in the docs.