Elasticsearch 多术语自动完成

发布于 2025-01-12 20:02:47 字数 2761 浏览 5 评论 0原文

我正在尝试实现提出的多术语自动完成 此处

过滤到正确的文档是可行的,但在聚合 completion_terms 时,它们不会过滤到与当前部分查询匹配的文档,而是包含任何匹配文档中的所有 completion_terms

这是映射:

{
  "mappings": {
    "dynamic" : "false",
    "properties" : {
      "completion_ngrams" : {
        "type" : "text",
        "analyzer" : "completion_ngram_analyzer",
        "search_analyzer" : "completion_ngram_search_analyzer"
      },
      "completion_terms" : {
        "type" : "keyword",
        "normalizer" : "completion_normalizer"
      }
    }
  }
}

这是设置:

{
    "settings" : {
      "index" : {
        "analysis" : {
          "filter" : {
            "edge_ngram" : {
              "type" : "edge_ngram",
              "min_gram" : "1",
              "max_gram" : "10"
            }
          },
          "normalizer" : {
            "completion_normalizer" : {
              "filter" : [
                "lowercase",
                "german_normalization"
              ],
              "type" : "custom"
            }
          },
          "analyzer" : {
            "completion_ngram_search_analyzer" : {
              "filter" : [
                "lowercase"
              ],
              "tokenizer" : "whitespace"
            },
            "completion_ngram_analyzer" : {
              "filter" : [
                "lowercase",
                "edge_ngram"
              ],
              "tokenizer" : "whitespace"
            }
          }
        }
      }
    }
  }
}

然后我像这样索引数据:

{
  "completion_terms" : ["Hammer", "Fortis", "Tool", "2000"],
  "completion_ngrams": "Hammer Fortis Tool 2000"
}

最后,自动完成搜索如下所示:

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "completion_terms": "fortis"
          }
        },
        {
          "term": {
            "completion_terms": "hammer"
          }
        },        
        {
          "match": {
            "completion_ngrams": "too"
          }
        }
      ]
    }
  },
  "aggs": {
    "autocomplete": {
      "terms": {
        "field": "completion_terms",
        "size": 100
      }
    }
  }
}

这正确返回与搜索字符串“fortis Hammer太”匹配的文档,但聚合包括所有完成术语包含在任何匹配的文档中,例如上面的查询:

"buckets": [
  { "key": "fortis" },
  { "key": "hammer" },
  { "key": "tool" },
  { "key": "2000" },
]

理想情况下,我希望

"buckets": [
  { "key": "tool" }
]

我可以过滤掉应用程序中的搜索查询(在本例中为“fortis”和“hammer”)已经涵盖的术语,但是从用户的角度来看,“2000”没有任何意义,因为它不部分匹配任何提供的搜索词。

我明白为什么会发生这种情况,但我想不出解决方案。有人可以帮忙吗?

I'm trying to implement the Multi-Term Auto Completion that's presented here.

Filtering down to the correct documents works, but when aggregating the completion_terms they are not filtered to those that match the current partial query, but instead include all completion_terms from any matched documents.

Here are the mappings:

{
  "mappings": {
    "dynamic" : "false",
    "properties" : {
      "completion_ngrams" : {
        "type" : "text",
        "analyzer" : "completion_ngram_analyzer",
        "search_analyzer" : "completion_ngram_search_analyzer"
      },
      "completion_terms" : {
        "type" : "keyword",
        "normalizer" : "completion_normalizer"
      }
    }
  }
}

Here are the settings:

{
    "settings" : {
      "index" : {
        "analysis" : {
          "filter" : {
            "edge_ngram" : {
              "type" : "edge_ngram",
              "min_gram" : "1",
              "max_gram" : "10"
            }
          },
          "normalizer" : {
            "completion_normalizer" : {
              "filter" : [
                "lowercase",
                "german_normalization"
              ],
              "type" : "custom"
            }
          },
          "analyzer" : {
            "completion_ngram_search_analyzer" : {
              "filter" : [
                "lowercase"
              ],
              "tokenizer" : "whitespace"
            },
            "completion_ngram_analyzer" : {
              "filter" : [
                "lowercase",
                "edge_ngram"
              ],
              "tokenizer" : "whitespace"
            }
          }
        }
      }
    }
  }
}

I'm then indexing data like this:

{
  "completion_terms" : ["Hammer", "Fortis", "Tool", "2000"],
  "completion_ngrams": "Hammer Fortis Tool 2000"
}

Finally, the autocomplete search looks like this:

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "completion_terms": "fortis"
          }
        },
        {
          "term": {
            "completion_terms": "hammer"
          }
        },        
        {
          "match": {
            "completion_ngrams": "too"
          }
        }
      ]
    }
  },
  "aggs": {
    "autocomplete": {
      "terms": {
        "field": "completion_terms",
        "size": 100
      }
    }
  }
}

This correctly returns documents matching the search string "fortis hammer too", but the aggregations include ALL completion terms that are included in any of the matched documents, e.g. for the query above:

"buckets": [
  { "key": "fortis" },
  { "key": "hammer" },
  { "key": "tool" },
  { "key": "2000" },
]

Ideally, I'd expect

"buckets": [
  { "key": "tool" }
]

I could filter out the terms that are already covered by the search query ("fortis" and "hammer" in this case) in the app, but the "2000" doesn't make any sense from a user's perspective, because it doesn't partially match any of the provided search terms.

I understand why this is happening, but I can't think of a solution. Can anyone help?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

抱猫软卧 2025-01-19 20:02:47

请尝试filters agg

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "completion_terms": "fortis"
          }
        },
        {
          "term": {
            "completion_terms": "hammer"
          }
        },
        {
          "match": {
            "completion_ngrams": "too"
          }
        }
      ]
    }
  },
  "aggs": {
    "findOuthammerAndfortis": {
      "filters": {
        "filters": {
          "fortis": {
            "term": {
              "completion_terms": "fortis"
            }
          },
          "hammer": {
            "term": {
              "completion_terms": "hammer"
            }
          }
        }
      }
    }
  }
}

try filters agg please

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "completion_terms": "fortis"
          }
        },
        {
          "term": {
            "completion_terms": "hammer"
          }
        },
        {
          "match": {
            "completion_ngrams": "too"
          }
        }
      ]
    }
  },
  "aggs": {
    "findOuthammerAndfortis": {
      "filters": {
        "filters": {
          "fortis": {
            "term": {
              "completion_terms": "fortis"
            }
          },
          "hammer": {
            "term": {
              "completion_terms": "hammer"
            }
          }
        }
      }
    }
  }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文