Elasticsearch:如何在copy_to字段中搜索多个单词?

发布于 2025-02-06 09:19:04 字数 3217 浏览 1 评论 0 原文

我目前正在学习Elasticsearch,并在下面描述的问题上陷入困境:

在现有索引上(我不知道是否重要),我添加了此新映射:

PUT user-index
{
  "mappings": {
    "properties": {
     "common_criteria": { -- new property which aggregates other properties by copy_to
        "type": "text"
      },
      "name": { -- already existed before this mapping
        "type": "text",
        "copy_to": "common_criteria"
      },
      "username": { -- already existed before this mapping
        "type": "text",
        "copy_to": "common_criteria"
      },
      "phone": { -- already existed before this mapping
        "type": "text",
        "copy_to": "common_criteria"
      },
      "country": { -- already existed before this mapping
        "type": "text",
        "copy_to": "common_criteria"
      }
    }
  }
}

目标是仅在 common_criteria 。 说我们有:

{
 "common_criteria": ["John Smith","johny","USA"]
}

我想实现的目标是在 common_criteria 的多个值上搜索的确切匹配:

  1. 如果我们使用 John Smith 或<<代码>美国 + John Smith 或使用 Johny + USA 或使用 USA Johny ,最后使用 John Smith +美国 + johny (单词顺序无关紧要),
  2. 如果我们使用 John Smith + Germany Johny + England 等多个单词搜索结果,

我正在使用弹簧数据弹性来构建我的查询:

 NativeSearchQueryBuilder nativeSearchQuery = new NativeSearchQueryBuilder();
 BoolQueryBuilder booleanQuery = QueryBuilders.boolQuery();
 
 String valueToSearch = "johny"
 nativeSearchQuery.withQuery(booleanQuery.must(QueryBuilders.matchQuery("common_criteria", valueToSearch)
                        .fuzziness(Fuzziness.AUTO)
                        .operator(Operator.AND)));

记录发送给弹性的请求:

{
  "bool" : {
    "must" :
    {
        "match" : {
          "common_criteria" : {
            "query" : "johny",
            "operator" : "AND",
            "fuzziness" : "AUTO",
            "prefix_length" : 0,
            "max_expansions" : 50,
            "fuzzy_transpositions" : true,
            "lenient" : false,
            "zero_terms_query" : "NONE",
            "auto_generate_synonyms_phrase_query" : true,
            "boost" : 1.0
          }
        }
      },
    "adjust_pure_negative" : true,
    "boost" : 1.0
  }
}

使用该请求,我有0个结果。我知道请求是不正确的,因为必须。

在此先感谢您的帮助和解释。

编辑:尝试 multi_match 查询。

遵循 @rabbitbr的建议,我尝试了 multi_match 查询,但似乎不起作用。这是发送给弹性的请求的示例(带有0结果):

{
  "bool" : {
    "must" : {
        "multi_match" : {
          "query" : "John Smith USA",
          "fields" : [
            "name^1.0",
            "username^1.0",
            "phone^1.0",
            "country^1.0",
          ],
          "type" : "best_fields",
          "operator" : "AND",
          "slop" : 0,
          "fuzziness" : "AUTO",
          "prefix_length" : 0,
          "max_expansions" : 50,
          "zero_terms_query" : "NONE",
          "auto_generate_synonyms_phrase_query" : true,
          "fuzzy_transpositions" : true,
          "boost" : 1.0
        }
    },
    "adjust_pure_negative" : true,
    "boost" : 1.0
  }
}

该请求不会返回结果。

I am currently learning Elasticsearch and stuck on the issue described below:

On an existing index (I don't know if it matter) I added this new mapping:

PUT user-index
{
  "mappings": {
    "properties": {
     "common_criteria": { -- new property which aggregates other properties by copy_to
        "type": "text"
      },
      "name": { -- already existed before this mapping
        "type": "text",
        "copy_to": "common_criteria"
      },
      "username": { -- already existed before this mapping
        "type": "text",
        "copy_to": "common_criteria"
      },
      "phone": { -- already existed before this mapping
        "type": "text",
        "copy_to": "common_criteria"
      },
      "country": { -- already existed before this mapping
        "type": "text",
        "copy_to": "common_criteria"
      }
    }
  }
}

The goal is to search ONE or MORE values only on common_criteria.
Say that we have:

{
 "common_criteria": ["John Smith","johny","USA"]
}

What I would like to achieve is an exact match searching on multiple values of common_criteria:

  1. We should have a result if we search with John Smith or with USA + John Smith or with johny + USA or with USA or with johny and finally with John Smith + USA + johny (the words order does not matter)
  2. If we search with multiple words like John Smith + Germany or johny + England we should not have a result

I am using Spring Data Elastic to build my query:

 NativeSearchQueryBuilder nativeSearchQuery = new NativeSearchQueryBuilder();
 BoolQueryBuilder booleanQuery = QueryBuilders.boolQuery();
 
 String valueToSearch = "johny"
 nativeSearchQuery.withQuery(booleanQuery.must(QueryBuilders.matchQuery("common_criteria", valueToSearch)
                        .fuzziness(Fuzziness.AUTO)
                        .operator(Operator.AND)));

Logging the request sent to Elastic I have:

{
  "bool" : {
    "must" :
    {
        "match" : {
          "common_criteria" : {
            "query" : "johny",
            "operator" : "AND",
            "fuzziness" : "AUTO",
            "prefix_length" : 0,
            "max_expansions" : 50,
            "fuzzy_transpositions" : true,
            "lenient" : false,
            "zero_terms_query" : "NONE",
            "auto_generate_synonyms_phrase_query" : true,
            "boost" : 1.0
          }
        }
      },
    "adjust_pure_negative" : true,
    "boost" : 1.0
  }
}

With that request I have 0 result. I know that request is not correct because of must.match condition and maybe the field common_criteria is also not well defined.

Thanks in advance for your help and explanations.

EDIT: After trying multi_match query.

Following @rabbitbr's suggestion I tried the multi_match query but does not seem to work. This is the example of a request sent to Elastic (with 0 result):

{
  "bool" : {
    "must" : {
        "multi_match" : {
          "query" : "John Smith USA",
          "fields" : [
            "name^1.0",
            "username^1.0",
            "phone^1.0",
            "country^1.0",
          ],
          "type" : "best_fields",
          "operator" : "AND",
          "slop" : 0,
          "fuzziness" : "AUTO",
          "prefix_length" : 0,
          "max_expansions" : 50,
          "zero_terms_query" : "NONE",
          "auto_generate_synonyms_phrase_query" : true,
          "fuzzy_transpositions" : true,
          "boost" : 1.0
        }
    },
    "adjust_pure_negative" : true,
    "boost" : 1.0
  }
}

That request does not return a result.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

幽蝶幻影 2025-02-13 09:19:04

我会尝试使用在创建一个字段之前,将所有其他字段存储在一个地方。

MULTI_MATCH查询在匹配查询上构建以允许多场
查询。

I would try to use Multi-match query before creating a field to store all the others in one place.

The multi_match query builds on the match query to allow multi-field
queries.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文