elasticsearch _knn_search查询多个字段

发布于 2025-01-29 11:36:02 字数 864 浏览 7 评论 0原文

我正在使用ES 8.2。我想在超过1个向量上使用_knn_search的近似方法。下面,我已将当前代码搜索附加在单个向量上。就我阅读_knn_search而不支持嵌套字段上的搜索。 另外,我可以使用多索引搜索。一个索引,一个向量,一个搜索,将所有结果总结在一起。但是,我需要将所有这些向量存储在一起一个索引,因为我还需要在其他字段上执行过滤,除了向量进行KNN搜索。

因此,问题是,是否有关于如何在1个以上的向量上执行_KNN_Search的工作?

search_vector = np.zeros(512).tolist()
es_query = {
        "knn": {
            "field": "feature_vector_1.vector",
            "query_vector": search_vector,
            "k": 100,
            "num_candidates": 1000
        },
        "filter": [
            {
                "range": {
                    "feature_vector_1.match_prc": {
                        "gt": 10
                    }
                }
            }
        ],
    "_source": {
        "excludes": ["feature_vector_1.vector", "feature_vector_2.vector"]
    }
    }

I'm using ES 8.2. I'd like to use approximate method of _knn_search on more than 1 vector. Below I've attached my current code searching on a single vector. So far as I've read _knn_search does not support search on nested fields.
Alternatively, I can use multi index search. One index, one vector, one search, sum up all results together. However, I need to store all these vectors together in one index as I need also to perform filtration on some other fields besides vectors for knn search.

Thus, the question is if there is a work around how I can perform _knn_search on more than 1 vector?

search_vector = np.zeros(512).tolist()
es_query = {
        "knn": {
            "field": "feature_vector_1.vector",
            "query_vector": search_vector,
            "k": 100,
            "num_candidates": 1000
        },
        "filter": [
            {
                "range": {
                    "feature_vector_1.match_prc": {
                        "gt": 10
                    }
                }
            }
        ],
    "_source": {
        "excludes": ["feature_vector_1.vector", "feature_vector_2.vector"]
    }
    }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

梦里人 2025-02-05 11:36:02

但是,我最终得到的最后一个工作查询是

es_query = {
            "knn": {
                "field": "feature_vector_1.vector",
                "query_vector": search_vector,
                "k": 1000,
                "num_candidates": 1000
            },
            "filter": [
                {
                    "function_score": {
                        "query": {
                            "match_all": {}
                        },
                        "script_score": {
                            "script": {
                                "source": """
                                                  double value = dotProduct(params.queryVector, 'feature_vector_2.vector');
                                                  return 100 * (1 + value) / 2;
                                                """,
                                "params": {
                                    "queryVector": search_vector
                                }
                            },
                        }
                    }
                }
            ],
        "_source": {
            "excludes": ["feature_vector_1.vector", "feature_vector_2.vector"]
          }
        }

,这不是2个向量上的AKNN,但如果此类查询的性能满足您的期望,则仍然可以使用。

The last working query that I've end up with is

es_query = {
            "knn": {
                "field": "feature_vector_1.vector",
                "query_vector": search_vector,
                "k": 1000,
                "num_candidates": 1000
            },
            "filter": [
                {
                    "function_score": {
                        "query": {
                            "match_all": {}
                        },
                        "script_score": {
                            "script": {
                                "source": """
                                                  double value = dotProduct(params.queryVector, 'feature_vector_2.vector');
                                                  return 100 * (1 + value) / 2;
                                                """,
                                "params": {
                                    "queryVector": search_vector
                                }
                            },
                        }
                    }
                }
            ],
        "_source": {
            "excludes": ["feature_vector_1.vector", "feature_vector_2.vector"]
          }
        }

However, it is not true AKNN on 2 vectors but still working option if performance of such query satisfies your expectations.

半山落雨半山空 2025-02-05 11:36:02

以下似乎对我组合KNN搜索的平均分数对我来说似乎很适合我。请注意,这与原始请求有所不同,因为它执行了蛮力搜索,但是您仍然可以通过更换match_all lit来预先过滤结果。

GET my-index/_search
{
  "query": {
    "script_score": {
      "query": {
        "match_all": {}
      },
      "script": {
        "source": "(cosineSimilarity(params.vector1, 'my-vector1') + cosineSimilarity(params.vector2, 'my-vector2'))/2 + 1.0",
        "params": {
          "vector1": [
            1.3012068271636963,
            ...
            0.23468133807182312
          ],
          "vector2": [
            -0.49404603242874146,
            ...
            -0.15835021436214447
          ]
        }
      }
    }
  }
}

the below seems to be working for me for combining KNN searches, taking the average of multiple cosine similarity scores. Note that this is a little different than the original request, since it performs a brute force search, but you can still filter the results up front by replacing the match_all bit.

GET my-index/_search
{
  "query": {
    "script_score": {
      "query": {
        "match_all": {}
      },
      "script": {
        "source": "(cosineSimilarity(params.vector1, 'my-vector1') + cosineSimilarity(params.vector2, 'my-vector2'))/2 + 1.0",
        "params": {
          "vector1": [
            1.3012068271636963,
            ...
            0.23468133807182312
          ],
          "vector2": [
            -0.49404603242874146,
            ...
            -0.15835021436214447
          ]
        }
      }
    }
  }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文