Elastisearch:Bool带有REGEXP,过滤器和Aggs

发布于 2025-01-17 11:10:15 字数 1553 浏览 1 评论 0原文

我是AWS Elastisearch的新手,并且正在尝试在数据集上做一些有关标记电影的事情。该数据集有五列:类型,MovieID,Tag,Title,userId。每个电影的年份都包含在标题中,例如So Waterworld(1995)。 我想看看2002年制作了带有标签的Tag 的电影。 由于我首先必须匹配日期,然后用标签过滤,最后对我尝试使用这样的电影进行计算:

GET tagged_movies/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "regexp": {
            "title": "(2002)"
          }
        }
      ],
      "filter": [
        {
          "term": {
            "tag": "true story"
          }
        }
      ],
      "aggs": {
        "by_numberofmovies": {
          "terms": {
            "field": "movieId"
          }
        }
      }
    }
  }
}

但是我会收到以下错误:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "x_content_parse_exception",
        "reason" : "[18:7] [bool] unknown field [aggs]"
      }
    ],
    "type" : "x_content_parse_exception",
    "reason" : "[18:7] [bool] unknown field [aggs]"
  },
  "status" : 400
}

我根本不了解,因为布尔应该识别<代码> aggs 。我已经尝试在文档和互联网上查看文档,但它说布尔确实应该识别aggs。有人可以指导问题可能在哪里吗?

这是此查询应匹配的示例文档的示例:

{
        "_index" : "tagged_movies",
        "_id" : "EgADsX8B2WnPqWZmot9b",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2011-03-22T04:22:48.000+01:00",
          "genres" : "Comedy",
          "movieId" : 5283,
          "tag" : "true story",
          "title" : "National Lampoon's Van Wilder (2002)",
          "userId" : 121,
          "timestamp" : "2011-03-22 04:22:48"
        }

I am completely new to AWS ElastiSearch and am trying to do something on a dataset about tagged movies. The dataset has five columns : genres, movieId, tag, title, userId. The year of each movie is contained in the title like so Waterworld (1995).
I want to see how many movies with the tag true story were produced in 2002.
Since I first have to match the date, then filter with the tag and finally count the movies I tried doing it with a bool like so:

GET tagged_movies/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "regexp": {
            "title": "(2002)"
          }
        }
      ],
      "filter": [
        {
          "term": {
            "tag": "true story"
          }
        }
      ],
      "aggs": {
        "by_numberofmovies": {
          "terms": {
            "field": "movieId"
          }
        }
      }
    }
  }
}

But I get the following error :

{
  "error" : {
    "root_cause" : [
      {
        "type" : "x_content_parse_exception",
        "reason" : "[18:7] [bool] unknown field [aggs]"
      }
    ],
    "type" : "x_content_parse_exception",
    "reason" : "[18:7] [bool] unknown field [aggs]"
  },
  "status" : 400
}

which I don't understand at all since the bool should recognize aggs. I've tried looking in the documentation as well as on the internet but it says that bool should indeed recognize the aggs. Could someone guide to where the problem might be ?

Here is an example of a the sample document that this query should match:

{
        "_index" : "tagged_movies",
        "_id" : "EgADsX8B2WnPqWZmot9b",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2011-03-22T04:22:48.000+01:00",
          "genres" : "Comedy",
          "movieId" : 5283,
          "tag" : "true story",
          "title" : "National Lampoon's Van Wilder (2002)",
          "userId" : 121,
          "timestamp" : "2011-03-22 04:22:48"
        }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

冷情 2025-01-24 11:10:15

aggs不能在查询块中,aggsquery是兄弟姐妹,您正确查询应该如下

{
    "query": {
        "bool": {
            "must": [
                {
                    "regexp": {
                        "title": "(2002)"
                    }
                }
            ],
            "filter": [
                {
                    "match": {
                        "tag": "true story"
                    }
                }
            ]
        }
    },
    "aggs": {
        "by_numberofmovies": {
            "terms": {
                "field": "movieId"
            }
        }
    }
}

aggs can't be inside the query block, aggs and query are siblings, you correct query should be like below

{
    "query": {
        "bool": {
            "must": [
                {
                    "regexp": {
                        "title": "(2002)"
                    }
                }
            ],
            "filter": [
                {
                    "match": {
                        "tag": "true story"
                    }
                }
            ]
        }
    },
    "aggs": {
        "by_numberofmovies": {
            "terms": {
                "field": "movieId"
            }
        }
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文