Elasticsearch“数据”:{“类型”:“浮动”; } 查询返回不正确的结果

发布于 2025-01-16 04:52:32 字数 1137 浏览 2 评论 0原文

我有一个如下查询,当 date_partition 字段为“type”=> “float”它返回诸如 20220109、20220108、20220107 之类的查询。 当字段“类型”=> “long”,它只返回20220109查询。这就是我想要的。

下面的每个查询都会返回结果,就像发送了查询 20220119 一样。 --> 20220109、20220108、20220107

PUT date
{
  "mappings": {
    "properties": {
      "date_partition_float": {
        "type": "float"
      },
      "date_partition_long": {
        "type": "long"
      }
    }
  }
}
POST date/_doc
{
  "date_partition_float": "20220109",
  "date_partition_long": "20220109"
}
#its return the query
GET date/_search
{
  "query": {
    "match": {
      "date_partition_float": "20220108"
    }
  }
}
#nothing return
GET date/_search
{
  "query": {
    "match": {
      "date_partition_long": "20220108"
    }
  }
}

这是一个错误还是 float 类型的工作原理? 加载到 Elasticsearch 的 2 年数据(例如第 1 天、第 2 天)(每天 20 GB pri 分片大小)(总计 15 TB)更改此字段类型的最佳方法是什么? 我的映射中有 5 个浮点类型,更改所有浮点类型的最快方法是什么。 注意:在我看来,我有以下解决方案,但恐怕

  • 查询 API
  • 重新索引 API
  • 运行时搜索请求更新速度很慢(尤其是这个) 谢谢你! 输入图片此处描述

I have a query like below and when date_partition field is "type" => "float" it returns queries like 20220109, 20220108, 20220107.
When field "type" => "long", it only returns 20220109 query. Which is what I want.

Each queries below, the result is returned as if the query 20220119 was sent.
--> 20220109, 20220108, 20220107

PUT date
{
  "mappings": {
    "properties": {
      "date_partition_float": {
        "type": "float"
      },
      "date_partition_long": {
        "type": "long"
      }
    }
  }
}
POST date/_doc
{
  "date_partition_float": "20220109",
  "date_partition_long": "20220109"
}
#its return the query
GET date/_search
{
  "query": {
    "match": {
      "date_partition_float": "20220108"
    }
  }
}
#nothing return
GET date/_search
{
  "query": {
    "match": {
      "date_partition_long": "20220108"
    }
  }
}

Is this a bug or is this how float type works ?
2 years of data loaded to Elasticsearch (like day-1, day-2) (20 gb pri shard size per day)(total 15 TB) what is the best way to change the type of just this field ?
I have 5 float type in my mapping, what is the fastest way to change all of them.
Note: In my mind I have below solutions but I'm afraid it's slow

  • update by query API
  • reindex API
  • run time search request (especially this one)
    Thank you!
    enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

娇柔作态 2025-01-23 04:52:33

date_partition 字段应具有 date 类型和 format=yyyyMMdd,这是唯一可以使用的合理类型,而不是 long 甚至更糟糕的浮动

PUT date
{
  "mappings": {
    "properties": {
      "date_partition": {
        "type": "date",
        "format": "yyyyMMdd"
      }
    }
  }
}

查询 20220108 并在结果中返回 20220109 文档是不合逻辑的。

使用date类型还允许您使用适当的基于时间的range查询并在数据上创建date_histogram聚合。

您可以使用适当的类型重新创建索引并重新索引数据,或者向现有索引添加新字段并通过查询更新它。两个选项都有效。

That date_partition field should have the date type with format=yyyyMMdd, that's the only sensible type to use, not long and even worse float.

PUT date
{
  "mappings": {
    "properties": {
      "date_partition": {
        "type": "date",
        "format": "yyyyMMdd"
      }
    }
  }
}

It's not logical to query for 20220108 and have the 20220109 document returned in the results.

Using the date type would also allow you to use proper time-based range queries and create date_histogram aggregations on your data.

You can either recreate the index with the adequate type and reindex your data, or add a new field to your existing index and update it by query. Both options are valid.

原来分手还会想你 2025-01-23 04:52:33

这是我的问题的答案=> https://discuss.elastic.co/ t/elasticsearch-data-type-float-returns-in Correct-results/300335

您在这里遇到了一些 java 怪癖(但是按预期构建)。
如果您想重现,请在本地运行 jshell 并输入此内容

Float.valueOf(20220109.0f);结果将返回 2.0220108E7,因为
浮点值的舍入问题,因为它们不被存储
完全正确。

您可以使用重新索引功能将数据重新索引到
固定映射的索引(您还可以向
现有索引并使用按查询更新,但我不确定这是
干净)。

Here is the answer to my question => https://discuss.elastic.co/t/elasticsearch-data-type-float-returns-incorrect-results/300335

You're running into some java quirks (built as intended however) here.
If you want to reproduce, run jshell locally and type in this

Float.valueOf(20220109.0f); the result will return 2.0220108E7 due to
rounding issues with floating point values, as they are not stored
exactly.

You can use the reindex functionality to reindex your data into an
index with the mapping fixed (you could also add new fields to the
existing index and use update-by-query, but I am not sure that is
clean).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文