Kibana ILM 性价比多少?

发布于 2025-01-10 17:51:47 字数 852 浏览 0 评论 0原文

我明白 hot-warm(-cold -frozen-deleted) 生命周期是一个很棒的工具,但我还没有找到太多数字文档:提供带有数字示例(而不仅仅是功能描述)的少数文档之一是 此博文。在没有汇总的热温示例中,在我看来,主要存储优化是由副本数量给出的:

  • 一天的数据 = 86.4 GB
  • 7 个热天 = 一天的数据* 7 天 * 2 个副本 = 1.2 TB
  • 30-7 个温暖天 = 一天的数据 * 23 天 * 1 个副本 = 1.98 TB

还有其他资源,例如 此网络研讨会,但它不区分存储使用情况和 RAM 使用情况。是否有官方文档(或第三方实验/报告)显示冷/冻结/“删除后不可搜索的快照”阶段是否以及在多大程度上优化了存储使用?或者只是减少 RAM 使用量?

I understood that the hot-warm(-cold-frozen-deleted) lifecycle is a great tool, but I haven't found much numerical documentation: one of the few documents that gives examples with numbers (and not just feature descriptions) is this blogpost. In the hot-warm example without roll-up, it seems to me that the main storage optimization is given by the number of replicas:

  • one day of data = 86.4 GB
  • 7 hot days = one day of data * 7 days * 2 replicas = 1.2 TB
  • 30-7 warm days = one day of data * 23 days * 1 replica = 1.98 TB

There are other resources like this webinar, yet it doesn't distinguish between storage usage and RAM usage. Is there an official document (or third parties experiment/report) that shows if and how much the cold/frozen/"non-searchable snapshot after deletion" phases optimize the storage usage? Or is only about less RAM usage?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

七月上 2025-01-17 17:51:47

这里不可能有单一的“基准”,因为 ILM 只是一个允许根据您的数据使用模式调整硬件配置的工具。

例如,假设您对所有数据进行大量索引和大量搜索。在这种情况下,您不想减少旧数据的副本数量,而收益主要归功于稍微便宜的“热”SSD 存储。因此,如果分离开销补偿了该增益,则此处的差异将很小或根本没有。

一个相反的例子是出于合规性目的存储日志(大量写入但很少读取,并且主要是过去 24 小时) - 那么您可能希望将一周左右的所有内容移至使用 s3 存储桶进行存储的“冻结”层而且非常便宜。此外,就堆使用情况和稳定性而言,这些分片不计入集群分片计数。在这种情况下,分层存储可能比单层集群便宜几个数量级。

There can't be a single "benchmark" here since ILM is just a tool that allows tuning your hardware configuration according to your data usage patterns.

For example, suppose you have heavy indexing and heavy searching across all of your data. In that case, you don't want to reduce your replica count for the old data, and the gain would be primarily due to slightly cheaper "warm" SSD storage. So the difference here would be minimal or none at all if the separation overhead compensates that gain.

An opposite example would be storing logs for compliance purposes (lots of writes but minimal reads, and it's primarily last 24 hrs) - then you probably want to move everything beyond a week or so into the "frozen" tier which uses s3 buckets for storage and is very cheap. Also, those shards don't count towards cluster shard count regarding heap usage and stability. In this case, tiered storage might turn out to be orders of magnitude cheaper than a single-tier cluster.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文