当前位置：文江博客话题详情

ScylladB中多个学流重叠的可能性是多少？

发布于 2025-01-29 07:46:51 字数 182 浏览 3 评论 0原文

在开源版本中，Scylla建议免费提供多达50％的磁盘空间以进行“压实”。同时，文档指出每个表是彼此独立压实的。从逻辑上讲，这表明在具有数十个（甚至多个）表的应用程序中，只有很小的机会将如此多的压实重合。

是否有一个数学模型来计算多个表格中多个压实如何重叠？基于粗略的分析，似乎多个重叠压实的可能性很小，尤其是当我们处理数十个独立表时。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

淡写薰衣草的香 2025-02-05 07:46:51

您绝对正确：

使用尺寸层的压实策略压实可能会暂时翻倍磁盘要求。但这并不是整个磁盘要求的两倍，而只有此压实涉及的sstable （另请参见我的博客文章 size-tiered压实及其空间放大）。确实存在“整个磁盘用法”与“与此压实相关的sstables”之间的区别，原因有两个：

如您在问题中所指出的那样，如果您有10张相似的尺寸，则只有一个紧凑的一张，只有一个会处理仅10％的数据，因此压实过程中的临时磁盘用法可能是磁盘使用情况的10％，而不是100％。
此外，Scylla是 sharded ，这意味着不同的CPU完全独立地处理其S史塔仪和压实。如果您的机器上有8个CPU，则每个CPU仅处理1/8数据，因此当它进行压实时，最大临时开销将是表尺寸的1/8，而不是全表尺寸。

第二个原因不能依靠 - 因为碎片选择何时独立紧凑，如果您不幸的话，所有碎片可能会决定完全同时紧凑同一表，而且更糟糕的是，可能会碰巧在同一时完成最大的压实时间。如果您启动“主要压实”（Nodetool Compact），则这种“不幸”也可能以100％的概率发生。

您询问的问题的第一个原因确实更为有用和可靠：除了所有碎片都不太可能选择压缩所有Sstables的情况外，Scylla的压实算法中有一个重要的细节，在这里有所帮助：每个碎片一次仅一次（大致）给定尺寸的一个压实。因此，如果您有许多大致相等的桌子，那么一次不可能一次完成其中一张桌子的全部压实。这是可以保证的 - 这不是概率问题。

当然，这种“技巧”只有在您确实拥有许多大致相等的表格时才有帮助。如果一张桌子比其余的大得多，或者桌子的尺寸非常不同，那么它不会帮助您控制最大的临时磁盘使用情况。

在问题中 https：//github.com/scylladb/scylladb/scylla/scylla/sissues/issues/2871 Scylla如何保证的想法，即当磁盘空间较低时，碎片（点1）也用于减少临时磁盘空间的使用情况。我们没有实施这个想法，而是实现了一个更好的想法 - “增量压实策略”，该策略在零件（“增量”）中进行巨大的压实，以避免大多数临时磁盘用法。参见此博客文章这种新的压实策略有效，并绘制了它如何降低临时磁盘使用情况的图表。请注意，增量压实策略当前是Scylla Enterprise版本的一部分（不在开源版本中）。

You're absolutely right:

With the size-tiered compaction strategy a compaction may temporarily double the disk requirements. But it doesn't double the entire disk requirements but only of the sstables involved in this compaction (see also my blog post on size-tiered compaction and its space amplification). There is indeed a difference between "the entire disk usage" and just "the sstables involved in this compaction" for two reasons:

As you noted in your question, if you have 10 tables of similar size, compacting just one of them will work on just 10% of the data, so the temporary disk usage during compaction might be 10% of the disk usage, not 100%.
Additionally, Scylla is sharded, meaning that different CPUs handle their sstables, and compactions, completely independently. If you have 8 CPUs on your machines, each CPU only handles 1/8th of the data, so when it does compaction, the maximum temporary overhead will be 1/8th of the table's size - not the full table size.

The second reason cannot be counted on - since shards choose when to compact independently, if you're unlucky all shards may decide to compact the same table at exactly the same time, and worse - may happen to do the biggest compactions all at the same time. This "unluckiness" can also happen at 100% probability if you start a "major compaction" (nodetool compact).

The first reason, the one which you asked about, is indeed more useful and reliable: Beyond it being unlikely that all shards will choose to compact all sstables are exactly the same time, there is an important detail in Scylla's compaction algorithm which helps here: Each shard only does one compaction of a (roughly) given size at a time. So if you have many roughly-equal-sized tables, no shard can be doing full compaction of more than one of those tables at a time. This is guaranteed - it's not a matter of probability.

Of course, this "trick" only helps if you really have many roughly-equal-sized tables. If one table is much bigger than the rest, or tables have very different sizes, it won't help you too much to control the maximum temporary disk use.

In issue https://github.com/scylladb/scylla/issues/2871 I proposed a idea of how Scylla can guarantee that when disk space is low, the sharding (point 1) is also used to reduce temporary disk space usage. We haven't implemented this idea, but instead implemented a better idea - "incremental compaction strategy", which does huge compactions in pieces ("incrementally") to avoid most of the temporary disk usage. See this blog post for how this new compaction strategy works, and graphs demonstrating how it lowers the temporary disk usage. Note that Incremental Compaction Strategy is currently part of the Scylla Enterprise version (it's not in the open-source version).

回复收藏 0 原文

~没有更多了~