dfs.replication.max 的含义是什么

发布于 2025-01-10 08:34:04 字数 317 浏览 0 评论 0原文

关于 HDFS

dfs.replication.max 的含义是什么?

来自文档 - https:/ /hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

它只是说 - 最大块复制

但还是不明白这个意思

regarding to HDFS

what is the meaning of dfs.replication.max ?

from doc - https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

its say only that - Maximal block replication

but still not understand this meaning

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

╰沐子 2025-01-17 08:34:04

让我们想一想。我们有一个最小复制数,通常是 3。

为什么有一个最大复制数?也许您进行了大量维护并定期从集群中取出一个节点。您最终可能会[取出节点]并[将节点重新放回]集群中,并且可以合理地认为,节点离开和返回时可能会出现一个块的 4 个副本。这可能是一个很好的情况,因为您的定期维护有一个额外的副本,这样维护并不总是需要大量的复制。您可以接受最多 4 个副本作为复制。极端地说,如果您有一个文件的 50 个副本,这可能会有点失控,因为重复次数太多并开始占用 hdfs 空间。将最大值视为您可能开始剔除额外副本的时间。

Let's think through this. We have a min replication and this is typically 3.

Why have a max? Maybe you do a lot of maintenance and regularly take a node out of the cluster. You may end up by [taking nodes out] and [replacing nodes back in ] the cluster and it's reasonable to think 4 replicas of a block might happen with nodes leaving and returning. This might be a good situation due to your regular maintenance to have an extra copy hanging around so that maintenance doesn't always require lot of replication. You might accept 4 replicas as a max to replication. Taken to the extreme, this might get a little out of hand if you have 50 replicas of a file as this is just too much duplication and starts to eat into hdfs space. Think of the max as the time you might start to cull extra replicas.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文