Hadoop 任务调度程序:容量 vs 公平共享还是其他?

发布于 2024-09-19 11:32:52 字数 980 浏览 7 评论 0原文

背景

我的雇主正在逐步将我们的资源密集型 ETL 和后端处理逻辑从 MySQL 转移到 Hadoop(dfs 和 hive)。目前,一切仍然较小且易于管理(10 个节点 20 TB),但我们打算逐步增加集群大小。

现在,hadoop 正在转向生产使用,它成为批处理调度和在临时用户 hive 查询、每小时 M/R 进程之间共享集群的一个更大问题,我相信最终会使用 hbase。令人担心的是,用户发出的简单查询可能会运行不合理的时间(例如 4 小时),从而堵塞任务队列并产生潜在的基础设施负载不稳定。

问题

我公司的另一个部分已经被 Flume 的不成熟所困扰,所以我的问题是,两个已知的调度程序(Capacity 和 Fair)有多稳定,除了在它们的赞助公司(Yahoo 和 Facebook)中使用之外,它们还在其他地方使用吗?

编辑:背景信息

http://www.cloudera.com /blog/2008/11/job-scheduling-in-hadoop/

http://hadoop.apache.org/mapreduce/docs/r0.21.0/fair_scheduler.html

http://hadoop.apache.org/mapreduce/docs/r0.21.0/capacity_scheduler.html

Background

My employer is progressively shifting our resource intensive ETL and backend processing logic from MySQL to Hadoop ( dfs & hive ). At the moment everything is still somewhat small and manageable ( 20 TB over 10 nodes ) but we intend to progressively increase the cluster size.

Now that hadoop is being shifted into production use, its becoming a bigger issue of batch scheduling and sharing the cluster between ad-hoc user hive queries, hourly M/R processes, and I believe eventually some usage of hbase. The fear is that a naive query will be made by a user that could potentially run for an unreasonable amount of time ( say 4 hours ) clogging up the task queue and producing potential infrastructure load instabilities.

Question

Another section of my company has already been burned by Flume's immaturity, so my question is, how stable are the two known schedulers ( Capacity & Fair ) and besides usage in their sponsoring companies ( Yahoo & Facebook ) are they used elsewhere?

Edit: Background info

http://www.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/

http://hadoop.apache.org/mapreduce/docs/r0.21.0/fair_scheduler.html

http://hadoop.apache.org/mapreduce/docs/r0.21.0/capacity_scheduler.html

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

长途伴 2024-09-26 11:32:52

我们默认启用公平共享调度程序来交付 CDH。是相当稳定的。

We ship CDH with the Fair Share scheduler on by default. It's quite stable.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文