MapReduce 中的容错
我正在阅读有关 Hadoop 及其容错能力的内容。我阅读了 HDFS 并阅读了如何处理主节点和从节点的故障。但是,我找不到任何提到 MapReduce 如何执行容错的文档。特别是,当包含 Job Tracker 的主节点发生故障或任何从属节点发生故障时会发生什么?
如果有人可以向我指出一些详细解释这一点的链接和参考文献。
I was reading about Hadoop and how fault tolerant it is. I read the HDFS and read how failure of master and slave nodes can be handled. However, i couldnt find any document that mentions how the mapreduce performs fault tolerance. Particularly, what happens when the Master node containing Job Tracker goes down or any of the slave nodes goes down?
If anyone can point me to some links and references that explains this in detail.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
MapReduce层的容错能力取决于hadoop版本。对于hadoop.0.21之前的版本,没有做检查点,JobTracker失败会导致数据丢失。
然而,从 hadoop.0.21 开始的版本中,添加了检查点,JobTracker 在文件中记录其进度。当 JobTracker 启动时,它会查找此类数据,以便可以从中断处重新开始工作。
Fault Tolerance of MapReduce layer depends on the hadoop version. For versions before hadoop.0.21, no checkpointing was done and failure of JobTracker would lead to loss of data.
However, versions starting hadoop.0.21, checkpointing was added where JobTracker records its progress in a file. When a JobTracker starts up, it looks for such data, so that it can restart work from where it left off.
HADOOP 中的容错
如果 JobTracker 在指定时间内未收到来自 TaskTracker 的任何心跳
时间(默认设置为 10 分钟),JobTracker 知道关联到的工作线程
TaskTracker 失败了。当这种情况发生时,JobTracker需要重新调度所有
待处理和正在进行的任务转移到另一个TaskTracker,因为中间数据属于
失败的 TaskTracker 可能不再可用。
所有已完成的地图任务还需要
如果它们属于未完成的作业,则重新安排,因为中间结果驻留在失败的作业中
reduce 任务可能无法访问 TaskTracker 文件系统。
TaskTracker 也可以被列入黑名单。在这种情况下,列入黑名单的 TaskTracker 仍保留在
与JobTracker通信,但没有任务分配给相应的worker。当一个
属于由 a 管理的特定作业的给定数量的任务(默认情况下,该数量设置为 4)
TaskTracker失败,系统认为发生故障。
TaskTracker 发送到 JobTracker 的心跳中的一些相关信息是:
● TaskTrackerStatus
● 已重新启动
● 如果是第一次心跳
● 如果节点需要执行更多任务
TaskTrackerStatus 包含有关 TaskTracker 管理的 Worker 的信息,例如
可用的虚拟和物理内存以及有关 CPU 的信息。
JobTracker 会保存有故障的 TaskTracker 的黑名单以及最后收到的心跳
来自任务跟踪器。因此,当收到新的重新启动/第一个心跳时,JobTracker 通过使用
该信息,可以决定是否重新启动TaskTracker或从TaskTracker中删除
黑名单
之后,JobTracker 中更新 TaskTracker 的状态,并发出 HeartbeatResponse
创建的。此 HeartbeatResponse 包含 TaskTracker 要采取的下一步操作。
如果有任务要执行,TaskTracker 需要新任务(这是
Heartbeat)且不在黑名单中,则创建清理任务和设置任务(
清理/设置机制尚未得到进一步研究)。如果没有清理或
设置要执行的任务后,JobTracker 会获取新任务。当任务可用时,
LunchTaskAction 被封装在每一个中,然后 JobTracker 还会查找:
- 要被杀死的任务
- 要杀死/清理的作业
- 其输出尚未保存的任务。
所有这些操作(如果适用)都会添加到要在 HeartbeatResponse 中发送的操作列表中。
Hadoop 中实现的容错机制仅限于在给定的情况下重新分配任务。
执行失败。在这种情况下,支持两种场景:
1. 如果分配给给定 TaskTracker 的任务失败,则通过 Heartbeat 进行通信
用于通知JobTracker,如果可能的话,它会将任务重新分配给另一个节点。
2.如果某个TaskTracker失败,JobTracker会注意到故障情况,因为它不会接收到
来自该任务跟踪器的心跳。然后,JobTracker 将分配 TaskTracker 拥有的任务
到另一个任务跟踪器。
JobTracker 中也存在单点故障,因为如果它失败,整个执行就会失败。
Hadoop 中实现的容错标准方法的主要优点在于其简单性,并且似乎在本地集群中运行良好。但是,标准方法对于大型分布式基础设施来说还不够,节点之间的距离可能太大,并且重新分配任务所浪费的时间可能会降低系统速度
FAULT TOLERANCE IN HADOOP
In case the JobTracker does not receive any heartbeat from a TaskTracker for a specified period of
time (by default, it is set to 10 minutes), the JobTracker understands that the worker associated to
that TaskTracker has failed. When this situation happens, the JobTracker needs to reschedule all
pending and in progress tasks to another TaskTracker, because the intermediate data belonging to
the failed TaskTracker may not be available anymore.
All completed map tasks need also to be
rescheduled if they belong to incomplete jobs, because the intermediate results residing in the failed
TaskTracker file system may not be accessible to the reduce task.
A TaskTracker can also be blacklisted. In this case, the blacklisted TaskTracker remains in
communication with the JobTracker, but no tasks are assigned to the corresponding worker. When a
given number of tasks (by default, this number is set to 4) belonging to a specific job managed by a
TaskTracker fails, the system considers that a fault has occurred.
Some of the relevant information in the heartbeats the TaskTracker sends to the JobTracker are:
● The TaskTrackerStatus
● Restarted
● If it is the first heartbeat
● If the node requires more tasks to execute
The TaskTrackerStatus contains information about the worker managed by the TaskTracker, such as
available virtual and physical memory and information about the CPU.
The JobTracker keeps the blacklist with the faulty TaskTracker and also the last heartbeat received
from that TaskTracker. So, when a new restarted/first heartbeat is received, the JobTracker, by using
this information, may decide whether to restart the TaskTracker or to remove the TaskTracker from
the blacklist
After that, the status of the TaskTracker is updated in the JobTracker and a HeartbeatResponse is
created. This HeartbeatResponse contains the next actions to be taken by the TaskTracker .
If there are tasks to perform, the TaskTracker requires new tasks (this is a parameter of the
Heartbeat) and it is not in the blacklist, then cleanup tasks and setup tasks are created (the
cleanup/setup mechanisms have not been further investigated yet). In case there are not cleanup or
setup tasks to perform, the JobTracker gets new tasks. When tasks are available, the
LunchTaskAction is encapsulated in each of them, and then the JobTracker also looks up for:
-Tasks to be killed
-Jobs to kill/cleanup
-Tasks whose output has not yet been saved.
All this actions, if they apply, are added to the list of actions to be sent in the HeartbeatResponse.
The fault tolerance mechanisms implemented in Hadoop are limited to reassign tasks when a given
execution fails. In this situation, two scenarios are supported:
1. In case a task assigned to a given TaskTracker fails, a communication via the Heartbeat is
used to notify the JobTracker, which will reassign the task to another node if possible.
2. If a TaskTracker fails, the JobTracker will notice the faulty situation because it will not receive
the Heartbeats from that TaskTracker. Then, the JobTracker will assign the tasks the TaskTracker had
to another TaskTracker.
There is also a single point of failure in the JobTracker, since if it fails, the whole execution fails.
The main benefits of the standard approach for fault tolerance implemented in Hadoop consists on its simplicity and that it seems to work well in local clusters However, the standard approach is not enough for large distributed infrastructures the distance between nodes may be too big, and the time lost in reassigning a task may slow the system
Master节点(NameNode)是hadoop中的单点故障。如果出现故障,系统将不可用。
从属(计算)节点故障没有问题,故障时在其上运行的任何内容都只需在不同的节点上重新运行。事实上,即使节点运行缓慢,也可能会发生这种情况。
有些项目/公司希望消除单点故障。如果您有兴趣,谷歌搜索“hadoop ha”(高可用性)应该会让您上路。
The Master node (NameNode) is a single point of failure in hadoop. If it goes down, the system is unavailable.
Slave (Computational) node failures are fine, and anything running on them at the time of failure are simply rerun on a different node. In fact this may occur even if a node is running slowly.
There are some projects / companies looking to eliminate the single point of failure. Googling "hadoop ha" (High availablity) should get you on your way if you're interested.