大数据项目实战 - hadoop 高可用 ha 架构

发布于 2023-05-14 19:35:29 字数 9987 浏览 88 评论 0

HDFS 高可用 ha

编辑 hdfs-site.xml 如下:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>

    <property>
        <name>dfs.permissions.enabled</name>
        <value>false</value>
    </property>

    <property>
        <name>dfs.nameservices</name>
        <value>ns</value>
    </property>

    <property>
        <name>dfs.ha.namenodes.ns</name>
        <value>nn1,nn2</value>
    </property>

    <property>
        <name>dfs.namenode.rpc-address.ns.nn1</name>
        <value>header:8020</value>
    </property>

    <property>
        <name>dfs.namenode.rpc-address.ns.nn2</name>
        <value>worker-1:8020</value>
    </property>

    <property>
        <name>dfs.namenode.http-address.ns.nn1</name>
        <value>header:50070</value>
    </property>

    <property>
       <name>dfs.namenode.http-address.ns.nn2</name>
       <value>worker-1:50070</value>
    </property>

    <property>
       <name>dfs.namenode.shared.edits.dir</name>
       <value>qjournal://header:8485;worker-1:8485;worker-2:8485/ns</value>
    </property>

    <property>
       <name>dfs.journalnode.edits.dir</name>
       <value>/opt/modules/hadoop-2.5.0/data/jn</value>
    </property>

    <property>
       <name>dfs.client.failover.proxy.provider.ns</name>
       <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

    <property>
       <name>dfs.ha.fencing.methods</name>
       <value>sshfence</value>
    </property>

    <property>
       <name>dfs.ha.fencing.ssh.private-key-files</name>
       <value>/home/hadoop/.ssh/id_rsa</value>
    </property>

    <property>
       <name>dfs.ha.automatic-failover.enabled.ns</name>
       <value>true</value>
    </property>

    <property>
       <name>ha.zookeeper.quorum</name>
       <value>header:2181,worker-1:2181,worker-2:2181</value>
    </property>

</configuration>

YARN 高可用 ha

编辑 yarn-site.xml

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>header:8088</value>
    </property>

    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>

    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>-1</value>
    </property>

    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>

    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>rm</value>
    </property>

    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>header</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>worker-1</value>
    </property>

    <property>
       <name>yarn.resourcemanager.zk-address</name>
       <value>header:2181,worker-1:2181,worker-2:2181</value>
    </property>

    <property>
       <name>yarn.resourcemanager.recovery.enabled</name>
       <value>true</value>
    </property>

    <property>
       <name>yarn.resourcemanager.store.class</name>
       <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>

</configuration>

运行与测试

将修改后的配置推到 worker-1 和 worker-2上(注意如何是拷贝 hadoop 过去需要修改 zookeeper-3.4.5/zkData/mypid),部署方案:

headerworker-1worker-2
namenodenamenode 
datanodedatanode 
journalnodejournalnodejournalnode
zkfczkfc 
resourcemanagerresourcemanager 
nodemanagernodemanagernodemanager

启动 zookeeper:

$ bin/zkServer.sh start  # header
$ bin/zkServer.sh start  # worker-1
$ bin/zkServer.sh start  # worker-2

启动 journalnode

$ sbin/hadoop-daemon.sh start journalnode # header
$ sbin/hadoop-daemon.sh start journalnode # worker-1
$ sbin/hadoop-daemon.sh start journalnode # worker-2

格式化 hdfs(初始化时使用)

$ bin/hdfs namenode -format

初始化 HA 在 zookeeper 中的状态

$ bin/hdfs zkfc -formatZK  # header

启动 hdfs

$ sbin/hadoop-daemon.sh start namenode  # header
$ bin/hdfs namenode -bootstrapStandby  # worker-1, 同步nn1的元数据信息
$ sbin/hadoop-daemon.sh start namenode  # worker-1 
$ sbin/hadoop-daemon.sh start zkfc # header 那台机器启动,那个namenode就active
$ sbin/hadoop-daemon.sh start datanode  # header
$ sbin/hadoop-daemon.sh start datanode  # worker-1
$ sbin/hadoop-daemon.sh start datanode  # worker-2

启动 yarn

$ sbin/yarn-daemon.sh start resourcemanager  # header
$ sbin/yarn-daemon.sh start resourcemanager  # worker-1
$ sbin/yarn-daemon.sh start nodemanager  # header
$ sbin/yarn-daemon.sh start nodemanager  # worker-1
$ sbin/yarn-daemon.sh start nodemanager  # worker-2
$ bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount hdfs://header/data   /user/hadoop/output/1  # 启动测试程序

异常

$ hdfs namenode -format
18/07/07 23:02:15 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = header/172.18.179.240
...
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
18/07/07 23:02:28 WARN namenode.NameNode: Encountered exception during format:
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown:
172.18.179.240:8485: Call From header/172.18.179.240 to header:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
    at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
    at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:232)
    at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:875)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:171)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:922)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1354)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473)
18/07/07 23:02:28 FATAL namenode.NameNode: Exception in namenode join
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 1 exceptions thrown:
172.18.179.240:8485: Call From header/172.18.179.240 to header:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
    at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
    at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:232)
    at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:875)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:171)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:922)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1354)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1473)

解决方法:先用 ./zkServer.sh start 启动各个 zookeeper,再用 ./ hadoop-daemon.sh start journalnode 启动各个 DataNode 上的 JournalNode 进程。然后再进行格式化即可。

参考

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据

关于作者

文章
评论
29 人气
更多

推荐作者

櫻之舞

文章 0 评论 0

弥枳

文章 0 评论 0

m2429

文章 0 评论 0

野却迷人

文章 0 评论 0

我怀念的。

文章 0 评论 0

    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文