我按照此处 当我到达检查角色分配阶段时,我只看到一个托管主机:localhost.localdomain
。
任何后续添加其他主机的尝试都会产生相同的结果:
- 每个群集主机安装均成功
- ,并且该主机未显示为托管主机
我缺少什么?
更新:我不喜欢回答自己的问题,所以我在这里写下我的答案。
解决方案是如此明显,以至于我没有看到它,并在相当长的一段时间内没有解决问题,直到在做一些检查时它击中了我。
安装时提供的主机名是在 /etc/hosts
中为 IP 127.0.0.1
和 localhost.localdomain
设置的> 女巫对 Cloudera 设置产生了误导,基本上使所有主机都具有相同的 IP 和主机名。
我已经使用 hostname.domain.local
重新进行了设置,现在 hosts
文件有一个单独的行,其中包含特定的 IP 和主机名以及 /etc/resolv .conf
文件与搜索域.local
一致。
即使在经历了这次不愉快的经历之后,我仍然认为 安装文档应该包含这些小细节,但这就像陈述显而易见的事情一样。
I've followed the installation procedure from here and when I reach the Inspect Role Assignments stage I only see one managed host: localhost.localdomain
.
Any subsequent attempts to add other hosts have the same outcome:
- each cluster host installation is successful
- and the host does not show up as managed
What am I missing?
Update: I don't like to answer my own questions so I am writing my answer here.
The solution is so obvious that I cloud not see it and left the problem unresolved for quite some time until it hit me while doing some checks.
The hostname
provided at installation time was set in /etc/hosts
for the IP 127.0.0.1
and localhost.localdomain
witch was misleading for the Cloudera setup and basically made all hosts to have the same IP and hostname.
I've redone the setup with hostname.domain.local
and now the hosts
file feature a separate line with the specific IP and hostname and the /etc/resolv.conf
file has line with search domain.local
.
Even thou after this unpleasant experience I think that the installation documentation should feature these small details but, it's like stating the obvious.
发布评论
评论(6)
看起来 Cloudera(可能是最近)在其文档中添加了有关此内容的简介< /a>.我遇到这个问题已经有一段时间了,对我来说关键是获得以下命令来给出正确的结果:
我的解决方案涉及设置本地 DNS 服务器,但也许每个节点上都有相同的 /etc/hosts 就可以了已经足够了。 YMMV。
Looks like Cloudera (possibly recently) added a blurb about this to their documentation. I've been having this problem for a while, and the key for me was getting the following command to give correct results:
My soution involved setting up a local DNS server, but perhaps just having the same /etc/hosts on every node would have been sufficient. YMMV.
好吧,我在虚拟机上实现了集群,所以我想分享我所做的一切。在我的集群中,我创建了一个管理节点(仅适用于cloudera管理器),一个名称节点,两个数据节点。这使得向集群添加新节点变得更容易且没有问题。我还准备了简单的说明文件。它可能有点概括,但工作正常。大多数代码都取自各个站点,因此我尽力使它们尽可能简单。我在这里添加了这个答案,因为我的实现还包括向集群添加新主机。
注意:我对Linux环境非常陌生,我尽了最大努力去做事情,我期待任何人能够纠正我对使用或解释的评论。
=================================================== ==================================
这些说明是在 centOS 6.2 x64(非实时桌面版本)上实现的。如果您使用服务器版本,那么您可能需要自行配置网络配置。
尽可能在所有机器上使用相同的版本。有人说机器的 IP 值很重要,但我使用不同的 IP 范围实现,例如一台机器使用 192.168.12.13,另一台机器使用 192.168.13.144。它不会造成问题。
我还在 Windows 7 Enterprise 上使用 Oracle VirtualBox 作为虚拟机环境。
建议:当您创建一个常见的 CentOS 安装时,如果发生任何错误的配置,您应该创建一个克隆。始终保留备份克隆。
首先手动下载这些文件:
cloudera manager(可以下载社区版)。我们需要主节点,但这并不意味着主节点是集群的一部分。我
在没有名称节点或作业跟踪器的机器上使用了管理器,只有管理器应用程序。
Oracle JDK。您可以从 oracle 网站下载合适的一个。只需前往那里并从浏览器下载或复制链接并使用 wget 即可下载。这是你的选择。
请务必卸载“open jdk”:
手动安装“oracle jdk”
请注意,wget 行可以更改。您可以从浏览器下载文件。
让我们的系统和浏览器使用我们新的 java
添加用户作为 sudoers
找到“root ALL=(ALL) ALL”行并在下面添加此行
//这行意味着用户 root 可以从所有终端执行,
//充当所有(任何)用户,并运行所有(任何)命令。
安装“ssh 服务器”
检查 ssh 服务器状态,确保它正在运行
如果未启动,则启动 sshd 服务
,或者您可以简单地测试 ssh,
测试成功后您可以退出
这些说明也在 cloudera 网站中定义。
如果您可以检查 /var/log/cloudera-scm-agent/cloudera-scm-agent-log 或 .out 文件,并看到存在与持久性或休眠相关的
异常/错误,则意味着问题与 postgresql 数据库有关。可能数据库还没有设置。我们需要做的就是设置它。
不是:postgresql 仅需要管理器(主)节点。不需要奴隶。
通过检查服务状态来确保 postgresql 实例已安装
否:下面的说明需要存储库配置!!!如果您不知道如何操作,请跳至脚本文件的使用。
在 Cloudera Manager Server 主机上安装嵌入式 PostgreSQL 数据库包:
通过运行以下命令准备嵌入式 PostgreSQL 数据库以与 Cloudera Manager Server 一起使用 通过
运行以下命令启动嵌入式 PostgreSQL 数据库:
脚本文件使用:下面的说明是通过脚本文件手动设置postgresql
所需参数和说明
数据库类型 要连接到MySQL数据库,请指定mysql为数据库类型,或指定postgresql连接到外部 PostgreSQL 数据库。
database-name 您要创建的 Cloudera Manager Server 数据库的名称。
用户名 您要创建的 Cloudera Manager Server 数据库的用户名。
密码 您要创建的 Cloudera Manager Server 数据库的密码。如果您未在命令行上指定密码,脚本将提示您输入密码。
您可以查看此页面了解详细信息:https://ccp.cloudera.com/display/ENT/Installation+Path+B+-+Installation+Using+Your+Own+Method#InstallationPathB-InstallationUsingYourOwnMethod-Step5%3AConfigureaDatabasefortheClouderaManagerServer
启动 postgresql如果它没有启动(您可以检查状态并确保重新启动它)
如果 Linux 上有 root/防火墙限制,则心跳代理的信息不会到达主节点(管理器),因此我们需要消除安全
问题。在这种情况下,Selinux 和 iptables 可能会产生问题。 Cloudera 说完全禁用 iptables,但如果您对 iptables 配置有经验
,那么您可以添加这样的规则。
打开 iptables 并设置 7180 端口访问规则,
添加以下行:
或者简单地(cloudera 方式)完全禁用 iptables。确保所有节点上的iptables状态都相同
检查状态参数
不:每次机器重新启动时,iptables都会再次激活,因此您可能需要一种方法来自动停止它。
由于 iptables 和 selinuxun 发生的问题将出现在日志文件“cloudera-scm-agent.log”中。您可能会看到一些有关 phyton 代码的“已弃用”警告
,只需忽略它们即可。错误/异常通常是“没有到主机的路由”或类似的内容。
禁用selinux。但您可能需要在上述许多操作之前执行此操作。特别是当您尝试安装cloudera manager时。 linux 会给你关于 selinux 的警告。
为每台机器设置唯一的主机名。因此,在每台机器中编辑此文件并为该机器指定名称。我们将在主机文件中使用这个名称。
使用节点的所有 ip 值和主机名重新修改主机文件。在所有节点中执行此操作。您也可以简单地复制到其他节点。所有主机文件都是相同的
示例:
127.0.0.1 本地主机
192.168.1.2 主节点
192.168.1.3名称节点
192.168.1.4 数据节点1
192.168.1.5 datanode2
检查cloudera manager状态,如果需要,可以重新启动它
,确保所有节点的互联网连接都足够好。因为管理器将连接它们并对它们中的每一个开始一系列的下载操作。如果管理器遇到任何问题,它会回滚所有内容,因此这将花费您重新启动所有内容。相信我,这部分花费了太多时间!
如果您使用虚拟机作为节点(我就是这样做的),您可以选择桥接网络模式。这样你就可以为所有节点提供互联网连接,但这有一个缺点。如果重新启动物理机,您可能会丢失 IP 值并自动重新获取新的 IP 值。这可能会导致您重新修改每个节点上的主机文件。但是,如果您使用 NAT 或其他类似内部网络的方式,您可以为节点提供静态 IP 值,这样就不需要重新配置。但是你应该为所有机器提供互联网访问网关IP。因为不仅是经理,座席也需要访问互联网来下载文件。当然,当您完成集群设置后,您就可以消除代理(从属)节点访问互联网的需要。
您应该在启动虚拟机时尝试 ifconfig 以查看它是否从网络获取 ip 值。如果不是,则必须更改 VM 应用程序上的虚拟机配置。如果您正在使用具有有线和无线连接的物理计算机,那么您将有不止一种以太网适配器选择。但一定要选择正确的。错误的不会给你ip地址。
一定要使用oracle JDK。
不时检查cloudera scm状态。
检查 7180 和其他 Cloudera Manager 相关端口是否已侦听。你可以使用“nmap”或“netstat --listen”
如果你无法安装cloudera manager到主节点(可能是selinux、postgresql或下载问题。顺便确保下载是不可剪切的)那么你可能需要清理并重新启动。
此行将清理 cloudera 相关文件并允许您再次重新启动。
如果您更改任何内容并确保进程正常工作,您可以在从节点上重新启动cloudera-scm-agent。但是您应该保留干净的日志文件以查看新配置是否正常工作。日志文件对于查看问题所在或正确情况非常重要。
接下来的步骤是从 includera 管理器 Web 界面添加主机:
在管理器计算机中,我使用“localhost:7180”连接到 mamanger gui。在主机部分,您将向集群添加新主机。只需在测试框中添加节点的名称,然后按“查找主机”按钮。如果您还记得的话,主机名已在 /etc/hosts 文件中定义。因此,您可以在文本框中使用 ip 或主机名,如果它们设置正确,那么 mamanger 会找到合适的并将它们列在上面的列表中。如果它们尚未被管理(意味着尚未安装任何内容),“当前管理”列将显示“否”。否则会显示“是”。
之后,您可以继续在所选主机上安装cloudera代理和hadoop文件。但是,如果您已经安装了它们(如果它们是托管的),那么您可以开始在它们上添加服务。只需转到“服务”页面并继续您的流程。如果您正确设置主机并看到它们受到管理,那么添加服务非常容易并且没有问题。(至少对我来说)。
请发送有关我的回答的任何评论。它很长。也许不必要。但我尝试添加每一个细节。
Allright i implemented cluster on virtual machines so i wanted to share all i did. in my cluster i created one manager node(only for cloudera manager), one namenode, two datanode. This made adding new node to cluster easier and without problem. i also prepared simple document for instructions. It maybe little summerized but working ok. Most of the codes are taken from various sites so i tried to keep them simple as much as i understand. I added this answer here because my implementation is also including adding new host to cluster.
Note: i am very new to linux environment, i tried my best to do things, i am expecting any one who can correct my comments on usage or explainings.
==================================================================================
These instructions are implemented on cenTOS 6.2 x64 (non live desktop version). If you use server version then you may need to configure network configuration by yourself.
Use same version on all machine as much as possible. Some says IP values of machines are important but i implemented with different IP ranges like one machine is using 192.168.12.13 and other is 192.168.13.144. it is not creating problem.
I also used Oracle VirtualBox for virtual machine environment on windows 7 enterprise.
Suggestion : when you create one common cenTOS installation then you should create a clone if any wrong configuration happens. Keep a backup clone always.
Download these files manually first:
cloudera manager (you can download community edition). we need this for master node but that does not mean that master node is part of cluster. I
used manager on machine which has no namenode or job tracker, just mamanger applicaiton.
Oracle JDK. you can download proper one from oracle web site. Just go there and download from browser or copy the link and use wget to download it. It is your choise.
Be sure to uninstall "open jdk" :
install "oracle jdk" manualy
Note that wget line can be changed. you can download file from browser.
Make our system and browsers use our new java
Add user as sudoers
find the line "root ALL=(ALL) ALL" and add this line below
//This lines means that the user root can execute from ALL terminals,
//acting as ALL (any) users, and run ALL (any) command.
Install "ssh server"
check the ssh server status to be sure it is running
start sshd service if it is not started
or you can simply test ssh with
after succesfull test you can exit
These instructions are also defined in cloudera web site.
If you can check the /var/log/cloudera-scm-agent/cloudera-scm-agent-log or .out files and see that there are persistence or hibernate related
exception/errors that means problem is about postgresql database. probably database is not set yet. All we need to do is to set it up.
Not : postgresql only needed for manager(master) node. no need for slaves.
Be sure postgresql instance is installed by checking service status
Not : instruction below needs repo configuration!!! If you do not know how then skip to script file usage.
Install the embedded PostgreSQL database package on the Cloudera Manager Server host:
Prepare the embedded PostgreSQL database for use with the Cloudera Manager Server by running this command
Start the embedded PostgreSQL database by running this command:
Script file usage : Instruction below is manual setting of postgresql with script file
Required Parameter and Description
database-type To connect to a MySQL database, specify mysql as the database type, or specify postgresqlto connect to an external PostgreSQL database.
database-name The name of the Cloudera Manager Server database you want to create.
username The username for the Cloudera Manager Server database you want to create.
password The password for the Cloudera Manager Server database you want to create. If you don't specify the password on the command line, the script will prompt you to enter it.
You can check this page for details : https://ccp.cloudera.com/display/ENT/Installation+Path+B+-+Installation+Using+Your+Own+Method#InstallationPathB-InstallationUsingYourOwnMethod-Step5%3AConfigureaDatabasefortheClouderaManagerServer
start postgresql if it is not started (you can check the status and to be sure restart it)
If there is rooting/ firewall restriction on linux then heartbeath of the agent will not reach master node(manager) so we need to eliminate security
concerns. In this case there are Selinux and iptables that can create problem. Cloudera says disable iptables totally but if you are experienced
about iptables configuration then you can add rules like this.
open iptables and set rule for port access of 7180
adding this line :
or simply (cloudera way) disable iptables totaly. be sure it is same on all nodes
check iptables status with status parameter
Not : Every time machine restarts, iptables will be activated again so you may need a way to stop it automatically.
Ay problem happened because of iptables and selinuxun will be in log file "cloudera-scm-agent.log". You may see some "deprecated" warnings about
phyton code, just ignore them. Error/exception are generally "no route to host " or something like that.
disable selinux. but you may need to do this before many operation above. Especially when you try to install cloudera manager. linux will give you warning about selinux.
Set unique host name for each machine. so in each mahine edit this file and give name to that machine. we will use this name in hosts file.
remodify host file with all ip values and hostnames of nodes. Do this in all nodes. You can simply copy to other nodes also. all hosts files will be same
example :
127.0.0.1 localhost
192.168.1.2 masternode
192.168.1.3 namenode
192.168.1.4 datanode1
192.168.1.5 datanode2
check the cloudera manager status and if you need you can restart it
be sure your internet connection is good enough for all nodes. because manager will connect them and starts series of download operation on each of them. if manager comes across any problem it will rollback everything so this will cost you to restart each everything. Trust me this part is taking too much time!
if you using virtual machines as nodes(which is i did.) you may choose bridged network mode. so you can give internet connectivity to all nodes but this has one downside. If you restart your physical machine you may lost your ip values and retake new ones automatically. Which can couse you to remodify hosts file on each node. But if you use NAT or something other like internal network you can give static ip values to your nodes so there will not be reconfiguration need. but then you should provide internet access gateway ip for all machine. because not just manager, also agents need internet access to download files. Ofcourse when you finish seting up your cluster then you can eliminate the need of agent(slaves) node's internet access.
You should try ifconfig when you start virtual machine to see if it is getting ip value from network. If not then your virtual machine configuration on your VM application must be changed. if you are working on physical machine that has cable and wireless connectivity then you will have more than one ethernet adaptor choise. bu sure to choose right one. wrong one will not give you ip address.
Be sure to use oracle JDK.
Check cloudera scm status time to time.
check 7180 and other cloudera manager realted ports are listened. you can use "nmap" or "netstat --listen"
If you are unable to install cloudera manager to master node(probably selinux, postgresql or download problem. by the way be sure download is uncuttable) then you may need to clean up and restart.
this line will clean cloudera realted files and allow you to restart again.
you can restart cloudera-scm-agent on slave nodes if you change anything and to besure process are working correctly.But you shold clean log files to see if new configuration is working properly. Log files are important to see what is going wrong or right.
Next steps are adding host from cludera manager web interface :
In manager machine i used "localhost:7180" to connect to mamanger gui. in the hosts part you will se adding new host to cluster. just add the name of the node in testbox adn press the "Find Hosts" button. The name of the hosts are already defined in /etc/hosts file if you remember. So you can either use ip or hostname in the textbox, if they are set right then mamanger will find suitable one and lists them in list above. If they are not managed yet (means nothing installed on them yet), "currently managed" column will show "no". otherwise it will show "yes".
After that you can continue to install cloudera agent and hadoop files on choosen hosts. But if you already installed them(if they are managed) then you can begin to add services on them. just go to "Services" page and continue your process. If you set ups hosts correctly and see they are managed then adding service is very easy and non problematic.(at least for me).
please send any comment about my answer. it is kind a long. maybe nonneccessaryly. but i tried to add every detail.
我也有类似的问题。 Cloudera Manager 能够安装所有组件,但主机未显示在托管主机列表中。
就我而言,ip/dns 名称配置很好。我能够成功地进行查找。
后来我意识到Cloudera需要一堆端口来管理节点。各种 Hadoop 服务还需要额外的端口。只是看看问题是否是因为这个,可以暂时关闭防火墙。如果这是问题所在,请参阅 Cloudera 的文档以获取端口列表。目前它位于:
https://ccp.cloudera.com/display/ENT4DOC/Configuring +Cloudera+Manager 的+端口
I also had a similar problem. Cloudera Manager was able to install all the components but the hosts wasn't showing up in the managed hosts list.
In my case the ip/dns name configuration was fine. I was able to do lookups successfully.
Later I realized that Cloudera needs bunch of ports to manage the nodes. Also additional ports will be needed for various Hadoop services. Just to see if the problem is because of this, you can turn off the firewall temporarily. If that's the issue, refer to Cloudera's documentation for list of ports. Currently it's located in:
https://ccp.cloudera.com/display/ENT4DOC/Configuring+Ports+for+Cloudera+Manager
为了解决这个错误,我做了三件事:
1)vim /etc/cloudera-scm-agent/config.ini
最初将
主机名更改为:
另外请确保在 /etc/hosts 文件中添加了“manager”
2)在 /usr/local/java/jdk1.7xxx 目录中安装了 java
在~/.bash_profile中
包含的以下
软链接也可用于此目的:
Cloudera 可能将 java 路径视为“/usr/java”。所以我在/usr目录中创建了一个符号链接。
3)当它仍然不起作用时,我使用以下命令安装了MySQL Connector:
重新启动服务器并重新启动代理。当时它对我有用。
To solve this error, I did three things:
1) vim /etc/cloudera-scm-agent/config.ini
Originally it was
Changed hostname to:
Also make sure the 'manager' is added in the /etc/hosts file
2) Installed java in the /usr/local/java/jdk1.7xxx directory
In ~/.bash_profile
Included following
Soft Link can also be used for this purpose:
Cloudera probably takes the java path as '/usr/java'. So I created a symbolic link in the /usr directory.
3) When it still did not work, I installed MySQL Connector using the following:
Restart server and restart agents. It worked for me then.
如果您尝试了所有建议,但仍然无法将新主机添加到集群中,
请尝试以下操作:
从卸载 Cloudera 卸载 Cloudera Manager 和托管软件管理器代理和托管软件
原因:
因为Cloudera-manage代理是用Python编写的。如果你之前安装失败,一些僵尸进程会留在你的新主机上,这是很难意识到的。
If you tried all the suggestions upside and you still can't add the new host into cluster,
please try this:
Uninstalling Cloudera Manager and Managed Software from Uninstall Cloudera Manager Agent and Managed Software
reason why:
Because the Cloudera-manage agent is written in Python. If you failed to install previously, some zombie process will stay in your new host, this is hard to realise it.
您可以检查/etc/hostname 文件。它应该具有主机名,后跟 FQDN。
HOSTNAME=主机名.fqdn
然后你也可以运行这个命令:
主机名
more /etc/hostname
(`` 不是 '')you can check /etc/hostname file. It should have the hostname followed by fqdn.
HOSTNAME=hostname.fqdn
Then you can run this command as well:
hostname
more /etc/hostname
(`` not '')