hbase伪分布式远程连接

发布于 2024-12-29 01:24:48 字数 1202 浏览 3 评论 0原文

我有 HBase 和HDFS 设置并以伪分布式模式工作(在 Mac OSX 上)。我还有一个简单的 Java 应用程序。它在本地使用时有效。 我想让它远程工作。服务器隐藏在路由器后面,所有必要的端口都已转发。

当我尝试远程连接时,我得到:

...
12/01/25 23:21:15 INFO zookeeper.ClientCnxn: Session establishment complete on server 
remote.host.com/remoteip:53058, sessionid = 0x13516f179a30005, negotiated timeout = 40000
12/01/25 23:21:36 INFO client.HConnectionManager$HConnectionImplementation: getMaster attempt 
0 of 10 failed; retrying after sleep of 1000
java.net.SocketTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=192.168.52.53/192.168.52.53:58023]

这对我来说意味着 Zookeeper 连接但给客户端提供了错误的地址: 1)因为它是本地的 2)因为它位于错误的端口上,

我尝试通过在 HDFS core-site.xml (fs.default.name) 和 hbase-site.xml (hbase.rootdir) 中设置远程地址来修复问题 #1。 HDFS 不会绑定到远程地址。如果 HDFS 绑定到本地并且工作,如果在 hbase-site 中给定远程 hbase,则 hbase 将无法连接(IP 和端口转发肯定有效,请使用 telnet 检查)。 我玩弄了 /etc/hosts - 无论 ping -c 1 $(hostname) 返回本地还是远程地址,HDFS 和远程地址都可以。 HBase 仅在绑定到本地时启动。

我还尝试通过在 hbase-site.xml 中设置 hbase.master.port 来修复问题 #2 - 无论我设置什么,HBase 主服务器都会绑定到随机端口。

我浪费了大量时间试图解决这个问题,检查了所有可能的来源并尝试了所有可能的组合。

I have HBase & HDFS set up and working in pseudo-distributed mode (on Mac OSX). I also have a simple Java application. It works when used locally.
I would like to make it work remotely. The server is hidden behind a router, all necessary ports have been forwarded.

When I try to connect remotely I get:

...
12/01/25 23:21:15 INFO zookeeper.ClientCnxn: Session establishment complete on server 
remote.host.com/remoteip:53058, sessionid = 0x13516f179a30005, negotiated timeout = 40000
12/01/25 23:21:36 INFO client.HConnectionManager$HConnectionImplementation: getMaster attempt 
0 of 10 failed; retrying after sleep of 1000
java.net.SocketTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=192.168.52.53/192.168.52.53:58023]

Which to me means that Zookeeper connects but gives the client the wrong address:
1) because its local
2) because its on the wrong port

I tried fixing issue #1 by setting the remote address in HDFS core-site.xml (fs.default.name) and in hbase-site.xml (hbase.rootdir).
HDFS won't bind to the remote address. If HDFS is binded to local and works, hbase will not connect if it is given the remote one in hbase-site (the ip and port forward is working for sure, checked with telnet).
I played around with /etc/hosts - whether or not ping -c 1 $(hostname) returns local or remote address, both HDFS & HBase start only when binded to local.

I also tried fixing issue #2 by setting hbase.master.port in hbase-site.xml - doesn't matter what I set, HBase master server binds to a random port.

I've wasted tons of time trying to get this right, checked all possible sources and tried every possible combination.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

何处潇湘 2025-01-05 01:24:48

这种情况下的常见问题是您期望可以从 NAT 防火墙外部通过单个 IP 地址访问 HBase。虽然这可能是可能的,但设置起来非常困难,而且几乎肯定不受支持。

当客户端连接到 HBase 时,发生的第一件事是它们连接到 ZooKeeper 以确定哪台机器托管它们正在查找的表(或者哪台机器是当前的 Master,如果您正在执行管理操作,这似乎是案例在这里)。

然后客户端直接连接到远程计算机。如果远程计算机(特别是 HBase RegionServer)位于 NAT 路由器后面并使用其内部 IP 向 ZooKeeper 报告自身,则路由器外部的计算机无法解析防火墙内部 RegionServer 的 IP。

让 HBase 通过 NAT 工作的唯一合理方法是通过代理传输所有外部请求。有两种选择——Thrift 和 REST。有关代理的更多信息,请访问:http://ofps.oreilly.com/titles/9781449396107/clients .html

顺便说一句,您几乎永远不需要这种设置 - 所有客户端计算机都应该能够直接与 RegionServer 通信,以便您您的 HBase 代理服务器不会出现瓶颈。

The usual problem in this situation is that you are expecting that you can access HBase via a single IP address from outside a NAT firewall. While this is probably possible, it is very hard to set up and almost certainly unsupported.

When a client connects to HBase, the first thing that happens is they connect to ZooKeeper to determine which machine hosts the tables that they are looking for (or which machine is the current Master, if you are performing admin operations, which seems to be the case here).

Then the client connects directly to the remote machines. If the remote machines (the HBase RegionServers, specifically) are behind a NAT router and report themselves to ZooKeeper using their internal IPs, then there is no way for a machine outside of the router to resolve the IP of a RegionServer inside of the firewall.

The only reasonable way to make HBase work through NAT is to channel all outside requests through a proxy. There are two options for that- Thrift and REST. Much more on proxies here: http://ofps.oreilly.com/titles/9781449396107/clients.html

Incidentally, you almost never want this setup- all client machines should be able to communicate directly with RegionServers, so that you don't end up with a bottleneck at your HBase proxy server.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文