Hadoop DistCP:使用了哪些端口?
如果我想在本地Hadoop群集上使用DISTCP,以便将数据“将”数据推向外部云存储,则必须提出哪些防火墙考虑才能利用此工具?数据的实际传输是什么端口进行的?是通过SSH和/或端口8020吗?我需要确保为目的地提供网络连接,但归因于最少的特权。 (即,仅开放绝对需要的端口)
If I want to use distCp on an on-prem hadoop cluster, so it can 'push' data to external cloud storage, what firewall considerations must be made in order to leverage this tool? What ports does the actual transfer of data take place on? Is it via SSH, and/or port 8020? I need to make sure network connectivity is provided for source to destination, but with the least amount of privileges ascribed to it. (i.e., only opening ports that are absolutely needed)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不认为SSH用于实际数据传输,除非您实际登录群集并启动命令。
至少,它将是Namenodes和Datanodes的RPC数据转移端口,因此您为
fs.defaultfs
,dfs.namenode.rpc-address.rpc-address 和
dfs.datanode.address
I do not believe SSH is used for the actual data transfer, other than you actually logging into the cluster and starting the command, for example.
At a minimum, it would be the RPC data-transfer ports for the NameNodes and Datanodes, so whatever you've configured for
fs.defaultFS
,dfs.namenode.rpc-address
anddfs.datanode.address