如何使用机器2中的nifi将csv文件从机器1发送到机器3?
我的场景有 3 台机器 机器 1:有一个 .csv 文件 机器2:Nifi已安装并运行 机器 3:HDFS 和 Hbase 已安装并正在运行。
现在我必须使用在机器 2 中运行的 nifi 将 .csv 文件从机器 1 发送到在机器 3 中运行的 Hbase 表。
要从机器 1 获取文件,我使用 GetSFTP 处理器,并且可以在 nifi 之前获取 .csv 文件在机器 2 中运行。现在我不知道要使用哪个处理器才能将文件发送到在机器 3 中运行的 hbase 表?我已经使用了 PutHbaserecord,但是如果我的 habse 和 hdfs 在机器 2 中运行,这只是帮助我存储在 habse 表中。
那么有人可以让我知道如何使用 nifi 发送到机器 3 吗?
hbase-site.xml
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/zookeeper</value>
</property>
</configuration>
<property>
<name>hbase.wal.provider</name>
<value>filesystem</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>./tmp</value>
</property>
</configuration>
核心站点.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hdoop/tmpdata</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hdfs/datanode</value>
</property>
</configuration>
I have 3 machines for my scenario
Machine 1: has a .csv file
Machine 2: Nifi is installed and running
Machine 3: HDFS and Hbase is installed and running.
Now I have to send the .csv file from machine 1 to Hbase table running in machine 3 using nifi which is running in machine 2.
To get the file from machine 1 I am using GetSFTP processor and I could get the .csv file till nifi running in machine 2. Now I don't know which processor to use so that I can send the file to my hbase table running in machine 3? I have used PutHbaserecord but that just help me in storing in habse table if my habse and hdfs is running in machine 2.
So can someone let me know how can is send to machine 3 using nifi?
hbase-site.xml
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/zookeeper</value>
</property>
</configuration>
<property>
<name>hbase.wal.provider</name>
<value>filesystem</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>./tmp</value>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hdoop/tmpdata</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hdfs/datanode</value>
</property>
</configuration>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为这不是真的。根据文档,您需要提供 HBase 客户端服务
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-hbase-nar/1.6.0/org.apache.nifi.hbase.PutHBaseRecord/
在此服务中,您提供 Hadoop 配置文件(hbase-site.xml),其中包含远程 HBase 集群(通过 Zookeeper)的 IP:端口信息,该信息不会是 localhost ;如果给出 Nifi(或 Zookeeper 返回)
localhost
,那么是的,它会认为 HBase 正在 Nifi 节点上运行。你应该只需要在 xml 中设置这些来连接到分布式 hbase 集群
顺便说一句,hbase 有自己的用于导入 CSV 文件的 CLI 命令,而 Nifi 对于上传数据的简单任务
I don't think that's true. According to the documentation, you need to provide an HBase Client Service
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-hbase-nar/1.6.0/org.apache.nifi.hbase.PutHBaseRecord/
In this Service, you provide Hadoop configuration files (an hbase-site.xml) which would contain the IP:port information of the remote HBase cluster (via Zookeeper), which will not be localhost ; if Nifi is given (or Zookeeper returns)
localhost
, then yes, it'll think HBase is running on the Nifi node.You should only need these set in the xml to connect to a distributed hbase cluster
By the way, hbase has its own CLI commands for importing CSV files, and Nifi seems like overkill for the simple task of uploading data