hadoop dfs -copyFromLocal src dest
我的问题是为什么我们需要指定一个目标。我放入hdfs的文件不一定完全位于本地机器上,那么在命令中指定dest有什么用呢?
当我通过命令 lie 运行命令,然后执行 hadoop dfs -ls 时,我可以看到我的文件列在 hdfs 中,但是当我使用编程语法创建文件
FileSystem fs = FileSystem.get(conf);
Path filenamePath = new Path("hello.txt");
fs.create(filenamePath);
,然后执行 hadoop dfs -ls 时,我找不到这个文件。
在我的 core-site.xml 中,我有以下内容...
<!-- In: conf/core-site.xml -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/apurv/hadoop/hdfs</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
直观上来说,复制的文件驻留在哪里对我来说没有意义,因为它可能足够大,可以驻留在一台计算机上。
My question is that why do we need to specify a dest. The file which I am putting into hdfs does not necessarily lie entirely on the local machine, so what is the use of specifying dest in the command.
When I run the command via command lie and then later do hadoop dfs -ls I can see my file getting listed in the hdfs but when I create the file pro-grammatically using
FileSystem fs = FileSystem.get(conf);
Path filenamePath = new Path("hello.txt");
fs.create(filenamePath);
and then later do hadoop dfs -ls I can't find this file.
In my core-site.xml I have the following...
<!-- In: conf/core-site.xml -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/apurv/hadoop/hdfs</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
Intuitively also it does not make sense to me where does the copied file reside, as it might be large enough to reside on a single machine.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我们在“谈论它”上进行了交谈,我有更多时间向您解释这一点。
如果您在代码中使用此代码片段:
那么
conf
对象内部的内容就很重要。如果您不向其中添加任何内容,则返回的FileSystem
始终是本地的。如果您将其放入conf中:
那么您应该通过该“服务器”上的名称节点连接到HDFS,并且您可以写入HDFS。
如果您想让配置读取 XML,则必须使用
#addResource()
方法。查看此处的文档:
http://hadoop.apache.org /common/docs/current/api/org/apache/hadoop/conf/Configuration.html
示例用法可能是:
然后所有
hdfs-site.xml
映射都将位于您的会议。玩一下它,感觉真的很直观。至少对我来说;)
We chatted on Talk about it and I have a bit more time to explain this to you.
If you use this snippet in your code:
then it is important what is inside the
conf
object. If you put nothing into it, theFileSystem
returned is always local.If you put this in your conf:
then you should be connected to your HDFS via the namenode on that "server" and you are able to write to HDFS.
If you want to let the configuration read the XMLs, then you have to use the
#addResource()
methods.Look into the documentation here:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/conf/Configuration.html
A sample usage could be:
Then all your
hdfs-site.xml
mappings will be inside your conf.Play arround a bit with it, it really feels intuitive. At least for me ;)
FileSystem#Create(Path) 打开指向指定路径的流。在文件可见之前必须关闭流。
不确定您的意思,但目的地指定了目标位置。
FileSystem#Create(Path) opens a stream to the indicated path. The stream has to be closed before the file is visible.
Not sure exactly what you mean, but destination specifies the target location.