hadoop fs -put 和 hadoop fs -copyFromLocal 之间的区别
-put
和 -copyFromLocal
被记录为相同的,而大多数示例使用详细变体 -copyFromLocal。为什么?
-get
和 -copyToLocal
也是如此
-put
and -copyFromLocal
are documented as identical, while most examples use the verbose variant -copyFromLocal. Why?
Same thing for -get
and -copyToLocal
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
最新
-copyFromLocal
和-put
命令之间没有区别。参考:Hadoop 的文档。
早期
-copyFromLocal
与-put
命令类似,不同之处在于源限制为本地文件引用。所以基本上,您可以使用 put 执行所有与
-copyFromLocal
相同的操作,但反之则不然。同样,
-copyToLocal
与 get 命令类似,只不过目标限制为本地文件引用。因此,您可以使用 get 代替
-copyToLocal
,但反之则不行。Latest
There is no difference between
-copyFromLocal
and the-put
command.Reference: Hadoop's documentation.
Earlier
-copyFromLocal
is similar to-put
command, except that the source is restricted to a local file reference.So basically, you can do with put, all that you do with
-copyFromLocal
, but not vice-versa.Similarly,
-copyToLocal
is similar to get command, except that the destination is restricted to a local file reference.Hence, you can use get instead of
-copyToLocal
, but not the other way round.我们举个例子:
如果您的 HDFS 包含路径:
/tmp/dir/abc.txt
如果您的本地磁盘也包含此路径,那么 hdfs API 将不知道您指的是哪一个,除非您指定类似
file://
或hdfs://
的方案>。也许它选择了您不想复制的路径。因此,您可以使用
-copyFromLocal
通过限制您提供给本地文件系统的参数来防止您意外复制错误的文件。Put
适用于更高级的用户,他们知道将哪个方案放在前面。对于 Hadoop 新用户来说,他们当前所在的文件系统以及文件实际所在的位置总是有点令人困惑。
Let's make an example:
If your HDFS contains the path:
/tmp/dir/abc.txt
And if your local disk also contains this path then the hdfs API won't know which one you mean, unless you specify a scheme like
file://
orhdfs://
. Maybe it picks the path you did not want to copy.Therefore you have
-copyFromLocal
which is preventing you from accidentally copying the wrong file, by limiting the parameter you give to the local filesystem.Put
is for more advanced users who know which scheme to put in front.It is always a bit confusing to new Hadoop users which filesystem they are currently in and where their files actually are.
尽管文档声称,截至目前(2015 年 10 月),
-copyFromLocal
和-put
是相同的。从在线帮助:
这得到了 查看来源,您可以在其中可以看到 CopyFromLocal 类扩展了 Put 类,但没有添加任何新行为:
您可能会注意到,这与
get
/copyToLocal
完全相同。Despite what is claimed by the documentation, as of now (Oct. 2015), both
-copyFromLocal
and-put
are the same.From the online help:
And this is confirmed by looking at the sources, where you can see that the CopyFromLocal class extends the Put class, but without adding any new behavior:
As you might notice it, this is exactly the same for
get
/copyToLocal
.-copyFromLocal
仅限于从本地复制,而-put
可以从任何(其他 HDFS/本地文件系统/..)获取文件-copyFromLocal
is restricted to copy from local while-put
can take file from any (other HDFS/local filesystem/..)他们是一样的。这可以通过在命令行上打印
hdfs
(或hadoop
)的使用情况来看到:与
hdfs
(hadoop
命令特定于HDFS 文件系统):They're the same. This can be seen by printing usage for
hdfs
(orhadoop
) on a command-line:Same for
hdfs
(thehadoop
command specific for HDFS filesystems):-put
和-copyFromLocal
命令的工作原理完全相同。您无法使用-put
命令将文件从一个 HDFS 目录复制到另一个目录。让我们看一个例子:假设您的根目录有两个目录,名为“test1”和“test2”。如果“test1”包含文件“customer.txt”,并且您尝试将其复制到 test2 目录,则自“put”以来将导致
“没有此类文件或目录”
错误将在本地文件系统而不是 hdfs 中查找该文件。它们都只是将文件(或目录)从本地文件系统复制到 HDFS。
Both
-put
&-copyFromLocal
commands work exactly the same. You cannot use-put
command to copy files from one HDFS directory to another. Let's see this with an example: say your root has two directories, named 'test1' and 'test2'. If 'test1' contains a file 'customer.txt' and you try copying it to test2 directoryIt will result in
'no such file or directory'
error since 'put' will look for the file in the local file system and not hdfs.They are both meant to copy files (or directories) from the local file system to HDFS, only.