hadoop fs -put 和 hadoop fs -copyFromLocal 之间的区别

发布于 2024-12-10 08:37:47 字数 155 浏览 0 评论 0原文

-put-copyFromLocal 被记录为相同的,而大多数示例使用详细变体 -copyFromLocal。为什么?

-get-copyToLocal 也是如此

-put and -copyFromLocal are documented as identical, while most examples use the verbose variant -copyFromLocal. Why?

Same thing for -get and -copyToLocal

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

黑白记忆 2024-12-17 08:37:47

最新

-copyFromLocal-put 命令之间没有区别

参考Hadoop 的文档


早期

-copyFromLocal-put 命令类似,不同之处在于源限制为本地文件引用。

所以基本上,您可以使用 put 执行所有与 -copyFromLocal 相同的操作,但反之则不然。

同样,

-copyToLocalget 命令类似,只不过目标限制为本地文件引用。

因此,您可以使用 get 代替 -copyToLocal,但反之则不行。

Latest

There is no difference between -copyFromLocal and the -put command.

Reference: Hadoop's documentation.


Earlier

-copyFromLocal is similar to -put command, except that the source is restricted to a local file reference.

So basically, you can do with put, all that you do with -copyFromLocal, but not vice-versa.

Similarly,

-copyToLocal is similar to get command, except that the destination is restricted to a local file reference.

Hence, you can use get instead of -copyToLocal, but not the other way round.

稚然 2024-12-17 08:37:47

我们举个例子:
如果您的 HDFS 包含路径:/tmp/dir/abc.txt
如果您的本地磁盘也包含此路径,那么 hdfs API 将不知道您指的是哪一个,除非您指定类似 file://hdfs:// 的方案>。也许它选择了您不想复制的路径。

因此,您可以使用 -copyFromLocal 通过限制您提供给本地文件系统的参数来防止您意外复制错误的文件。

Put 适用于更高级的用户,他们知道将哪个方案放在前面。

对于 Hadoop 新用户来说,他们当前所在的文件系统以及文件实际所在的位置总是有点令人困惑。

Let's make an example:
If your HDFS contains the path: /tmp/dir/abc.txt
And if your local disk also contains this path then the hdfs API won't know which one you mean, unless you specify a scheme like file:// or hdfs://. Maybe it picks the path you did not want to copy.

Therefore you have -copyFromLocal which is preventing you from accidentally copying the wrong file, by limiting the parameter you give to the local filesystem.

Put is for more advanced users who know which scheme to put in front.

It is always a bit confusing to new Hadoop users which filesystem they are currently in and where their files actually are.

只是一片海 2024-12-17 08:37:47

尽管文档声称,截至目前(2015 年 10 月),-copyFromLocal-put 是相同的。

从在线帮助:

[cloudera@quickstart ~]$ hdfs dfs -help copyFromLocal 
-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst> :
  Identical to the -put command.

这得到了 查看来源,您可以在其中可以看到 CopyFromLocal 类扩展了 Put 类,但没有添加任何新行为:

  public static class CopyFromLocal extends Put {
    public static final String NAME = "copyFromLocal";
    public static final String USAGE = Put.USAGE;
    public static final String DESCRIPTION = "Identical to the -put command.";
  }

  public static class CopyToLocal extends Get {
    public static final String NAME = "copyToLocal";
    public static final String USAGE = Get.USAGE;
    public static final String DESCRIPTION = "Identical to the -get command.";
  }

您可能会注意到,这与 get/copyToLocal 完全相同。

Despite what is claimed by the documentation, as of now (Oct. 2015), both -copyFromLocal and -put are the same.

From the online help:

[cloudera@quickstart ~]$ hdfs dfs -help copyFromLocal 
-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst> :
  Identical to the -put command.

And this is confirmed by looking at the sources, where you can see that the CopyFromLocal class extends the Put class, but without adding any new behavior:

  public static class CopyFromLocal extends Put {
    public static final String NAME = "copyFromLocal";
    public static final String USAGE = Put.USAGE;
    public static final String DESCRIPTION = "Identical to the -put command.";
  }

  public static class CopyToLocal extends Get {
    public static final String NAME = "copyToLocal";
    public static final String USAGE = Get.USAGE;
    public static final String DESCRIPTION = "Identical to the -get command.";
  }

As you might notice it, this is exactly the same for get/copyToLocal.

情栀口红 2024-12-17 08:37:47
  • 两者都是相同的,除了
  • -copyFromLocal 仅限于从本地复制,而 -put 可以从任何(其他 HDFS/本地文件系统/..)获取文件
  • both are the same except
  • -copyFromLocal is restricted to copy from local while -put can take file from any (other HDFS/local filesystem/..)
脱离于你 2024-12-17 08:37:47

他们是一样的。这可以通过在命令行上打印 hdfs (或 hadoop)的使用情况来看到:

$ hadoop fs -help
# Usage: hadoop fs [generic options]
# [ . . . ]
# -copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst> :
#   Identical to the -put command.

# -copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst> :
#   Identical to the -get command.

hdfshadoop 命令特定于HDFS 文件系统):

$ hdfs dfs -help
# [ . . . ]
# -copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst> :
#   Identical to the -put command.

# -copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst> :
#   Identical to the -get command.

They're the same. This can be seen by printing usage for hdfs (or hadoop) on a command-line:

$ hadoop fs -help
# Usage: hadoop fs [generic options]
# [ . . . ]
# -copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst> :
#   Identical to the -put command.

# -copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst> :
#   Identical to the -get command.

Same for hdfs (the hadoop command specific for HDFS filesystems):

$ hdfs dfs -help
# [ . . . ]
# -copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst> :
#   Identical to the -put command.

# -copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst> :
#   Identical to the -get command.
染年凉城似染瑾 2024-12-17 08:37:47

-put-copyFromLocal 命令的工作原理完全相同。您无法使用 -put 命令将文件从一个 HDFS 目录复制到另一个目录。让我们看一个例子:假设您的根目录有两个目录,名为“test1”和“test2”。如果“test1”包含文件“customer.txt”,并且您尝试将其复制到 test2 目录,

$ hadoop fs -put /test1/customer.txt /test2

则自“put”以来将导致“没有此类文件或目录” 错误将在本地文件系统而不是 hdfs 中查找该文件。
它们都只是将文件(或目录)从本地文件系统复制到 HDFS。

Both -put & -copyFromLocal commands work exactly the same. You cannot use -put command to copy files from one HDFS directory to another. Let's see this with an example: say your root has two directories, named 'test1' and 'test2'. If 'test1' contains a file 'customer.txt' and you try copying it to test2 directory

$ hadoop fs -put /test1/customer.txt /test2

It will result in 'no such file or directory' error since 'put' will look for the file in the local file system and not hdfs.
They are both meant to copy files (or directories) from the local file system to HDFS, only.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文