dis' hdfs dfs -cp'使用 /TMP作为其实施的一部分

发布于 2025-01-23 06:57:04 字数 214 浏览 4 评论 0原文

试图调查 /TMP填充的问题,我们不知道是什么原因引起的。我们确实有一个更改,该更改使用HDFS命令将副本执行到另一个主机(hdfs dfs -cp/source/file/file hdfs://other.host:port/target/target/file,以及while while 它可能会使用它作为实施的一部分。

复制操作并未直接触摸或参考 /TMP ,

Trying to investigate an issue where /tmp is filling up and we don't know what's causing it. We do have a recent change that's using the HDFS command to perform a copy to another host (hdfs dfs -cp /source/file hdfs://other.host:port/target/file, and while the copy operation doesn't directly touch or reference /tmp it could potentially be using it as part of its implementation.

But I can't find anything in the documentation to confirm or refute that theory - does anyone else know the answer?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

离不开的别离 2025-01-30 06:57:04

您可以查看代码:

这是复制使用HDFS
它使用自己的内部 commandWithDestination 类。
并使用另一个 internals班级实际上只是java.io。课程。 (要完成实际写入。)因此,它在内存中缓冲字节并在周围发送字节。可能不是问题。您可以通过更改Java使用的TMP目录进行检查。 ( java.io.tmpdir

导出_java_options = -djava.io.tmpdir =/new/new/tmp/dir

根据java.io.file Java Docs

系统指定默认临时文件目录
属性java.io.tmpdir。在UNIX系统上的默认值
属性通常为“/tmp”或“/var/tmp”;在Microsoft Windows上
系统通常是“ C:\ temp”。可以给出不同的价值
调用Java虚拟机时此系统属性,但是
不能保证对该属性的程序化更改具有任何
对此方法使用的临时目录的影响。

Metheod used to by

protected void copyStreamToTarget(InputStream in, PathData target)
  throws IOException {
    if (target.exists && (target.stat.isDirectory() || !overwrite)) {
      throw new PathExistsException(target.toString());
    }
    TargetFileSystem targetFs = new TargetFileSystem(target.fs);
    try {
        System.out.flush();
        System.out.println("Hello Copy Stream");
      PathData tempTarget = direct ? target : target.suffix("._COPYING_");
      targetFs.setWriteChecksum(writeChecksum);
      targetFs.writeStreamToFile(in, tempTarget, lazyPersist, direct); //here's where it uses Java.io to write the file to hdfs.
      if (!direct) {
        targetFs.rename(tempTarget, target);
      }
    } finally {
      targetFs.close(); // last ditch effort to ensure temp file is removed
    }
  }

You could look at the code:

Here's the code for copying using HDFS.
It uses it's own internal CommandWithDestination class.
And writes everything using another internal class which is really just java.io. classes. (To complete the actual write.) So it's buffering byte's in memory and sending the bytes around. Likely not the issue. You could check this by altering the tmp directory used by java. (java.io.tmpdir)

export _JAVA_OPTIONS=-Djava.io.tmpdir=/new/tmp/dir

According to the java.io.File Java Docs

The default temporary-file directory is specified by the system
property java.io.tmpdir. On UNIX systems the default value of this
property is typically "/tmp" or "/var/tmp"; on Microsoft Windows
systems it is typically "c:\temp". A different value may be given to
this system property when the Java virtual machine is invoked, but
programmatic changes to this property are not guaranteed to have any
effect upon the the temporary directory used by this method.

Metheod used to by HDFS copy:

protected void copyStreamToTarget(InputStream in, PathData target)
  throws IOException {
    if (target.exists && (target.stat.isDirectory() || !overwrite)) {
      throw new PathExistsException(target.toString());
    }
    TargetFileSystem targetFs = new TargetFileSystem(target.fs);
    try {
        System.out.flush();
        System.out.println("Hello Copy Stream");
      PathData tempTarget = direct ? target : target.suffix("._COPYING_");
      targetFs.setWriteChecksum(writeChecksum);
      targetFs.writeStreamToFile(in, tempTarget, lazyPersist, direct); //here's where it uses Java.io to write the file to hdfs.
      if (!direct) {
        targetFs.rename(tempTarget, target);
      }
    } finally {
      targetFs.close(); // last ditch effort to ensure temp file is removed
    }
  }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文