当前位置：文江博客话题详情

如何从远程存档文件中提取单个文件？

发布于 2024-09-07 08:14:06 字数 376 浏览 6 评论 0原文

给定

存档的 URL（例如 zip 文件）
该存档内文件的全名（包括路径）

我正在寻找一种方法（最好用 Java）创建该文件的本地副本，无需下载首先是整个档案。

根据我（有限）的理解，这应该是可能的，尽管我不知道如何做到这一点。我一直在使用 TrueZip，因为它似乎支持多种存档类型，但我有对其以这种方式工作的能力表示怀疑。有人有此类事情的经验吗？

编辑：能够使用 tarball 和压缩 tarball 来做到这一点对我来说也很重要。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

So要识趣 2024-09-14 08:14:06

那么，至少，您必须下载存档的部分，直到并包括您要提取的文件的压缩数据。这建议采用以下解决方案：打开存档的 URLConnection，获取其输入流，将其包装在 ZipInputStream 中，然后重复调用 getNextEntry() > 和 closeEntry() 迭代文件中的所有条目，直到到达所需的条目。然后您可以使用 ZipInputStream.read(...) 读取其数据。

Java 代码看起来像这样：

URL url = new URL("http://example.com/path/to/archive");
ZipInputStream zin = new ZipInputStream(url.getInputStream());
ZipEntry ze = zin.getNextEntry();
while (!ze.getName().equals(pathToFile)) {
    zin.closeEntry(); // not sure whether this is necessary
    ze = zin.getNextEntry();
}
byte[] bytes = new byte[ze.getSize()];
zin.read(bytes);

当然，这是未经测试的。

Well, at a minimum, you have to download the portion of the archive up to and including the compressed data of the file you want to extract. That suggests the following solution: open a URLConnection to the archive, get its input stream, wrap it in a ZipInputStream, and repeatedly call getNextEntry() and closeEntry() to iterate through all the entries in the file until you reach the one you want. Then you can read its data using ZipInputStream.read(...).

The Java code would look something like this:

URL url = new URL("http://example.com/path/to/archive");
ZipInputStream zin = new ZipInputStream(url.getInputStream());
ZipEntry ze = zin.getNextEntry();
while (!ze.getName().equals(pathToFile)) {
    zin.closeEntry(); // not sure whether this is necessary
    ze = zin.getNextEntry();
}
byte[] bytes = new byte[ze.getSize()];
zin.read(bytes);

This is, of course, untested.

回复收藏 0 原文

挽梦忆笙歌 2024-09-14 08:14:06

与这里的其他答案相反，我想指出 ZIP 条目是单独压缩的，因此（理论上）您不需要下载目录和条目本身之外的任何内容。服务器需要支持 Range HTTP 标头才能正常工作。

标准Java API仅支持从本地文件和输入流读取ZIP文件。据我所知，没有提供从随机访问远程文件读取的功能。

由于您使用的是 TrueZip，我建议使用 Apache HTTP 客户端实现 de.schlichtherle.io.rof.ReadOnlyFile 并使用以下命令创建 de.schlichtherle.util.zip.ZipFile那。

这不会为压缩的 TAR 存档提供任何优势，因为整个存档被压缩在一起（不仅仅是使用 InputStream 并在获得条目时杀死它）。

回复收藏 0 原文

殤城〤 2024-09-14 08:14:06

从 TrueZIP 7.2 开始，TrueZIP 路径模块中有一个新的客户端 API。这是 JSE 7 的 NIO.2 FileSystemProvider 的实现。使用此 API，您可以访问 HTTP URI，如下所示：

Path path = new TPath(new URI("http://acme.com/download/everything.tar.gz/README.TXT"));
try (InputStream in = Files.newInputStream(path)) {
    // Read archive entry contents here.
    ...
}

Since TrueZIP 7.2, there is a new client API in the module TrueZIP Path. This is an implementation of an NIO.2 FileSystemProvider for JSE 7. Using this API, you can access HTTP URI as follows:

Path path = new TPath(new URI("http://acme.com/download/everything.tar.gz/README.TXT"));
try (InputStream in = Files.newInputStream(path)) {
    // Read archive entry contents here.
    ...
}

回复收藏 0 原文

鸵鸟症 2024-09-14 08:14:06

我不确定是否有办法从 ZIP 中提取单个文件，而无需先下载整个文件。但是，如果您是 ZIP 文件的托管者，则可以创建一个 Java servlet 来读取 ZIP 文件并在响应中返回所请求的文件：

public class GetFileFromZIPServlet extends HttpServlet{
  @Override
  public void doGet(HttpServletRequest request, HttpServletResponse response)
  throws ServletException, IOException{
    String pathToFile = request.getParameter("pathToFile");

    byte fileBytes[];
    //get the bytes of the file from the ZIP

    //set the appropriate content type, maybe based on the file extension
    response.setContentType("...");

    //write file to the response
    response.getOutputStream().write(fileBytes);
  }
}

I'm not sure if there's a way to pull out a single file from a ZIP without downloading the whole thing first. But, if you're the one hosting the ZIP file, you could create a Java servlet which reads the ZIP file and returns the requested file in the response:

public class GetFileFromZIPServlet extends HttpServlet{
  @Override
  public void doGet(HttpServletRequest request, HttpServletResponse response)
  throws ServletException, IOException{
    String pathToFile = request.getParameter("pathToFile");

    byte fileBytes[];
    //get the bytes of the file from the ZIP

    //set the appropriate content type, maybe based on the file extension
    response.setContentType("...");

    //write file to the response
    response.getOutputStream().write(fileBytes);
  }
}

回复收藏 0 原文

~没有更多了~