如何在java中获取gzip内文件的文件名?

发布于 2024-09-28 06:28:15 字数 717 浏览 0 评论 0原文

int BUFFER_SIZE = 4096;
    byte[] buffer = new byte[BUFFER_SIZE];
    InputStream input = new GZIPInputStream(new FileInputStream("a_gunzipped_file.gz"));
    OutputStream output = new FileOutputStream("current_output_name");
    int n = input.read(buffer, 0, BUFFER_SIZE);
    while (n >= 0) {
        output.write(buffer, 0, n);
        n = input.read(buffer, 0, BUFFER_SIZE);
    }

    }catch(IOException e){
            System.out.println("error: \n\t" + e.getMessage());
    }

使用上面的代码我可以成功提取 gzip 的内容,尽管提取的文件的文件名如预期的那样始终是 current_output_name (我知道它是因为我在代码中声明它是这样的)。我的问题是,当文件仍在存档中时,我不知道如何获取文件的文件名。

虽然 java.util.zip 提供了 ZipEntry,但我无法在 gzip 文件上使用它。 还有其他选择吗?

int BUFFER_SIZE = 4096;
    byte[] buffer = new byte[BUFFER_SIZE];
    InputStream input = new GZIPInputStream(new FileInputStream("a_gunzipped_file.gz"));
    OutputStream output = new FileOutputStream("current_output_name");
    int n = input.read(buffer, 0, BUFFER_SIZE);
    while (n >= 0) {
        output.write(buffer, 0, n);
        n = input.read(buffer, 0, BUFFER_SIZE);
    }

    }catch(IOException e){
            System.out.println("error: \n\t" + e.getMessage());
    }

Using the above code I can succesfully extract a gzip's contents although the extracted file's filenames are, as expected, will always be current_output_name (I know its because I declared it to be that way in the code). My problem is I dont know how to get the file's filename when it is still inside the archive.

Though, java.util.zip provides a ZipEntry, I couldn't use it on gzip files.
Any alternatives?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

凉城 2024-10-05 06:28:15

我有点同意“Michael Borgwardt”的回复,但这并不完全正确,gzip 文件规范包含存储在 gz 文件标头中的可选文件名,遗憾的是(据我所知)没有办法在当前的 java (1.6) 中获取该名称。如方法 getHeader 在 openjdk 中

他们跳过读取文件名

// Skip optional file name
if ((flg & FNAME) == FNAME) {
      while (readUByte(in) != 0) ;
}

我已经修改了类 GZIPInputStream 以从 gzip 存档中获取可选文件名(我不确定我是否可以这样做)( 从这里下载原始版本),只需添加成员字符串文件名即可;到类中,并将上面的代码修改为:

 // Skip optional file name
 if ((flg & FNAME) == FNAME) {
      filename= "";
      int _byte = 0;
      while ((_byte= readUByte(in)) != 0){
           filename += (char)_byte;
      }
 }

它对我有用。

as i kinda agree with "Michael Borgwardt" on his reply, but it is not entirely true, gzip file specifications contains an optional file name stored in the header of the gz file, sadly there are no way (as far as i know ) of getting that name in current java (1.6). as seen in the implementation of the GZIPInputStream in the method getHeader in the openjdk

they skip reading the file name

// Skip optional file name
if ((flg & FNAME) == FNAME) {
      while (readUByte(in) != 0) ;
}

i have modified the class GZIPInputStream to get the optional filename out of the gzip archive(im not sure if i am allowed to do that) (download the original version from here), you only need to add a member String filename; to the class, and modify the above code to be :

 // Skip optional file name
 if ((flg & FNAME) == FNAME) {
      filename= "";
      int _byte = 0;
      while ((_byte= readUByte(in)) != 0){
           filename += (char)_byte;
      }
 }

and it worked for me.

烛影斜 2024-10-05 06:28:15

Apache Commons Compress 提供了两种获取文件名的选项:

使用元数据(Java 7+ 示例代码)

try ( //
     GzipCompressorInputStream gcis = //
         new GzipCompressorInputStream( //
             new FileInputStream("a_gunzipped_file.gz") //
         ) //
    ) {
      String filename = gcis.getMetaData().getFilename();
    }

使用“约定”

 String filename = GzipUtils.getUnCompressedFilename("a_gunzipped_file.gz");

参考文献

Apache Commons Compress offers two options for obtaining the filename:

With metadata (Java 7+ sample code)

try ( //
     GzipCompressorInputStream gcis = //
         new GzipCompressorInputStream( //
             new FileInputStream("a_gunzipped_file.gz") //
         ) //
    ) {
      String filename = gcis.getMetaData().getFilename();
    }

With "the convention"

 String filename = GzipUtils.getUnCompressedFilename("a_gunzipped_file.gz");

References

秋风の叶未落 2024-10-05 06:28:15

实际上,GZIP 文件格式使用多个成员,允许指定原始文件名。包括带有FLAG的成员FLAG.FNAME可以指定名称。但我在 java 库中没有看到这样做的方法。

http://www.gzip.org/zlib/rfc-gzip.html#specation

Actually, the GZIP file format, using the multiple members, allows the original filename to be specified. Including a member with the FLAG of FLAG.FNAME the name can be specified. I do not see a way to do this in the java libraries though.

http://www.gzip.org/zlib/rfc-gzip.html#specification

泡沫很甜 2024-10-05 06:28:15

按照上面的答案,这里是一个创建文件“myTest.csv.gz”的示例,其中包含文件“myTest.csv”,请注意,您无法更改内部文件名,并且无法添加更多文件进入.gz 文件。

@Test
public void gzipFileName() throws Exception {
    File workingFile = new File( "target", "myTest.csv.gz" );
    GZIPOutputStream gzipOutputStream = new GZIPOutputStream( new FileOutputStream( workingFile ) );

    PrintWriter writer = new PrintWriter( gzipOutputStream );
    writer.println("hello,line,1");
    writer.println("hello,line,2");
    writer.close();

}

following the answers above, here is an example that creates a file "myTest.csv.gz" that contains a file "myTest.csv", notice that you can't change the internal file name, and you can't add more files into the gz file.

@Test
public void gzipFileName() throws Exception {
    File workingFile = new File( "target", "myTest.csv.gz" );
    GZIPOutputStream gzipOutputStream = new GZIPOutputStream( new FileOutputStream( workingFile ) );

    PrintWriter writer = new PrintWriter( gzipOutputStream );
    writer.println("hello,line,1");
    writer.println("hello,line,2");
    writer.close();

}
三月梨花 2024-10-05 06:28:15

Gzip 是纯粹的压缩。 没有存档,它只是压缩的文件数据。

惯例是 gzip 将 .gz 附加到文件名,而gunzip则删除该扩展名。因此,压缩时 logfile.txt 变为 logfile.txt.gz,解压缩时又变为 logfile.txt。如果重命名该文件,名称信息将丢失。

Gzip is purely compression. There is no archive, it's just the file's data, compressed.

The convention is for gzip to append .gz to the filename, and for gunzip to remove that extension. So, logfile.txt becomes logfile.txt.gz when compressed, and again logfile.txt when it's decompressed. If you rename the file, the name information is lost.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文