Ruby 流 tar/gz

发布于 2024-12-11 10:23:22 字数 858 浏览 0 评论 0原文

基本上我想将内存中的数据流式传输为 tar/gz 格式(可能将多个文件传输到 tar 中,但它永远不应该接触硬盘,只能流式传输!),然后将它们流式传输到其他地方(在我的例子中是 HTTP 请求体)。

有人知道现有的图书馆可以做到这一点吗? Rails 中有什么东西吗?

libarchive-ruby 只是一个 C 包装器,看起来它非常依赖于平台(文档希望您将编译作为安装步骤?!)。

解决方案:

require 'zlib'
require 'rubygems/package'

tar = StringIO.new

Gem::Package::TarWriter.new(tar) { |writer|
  writer.add_file("a_file.txt", 0644) { |f| 
    (1..1000).each { |i| 
      f.write("some text\n")
    }
  }
  writer.add_file("another_file.txt", 0644) { |f| 
    f.write("some more text\n")
  }
}
tar.seek(0)

gz = Zlib::GzipWriter.new(File.new('this_is_a_tar_gz.tar.gz', 'wb'))  # Make sure you use 'wb' for binary write!
gz.write(tar.read)
tar.close
gz.close

就是这样!您可以用任何 IO 交换 GzipWriter 中的文件以保持其流式传输。 dw11wtq 的饼干!

Basically I want to stream data from memory into a tar/gz format (possibly multiple files into the tar, but it should NEVER TOUCH THE HARDDRIVE, only streaming!), then stream them somewhere else (an HTTP request body in my case).

Anyone know of an existing library that can do this? Is there something in Rails?

libarchive-ruby is only a C wrapper and seems like it would be very platform-dependent (the docs want you to compile as an installation step?!).

SOLUTION:

require 'zlib'
require 'rubygems/package'

tar = StringIO.new

Gem::Package::TarWriter.new(tar) { |writer|
  writer.add_file("a_file.txt", 0644) { |f| 
    (1..1000).each { |i| 
      f.write("some text\n")
    }
  }
  writer.add_file("another_file.txt", 0644) { |f| 
    f.write("some more text\n")
  }
}
tar.seek(0)

gz = Zlib::GzipWriter.new(File.new('this_is_a_tar_gz.tar.gz', 'wb'))  # Make sure you use 'wb' for binary write!
gz.write(tar.read)
tar.close
gz.close

That's it! You can swap out the File in the GzipWriter with any IO to keep it streaming. Cookies for dw11wtq!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

旧话新听 2024-12-18 10:23:22

看一下 ruby​​gems 中的 TarWriter 类: http://rubygems.rubyforge .org/rubygems-update/Gem/Package/TarWriter.html 它只是在 IO 流上运行,可能是 StringIO。

tar = StringIO.new

Gem::Package::TarWriter.new(tar) do |writer|
  writer.add_file("hello_world.txt", 0644) { |f| f.write("Hello world!\n") }
end

tar.seek(0)

p tar.read #=> mostly padding, but a tar nonetheless

如果您需要 tarball 中的目录布局,它还提供了添加目录的方法。

作为参考,您可以使用 IO.popen 实现 gzip 压缩,只需将数据传入/传出系统进程:

http://www.ruby-doc.org/core-1.9.2/IO.html#method-c-popen

gzip 本身看起来像这样的东西:

gzippped_data = IO.popen("gzip", "w+") do |gzip|
  gzip.puts "Hello world!"
  gzip.close_write
  gzip.read
end
# => "\u001F\x8B\b\u0000\xFD\u001D\xA2N\u0000\u0003\xF3H\xCD\xC9\xC9W(\xCF/\xCAIQ\xE4\u0002\u0000A䩲\r\u0000\u0000\u0000"

Take a look at the TarWriter class in rubygems: http://rubygems.rubyforge.org/rubygems-update/Gem/Package/TarWriter.html it just operates on an IO stream, which may be a StringIO.

tar = StringIO.new

Gem::Package::TarWriter.new(tar) do |writer|
  writer.add_file("hello_world.txt", 0644) { |f| f.write("Hello world!\n") }
end

tar.seek(0)

p tar.read #=> mostly padding, but a tar nonetheless

It also provides methods to add directories if you need a directory layout in the tarball.

For reference, you could achieve the gzipping with IO.popen, just piping the data in/out of the system process:

http://www.ruby-doc.org/core-1.9.2/IO.html#method-c-popen

The gzipping itself would look something like this:

gzippped_data = IO.popen("gzip", "w+") do |gzip|
  gzip.puts "Hello world!"
  gzip.close_write
  gzip.read
end
# => "\u001F\x8B\b\u0000\xFD\u001D\xA2N\u0000\u0003\xF3H\xCD\xC9\xC9W(\xCF/\xCAIQ\xE4\u0002\u0000A䩲\r\u0000\u0000\u0000"
那些过往 2024-12-18 10:23:22

根据OP编写的解决方案,我编写了完全内存中的tgz存档函数,我想用它来POST到Web服务器。

  # Create tar gz archive file from files, on the memory.
  # Parameters:
  #   files: Array of hash with key "filename" and "body"
  #     Ex: [{"filename": "foo.txt", "body": "This is foo.txt"},...]
  #
  # Return:: tar_gz archived image as string
  def create_tgz_archive_from_files(files)
    tar = StringIO.new
    Gem::Package::TarWriter.new(tar){ |tar_writer|
      files.each{|file|
        tar_writer.add_file(file['filename'], 0644){|f|
          f.write(file['body'])
        }
      }
    }
    tar.rewind

    gz = StringIO.new('', 'r+b')
    gz.set_encoding("BINARY")
    gz_writer = Zlib::GzipWriter.new(gz)
    gz_writer.write(tar.read)
    tar.close
    gz_writer.finish
    gz.rewind
    tar_gz_buf = gz.read
    return tar_gz_buf
  end

Based on the solution OP wrote, I wrote fully on-memory tgz archive function what I want to use to POST to web server.

  # Create tar gz archive file from files, on the memory.
  # Parameters:
  #   files: Array of hash with key "filename" and "body"
  #     Ex: [{"filename": "foo.txt", "body": "This is foo.txt"},...]
  #
  # Return:: tar_gz archived image as string
  def create_tgz_archive_from_files(files)
    tar = StringIO.new
    Gem::Package::TarWriter.new(tar){ |tar_writer|
      files.each{|file|
        tar_writer.add_file(file['filename'], 0644){|f|
          f.write(file['body'])
        }
      }
    }
    tar.rewind

    gz = StringIO.new('', 'r+b')
    gz.set_encoding("BINARY")
    gz_writer = Zlib::GzipWriter.new(gz)
    gz_writer.write(tar.read)
    tar.close
    gz_writer.finish
    gz.rewind
    tar_gz_buf = gz.read
    return tar_gz_buf
  end
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文