如何使用滑动窗口过滤器包装 ruby IO
我在一些 ruby 代码中使用不透明的 API,它采用文件/IO 作为参数。我希望能够向它传递一个 IO 对象,该对象仅允许访问真实 IO 对象中给定范围的数据。
例如,我有一个 8GB 的文件,我想给 api 一个 IO 对象,该对象在我的真实文件中间有 1GB 的范围。
real_file = File.new('my-big-file')
offset = 1 * 2**30 # start 1 GB into it
length = 1 * 2**30 # end 1 GB after start
filter = IOFilter.new(real_file, offset, length)
# The api only sees the 1GB of data in the middle
opaque_api(filter)
filter_io 项目看起来是最容易适应这样做的,但似乎不支持直接这个用例。
I'm using an opaque API in some ruby code which takes a File/IO as a parameter. I want to be able to pass it an IO object that only gives access to a given range of data in the real IO object.
For example, I have a 8GB file, and I want to give the api an IO object that has a 1GB range within the middle of my real file.
real_file = File.new('my-big-file')
offset = 1 * 2**30 # start 1 GB into it
length = 1 * 2**30 # end 1 GB after start
filter = IOFilter.new(real_file, offset, length)
# The api only sees the 1GB of data in the middle
opaque_api(filter)
The filter_io project looks like it would be the easiest to adapt to do this, but doesn't seem to support this use case directly.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为你必须自己编写它,因为它似乎是一个相当具体的事情:你必须使用一大块来实现 IO 的所有方法(或者你需要的子集)打开的文件作为数据源。 “特殊性”的一个例子是写入这样的流 - 您必须注意不要跨越给定段的边界,即不断跟踪您在大文件中的当前位置。这似乎不是一项微不足道的工作,而且我没有看到任何可以帮助您的捷径。
也许您可以找到一些基于操作系统的解决方案,例如从大文件的一部分中制作环回设备(请参阅
man losetup
,特别是-o
和--sizelimit
选项,例如)。变体 2:
如果您愿意始终将窗口的内容保留在内存中,则可以将
StringIO
像这样(只是一个草图,未经测试):并像使用
IO#open< 的块变体一样使用它/代码>。
I think you would have to write it yourself, as it seems like a rather specific thing: you would have to implement all (or, a subset that you need) of
IO
's methods using a chunk of the opened file as a data source. An example of the "speciality" would be writing to such stream - you would have to take care not to cross the boundary of the segment given, i.e. constantly keeping track of your current position in the big file. Doesn't seem like a trivial job, and I don't see any shortcuts that could help you there.Perhaps you can find some OS-based solution, e.g. making a loopback device out of the part of the large file (see
man losetup
and particularly-o
and--sizelimit
options, for example).Variant 2:
If you are ok with keeping the contents of the window in memory all the time, you may wrap
StringIO
like this (just a sketch, not tested):And use it as you would use block variant of
IO#open
.我相信 IO 对象具有您正在寻找的功能。我之前用过它来对类似大小的文件进行 MD5 哈希求和。
这是我使用的块,我将块传递给 MD5 Digest 对象。
http://www.ruby-doc.org/core/classes/IO .html#M000918
I believe the IO object has the functionality you are looking for. I've used it before for MD5 hash summing similarly sized files.
This was the block I used, where I was passing the chunk to the MD5 Digest object.
http://www.ruby-doc.org/core/classes/IO.html#M000918