It's possible, but it's very, very tricky. You'll have to develop custom drivers because while encrypted sectors are the same size as regular sectors, and thus use the same math to find data, compressed sectors are "smaller" so you either have to hold a map of 'real' sectors to compressed sectors in the OS, or on the drive itself.
The only other aspect is speed of access and latency. It shouldn't affect seek, but it may take longer to compress data than it would to write it - compression is fairly compute intensive.
Further, compression isn't really good until you get to large chunks of data. You can probably compress 512 bytes (1 sector) on the fly and get a few percent compression on average, but people really want to see 20% and more compression before they're willing to pony up the extra cash for the hardware.
It will require more on disk processing power and memory, which will increase the cost of the drive.
Further, drive capacity is growing at a rate that it's probably not cost effective to do this.
Let's say, for instance, that you develop the miracle compression that doubles the space with no performance drop, no extra (flaky or crash prone) drivers, works on any OS, etc. But it adds $100 to the cost of the drive.
It might make sense for someone to do this for a 1TB drive now, converting it into a 2TB drive, but in 6-8 months the 2TB drives will be below $200. It won't be worth it for any smaller drive, because you can get a 1TB for $99 now.
If you made it so it would work between the drive and the computer you're going to have much greater latency than building it directly into the drive, and the price/performance hit might not make it worthwhile.
So, technically it's possible, but has pitfalls and adds complexity and points of weakness to the system, but even if it didn't have these drawbacks, it is likely that it isn't worthwhile.
One other consideration is that most of the big files on your disk (music, pictures and videos) are usually already compressed (MP3, JPEG, MP4/MOV), so compression wouldn't help those. And the files that aren't compressed (text files, word processing, text of emails) tend to be a drop in the bucket.
I was wondering the same thing myself because I was searching through thousands of gzipped text files, and ungzipping pegs my quad-core i7, and I wondered if dedicated gzip hardware could speed this up like a GPU speeds up parallel processing. But I'm suspecting the above concerns would make it so that for most uses a compressed hard drive wouldn't help much.
In the world of 1TB drives for $100 speed is a much more valuable resource than space. It wouldn't be worth it.
EDIT:
Ah so you're saying that it would be quicker to grab 100 bytes of compressed data off the platters, decompress it and then send it along to the system rather than grabbing 800 bytes of uncompressed data and send it along to the system because seek times are so slow.
That seems like a clever approach but I'm willing to bet that if the trade off ends up being worth it that hard drive manufacturers have already employed this technique and hard drive speeds are what they are in spite of the fact.
I remember about 15 years ago seeing an advertisement for an IDE controller card that would do hardware compression. Not sure if it was any good or not. Those were the days when 1GB drives where over $1,000.
Who remembers Stacker? This was all already done-to-death in the '80's/90's. Speed was never a problem, and neither was it "tricky." It's just completely unnecessary these days.
因为在硬件中很难做到(应该报告多大的磁盘大小?如果输入的熵等于其大小,你会怎么做?)并且现代 CPU + RAM 与 HDD 相比速度快得惊人,所以只需在软件中完成即可。
我知道的一种实现是 compFUSed ,它位于任何其他文件系统之上,另一个是 ZFS < a href="http://jforonda.blogspot.com/2007/01/zfs-compression.html" rel="nofollow noreferrer">有关如何启用它的博客条目,它本身支持压缩。
As previously said, the gain is not that big, especially if you are storing seldom accessed files in a compressed form anyway.
As it would be hard to do in hardware (What disk size should be reported? What do you do if the entropy of the input equals its size?) and modern CPUs+RAM are blazingly fast compared to HDDs anyway, just do it in software.
An implementation I know of is compFUSed which is layered on top of any other file system, another one is ZFS Blog entry about how to enable it which supports compression natively.
I had also thought of this idea a while ago for network traffic- which has been done before: there are accelerator cards for comressing using gzip: http://www.aha.com/show_prod.php?id=36
I had also thought of another benefit would be that now you can transfer without compressing the contents from the drive- simply read from the disk the compressed blocks and send rather than having to compress at that later time.
发布评论
评论(7)
这是可能的,但是非常非常棘手。 您必须开发自定义驱动程序,因为虽然加密扇区的大小与常规扇区相同,因此使用相同的数学来查找数据,但压缩扇区“更小”,因此您必须保存“真实”扇区的映射来查找数据。操作系统中或驱动器本身上的压缩扇区。
唯一的另一个方面是访问速度和延迟。 它不应该影响查找,但压缩数据可能比写入数据花费更长的时间 - 压缩是相当计算密集型的。
此外,在获得大数据块之前,压缩效果并不是很好。 您可能可以即时压缩 512 字节(1 个扇区)并获得平均几个百分点的压缩率,但人们确实希望看到 20% 或更多的压缩率,然后才愿意为硬件支付额外的费用。
它将需要更多的磁盘处理能力和内存,这将增加驱动器的成本。
此外,驱动器容量的增长速度可能不符合成本效益。
举例来说,假设您开发了一种神奇的压缩技术,可以使空间加倍,而不会降低性能,无需额外的(片状或容易崩溃的)驱动程序,可以在任何操作系统上运行,等等。但它会增加 100 美元的驱动器成本。
现在有人对 1TB 驱动器进行此操作,将其转换为 2TB 驱动器可能是有意义的,但在 6-8 个月内,2TB 驱动器的价格将低于 200 美元。 对于任何较小的驱动器来说,这都是不值得的,因为你现在可以花 99 美元购买 1TB 的驱动器。
如果您做到了这一点,那么它可以在驱动器和计算机之间工作,那么与直接将其构建到驱动器中相比,您的延迟将会大得多,而且性价比可能不值得。
因此,从技术上讲这是可能的,但存在陷阱并增加了系统的复杂性和弱点,但即使它没有这些缺点,也可能不值得。
-亚当
It's possible, but it's very, very tricky. You'll have to develop custom drivers because while encrypted sectors are the same size as regular sectors, and thus use the same math to find data, compressed sectors are "smaller" so you either have to hold a map of 'real' sectors to compressed sectors in the OS, or on the drive itself.
The only other aspect is speed of access and latency. It shouldn't affect seek, but it may take longer to compress data than it would to write it - compression is fairly compute intensive.
Further, compression isn't really good until you get to large chunks of data. You can probably compress 512 bytes (1 sector) on the fly and get a few percent compression on average, but people really want to see 20% and more compression before they're willing to pony up the extra cash for the hardware.
It will require more on disk processing power and memory, which will increase the cost of the drive.
Further, drive capacity is growing at a rate that it's probably not cost effective to do this.
Let's say, for instance, that you develop the miracle compression that doubles the space with no performance drop, no extra (flaky or crash prone) drivers, works on any OS, etc. But it adds $100 to the cost of the drive.
It might make sense for someone to do this for a 1TB drive now, converting it into a 2TB drive, but in 6-8 months the 2TB drives will be below $200. It won't be worth it for any smaller drive, because you can get a 1TB for $99 now.
If you made it so it would work between the drive and the computer you're going to have much greater latency than building it directly into the drive, and the price/performance hit might not make it worthwhile.
So, technically it's possible, but has pitfalls and adds complexity and points of weakness to the system, but even if it didn't have these drawbacks, it is likely that it isn't worthwhile.
-Adam
另一个考虑因素是磁盘上的大多数大文件(音乐、图片和视频)通常已经被压缩(MP3、JPEG、MP4/MOV),因此压缩无济于事。 未压缩的文件(文本文件、文字处理、电子邮件文本)往往只是杯水车薪。
我自己也想知道同样的事情,因为我正在搜索数千个 gzip 压缩的文本文件,并解压我的四核 i7,我想知道专用的 gzip 硬件是否可以像 GPU 加速并行处理一样加速这一过程。 但我怀疑上述担忧会导致对于大多数用途而言,压缩硬盘驱动器不会有太大帮助。
One other consideration is that most of the big files on your disk (music, pictures and videos) are usually already compressed (MP3, JPEG, MP4/MOV), so compression wouldn't help those. And the files that aren't compressed (text files, word processing, text of emails) tend to be a drop in the bucket.
I was wondering the same thing myself because I was searching through thousands of gzipped text files, and ungzipping pegs my quad-core i7, and I wondered if dedicated gzip hardware could speed this up like a GPU speeds up parallel processing. But I'm suspecting the above concerns would make it so that for most uses a compressed hard drive wouldn't help much.
在 100 美元的 1TB 驱动器世界中,速度是比空间更有价值的资源。 这是不值得的。
编辑:
啊,所以你是说从盘片上抓取 100 字节的压缩数据,解压缩然后将其发送到系统比抓取 800 字节的未压缩数据并将其发送到系统更快,因为寻道时间太慢了。
这似乎是一个聪明的方法,但我愿意打赌,如果这种权衡最终是值得的,那么硬盘驱动器制造商已经采用了这种技术,并且尽管事实如此,硬盘驱动器速度仍然如此。
但谁知道呢,你可能会有所收获!
In the world of 1TB drives for $100 speed is a much more valuable resource than space. It wouldn't be worth it.
EDIT:
Ah so you're saying that it would be quicker to grab 100 bytes of compressed data off the platters, decompress it and then send it along to the system rather than grabbing 800 bytes of uncompressed data and send it along to the system because seek times are so slow.
That seems like a clever approach but I'm willing to bet that if the trade off ends up being worth it that hard drive manufacturers have already employed this technique and hard drive speeds are what they are in spite of the fact.
But who knows, you may be on to something!
我记得大约 15 年前看到过一个关于可以进行硬件压缩的 IDE 控制器卡的广告。 不确定这是否有好处。 那时候 1GB 硬盘售价超过 1,000 美元。
I remember about 15 years ago seeing an advertisement for an IDE controller card that would do hardware compression. Not sure if it was any good or not. Those were the days when 1GB drives where over $1,000.
谁还记得斯塔克? 这一切都已经在 80 年代/90 年代已经完蛋了。 速度从来都不是问题,也不是“棘手的”。 现在完全没有必要了。
Who remembers Stacker? This was all already done-to-death in the '80's/90's. Speed was never a problem, and neither was it "tricky." It's just completely unnecessary these days.
如前所述,增益并没有那么大,特别是如果您以压缩形式存储很少访问的文件。
因为在硬件中很难做到(应该报告多大的磁盘大小?如果输入的熵等于其大小,你会怎么做?)并且现代 CPU + RAM 与 HDD 相比速度快得惊人,所以只需在软件中完成即可。
我知道的一种实现是 compFUSed ,它位于任何其他文件系统之上,另一个是 ZFS < a href="http://jforonda.blogspot.com/2007/01/zfs-compression.html" rel="nofollow noreferrer">有关如何启用它的博客条目,它本身支持压缩。
As previously said, the gain is not that big, especially if you are storing seldom accessed files in a compressed form anyway.
As it would be hard to do in hardware (What disk size should be reported? What do you do if the entropy of the input equals its size?) and modern CPUs+RAM are blazingly fast compared to HDDs anyway, just do it in software.
An implementation I know of is compFUSed which is layered on top of any other file system, another one is ZFS Blog entry about how to enable it which supports compression natively.
不久前,对于网络流量,我也想到了这个想法 - 以前已经做过:有使用 gzip 进行压缩的加速器卡: http://www.aha.com/show_prod.php?id=36
我还想到了另一个好处是,现在您可以在不压缩内容的情况下进行传输驱动器 - 只需从磁盘读取压缩块并发送,而不必在稍后进行压缩。
I had also thought of this idea a while ago for network traffic- which has been done before: there are accelerator cards for comressing using gzip: http://www.aha.com/show_prod.php?id=36
I had also thought of another benefit would be that now you can transfer without compressing the contents from the drive- simply read from the disk the compressed blocks and send rather than having to compress at that later time.