顺序写和随机写的区别

发布于 2024-08-18 11:08:49 字数 179 浏览 6 评论 0原文

在以下情况下,顺序写入和随机写入有什么区别:- 1)基于磁盘的系统 2)基于SSD [闪存设备]的系统

当应用程序写入某些内容并且需要在磁盘上修改信息/数据时,我们如何知道它是顺序写入还是随机写入。到目前为止,无法写入区分为“顺序”或“随机”。写入只是缓冲,然后在刷新缓冲区时应用到磁盘。

如果我错了,请纠正我。

What is the difference between sequential write and random write in case of :-
1)Disk based systems
2)SSD [Flash Device ] based systems

When the application writes something and the information/data needs to be modified on the disk then how do we know whether it is a sequential write or a random write.As till this point a write cannot be distinguished as "sequential" or "random".The write is just buffered and then applied to the disk when we will flush the buffer.

Please correct me if I am wrong.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

疑心病 2024-08-25 11:08:49

当人们谈论顺序随机写入文件时,他们通常会区分无中间查找的写入(“顺序”)与查找模式-write-seek-write-seek-write等(“随机”)。

这种区别在传统的基于磁盘的系统中非常重要,其中每次磁盘查找将花费大约 10 毫秒。将数据顺序写入同一磁盘大约需要每 MB 30 毫秒。因此,如果顺序将 100MB 的数据写入磁盘,大约需要 3 秒。但如果你进行 100 次随机写入,每次 1MB,则总共需要 4 秒(实际写入 3 秒,所有查找 10ms*100 == 1 秒)。

随着每次随机写入变得越来越小,您为磁盘查找付出的代价也越来越大。在执行 1 亿次随机 1 字节写入的极端情况下,您仍然需要 3 秒来完成所有实际写入,但您现在有 11.57 天的时间去做!显然,顺序写入与随机写入的程度确实会影响完成任务所需的时间。

当涉及到闪存时,情况有点不同。使用闪存,您无需移动物理磁盘头。 (这就是传统磁盘 10 毫秒寻道成本的来源)。然而,闪存设备往往具有较大的页面大小(根据 wikipedia,最小的“典型”页面大小约为 512 字节,4K 页面大小似乎也很常见)。因此,如果您写入少量字节,闪存仍然存在开销,因为您必须读出整个页面,修改正在写入的字节,然后写回整个页面。我不知道从我头顶闪过的特征数字。但经验法则是,在闪存上,如果每次写入的大小通常与设备的页面大小相当,那么随机写入和顺序写入之间不会有太大的性能差异。如果您的每次写入与设备页面大小相比都很小,那么在进行随机写入时您会看到一些开销。

现在,对于上述所有内容,确实在应用程序层有很多内容对您隐藏。内核、磁盘/闪存控制器等中的某些层可能会在“顺序”写入过程中插入不明显的搜索。但在大多数情况下,在应用程序层“看起来”顺序写入(无寻道,大量连续 I/O)将具有顺序写入性能,而在应用程序层“看起来”随机写入将具有(通常更差)随机写入性能。

When people talk about sequential vs random writes to a file, they're generally drawing a distinction between writing without intermediate seeks ("sequential"), vs. a pattern of seek-write-seek-write-seek-write, etc. ("random").

The distinction is very important in traditional disk-based systems, where each disk seek will take around 10ms. Sequentially writing data to that same disk takes about 30ms per MB. So if you sequentially write 100MB of data to a disk, it will take around 3 seconds. But if you do 100 random writes of 1MB each, that will take a total of 4 seconds (3 seconds for the actual writing, and 10ms*100 == 1 second for all the seeking).

As each random write gets smaller, you pay more and more of a penalty for the disk seeks. In the extreme case where you perform 100 million random 1-byte writes, you'll still net 3 seconds for all the actual writes, but you'd now have 11.57 days worth of seeking to do! So clearly the degree to which your writes are sequential vs. random can really affect the time it takes to accomplish your task.

The situation is a bit different when it comes to flash. With flash, you don't have a physical disk head that you must move around. (This is where the 10ms seek cost comes from for a traditional disk). However, flash devices tend to have large page sizes (the smallest "typical" page size is around 512 bytes according to wikipedia, and 4K page sizes appear to be common as well). So if you're writing a small number of bytes, flash still has overhead in that you must read out an entire page, modify the bytes you're writing, and then write back the entire page. I don't know the characteristic numbers for flash off the top of my head. But the rule of thumb is that on flash if each of your writes is generally comparable in size to the device's page size, then you won't see much performance difference between random and sequential writes. If each of your writes is small compared to the device page size, then you'll see some overhead when doing random writes.

Now for all of the above, it's true that at the application layer much is hidden from you. There are layers in the kernel, disk/flash controller, etc. that could for example interject non-obvious seeks in the middle of your "sequential" writing. But in most cases, writing that "looks" sequential at the application layer (no seeks, lots of continuous I/O) will have sequential-write performance while writing that "looks" random at the application layer will have the (generally worse) random-write performance.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文