HD 中数据密集型读写的最佳实践有哪些?

发布于 2024-10-14 01:35:08 字数 360 浏览 8 评论 0原文

我正在开发一个 C++ 应用程序(在 Linux 机器中运行),该应用程序在读取日志文件并将派生结果写入磁盘方面非常密集。我想知道优化此类应用程序的最佳实践:

  • 哪些操作系统调整可以提高性能?
  • 哪些编程模式可以提高 IO 吞吐量?
  • 预处理数据(转换为二进制、压缩数据等)是一种有用的措施吗?
  • 分块/缓冲数据有助于提高性能吗?
  • 我应该注意哪些硬件功能?
  • 哪些实践最适合分析和测量这些应用程序的性能?
  • (在这里表达我所缺少的担忧)

是否有一本好的读物可以让我了解这方面的基础知识,以便我可以根据我的问题调整现有的专业知识?

谢谢

I'm developing a C++ application (running in a Linux box) that is very intensive in reading log files and writing derived results in disk. I'd like to know which are the best practices for optimizing these kind of applications:

  • Which OS tweaks improve performance?
  • Which programming patterns boost IO throughput?
  • Is pre-processing the data (convert to binary, compress data, etc...) a helpful measure?
  • Does chunking/buffering data helps in performance?
  • Which hardware capabilities should I be aware of?
  • Which practices are best for profiling and measuring performance in these applications?
  • (express here the concern I'm missing)

Is there a good read where I could get the basics of this so I could adapt the existing know-how to my problem?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

冷血 2024-10-21 01:35:26

在 Windows 上,将 CreateFile() 与 FILE_FLAG_SEQUENTIAL_SCAN 和/或 FILE_FLAG_NO_BUFFERING 一起使用,而不是 fopen() - 至少对于写入来说,它会立即返回,而不是等待数据刷新到磁盘

On windows, use CreateFile() with FILE_FLAG_SEQUENTIAL_SCAN and/or FILE_FLAG_NO_BUFFERING rather than fopen() - at least for writing this returns immediately rather than waiting for the data to flush to disk

绝對不後悔。 2024-10-21 01:35:22

正如此处所述,您应该检查块的大小。您可以使用 stat 系列函数来完成此操作。
在 struct stat 中,此信息位于字段 st_blksize 中。

第二件事是函数 posix_fadvise(),它向操作系统提供有关分页的建议。您告诉系统您将如何使用文件(甚至文件的片段)。您可以在手册页上找到更多信息。

As it was stated here, you should check size of block. You do this with stat family functions.
In struct stat this information is located in field st_blksize.

Second thing is function posix_fadvise(), which gives advice to OS about paging. You tell system how you're going to use file (or even fragment of a file). You'll find more on manual page.

千鲤 2024-10-21 01:35:18

获取有关您将写入/读取的卷的信息,并创建与该卷的特征匹配的缓冲区。例如 10 * clusterSize。

缓冲很有帮助,因为可以最大限度地减少您必须执行的写入量。

Get information about the volume you'll be writing to/reading from and create buffers that match the characteristics of the volume. e.g. 10 * clusterSize.

Buffering helps a lot, as would minimizing the amount of writing you have to do.

红ご颜醉 2024-10-21 01:35:16

1) 检查磁盘的扇区大小。
2) 确保磁盘已进行碎片整理。
3) 读取上次读取的“本地”数据,以提高缓存局部性(缓存由操作系统执行,许多硬盘驱动器也有内置缓存)。
4)连续写入数据。

为了提高写入性能,将数据块缓存在内存中,直到达到扇区大小的倍数,然后启动对磁盘的异步写入。在确定数据已写入(即同步写入)之前,请勿覆盖当前正在写入的数据。双缓冲或三缓冲在这里可以提供帮助。

为了获得最佳读取性能,您可以双缓冲读取。假设您在读取时缓存 16K 块。将第一个 16K 从磁盘读取到块 1。启动将第二个 16K 异步读取到块 2。开始处理块 1。完成块 1 后,同步块 2 的读取,并开始异步读取块 1 的数据。将第 3 个 16K 块写入块 1。现在处理块 2。完成同步读取第 3 个 16K 块后,启动将第 4 个 16K 异步读取到块 2 并处理块 1。冲洗并重复,直到处理完所有数据数据。

如前所述,您需要读取的数据越少,从磁盘读取所浪费的时间就越少,因此读取压缩数据并花费 CPU 时间扩展每个读取块可能是值得的。在写入之前同样压缩块将节省磁盘时间。这是否成功实际上取决于您处理数据的 CPU 密集程度。

此外,如果对块的处理是不对称的(即处理块 1 的时间可能是处理块 2 的 3 倍),则请考虑对读取进行三重或更多缓冲。

1) Check out your disk's sector size.
2) Make sure the disk is defragged.
3) Read data that is "local" to the last reads you have done to improve cache locality (Cacheing is performend by the operating system and many hard drives also have a built-in cache).
4) Write data contiguously.

For write performance, Cache blocks of data in memory until you reach a multiple of the sector size then initiate an asynchronous write to disk. Do not overwrite the data currently being written until you can be definite the data has been written (ie sync the write). Double or triple buffering can help here.

For best read performance you can double buffer reads. So lets say you cache 16K blocks on read. Read the 1st 16K from disk into block 1. Initiate an asynchronous read of the 2nd 16K into block 2. Start working on block 1. When you have finished with block 1 sync the read of block 2 and start an async read into block 1 of the 3rd 16K block into block 1. Now work on block 2. When finished sync the read of the 3rd 16K block, initiate an async read of the 4th 16K into block 2 and work on block 1. Rinse and repeat until you have processed all the data.

As already stated the less data you have to read the less time will be lost to reading from disk so it may well be worth reading compressed data and spending the CPU time expanding each block on read. Equally compressing the block before write will save you disk time. Whether this is a win or not really will depend on how CPU intensive your processing of the data is.

Also if the processing on the blocks is asymmetric (ie processing block 1 can take 3 times as long as processing block 2) then consider triple or more buffering for reads.

丑疤怪 2024-10-21 01:35:14

压缩肯定会有很大帮助,并且比调整操作系统要简单得多。查看 Boost.IOStreams 库。不过,这会对处理器造成影响。

测量此类作业从 time 命令开始。如果系统时间与用户时间相比非常高,那么您的程序会花费大量时间进行系统调用。如果挂钟(“实际”)时间比系统和用户时间长,则它正在等待磁盘或网络。 top 命令显示程序的 CPU 使用率明显低于 100%,这也是 I/O 瓶颈的标志。

Compression may certainly help a lot and is much simpler than tweaking the OS. Check out the gzip and bzip2 support in the Boost.IOStreams library. This takes its toll on the processor, though.

Measuring these kinds of jobs starts with the time command. If system time is very high compared to user time, then your program spends a lot of time doing system calls. If wall-clock ("real") time is high compared to system and user time, it's waiting for the disk or the network. The top command showing significantly less than 100% CPU usage for the program is also a sign of I/O bottleneck.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文