为什么 ostringstream 比 ofstream 快

发布于 2024-10-20 01:04:06 字数 572 浏览 7 评论 0原文

要将多条数据写入文件,我有2种方法:

  1. 直接一条一条写入ofstream

    ofstream 文件("c:\\test.txt");
    for (int i = 0; i < 10000; ++i)
    {
        文件<<数据[i];
    }
    
  2. 先写入istingstream,然后一次性写入ofstream

    ostringstream strstream;
    for (int i = 0; i < 10000; ++i)
    {
        字符串流<<数据[i];
    }
    ofstream 文件("c:\\test.txt");
    文件<< strstream.str();
    

毫不奇怪,第二种方法更快,事实上,在我的HP7800机器上它比第一种方法快4倍。

但为什么?我知道 ofstream 在内部使用 filebuf,而 ostringstream 使用 stringbuf - 作为缓冲区,它们都应该驻留在内存中,因此应该没有区别。

引擎盖下有什么区别?

To write many piece of data to file, I have 2 approaches:

  1. Write to ofstream one by one directly

    ofstream file("c:\\test.txt");
    for (int i = 0; i < 10000; ++i)
    {
        file << data[i];
    }
    
  2. Write to istringstream first, and then write to ofstream at once

    ostringstream strstream;
    for (int i = 0; i < 10000; ++i)
    {
        strstream << data[i];
    }
    ofstream file("c:\\test.txt");
    file << strstream.str();
    

Not surprisingly, the second approach is faster, in fact, it is 4 times faster than the first approach on my HP7800 machine.

But why? I know ofstream is using filebuf inside, and ostringstream is using stringbuf - as a buffer they should all reside in memory thus should have no difference.

What is the difference under the hood?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

空‖城人不在 2024-10-27 01:04:06

您是否经常使用 std::endl 而不是 '\n'std::endl 做了两件事:它将 '\n' 插入流中,然后将缓冲区刷新到磁盘< /强>。我见过代码说这样做会严重影响性能。 (修复后代码运行速度提高了 5-10 倍。)
刷新到字符串缓冲区将比刷新到磁盘快得多,因此这可以解释您的发现。

如果不是这种情况,您可能会考虑增加缓冲区大小:

const std::size_t buf_size = 32768;
char my_buffer[buf_size];
ofstream file("c:\\test.txt");
file.rdbuf()->pubsetbuf(my_buffer, buf_size);

for (int i = 0; i < 10000; ++i)
{
    file << data[i];
}

Are you using std::endl a lot instead of '\n'? std::endl does two things: it inserts a '\n' into the stream and then flushes the buffer to disk. I've seen code talking a severe performance hit by doing so. (The code ran 5-10 times faster after that was fixed.)
Flushing to a string buffer will be much faster than flushing to the disk, so that would explain your findings.

If that's not the case you might consider is increasing the buffer size:

const std::size_t buf_size = 32768;
char my_buffer[buf_size];
ofstream file("c:\\test.txt");
file.rdbuf()->pubsetbuf(my_buffer, buf_size);

for (int i = 0; i < 10000; ++i)
{
    file << data[i];
}
深巷少女 2024-10-27 01:04:06

磁盘速度慢。许多小写入比一次大写入更昂贵。

Disk is slow. Many small writes are more expensive than one large.

浮生未歇 2024-10-27 01:04:06

这可能是特定操作系统的实现问题。
另外,我猜测 ofstream 缓冲区(buflen)小于 10000,其典型值为 4095。因此尝试使用 i<4096 运行,响应时间应该完全相同!

在第二种情况下速度更快的原因是:

在第一种情况下,当缓冲区已满(buflen=4095bytes)时,它会被写入磁盘。因此,对于 i<10000 来说,它会被刷新 3 次。

而在第二种情况下,所有数据首先在 RAM 中准备好,然后一次性刷新到硬盘。这样就省掉了两次同花!

It can be implementation issue with specific OS.
Also I guess the ofstream buffer(buflen) is smaller than 10000, a typical value of which is 4095. So try running with i<4096 and the response time should be quite same!

The reason why it's faster in the second case:

In the first case when the buffer is full ( buflen=4095bytes) it is written to disk. So for i<10000 it'd have caused it to be flushed 3 times.

While in the second case, all data is first prepared in the RAM and in one go flushed to the harddisk. So two flushes have been saved!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文