为什么 C++ 中的缓冲是这样的？重要的？

发布于 2024-10-19 10:01:17 字数 1012 浏览 2 评论 0原文

我尝试打印 Hello World 200,000 次，但花了很长时间，所以我必须停下来。但在我添加一个 char 数组作为缓冲区后，只花了不到 10 秒的时间。为什么？

添加缓冲区之前：

#include <iostream> 
using namespace std;

int main() {
        int count = 0;
        std::ios_base::sync_with_stdio(false);
        for(int i = 1; i < 200000; i++)
        {       
                cout << "Hello world!\n";
                count++;
        }
                cout<<"Count:%d\n"<<count;
return 0;
}

这是添加缓冲区之后：

#include <iostream> 
using namespace std;

int main() {
        int count = 0;
        std::ios_base::sync_with_stdio(false);
        char buffer[1024];
        cout.rdbuf()->pubsetbuf(buffer, 1024);
        for(int i = 1; i < 200000; i++)
        {       
                cout << "Hello world!\n";
                count++;
        }
                cout<<"Count:%d\n"<<count;
return 0;
}

这让我想到了 Java。使用BufferReader读取文件有什么好处？

原文

I tried to print Hello World 200,000 times and it took me forever, so I have to stop. But right after I add a char array to act as a buffer, it took less than 10 seconds. Why?

Before adding a buffer:

#include <iostream> 
using namespace std;

int main() {
        int count = 0;
        std::ios_base::sync_with_stdio(false);
        for(int i = 1; i < 200000; i++)
        {       
                cout << "Hello world!\n";
                count++;
        }
                cout<<"Count:%d\n"<<count;
return 0;
}

And this is after adding a buffer:

#include <iostream> 
using namespace std;

int main() {
        int count = 0;
        std::ios_base::sync_with_stdio(false);
        char buffer[1024];
        cout.rdbuf()->pubsetbuf(buffer, 1024);
        for(int i = 1; i < 200000; i++)
        {       
                cout << "Hello world!\n";
                count++;
        }
                cout<<"Count:%d\n"<<count;
return 0;
}

This makes me think about Java. What's the advantages of a using BufferReader to read in file?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

嗳卜坏 2024-10-26 10:01:17

对于文件操作来说，写入内存（RAM）总是比直接写入磁盘上的文件要快。

为了便于说明，我们定义：

每次对磁盘上文件的写 IO 操作花费 1 毫秒
通过网络对磁盘上文件的每次写 IO 操作花费 5 毫秒
对内存的每次写 IO 操作花费 0.5 毫秒

假设我们必须将一些数据写入文件 100 次。

案例 1：直接写入磁盘上的文件

100 times x 1 ms = 100 ms

案例 2：通过网络直接写入磁盘上的文件

100 times x 5 ms = 500 ms

案例 3：写入磁盘上的文件之前在内存中缓冲

(100 times x 0.5 ms) + 1 ms = 51 ms

案例 4：通过网络写入磁盘上的文件之前在内存

(100 times x 0.5 ms) + 5 ms = 55 ms

中缓冲结论

内存中的缓冲总是比直接操作更快。但是，如果您的系统内存不足并且必须与页面文件交换，则速度会再次变慢。因此，您必须平衡内存和磁盘/网络之间的 IO 操作。

For the stand of file operations, writing to memory (RAM) is always faster than writing to the file on the disk directly.

For illustration, let's define:

each write IO operation to a file on the disk costs 1 ms
each write IO operation to a file on the disk over a network costs 5 ms
each write IO operation to the memory costs 0.5 ms

Let's say we have to write some data to a file 100 times.

Case 1: Directly Writing to File On Disk

100 times x 1 ms = 100 ms

Case 2: Directly Writing to File On Disk Over Network

100 times x 5 ms = 500 ms

Case 3: Buffering in Memory before Writing to File on Disk

(100 times x 0.5 ms) + 1 ms = 51 ms

Case 4: Buffering in Memory before Writing to File on Disk Over Network

(100 times x 0.5 ms) + 5 ms = 55 ms

Conclusion

Buffering in memory is always faster than direct operation. However if your system is low on memory and has to swap with page file, it'll be slow again. Thus you have to balance your IO operations between memory and disk/network.

回复收藏 0 原文

囚我心虐我身 2024-10-26 10:01:17

写入磁盘的主要问题是写入时间不是字节数的线性函数，而是一个具有巨大常数的仿射函数。

在计算方面，这意味着对于 IO，您具有良好的吞吐量（低于内存，但仍然相当不错），但延迟很差（比正常网络稍好一些）。

如果您查看 HDD 或 SSD 的评测文章，您会发现读/写测试分为两类：

随机读取的吞吐量
连续读取的吞吐量

后者通常明显大于前者。

通常，操作系统和 IO 库应该为您抽象这一点，但正如您所注意到的，如果您的例程是 IO 密集型，您可能会通过增加缓冲区大小来获得收益。这是正常的，该库通常是为各种用途量身定制的，因此为普通应用程序提供了良好的中间立场。如果您的应用程序不是“平均”，那么它的执行速度可能不会那么快。

回复收藏 0 原文

帅气称霸 2024-10-26 10:01:17

您使用什么编译器/平台？我认为这里没有显着差异（RedHat，gcc 4.1.2）；这两个程序都需要 5-6 秒才能完成（但“用户”时间约为 150 毫秒）。如果我将输出重定向到文件（通过 shell），总时间约为 300 毫秒（因此 6 秒的大部分时间都花在等待控制台跟上程序上）。

换句话说，默认情况下应该缓冲输出，所以我很好奇为什么你会看到如此巨大的加速。

3 个与切线相关的注释：

您的程序存在一个相差一的错误，因为您仅打印了 199999 次，而不是规定的 200000 次（以 i = 0 开头或以 i < 结尾;= 200000）
在输出 count 时，您将 printf 语法与 cout 语法混合在一起......对此的修复是显而易见的。
当输出到控制台时，禁用 sync_with_stdio 会产生小幅加速（大约 5%），但在重定向到文件时，影响可以忽略不计。这是一种微观优化，在大多数情况下您可能不需要（恕我直言）。