为什么这个 C 代码比这个 C++ 更快?代码 ?获取文件中最大的行
我有一个程序的两个版本,它们基本上做同样的事情,获取文件中一行的最大长度,我有一个大约有 8000 行的文件,我的 C 代码有点原始(当然!)比我用 C++ 编写的代码。 C 程序大约需要 2 秒才能运行,而 C++ 程序则需要 10 秒才能运行(这两种情况我都使用同一文件进行测试)。但为什么?我预计它会花费相同的时间或更多一点,但不会慢 8 秒!
我的 C 代码:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#if _DEBUG
#define DEBUG_PATH "../Debug/"
#else
#define DEBUG_PATH ""
#endif
const char FILE_NAME[] = DEBUG_PATH "data.noun";
int main()
{
int sPos = 0;
int maxCount = 0;
int cPos = 0;
int ch;
FILE *in_file;
in_file = fopen(FILE_NAME, "r");
if (in_file == NULL)
{
printf("Cannot open %s\n", FILE_NAME);
exit(8);
}
while (1)
{
ch = fgetc(in_file);
if(ch == 0x0A || ch == EOF) // \n or \r or \r\n or end of file
{
if ((cPos - sPos) > maxCount)
maxCount = (cPos - sPos);
if(ch == EOF)
break;
sPos = cPos;
}
else
cPos++;
}
fclose(in_file);
printf("Max line length: %i\n", maxCount);
getch();
return (0);
}
我的 C++ 代码:
#include <iostream>
#include <fstream>
#include <stdio.h>
#include <string>
using namespace std;
#ifdef _DEBUG
#define FILE_PATH "../Debug/data.noun"
#else
#define FILE_PATH "data.noun"
#endif
int main()
{
string fileName = FILE_PATH;
string s = "";
ifstream file;
int size = 0;
file.open(fileName.c_str());
if(!file)
{
printf("could not open file!");
return 0;
}
while(getline(file, s) )
size = (s.length() > size) ? s.length() : size;
file.close();
printf("biggest line in file: %i", size);
getchar();
return 0;
}
I have two versions of a program that does basically the same thing, getting the biggest length of a line in a file, I have a file with about 8 thousand lines, my code in C is a little bit more primitive (of course!) than the code I have in C++. The C programm takes about 2 seconds to run, while the program in C++ takes 10 seconds to run (same file I am testing with for both cases). But why? I was expecting it to take the same amount of time or a little bit more but not 8 seconds slower!
my code in C:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#if _DEBUG
#define DEBUG_PATH "../Debug/"
#else
#define DEBUG_PATH ""
#endif
const char FILE_NAME[] = DEBUG_PATH "data.noun";
int main()
{
int sPos = 0;
int maxCount = 0;
int cPos = 0;
int ch;
FILE *in_file;
in_file = fopen(FILE_NAME, "r");
if (in_file == NULL)
{
printf("Cannot open %s\n", FILE_NAME);
exit(8);
}
while (1)
{
ch = fgetc(in_file);
if(ch == 0x0A || ch == EOF) // \n or \r or \r\n or end of file
{
if ((cPos - sPos) > maxCount)
maxCount = (cPos - sPos);
if(ch == EOF)
break;
sPos = cPos;
}
else
cPos++;
}
fclose(in_file);
printf("Max line length: %i\n", maxCount);
getch();
return (0);
}
my code in C++:
#include <iostream>
#include <fstream>
#include <stdio.h>
#include <string>
using namespace std;
#ifdef _DEBUG
#define FILE_PATH "../Debug/data.noun"
#else
#define FILE_PATH "data.noun"
#endif
int main()
{
string fileName = FILE_PATH;
string s = "";
ifstream file;
int size = 0;
file.open(fileName.c_str());
if(!file)
{
printf("could not open file!");
return 0;
}
while(getline(file, s) )
size = (s.length() > size) ? s.length() : size;
file.close();
printf("biggest line in file: %i", size);
getchar();
return 0;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
我的猜测是,这是您正在使用的编译器选项、编译器本身或文件系统的问题。我刚刚编译了两个版本(启用了优化)并针对 92,000 行文本文件运行它们:
我怀疑 C++ 版本更快的原因是 fgetc 很可能更慢。
fgetc
确实使用缓冲 I/O,但它通过函数调用来检索每个字符。我之前测试过它,fgetc
并不像在一次调用中读取整行的调用那么快(例如,与fgets
相比)。My guess is that it is a problem with the compiler options you are using, the compiler itself, or the file system. I just now compiled both versions (with optimizations on) and ran them against a 92,000 line text file:
And I suspect that the reason that the C++ version is faster is because fgetc is most likely slower.
fgetc
does use buffered I/O, but it is making a function call to retrieve every character. I've tested it before andfgetc
is not as fast as making a call to read the entire line in one call (e.g., compared tofgets
).因此,在一些评论中,我回应了人们的答案,即问题可能是您的 C++ 版本完成的额外复制,它将行复制到字符串中的内存中。但我想测试一下。
首先,我实现了 fgetc 和 getline 版本并对它们进行计时。我确认在调试模式下,getline 版本速度较慢,约为 130 µs,而 fgetc 版本则为 60 µs。鉴于 iostream 比使用 stdio 慢的传统观点,这并不奇怪。然而,根据我过去的经验,iostreams 通过优化获得了显着的加速。当我比较我的释放模式时间时,这一点得到了证实:使用 getline 大约 20 µs,使用 fgetc 大约 48 µs。
事实上,使用 getline 和 iostreams 比 fgetc 更快,至少在发布模式下,与复制所有数据必须比不复制它慢的推理背道而驰,所以我不确定所有优化能够避免什么,我并没有真正寻找任何解释,但了解正在优化的内容会很有趣。 编辑:当我用探查器查看程序时,如何比较性能并不明显,因为不同的方法看起来彼此不同
Anwyay我想看看是否可以获得更快的版本通过避免在 fstream 对象上使用 get() 方法进行复制,只需执行 C 版本正在执行的操作即可。当我这样做时,我非常惊讶地发现在调试和发布中使用 fstream::get() 比 fgetc 和 getline 方法慢得多;调试时约为 230 µs,发布时约为 80 µs。
为了缩小速度下降的范围,我继续做了另一个版本,这次使用附加到 fstream 对象的stream_buf,以及其上的 snextc() 方法。这个版本是迄今为止最快的;调试时间为 25 µs,发布时间为 6 µs。
我猜测 fstream::get() 方法之所以慢得多,是因为它为每次调用构造了一个哨兵对象。虽然我还没有对此进行测试,但我看不出
get()
除了从stream_buf 中获取下一个字符之外还有什么作用,除了这些哨兵对象之外。不管怎样,这个故事的寓意是,如果你想要快速 io,你可能最好使用高级 iostream 函数而不是 stdio,并且为了真正快速的 io 访问底层的stream_buf。 编辑:实际上这个道德可能只适用于 MSVC,请参阅底部的更新以了解不同工具链的结果。
供参考:
我使用 VS2010 和 boost 1.47 中的 chrono 进行计时。我构建了 32 位二进制文件(似乎是 boost chrono 所必需的,因为它似乎找不到该库的 64 位版本)。我没有调整编译选项,但它们可能不是完全标准的,因为我在我保留的临时项目中这样做了。
我测试的文件是 1.1 MB 20,000 行纯文本版本的 Oeuvres Complètes de Frédéric Bastiat, tome 1 by Frédéric Bastiat from Project Gutenberg, http://www.gutenberg.org/ebooks/35390
发布模式次数
调试模式次数:
这是我的
fgetc()
版本:这是我的
getline()
版本:fstream::get()
版本和
snextc()
版本更新:
我在 OS X 上使用 clang (trunk) 和 libc++ 重新运行了测试。基于 iostream 的实现的结果保持相对相同(打开优化);
fstream::get()
比std::getline()
慢得多,比filebuf::snextc()
慢得多。但fgetc()
的性能相对于getline()
实现有所提高,并且变得更快。也许这是因为getline()
完成的复制成为此工具链的问题,而 MSVC 则不然?也许微软的 fgetc() 的 CRT 实现很糟糕还是什么?无论如何,这里是时间(我使用了一个更大的文件,5.3 MB):
使用 -Os
使用 -O0
-O2
-O3
So in a few comments I echoed peoples' answers that the problem was likely the extra copying done by your C++ version, where it copies the lines into memory in a string. But I wanted to test that.
First I implemented the fgetc and getline versions and timed them. I confirmed that in debug mode the getline version is slower, about 130 µs vs 60 µs for the fgetc version. This is unsurprising given conventional wisdom that iostreams are slower than using stdio. However in the past it's been my experience that iostreams get a significant speed up from optimization. This was confirmed when I compared my release mode times: about 20 µs using getline and 48 µs with fgetc.
The fact that using getline with iostreams is faster than fgetc, at least in release mode, runs counter to the reasoning that copying all that data must be slower than not copying it, so I'm not sure what all optimization is able to avoid, and I didn't really look to find any explanation, but it'd be interesting to understand what's being optimized away. edit: when I looked at the programs with a profiler it wasn't obvious how to compare the performance since the different methods looked so different from each other
Anwyay I wanted to see if I could get a faster version by avoiding the copying using the
get()
method on the fstream object and just do exactly what the C version is doing. When I did this I was quite surprised to find that usingfstream::get()
was quite a bit slower than both the fgetc and getline methods in both debug and release; About 230 µs in debug, and 80 µs in Release.To narrow down whatever the slow-down is I went ahead and and did another version, this time using the stream_buf attached to the fstream object, and
snextc()
method on that. This version is by far the fastest; 25 µs in debug and 6 µs in release.I'm guessing that the thing that makes the
fstream::get()
method so much slower is that it constructs a sentry objects for every call. Though I haven't tested this, I can't see thatget()
does much beyond just getting the next character from the stream_buf, except for these sentry objects.Anyway, the moral of the story is that if you want fast io you're probably best off using high level iostream functions rather than stdio, and for really fast io access the underlying stream_buf. edit: actually this moral may only apply to MSVC, see update at bottom for results from a different toolchain.
For reference:
I used VS2010 and chrono from boost 1.47 for timing. I built 32-bit binaries (seems required by boost chrono because it can't seem to find a 64 bit version of that lib). I didn't tweak the compile options but they may not be completely standard since I did this in a scratch vs project I keep around.
The file I tested with was the 1.1 MB 20,000 line plain text version of Oeuvres Complètes de Frédéric Bastiat, tome 1 by Frédéric Bastiat from Project Gutenberg, http://www.gutenberg.org/ebooks/35390
Release mode times
Debug mode times:
Here's my
fgetc()
version:Here's my
getline()
version:the
fstream::get()
versionand the
snextc()
versionupdate:
I reran the tests using clang (trunk) on OS X with libc++. The results for the iostream based implementations stayed relatively the same (with optimization turned on);
fstream::get()
much slower thanstd::getline()
much slower thanfilebuf::snextc()
. But the performance offgetc()
improved relative to thegetline()
implementation and became faster. Perhaps this is because the copying done bygetline()
becomes an issue with this toolchain whereas it wasn't with MSVC? Maybe Microsoft's CRT implementation of fgetc() is bad or something?Anyway, here are the times (I used a much larger file, 5.3 MB):
using -Os
using -O0
-O2
-O3
C++ 版本不断分配和释放 std::string 的实例。内存分配是一项代价高昂的操作。除此之外,还会执行构造函数/析构函数。
然而,C 版本使用常量内存,并且只需要这样做:读取单个字符,将行长度计数器设置为新值(如果更高),对于每个换行符,仅此而已。
The C++ version constantly allocates and deallocates instances of std::string. Memory allocation is a costly operation. In addition to that the constructors/destructors are executed.
The C version however uses constant memory, and just does was necessary: Reading in single characters, setting the line-length counter to the new value if higher, for each newline and that's it.
你不是在比较苹果与苹果。您的 C 程序不会将数据从 FILE* 缓冲区复制到程序内存中。它还对原始文件进行操作。
您的 C++ 程序需要多次遍历每个字符串的长度 - 一次在流代码中,以了解何时终止它返回给您的字符串,一次在
std::string
的构造函数中,< Strike> 以及一次在代码中对s.length()
的调用。您可以提高 C 程序的性能,例如使用
getc_unlocked
(如果您可以使用)。但最大的胜利来自于不必复制数据。编辑:针对 bames53 的评论进行编辑
You are not comparing apples to apples. Your C program does no copying of data from
FILE*
buffer into your program's memory. It also operates on raw files.Your C++ program needs to traverse the length of each string several times - once in the stream code to know when to terminate the string that it returns to you, once in the constructor of
std::string
,and once in your code's call to.s.length()
It is possible that you could improve the performance of your C program, for example by using
getc_unlocked
if it is available to you. But the biggest win comes from not having to copy your data.EDIT: edited in response to a comment by bames53
2 秒只需要 8000 行?我不知道你的队伍排了多久,但很可能你做错了什么。
这个简单的 Python 程序几乎可以立即通过从 Project Gutenberg 下载的 El Quijote 执行(40006 行,2.2MB):
时机:
您可以通过缓冲输入而不是逐个字符地读取字符来改进 C 代码。
至于为什么C++比C慢,应该与构建字符串对象然后调用length方法有关。在 C 语言中,你只需边计算字符数即可。
2 seconds for just 8.000 lines? I don't know how long your lines are, but the chances are that you are doing something very wrong.
This trivial Python program executes almost instantly with El Quijote downloaded from Project Gutenberg (40006 lines, 2.2MB):
The timing:
You could improve your C code by buffering the input rather than reading char by char.
About why is the C++ slower than C, it should be related with building the string objects and then calling the length method. In C you are just counting the chars as you go.
我尝试针对 40K 行 C++ 源代码编译并运行您的程序,它们都在大约 25 毫秒左右完成。我只能得出结论,您的输入文件有极长行,每行可能有 10K-100K 个字符。在这种情况下,C 版本不会因长行长度而产生任何负面性能,而 C++ 版本则必须不断增加字符串的大小并将旧数据复制到新缓冲区中。如果必须将大小增加足够多的次数,则可能会导致性能差异过大。
这里的关键是这两个程序不执行相同的操作,因此您无法真正比较它们的结果。如果您能够提供输入文件,我们也许能够提供更多详细信息。
您可能可以使用
tellg
和ignore
在 C++ 中更快地完成此操作。I tried compiling and running your programs against 40K lines of C++ source and they both completed in about 25ms or so. I can only conclude that your input files have extremely long lines, possibly 10K-100K characters per line. In that case the C version doesn't have any negative performance from the long line length while the C++ version would have to keep increasing the size of the string and copying the old data into the new buffer. If it had to increase in size a sufficient number of times that could account for the excessive performance difference.
The key here is that the two programs don't do the same thing so you can't really compare their results. If you were able to provide the input file we might be able to provide additional details.
You could probably use
tellg
andignore
to do this faster in C++.C++ 程序构建行的字符串对象,而 C 程序只是读取字符并查看字符。
编辑:
感谢您的投票,但经过讨论,我现在认为这个答案是错误的。这是一个合理的初步猜测,但在这种情况下,不同(并且非常慢)的执行时间似乎是由其他原因引起的。
The C++ program builds string objects of the lines, while the C program just reads characters and looks at the characters.
EDIT:
Thanks for the upvotes, but after the discussion I now think this answer is wrong. It was a reasonable first guess, but in this case it seems that the different (and very slow) execution times are caused by other things.
我对理论家们的看法没问题。但让我们来实证一下。
我生成了一个包含 1300 万行文本文件的文件供使用。
编辑原始代码以从
stdin
读取(不应影响太多性能)差不多2分钟就做好了。
C++ 代码:
C++ 时间:
“C”版本:
C 性能:
自己算算...
I'm alright with the theory folks. But let's get empirical.
I generated a file with 13 million lines of text file to work with.
The original code edited to read from
stdin
(shouldn't affect too much the performance)made it in almost 2 min.
C++ code:
C++ time:
A 'C' version:
C performance:
Do your own math...