在 C++ 中从文件读取行的首选模式是什么?
我在 C++ 教程中至少看到了两种从文件中读取行的方法:
std::ifstream fs("myfile.txt");
if (fs.is_open()) {
while (fs.good()) {
std::string line;
std::getline(fs, line);
// ...
和:
std::ifstream fs("myfile.txt");
std::string line;
while (std::getline(fs, line)) {
// ...
当然,我可以添加一些检查以确保文件存在并已打开。除了异常处理之外,还有理由选择更详细的第一个模式吗?你的标准做法是什么?
I've seen at least two ways of reading lines from a file in C++ tutorials:
std::ifstream fs("myfile.txt");
if (fs.is_open()) {
while (fs.good()) {
std::string line;
std::getline(fs, line);
// ...
and:
std::ifstream fs("myfile.txt");
std::string line;
while (std::getline(fs, line)) {
// ...
Of course, I can add a few checks to make sure that the file exists and is opened. Other than the exception handling, is there a reason to prefer the more-verbose first pattern? What's your standard practice?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这不仅是正确的,而且也是首选 因为它是惯用的。
我假设在第一种情况下,您在
fs
之后没有检查fs
>std::getline() 为if(!fs) break;
或类似的东西。因为如果你不这样做,那么第一种情况就完全错误了。或者如果你这样做,那么第二个仍然是更好的选择,因为它在逻辑上更加简洁和清晰。函数
good()
应该在您尝试从流中读取之后使用;它用于检查尝试是否成功。在第一种情况下,你不这样做。在std::getline()
之后,您假设读取成功,甚至不检查fs.good()
返回的内容。另外,您似乎假设如果fs.good()
返回 true,std::getline
将成功从流中读取一行。您正朝着相反的方向前进:事实是,如果std::getline
成功从流中读取一行,则fs.good()
将返回真
。cplusplus 的文档介绍了
good()
那个,也就是说,当您尝试从输入流读取数据时,如果尝试失败,则仅设置失败标志,并且
good()
返回false
作为失败的指示。如果您想将
line
变量的范围限制在循环内部,那么您可以编写一个for
循环:注意:在阅读@后我想到了这个解决方案约翰的解决方案,但我认为它比他的版本更好。
请阅读此处的详细解释,为什么第二个更可取且惯用:
或者阅读 @Jerry Coffin 撰写的这篇写得很好的博客:
This is not only correct but preferable also because it is idiomatic.
I assume in the first case, you're not checking
fs
afterstd::getline()
asif(!fs) break;
or something equivalent. Because if you don't do so, then the first case is completely wrong. Or if you do that, then second one is still preferable as its more concise and clear in logic.The function
good()
should be used after you made an attempt to read from the stream; its used to check if the attempt was successful. In your first case, you don't do so. Afterstd::getline()
, you assume that the read was successful, without even checking whatfs.good()
returns. Also, you seem to assume that iffs.good()
returns true,std::getline
would successfully read a line from the stream. You're going exactly in the opposite direction: the fact is that, ifstd::getline
successfully reads a line from the stream, thenfs.good()
would returntrue
.The documentation at cplusplus says about
good()
that,That is, when you attempt to read data from an input stream, and if the attempt was failure, only then a failure flag is set and
good()
returnsfalse
as an indication of the failure.If you want to limit the scope of
line
variable to inside the loop only, then you can write afor
loop as:Note: this solution came to my mind after reading @john's solution, but I think its better than his version.
Read a detail explanation here why the second one is preferable and idiomatic:
Or read this nicely written blog by @Jerry Coffin:
将此视为对纳瓦兹已经非常出色的答案的扩展评论。
关于您的第一个选择,
这有多个问题。问题 1 是
while
条件位于错误的位置并且是多余的。它位于错误的位置,因为fs.good()
指示对文件执行的最新操作是否正常。 while 条件应该与即将发生的操作有关,而不是与之前的操作有关。无法知道接下来对该文件执行的操作是否正常。即将采取什么行动?fs.good()
不会读取您的代码来查看即将执行的操作是什么。第二个问题是您忽略了 std::getline() 的返回状态。如果您立即使用
fs.good()
检查状态,那就没问题了。因此,稍微修复一下这个问题,或者,您可以执行 if (! std::getline(fs, line)) { break; } } 但现在循环中间有一个
break
。耶赫。如果可能的话,最好将退出条件作为循环语句本身的一部分。将其与
“这是从文件中读取行的标准习惯用法”进行比较。 C 中存在一个非常相似的习惯用法。这个习惯用法非常古老,使用非常广泛,并且被广泛视为从文件中读取行的正确方法。
如果您来自一家禁止有副作用的条件句的商店怎么办? (有很多很多编程标准就是这样做的。)有一种方法可以解决这个问题,而无需诉诸循环中间的中断方法:
不像中断方法那么丑陋,但大多数人都会同意这不是“它几乎和标准习语一样漂亮。
我的建议是使用标准习惯用法,除非某些标准白痴禁止使用它。
附录
关于 for (std::getline(fs, line); fs.good(); std::getline(fs, line)) :这很丑陋,原因有两个。一是明显的重复代码块。
不太明显的是,调用
getline
然后调用good
会破坏原子性。如果其他线程也在读取该文件怎么办?现在这还不太重要,因为 C++ I/O 当前不是线程安全的。它将出现在即将发布的 C++11 中。仅仅为了让标准的执行者满意而破坏原子性只会导致灾难。Think of this as an extended comment to Nawaz' already excellent answer.
Regarding your first option,
This has multiple problems. Problem number 1 as that that the
while
condition is in the wrong place and is superfluous. It's in the wrong place becausefs.good()
indicates whether or not the most recent action performed on the file was OK. A while condition should be with respect to the upcoming actions, not the previous ones. There is no way to know whether the upcoming action on the file will be OK. What upcoming action?fs.good()
does not read your code to see what that upcoming action is.Problem number two is that the you are ignoring the return status from
std::getline()
. That's OK if you immediately check the status withfs.good()
. So, fixing this up a bit,Alternatively, you can do
if (! std::getline(fs, line)) { break; }
but now you have abreak
in the middle of the loop. Yech. It is much, much better to make the exit conditions a part of the loop statement itself if at all possible.Compare that to
This is the standard idiom for reading lines from a file. A very similar idiom exists in C. This idiom is very old, very widely used, and very widely viewed as the correct way to read lines from a file.
What if you come from a shop that bans conditionals with side-effects? (There are lots and lots of programming standards that do just that.) There is a way around this without resorting to the break in the middle of the loop approach:
Not as ugly as the break approach, but most will agree that this isn't nearly as nice-looking as is the standard idiom.
My recommendation is to use the standard idiom unless some standards idiot has banned its use.
Addendum
Regarding
for (std::getline(fs, line); fs.good(); std::getline(fs, line))
: This is ugly for two reasons. One is that obvious chunk of replicated code.Less obvious is that calling
getline
and thengood
breaks atomicity. What if some other thread is also reading from the file? This isn't quite so important right now because C++ I/O currently is not threadsafe. It will be in the upcoming C++11. Breaking atomicity just to keep the enforcers of the standards happy is recipe for disaster.实际上,我更喜欢另一种方式,
对我来说,它读起来更好,并且字符串的范围正确(即在使用它的循环内,而不是在循环外),
但是在您编写的两种方式中,第二个是正确的。
Actually I prefer another way
To me it reads better, and the string is scoped correctly (i.e. inside the loop where it is being used, not outside the loop)
But of the two you've written the second is correct.
第一个在每个循环中释放并重新分配字符串,浪费时间。
第二次将字符串写入已经存在的空间,消除了释放和重新分配,使其实际上比第一次更快(更好)。
The first one deallocated and re-allocated the string every loop, wasting time.
The second time writes the string to an already existing space removing the deallocation and reallocation, making it actually faster (and better) than the first one.
试试这个=>
Try this =>