将整个 ASCII 文件读入 C++标准::字符串
我需要将整个文件读入内存并将其放入 C++ std::string
中。
如果我将其读入 char[]
中,答案将非常简单:
std::ifstream t;
int length;
t.open("file.txt"); // open input file
t.seekg(0, std::ios::end); // go to the end
length = t.tellg(); // report location (this is the length)
t.seekg(0, std::ios::beg); // go back to the beginning
buffer = new char[length]; // allocate memory for a buffer of appropriate dimension
t.read(buffer, length); // read the whole file into the buffer
t.close(); // close file handle
// ... Do stuff with buffer here ...
现在,我想做完全相同的事情,但使用 std::string
> 而不是 char[]
。我想避免循环,即我不想:
std::ifstream t;
t.open("file.txt");
std::string buffer;
std::string line;
while(t){
std::getline(t, line);
// ... Append line to buffer and go on
}
t.close()
有什么想法吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
有几种可能性。我喜欢使用字符串流作为中间人:
现在“file.txt”的内容可以在字符串中作为
buffer.str()
使用。另一种可能性(虽然我当然也不喜欢它)更像是你原来的:
正式地,这不需要在 C++98 或 03 标准下工作(不需要字符串来连续存储数据)但事实上它适用于所有已知的实现,并且 C++11 及更高版本确实需要连续存储,因此保证可以与它们一起使用。
至于为什么我也不喜欢后者:首先,因为它更长,更难读。其次,因为它要求您使用不关心的数据初始化字符串的内容,然后立即覆盖该数据(是的,与读取相比,初始化时间通常很短,因此可能并不重要,但对我来说仍然感觉有点不对)。第三,在文本文件中,文件中的位置 X 并不一定意味着您将读取 X 个字符才能到达该点 - 不需要考虑行尾翻译等问题。在执行此类翻译的真实系统(例如 Windows)上,翻译后的形式比文件中的内容短(即文件中的“\r\n”在翻译后的字符串中变为“\n”),因此您所做的一切保留了一些您从未使用过的额外空间。再说一次,并没有真正造成大问题,但无论如何感觉有点不对劲。
There are a couple of possibilities. One I like uses a stringstream as a go-between:
Now the contents of "file.txt" are available in a string as
buffer.str()
.Another possibility (though I certainly don't like it as well) is much more like your original:
Officially, this isn't required to work under the C++98 or 03 standard (string isn't required to store data contiguously) but in fact it works with all known implementations, and C++11 and later do require contiguous storage, so it's guaranteed to work with them.
As to why I don't like the latter as well: first, because it's longer and harder to read. Second, because it requires that you initialize the contents of the string with data you don't care about, then immediately write over that data (yes, the time to initialize is usually trivial compared to the reading, so it probably doesn't matter, but to me it still feels kind of wrong). Third, in a text file, position X in the file doesn't necessarily mean you'll have read X characters to reach that point -- it's not required to take into account things like line-end translations. On real systems that do such translations (e.g., Windows) the translated form is shorter than what's in the file (i.e., "\r\n" in the file becomes "\n" in the translated string) so all you've done is reserved a little extra space you never use. Again, doesn't really cause a major problem but feels a little wrong anyway.
更新:事实证明,这种方法虽然很好地遵循了 STL 习惯用法,但实际上效率低得惊人!不要对大文件执行此操作。 (请参阅:http://insanecoding。 blogspot.com/2011/11/how-to-read-in-file-in-c.html)
您可以从文件中创建一个streambuf迭代器并用它初始化字符串:
不确定您在哪里'重新获取
t.open("file.txt", "r")
语法。据我所知,这不是std::ifstream
具有的方法。看起来您已经将它与 C 的fopen
混淆了。编辑: 另请注意字符串构造函数第一个参数周围的额外括号。 这些都是必不可少的。它们可以防止称为“最令人烦恼的解析”,在这种情况下实际上不会像通常那样给你一个编译错误,但会给你有趣的(读:错误的)结果。
按照 KeithB 在评论中的观点,这里有一种方法可以预先分配所有内存(而不是依赖于字符串类的自动重新分配):
Update: Turns out that this method, while following STL idioms well, is actually surprisingly inefficient! Don't do this with large files. (See: http://insanecoding.blogspot.com/2011/11/how-to-read-in-file-in-c.html)
You can make a streambuf iterator out of the file and initialize the string with it:
Not sure where you're getting the
t.open("file.txt", "r")
syntax from. As far as I know that's not a method thatstd::ifstream
has. It looks like you've confused it with C'sfopen
.Edit: Also note the extra parentheses around the first argument to the string constructor. These are essential. They prevent the problem known as the "most vexing parse", which in this case won't actually give you a compile error like it usually does, but will give you interesting (read: wrong) results.
Following KeithB's point in the comments, here's a way to do it that allocates all the memory up front (rather than relying on the string class's automatic reallocation):
我认为最好的方法是使用字符串流。简单快捷!
I think best way is to use string stream. simple and quick !!!
你可能在任何书籍或网站上都找不到这个,但我发现它效果很好:
You may not find this in any book or site, but I found out that it works pretty well:
尝试以下两种方法之一:
Try one of these two methods:
我找到了另一种适用于大多数 istream 的方法,包括 std::cin!
I figured out another way that works with most istreams, including std::cin!
如果您碰巧使用 glibmm 您可以尝试 Glib::file_get_contents。
If you happen to use glibmm you can try Glib::file_get_contents.
我可以这样做:
如果这是令人不悦的事情,请告诉我为什么
I could do it like this:
If this is something to be frowned upon, please let me know why
我认为如果没有显式或隐式循环,如果不首先读入 char 数组(或其他一些容器)并十次构建字符串,您就无法做到这一点。如果您不需要字符串的其他功能,可以使用
vector
来完成,就像您当前使用char *
一样。I don't think you can do this without an explicit or implicit loop, without reading into a char array (or some other container) first and ten constructing the string. If you don't need the other capabilities of a string, it could be done with
vector<char>
the same way you are currently using achar *
.