从文件中读取或将文件读入缓冲区然后使用缓冲区(在 C++ 中)?
我正在编写一个解析器,其中我需要从文件中读取字符。但我会逐个字符地读取文件,如果条件不满足,甚至可能会在中间停止读取。
那么,是否建议创建文件的 ifstream,并每次查找该位置并从那里开始读取,或者我应该将整个文件读入流或缓冲区,然后使用它?
I am writing a parser wherein, I need to read characters from a file. But I will be reading the file character by character, and may even stop reading in the middle if come conditions do not satisy.
So is it advisable to create an ifstream of the file, and seek to the position everytime and start reading from there, Or should I read the entire file into a stream or buffer, and then use that instead??
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果可以的话,请使用内存映射文件。
Boost 提供了一种跨平台的: http:// /www.boost.org/doc/libs/1_35_0/libs/iostreams/doc/classes/mapped_file.html
If you can, use a memory-mapped file.
Boost offers a cross-platform one: http://www.boost.org/doc/libs/1_35_0/libs/iostreams/doc/classes/mapped_file.html
文件有多大?您是否多次通过?无论您是否将其读入内存缓冲区,读取文件都会消耗(文件大小/BUFSIZ)读取来完成整个过程。逐个字符读取并不重要,因为底层读取仍然一次消耗 BUFSIZ 字节(除非您采取措施改变该行为)——它只是逐个字符地分发它们。
如果您正在读取它并一次性处理它,那么将其读入内存将意味着您总是需要(文件大小/
BUFSIZ
)读取,其中 - 假设停止的原因是均匀分布的——读取它并在线处理平均需要 (文件大小/BUFSIZ) * 0.5 次读取,这对于大文件来说可能是一个巨大的收获。一个更重要的问题可能是“您正在做什么寻找这个复杂的解决方案?”找出可爱的解决方案所需的时间可能会主导您寻找比标准“虽然不是文件结束,获取字符和过程”解决方案更奇特的解决方案所获得的任何收益。
How big is the file? Do you make more than one pass? Whether you read it into an in-memory buffer or not, reading the file will consume (file size/
BUFSIZ
) reads to go through the whole thing. Reading character by character doesn't matter, because the underlying read still consumes BUFSIZ bytes at a time (unless you take steps to change that behavior) -- it just hands them out character-by-character.If you're reading it and processing it in one pass anyway, then reading it into memory will mean you always need (file size/
BUFSIZ
) reads, where -- assuming the reason for stopping is distributed equiprobably -- reading it and processing in line will take on average (file size/BUFSIZ
) * 0.5 reads, which on a big file could be a substantial gain.An even more important question might be "what are you doing looking for this complicated a solution?" The amount of time it takes to figure out the cute solution probably dominates any gains you'll make from looking for something fancier than the standard "while not end of file, get character and process" solution.
每次寻找位置并阅读并不是更好的选择,因为它会降低性能,
尝试创建一个缓冲区并从中读取,这将是一个更好的主意,效率更高
尝试将所有文件内容一次读取到缓冲区,然后使用缓冲区处理后续输入需求,而不是读取每次都从文件中,,
Seeking the position every time and reading wouldn't be a better option for this as it degrades the performance,
Try creating a Buffer and read from that that would be a better idea and more efficient
Try to read all the file contents at a stretch to the buffer and then process the subsequent input needs with the buffer and without reading from the file everytime,,
在全服务操作系统(即 Windows、Mac OS、Linux、BSD...)上,操作系统将具有缓存机制,可以在某种程度上为您处理此问题(并假设您的使用模式符合“通常”的某些定义)。
除非您遇到不可接受的性能,否则您可能想愉快地忽略整个问题(即仅使用简单的文件访问原语)。
On a full service OS (i.e. Windows, Mac OS, Linux, BSD...) the operating system will have a caching mechanism that handles this for you to some extent (and assuming your usage patterns meet some definition of "usual").
Unless you are experiencing unacceptable performance you might want to merrily ignore the whole issue (i.e. just use the naive file access primitives).