截断 c++ 中的文件
我正在用 C++ 编写一个程序,想知道是否有人可以帮助我解决这里解释的情况。
假设,我有一个大小约为 30MB 的日志文件,我已将文件的最后 2MB 复制到程序内的缓冲区中。
我删除该文件(或清除内容),然后将我的 2MB 写回该文件。
到这里一切都正常。但是,问题是我读取了文件(最后 2MB)并清除了文件(30MB 文件),然后写回了最后 2MB。 如果我从 1GB 文件复制最后 300MB 文件,则需要很多时间。
有谁有让这个过程更简单的想法吗?
当日志文件很大时,应该并且将会考虑以下原因。
磁盘空间:日志文件是未压缩的纯文本,会占用大量空间。 典型压缩可将文件大小减少 10:1。但是文件无法被压缩 当它正在使用时(锁定)。因此日志文件必须停止使用。
系统资源:定期打开和关闭文件会消耗大量系统资源 资源,并且会降低服务器的性能。
文件大小:小文件更容易在发生故障时备份和恢复。
我只是不想将最后特定行复制、清除并重新写入文件。只是一个更简单的过程......:-)
编辑:不制作任何内部流程来支持日志轮换。 logrotate 是一个工具。
I was writing a program in C++ and wonder if anyone can help me with the situation explained here.
Suppose, I have a log file of about size 30MB, I have copied last 2MB of file to a buffer within the program.
I delete the file (or clear the contents) and then write back my 2MB to the file.
Everything works fine till here. But, the concern is I read the file (the last 2MB) and clear the file (the 30MB file) and then write back the last 2MB.
To much of time will be needed if in a scenario where I am copying last 300MB of file from a 1GB file.
Does anyone have an idea of making this process simpler?
When having a large log file the following reasons should and will be considered.
Disk Space: Log files are uncompressed plain text and consume large amounts of space.
Typical compression reduce the file size by 10:1. However a file cannot be compressed
when it is in use (locked). So a log file must be rotated out of use.
System resources: Opening and closing a file regularly will consume lots of system
resources and it would reduce the performance of the server.
File size: Small files are easier to backup and restore in case of a failure.
I just do not want to copy, clear and re-write the last specific lines to a file. Just a simpler process.... :-)
EDIT: Not making any inhouse process to support log rotation.
logrotate is the tool.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我建议采取稍微不同的方法。
要提高复制的性能,可以将数据复制到块,您可以调整块大小以找到最佳值。
I would suggest an slightly different approach.
To improve the performance of the copy, you can copy the data in chunks, you can play around with the chunk size to find the optimal value.
如果这是您之前的文件:
其中
-
是您不想要的,而+
是您想要的,那么最便携的获取方式:...就是正如你所说。读入您想要的部分 (
+
),删除/清除文件(如使用fopen(... 'wb') 或类似的内容
,然后写出您想要的部分想要(+
)。任何更复杂的事情都需要特定于操作系统的帮助,并且不可移植。不幸的是,我不相信任何主要操作系统都支持您想要的东西。对于“截断之后位置X”(排序
head
),但不是您请求的tail
之类的操作,这样的操作很难实现,因为文件系统上的块大小不同(如果文件系统)。具有块大小)最多会导致麻烦,但这种情况很少见,这可能就是为什么不这样做的原因。直接支持。
If this is your file before:
Where
-
is what you don't want and+
is what you do want, the most portable way of getting:...is just as you said. Read in the section you want (
+
), delete/clear the file (as withfopen(... 'wb') or something similar
and write out the bit you want (+
).Anything more complicated requires OS-specific help, and isn't portable. Unfortunately, I don't believe any major OS out there has support for what you want. There might be support for "truncate after position X" (a sort of
head
), but not thetail
like operation you're requesting.Such an operation would be difficult to implement, as varying blocksizes on filesystems (if the filesystem has a block size) would cause trouble. At best, you'd be limited to cutting on blocksize boundaries, but this would be harry. This is such a rare case, that this is probably why such a procudure is not directly supported.
更好的方法可能是不要让文件增长得那么大,而是使用旋转日志文件,为每个日志文件设置最大大小并保留最大数量的旧文件。
A better approach might be not to let the file grow that big but rather use rotating log files with a set maximum size per log file and a maximum number of old files being kept.
如果您可以控制写入过程,那么您可能想要做的就是像 循环缓冲区一样写入文件。这样您就可以保留最后 X 个字节的数据,而无需执行您所建议的操作。
即使您无法控制写入过程,如果您至少可以控制它写入哪个文件,那么也许您可以让它写入 命名管道。您可以将自己的程序附加到此命名管道的末尾,该程序写入循环缓冲区,如所讨论的。
If you can control the writing process, what you probably want to do here is to write to the file like a circular buffer. That way you can keep the last X bytes of data without having to do what you're suggesting at all.
Even if you can't control the writing process, if you can at least control what file it writes to, then maybe you could get it to write to a named pipe. You could attach your own program at the end of this named pipe that writes to a circular buffer as discussed.