如何在 Perl 中刷新文件?
我有 Perl 脚本,它每 3 秒向现有文件追加一个新行。此外,还有一个 C++ 应用程序可以读取该文件。
问题是应用程序在脚本完成并且文件句柄关闭后开始读取文件。为了避免这种情况,我想在每行追加后刷新。我怎样才能做到这一点?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
尝试:
这实际上是作为自动刷新的一种方式发布在 一个早期问题中我的,它询问了普遍接受的实现这一目标的糟糕方法:-)
Try:
This was actually posted as a way of auto-flushing in an early question of mine, which asked about the universally accepted bad way of achieving this :-)
TL/DR:使用
IO::Handle
和flush
方法,例如:首先,您需要决定您想要如何“刷新”它。可以有相当多的缓冲层:
Perl 文件句柄上的内部缓冲区。其他程序在数据离开此缓冲区之前无法看到数据。
“脏”文件块的文件系统级缓冲。其他程序仍然可以看到这些更改,它们看起来是“写入的”,但如果操作系统或机器崩溃,它们就会丢失。
写入的磁盘级回写缓冲。操作系统认为这些已写入磁盘,但磁盘实际上只是将它们存储在驱动器上的易失性内存中。如果操作系统崩溃,数据不会丢失,但如果断电,则可能会丢失,除非磁盘可以先将其写出。对于廉价消费级 SSD 来说,这是一个大问题。
当 SAN、远程文件系统、RAID 控制器等参与进来时,情况会变得更加复杂。如果您通过管道写入,还需要考虑管道缓冲区。
如果您只想刷新 Perl 缓冲区,您可以
关闭
该文件,打印
包含“\n”
的字符串(因为看起来Perl 在换行符上刷新),或 使用IO::Handle
的flush
方法。您还可以根据perl 常见问题解答使用
binmode
或玩使用$|
使文件句柄不缓冲。这与刷新缓冲句柄不同,因为将一堆缓冲写入排队然后执行一次刷新比写入未缓冲句柄的性能成本要低得多。如果您想刷新文件系统写回缓冲区,您需要使用像
fsync()
这样的系统调用,以O_DATASYNC
模式打开文件,或者使用众多方法之一其他选项。它非常复杂,事实证明 PostgreSQL 有自己的工具来测试文件同步方法。如果您想确保它确实、真实、诚实地位于硬盘驱动器上的永久存储中,您必须将其刷新到程序中的文件系统中。您还需要配置硬盘驱动器/SSD/RAID 控制器/SAN/任何内容,以便在操作系统要求时真正刷新。这可能会非常复杂,并且是特定于操作系统/硬件的。强烈建议进行“插拔”测试,以确保您确实正确无误。
TL/DR: use
IO::Handle
and theflush
method, eg:First, you need to decide how "flushed" you want it. There can be quite a few layers of buffering:
Perl's internal buffer on the file handle. Other programs can't see data until it's left this buffer.
File-system level buffering of "dirty" file blocks. Other programs can still see these changes, they seem "written", but they'll be lost if the OS or machine crashes.
Disk-level write-back buffering of writes. The OS thinks these are written to disk, but the disk is actually just storing them in volatile memory on the drive. If the OS crashes the data won't be lost, but if power fails it might be unless the disk can write it out first. This is a big problem with cheap consumer SSDs.
It gets even more complicated when SANs, remote file systems, RAID controllers, etc get involved. If you're writing via pipes there's also the pipe buffer to consider.
If you just want to flush the Perl buffer, you can
close
the file,print
a string containing"\n"
(since it appears that Perl flushes on newlines), or useIO::Handle
'sflush
method.You can also, per the perl faq use
binmode
or play with$|
to make the file handle unbuffered. This is not the same thing as flushing a buffered handle, since queuing up a bunch of buffered writes then doing a single flush has a much lower performance cost than writing to an unbuffered handle.If you want to flush the file system write back buffer you need to use a system call like
fsync()
, open your file inO_DATASYNC
mode, or use one of the numerous other options. It's painfully complicated, as evidenced by the fact that PostgreSQL has its own tool just to test file syncing methods.If you want to make sure it's really, truly, honestly on the hard drive in permanent storage you must flush it to the file system in your program. You also need to configure the hard drive/SSD/RAID controller/SAN/whatever to really flush when the OS asks it to. This can be surprisingly complicated to do and is quite OS/hardware specific. "plug-pull" testing is strongly recommended to make sure you've really got it right.
来自“man perlfaq5”:
如果您只想刷新标准输出,您可能可以这样做:
但是请检查常见问题解答以获取有关为您提供更好使用的抽象的模块的详细信息,例如
IO::Handle.
From 'man perlfaq5':
If you just want to flush stdout, you can probably just do:
But check the FAQ for details on a module that gives you a nicer-to-use abstraction, like
IO::Handle
.这就是答案——真正的答案。
停止在进程的生命周期内维护该文件的打开文件句柄。
开始将文件追加操作抽象为一个子程序,该子程序以追加模式打开文件、写入文件并关闭它。
观察文件的进程将遇到一个关闭的文件,每当调用该函数时该文件就会被修改。
Here's the answer - the real answer.
Stop maintaining an open file handle for this file for the life of the process.
Start abstracting your file-append operation into a sub that opens the file in append mode, writes to it, and closes it.
The process observing the file will encounter a closed file that gets modified whenever the function is called.
所有建议设置自动刷新的解决方案都忽略了一个基本事实,即大多数现代操作系统都在缓冲文件 I/O,无论 Perl 在做什么。
强制将数据提交到磁盘的唯一方法是关闭文件。
我陷入了与 atm 相同的困境,我们在写入日志的轮换方面遇到了问题。
All of the solutions suggesting setting autoflush are ignoring the basic fact that most modern OS's are buffering file I/O irrespective of what Perl is doing.
You only possibility to force the commitment of the data to disk is by closing the file.
I'm trapped with the same dilemma atm where we have an issue with rotation of the log being written.
要自动刷新输出,您可以在输出到文件句柄之前按照其他人的描述设置 autoflush/
$|
。如果您已经输出到文件句柄并需要确保它到达物理文件,则需要使用 IO::Handle
flush
和sync
方法。To automatically flush the output, you can set autoflush/
$|
as described by others before you output to the filehandle.If you've already output to the filehandle and need to ensure that it gets to the physical file, you need to use the IO::Handle
flush
andsync
methods.PerlDoc 中有一篇关于此的文章: 如何刷新/取消缓冲输出文件句柄?为什么我必须这样做?
两种解决方案:
$|
IO::Handle
则调用 autoflush 方法> 或其子类之一。There an article about this in PerlDoc: How do I flush/unbuffer an output filehandle? Why must I do this?
Two solutions:
$|
IO::Handle
or one of its subclasses.另一种方法是在 Perl 脚本和 C++ 程序之间使用命名管道,代替您当前正在使用的文件的名称。
An alternative approach would be to use a named pipe between your Perl script and C++ program, in lieu of the file you're currently using.
对于那些正在寻找使用会话文件 (*.cse) 将输出逐行刷新到 Ansys CFD Post 中的文件的解决方案的人,这是唯一对我有用的解决方案:
请注意,您需要在包含 Perl 脚本的每一行的每个开头添加感叹号。
sleep(3);
仅用于演示目的。不需要use IO::Handle;
。For those who are searching a solution to flush output line by line to a file in Ansys CFD Post using a Session File (*.cse), this is the only solution that worked for me:
Note that you need the exclamation marks at every begin of every line that contains Perl script.
sleep(3);
is only applied for demonstration reasons.use IO::Handle;
is not needed.真正正确的答案是使用:-
虽然这是问题的原因之一,但同一问题的另一个原因是:“此外,有一个从该文件读取的 C++ 应用程序。”
编写可以正确读取正在增长的文件的 C++ 代码是非常重要的,因为您的“C++”程序在到达末尾时会遇到 EOF...(您无法读取超出文件末尾的内容)没有严重的额外欺骗) - 你必须用 IO 阻塞和标志做一堆复杂的事情才能以这种方式正确监视文件(就像 Linux“tail”命令的工作方式)。
The genuine correct answer is to use:-
and although that is one cause of your problem, the other cause of the same problem is this: "Also, there is a C++ application which reads from that file."
It is EXTREMELY NON-TRIVIAL to write C++ code which can properly read from a file that is growing, because your "C++" program will encounter an EOF when it gets to the end... (you cannot read past the end of a file without serious extra trickery) - you have to do a pile of complicated stuff with IO blocking and flags to properly monitor a file this way (like how the linux "tail" command works).
我遇到了同样的问题,唯一的区别是用新内容一遍又一遍地写入同一个文件。 “$| = 1”和自动刷新的这种关联对我有用:
祝你好运。
H
I had the same problem with the only difference of writing the same file over and over again with new content. This association of "$| = 1" and autoflush worked for me:
Best of luck.
H