如何查看文件的更改?
我有一个由另一个进程写入的日志文件,我想观察它的变化。 每次发生更改时,我都想读取新数据以对其进行一些处理。
最好的方法是什么? 我希望 PyWin32 库中有某种钩子。 我找到了 win32file.FindNextChangeNotification
函数,但不知道如何要求它监视特定文件。
如果有人做过类似的事情,我将非常感激听到如何...
[编辑]我应该提到我正在寻求一个不需要轮询的解决方案。
[编辑]诅咒! 这似乎不适用于映射的网络驱动器。 我猜测 Windows 不会像在本地磁盘上那样“听到”文件的任何更新。
I have a log file being written by another process which I want to watch for changes. Each time a change occurs I'd like to read the new data in to do some processing on it.
What's the best way to do this? I was hoping there'd be some sort of hook from the PyWin32 library. I've found the win32file.FindNextChangeNotification
function but have no idea how to ask it to watch a specific file.
If anyone's done anything like this I'd be really grateful to hear how...
[Edit] I should have mentioned that I was after a solution that doesn't require polling.
[Edit] Curses! It seems this doesn't work over a mapped network drive. I'm guessing windows doesn't 'hear' any updates to the file the way it does on a local disk.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(28)
似乎没有人发布fswatch。 它是一个跨平台的文件系统观察器。 只需安装、运行并按照提示操作即可。
我已经将它与 python 和 golang 程序一起使用,并且它可以正常工作。
Seems that no one has posted fswatch. It is a cross-platform file system watcher. Just install it, run it and follow the prompts.
I've used it with python and golang programs and it just works.
由于我已在全球范围内安装了它,因此我最喜欢的方法是使用 nodemon。 如果您的源代码位于
src
中,并且您的入口点是src/app.py
,那么就很简单:... where
-e py, html
允许您控制要监视更改的文件类型。Since I have it installed globally, my favorite approach is to use nodemon. If your source code is in
src
, and your entry point issrc/app.py
, then it's as easy as:... where
-e py,html
lets you control what file types to watch for changes.只是为了指出这一点,因为没有人提到它:标准库中有一个名为
filecmp
的 Python 模块,它具有比较两个文件的cmp()
函数。只需确保您不执行
from filecmp import cmp
即可,以免掩盖 Python 2.x 中的内置cmp()
函数。 不过,在 Python 3.x 中这没问题,因为不再有这样的内置cmp()
函数。无论如何,它的使用方式如下:
参数shallow默认为True。 如果参数的值为True,则仅比较文件的元数据; 但是,如果参数的值为 False,则比较文件的内容。
也许这些信息对某人有用。
Just to put this out there since no one mentioned it: there's a Python module in the Standard Library named
filecmp
which has thiscmp()
function that compares two files.Just make sure you don't do
from filecmp import cmp
to not overshadow the built-incmp()
function in Python 2.x. That's okay in Python 3.x, though, since there's no such built-incmp()
function anymore.Anyway, this is how its use looks like:
The argument shallow defaults to True. If the argument's value is True, then only the metadata of the files are compared; however, if the argument's value is False, then the contents of the files are compared.
Maybe this information will be useful to someone.
下面是一个用于观察每秒写入不超过一行但通常少得多的输入文件的示例。 目标是将最后一行(最近写入)附加到指定的输出文件。 我从我的一个项目中复制了这一行,然后删除了所有不相关的行。 您必须填写或更改缺少的符号。
当然,并不严格要求包含 QMainWindow 类,即。 您可以单独使用 QFileSystemWatcher。
Here's an example geared toward watching input files that write no more than one line per second but usually a lot less. The goal is to append the last line (most recent write) to the specified output file. I've copied this from one of my projects and just deleted all the irrelevant lines. You'll have to fill in or change the missing symbols.
Of course, the encompassing QMainWindow class is not strictly required, ie. you can use QFileSystemWatcher alone.
watchfiles (https://github.com/samuelcolvin/watchfiles) 是一个 Python API 和 CLI,它使用Notify (https://github.com/notify-rs/notify) 库编写于锈。
目前(2022-10-09)的 rust 实现支持:
PyPI 上可用的二进制文件(https://pypi.org/project/watchfiles/) 和 conda-forge (https://github.com/conda-forge/watchfiles-feedstock)。
watchfiles (https://github.com/samuelcolvin/watchfiles) is a Python API and CLI that uses the Notify (https://github.com/notify-rs/notify) library written in Rust.
The rust implementation currently (2022-10-09) supports:
Binaries available on PyPI (https://pypi.org/project/watchfiles/) and conda-forge (https://github.com/conda-forge/watchfiles-feedstock).
您还可以使用一个名为 repyt 的简单库,以下是一个示例:
You can also use a simple library called repyt, here is an example:
相关@4Oh4解决方案平滑更改要观看的文件列表;
related @4Oh4 solution a smooth change for a list of files to watch;
最好、最简单的解决方案是使用 pygtail:
https://pypi.python.org/pypi/pygtail
The best and simplest solution is to use pygtail:
https://pypi.python.org/pypi/pygtail
最简单的解决方案是在一段时间后获取同一文件的两个实例并比较它们。 你可以尝试这样的事情
The easiest solution would get the two instances of the same file after an interval and Compare them. You Could try something like this
如果您使用的是 Windows,请创建此 POLL.CMD 文件,
然后您可以键入“poll dir1 dir2”,它将把所有文件从 dir1 复制到 dir2 并每秒检查一次更新。
“查找”是可选的,只是为了减少控制台的噪音。
这不是递归的。 也许你可以在 xcopy 上使用 /e 使其递归。
If you're using windows, create this POLL.CMD file
then you can type "poll dir1 dir2" and it will copy all the files from dir1 to dir2 and check for updates once per second.
The "find" is optional, just to make the console less noisy.
This is not recursive. Maybe you could make it recursive using /e on the xcopy.
我不知道任何 Windows 特定功能。 您可以尝试每秒/每分钟/每小时获取文件的 MD5 哈希值(取决于您需要的速度)并将其与最后一个哈希值进行比较。 当它不同时,您知道文件已更改,并且您可以读出最新的行。
I don't know any Windows specific function. You could try getting the MD5 hash of the file every second/minute/hour (depends on how fast you need it) and compare it to the last hash. When it differs you know the file has been changed and you read out the newest lines.
我会尝试这样的事情。
该循环检查自上次读取文件以来是否有新行 - 如果有,则读取该行并将其传递给 functionThatAnalisesTheLine 函数。 如果没有,脚本将等待 1 秒并重试该过程。
I'd try something like this.
The loop checks if there is a new line(s) since last time file was read - if there is, it's read and passed to the
functionThatAnalisesTheLine
function. If not, script waits 1 second and retries the process.查看 pyinotify。
inotify 在较新的 linux 中取代了 dnotify(来自早期的答案),并允许文件级而不是目录级监视。
Check out pyinotify.
inotify replaces dnotify (from an earlier answer) in newer linuxes and allows file-level rather than directory-level monitoring.
为了观看具有轮询和最小依赖性的单个文件,这里有一个完全充实的示例,基于 Deestan 的答案(多于):
For watching a single file with polling, and minimal dependencies, here is a fully fleshed-out example, based on answer from Deestan (above):
在对蒂姆·戈尔登的脚本进行了一些修改之后,我得到了以下似乎工作得很好的内容:
它可能可以通过加载更多错误检查来完成,但只是为了简单地观看日志文件并在吐出它之前对其进行一些处理到屏幕上,效果很好。
感谢大家的意见 - 很棒的东西!
Well after a bit of hacking of Tim Golden's script, I have the following which seems to work quite well:
It could probably do with a load more error checking, but for simply watching a log file and doing some processing on it before spitting it out to the screen, this works well.
Thanks everyone for your input - great stuff!
检查我的回答类似问题。 您可以在 Python 中尝试相同的循环。 此页面建议:
另请参阅问题tail() 使用 Python 的文件。
Check my answer to a similar question. You could try the same loop in Python. This page suggests:
Also see the question tail() a file with Python.
下面是 Kender 代码的简化版本,它似乎执行相同的操作,并且不会导入整个文件:
Here is a simplified version of Kender's code that appears to do the same trick and does not import the entire file:
这是 Tim Goldan 脚本的另一个修改,该脚本在 unix 类型上运行,并通过使用 dict (file=>time) 添加了一个用于文件修改的简单监视程序。
用法:whateverName.py path_to_dir_to_watch
This is another modification of Tim Goldan's script that runs on unix types and adds a simple watcher for file modification by using a dict (file=>time).
usage: whateverName.py path_to_dir_to_watch
对我来说最简单的解决方案是使用看门狗的工具 watchmedo
来自 https://pypi.python.org/pypi/watchdog 我现在有一个进程可以在目录中查找 sql 文件并在必要时执行它们。
Simplest solution for me is using watchdog's tool watchmedo
From https://pypi.python.org/pypi/watchdog I now have a process that looks up the sql files in a directory and executes them if necessary.
好吧,既然您使用的是 Python,您只需打开一个文件并继续从中读取行即可。
如果读取的行非空,则对其进行处理。
您可能忽略了在 EOF 处继续调用 readline 是可以的。 在这种情况下,它只会继续返回空字符串。 当某些内容被附加到日志文件时,读取将根据您的需要从停止的地方继续。
如果您正在寻找使用事件或特定库的解决方案,请在您的问题中指定。 否则,我认为这个解决方案很好。
Well, since you are using Python, you can just open a file and keep reading lines from it.
If the line read is not empty, you process it.
You may be missing that it is ok to keep calling
readline
at the EOF. It will just keep returning an empty string in this case. And when something is appended to the log file, the reading will continue from where it stopped, as you need.If you are looking for a solution that uses events, or a particular library, please specify this in your question. Otherwise, I think this solution is just fine.
正如您在 Tim Golden 的文章中看到的,由 Horst Gutmann,WIN32相对复杂,并且监视目录,而不是单个文件。
我建议您研究一下IronPython,它是.NET python 实现。
通过 IronPython,您可以使用所有 .NET 功能 - 包括
使用简单的 Event 接口处理单个文件。
As you can see in Tim Golden's article, pointed by Horst Gutmann, WIN32 is relatively complex and watches directories, not a single file.
I'd like to suggest you look into IronPython, which is a .NET python implementation.
With IronPython you can use all the .NET functionality - including
Which handles single files with a simple Event interface.
这是检查文件更改的示例。 这可能不是最好的方法,但它确实是一种捷径。
当源代码发生更改时,用于重新启动应用程序的便捷工具。 我在玩 pygame 时做了这个,这样我就可以看到文件保存后立即发生的效果。
当在 pygame 中使用时,请确保将“while”循环中的内容放置在游戏循环中,即更新或其他内容。 否则您的应用程序将陷入无限循环,并且您将看不到游戏更新。
如果您想要我在网上找到的重启代码。 这里是。 (与问题无关,尽管它可能会派上用场)
让电子做你想让它们做的事情,享受乐趣。
This is an example of checking a file for changes. One that may not be the best way of doing it, but it sure is a short way.
Handy tool for restarting application when changes have been made to the source. I made this when playing with pygame so I can see effects take place immediately after file save.
When used in pygame make sure the stuff in the 'while' loop is placed in your game loop aka update or whatever. Otherwise your application will get stuck in an infinite loop and you will not see your game updating.
In case you wanted the restart code which I found on the web. Here it is. (Not relevant to the question, though it could come in handy)
Have fun making electrons do what you want them to do.
您是否尝试过使用Watchdog?
Did you try using Watchdog?
如果轮询对您来说足够好,我只是观察“修改时间”文件统计信息是否发生变化。 阅读它:(
另请注意,Windows 本机更改事件解决方案并非在所有情况下都有效,例如在网络驱动器上。)
If polling is good enough for you, I'd just watch if the "modified time" file stat changes. To read it:
(Also note that the Windows native change event solution does not work in all circumstances, e.g. on network drives.)
如果您想要多平台解决方案,请检查 QFileSystemWatcher。
这里是一个示例代码(未清理):
If you want a multiplatform solution, then check QFileSystemWatcher.
Here an example code (not sanitized):
它不应该在 Windows 上工作(也许使用 cygwin ?),但对于 unix 用户,您应该使用“fcntl”系统调用。 这是一个 Python 示例。 如果您需要用 C 语言编写(相同的函数名称),那么它基本上是相同的代码
It should not work on windows (maybe with cygwin ?), but for unix user, you should use the "fcntl" system call. Here is an example in Python. It's mostly the same code if you need to write it in C (same function names)