Erlang:创建文件观察器
我必须在 Erlang 中实现文件观察器功能:应该有一个进程列出特定目录中的文件,并在文件出现时执行某些操作。
我看一下 OTP。所以目前我有以下想法: 1. 创建将控制 gen_servers 的 Supervisor(每个文件夹一台服务器) 2. 为我要监视的每个文件夹创建 WatchServer - gen_server。 3. 创建 ProcessFileServer - 应该对文件执行某些操作的生成服务器)假设复制到不同的文件夹=
所以第一个问题:WatchServer 不应该等待请求,它应该以预定义的时间间隔生成一个请求。
目前我已经在 init/1 函数中创建了一个计时器,并在 handle_info 函数中处理 on_timer 事件。
现在问题: 1.还有更好的想法吗? 2. 我应该如何通知ProcessFileServer文件已找到?我觉得独立创建 WatchServer 和 ProcessServer 会方便得多,但在这种情况下我不知道向谁发送消息?
可能有一些类似的项目/库可用吗?
I have to implement file watcher functionality in Erlang: There should be a process that list files if specific directory and do something, when files appear.
I take a look at OTP. So at the moment I have following ideas:
1. Create Supervisor that will control gen_servers (one server per folder)
2. Create WatchServer - gen_server for each folder that I want to monitor.
3. Create ProcessFileServer - gen server that should do something with files )assume copy to different folder=
So First problem: WatchServer should not wait for request, it should generate one in predefined intervals.
At the moment I have created a timer in init/1 function and handle on_timer event in handle_info function.
Now questions:
1. Are there better ideas?
2. How should I inform ProcessFileServer that file found? It seams to me that it would be much more convenient create WatchServers and ProcessServers independently, but in this case I do not know to whom send message?
May be there are some similar project/libs available?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您使用的是Linux,则可以使用inotify。它是一项内核服务,可让您订阅文件系统事件。不要轮询文件系统,让文件系统来调用您。
您可以尝试 https://github.com/massemanet/inotify 来观察您的目录。
乌尔夫
if you are using Linux, you can use inotify. It is a kernel service that lets you subscribe to file system events. Don't poll the filesystem, let the filesystem call you.
you can try https://github.com/massemanet/inotify for observing your directory.
Ulf
在 Erlang 中创建进程非常便宜(与其他系统相比是数量级)。
因此,我建议每次出现要处理的新文件时创建一个新的 ProcessFileServer。完成后,只需以退出原因
正常
终止进程即可。我建议采用以下结构:
当 dir_watcher 注意到出现新文件时。它使用文件路径的额外参数调用
processing_supervisor
的supervisor:start_child\2
函数,例如processing_supervisor
应该以 <代码>瞬态重启策略。因此,如果其中一台
proc_file
服务器崩溃,它将重新启动,但当它们因退出原因正常
终止时,它们不会重新启动。因此,您只需在完成后退出正常
,并在发生其他情况时崩溃。如果你不过分,循环轮询文件就可以了。如果系统由于此轮询而被加载,您可以调查内核通知系统(例如 FreeBSD KQUEUE 或在 MacOSX 上构建于其之上的更高级别服务),以便在目录中出现文件时向您发送消息。然而,这些服务具有复杂性,因为如果发生太多事件,它们就必须放弃(否则它们不会提高性能,而是相反)。因此,无论如何,您都必须有一个强大的轮询解决方案作为后备。
因此,不要进行过早的优化,并从轮询开始,在必要时添加改进(这将在
dir_watcher
服务器中隔离)。关于评论什么行为用作
dir_watcher
进程,因为它没有使用太多gen_servers
功能:仅使用部分
没有问题gen_servers
可能性,事实上,不使用全部可能性是很常见的。在您的情况下,您只需在init
中设置一个计时器并使用handle_info
来完成您的工作。 gen_server 的其余部分只是未更改的模板。如果您以后想要更改轮询频率等参数,很容易添加到其中。
gen_fsm
使用较少,因为它只适合相当有限的模型并且不太灵活。我仅在它确实 100% 满足要求时才使用它(几乎从不这样做)。如果您只想要一个简单的 Erlang 服务器,您可以使用
proc_lib
获取在主管下运行的最小功能。编写更自然的 Erlang 代码并仍然具有 OTP 优势的一种有趣方法是
plain_fsm
,这里您具有选择性接收和灵活消息处理的优势,尤其是在处理与 OTP 的良好功能相结合的协议时。话虽如此:如果我要编写一个
dir_watcher
,我只会使用gen_server
并仅使用我需要的内容。未使用的功能实际上不会花费您任何费用,而且每个人都了解它的作用。In Erlang it is very cheap to create processes (orders of magnitudes compared to other systems).
Therefore I recommend to create a new ProcessFileServer each time a new file to process is appearing. When it is done with just terminate the process with exit reason
normal
.I would suggest the following structure:
When a
dir_watcher
notices a new file appeared. It calls theprocessing_supervisor
ssupervisor:start_child\2
function, with the extra parameter of the file pathe e.g.The
processing_supervisor
should start its children withtransient
restart policy.So if one of the
proc_file
servers is crashing it will be restarted, but when they terminate with exit reasonnormal
they are not restarted. So you just exitnormal
when done and crash when whatever else happens.If you don't overdo it, cyclic polling for files is Ok. If the system becomes loaded because of this polling you can investigate in kernel notification systems (e.g. FreeBSD KQUEUE or the higher level services building upon it on MacOSX) to send you a message when a file appears in a directory. These services however have a complexity because it is necessary for them to throw up their hands if too many events happen (otherwise they wouldn't be a performance improvement but the opposite). So you will have to have a robust polling solution as a fallback anyway.
So don't do premature optimization and start with polling, adding improvements (which would be isolated in the
dir_watcher
servers) when it gets necessary.Regarding the comment what behaviour to use as
dir_watcher
process since it doesn't use much ofgen_servers
functionality:There is no problem with only using part of
gen_servers
posibilities, in fact it is very common not to use all of it. In your case you only set up a timer ininit
and usehandle_info
to do your work. The rest of thegen_server
is just the unchanged template.If you later want changing parameters like poll frequency it is easy to add into this.
gen_fsm
is much less used since it only fits a quite limited model and is not very flexible. I use it only when it really fits 100% to the requirement (which it does almost never).In a case where you just want a simple plain Erlang server you can use the spawn functions in
proc_lib
to get just the minimal functionality to run under a supervisor.A interesting way to write more natural Erlang code and still have the OTP advantages is
plain_fsm
, here you have the advantages of selective receive and flexible message handling needed especially when handling protocols paired with the nice features of OTP.Having said all this: if I would write a
dir_watcher
I'd just use agen_server
and use only what I need. The unused functionality doesn't really cost you anything and everybody understands what it does.我写了一个基于轮询的库。 (最好将其扩展为在支持此功能的平台上使用 inotify。)它最初打算在 EUnit 中使用,但我变成了一个单独的项目。您可以在这里找到它:
https://github.com/richcarl/file_monitor
I have written such a library, based on polling. (It would be nice to extend it to use inotify on platforms where this is supported.) It was originally meant to be used in EUnit, but I turned into a separate project instead. You can find it here:
https://github.com/richcarl/file_monitor