Erlang:创建文件观察器

发布于 2024-11-01 20:06:13 字数 506 浏览 0 评论 0原文

我必须在 Erlang 中实现文件观察器功能:应该有一个进程列出特定目录中的文件,并在文件出现时执行某些操作。

我看一下 OTP。所以目前我有以下想法: 1. 创建将控制 gen_servers 的 Supervisor(每个文件夹一台服务器) 2. 为我要监视的每个文件夹创建 WatchServer - gen_server。 3. 创建 ProcessFileServer - 应该对文件执行某些操作的生成服务器)假设复制到不同的文件夹=

所以第一个问题:WatchServer 不应该等待请求,它应该以预定义的时间间隔生成一个请求。

目前我已经在 init/1 函数中创建了一个计时器,并在 handle_info 函数中处理 on_timer 事件。

现在问题: 1.还有更好的想法吗? 2. 我应该如何通知ProcessFileServer文件已找到?我觉得独立创建 WatchServer 和 ProcessServer 会方便得多,但在这种情况下我不知道向谁发送消息?

可能有一些类似的项目/库可用吗?

I have to implement file watcher functionality in Erlang: There should be a process that list files if specific directory and do something, when files appear.

I take a look at OTP. So at the moment I have following ideas:
1. Create Supervisor that will control gen_servers (one server per folder)
2. Create WatchServer - gen_server for each folder that I want to monitor.
3. Create ProcessFileServer - gen server that should do something with files )assume copy to different folder=

So First problem: WatchServer should not wait for request, it should generate one in predefined intervals.

At the moment I have created a timer in init/1 function and handle on_timer event in handle_info function.

Now questions:
1. Are there better ideas?
2. How should I inform ProcessFileServer that file found? It seams to me that it would be much more convenient create WatchServers and ProcessServers independently, but in this case I do not know to whom send message?

May be there are some similar project/libs available?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

失去的东西太少 2024-11-08 20:06:13

如果您使用的是Linux,则可以使用inotify。它是一项内核服务,可让您订阅文件系统事件。不要轮询文件系统,让文件系统来调用您。

您可以尝试 https://github.com/massemanet/inotify 来观察您的目录。

乌尔夫

if you are using Linux, you can use inotify. It is a kernel service that lets you subscribe to file system events. Don't poll the filesystem, let the filesystem call you.

you can try https://github.com/massemanet/inotify for observing your directory.

Ulf

如梦初醒的夏天 2024-11-08 20:06:13

在 Erlang 中创建进程非常便宜(与其他系统相比是数量级)。

因此,我建议每次出现要处理的新文件时创建一个新的 ProcessFileServer。完成后,只需以退出原因正常终止进程即可。

我建议采用以下结构:

                              top_supervisor
                                      |
              +-----------------------+-------------------------+
              |                                                 |
       directory_supervisor                             processing_supervisor
               |                                         simple_one_for_one
    +----------+-----...-----+                                   |
    |          |             |                       starts children transient
    |          |             |                                   |
dir_watcher_1 dir_watcher_2 dir_watcher_n   +-------------+------+---...----+
                                            |             |                 |
                                        proc_file_1   proc_file_2       proc_file_n

当 dir_watcher 注意到出现新文件时。它使用文件路径的额外参数调用 processing_supervisorsupervisor:start_child\2 函数,例如

processing_supervisor 应该以 <代码>瞬态重启策略。

因此,如果其中一台 proc_file 服务器崩溃,它将重新启动,但当它们因退出原因 正常 终止时,它们不会重新启动。因此,您只需在完成后退出正常,并在发生其他情况时崩溃。

如果你不过分,循环轮询文件就可以了。如果系统由于此轮询而被加载,您可以调查内核通知系统(例如 FreeBSD KQUEUE 或在 MacOSX 上构建于其之上的更高级别服务),以便在目录中出现文件时向您发送消息。然而,这些服务具有复杂性,因为如果发生太多事件,它们就必须放弃(否则它们不会提高性能,而是相反)。因此,无论如何,您都必须有一个强大的轮询解决方案作为后备。

因此,不要进行过早的优化,并从轮询开始,在必要时添加改进(这将在 dir_watcher 服务器中隔离)。


关于评论什么行为用作 dir_watcher 进程,因为它没有使用太多 gen_servers 功能:

  • 仅使用部分 没有问题gen_servers 可能性,事实上,不使用全部可能性是很常见的。在您的情况下,您只需在 init 中设置一个计时器并使用 handle_info 来完成您的工作。 gen_server 的其余部分只是未更改的模板。

  • 如果您以后想要更改轮询频率等参数,很容易添加到其中。

  • gen_fsm 使用较少,因为它只适合相当有限的模型并且不太灵活。我仅在它确实 100% 满足要求时才使用它(几乎从不这样做)。

  • 如果您只想要一个简单的 Erlang 服务器,您可以使用 proc_lib 获取在主管下运行的最小功能。

  • 编写更自然的 Erlang 代码并仍然具有 OTP 优势的一种有趣方法是 plain_fsm ,这里您具有选择性接收和灵活消息处理的优势,尤其是在处理与 OTP 的良好功能相结合的协议时。

话虽如此:如果我要编写一个 dir_watcher,我只会使用 gen_server 并仅使用我需要的内容。未使用的功能实际上不会花费您任何费用,而且每个人都了解它的作用。

In Erlang it is very cheap to create processes (orders of magnitudes compared to other systems).

Therefore I recommend to create a new ProcessFileServer each time a new file to process is appearing. When it is done with just terminate the process with exit reason normal.

I would suggest the following structure:

                              top_supervisor
                                      |
              +-----------------------+-------------------------+
              |                                                 |
       directory_supervisor                             processing_supervisor
               |                                         simple_one_for_one
    +----------+-----...-----+                                   |
    |          |             |                       starts children transient
    |          |             |                                   |
dir_watcher_1 dir_watcher_2 dir_watcher_n   +-------------+------+---...----+
                                            |             |                 |
                                        proc_file_1   proc_file_2       proc_file_n

When a dir_watcher notices a new file appeared. It calls the processing_supervisors supervisor:start_child\2 function, with the extra parameter of the file pathe e.g.

The processing_supervisor should start its children with transient restart policy.

So if one of the proc_file servers is crashing it will be restarted, but when they terminate with exit reason normal they are not restarted. So you just exit normal when done and crash when whatever else happens.

If you don't overdo it, cyclic polling for files is Ok. If the system becomes loaded because of this polling you can investigate in kernel notification systems (e.g. FreeBSD KQUEUE or the higher level services building upon it on MacOSX) to send you a message when a file appears in a directory. These services however have a complexity because it is necessary for them to throw up their hands if too many events happen (otherwise they wouldn't be a performance improvement but the opposite). So you will have to have a robust polling solution as a fallback anyway.

So don't do premature optimization and start with polling, adding improvements (which would be isolated in the dir_watcher servers) when it gets necessary.


Regarding the comment what behaviour to use as dir_watcher process since it doesn't use much of gen_servers functionality:

  • There is no problem with only using part of gen_servers posibilities, in fact it is very common not to use all of it. In your case you only set up a timer in init and use handle_info to do your work. The rest of the gen_server is just the unchanged template.

  • If you later want changing parameters like poll frequency it is easy to add into this.

  • gen_fsm is much less used since it only fits a quite limited model and is not very flexible. I use it only when it really fits 100% to the requirement (which it does almost never).

  • In a case where you just want a simple plain Erlang server you can use the spawn functions in proc_lib to get just the minimal functionality to run under a supervisor.

  • A interesting way to write more natural Erlang code and still have the OTP advantages is plain_fsm, here you have the advantages of selective receive and flexible message handling needed especially when handling protocols paired with the nice features of OTP.

Having said all this: if I would write a dir_watcher I'd just use a gen_server and use only what I need. The unused functionality doesn't really cost you anything and everybody understands what it does.

隱形的亼 2024-11-08 20:06:13

我写了一个基于轮询的库。 (最好将其扩展为在支持此功能的平台上使用 inotify。)它最初打算在 EUnit 中使用,但我变成了一个单独的项目。您可以在这里找到它:

https://github.com/richcarl/file_monitor

I have written such a library, based on polling. (It would be nice to extend it to use inotify on platforms where this is supported.) It was originally meant to be used in EUnit, but I turned into a separate project instead. You can find it here:

https://github.com/richcarl/file_monitor

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文