目录监控

发布于 2024-09-01 11:43:30 字数 284 浏览 5 评论 0 原文

对我来说检查添加到目录中的新文件的最佳方法是什么,我认为 filesystemwatcher 不合适,因为这不是一个始终在线的服务,而是一种在我的程序启动时运行的方法。

我正在监视的文件夹结构中有超过 20,000 个文件,目前我正在单独检查每个文件以查看文件路径是否在我的数据库表中,但这大约需要十分钟,我想加快速度是可能的,

我可以存储上次检查文件夹的日期 - 是否可以轻松获取 createdate > 的所有文件最后检查日期。

有人有什么想法吗?

谢谢马克

What is the best way for me to check for new files added to a directory, I dont think the filesystemwatcher would be suitable as this is not an always on service but a method that runs when my program starts up.

there are over 20,000 files in the folder structure I am monitoring, at present I am checking each file individually to see if the filepath is in my database table, however this is taking around ten minutes and I would like to speed it up is possible,

I can store the date the folder was last checked - is it easy to get all files with createddate > last checked date.

anyone got any Ideas?

Thanks

Mark

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

意中人 2024-09-08 11:43:30

您的方法是唯一可行的(即文件系统观察程序允许您查看更改,而不是在启动时检查)。

找出什么需要这么长时间。 20.000 个检查不应花费 10 分钟 - 最多可能 1 分钟。你的程序写得很慢。你如何测试它?

提示:不要询问数据库,将所有文件的列表放入内存,数据库中所有文件的列表,检查内存。向数据库发送 20.000 条 SQL 语句太慢,这样您就需要 1 条 SQL 语句来获取列表。

Your approach is the only feasible (i.e. file system watcher allows you to see changes, not check on start).

Find out what takes so long. 20.000 checks should not take 10 minutes - maybe 1 maximum. Your program is written slowly. How do you test it?

Hint: do not ask the database, get a list of all files into memory, a list of all filesi n the database, check in memory. 20.000 SQL statements to the database are too slow, this way you need ONE to get the list.

软的没边 2024-09-08 11:43:30

对于 20,000 个文件来说 10 分钟似乎太长了。你打算如何进行比较?您的建议也不考虑已删除的文件。如果您想从数据库中删除它们,则必须进行全面比较。

也许问题是数据库往返。您可以从数据库中以大块(或一次全部)检索已知文件列表,并按字母顺序排序。对本地文件列表进行排序,然后遍历两个列表,同时处理丢失的或新的条目。

10 minutes seems awfully long for 20,000 files. How are you going about doing the comparison? Your suggestion doesn't account for deleted files either. If you want to remove those from the database, you will have to do a full comparison.

Perhaps the problem is the database round trips. You can retrieve a known file list from the database in large chunks (or all at once), sorted alphabetically. Sort the local file list as well and walk the two lists, processing missing or new entries as you go along.

油焖大侠 2024-09-08 11:43:30

FileSystemWatcher 不是 可靠,因此即使您可以使用某项服务,它也不一定适合您。

我可以看到的两个选项是:

  1. 保留您了解的文件列表并不断与该列表进行比较。这将允许您查看文件是否被添加、删除等。将此列表保留在内存中,而不是查询数据库中的每个文件。
  2. 正如您所建议的,存储时间戳并与其进行比较。

FileSystemWatcher is not reliable, so even if you could use a service, it would not necessarily work for you.

The two options I can see are:

  1. Keep a list of files you know about and keep comparing to this list. This will allow you to see if files were added, deleted etc. Keep this list in memory, instead of querying the database for each file.
  2. As you suggest, store a timestamp and compare to that.
罪#恶を代价 2024-09-08 11:43:30

您可以在某处写入创建 onfile 的最后一个时间戳,这很简单并且可以为您工作。

You can write in somewhere the last timestamp that onfile was created, it is simple and can work for you.

洒一地阳光 2024-09-08 11:43:30

你能编写一个在该机器上运行的服务吗?然后该服务可以使用 FileSystemWtcher

Can you write a service that runs on that machine? The service can then use FileSystemWtcher

苄①跕圉湢 2024-09-08 11:43:30

像凯文·琼斯(Kevin Jones)建议的那样拥有 FileSystemWatcher 服务可能是最务实的答案,但还有一些其他选择。

如果您在 Linux 机器上使用 Samba 挂载该目录,则可以使用 inotify 来查看该目录。当然,这是假设您不介意分散您的平台,但这就是 inotify 的用途。

更正确的是,如果您正在监视一个包含 20K 文件的目录,那么您获得批准的机会相应较小,那么可能是时候改进您的系统架构了。由于不了解有关您的应用程序的更多信息,听起来消息队列可能值得一看。

Having a FileSystemWatcher service like Kevin Jones suggests is probably the most pragmatic answer, but there are some other options.

You can watch the directory with inotify if you mount it with Samba on a linux box. That of course assumes you don't mind fragmenting your platform, but that's what inotify is there for.

And then more correctly but with correspondingly less chance of you getting a go-ahead, if you're sitting monitoring a directory with 20K files in it it is probably time to evolve your system architecture. Not knowing all that much more about your application, it sounds like a message queue might be worth looking at.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文