验证ftp是否完整?

发布于 2024-07-29 16:23:46 字数 343 浏览 9 评论 0原文

我有一个正在连续轮询文件夹的应用程序。 一旦任何文件通过 ftp 传输到该文件夹​​,应用程序就必须将该文件移动到其他文件夹进行处理。

在这里,我们没有任何选项来验证 ftp 是否完整。

技术论坛中建议使用一个命令“lsof”。 它有一个文件描述列,提供文件状态。

由于这是一个免费的 bsd 命令,并且在旧版本的 linux 中不存在,因此我想澄清一下该命令的用法。

你们能告诉我们您在文件验证方面的经验吗?还有其他可用的替代解决方案吗?

另外,使用这个实用程序有什么风险吗?

提前感谢您的帮助。

谢谢, 马修·李居

I got an application which is polling on a folder continuously. Once any file is ftp to the folder, the application has to move this file to some other folder for processing.

Here, we don't have any option to verify whether ftp is complete or not.

One command "lsof" is suggested in the technical forums. It got a file description column which gives the file status.

Since, this is a free bsd command and not present in old versions of linux, I want to clarify the usage of this command.

Can you guys tell us your experience in file verification and is there any other alternative solution available?

Also, is there any risk in using this utility?

Appreciate your help in advance.

Thanks,
Mathew Liju

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

韬韬不绝 2024-08-05 16:23:46

我们之前已经通过多种不同的方式做到了这一点。

方法一:

如果您可以控制发送文件的进程,请让它发送文件本身,然后发送哨兵文件。 例如,发送真实文件“contracts.doc”,后跟一字节“contracts.doc.sentinel”

然后让您的侦听器进程留意哨兵文件。 创建其中之一后,您应该处理等效的数据文件,然后删除两者。

任何超过一天的数据文件并且没有相应的哨兵文件,请将其删除 - 这是一次失败的传输。

方法二:

密切关注文件本身(特别是最后修改日期/时间)。 仅处理过去修改时间超过 N 分钟的文件。 这会增加处理文件的延迟,但您通常可以确定,如果文件在五分钟内(例如)尚未写入,则文件已完成。

结论:

这两种方法我们过去都曾成功使用过。 我更喜欢第一个,但当我们不允许更改发送文件的过程时,我们不得不使用第二个。

第一个的优点是,当哨兵文件出现时,您知道文件已准备好。 使用 lsof (我假设您将任何进程未打开的文件视为已准备好处理)和时间戳,FTP 可能会在中间崩溃,您可能会处理半个文件。

We've done this before in a number of different ways.

Method one:

If you can control the process sending the files, have it send the file itself followed by a sentinel file. For example, send the real file "contracts.doc" followed by a one-byte "contracts.doc.sentinel".

Then have your listener process watch out for the sentinel files. When one of them is created, you should process the equivalent data file, then delete both.

Any data file that's more than a day old and doesn't have a corresponding sentinel file, get rid of it - it was a failed transmission.

Method two:

Keep an eye on the files themselves (specifically the last modification date/time). Only process files whose modification time is more than N minutes in the past. That increases the latency of processing the files but you can usually be certain that, if a file hasn't been written to in five minutes (for example), it's done.

Conclusion:

Both those methods have been used by us successfully in the past. I prefer the first but we had to use the second one once when we were not allowed to change the process sending the files.

The advantage of the first one is that you know the file is ready when the sentinel file appears. With both lsof (I'm assuming you're treating files that aren't open by any process as ready for processing) and the timestamps, it's possible that the FTP crashed in the middle and you may be processing half a file.

独留℉清风醉 2024-08-05 16:23:46

解决此类问题通常有三种方法。

  1. 提供一个信号文件,以便当您的文件传输时,会发送一个附加文件来标记传输已完成
  2. 在该目录中的日志文件中添加一个条目以指示传输已完成(这实际上仅在您有一个对等方时才有效更新目录,以避免并发问题)
  3. 解析文件以确定完整性。 例如,文件是否以长度字段开头,或者明显不完整? 例如,解析不完整的 XML 文件将由于缺少结束元素而导致解析错误。 根据文件的大小和格式,这可能很简单,也可能非常耗时。

尽管您已经确定了 Linux 可移植性问题,但 lsof 可能是一个选择。 如果使用此选项,请注意 -F 选项,该选项将输出格式化为适合其他程序处理,而不是人类可读的。

编辑:Pax 确定了我忘记的第四个(!)方法 - 使用文件的时间戳在一段时间内没有更新的事实。

There are normally three approaches to this sort of problem.

  1. providing a signal file so that when your file is transferred, an additional file is sent to mark that transfer is complete
  2. add an entry to a log file within that directory to indicate a transfer is complete (this really only works if you have a single peer updating the directory, to avoid concurrency issues)
  3. parsing the file to determine completeness. e.g. does the file start with a length field, or is it obviously incomplete ? e.g. parsing an incomplete XML file will result in a parse error due to the lack of an end element. Depending on your file's size and format, this can be trivial, or can be very time-consuming.

lsof would possibly be an option, although you've identified your Linux portability issue. If you use this, note the -F option, which formats the output suitable for processing by other programs, rather than being human-readable.

EDIT: Pax identified a fourth (!) method I'd forgotten - using the fact that the timestamp of the file hasn't updated in some time.

ぶ宁プ宁ぶ 2024-08-05 16:23:46

还有第五种方法。 您还可以检查 FTP 会话是否仍处于活动状态。 如果每个对等点都有自己的 ftp 用户帐户,这将起作用。 只要用户未从 FTP 注销,就假定文件不完整。

There is a fifth method. You can also check if the FTP Session is still active. This will work if every peer has it's own ftp user account. As long as the user is not logged off from FTP, assume the files are not complete.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文