如何有效地找出文件与另一个早期版本的中断位置?

发布于 2024-09-28 12:43:24 字数 508 浏览 4 评论 0原文

我有一个不断添加的文件(一个超出我控制的过程),并且我每 x 秒捕获该文件。我想提取文件的内容(在我之前的捕获之间添加)并使用它。不幸的是,该文件没有任何内容表明它上次添加的时间,并且我无法写入该文件,所以我唯一的选择是存储我已经知道的文件中的内容并将其与我拥有的新版本进行比较。

现在我需要知道的是如何我能最好地做到这一点。我正在使用 PHP,我认为最简单的解决方案是只存储前面的内容,然后使用 explode() 计算出后面的内容,这(很明显)是一个糟糕的解决方案文件数量很大(1GB+),处理起来会很困难。

我的一个想法是存储最终字符的位置,然后从那里开始工作,例如,如果最后一个字符是第 100 个字符,那么我会在下一个过程中从第 100 个字符开始工作,但我不确定如何我可以做到这一点,或者如果 PHP 可以的话。

所以我的问题是执行此操作的正确方法是什么以及如何使用 PHP 执行此操作(如果可能)?功能或总体想法都很好,我很适合实现,只是不确定其背后的理论。

I have a file that is constantly added to (a process beyond my control) and I capture that file every x seconds. I want to extract the new contents of the file (added between my previous capture) and work with it. The file unfortunately doesn't have anything to signify when it was last added to and I can't write to this file, so my only option is to store what I already know is in the file and compare it to the new version I have.

Now what I need to know is how I can best do this. I'm using PHP and I figured the simplest solution is to just store the previous contents and then use explode() to work out what comes after it, this is (quite obviously) a terrible solution as once the file reaches large numbers (1GB+) it's going to be hell to process.

An idea I had would be to store the position of the final character and then work from there, for example if the last character was the 100th I'd then work from the 100th character on the next process, but I'm not sure how I could do this, or if it's even possible with PHP.

So my question is what is the correct method for doing this and how can I do it with PHP (if possible)? Functions or a general idea are fine, I'm good for the implementation, just not sure the theory behind it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

叶落知秋 2024-10-05 12:43:24

假设文件只是简单地附加到文件中,直观上最简单的方法是存储以前的文件大小并使用 fseek() 或 file_get_contents 的偏移量参数移动到旧的位置该文件的版本已结束。即:

$old_position = (int)file_get_contents("last_position.temp");
file_put_contents("last_position.temp", filesize("thebigfile.txt"));

// There might be an off-by-one error here that I'm not paying attention to
$new_entry = file_get_contents("thebigfile.txt", false, "r", $old_position);

要第一次启动此操作,您需要将 0 放入 last_position.temp 中,这样就不会出现错误或不舒服的感觉。

希望这有帮助:)

Assuming the file is simply appended to, it would intuitively be easiest to store the previous file size and use fseek() or the offset parameter of file_get_contents to move to where the old version of the file ended. I.e.:

$old_position = (int)file_get_contents("last_position.temp");
file_put_contents("last_position.temp", filesize("thebigfile.txt"));

// There might be an off-by-one error here that I'm not paying attention to
$new_entry = file_get_contents("thebigfile.txt", false, "r", $old_position);

To get this rolling for the first time, you'll want to put 0 in last_position.temp so there's no errors or hard feelings.

Hope this helps :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文