使用 .NET 实时读取文件中的更改
我有一个经常更新的 .csv 文件(大约每分钟 20 到 30 次)。 我想将新添加的行写入文件后立即将其插入数据库。
FileSystemWatcher 类侦听文件系统更改通知并可以在指定文件发生更改时引发事件。 问题是 FileSystemWatcher 无法准确确定添加或删除了哪些行(据我所知)。
读取这些行的一种方法是保存并比较更改之间的行数,并读取最后一个更改与倒数第二个更改之间的差异。 然而,我正在寻找一个更干净(也许更优雅)的解决方案。
I have a .csv file that is frequently updated (about 20 to 30 times per minute). I want to insert the newly added lines to a database as soon as they are written to the file.
The FileSystemWatcher class listens to the file system change notifications and can raise an event whenever there is a change in a specified file. The problem is that the FileSystemWatcher cannot determine exactly which lines were added or removed (as far as I know).
One way to read those lines is to save and compare the line count between changes and read the difference between the last and second last change. However, I am looking for a cleaner (perhaps more elegant) solution.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
您对 FileSystemWatcher 的看法是正确的。 您可以侦听创建、修改、删除等事件,但您不会比引发这些事件的文件更深入。
您可以控制文件本身吗? 您可以稍微更改模型以将文件用作缓冲区。 不要使用一个文件,而是使用两个文件。 一种是分段,一种是所有处理输出的总和。 从“缓冲区”文件中读取所有行,处理它们,然后将它们插入到另一个文件的末尾,该文件是已处理的所有行的总数。 然后,删除您处理的行。 这样,文件中的所有信息都将等待处理。 问题是,如果系统不是写系统(即也删除行),那么它就无法工作。
You're right about the FileSystemWatcher. You can listen for created, modified, deleted, etc. events but you don't get deeper than the file that raised them.
Do you have control over the file itself? You could change the model slightly to use the file like a buffer. Instead of one file, have two. One is the staging, one is the sum of all processed output. Read all lines from your "buffer" file, process them, then insert them into the end of another file that is the total of all lines processed. Then, delete the lines you processed. This way, all info in your file is pending processing. The catch is that if the system is anything other than write (i.e. also deletes lines) then it won't work.
我突然想到,你可以存储最后已知的文件大小。 检查文件大小,当文件大小发生变化时,打开阅读器。
然后让阅读器找到您最后的文件大小,并从那里开始阅读。
off the top of my head, you could store the last known file size. Check against the file size, and when it changes, open a reader.
Then seek the reader to your last file size, and start reading from there.
如果当前文本足够小,我会将当前文本保留在内存中,然后使用差异算法来检查新文本和先前文本是否发生更改。 这个库,http://www.mathertel.de/Diff/,不仅会告诉你有些东西改变了,但什么也改变了。 这样您就可以将更改后的数据插入数据库中。
I would keep the current text in memory if it is small enough and then use a diff algorithm to check if the new text and previous text changed. This library, http://www.mathertel.de/Diff/, not only will tell you that something changed but what changed as well. So you can then insert the changed data into the db.
我认为你应该使用 NTFS Change Journal 或类似的:
您可以在 TechNet 上找到说明。 您将需要在 .NET 中使用 PInvoke。
I think you should use NTFS Change Journal or similar:
You can find a description on TechNet. You will need to use PInvoke in .NET.
是的,FileSystemWatcher 对文件的内容一无所知。 它会告诉您是否发生了变化等,但不会告诉您发生了什么变化。
您只添加到文件中吗? 帖子中不太清楚是否添加或删除了台词。 假设它们被附加,解决方案非常简单,否则您将进行一些比较。
Right, the FileSystemWatcher doesn't know anything about your file's contents. It'll tell you if it changed, etc. but not what changed.
Are you only adding to the file? It was a little unclear from the post as to whether lines were added or could also be removed. Assuming they are appended, the solution is pretty straightforward, otherwise you'll be doing some comparisons.
我写过一些非常相似的东西。 我使用 FileSystemWatcher 来获取有关更改的通知。 然后,我使用 FileStream 来读取数据(跟踪文件中的最后位置并在读取新数据之前查找该位置)。 然后,我将读取的数据添加到缓冲区,该缓冲区会自动提取完整的行,然后输出到 UI。
注意:“this.MoreData(..) 是一个事件,其侦听器添加到上述缓冲区,并处理完整的行提取。
注意:正如已经提到的,只有当修改始终添加到任何删除都会导致问题
。
I've written something very similar. I used the FileSystemWatcher to get notifications about changes. I then used a FileStream to read the data (keeping track of my last position within the file and seeking to that before reading the new data). Then I add the read data to a buffer which automatically extracts complete lines and then outputs then to the UI.
Note: "this.MoreData(..) is an event, the listener of which adds to the aforementioned buffer, and handles the complete line extraction.
Note: As has already been mentioned, this will only work if the modifications are always additions to the file. Any deletions will cause problems.
Hope this helps.