如何在程序执行过程中检测目录中的更改?
我正在制作一个协议、客户端和服务器,提供类似于 FTP 的文件传输功能(以及其他功能)。我的协议和 FTP 之间的一个区别是我希望将远程服务器目录结构的副本存储在本地缓存中。服务器仅在 Windows 上运行(用 C++ 编写),因此任何适用的 Win32 API 调用都将受到赞赏(如果有)。最初连接时,客户端请求直接子级(文件和目录,就像没有选项的“ls”或“dir”),然后当用户导航到目录时,如您所期望的那样,对新的父级重复此步骤。
当然,大多数时候,如果客户端两次请求给定服务器的同一目录,则该目录的内容将是相同的。因此我想在客户端上缓存每个目录列表的结果。我想要一种简单的方法来实现这一点,但需要考虑由于文件/目录访问、修改时间和名称更改而导致的过期缓存条目,这是棘手的部分。 ,我希望能够让客户端几乎即时列出目录列表,例如哈希值,它不仅考虑文件内容,还考虑子目录内容的文件名、数据、修改和访问日期等的变化。
理想情况下 不能完全依赖 FileSystemWatcher (或类似)对象,因为即使程序只是偶尔运行,它也需要维护此缓存。当然,这些有助于维护缓存,但这只是问题的一部分。
到目前为止,我最好的(?)想法是使用FindFirstFile()和FindNextFile(),并排序(以某种方式),连接和散列在WIN32_FIND_DATA结构中找到的值(可能包含文件内容),并将其用作过期令牌(只是以指示任何这些字段中的更改)。然后我将为每个目录拥有其中一个令牌。当请求目录时,服务器将对所有内容进行哈希处理,并将其与客户端提供的缓存哈希进行比较,如果不同,则返回正常数据,否则返回 HTTP 304 等效数据。有没有一种不太复杂的方法来做这样的事情? “目录上次修改日期”是否在所有情况下都考虑到其子目录文件的每个修改日期?我确信内置的 Windows 索引服务具有类似的功能,但理想情况下我不需要依赖它。
因为该服务用于文件共享,所以涉及哈希值的东西会特别好,这样我就可以自动有效地找到共享给定文件的其他人,但与在哈希计算期间存储磁盘相比,这不是一个问题。
我想知道其他比我更有编程经验的人会采取什么措施来解决这个问题(rsync 和 subversion 已经解决了类似的问题,但不完全相同)。
I am making a protocol, client and server which provide file transfer functionality similar to FTP (among other features). One difference between my protocol and FTP is that I would like to store a copy of the remote server's directory structure in a local cache. The server will only be running on Windows (written in C++) so any applicable Win32 API calls would be appreciated (if any). When initially connected, the client requests the immediate children (both files and directories, just like "ls" or "dir" with no options), then when a user navigates into a directory, this step repeats with the new parent like you might expect.
Of course, most of the time, if the same directory of a given server is requested twice by a client, the directory's contents will be the same. Therefore I would like to cache the results of each directory listing on the client. I would like a simple way of implementing this, but it would need to take into account expiring cache entries because of file/directory access and modification time and name changes, which is the tricky part. I would ideally like something which would enable almost instant directory listings by the client, with something like a hash which takes into account not only file contents, but also changes in subdirectories' contents' filenames, data, modification and access dates, etc.
This is NOT something that could completely rely on FileSystemWatcher (or similar) objects because it would need to maintain this cache even if the program is only run occasionally. Of course these would be nice to help maintain the cache, but that's only part of the problem.
My best(?) idea so far is using FindFirstFile() and FindNextFile(), and sorting (somehow), concatenating and hashing values found in the WIN32_FIND_DATA structs (with file contents maybe), and using that as a token for expiration (just to indicate change in any of these fields). Then I would have one of these tokens for each directory. When a directory is requested, the server would hash everything and compare that to the cached hash provided by the client, and if it's different, return the normal data, otherwise an HTTP 304 equivalent. Is there a less elaborate way of doing something like this? Does "directory last modified date" take into account every one of its subdirectories' files' modification dates under all circumstances? I'm sure that the built-in Windows indexing service has something like this but ideally I wouldn't need to rely on it.
Because this service is for file sharing, something involving hashes would be especially nice so that I could automatically and efficiently find other people who are sharing a given file, but that's less of a concern then hosing the disk during the hash calculation.
I'm wondering what others who are more experienced than I am with programming would do to solve this problem (rsync and subversion have solved similar problems but not identical).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您对 Very Little Brain 的文件系统实现提出了很多要求(向 AA Milne 致歉)。
这实际上是一个很明确的基础,您最好看看有关分布式文件系统的现有文献。 AFS 是一个经过充分研究的方法的例子。
我怀疑如果不做一些认真的功课,你是否能够想出一些有用且准确的东西。换句话说,“忽视所有现有技术是愚蠢的”。
You're asking a lot of a File System Implementation of Very Little Brain (with apologies to A. A. Milne).
This is actually well-trammeled ground and you'd do well to look at the existing literature on distributed filesystems. AFS comes to mind as an example of a very well studied approach.
I doubt you'll be able to come up with something useful and accurate without doing some serious homework. Put another way, 'twould be folly to ignore all the prior art.