如何处理 Windows 的问题? ReadDirectoryChangesW() 及其混合长/短文件名输出?
我正在开发一段 C 代码,它使用 ReadDirectoryChangesW() 来监视 Windows 中目录下的更改。我已阅读有关 ReadDirectoryChangesW() 和 FILE_NOTIFY_INFORMATION 结构的相关 MSDN 条目,以及其他几个文档。至此,我已经成功地监视了多个目录,监视本身没有明显的问题。问题是该函数放入 FILE_NOTIFY_INFORMATION 结构中的文件名不规范。
根据 MSDN,它们可以是长形式,也可以是短形式。我发现有几篇文章建议缓存短路径名和长路径名来处理这种情况。不幸的是,根据我自己在 Windows 7 系统上的测试,这不足以消除该问题,因为每个文件名不仅仅有两种选择。问题在于路径名中的每个组件可以是长形式或短形式。以下路径名都可以引用同一文件:
c:\PROGRA~1\MYPROG~1\MYDATA~1.TXT
c:\PROGRA~1\MYPROG~1\MyDataFile.txt
c:\PROGRA~1\MyProgram\ MYDATA~1.TXT
c:\PROGRA~1\MyProgram\MyDataFile.txt
c:\Program Files\MYPROG~1\MYDATA~1.TXT
...
据我使用 cmd.exe 的测试可知,它们都是完全可以接受的。本质上,每个文件的有效路径名数量随着其路径名中的组件数量呈指数增长。
不幸的是,ReadDirectoryChangesW() 似乎用提供给导致每个操作的系统调用的文件名填充其输出缓冲区。例如,如果您使用 cmd.exe 命令创建、重命名、删除等文件,则 FILE_NOTIFY_INFORMATION 将包含命令行中指定的文件名。
现在,在大多数情况下,我可以使用 GetLongPathName() 和朋友来获取供我使用的唯一路径。不幸的是,删除文件时无法完成此操作 - 当我收到通知时,文件已经消失,并且 Get*PathName() 函数将不起作用。
目前,我正在考虑使用更广泛的缓存来确定应用程序对每个文件使用哪些替代路径名,这可以处理任何情况,除了有人决定使用看不见的混合路径名突然删除文件的情况。我正在考虑从父目录修改事件中进行创造性数据挖掘,并回退到检查该情况的实际目录。
对于更简单的方法有什么建议吗?
PS1:虽然 Change Journals 可以有效地处理这个问题(我希望),但我不相信我可以使用它们,因为它们与 NTFS 相关并且我的应用程序缺乏管理员权限。我宁愿不去那里,除非我绝对被迫去那里。
PS2:请记住,我主要在 Unix 上编写代码,所以要温柔......
I am developing a piece of C code that uses ReadDirectoryChangesW() to monitor changes under a directory in Windows. I have read the related MSDN entries for ReadDirectoryChangesW() and the FILE_NOTIFY_INFORMATION structure, as well as several other pieces of documentation. At this point I have managed to monitor multiple directories with no apparent problems in the monitoring itself. The problem is that the filenames put in the FILE_NOTIFY_INFORMATION structure by this function are not canonical.
According to MSDN they can be in either long or short form. I have found several posts which suggest caching both short and long pathnames to handle this case. Unfortunately, according to my own testing on a Windows 7 system this is not sufficient to eliminate the issue, because there are not just two alternatives for each filename. The problem is that in a pathname EACH COMPONENT can be in either long or short form. The following pathnames could all refer to the same file:
c:\PROGRA~1\MYPROG~1\MYDATA~1.TXT
c:\PROGRA~1\MYPROG~1\MyDataFile.txt
c:\PROGRA~1\MyProgram\MYDATA~1.TXT
c:\PROGRA~1\MyProgram\MyDataFile.txt
c:\Program Files\MYPROG~1\MYDATA~1.TXT
...
and as far as I can tell from my testing using cmd.exe they are all perfectly acceptable. Essentially, the number of valid pathnames for each file rises exponentialy with the number of components in its pathname.
Unfortunately, ReadDirectoryChangesW() seems to fill in its output buffer with the filenames as provided to the system call that causes each operation. For example if you use cmd.exe commands to create, rename, delete e.t.c. files, the FILE_NOTIFY_INFORMATION will contain the filenames as specified at the command line.
Now, in most cases I could use GetLongPathName() and friends to get a unique path for my use. Unfortunately that cannot be done when deleting files - by the time I get the notification, the file is already gone and the Get*PathName() functions will not work.
At the moment I am thinking about using more extensive caching to determine which alternative pathnames are used by applications for each file, which would handle any case, except for the one where someone decides to delete a file out of the blue using an unseen mixed pathname. And I am thinking about creative data mining from the parent directory modification events and falling back to checking the actual directory for that case.
Any suggestions for an easier way to do this ?
PS1: While Change Journals would deal with this effectively (I hope) I do not believe I can use them, due to their ties to NTFS and the lack of administrator priviledges for my application. I'd rather not go there, unless I am absolutely forced to.
PS2: Please, keep in mind that I code mainly on Unix, so be gentle...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您不需要缓存每个组合。如果您缓存每个子路径以便能够将其转换为长格式,那就可以了。例如存储:
C:\PROGRA~1 => c:\Program Files
c:\Program Files\MYPROG~1 => c:\Program Files\MyProgram
c:\Program Files\MyProgram\MYDATA~1.TXT => c:\Program Files\MyProgram\MyDataFile.txt
c:\Program Files\MyProgram\MYDATA~2.TXT => c:\Program Files\MyProgram\MyDataFile2.txt
现在,如果您收到
c:\PROGRA~1\MYPROG~1\MYDATA~1.TXT
的通知,请在每个\
,并查找每个部分的长格式。不要忘记
MyDataFile.txt
和MYDATAFILE.TXT
也指向同一个文件。因此,比较不区分大小写或将所有内容都转换为大写。如果删除了
c:\PROGRA~1\MYPROG~1\MYDATA~1.TXT
,您仍然可以在c:\PROGRA 上使用
。GetLongPathName()
~1\MYPROG~1You don't need to cache every combination. It will do if you cache each subpath to be able to convert it to the long form. for example store this:
C:\PROGRA~1 => c:\Program Files
c:\Program Files\MYPROG~1 => c:\Program Files\MyProgram
c:\Program Files\MyProgram\MYDATA~1.TXT => c:\Program Files\MyProgram\MyDataFile.txt
c:\Program Files\MyProgram\MYDATA~2.TXT => c:\Program Files\MyProgram\MyDataFile2.txt
Now if you get a notification of
c:\PROGRA~1\MYPROG~1\MYDATA~1.TXT
, split it at every\
, and lookup each part for it's long form.Don't forget that
MyDataFile.txt
andMYDATAFILE.TXT
also point to the same file. So compare case-insensitive or convert everything to uppercase.And if
c:\PROGRA~1\MYPROG~1\MYDATA~1.TXT
is deleted, you might still useGetLongPathName()
onc:\PROGRA~1\MYPROG~1
.