在 NTFS 中搜索文件
我们有一个相当大的磁盘阵列,上面大约有 2-3 百万个 XML 文件。磁盘采用 NTFS 格式,我们想使用通配符搜索文件系统。因此,像 * SomePartOfTheFilename * 这样的内容将是典型的搜索查询。
我们正在使用 .Net,并且发现使用 DirectoryInfo 似乎很慢。
DirectoryInfo directoryInfo = new DirectoryInfo(directory);
List<FileInfo> fileInfos = directoryInfo.GetFiles(searchString, SearchOption.AllDirectories).ToList();
使用循环和递归也非常慢。
是否有一个较低级别的 API 调用可以用来直接搜索 NTFS 索引?
从命令行使用 dir * SomePartOfTheFilename * /s 几乎是即时的。有什么东西可以利用吗?
We have a fairly large disk array with roughly 2-3 million XML files on it. The disk is formatted with NTFS and we would like to search the filesystem using wildcards. So something like * SomePartOfTheFilename * would be a typical search query.
We are using .Net and are finding that using DirectoryInfo appears to be slow.
DirectoryInfo directoryInfo = new DirectoryInfo(directory);
List<FileInfo> fileInfos = directoryInfo.GetFiles(searchString, SearchOption.AllDirectories).ToList();
Using Loops and recursion is also very slow.
Is there a lower level API call we can use to directly search the NTFS index?
Using dir * SomePartOfTheFilename * /s from the command line is almost instant. Is there something there that can be leveraged?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我不确定您是否可以使用索引服务,但它可能对您想要执行的操作很方便:
http://msdn.microsoft.com/en-us/library/ee805985%28VS.85%29.aspx
http://www.codeproject.com/KB/database/Indexing_Service_HOW-TO.aspx
它允许您可以针对计算机上文件的 NTFS 索引创建复杂的查询。
I'm not sure if you can use the Indexing service, but it may be handy for what you are trying to do:
http://msdn.microsoft.com/en-us/library/ee805985%28VS.85%29.aspx
http://www.codeproject.com/KB/database/Indexing_Service_HOW-TO.aspx
It allows you to create complex queries against the NTFS index of the files on a computer.
您可以直接使用 MFT(参见:NTFS Wiki)。这是有关文件的所有信息所在的数据表。您可以在此处或此处。 Windows API 最终位于同一个表中,因此您也可以尝试加快搜索速度,以保证在搜索之前将其分页到内存中(简单读取例如 c:\$Mft 就足够了)。
You may use MFT directly (See: NTFS Wiki). That is data table where all information about files is located. You can see the structure of MFT for example here or here. The Windows API ends up in the same table so you can alternatively try to speed the searches up to guarantee that it will be paged in memory before search (simple read of e.g. c:\$Mft is enough).