NTFS 目录有 100K 条目。如果分布在 100 个子目录中,性能会有多少提升?
背景 我们有一个自行开发的文件系统支持的缓存库。目前,由于条目数量较多(例如多达 100,000 个),我们在一次安装中遇到了性能问题。问题:我们将所有文件系统条目存储在一个“缓存目录”中。非常大的目录性能很差。
我们正在考虑将这些条目分散到子目录中——就像 git 那样,例如 100 个子目录,每个子目录大约有 1,000 个条目。
问题
我知道较小的目录大小将有助于文件系统访问。
但是“扩展到子目录”会加速遍历所有条目,例如枚举/读取所有 100,000 个条目吗?即,当我们从 FS 存储初始化/预热缓存时,我们需要遍历所有 100,000 个条目(并删除旧条目),这可能需要 10 分钟以上。
“传播数据”会减少这个“遍历时间”吗?另外,这种“遍历”实际上可以/确实删除过时的条目(例如,早于 N 天)“传播数据”会缩短删除时间吗?
其他背景 -NTFS -Windows 系列操作系统(服务器 2003、2008)
-Java J2ee 应用程序。
我/我们将不胜感激任何关于文件系统可扩展性问题的教育。
提前致谢。
将
附注我应该评论说,我有工具和能力来亲自测试这一点,但我想我会首先选择蜂巢思维的理论和经验。
Context
We have a homegrown filesystem-backed caching library. We currently have performance problems with one installation due to large number of entries (e.g. up to 100,000). The problem: we store all fs entries in one "cache directory". Very large directories perform poorly.
We're looking at spreading those entries over subdirectories--as git does, e.g. 100 subdirectories with ~ 1,000 entries each.
The question
I understand that smaller directories sizes will help with filesystem access.
But will "spreading into subdirectories" speed up traversing all entries, e.g. enumerating/reading all 100,000 entries? I.e. When we initialize/warm the cache from the FS store, we need to traversing all 100,000 entries (and deleting old entries) can take 10+ minutes.
Will "spreading the data" decrease this "traversal time". Additionally this "traversal" actually can/does delete stale entries (e.g older then N days) Will "spreading the data" improve delete times?
Additional Context
-NTFS
-Windows Family OS (Server 2003, 2008)
-Java J2ee application.
I/we would appreciate any schooling on filesystem scalability issues.
Thanks in advance.
will
p.s. I should comment that I have the tools and ability to test this myself, but figured I'd pick the hive mind for the theory and experience first.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我还相信跨子目录传播文件会加快操作速度。
所以我进行了测试:我生成了从 AAAA 到 ZZZZ 的文件(26^4 个文件,大约 450K)并将它们放入一个 NTFS 目录中。我还将相同的文件放置到从 AA 到 ZZ 的子目录中(即按文件名的前 2 个字母对文件进行分组)。然后我执行了一些测试 - 枚举和随机访问。我在创建后和测试之间重新启动了系统。
扁平结构表现出比子目录稍好的性能。我相信这是因为目录被缓存并且 NTFS 索引目录内容,因此查找速度很快。
请注意,对于 400K 文件,完整枚举(在这两种情况下)大约需要 3 分钟。这是很重要的时间,但子目录会让情况变得更糟。
结论:特别是在 NTFS 上,如果可以访问任何文件,则将文件分组到子目录中是没有意义的。如果您有缓存,我还会测试按日期或按域对文件进行分组,假设某些文件的访问频率比其他文件更频繁,并且操作系统不需要将所有目录保留在内存中。但是,对于您的文件数量(低于 100K),这可能也不会提供显着的好处。我认为你需要自己衡量这些具体场景。
更新:我已将随机访问测试减少到仅访问一半文件(从 AA 到 OO)。假设这将涉及一个平面目录和仅一半的子目录(对子目录的情况给予奖励)。仍然扁平目录表现更好。因此,我认为除非您有数百万个文件,否则将它们保存在 NTFS 上的一个平面目录中会比将它们分组到子目录中更快。
I also believed that spreading files across subdirectories will speed-up operations.
So I conducted the tests: I've generated files from AAAA to ZZZZ (26^4 files, it's about 450K) and placed them into one NTFS directory. I also placed the identical files to subdirectories from AA to ZZ (i.e. grouped files by first 2 letters of their names). Then I performed some tests - enumeration and random access. I rebooted the system after creation and between tests.
Flat structure exposed slightly better performance than subdirectories. I believe this is because the directories are cached and NTFS indexes directory contents, so lookup is fast.
Note, that full enumeration (in both cases) took about 3 minutes for 400K files. This is significant time, but subdirectories make it even worse.
Conclusion: on NTFS in particular it makes no sense to group files into subdirectories if access is possible to any of those files. If you have a cache, I would also test grouping the files by date or by domain, assuming that some files are accessed more frequently than others, and the OS doesn't need to keep all directories in memory. However, for your number of files (under 100K) this probably wouldn't provide significant benefits either. You need to measure such specific scenarios yourself, I think.
Update: I've reduced my test for random access to only access half of the files (from AA to OO). The assumption was that this will involve one flat directory and only half of subdirectories (giving a bonus to subdirectory case). Still flat directory performed better. So I assume that unless you have millions of files, keeping them in one flat directory on NTFS will be faster than grouping them into subdirectories.
如果您从不需要统计或列出缓存目录,而只需要通过完整路径统计和打开其中的文件,那么目录中有多少文件并不重要(至少在 100k 文件级别)。
在这种情况下,许多缓存框架和文件系统密集型存储引擎将根据文件名中的第一个字符创建子目录,因此,如果您在缓存中存储文件“abcdefgh.png”,它将进入“cache/a/” b/cdefgh.png”而不仅仅是“cache/abcdefgh.png”。这假设文件名的前两个字母在字符空间中的分布大致均匀。
正如您所提到的,由于涉及列出或遍历目录的主要任务是删除过时的文件,因此我建议您根据缓存文件的日期和/或时间创建目录,即“cache/2010/12/04 /22/abcdefgh.png”,并且无论您在何处对缓存进行索引,请务必按文件名和日期对其进行索引(尤其是在数据库中),以便您可以快速按日期从索引中删除项目并删除相应的目录。
If you never need to stat or list the cache directory, and only ever stat and open files within it by full path, it should not really matter (at least not at the 100k files level) how many files are in the directory.
Many caching frameworks and filesystem-heavy storage engines will create subdirectories based on the first character in the filenames in such scenarios, so that if you are storing a file "abcdefgh.png" in your cache, it would go into "cache/a/b/cdefgh.png" instead of just "cache/abcdefgh.png". This assumes that the distributions of the first two letters of your file names are roughly uniform across the character space.
As you mentioned, since your primary task that involves listing or traversing the directories is in deleting outdated files, I would recommend that you create directories based on the date and/or time the file was cached, i.e. "cache/2010/12/04/22/abcdefgh.png" and, wherever you index the cache, be sure to index it by filename AND date (especially if it's in a database) so that you can quickly remove items by date from the index and remove the corresponding directory.
您如何加载缓存?如果您使用标准 Java 文件系统交互,这将是您的第一个瓶颈 - Java 在文件夹内容迭代方面非常糟糕 - 如果您在迭代时对每个文件进行检查(获取修改日期,确保文件是不是目录等...)性能可能会受到很大影响(这些都涉及到祖国的往返旅行)。转向基于本机 FindFirstFile 的解决方案可能会带来显着(如数量级)的改进。 FindFirstFile 在每个迭代步骤中返回有关文件的所有信息。 Java File.listFiles() 返回路径列表。然后,当您查询属性或其他元时,每次调用都是到文件系统的往返。可怕的是,效率低得可怕。
好吧——那不碍事了。接下来,NTFS 中大型目录的原始迭代并不比 n 叉树方法(文件夹和子文件夹等)慢得多。对于 FAT32,这是一个非常大的问题 - 但 NTFS 可以很好地处理此类事情。也就是说,拆分为子文件夹提供了一些自然的并行化机会,而使用单个文件夹很难实现这些机会。如果您可以生成 10 或 15 个线程,每个线程都访问单独的文件夹,那么您可以有效地消除磁盘延迟这一影响因素。
我可能会建议您从分析开始(当然,您已经知道了) - 并查看大部分加载时间来自何处。您可能会感到惊讶(例如,在我们的一个进行大量文件列表处理的应用程序中,我惊讶地发现我们在检查 isDirectory() 时花费了多少时间 - 一个简单的更改,例如进行日期比较 之前目录/文件确定使我们的迭代速度提高了 30%)。
How are you loading your cache? If you are using standard Java file system interaction, that's going to be your first bottleneck - Java is pretty bad at folder content iteration - and if you are doing checks against each file as you iterate (get the modified date, make sure the File isn't a directory, etc...) performance can take a big hit (these all involve round trips to native land). Moving to a solution based on native FindFirstFile may provide significant (like orders of magnitude) improvement. FindFirstFile returns all of the information about the file with each iteration step. Java File.listFiles() returns the list of paths. Then when you query for attributes or other meta - each call is a round trip to the file system. Horribly, horribly inefficient.
OK - that's out of the way. Next, raw iteration of a huge directory in NTFS isn't particularly slower than an n-ary tree approach (folders and subfolders, etc...). With FAT32, this was a very big deal - but NTFS handles this sort of thing pretty well. That said, splitting into sub-folders opens up some natural parallelization opportunities that are much harder to achieve with a single folder. If you can spawn 10 or 15 threads, each hitting separate folders, then you can effectively eliminate disk latency as a contributing factor.
I would probably suggest that you start with profiling (you knew that already, of course) - and see where the bulk of the load time is coming from. You might be surprised (for example, in one of our apps that does a lot of file list processing, I was shocked to find how much time we were getting hit for when checking isDirectory() - a simple change like doing date compare before directory/file determination made a 30% improvement in our iteration speeds).
需要注意的是你的磁盘子系统是如何排列的。虽然磁盘大小快速增长,但它们(访问时间)并没有变得更快。是否可以选择不同的磁盘布置(使用更多磁盘)或使用 SSD 驱动器。例如,SSD 没有移动部件,可以在 10 秒内处理 100K 文件。使预热变得不必要。
Something to look at is how your disk subsystem is arranged. While disks are growing in size rapidly, they are not getting much faster (in access time) Is a different disk arrangement (using more disks) or using SSD drives an option. For example, an SSD has no moving parts and can touch 100K files in 10 seconds. Making the warmup unnecessary.