存储 600,000 个数据对性能的影响同一文件夹中的图像 (NTFS)
我需要在使用 NTFS 的 Web 服务器上存储大约 600,000 个图像。将图像存储在子文件夹中的 20,000 个图像块中是否会更好? (Windows Server 2008)
我担心图像检索期间会产生操作系统开销
I need to store about 600,000 images on a web server that uses NTFS. Am I better off storing images in 20,000-image chunks in subfolders? (Windows Server 2008)
I'm concerned about incurring operating system overhead during image retrieval
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
大胆试试吧。只要您有一个外部索引并且有每个文件的直接文件路径而不列出目录的内容,那么就可以了。
我有一个大小超过 500 GB 的文件夹,其中包含超过 400 万个文件夹(其中有更多文件夹和文件)。我总共有大约 1000 万个文件。
如果我不小心在 Windows 资源管理器中打开此文件夹,它会卡在 100% cpu 使用率(对于一个核心),直到我终止该进程。但只要您直接引用文件/文件夹,性能就很棒(这意味着我可以访问这 1000 万个文件中的任何一个,而无需任何开销)
Go for it. As long has you have an external index and have a direct file path to each file with out listing the contents of the directory then you are ok.
I have a folder with that is over 500 GB in size with over 4 million folders (which have more folders and files). I have somewhere in the order of 10 million files in total.
If I accidentally open this folder in windows explorer it gets stuck at 100% cpu usage (for one core) until I kill the process. But as long as you directly refer to the file/folder performance is great (meaning I can access any of those 10 million files with no overhead)
根据 NTFS 是否有目录索引,从应用程序级别来看应该没问题。
我的意思是,以编程方式按名称打开文件、删除、重命名等应该可以很好地工作。
但问题始终是工具。第三方工具(例如 MS 资源管理器、备份工具等)可能会很糟糕,或者至少在每个目录有大量文件时极其无法使用。
任何进行目录扫描的操作都可能非常慢,但更糟糕的是,其中一些工具的算法很差,甚至无法扩展到每个目录中的文件数量(10k+)。
Depending on whether NTFS has directory indexes, it should be alright from the application level.
I mean, that opening files by name, deleting, renaming etc, programmatically should work nicely.
But the problem is always tools. Third party tools (such as MS explorer, your backup tool, etc) are likely to suck or at least be extremely unusable with large numbers of files per directory.
Anything which does a directory scan, is likely to be quite slow, but worse, some of these tools have poor algorithms which don't scale to even modest (10k+) numbers of files per directory.
NTFS 文件夹存储索引文件及其所有内容的链接。如果图像数量很大,该文件将会增加很多,并对您的性能产生负面影响。所以,是的,仅就这个论点而言,您最好将块存储在子文件夹中。索引内的碎片很痛苦。
NTFS folders store an index file with links to all its contents. With a large amount of images, that file is going to increase a lot and impact your performance negatively. So, yes, on that argument alone you are better off to store chunks in subfolders. Fragments inside indexes are a pain.