Windows Azure 云存储 - 根目录中大量文件的影响
如果我在这里弄错任何术语,我很抱歉,但希望您能明白我的意思。
我使用 Windows Azure 云存储来存储大量小文件(图像,每个 20Kb)。
目前,这些文件都存放在根目录下。我知道这不是一个普通的文件系统,所以 root 可能不是正确的术语。
我试图找到有关该计划的长期影响的信息,但没有运气,所以如果有人能给我一些信息,我将不胜感激。
基本上,如果存储在该根目录中的文件数量最终达到数十万/数百万,我是否会遇到问题?
谢谢,
史蒂文
Sorry if I get any of the terminology wrong here, but hopefully you will get what I mean.
I am using Windows Azure Cloud Storage to store a vast quantity of small files (images, 20Kb each).
At the minute, these files are all stored in the root directory. I understand it's not a normal file system, so maybe root isn't the correct term.
I've tried to find information on the long-term effects of this plan but with no luck so if any one can give me some information I'd be grateful.
Basically, am I going to run into problems if the numbers of files stored in this root end up in the hundreds of thousands/millions?
Thanks,
Steven
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我也遇到过类似的情况,我们在一个 blob 容器中存储约 10M 个小文件。通过代码访问单个文件很好,并且没有任何性能问题。
我们确实遇到的问题是在代码之外管理这么多文件。如果您使用存储资源管理器(无论是VS2010自带的还是其他任何一个),我遇到的那些不支持按前缀API返回文件,您只能列出前5K,然后是接下来的 5K 等等。您可以看到,当您想要查看容器中的第 125,000 个文件时,这可能会成为一个问题。
另一个问题是,如果不编写简单地迭代所有 blob 的内容,就没有简单的方法可以找出容器中有多少文件(这对于准确了解所有 blob 存储的成本非常重要)。数他们。
这对我们来说是一个很容易解决的问题,因为我们的 blob 具有连续的数字名称,因此我们只需将它们分区到每个包含 1k 项的文件夹中。根据您拥有的项目数量,您可以将 1K 个文件夹分组到子文件夹中。
I've been in a similar situation where we were storing ~10M small files in one blob container. Accessing individual files through code was fine and there weren't any performance problems.
Where we did have problems was with managing that many files outside of code. If you're using a storage explorer (either the one that comes with VS2010 or anyone of the others), the ones I've encountered don't support the return files by prefix API, you can only list the first 5K, then the next 5K and so on. You can see how this might be a problem when you want to look at the 125,000th file in the container.
The other problem is that there is no easy way of finding out how many files are in your container (which can be important for knowing exactly how much all of that blob storage is costing you) without writing something that simply iterates over all the blobs and counts them.
This was an easy problem to solve for us as our blobs had sequential numeric names, so we've simply partitioned them into folders of 1k items each. Depending on how many items you've got you can group 1K of these folders into sub folders.
http://social.msdn .microsoft.com/Forums/en-US/windowsazure/thread/d569a5bb-c4d4-4495-9e77-00bd100beaef
简短回答:否
中等回答:有点?
长答案:不,但是如果您查询文件列表,它只会返回 5000。您需要每 5k 重新查询一次,才能根据该 MSDN 页面获得完整列表。
编辑: Root 可以很好地描述它。 99.99% 的人都会明白你想说的话。
http://social.msdn.microsoft.com/Forums/en-US/windowsazure/thread/d569a5bb-c4d4-4495-9e77-00bd100beaef
Short Answer: No
Medium Answer: Kindof?
Long Answer: No, but if you query for a file list it will only return 5000. You'll need to requery every 5k to get a full listing according to that MSDN page.
Edit: Root works fine for describing it. 99.99% of people will grok what you're trying to say.