使用 Sql Server FileStream 还是传统文件服务器?
我正在设计一个系统,大约有1000万+用户,每个用户都有一张照片,大约1~2 MB。 我们将使用 Microsoft Azure 部署数据库和 Web 应用程序 我想知道我应该如何存储照片,目前有两种选择,
1,使用Sql Server FileStream存储所有照片
2,使用文件服务器
我还没有使用FileStream经历过如此大规模的BLOB数据。
有人可以给我任何建议吗?缺点和优点? 任何拥有 Microsoft Azure 大型照片存储经验的人都非常感激!
谢谢 瑞安.
I am designing a system that's going to have about 10 millions+ users, each has a photo, which is about 1~2 MB.
We are going to deploy both database and web app using Microsoft Azure
I am wondering the way I should store the photos, there are currently two options,
1, Store all photos use Sql Server FileStream
2, Use File Server
I haven't experienced such large scale BLOB data using FileStream.
Can anybody give my any suggestion? The Cons and Pros?
And anyone with Microsoft Azure experiences concerning the large photos store is really appreciated!
Thx
Ryan.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我两个都不投。使用 Windows Azure Blob 存储。简单的 REST API,0.15 美元/GB/月。如果您将图像公开(例如 ),您甚至可以直接从那里提供图像,这意味着您不必通过您的网络应用程序输送它们。
I vote for neither. Use Windows Azure Blob storage. Simple REST API, $0.15/GB/month. You can even serve the images directly from there, if you make them public (like <img src="http://myaccount.blob.core.windows.net/container/image.jpg" />), meaning you don't have to funnel them through your web app.
对于任何大规模二进制存储需求来说,数据库几乎总是一个可怕的选择。数据库最适合仅关系型系统,而是在数据库中提供对实际存储位置的引用。您应该考虑以下几个因素:
换句话说,没有理由不使用 Blob 存储。它使用简单、成本效益高,并且可以根据任何需求进行扩展。
Database is almost always a horrible choice for any large-scale binary storage needs. Database is best for relational-only systems, and instead, provide references in your database to the actual storage location. There's a few factors you should consider:
In other words, no reason not to go with Blob storage. It is simple to use, cost effective, and will scale to any needs.
我不能谈论任何与 Azure 相关的事情,但就我的钱而言,使用 FILESTREAM 的最大优势是可以在正常的 SQL Server 备份过程中备份数据。您所讨论的数据大小也表明 FILESTREAM 可能也是一个不错的选择。
我曾开发过一个带有 RDBMS 后端的 SCM 系统,我们的重大决定之一是是将文件增量存储在文件系统上还是数据库本身内。因为它是跨 RDBMS 的,所以我们必须制定一种通用的非 FILESTREAM 方式来执行此操作,但执行单次备份的能力却让我们受益匪浅。
I can't speak on anything Azure related but for my money the biggest advantage of using FILESTREAM is that that data can get backed up inside the normal SQL Server backup process. The size of the data that you are talking about also suggests that FILESTREAM may be a good choice as well.
I've worked on a SCM system with a RDBMS back end and one of our big decisions was whether to store the file deltas on the file system or inside the DB itself. Because it was cross-RDBMS we had to cook up a generic non-FILESTREAM way of doing it but the ability to do a single shot backup sold us.
FILESTREAM 对于存储图像来说是一个糟糕的选择。我很惊讶微软竟然推广它。
我们目前正在将它用于我们网站上的图像。主要是用户生成的图像和管理员创建的任何 CMS 相关内容。使用 FILESTREAM 的决定是在我开始之前就做出的。最大的问题与提供图像有关。你最好在前面放一个 CDN。如果没有,请计划您的系统突然停止。当然,大多数网站都有 CDN,但您不希望受到服务中断的影响,这意味着您的系统将过载。 sql 服务器承受的压力是这里的主要问题。
在备份的便捷性方面。您的权衡是您的数据库更大,因此备份需要更长的时间。备份期间可能会更长且系统运行速度更慢。更不用说,移动备份需要更长的时间(即,在开发环境中或出于开发目的在本地计算机上恢复生产数据)。不要将此作为决定因素。
大多数云服务都会对您存储在其系统上的任何文件(即 aws 的 S3 和 azure 的 blob)进行自动冗余。如果您在本地,只需确保使用图像的共享位置并确保该位置已备份。我认为最好的选择是对其进行设置,以便每个图像(其他 UGC 文件类型也是如此)在数据库中都有一个条目,其中包含该文件的路径。更进一步,将根路径分离到配置设置中,并仅将剩余路径与条目一起存储。例如,配置中的根路径可能是基本 url、共享驱动器或虚拟目录,或者空白条目。那么您的条目可能有“/files/images/image.jpg”。这样,如果您移动文件存储,则只需更新根配置即可。我还建议创建一个 FileStoreProvider 接口(Singleton),可用于管理(保存、删除、更新)这些文件。这样,如果您在 AWS、Azure 或本地之间切换,则只需创建一个新的提供商即可。
FILESTREAM is a horrible option for storing images. I'm surprised MS ever promoted it.
We're currently using it for our images on our website. Mainly the user generated images and any CMS related stuff that admins create. The decision to use FILESTREAM was made before I started. The biggest issue is related to serving the images up. You better have a CDN sitting in front. If not, plan on your system coming to a screeching halt. Of course, most sites have a CDN, but you don't want to be at the mercy of that service going down meaning your system will get overloaded. The amount of stress put on your sql server is the main problem here.
In terms of ease of backup. Your tradeoff there is that your db is MUCH MUCH LARGER and, therefore, the backup takes longer. Potentially, much longer and the system runs slower during the backup. Not to mention, moving backups around takes longer (i.e., restoring prod data in a dev environment or on local machines for dev purposes). Don't use this as a deciding factor.
Most cloud services have automatic redundancy of any files that you store on their system (i.e., aws's S3 and azure's blob). If you're on premise, just make sure you use a shared location for the images and make sure that location is backed up. I think the best option is to set it up so each image (other UGC file types too) has an entry in your db with a path to that file. Going one step further, separate the root path into a config setting and only store the remaining path with the entry. For example, root path in config might be a base url, a shared drive or virtual dir, or a blank entry. Then your entry might have "/files/images/image.jpg". This way, if you move your filestore, you can just update the root config. I would also suggest creating a FileStoreProvider interface (Singleton) that can be used for managing (saving, deleting, updating) these files. This way, if you switch between AWS, Azure, or on premise, you can just create a new Provider.
我有一个客户端服务器数据库,我管理许多文件(doc、txt、pdf...),所有这些文件都放在文件流 BLOB 中。客户拥有 50+ MB 数据库。如果在天蓝色中你也可以这样做。将所有内容都存储在数据库中是一件很棒的事情。对于 Postgres 和 MySQL 来说,这也被认为是很好的策略
I have a client server DB, i manage many files (doc, txt, pdf, ...) and all of them go in a filestream BLOB. Customers has 50+ MB dbs. If in azure you can do the same go for it. Having all in the db is a wonderful thing. It is considered good policy also for Postgres and MySQL