考虑在哪里存储文档 - 文件服务器上还是数据库中?

发布于 2024-08-20 10:51:29 字数 393 浏览 2 评论 0原文

我对上传到我的网站的文档做出了一个设计决策:我可以将它们存储在我的文件服务器上的某个位置,或者我可以将它们作为 blob 存储在我的数据库中 (MSSQL 2005)。如果对设计决策有任何影响,这些文件都是保密的,并且必须受到一定程度的保护。

我想到的考虑因素是:

  1. 存储在文件服务器上会导致大量文件全部转储到单个目录中,因此访问速度会变慢,除非我能为目录树结构制定出合理的语义定义
  2. OTOH,我'我猜测文件服务器可以比数据库更好地处理压缩......还是我错了?
  3. 我的直觉告诉我数据库的安全性比文件服务器的安全性更强,但我不确定这是否一定是真的。
  4. 不知道我的数据库中拥有数 TB 的 blob 将如何影响性能。

我非常感谢这里的一些建议。谢谢!

I have a design decision to make regarding documents uploaded to my web site: I can either store them on my file server somewhere, or I can store them as a blob in my database (MSSQL 2005). If it makes any difference to the design decision, these documents are confidential and must have a certain degree of protection.

The considerations I've thought of are:

  1. Storing on the file server makes for HUUUUUUUGE numbers of files all dumped in a single directory, and therefore slower access, unless I can work out a reasonable semantic definition for a directory tree structure
  2. OTOH, I'm guessing that the file server can handle compression somewhat better than the DB... or am I wrong?
  3. My instincts tell me that the DB's security is stronger than the file server's, but I'm not sure if that's necessarily true.
  4. Don't know how having terabytes of blobs in my DB will affect performance.

I'd very much appreciate some recommendations here. Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

夜唯美灬不弃 2024-08-27 10:51:29

在 SQL Server 2005 中,您只能选择使用 VARBINARY(MAX) 将文件存储在数据库表内,或者然后将它们保留在外部。

将它们留在数据库之外的明显缺点是数据库无法真正控制它们发生的情况;它们可以被移动、重命名、删除......

SQL Server 2008VARBINARY(MAX) 类型上引入了 FILESTERAM 属性,该属性允许您可以将文件保留在数据库表之外,但仍处于数据库的事务控制之下 - 例如,您不能只是从磁盘中删除文件,这些文件是数据库的组成部分,因此会随数据库一起复制和备份。如果您需要它,那就太好了,但它可以用于一些巨大的备份! :-)

SQL Server 2008 的发布提供了一些关于何时直接在数据库中存储内容以及何时使用 FILESTREAM 的“最佳实践”。这些是:

  • 如果文件的大小通常小于 256 KB,则数据库表是最佳选择
  • 如果文件的大小通常超过 1 MB,或者可能超过 2 GB,则 FILESTREAM(或者在您的情况下) :普通旧文件系统)是您的最佳选择,
  • 不推荐这两个边距之间的文件

此外,为了不对查询性能产生负面影响,将大文件一起放入单独的表中通常是一个好主意 - 不需要巨大的 blob 是您查询的常规表的一部分 - 而是创建一个单独的表,只有在您确实需要兆字节的文档或图像时才对其进行查询。

这可能会让您知道从哪里开始!

In SQL Server 2005, you only have the choice of using VARBINARY(MAX) to store the files inside the database table, or then keep them outside.

The obvious drawback of leaving them outside the database is that the database can't really control what happens to them; they could be moved, renamed, deleted.....

SQL Server 2008 introduces the FILESTERAM attribute on VARBINARY(MAX) types, which allows you to leave the files outside the database table, but still under transactional control of the database - e.g. you cannot just delete the files from the disk, the files are integral part of the database and thus get copied and backed up with it. Great if you need it, but it could make for some huge backups! :-)

The SQL Server 2008 launch presented some "best practices" as to when to store stuff in the database directly, and when to use FILESTREAM. These are:

  • if the files are typically less than 256 KB in size, the database table is the best option
  • if the files are typically over 1 MB in size, or could be more than 2 GB in size, then FILESTREAM (or in your case: plain old filesystem) is your best choice
  • no recommendation for files between those two margins

Also, in order not to negatively impact performance of your queries, it's often a good idea to put the large files into a separate table alltogether - don't have the huge blobs be part of your regular tables which you query - but rather create a separate table, which you only ever query against, if you really need the megabytes of documents or images.

So that might give you an idea of where to start out from!

潇烟暮雨 2024-08-27 10:51:29

我强烈建议您考虑文件系统解决方案。原因是:

  • 您可以更好地访问文件(在调试时非常重要),这意味着您可以使用常规的基于控制台的工具,
  • 您可以快速轻松地利用操作系统来分配负载,例如使用分布式文件系统、通过硬件 RAID 添加冗余等。
  • 您可以利用操作系统访问控制列表来强制执行权限。
  • 不会堵塞数据库

如果您担心目录中存在大量条目,则始终可以创建分支模式。例如:

filename : hello.txt
filename md5: 2e54144ba487ae25d03a3caba233da71
final filesystem position: /path/2e/54/hello.txt

I strongly suggest you to consider the filesystem solution. The reasons are:

  • you have better access to the files (precious in case of debugging), meaning that you can use regular console-based tools
  • you can quickly and easily take advantage of the OS to distribute the load, for example using a distributed filesystem, add redundancy via a hardware RAID etc.
  • you can take advantage of the OS access control lists to enforce permissions.
  • you don't clog your database

If you are worried about large amounts of entries in your directories, you can always create a branching schema. for example:

filename : hello.txt
filename md5: 2e54144ba487ae25d03a3caba233da71
final filesystem position: /path/2e/54/hello.txt
萌︼了一个春 2024-08-27 10:51:29

这个热门话题背后有很多“视情况而定”的观点。既然你说这些文件是敏感和机密的,那么我会立即将其存储在数据库中。以下是一些原因:

  • 潜在的更好的安全性。攻击文件系统通常比攻击数据库更容易。
  • 更好的音量控制。一个文件夹中的数千个文件可能会给操作系统带来压力,因为数据库可以在一个表中毫不犹豫地获取数百万行。
  • 更好的搜索和扫描。加载数据时添加分类列,或尝试全文索引来扫描实际文档。
  • 备份可能会更有效——只需将另一个数据库添加到您的备份计划中,您就可以了(当然,一旦您计算出空间详细信息)。对于任何试图获取您的敏感文档的人来说,这些备份文件是另一层混淆。
  • SQL Server 2008 的数据压缩选项可能会有所帮助。或者让应用程序来做这件事? (也许可以通过混淆提高安全性)

SQL Server 2008 还具有文件流数据类型,这可能会有所帮助,但我对它不够熟悉,无法针对您的情况提供建议。

There's a LOT of "it depends" behind this popular subject. Since you say the documents are sensitive and confidential, off the cuff I'd go with storing in the database. Here are a few reasons:

  • Potentially better security. It is often easier to hack a file system than a database.
  • Better volume control. Thousands of files in one folder can strain an OS, where a database can take millions of rows in one table without blinking.
  • Better searching and scanning. Add categorizing columns when you load the data, or try out full text indexing to scan the actual documents.
  • Backups may be more efficient -- just add another database to your backup plan, and you're covered (once you work out space details, of course). And those backup files are another layer of obfuscation on anyone trying to get at your sensitive documents.
  • SQL Server 2008 has data compression options that may help here. That, or have the application do it? (More security through obfuscation, perhaps)

SQL Server 2008 also has the filestream data type, which may help here, but I'm not familiar enough with it to give a recommendation for your situation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文