如何扩展文档存储系统?

发布于 2024-10-15 03:11:04 字数 629 浏览 0 评论 0原文

我维护一个Web应用程序(ASP.NET/IIS7/SQL2K8/Win2K8),需要访问文档,实际上有数十万个文档,并且还在不断增长。目前,它们都位于 Windows 2K8 Server 文件共享上,通过 UNC 路径 (SMB) 访问。这些文件位于一个平面目录中,我正在尝试计划如何最好地改进此解决方案。我不想使用 SQL Filestream 属性,因为将其全部迁移到其中需要付出巨大的努力,并且实际上会锁定到 SQL Server。我还需要找到一种复制数据以进行灾难恢复的方法,因此也许解决方案也可以提供帮助。

选项可以是:

  • 将文件分段到多个目录中?
    • 应用程序会为其所在目录添加元数据(或通过其他方式进行分段)
  • 将文件分段到单独的服务器中? (虚拟化)
    • 备份变得更加复杂。
    • 应用程序将为它所在的服务器添加元数据
  • 应用程序将为NAS 存储
  • SAN 存储
  • 将服务 (WCF) 放在文件前面,并让应用程序与该服务通信
    • 可在许多应用程序中重复使用的好处

假设我要存储在文件系统上而不是数据库中(我已经在这里阅读了这些讨论),这将是一个更具可扩展性的解决方案?

I maintain a web application (ASP.NET/IIS7/SQL2K8/Win2K8) that needs to access documents, actually hundreds of thousands of documents, and growing. Currently, they are all on a Windows 2K8 Server fileshare, being accessed by UNC path (SMB). The files are in a single flat directory and I'm trying to plan how to best improve this solution. I don't want to use the SQL Filestream attribute as it would be significant effort to migrate it all into that, and would really lock in to SQL Server. I also need to find a way to replicate the data for disaster recovery, so perhaps a solution can help with that too.

Options could be:

  • Segment files into multiple directories?
    • Application would add metadata for which directory it's on (or segment by other means)
  • Segment files into separate servers? (virtualize)
    • Backup becomes more complicated.
    • Application would add metadata for which server it's on
  • NAS Storage
  • SAN Storage
  • Put a service (WCF) in front of the files and have the app talk to the service
    • bonus of being reusable across many applications

Assuming I'm going to store on filesystem and not in database (I've read those disccusions here), which would be a more scalable solution?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

无声情话 2024-10-22 03:11:04

你有几个问题:
- 管理大量(静态?)文件
- 准备所述文件的备份和灾难恢复

尽管我不喜欢这个答案,但我还是把它扔在那里,但您可能会使用服务器 2k8 附带的免费 SharePoint 2010 Foundation。如果您在查找所需文档(通过搜索、通过标记进行分类或其他元数据)以及文档过期时遇到问题,并且您不想购买完整的文档管理系统,那么这可能是一个解决方案。当然,它引入了新的问题......

如果您唯一的愿望是让这些文件可以在网络上吐出,那么像您现在使用的文件存储确实是最简单的解决方案。出于灾难恢复/冗余的目的,我会考虑a)在某种raid/SAN上运行它们,b)将它们与云(天蓝色或亚马逊)自动同步。对于 b),您可以获取使云显示为映射驱动器的应用程序,然后使用 rsync 类型的软件使云保持最新状态。

如果您想构建一些新的、很酷的东西,您可能会考虑将整个文件存档移到云中,然后在数据库中编写一个表来管理文件名、旧位置、新云位置以及可以提供以下信息的重定向器代码:请求者的访问令牌。

3 种不同的方法...您的选择。

You've got a couple issues:
- managing a large volume of (static?) files
- preparing for backups and disaster recovery of said files

I'll throw this out there, even though I'm not a fan of the answer, but you might poke around with the free SharePoint 2010 Foundation that's included with server 2k8. If you're having issues with finding the documents you need (either by search, taxonomy via tagging or other metadata) as well as document expiration and you don't want to buy a full blown document management system, this might be a solution. Of course it introduces new problems...

If your only desire is to have these files available to spit out on the web, then the file store like you're using now really is the simplest solution. For DR/redundancy purposes, I'd look at a) running them on a raid/SAN of some sort and b) auto-syncing them with the cloud (either azure or amazon). For b) you can get apps that make the cloud appear as a mapped drive and then use an rsync type software to keep the cloud up to date.

If you want to build something new and cool, you might think about moving the entire file archive into the cloud and just write a table in a db to manage the file name, old location, new cloud location and a redirector code that can provide the access tokens to requestors.

3 different approaches... your choice.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文