Web 应用程序架构问题(大型数据库,指数级)。使用 Azure 表还是 SimpleDB?
我有一个存储大量文本数据的网络应用程序。目前数据库每周增加 1GB。我预计随着客户数量的增加,这个数据会呈指数级增长,所以本周 1GB,下周 2GB,下周 4GB,然后 8GB,等等......
现在这些数据存储在一个大小为 10GB 的 MS SQL 2008 数据库中。目前性能非常好,没有任何问题。
但是,我担心随着数据库的不断增长,几个月后会发生什么。我想确保我们能够扩展并且性能不受影响。
此外,我们需要为数据库找出一个好的备份策略,而且成本也不太高。
我正在考虑将存储转移到 Amazon 的 Simple DB 或将我们的 Web 应用程序转移到 Azure 并使用 Azure 表来存储这些数据。
Azure 的优点是可以自动处理备份(Azure 表和 Azure SQL 数据库)。缺点是成本高,而且应用程序的多个部分需要重新架构才能在 Azure 上运行并使用 Azure 表。
Simple DB 的优点是我们目前在 EC2 上并且可以保留在那里,并且需要重写的应用程序更少以使用 SimpleDB 而不是 SQL Server。缺点:我们仍然需要一个有效的 SQL Server 备份策略。
我们也可以将应用程序保留在 MS SQL 2008 数据库中(我只是不确定 SQL Server 可以处理多大的数据库 - 我见过的最大案例研究是 1TB 左右);但同样,对于相当大的数据库,我们需要一个有效的备份和恢复策略。但好处是我们可以对数据运行关系查询,因此将数据放在 SQL Server 中略有优势。
我想知道最好的解决方案是什么?以及其他公司如何扩展如此大的数据库并以这样的速度增长。以及什么备份和恢复选项是最好的?
您可以与 Azure Tables、SimpleDB 或大型 SQL Server DB 分享的任何建议或经验也很棒!
I have a web app that stores a large amount of text data. The db is currently increasing by 1GB a week. I expect this to grow exponentially as we get more customers, so 1GB this week, 2GB next week, 4GB the following week, then 8GB, etc...
Right now this data is stored in a single MS SQL 2008 database that 10GB in size. Performance is great right now, no issues so far.
But, I am worried about what will happen in a few months as the DB keeps growing. I want to ensure that we are able to scale and performance is not affected.
Also, we need to figure out a good backup strategy for the DB that is not too expensive.
I'm considering moving the storage over to Amazon's Simple DB or moving our web app over to Azure and using Azure Tables to store this data.
The pros with Azure is that backups would be taken care of automatically (both for Azure Tables and the Azure SQL db). The cons is the cost and the fact that several parts of the app would need to be re-architected to run on Azure and use Azure Tables.
The pros with Simple DB is that we are currently on EC2 and can stay there and less of the app would need to be rewritten to use SimpleDB instead of SQL Server. Cons: we still need an effective backup strategy for the SQL Server.
We could also just leave app as it is right now in an MS SQL 2008 database (I'm just not sure how large of a DB SQL Server can handle - max case studies I've seen are 1TB or so); but again we would need an effective backup and recovery strategy for a DB that is pretty large. But the benefit is that we can run relational queries on the data, so there is a slight advantage in having the data in SQL server.
I'm wondering what the best solution is? And how other companies scale DBs that are this big and grow at this rate. As well as what backup and recovery options are the best?
Any advice or exprience you can share with Azure Tables, SimpleDB, or large SQL Server DBs would be great as well!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
阅读一些有关分布式数据库的内容,它可能会给您带来关于数据存储的另一个视角。我并不是说分布式数据库是您的最佳选择。只需阅读它,看看它是否是您正在寻找的东西。
http://www.google.com/search?q=distributed+database
http://cassandra.apache.org/
http://voltdb.com/
或阅读 http://voltdb.com/ 中的一些文章/highscalability.com/
祝你好运!
Read something about distributed databases, it might give you another perspective on data-storages. I'm not saying distributed databases are the best option for you. Just read it and see if it's the thing you're looking for.
http://www.google.com/search?q=distributed+database
http://cassandra.apache.org/
http://voltdb.com/
or read some articles from http://highscalability.com/
Good luck!