云服务适合这个应用程序吗?

发布于 2024-07-12 10:42:12 字数 516 浏览 4 评论 0原文

我正在寻找弹出的云服务的详细信息(例如 Amazon/Azure),并且想知道它们是否适合我的应用程序。

我的应用程序基本上有一个大约 500GB 的单表数据库。 它每天增长 3-5 GB。 我需要从中提取文本数据,一次大约 100 万行,过滤大约 5 列。 提取的数据通常约为 1-5 GB,压缩后可达 100-500MB,然后在网络上提供。

这里有我现有实施的一些细节 一张 400GB 表,一个查询 - 需要调优思路 (SQL2005)

所以,我的问题是: 现有的云服务是否适合托管此类应用程序? 存储如此大量的数据和带宽(带宽使用量约为 2GB/天)的成本是多少?

持久性系统是否适合存储这样的大型平面表,并且它们是否提供在多个列上进行搜索的能力?

我当前的实现运行在低于​​ 10,000 美元的硬件上,因此如果成本远高于每年 5,000 美元,那么迁移就没有意义。

I'm looking for details of the cloud services popping up (eg. Amazon/Azure) and am wondering if they would be suitable for my app.

My application basically has a single table database which is about 500GB. It grows by 3-5 GB/Day.
I need to extract text data from it, about 1 million rows at a time, filtering on about 5 columns. This extracted data is usually about 1-5 GB and zips up to 100-500MB and then made available on the web.

There are some details of my existing implementation here
One 400GB table, One query - Need Tuning Ideas (SQL2005)

So, my question:
Would the existing cloud services be suitable to host this type of app? What would the cost be to store this amount of data and bandwidth (bandwidth usage would be about 2GB/day)?

Are the persistence systems suitable for storing large flat tables like this, and do they offer the ability to search on a number of columns?

My current implementation runs on sub $10k hardware so it wouldn't make sense to move if costs are much higher than, say, $5k/yr.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

野却迷人 2024-07-19 10:42:12

考虑到大量数据及其增长速度,我认为亚马逊不是一个好的选择。 我假设您希望将数据存储在持久存储上。 但对于 EC2,您需要分配给定量的存储并将其附加为磁盘。 除非您想要分配大量空间(然后为未使用的磁盘空间付费),否则您将必须不断添加更多磁盘。 我快速回顾了一下 envalop 计算,估计每年的托管成本在 2,500 至 10,000 美元之间。 我很难准确估计,因为亚马逊收取的所有费用都是可变的(实例正常运行时间、存储空间、带宽、磁盘 io 等)。这是 EC2 定价

Given the large volume of data and the rate that it's growing, I don't think that Amazon would be a good option. I'm assuming that you'll want to be storing the data on a persistent storage. But with EC2 you need to allocate a given amount of storage and attach it as a disk. Unless you want to allocate a really large amount of space (and then will be paying for unused disc space), you will have to constantly be adding more discs. I did a quick back of the envalop calculation and I estimate it will cost between $2,500 - $10,000 per year for hosting. It's difficult for me to estimate accurately because of all of the variable things that amazon charges for (instance uptime, storage space, bandwidth, disc io, etc.) Here's the EC2 pricing .

感情旳空白 2024-07-19 10:42:12

假设这是非关系数据(不能在单个表上处理关系数据),您可以考虑使用Azure表存储,这是一种专为非关系结构化数据设计的存储机制。

这里你会遇到的问题是,Azure 表只有一个主索引,因此无法按照你的要求按 5 列建立索引。 除非您存储数据 5 次,每次都按您希望过滤的列进行索引。 但不确定这是否会非常划算。

Azure 表存储的成本低至每月每 Gig 8 美分,具体取决于您存储的数据量。 还有每笔交易的费用和出口数据的费用。
有关定价的更多信息,请查看此处; http://www.windowsazure.com/en-us/pricing/calculator/高级/

您需要从哪里访问这些数据?
它是如何写入的?

基于此,还可以考虑其他选项,例如 Azure 驱动器等。

Assuming that this is non-relational data (can't do relational data on a single table) you could consider using Azure Table Storage which is a storage mechanism designed for non-relational structured data.

The problem that you will have here is that Azure Tables only have a primary index and therefore cannot be indexed by 5 columns as you require. Unless you store the data 5 times, indexed each time by the column you wish to filter on. Not sure that would work out very cost-effective though.

Costs for Azure Table storage is from as little as 8c USD per Gig per month, depending on how much data you store. There are also charges per transaction and charges for Egress data.
For more info on pricing check here; http://www.windowsazure.com/en-us/pricing/calculator/advanced/

Where do you need to access this data from?
How is it written to?

Based on this there could be other options to consider too, like Azure Drives etc.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文