我应该将图像保留在 EBS 还是 S3 上?
我正在将我的 Java、Tomcat、Mysql 服务器迁移到 AWS EC2。
我已经附加了用于存储 MySql 数据的 EBS 卷。在我的网络应用程序中,人们可以上传图像。所以我应该坚持下去。我的想法有两种选择:
- 将上传的图像保存到 EBS 卷。
- 使用S3服务。
以下是我的笔记,请大家持怀疑态度,因为我的专长不是服务器,而是软件开发。
EBS plus:S3 存储更昂贵。 (0.15 $/Gb > 0.1$/Gb)
S3 plus:从 EBS 提供静态数据可能会对我的 Web 服务器的性能产生负面影响。这是真的吗?提供图像会显着影响服务器性能吗?对于 S3,我的服务器将不负责提供静态数据。
S3 plus:从 EBS 提供静态数据可能会导致 I/O 成本,可能会很小。
S3EBS plus:人们说 EBS 更快。
S3 plus:人们说 S3 持久化更安全。
EBS plus:无需学习API,直接将图像保存到EBS卷。
EBS
即我无法决定,如果您指导我会很高兴。
谢谢
I am migrating my Java,Tomcat, Mysql server to AWS EC2.
I have already attached EBS volume for storing MySql data. In my web application people may upload images. So I should persist them. There are 2 alternatives in my mind:
- Save uploaded images to EBS volume.
- Use the S3 service.
The followings are my notes, please be skeptic about them, as my expertise is not on servers, but software development.
EBS plus: S3 storage is more expensive. (0.15 $/Gb > 0.1$/Gb)
S3 plus: Serving statics from EBS may influence my web server's performance negatively. Is this true? Does Serving images affect server performance notably? For S3 my server will not be responsible for serving statics.
S3 plus: Serving statics from EBS may result I/O cost, probably it will be minor.
EBS plus: People say EBS is faster.
S3 plus: People say S3 is more safe for persistence.
EBS plus: No need to learn API, it is straight forward to save the images to EBS volume.
Namely I can not decide, will be happy if you guide.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
价格比较不太正确:
S3 费用为每 GB 使用 0.14 美元,而 EBS 费用为每 GB 配置(EBS 卷的大小)0.10 美元,无论您是否使用它。因此,S3 可能比 EBS 便宜,也可能不便宜。
The price comparison is not quite right:
S3 charges are $0.14 per GB USED, whereas EBS charges are $0.10 per GB PROVISIONED (the size of your EBS volume), whether you use it or not. As a result, S3 may or may not be cheaper than EBS.
我目前在一个项目中使用 S3,它运行得非常好。
EBS 意味着您需要管理一个卷+将其附加到的计算机。您需要在空间填满时添加空间并执行备份(并不是说您不应该备份 S3 数据,只是说它不那么重要)。
它还使得扩展变得更加困难:当您想要添加其他计算机时,您要么需要将映像拉到单独的计算机上,要么在所有计算机上克隆映像。这也意味着您增加了一个瓶颈:您必须管理自己的上传过程,该过程要么上传到所有计算机,要么由一台计算机管理。
我推荐 S3:一劳永逸。任意数量的计算机都可以并行执行上传,并且您实际上不需要通知其他计算机有关上传的信息。
此外,您可以使用 Amazon Cloudfront 作为图像前面的廉价 CDN,而不是直接从 S3 下载。
I'm currently using S3 for a project and it's working extremely well.
EBS means you need to manage a volume + machines to attach it to. You need to add space as it's filling up and perform backups (not saying you shouldn't back up your S3 data, just that it's not as critical).
It also makes it harder to scale: when you want to add additional machines, you either need to pull off the images to a separate machine or clone the images across all. This also means you're adding a bottleneck: you'll have to manage your own upload process that will either upload to all machines or have a single machine managing it.
I recommend S3: it's set and forget. Any number of machines can be performing uploads in parallel and you don't really need to notify other machines about the upload.
In addition, you can use Amazon Cloudfront as a cheap CDN in front of the images instead of directly downloading from S3.
我在 AWS 上为图库摄影网站构建了解决方案,这些网站存储了数百万张跨越 TB 数据的图像,我想分享 AWS 中满足您需求的一些最佳实践:
P1) 将原始图像文件存储在 S3 标准选项中
P2) 存储S3 减少冗余选项 (RRS) 中的可重复图像(如拇指等)可节省成本
P3) 有关图像的元数据(包括 S3 URL)可以存储在 Amazon RDS 或 Amazon DynamoDB 中,具体取决于查询复杂性。从 Amazon RDS 查询条目。如果您的查询很复杂,通常的做法是将元数据存储在 Amazon CloudSearch 或 Apache Solr 中。
P4) 使用 Amazon CloudFront 以低延迟向用户提供您的拇指。
P5) 通过 Amazon EC2 上的 SQS 或 RabbitMQ 对图像转换进行排队
P6) 如果您计划使用 EBS,则它们无法随您的 EC2 一起扩展。因此,理想情况下,您可以使用 GlusterFS 作为所有图像的公共存储池。 Auto Scaled 模式下的多个 Amazon EC2 仍然可以连接到它并访问/写入图像。
I have architected solutions on AWS for Stock photography sites which stores millions of images spanning TB's of data, I would like to share some of the best practice in AWS for your requirement:
P1) Store the Original Image file in S3 Standard option
P2) Store the reproducible images like thumbs etc in the S3 Reduced Redundancy option (RRS) to save costs
P3) Meta data about images including the S3 URL can be stored in Amazon RDS or Amazon DynamoDB depending upon the query complexity. Query the entries from Amazon RDS. If your query is complex it is also common practice to Store the meta data in Amazon CloudSearch or Apache Solr.
P4) Deliver your thumbs to users with low latency using Amazon CloudFront.
P5) Queue your image conversion either thru SQS or RabbitMQ on Amazon EC2
P6) If you are planning to use EBS, then they are not scalable with your EC2. So ideally you can use GlusterFS as your common storage pool for all your images. Multiple Amazon EC2 in Auto Scaled mode can still connect to it and access/write images.
您已经概述了两者的优点和缺点。
如果您计划存储 TB 级的图像,且存储需求日复一日地增加,S3 可能会是这是您最好的选择,因为它是专为此类情况而设计的。您可以获得无限的存储空间,而不必担心将数据分片多次EBS 卷。
S3 的经常性成本是它比 EBS 贵 50%。您还必须学习 API 并在您的应用程序中实现它,但这是一笔一次性费用,我认为您应该能够很快吸收。
You already outlined the advantages and disadvantages of both.
If you are planning to store terabytes of images, with storage requirements increasing day after day, S3 will probably be your best bet as it is built especially for these kinds of situations. You get unlimited storage space, without having to worry about sharding your data over many EBS volumes.
The recurrent cost of S3 is that it comes 50% more expensive than EBS. You will also have to learn the API and implement it in your application, but that is a one-off expense which I think you should be able to absorb very quickly.
您希望这些图像能够无限期地持续下去吗?
Amazon EBS 常见问题解答非常清楚;年故障率并非“基本为零”;他们的报价是 0.1% 到 0.5%。它比你办公桌下的磁盘更好,但它需要某种备份。
Do you expect the images to last indefinitely?
The Amazon EBS FAQ is pretty clear; the annual failure rate is not "essentially zero"; they quote 0.1% to 0.5%. It's better than the disk under your desk, but it would need some kind of backup.