我知道以前有人以这样或那样的方式问过这个问题,但与 GAE 稳定性有关的大多数主要问题似乎是在 2008 年底、2009 年初左右提出的,或者与大规模游戏没有直接关系(我认为)有兴趣)。
基本上,我一直在与我的业务合作伙伴反复争论是否使用 GAE 还是 AWS 作为我们的社交游戏引擎的后端,现在到了关键时刻。我喜欢 GAE (Java) 有很多原因,虽然它曾经不稳定,但现在已经相当不错了。支持 AWS 的主要论据是,AWS 已经通过每天运行数千万活跃用户的多款游戏证明了自己。 AWS 的热门产品是 Zynga,其 Farmville 的日活跃用户数达到了 80 多万的峰值。这只是在 AWS 基础设施上运行的非常成功的游戏之一。令人瞩目的成就。
所以,不管怎样,它是有效的。另一方面,GAE 没有任何我能找到的做这类数字的例子。甚至还差得远。 那我可以相信它吗?是否有使用 GAE 的每日活跃用户超过 200 万的大型社交游戏的示例?
我们的社交游戏后端的主要考虑因素是:
- 可靠的 CDN(Amazon CloudFront/S3 非常适合这一点,因为Google 的 DataStore 显然非常出色)。
- 能够在不崩溃的情况下进行扩展(AWS-EC2 在这里得到了证明,GAE 似乎没有大型游戏应用程序的示例,可以每秒运行 1000 个请求。GAE 在这方面过去相当不稳定,我的也是如此主要关心的问题)。
- 可靠的非 SQL 数据库。 (AWS-SimpleDB 和 Google 的 DataStore 在这方面都非常出色。我们真的不需要 SQL)。
- 如果出现问题,支持/有人打电话/联系。 (这是 GAE 最大的担忧之一。我不知道可以打电话给谁,或者是否可能。AWS 有 SLA 和支持。)
我期待您的想法,但也请注意,这不是有意的开始任何形式的火焰战争。我喜欢这两个系统,但两者都有其优点和缺点,但我即将做出一个架构决策,该决策可能不会在未来撤销。
问候,
谢恩
I know this has been asked one way or another before, but most of the main issues to do with GAE stability seem to have been asked around the end of 2008, early 2009, or aren't directly related to games at scale (which I'm interested in).
Basically, I have been arguing back and forth with my business partner about whether to use GAE or AWS for the back-end of our social game engine, and now it's crunch time. I love GAE (Java) for so many reasons, and although it used to be unstable, it's pretty good now. The main argument in favour of AWS is the fact that AWS has proven itself with multiple games running tens of millions of active users per day. The obvious pin-up child for AWS is Zynga, with its Farmville peaking at 80+million DAU. And that's just one of the hugely successful games running on the AWS infrastructure. Remarkable achievement.
So, one way or another it's KNOWN to work. GAE on the other hand doesn't have any examples that I could find doing these sorts of numbers. Not even close. So can I trust it? Is there a single example of a large social game with 2 million+ Daily Active Users, using GAE?
The main considerations for our social game back-end are:
- Reliable CDN (Amazon CloudFront/S3 is excellent for this, as is Google's obviously excellent DataStore).
- Ability to scale without falling over (AWS-EC2 is proven here, GAE doesn't seem to have examples of large game apps which can run into the 1000s of requests per second. GAE used to be quite unstable in this regard and so is my main concern).
- Reliable no-SQL database. (AWS-SimpleDB and Google's DataStore are both excellent for this. We really don't need SQL).
- Support/someone to call/contact if there is a problem. (This is one of the biggest worries with GAE. I have no idea who I can call, or if it's even possible. AWS has an SLA and support.)
I look forward to your thoughts, but please also note, this is not intended to start any sort of flame war. I love both systems, but both have their positives and negatives, but I'm about to make an architectural decision that likely won't be undone moving forward.
Regards,
Shane
发布评论
评论(2)
我从未使用过 AWS-EC2,因此我将仅在 Google App Engine 方面分享我的知识。
更多数据:
最近,Google App Engine 团队关闭了应用程序库,原因很简单:玩具应用程序太多!
谷歌希望扭转这种展示成功企业案例研究的趋势;以下是其中一些:
此处
其他有趣的案例
"我们非常了解停机时间和可靠性问题,并正在努力解决这些问题:提高 App Engine 可靠性是我们的首要任务”最近,一位 Google 开发者关系经理此处说道。
App Engine 仍处于测试阶段,并且是一个不断发展的平台,因此您必须做好应对停机和问题的准备。
Google App Engine 团队刚刚推出了 App Engine for Business< /a> 提供 99.9% 正常运行时间的服务级别协议和高级开发人员支持。
这是我对其价值的看法:
我知道这是一个艰难的决定;阅读了很多有关 GAE 的文章后,我对它的感受很复杂,因为你可以从最近灾难性的Carlos Ble报告了花园 或 Gri.pe。
App Engine for Business 看起来很有前途,我会在严肃的商业项目计划中考虑它。
全新的 SDK 1.4.0 很大,它清楚地表明团队确实正在努力解决一些恼人的问题(预热请求)并放宽一些限制(任务队列上的 10 分钟处理)。
最后要考虑的事情是:如果您想要获得大量数据,Google App Engine 团队可能会将您的应用程序作为成功的案例研究,并随后进行免费且强大的炒作。
I've never worked with AWS-EC2 so I'm going to share my knowledge just on the Google App Engine side.
Further data:
Recently Google App Engine team has shut down the App Gallery for one simple reason: too many Toy Apps!
Google wants to counteract this tendency showing successful businesses case studies; here are some of them:
Other interesting case studies here
"We are well aware of downtimes and reliability issues, and are working hard to solve them: Improving App Engine reliability is our number one priority" was recently said by a Google Developer Relations Manager here.
App Engine is still in beta and is an evolving platform so you have to be prepared to deal with downtimes and issues.
Google App Engine team has just launched a preview of App Engine for Business providing 99.9% uptime service level agreement and premium developer support available.
Here is my opinion for what it's worth:
I'm aware that it's a tough call; having read a lot of articles about GAE I have mixed feelings about it because you can go from the recent catastrophic Carlos Ble report to the happy experience of Flower Garden or Gri.pe.
App Engine for Business looks promising and I would consider it in the case of a serious business project plan.
The fresh SDK 1.4.0 is huge and it clearly shows that the Team is really pushing hard to fix some annoying issues (Warmup requests) and relaxing some limitations (10 minutes process on TaskQueue).
Last thing to consider: if you are going to have big numbers, the Google App Engine Team will probably take your app as a successfull case study to follow with a boost of free and powerful Hype.
BuddyPoke 是在 GAE 上运行的大型社交应用程序的一个示例。有多大我不确定。本文提到每日页面浏览量为 3000 万(不是用户):
http: //googleappengine.blogspot.com/2008/10/app-engine-case-studies.html
他们的 Facebook 页面显示每月(而非每日)有 270 万用户:
http://www.facebook.com/buddypoke
尽管如此,他们也在其他社交网络上:
http://www.buddypoke.com/
就我个人而言,我决定使用 GAE,主要有以下几个原因:
如果第 4 点对您来说很重要,那么使用 AWS 可能会更好。对于 GAE,您似乎无能为力,也无法联系任何人。
大约一周前,我的应用程序遇到了问题 - 它突然开始在 Google 代码中出现故障,而在过去 5 天(即自从我上次上传应用程序以来)一直工作正常的位置。向 Google 报告问题的唯一方法似乎是通过其生产问题模板,如下所示:
http://code.google.com/p/googleappengine/issues/entry?template=Production%20issue
我报告了该问题,但没有听到任何消息。由于它在谷歌的服务器上运行,我无法采取任何“通常”的紧急策略,例如重新启动服务器。一个小时后,问题自行解决 - 我不确定 Google 是否有人看到我的消息并修复了某些内容,或者它是否就消失了。我更新了我的错误报告,说问题已得到解决,但即使一周后的现在,问题还没有被解决,甚至没有得到承认。此外,由于该问题必须公开发布,我的应用程序现在正受到机器人的随机点击。
诚然,我的应用程序目前仅处于测试阶段,因此只有一百左右的用户,所以这对我来说并不是一个重大事件。如果我获得了数千/数百万的点击量,也许谷歌会更早地注意到这个问题,或者他们会更加关注我的错误报告。
关于您的第 3 点,即使是我的流量较小的小型应用程序也会偶尔引发数据存储错误(即使在可用性图表上未将其报告为中断的情况下)。
话虽如此,我仍然喜欢GAE(我使用的是Python版本),并打算坚持下去。 GAE 的承诺是它的可扩展性 - 虽然现在它偶尔会因为我的小流量而崩溃,但当它扩展到更多流量时(即你的第 2 点),只要我正确编码以避免再崩溃,它就不会再崩溃了。争论。我看看进展如何。
最后,关于您的第 1 点,blobstore 和/或静态文件更像是 GAE 上的 CDN,而不是数据存储。然而,对于非常大的流量,真正的 CDN 可能会更便宜。它也不一定是 CDN,请参阅 Google 应用引擎和 CDN。 CDN。
BuddyPoke is one example of a large-scale social app running on GAE. How large I'm not sure. This article says 30m daily page views (not users):
http://googleappengine.blogspot.com/2008/10/app-engine-case-studies.html
Their facebook page says 2.7 million monthly (not daily) users:
http://www.facebook.com/buddypoke
Although, they are also on a heap of other social networks:
http://www.buddypoke.com/
Personally I decided to go with GAE, for a couple of main reasons:
If your point 4 is a big one for you, then you may be better off with AWS. With GAE there appears to be nothing you can do, and no-one you can contact.
About a week ago I had an issue with my app - it had suddenly started failing in Google's code, in a location which had been working fine for the last 5 days, ie since I had last uploaded my app. The only way to report issues to Google seems to be via their production issue template, here:
http://code.google.com/p/googleappengine/issues/entry?template=Production%20issue
I reported the issue, and didn't hear anything. Since it's running on Google's servers I was unable to resort to any 'usual' emergency tactics like restarting a server. An hour later and the problem resolved itself - I'm not sure if someone at Google saw my message and fixed something, or if it just went away. I updated my bug report to say the problem was fixed, but even now a week later the issue hasn't been closed or even acknowledged. Also since the issue has to be posted publicly, my app is now getting random hits from bots.
Admittedly my app is currently only in beta and so only has a hundred or so users, and so it wasn't a major incident for me. If I was getting thousands / millions of hits, maybe either Google would have noticed the problem themselves earlier, or they would have paid more attention to my bug report.
On your point 3, even my small app with a small amount of traffic throws occasional data store errors (even during times which aren't reported on the availability charts as outages).
Having said this, I still like GAE (I am using the Python version), and plan to stick with it. The promise of GAE is its scalability - although it falls over occasionally now for my small traffic, it shouldn't fall over any more when it scales to much more traffic (ie your point 2), provided I've coded it correctly to avoid contention. I'll see how it goes.
Finally regarding your point 1, the blobstore and/or static files are more like a CDN on GAE, than the datastore. However for very large amounts of traffic, a real CDN may be cheaper. It's also not necessarily a CDN, see Google app engine & CDN.