SimpleDB并行性
SimpleDB 文档中有一条评论基本上指出,如果您需要更多并行性,那么您应该使用多个域。
这让我想到了这个问题。 SimpleDB 是否会序列化所有请求,即使它们来自多个客户端应用程序?
有人对此有明确的答案吗?
There is a comment in the SimpleDB documentation that states basically that if you need more parallelism then you should use multiple domains.
This leads me to this question.
Does SimpleDB serialize all of it's requests even they come from multiple client applications?
Does anyone have a definitive answer to that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不,SimpleDB 当然不会“序列化其所有请求”;但是,它必须执行一定量的锁定以确保事务一致性。跨域分片是最小化这种影响的简单方法。
No, of course SimpleDB doesn't "serialize all of its requests"; however, it must do some amount of locking to ensure transactional consistency. Sharding across domains is an easy way to minimize the impact of this.
SimpleDB(Netflix 的首席工程师在博客中介绍了他们的过渡)似乎对每个域的帐户进行了速率限制。因此,您可能有 1 个帐户,并从 10 个线程向单个域进行查询或插入,而这些(根据我收集的数据)将被速率限制为每秒大约 40-70 个请求(我看到过不同的报告)。
另一件需要考虑的事情是,您的域规模不断增大,查询性能会下降。
由于这两种行为,建议对于大数据,您将数据“分片”到多个域。
因此,考虑一个跟踪推文的社交应用程序,您可以创建以下 5 个域:
TWEETS_0、TWEETS_1、TWEETS_2、TWEETS_3、TWEETS_4
然后将您的插入内容分片:
int domainIndex = tweet.getId() % 5;
simpleDB.doInsert(domainIndex,arguments...)
或一些这样的伪代码。 AWS 最近将每个客户的域限制提高到 250 个,因此看来预计会使用这种分片设计。
SimpleDB 的白日梦承诺是“我们扩展,您担心代码”,但现实是我们还没有做到这一点。
您仍然需要担心一些细节。
SimpleDB (per Netflix's lead engineer that blogs about their transition onto it) seems to rate-limit accounts per-domain. So you might have 1 account, and be doing queries or inserts from 10 threads to a single domain, and those (from what I gathered) will be rate limited to approximately 40-70 requests per second (I have seen varying reports).
The other thing to consider is that your domain grows in size, the query performance degrades.
Because of these 2 behaviors, it is recommended that for large data, you "shard" your data across multiple domains.
So consider a social app that tracks tweets, you might create the following 5 domains:
TWEETS_0,TWEETS_1,TWEETS_2,TWEETS_3,TWEETS_4
then shard your inserts across them:
int domainIndex = tweet.getId() % 5;
simpleDB.doInsert(domainIndex, arguments...)
or some such pseudo-code. Aws recently upped the domain limit to 250 per customer so it seems this sharding design is expected to be used.
The pipe-dream promise of SimpleDB is "we scale, you worry about code", but the reality is that we just aren't there yet.
You still have to worry about a handful of details.