除了批处理之外,还有什么方法可以优化对 Google App Engine 的多次 put 调用?
我在数据存储中的每个用户的映射中保存消息。它作为由唯一名称键入的无索引序列化值保存。一个用户可以同时向多个用户发送消息。目前,我对(例如)20 个目标执行批量获取,更新每个目标中的序列化值,然后执行批量放置。序列化消息的大小足够小,不重要,约为 1KB。
这对于用户来说很快,appstats 中显示的实时时间是 90 毫秒。然而,CPU 时间成本为 918 毫秒。这会导致警告,并且在高使用率下可能会变得昂贵,或者如果我想向 50 个用户发送消息,则会造成麻烦。有没有什么方法可以降低 CPU 时间成本,无论是通过数据存储调整,还是对我错过的架构进行明显的更改?任务队列解决方案可以消除警告,但实际上只会重新分配成本。
编辑:数据存储键是接收者的用户名,值是存储为序列化映射的消息,其中键是发送者的用户名,消息是包含两个整数的简单对象。有两种类型的请求。上面描述的“更新”类型,其中检索消息映射、将新消息添加到映射并存储映射。 “get”类型是收件箱所有者阅读消息,这是基于密钥的简单获取。我的想法是,即使将其拆分为多值关系或类似关系,这也会提高保真度(允许一次进行两次更新),但只要它是简单的键值方法,投入工作量仍然是相同的。
I hold messages in a map for each user in the datastore. It's held as an unindexed serialized value keyed by a unique name. A user can message many users at once. Currently I execute a batch get for the (e.g.) 20 targets, update the serialized value in each, then execute a batch put. The serialized message size is small enough to be unimportant, around 1KB.
This is quick for the user, the real time shown in appstats is 90ms. However the cpu-time cost is 918ms. This causes warnings and may become expensive with high usage, or cause trouble if I wish to message 50 users. Is there any way to reduce this cpu-time cost, either with datastore tweaks, or an obvious change to the architecture I've missed? A task queue solution would remove the warnings but would really only redistribute the cost.
EDIT: The datastore key is the username of the receiver, the value is the messages stored as serialized Map where key is username of sender and Message is simple object holding two ints. There are two types of request. The 'update' type described above where the message map is retrieved, the new message is added to the map, and the map is stored. The 'get' type is the inbox owner reading the messages which is a simple get based on key. My thinking was that even if this was split out into a multi-value relationship or similar, this made improve the fidelity (allowing two updates at once) but the amount of put work would still be the same provided it's a simple key-value approach.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
听起来你已经相当高效地做事了。您不太可能能够大幅减少这一点。无论如何,每个请求少于 1000 cpu 毫秒是一个相当合理的量。
通过拆分实体,您可能会获得两件事:如果您的列表很长,则当您只需要读取或修改其中的一小部分时,您可以节省读取和写入大型实体的 CPU 成本,并且可以节省交易冲突。也就是说,如果多个任务需要同时向队列添加项目,您无需重试事务即可完成此操作,从而节省 CPU 时间。
It sounds like you're already doing things fairly efficiently. It's not likely you're going to be able to reduce this substantially. Less than 1000 cpu milliseconds per request is a fairly reasonable amount anyway.
There's two things you might gain by splitting entities up: If your lists are long, you're saving the CPU cost of reading and writing large entities when you only need to read or modify some small part of it, and you're saving on transaction collisions. That is, if several tasks need to add items to the queue simultaneously, you can do it without transaction retries, saving you CPU time.