计划优化对非常大的 InnoDB 表的访问
我是一款社交游戏的开发者,我们拥有近 200 万玩家(而且这个数字还在增长)。
主 MySQL 数据库服务器有 24 Gb RAM,如果不是一张非常大的表,数据库可以装入内存。目前它拥有近 10 亿条记录,大小为 33Gb。它具有以下架构:
CREATE TABLE `plant` (
`player_id` int(10) unsigned NOT NULL DEFAULT '0',
`id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`x` tinyint(3) unsigned NOT NULL DEFAULT '0',
`y` tinyint(3) unsigned NOT NULL DEFAULT '0',
`distort_step` tinyint(3) unsigned NOT NULL DEFAULT '0',
`grow_step` tinyint(3) unsigned NOT NULL DEFAULT '0',
`proto_id` int(10) unsigned DEFAULT '0',
`yflip` tinyint(4) NOT NULL DEFAULT '0',
`grow_start` int(10) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`player_id`,`id`)
) ENGINE=InnoDB
我正在考虑以下关于如何优化它的计划:
添加带有“archive_”前缀的类似表
对此进行分区按哈希值创建新表
找出未玩游戏的非活跃玩家,例如一个月。
将他们的记录从大表复制到存档表
标记正在存档的玩家并使用存档表 每当他/她登录时都使用原始的
也可以选择通过哈希对原始表进行分区(可选, 因为它可能会导致大量停机时间)
如果没有任何帮助,请考虑分片
您对此有何看法?听起来是个好计划吗?
I'm a developer of a social game where we have almost 2 millions players(and this number is growing).
The master MySQL DB server has 24 Gb RAM and the database could fit into the memory if it wasn't for one table which has really large size. Currently it has almost a billion of records and its size is 33Gb. It has the following schema:
CREATE TABLE `plant` (
`player_id` int(10) unsigned NOT NULL DEFAULT '0',
`id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`x` tinyint(3) unsigned NOT NULL DEFAULT '0',
`y` tinyint(3) unsigned NOT NULL DEFAULT '0',
`distort_step` tinyint(3) unsigned NOT NULL DEFAULT '0',
`grow_step` tinyint(3) unsigned NOT NULL DEFAULT '0',
`proto_id` int(10) unsigned DEFAULT '0',
`yflip` tinyint(4) NOT NULL DEFAULT '0',
`grow_start` int(10) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`player_id`,`id`)
) ENGINE=InnoDB
I'm thinking about the following plan on how to optimize it:
Add a similar table with "archive_" prefix
Partition this new table by hash
Figure out inactive players who haven't played the game, say, for a month.
Copy their records from the big table to the archive table
Mark the player being archived and use the archive table instead
the original one whenever he/she logs inOptionally partition the original table by hash as well(optionally,
because it may cause lots of downtime)If nothing helps think about sharding
What do you think about it? Does it sound like a good plan?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为你的建议非常好,既简单又可能非常有效。您可以避开诸如分区之类的“可怕”东西。当然,如果您预计同时有 200 万玩家玩游戏,您将需要重新考虑您的方法。
您甚至可以一路走下去,只跟踪哪些“植物”现在实际上正在积极玩游戏。我假设您永远不会为现在不玩游戏的玩家更新表格。
您甚至可以按照您认为合适的方式对存档表进行分区,例如通过player_id 上的散列或类似的东西。
I think your suggestion is a very good one, it's simple and likely very effective. You steer away from 'scary' stuff like partitioning. Of course if you are expecting to have 2 million players playing all at once you will need to rethink your approach.
You could even go all the way and only keep track of which 'plants' are actually actively playing the game RIGHT NOW. I'm assuming you wont ever update the table for players that aren't playing right now.
You could even partition the archive tables as you see fit, for instance by hash on player_id or something similar.
对表进行分片怎么样?然后您就可以毫无问题地进行扩展。如果您担心与分片相关的繁重工作,请尝试 ScaleBase 以获得透明的分片解决方案
How about sharding the table? Then you can scale without any problem. If you're worried about the heavy lifting associated with Sharding, try out ScaleBase for a transparent sharding solution