MYSQL优化一个137000行的表
我正在尝试优化 redmine 数据库,以免它变得太痛苦; 更改(基本上是所有 SVN 更改的日志)为 137000 行(左右),并且该表设置为基本默认设置。 无密钥打包等。
下表如下
ID int[11] Auto Inc (PK)
changeset_id int[11]
action varchar[1]
path varchar[255]
from_path varchar[255]
from_revision varchar[255]
revision varchar[255]
branch varchar[255]
索引:Primary (ID)、
changeset_id 设置为 INDEX BTREE
全部基于 latin1 字符集,基于 http://forge.mysql.com/wiki/Top10SQLPerformanceTips
表引擎是 InnoDB Pack Keys 设置为 Default(仅打包 char varchar)
所有其他选项均关闭。
优化这个的最佳方法是什么? (小节截断;o))
I'm trying to optimize a redmine database before it gets too much of a pain; the Changes (basically a log of all the SVN Changes) is at 137000 rows (ish) and the table is set to the b asic default settings. No key packing etc.
The table is as follows
ID int[11] Auto Inc (PK)
changeset_id int[11]
action varchar[1]
path varchar[255]
from_path varchar[255]
from_revision varchar[255]
revision varchar[255]
branch varchar[255]
Indices: Primary (ID),
changeset_id set to INDEX BTREE
All on latin1 charset based on a bit of info from http://forge.mysql.com/wiki/Top10SQLPerformanceTips
The Table Engine is InnoDB
Pack Keys is set to Default (only packs char varchar)
All the other options are turned off.
Whats the best way to optimize this? (Bar Truncate ;o) )
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
mysql 有一些通用的优化技术:首先是确保您的数据类型符合 ABC(请参阅 此处)。 从上到下看,ID和changeset_id看起来不错,action可能应该是一个char1 而不是 varchar (如果可以将其留空,则可以为空(一般来说,请确保在其他字段上正确设置了可为空))。 至于其他 5 个字段(根据大小可能会主导表),字符串是正确的数据类型吗? (我猜是的,路径,from_path,分支,但也许修订版应该是一个数字(我猜它不是,所以它支持 git 或其他东西))
此外,它们看起来像规范化目标,特别是因为“路径” ”和“修订”表将规范其中四个(这里有一个基础教程,如果你需要的话)
There are some general optimization techniques for mysql: the first would be make sure your datatypes fit the ABCs (see here). Going over then from top to bottom, ID and changeset_id look good, action should probably be a char1 instead of a varchar (nullable if you can leave it blank (and in general, make sure your nullable is set correctly on other fields)). As for the 5 other fields (which depending on size would probably dominate the table), are strings the correct datatype? (I'm guessing yes with path, from_path, branch, but maybe revision should be a number (I'm guessing it isn't so it supports git or something))
Also, they look like normalization targets, especially since a "paths" and "revisions" table would normalize four of them (here's a basic tutorial, if you need it)
这完全取决于您的读写特征,即您正在进行的查询以及您写入的频率。
优化写入的方法是尽量减少索引的数量。 理想情况下,您使用 MS SQL Server 中的“聚集索引”和单调递增的键,确保将新记录写入表的末尾,并且不写入其他单独的索引。 如果您不需要任何事务功能,甚至更好的是跳过 DBMS 并写入某种普通的旧日志文件。
对于查询来说,可以按照您的意愿变得复杂。 但请记住,如果您需要表中的大量数据进行查询(即,不仅仅是根据键查找单个记录),表扫描可能并不是一件坏事。 一般来说,如果您要检查表中超过 3-5% 的内容,表扫描将会非常快。 同样,为此,普通的旧文件可能比 DBMS 更快。
如果必须对两者进行优化,请考虑针对写入进行优化,然后定期制作针对查询进行优化的副本,并针对该副本执行查询。
It depends entirely on your read and write characteristics, i.e., the queries you're making, and how often you're writing to it.
The way to optimize for writing is to minimize the number of indexes. Ideally, you use what in MS SQL server would be the "clustered index" with a monotonically incrementing key, ensuring that you write new records to the end of the table, and you write no other separate index. Better yet, even, is to skip the DBMS and write to a plain old log file of some sort, if you don't need any transactional capability.
For queries, well, that can get as complex as you like. Do keep in mind, though, that if you need any significant amount of data from the table for a query (i.e., it's more than just looking up a single record based on a key), table scans may not be such a bad thing. Generally, if you're examining more than 3-5% of the contents of a table, a table scan will be very fast. Again, for this, a plain old file will probably be faster than a DBMS.
If you have to optimize for both, consider optimizing for writing, and then making a copy on a regular basis that you optimize for queries, and doing the queries against the copy.