如何在 MySQL 中按天对 MyISAM 表进行分区

发布于 2024-10-11 14:19:15 字数 182 浏览 7 评论 0原文

我想将最近 45 天的日志数据保留在 MySQL 表中以用于统计报告。每天可能会处理 20-3000 万行。我计划创建一个平面文件并使用 load data infile 每天获取其中的数据。理想情况下，我希望每天都有自己的分区，而不必每天编写脚本来创建分区。

MySQL 有没有办法让每天自动获得自己的分区？

谢谢

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

找回味觉 2024-10-18 14:19:16

我强烈建议使用 Redis 或 Cassandra 而不是 MySQL 来存储日志等高流量数据。然后您可以全天传输它，而不是每天进行导入。

您可以在“NoSQL”比较中了解有关这两个（以及更多）的更多信息“数据库。

如果你坚持使用MySQL，我认为最简单的就是每天创建一个新表，例如logs_2011_01_13，然后将其全部加载到其中。它使得删除旧日期变得非常容易，您还可以轻松地在不同服务器上移动不同的表。

回复收藏 0 原文

放低过去 2024-10-18 14:19:16

呃..，用复合键在 Mod 45 中对它们进行编号，然后循环浏览它们...

说实话，每天 1 个表是一个有效的建议，并且由于它是静态数据，我将创建打包的 MyISAM，具体取决于我的主机的排序能力。

构建查询来联合其中的一些或全部只会具有一定的挑战性。

每天 1 个表，并对这些表进行分区以提高负载性能。

回复收藏 0 原文

最美的太阳 2024-10-18 14:19:16

是的，您可以按日期对 MySQL 表进行分区：

CREATE TABLE ExampleTable (
  id INT AUTO_INCREMENT,
  d DATE,
  PRIMARY KEY (id, d)
) PARTITION BY RANGE COLUMNS(d) (
  PARTITION p1 VALUES LESS THAN ('2014-01-01'),
  PARTITION p2 VALUES LESS THAN ('2014-01-02'),
  PARTITION pN VALUES LESS THAN (MAXVALUE)
);

稍后，当您接近溢出到分区 pN 时，您可以对其进行拆分：

ALTER TABLE ExampleTable REORGANIZE PARTITION pN INTO (
  PARTITION p3 VALUES LESS THAN ('2014-01-03'), 
  PARTITION pN VALUES LESS THAN (MAXVALUE)
);

这不会自动按日期进行分区，但您可以在需要时重新组织。最好在填满最后一个分区之前重新组织一下，这样操作会很快。

Yes, you can partition MySQL tables by date:

CREATE TABLE ExampleTable (
  id INT AUTO_INCREMENT,
  d DATE,
  PRIMARY KEY (id, d)
) PARTITION BY RANGE COLUMNS(d) (
  PARTITION p1 VALUES LESS THAN ('2014-01-01'),
  PARTITION p2 VALUES LESS THAN ('2014-01-02'),
  PARTITION pN VALUES LESS THAN (MAXVALUE)
);

Later, when you get close to overflowing into partition pN, you can split it:

ALTER TABLE ExampleTable REORGANIZE PARTITION pN INTO (
  PARTITION p3 VALUES LESS THAN ('2014-01-03'), 
  PARTITION pN VALUES LESS THAN (MAXVALUE)
);

This doesn't automatically partition by date, but you can reorganize when you need to. Best to reorganize before you fill the last partition, so the operation will be quick.

回复收藏 0 原文

未蓝澄海的烟 2024-10-18 14:19:16

我在寻找其他东西时偶然发现了这个问题，并想指出 MERGE 存储引擎（http://dev.mysql.com/doc/refman/5.7/en/merge-storage-engine.html）。

MERGE 存储或多或少是一个指向多个表的简单指针，并且可以在几秒钟内重做。对于骑行日志来说，它的功能非常强大！这就是我要做的：

每天创建一张表，使用 LOAD DATA 作为提到的 OP 来填充它。完成后，删除 MERGE 表并重新创建它，包括新表，同时忽略最旧的表。完成后，我可以删除/存档旧表。这将允许我快速查询特定日期或所有日期，因为原始表和合并都是有效的。

CREATE TABLE logs_day_46 LIKE logs_day_45 ENGINE=MyISAM;
DROP TABLE IF EXISTS logs;
CREATE TABLE logs LIKE logs_day_46 ENGINE=MERGE UNION=(logs_day_2,[...],logs_day_46);
DROP TABLE logs_day_1;

请注意，MERGE 表与 PARTIONNED 表不同，它有一些优点和缺点。但请记住，如果您尝试从所有表中聚合，那么速度会比所有数据仅位于一个表中要慢（分区也是如此，因为它们本质上是不同的表）。如果您主要在特定日期进行查询，则需要自己选择表，但如果对日期值进行分区，MySQL 将自动获取正确的表，这可能会更快、更容易编写。

I have stumbled on this question while looking for something else and wanted to point out the MERGE storage engine (http://dev.mysql.com/doc/refman/5.7/en/merge-storage-engine.html).

The MERGE storage is more or less a simple pointer to multiple tables, and can be redone in seconds. For cycling logs, it can be very powerfull! Here's what I'd do:

Create one table per day, use LOAD DATA as OP mentionned to fill it up. Once it is done, drop the MERGE table and recreate it including that new table while ommiting the oldest one. Once done, I could delete/archive the old table. This would allow me to rapidly query a specific day, or all as both the orignal tables and the MERGE are valid.

CREATE TABLE logs_day_46 LIKE logs_day_45 ENGINE=MyISAM;
DROP TABLE IF EXISTS logs;
CREATE TABLE logs LIKE logs_day_46 ENGINE=MERGE UNION=(logs_day_2,[...],logs_day_46);
DROP TABLE logs_day_1;

Note that a MERGE table is not the same as a PARTIONNED one and offer some advantages and inconvenients. But do remember that if you are trying to aggregate from all tables it will be slower than if all data was in only one table (same is true for partitions, as they are basically different tables under the hood). If you are going to query mostly on specific days, you will need to choose the table yourself, but if partitions are done on the day values, MySQL will automatically grab the correct table(s) which might come out faster and easier to write.

回复收藏 0 原文

~没有更多了~