MySQL 可以从事件调度程序创建新分区吗

发布于 2024-08-12 08:45:29 字数 679 浏览 1 评论 0原文

我有一个看起来像这样的表:

CREATE TABLE `Calls` (
  `calendar_id` int(11) NOT NULL,
  `db_date` timestamp NOT NULL,
  `cgn` varchar(32) DEFAULT NULL,
  `cpn` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`calendar_id`),
  KEY `db_date_idx` (`db_date`)
) 
 PARTITION BY RANGE (calendar_id)(
   PARTITION p20091024 VALUES LESS THAN (20091024) ,
   PARTITION p20091025 VALUES LESS THAN (20091025));

我可以以某种方式使用 mysql 调度程序自动添加一个新分区(提前 2 天) - 我正在寻找一个示例,每天都会添加一个新分区 - 它' d 运行类似于

alter table Calls add partition (partition p20091026 values less than(20091026));

计划任务运行时构造 p20091026/20091026 的内容,从现在 + 2 天得出值。 (或者我最好通过 cron 编写脚本吗?)

I'm having a table looking something like this:

CREATE TABLE `Calls` (
  `calendar_id` int(11) NOT NULL,
  `db_date` timestamp NOT NULL,
  `cgn` varchar(32) DEFAULT NULL,
  `cpn` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`calendar_id`),
  KEY `db_date_idx` (`db_date`)
) 
 PARTITION BY RANGE (calendar_id)(
   PARTITION p20091024 VALUES LESS THAN (20091024) ,
   PARTITION p20091025 VALUES LESS THAN (20091025));

Can I somehow use the mysql scheduler to automatically add a new partition(2 days in advance) - I'm looking for an example that would, every day add a new partition - it'd run something like

alter table Calls add partition (partition p20091026 values less than(20091026));

Where p20091026/20091026 is constructed when the scheduled task run, deriving the value from now + 2 day. (Or am I better of scripting this through cron ?)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

淡淡绿茶香 2024-08-19 08:45:29

是的,你可以这样做。

请注意,默认情况下调度程序未处于活动状态(请参阅事件调度程序配置),所以这不是一个零风险的选择。例如,如果您的运营团队将您的应用程序迁移到新服务器,但忘记启用调度程序,您的应用程序将被淹没。还需要特殊权限,这可能再次需要在新服务器上设置。

我的建议:首先,创建一个存储过程(请参阅下面的代码示例)来处理定期分区维护:如果表变得太大,则删除旧分区,并添加足够的新分区(例如 1 周),以便即使维护过程不存在只要不运行一段时间,你的应用程序就不会死掉。

然后冗余地安排对该存储过程的调用。使用 MySQL 调度程序、使用 cron 作业以及使用您喜欢的任何其他方式。这样,如果一个调度程序不起作用,另一个调度程序就可以弥补这一不足。如果您正确设计了存储过程,那么在不需要执行任何操作的情况下执行无操作应该很便宜。您甚至可能想从应用程序中调用它,例如作为生成长期运行的报告时的第一个语句,或者作为日常 ETL 流程的一部分(如果有)。我的观点是,计划任务的致命弱点是确保调度程序实际上正在工作——所以请考虑这里的冗余。

请确保不要同时安排所有通话,以免它们互相干扰! :-)

这是维护过程的代码示例 - 首先它修剪旧分区,然后添加新分区。我将错误检查和防止多个同时执行作为读者的练习。

DELIMITER $

DROP PROCEDURE IF EXISTS `test`.`UpdatePartitions` $
CREATE PROCEDURE `test`.`UpdatePartitions` ()
BEGIN

  DECLARE maxpart_date date;
  DECLARE partition_count int;
  DECLARE minpart date;
  DECLARE droppart_sql date;
  DECLARE newpart_date date;
  DECLARE newpart_sql varchar(500);

  SELECT COUNT(*)
    INTO partition_count
    FROM INFORMATION_SCHEMA.PARTITIONS
    WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

  -- first, deal with pruning old partitions
  -- TODO: set your desired # of partitions below, or make it parameterizable
  WHILE (partition_count > 1000)
  DO

    -- optionally, do something here to deal with the parition you're dropping, e.g.
    -- copy the data into an archive table

     SELECT MIN(PARTITION_DESCRIPTION)
       INTO minpart
       FROM INFORMATION_SCHEMA.PARTITIONS
       WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

     SET @sql := CONCAT('ALTER TABLE Calls DROP PARTITION p'
                        , CAST((minpart+0) as char(8))
                        , ';');

     PREPARE stmt FROM @sql;
     EXECUTE stmt;
     DEALLOCATE PREPARE stmt;

    SELECT COUNT(*)
      INTO partition_count
      FROM INFORMATION_SCHEMA.PARTITIONS
      WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';


  END WHILE;

  SELECT MAX(PARTITION_DESCRIPTION)
    INTO maxpart_date
    FROM INFORMATION_SCHEMA.PARTITIONS
    WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

  -- create enough partitions for at least the next week
  WHILE (maxpart_date < CURDATE() + INTERVAL 7 DAY)
  DO

    SET newpart_date := maxpart_date + INTERVAL 1 DAY;
    SET @sql := CONCAT('ALTER TABLE Calls ADD PARTITION (PARTITION p'
                        , CAST((newpart_date+0) as char(8))
                        , ' values less than('
                        , CAST((newpart_date+0) as char(8))
                        , '));');

    PREPARE stmt FROM @sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    SELECT MAX(PARTITION_DESCRIPTION)
      INTO maxpart_date
      FROM INFORMATION_SCHEMA.PARTITIONS
      WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

  END WHILE;

END $

DELIMITER ;

顺便说一句,分区维护(确保提前创建新分区、修剪旧分区等)对于自动化来说至关重要。我亲眼目睹过大型企业数据仓库瘫痪了一天,因为最初创建了一年的分区,但没有人记得在下一年到来时创建更多分区。因此,您在这里考虑自动化非常好——这对您正在从事的项目来说是个好兆头。 :-)

Yes, you can do this.

Note that the scheduler isn't active by default (see Event Scheduler Configuration), so it's not a zero-risk option. For example, if your operations team migrates your app to a new server, but forgets to enable the scheduler, your app will get hosed. There's also special privileges needed, which again may need to be set up on a new server.

My advice: first, create a stored procedure (see code sample below) which handles periodic partition maintenance: dropping old partitions if the table gets too big, and adding enough new partitions (e.g. 1 week) so that even if the maintenance proc isn't run for a while, your app won't die.

Then redundantly schedule calls to that stored proc. Use the MySQL scheduler, use a cron job, and use any other way you like. Then if one scheduler isn't working, the other can pick up the slack. If you design the sproc correctly, it should be cheap to execute a no-op if it doesn't need to do anything. You might even want to call it from your app, e.g. as the first statement when generating a long-running report, or as part of your daily ETL process (if you have one). My point is that the achilles heel of scheduled tasks is ensuring that the scheduler is actually working-- so think about redundancy here.

Just make sure not to schedule all the calls at the same time so they won't step on each other! :-)

Here's a code sample for what your maintenance proc could look like-- first it prunes old partitions, then adds new ones. I left error checking and preventing multiple simultaneous executions as an exerise for the reader.

DELIMITER $

DROP PROCEDURE IF EXISTS `test`.`UpdatePartitions` $
CREATE PROCEDURE `test`.`UpdatePartitions` ()
BEGIN

  DECLARE maxpart_date date;
  DECLARE partition_count int;
  DECLARE minpart date;
  DECLARE droppart_sql date;
  DECLARE newpart_date date;
  DECLARE newpart_sql varchar(500);

  SELECT COUNT(*)
    INTO partition_count
    FROM INFORMATION_SCHEMA.PARTITIONS
    WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

  -- first, deal with pruning old partitions
  -- TODO: set your desired # of partitions below, or make it parameterizable
  WHILE (partition_count > 1000)
  DO

    -- optionally, do something here to deal with the parition you're dropping, e.g.
    -- copy the data into an archive table

     SELECT MIN(PARTITION_DESCRIPTION)
       INTO minpart
       FROM INFORMATION_SCHEMA.PARTITIONS
       WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

     SET @sql := CONCAT('ALTER TABLE Calls DROP PARTITION p'
                        , CAST((minpart+0) as char(8))
                        , ';');

     PREPARE stmt FROM @sql;
     EXECUTE stmt;
     DEALLOCATE PREPARE stmt;

    SELECT COUNT(*)
      INTO partition_count
      FROM INFORMATION_SCHEMA.PARTITIONS
      WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';


  END WHILE;

  SELECT MAX(PARTITION_DESCRIPTION)
    INTO maxpart_date
    FROM INFORMATION_SCHEMA.PARTITIONS
    WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

  -- create enough partitions for at least the next week
  WHILE (maxpart_date < CURDATE() + INTERVAL 7 DAY)
  DO

    SET newpart_date := maxpart_date + INTERVAL 1 DAY;
    SET @sql := CONCAT('ALTER TABLE Calls ADD PARTITION (PARTITION p'
                        , CAST((newpart_date+0) as char(8))
                        , ' values less than('
                        , CAST((newpart_date+0) as char(8))
                        , '));');

    PREPARE stmt FROM @sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    SELECT MAX(PARTITION_DESCRIPTION)
      INTO maxpart_date
      FROM INFORMATION_SCHEMA.PARTITIONS
      WHERE TABLE_NAME='Calls' AND TABLE_SCHEMA='test';

  END WHILE;

END $

DELIMITER ;

BTW, partition maintenance (ensuring new partitions are created in advance, pruning old partitions, etc.) is, IMHO, critically important to automate. I've personally seen a large enterprise data warehouse go down for a day because a year's worth of partitions was cretaed initially but no one remembered to create more partitions once the next year came around. So it's very good you're thinking about automation here-- it bodes well for the project you're working on. :-)

反差帅 2024-08-19 08:45:29

贾斯汀那里提供了出色的解决方案。我把他的代码作为我当前项目的起点,并想提一下我在实现它时遇到的一些事情。

  1. 您运行此操作的表中的现有分区结构不应包含 MAXVALUE 类型分区 - 所有分区都必须由文字日期分隔。这是因为 SELECT MAX(PARTITION_DESCRIPTION) 将返回“MAXVALUE”,而该值在下一步中无法转换为日期。如果在调用该过程时收到奇怪的消息,例如:“<”的排序规则非法混合,这可能就是问题所在。

  2. 从 INFORMATION_SCHEMA 表中选择分区名称时,最好添加:“AND TABLE_SCHEMA = 'dbname'”,因为虽然同一个表(在不同的数据库中)可以存在多个具有相同名称的分区,但它们全部一起列在 INFORMATION_SCHEMA 表中。如果没有 TABLE_SCHEMA 规范,您的选择例如。 MAX(PARTITION_DESCRIPTION) 将为您提供每个数据库中同名表的每个现有分区中的最大分区名称。

  3. 在此过程中,我遇到了 ALTER TABLE xxx ADD PARTITION 问题,因为它是在 Justin 的解决方案中,我认为分区名称的相同格式 (yyyymmdd) 被用作分区分隔符,而预期是 yyyy- mm-dd (v5.6.2)。

  4. 默认行为是仅在将来根据需要添加分区。如果要创建过去的分区,则需要首先设置比您想要的最旧分区更早的日期的分区。例如。如果您要保留过去 30 天的数据,请首先添加 35 天前的分区,然后运行该过程。当然,可能只在空表上执行此操作才可行,但我认为值得一提。

  5. 为了创建所需的过去/未来分区范围(如 4 中所示),您最初需要运行该过程两次。对于上面 4. 中的示例,第一次运行将创建当前 -35 天的分区,以及必要的未来分区。然后,第二次运行将修剪 -35 和 -30 之间的分区。

这是我目前正在使用的。我添加了一些参数,以使其从调用者的角度来看更加灵活。您可以指定数据库、表、当前日期以及为过去和将来保留多少个分区。

我还更改了分区的命名,以便名为 p20110527 的分区代表从 2011-5-27 00:00 开始的日期,而不是当时的结束日期。

仍然没有错误检查或防止同时执行:-)

DELIMITER $

DROP PROCEDURE IF EXISTS UpdatePartitions $

-- Procedure to delete old partitions and create new ones based on a given date.
-- partitions older than (today_date - days_past) will be dropped
-- enough new partitions will be made to cover until (today_date + days_future)
CREATE PROCEDURE UpdatePartitions (dbname TEXT, tblname TEXT, today_date DATE, days_past INT, days_future INT)
BEGIN

DECLARE maxpart_date date;
DECLARE partition_count int;
DECLARE minpart date;
DECLARE droppart_sql date;
DECLARE newpart_date date;
DECLARE newpart_sql varchar(500); 

SELECT COUNT(*)
INTO partition_count
FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_NAME=tblname
AND TABLE_SCHEMA=dbname;

-- SELECT partition_count;

-- first, deal with pruning old partitions
WHILE (partition_count > days_past + days_future)
DO
-- optionally, do something here to deal with the parition you're dropping, e.g.
-- copy the data into an archive table

 SELECT STR_TO_DATE(MIN(PARTITION_DESCRIPTION), '''%Y-%m-%d''')
   INTO minpart
   FROM INFORMATION_SCHEMA.PARTITIONS
   WHERE TABLE_NAME=tblname
   AND TABLE_SCHEMA=dbname;

-- SELECT minpart;

 SET @sql := CONCAT('ALTER TABLE '
                    , tblname
                    , ' DROP PARTITION p'
                    , CAST(((minpart - INTERVAL 1 DAY)+0) as char(8))
                    , ';');

 -- SELECT @sql;
 PREPARE stmt FROM @sql;
 EXECUTE stmt;
 DEALLOCATE PREPARE stmt;

SELECT COUNT(*)
  INTO partition_count
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_NAME=tblname
  AND TABLE_SCHEMA=dbname;

-- SELECT partition_count;

END WHILE;

SELECT STR_TO_DATE(MAX(PARTITION_DESCRIPTION), '''%Y-%m-%d''')
INTO maxpart_date
FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_NAME=tblname
AND TABLE_SCHEMA=dbname;

-- select maxpart_date;
-- create enough partitions for at least the next days_future days
WHILE (maxpart_date < today_date + INTERVAL days_future DAY)
DO

-- select 'here1';
SET newpart_date := maxpart_date + INTERVAL 1 DAY;
SET @sql := CONCAT('ALTER TABLE '
                    , tblname
                    , ' ADD PARTITION (PARTITION p'
                    , CAST(((newpart_date - INTERVAL 1 DAY)+0) as char(8))
                    , ' VALUES LESS THAN ('''
                    , newpart_date
                    , '''));');

-- SELECT @sql;
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

SELECT STR_TO_DATE(MAX(PARTITION_DESCRIPTION), '''%Y-%m-%d''')
  INTO maxpart_date
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_NAME=tblname
  AND TABLE_SCHEMA=dbname;

SET maxpart_date := newpart_date;

END WHILE;

END $

DELIMITER ;

Excellent solution from Justin there. I took his code as the starting point for my current project and would like to mention a few things that came up while I was implementing it.

  1. The existing partition structure in the table you run this on should not include a MAXVALUE type partition - all partitions must be delimited by literal dates. This is because SELECT MAX(PARTITION_DESCRIPTION) will return 'MAXVALUE' which fails to be converted to a date in the next step. If you get odd message when calling the procedure saying something like: illegal mix of collations for '<', this could be the problem.

  2. It's a good idea to add: "AND TABLE_SCHEMA = 'dbname'" when selecting partition names from the INFORMATION_SCHEMA table, because while more than one partition can exist with the same name for the same table (in different databases), they are all listed in the INFORMATION_SCHEMA table together. Without the TABLE_SCHEMA specification your select eg. MAX(PARTITION_DESCRIPTION) will give you the max partition name among every existing partition for tables of that name in every database.

  3. Somewhere along the way I had problems with the ALTER TABLE xxx ADD PARTITION as it is in Justin's solution, I think it was that the same format for the partition name (yyyymmdd) was being used as the partition delimiter which expected yyyy-mm-dd (v5.6.2).

  4. The default behaviour is to only add partitions in the future as necessary. If you want to create partitions for the past, you will need to first set up a partition for a date older than the oldest partition you want. Eg. if you are keeping data for the past 30 days, first add a partition for say, 35 days ago and then run the procedure. Granted, it may only be feasible to do this on an empty table, but I thought it worth mentioning.

  5. In order to create the desired span of past/future partitions as in 4. you will initially need to run the procedure twice. For the example in 4. above, the first run will create partitions for -35 days to present, and the necessary future partitions. The second run will then trim the partitions between -35 and -30 away.

Here is what I am using at the moment. I added some parameters to make it a bit more flexible from the caller's point of view. You can specify the database, table, current date, and how many partitions to keep for both past and future.

I also altered the naming of partitions so that the partition named p20110527 represents the day starting from 2011-5-27 00:00 instead of the day ending at that time.

There is still no error checking or prevention of simultaneous execution :-)

DELIMITER $

DROP PROCEDURE IF EXISTS UpdatePartitions $

-- Procedure to delete old partitions and create new ones based on a given date.
-- partitions older than (today_date - days_past) will be dropped
-- enough new partitions will be made to cover until (today_date + days_future)
CREATE PROCEDURE UpdatePartitions (dbname TEXT, tblname TEXT, today_date DATE, days_past INT, days_future INT)
BEGIN

DECLARE maxpart_date date;
DECLARE partition_count int;
DECLARE minpart date;
DECLARE droppart_sql date;
DECLARE newpart_date date;
DECLARE newpart_sql varchar(500); 

SELECT COUNT(*)
INTO partition_count
FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_NAME=tblname
AND TABLE_SCHEMA=dbname;

-- SELECT partition_count;

-- first, deal with pruning old partitions
WHILE (partition_count > days_past + days_future)
DO
-- optionally, do something here to deal with the parition you're dropping, e.g.
-- copy the data into an archive table

 SELECT STR_TO_DATE(MIN(PARTITION_DESCRIPTION), '''%Y-%m-%d''')
   INTO minpart
   FROM INFORMATION_SCHEMA.PARTITIONS
   WHERE TABLE_NAME=tblname
   AND TABLE_SCHEMA=dbname;

-- SELECT minpart;

 SET @sql := CONCAT('ALTER TABLE '
                    , tblname
                    , ' DROP PARTITION p'
                    , CAST(((minpart - INTERVAL 1 DAY)+0) as char(8))
                    , ';');

 -- SELECT @sql;
 PREPARE stmt FROM @sql;
 EXECUTE stmt;
 DEALLOCATE PREPARE stmt;

SELECT COUNT(*)
  INTO partition_count
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_NAME=tblname
  AND TABLE_SCHEMA=dbname;

-- SELECT partition_count;

END WHILE;

SELECT STR_TO_DATE(MAX(PARTITION_DESCRIPTION), '''%Y-%m-%d''')
INTO maxpart_date
FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_NAME=tblname
AND TABLE_SCHEMA=dbname;

-- select maxpart_date;
-- create enough partitions for at least the next days_future days
WHILE (maxpart_date < today_date + INTERVAL days_future DAY)
DO

-- select 'here1';
SET newpart_date := maxpart_date + INTERVAL 1 DAY;
SET @sql := CONCAT('ALTER TABLE '
                    , tblname
                    , ' ADD PARTITION (PARTITION p'
                    , CAST(((newpart_date - INTERVAL 1 DAY)+0) as char(8))
                    , ' VALUES LESS THAN ('''
                    , newpart_date
                    , '''));');

-- SELECT @sql;
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

SELECT STR_TO_DATE(MAX(PARTITION_DESCRIPTION), '''%Y-%m-%d''')
  INTO maxpart_date
  FROM INFORMATION_SCHEMA.PARTITIONS
  WHERE TABLE_NAME=tblname
  AND TABLE_SCHEMA=dbname;

SET maxpart_date := newpart_date;

END WHILE;

END $

DELIMITER ;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文