mysql分区和临时表

发布于 2025-01-31 09:36:52 字数 4623 浏览 3 评论 0原文

一张大桌子(约1050万行)最近引起了问题。我以前修改了我的应用程序以使用临时表进行更快的选择,但由于更新语句而仍存在问题。今天,我实施了分区,以便写作更快地发生,但是现在我的临时表格错误。它的目的是将事件分组,将集合的第一个事件ID放在Event_ID列中。示例:编写4个从1000开始的事件将导致事件1000、1001、1002、1003,所有活动均为1000的Event_ID。我试图消除更新语句,但这需要太多的重构,因此不是一个选项。以下是表定义:

CREATE TABLE `all_events` (
  `ID` bigint NOT NULL AUTO_INCREMENT,
  `EVENT_ID` bigint unsigned DEFAULT NULL,
  `LAST_UPDATE` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `EMPLOYEE_ID` int unsigned NOT NULL,
  `QUANTITY` float unsigned NOT NULL,
  `OPERATORS` float unsigned NOT NULL DEFAULT '0',
  `SECSEARNED` decimal(10,2) unsigned NOT NULL DEFAULT '0.00' COMMENT 'for all parts in QUANTITY',
  `SECSBURNED` decimal(10,2) unsigned NOT NULL DEFAULT '0.00',
  `YR` smallint unsigned NOT NULL DEFAULT (year(curdate())),
  PRIMARY KEY (`ID`,`YR`),
  KEY `LAST_UPDATE` (`LAST_UPDATE`),
  KEY `EMPLOYEE_ID` (`EMPLOYEE_ID`),
  KEY `EVENT_ID` (`EVENT_ID`)
) ENGINE=InnoDB AUTO_INCREMENT=17464583 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
/*!50100 PARTITION BY RANGE (`YR`)
(PARTITION p2015 VALUES LESS THAN (2016) ENGINE = InnoDB,
 PARTITION p2016 VALUES LESS THAN (2017) ENGINE = InnoDB,
 PARTITION p2017 VALUES LESS THAN (2018) ENGINE = InnoDB,
 PARTITION p2018 VALUES LESS THAN (2019) ENGINE = InnoDB,
 PARTITION p2019 VALUES LESS THAN (2020) ENGINE = InnoDB,
 PARTITION p2020 VALUES LESS THAN (2021) ENGINE = InnoDB,
 PARTITION p2021 VALUES LESS THAN (2022) ENGINE = InnoDB,
 PARTITION p2022 VALUES LESS THAN (2023) ENGINE = InnoDB,
 PARTITION p2023 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */

现在在我的应用程序中运行报告时,

CREATE TEMPORARY TABLE IF NOT EXISTS ape ENGINE=MEMORY AS
SELECT * FROM all_events
WHERE LAST_UPDATE BETWEEN '2022-05-01 00:00:00' AND CURRENT_TIMESTAMP()

在默认值表达式中不支持“指定的存储引擎”。

是否支持仍然使用临时表使用引擎=内存,还是我可以使用另一个高性能引擎?该声明一直在执行,直到实施分区为止。 InnoDB是我的桌子唯一可以使用的引擎,这是由于MySQL的实现,并且自分区之前一直是InnoDB。

编辑:删除引擎=内存它确实可以工作,但是运行show create table告诉我它使用InnoDB。我希望内存的性能提高与InnoDB。

第二编辑: MySQL Server每天每天崩溃2至3次,每次捕获时,我都会发现此错误:

TRANSACTION 795211228, ACTIVE 0 sec fetching rows
mysql tables in use 13, locked 13
LOCK WAIT 866 lock struct(s), heap size 106704, 4800 row lock(s), undo log entries 1
MySQL thread id 5032986, OS thread handle 140442167994112, query id 141216988 myserver 192.168.1.100 my-user Searching rows for update
UPDATE `all_events` SET `EVENT_ID`=LAST_INSERT_ID() WHERE `EVENT_ID` IS NULL
RECORD LOCKS space id 30558 page no 16 n bits 792 index EVENT_ID of table `mydb`.`all_events` trx id 795211228 lock_mode X

它正在运行带有3个节点的Galera群集。节点3是主要的,不可用,而1脱机到Resync 3。我失败了2,我们通常不错,直到它赶上来,但会导致停机时间。我使用的温度表是用于更快的读取,分区是我尝试改善写作性能的尝试。

第三编辑: 添加的示例选择 - 注意表定义中没有字段,我为简单而减少了显示的内容,但实际上选择了选择中的所有字段。

CREATE TEMPORARY TABLE IF NOT EXISTS allpe AS
SELECT * FROM all_events
WHERE LAST_UPDATE BETWEEN ? AND ?;

CREATE TEMPORARY TABLE IF NOT EXISTS ap1 AS SELECT * FROM allpe;
CREATE TEMPORARY TABLE IF NOT EXISTS ap2 AS SELECT * FROM allpe;

SELECT PART_NUMBER, WORKCENTER_NAME, SUM(SECSEARNED) AS EARNED, SUM(SECSBURNED) AS BURNED, SUM(QUANTITY) AS QUANTITY, (
        SELECT SUM(ap1.SECSEARNED)
        FROM ap1
        WHERE ap1.PART_NUMBER = ape.PART_NUMBER AND ap1.WORKCENTER_ID = ape.WORKCENTER_ID
    ) AS EARNEDALL, (
        SELECT SUM(ap2.SECSBURNED)
        FROM ap2
        WHERE ap2.PART_NUMBER = ape.PART_NUMBER AND ap2.WORKCENTER_ID = ape.WORKCENTER_ID
    ) AS BURNEDALL
FROM allpe ape
WHERE EMPLOYEE_ID = ?
GROUP BY PART_NUMBER, WORKCENTER_ID, WORKCENTER_NAME, EMPLOYEE_ID
ORDER BY EARNED;

DROP TEMPORARY TABLE allpe;
DROP TEMPORARY TABLE ap1;
DROP TEMPORARY TABLE ap2;

第四次编辑: 在存储过程中写作 - 这不是在循环中,但是多行可以从多个加入到员工_Presence,因此我无法获得ID并将其存储以编写后续行。

INSERT INTO `all_events`(`EVENT_ID`,`LAST_UPDATE`,`PART_NUMBER`, `WORKCENTER_ID`,`XPPS_WC`, `EMPLOYEE_ID`,`WORKCENTER_NAME`, `QUANTITY`, `LEVEL_PART_NUMBER`,`OPERATORS`,`SECSEARNED`,`SECSBURNED`)
SELECT NULL,NOW(),NEW.PART_NUMBER,NEW.ID,OLD.XPPS_WC,ep.EMPLOYEE_ID,NEW.NAME,(NEW.PARTS_MADE-OLD.PARTS_MADE)*WorkerContrib(ep.EMPLOYEE_ID,OLD.ID),IFNULL(NEW.LEVEL_PART_NUMBER,NEW.PART_NUMBER),WorkerCount(NEW.ID)*WorkerContrib(ep.EMPLOYEE_ID,OLD.ID),WorkerContrib(ep.EMPLOYEE_ID,OLD.ID)*CreditSeconds,WorkerCount(NEW.ID)*WorkerContrib(ep.EMPLOYEE_ID,OLD.ID)*IFNULL(TIMESTAMPDIFF(SECOND, GREATEST(NEW.LAST_PART_TIME,NEW.JOB_START_TIME), now()),0)
FROM employee_presence ep WHERE ep.WORKCENTER_ID=OLD.ID;
UPDATE `all_events` SET `EVENT_ID`=LAST_INSERT_ID() WHERE `WORKCENTER_ID`=NEW.ID AND `EVENT_ID` IS NULL;

A large table (~10.5M rows) has been causing issues lately. I previously modified my application to use temporary tables for faster selects, but was still having issues due to UPDATE statements. Today I implemented partitions so that the writes happen more quickly, but now my temporary tables error. Its purpose is to group events, placing the first event ID of a set in the EVENT_ID column. Example: writing 4 events beginning at 1000 would result in events 1000, 1001, 1002, 1003, all with an EVENT_ID of 1000. I have tried to do away with the UPDATE statements, but that would require too much refactoring, so it is not an option. Here is the table definition:

CREATE TABLE `all_events` (
  `ID` bigint NOT NULL AUTO_INCREMENT,
  `EVENT_ID` bigint unsigned DEFAULT NULL,
  `LAST_UPDATE` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `EMPLOYEE_ID` int unsigned NOT NULL,
  `QUANTITY` float unsigned NOT NULL,
  `OPERATORS` float unsigned NOT NULL DEFAULT '0',
  `SECSEARNED` decimal(10,2) unsigned NOT NULL DEFAULT '0.00' COMMENT 'for all parts in QUANTITY',
  `SECSBURNED` decimal(10,2) unsigned NOT NULL DEFAULT '0.00',
  `YR` smallint unsigned NOT NULL DEFAULT (year(curdate())),
  PRIMARY KEY (`ID`,`YR`),
  KEY `LAST_UPDATE` (`LAST_UPDATE`),
  KEY `EMPLOYEE_ID` (`EMPLOYEE_ID`),
  KEY `EVENT_ID` (`EVENT_ID`)
) ENGINE=InnoDB AUTO_INCREMENT=17464583 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
/*!50100 PARTITION BY RANGE (`YR`)
(PARTITION p2015 VALUES LESS THAN (2016) ENGINE = InnoDB,
 PARTITION p2016 VALUES LESS THAN (2017) ENGINE = InnoDB,
 PARTITION p2017 VALUES LESS THAN (2018) ENGINE = InnoDB,
 PARTITION p2018 VALUES LESS THAN (2019) ENGINE = InnoDB,
 PARTITION p2019 VALUES LESS THAN (2020) ENGINE = InnoDB,
 PARTITION p2020 VALUES LESS THAN (2021) ENGINE = InnoDB,
 PARTITION p2021 VALUES LESS THAN (2022) ENGINE = InnoDB,
 PARTITION p2022 VALUES LESS THAN (2023) ENGINE = InnoDB,
 PARTITION p2023 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */

Now in my application when running a report the statement:

CREATE TEMPORARY TABLE IF NOT EXISTS ape ENGINE=MEMORY AS
SELECT * FROM all_events
WHERE LAST_UPDATE BETWEEN '2022-05-01 00:00:00' AND CURRENT_TIMESTAMP()

Produces the error: 'Specified storage engine' is not supported for default value expressions.

Is there a way to still use temporary tables with ENGINE=MEMORY, or is there another high performance engine I can use? The statement worked until the partitioning was implemented. InnoDB is the only engine my tables can be in due to the MySQL implementation, and it has been InnoDB since before partitioning.

Edit: When removing ENGINE=MEMORY it does work, but running SHOW CREATE TABLE tells me that it's using InnoDB. I would prefer the performance increase of MEMORY vs InnoDB.

Second Edit:
The MySQL server has been crashing 2 to 3 times daily, and every time I catch it I find this error:

TRANSACTION 795211228, ACTIVE 0 sec fetching rows
mysql tables in use 13, locked 13
LOCK WAIT 866 lock struct(s), heap size 106704, 4800 row lock(s), undo log entries 1
MySQL thread id 5032986, OS thread handle 140442167994112, query id 141216988 myserver 192.168.1.100 my-user Searching rows for update
UPDATE `all_events` SET `EVENT_ID`=LAST_INSERT_ID() WHERE `EVENT_ID` IS NULL
RECORD LOCKS space id 30558 page no 16 n bits 792 index EVENT_ID of table `mydb`.`all_events` trx id 795211228 lock_mode X

It's running Galera Cluster with 3 nodes. Node 3 is the main, becomes unavailable, and 1 comes offline to resync 3. I fail over to 2 and we're usually good until it catches up, but it's causing downtime. The temp tables I'm using are for faster reads, the partitioning is my attempt at improving write performance.

Third edit:
Added example SELECT - note there are fields not in the table definition, I reduced what was displayed for simplicity of the post, but all fields in the SELECT do in fact exist.

CREATE TEMPORARY TABLE IF NOT EXISTS allpe AS
SELECT * FROM all_events
WHERE LAST_UPDATE BETWEEN ? AND ?;

CREATE TEMPORARY TABLE IF NOT EXISTS ap1 AS SELECT * FROM allpe;
CREATE TEMPORARY TABLE IF NOT EXISTS ap2 AS SELECT * FROM allpe;

SELECT PART_NUMBER, WORKCENTER_NAME, SUM(SECSEARNED) AS EARNED, SUM(SECSBURNED) AS BURNED, SUM(QUANTITY) AS QUANTITY, (
        SELECT SUM(ap1.SECSEARNED)
        FROM ap1
        WHERE ap1.PART_NUMBER = ape.PART_NUMBER AND ap1.WORKCENTER_ID = ape.WORKCENTER_ID
    ) AS EARNEDALL, (
        SELECT SUM(ap2.SECSBURNED)
        FROM ap2
        WHERE ap2.PART_NUMBER = ape.PART_NUMBER AND ap2.WORKCENTER_ID = ape.WORKCENTER_ID
    ) AS BURNEDALL
FROM allpe ape
WHERE EMPLOYEE_ID = ?
GROUP BY PART_NUMBER, WORKCENTER_ID, WORKCENTER_NAME, EMPLOYEE_ID
ORDER BY EARNED;

DROP TEMPORARY TABLE allpe;
DROP TEMPORARY TABLE ap1;
DROP TEMPORARY TABLE ap2;

Fourth edit:
Writing inside of stored procedure - this is not in a loop, but multiple rows can come from multiple joins to employee_presence, so I cannot get the ID and store it for writing subsequent rows.

INSERT INTO `all_events`(`EVENT_ID`,`LAST_UPDATE`,`PART_NUMBER`, `WORKCENTER_ID`,`XPPS_WC`, `EMPLOYEE_ID`,`WORKCENTER_NAME`, `QUANTITY`, `LEVEL_PART_NUMBER`,`OPERATORS`,`SECSEARNED`,`SECSBURNED`)
SELECT NULL,NOW(),NEW.PART_NUMBER,NEW.ID,OLD.XPPS_WC,ep.EMPLOYEE_ID,NEW.NAME,(NEW.PARTS_MADE-OLD.PARTS_MADE)*WorkerContrib(ep.EMPLOYEE_ID,OLD.ID),IFNULL(NEW.LEVEL_PART_NUMBER,NEW.PART_NUMBER),WorkerCount(NEW.ID)*WorkerContrib(ep.EMPLOYEE_ID,OLD.ID),WorkerContrib(ep.EMPLOYEE_ID,OLD.ID)*CreditSeconds,WorkerCount(NEW.ID)*WorkerContrib(ep.EMPLOYEE_ID,OLD.ID)*IFNULL(TIMESTAMPDIFF(SECOND, GREATEST(NEW.LAST_PART_TIME,NEW.JOB_START_TIME), now()),0)
FROM employee_presence ep WHERE ep.WORKCENTER_ID=OLD.ID;
UPDATE `all_events` SET `EVENT_ID`=LAST_INSERT_ID() WHERE `WORKCENTER_ID`=NEW.ID AND `EVENT_ID` IS NULL;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

没有伤那来痛 2025-02-07 09:36:52

我想阅读以下从dev.mysql链接.com

您不能使用创建临时表...喜欢创建一个空
基于位于MySQL中的表的定义
表空间,InnoDB系统表空间(InnoDB_System)或一般
表空间。该表的表空间定义包括
表空间属性定义表空间
居住,上述表空间不支持临时
表。根据此类的定义创建临时表
表,使用此语法:

创建临时表new_tbl Select *从orig_tbl限制0;

因此,您的案件的正确语法似乎是:

CREATE TEMPORARY TABLE ape
SELECT * FROM all_events
WHERE... 

I would suppose to read the following link from dev.MySQL.com

You cannot use CREATE TEMPORARY TABLE ... LIKE to create an empty
table based on the definition of a table that resides in the mysql
tablespace, InnoDB system tablespace (innodb_system), or a general
tablespace. The tablespace definition for such a table includes a
TABLESPACE attribute that defines the tablespace where the table
resides, and the aforementioned tablespaces do not support temporary
tables. To create a temporary table based on the definition of such a
table, use this syntax instead:

CREATE TEMPORARY TABLE new_tbl SELECT * FROM orig_tbl LIMIT 0;

So it seems the correct syntax for your case will be:

CREATE TEMPORARY TABLE ape
SELECT * FROM all_events
WHERE... 
风向决定发型 2025-02-07 09:36:52

在本期中,有问题的列是yr smallInt unsigned而不是null默认值(年(curdate()))。对于用于分区表达式的列而言,此默认值不合法。该误差将是“不允许(子)分区函数中的恒定,随机或依赖时区的表达式……”。


而且只有当您通过删除分区来解决此问题时,您将收到错误“指定的存储引擎”的默认值表达式不支持“指定的存储引擎'”。

创建表..选择从源表继承主列属性。

在本期中,有问题的列是yr smallInt unsigned而不是null默认值(年(curdate()))。诱人的列必须继承主要属性,包括默认表达式 - 但对于MOMORY引擎不允许此表达式。

In the current issue the problematic column is YR smallint unsigned NOT NULL DEFAULT (year(curdate())). This DEFAULT value is not legal for a column which is used in partitioning expression. The error will be "Constant, random or timezone-dependent expressions in (sub)partitioning function are not allowed ...".


And only when you fix this by removing the partitioning then you'll receive an error "'Specified storage engine' is not supported for default value expressions".

CREATE TABLE .. SELECT inherits main columns properties from source tables.

In the current issue the problematic column is YR smallint unsigned NOT NULL DEFAULT (year(curdate())) again. The column in temptable must inherit main properties, including DEFAULT expression - but this expression is not allowed for MEMORY engine.

长亭外,古道边 2025-02-07 09:36:52

正如错误所暗示的那样,表达式默认值不适用于内存存储引擎。

一种解决方案是从all_events.yr列中删除该默认值。

另一个解决方案是最初以InnoDB表创建一个空的临时表,然后使用Alter Table删除表达式默认值并在将数据填充数据之前将其转换为内存引擎。

例子:

mysql> create temporary table t as select * from all_events where false;

mysql> alter table t alter column yr drop default, engine=memory;

mysql> insert into t select * from all_events;

As the error suggests, the expression default does not work with the MEMORY storage engine.

One solution would be to remove that default from your all_events.yr column.

The other solution is to create an empty temporary table initially as an InnoDB table, then use ALTER TABLE to remove the expression default and convert to MEMORY engine before filling it with data.

Example:

mysql> create temporary table t as select * from all_events where false;

mysql> alter table t alter column yr drop default, engine=memory;

mysql> insert into t select * from all_events;
染年凉城似染瑾 2025-02-07 09:36:52

足够吗?如果我没记错的话,这等同于您的选择查找的内容(无需临时表):

 SELECT  PART_NUMBER, WORKCENTER_ID, WORKCENTER_NAME, EMPLOYEE_ID,
         SUM(SECSEARNED) AS TOT_EARNED,
         SUM(SECSBURNED) AS TOT_BURNED,
         SUM(QUANTITY) AS TOT_QUANTITY
    FROM  all_events
    WHERE  EMPLOYEE_ID = ?
      AND  LAST_UPDATE >= '2022-05-01'
    GROUP BY  PART_NUMBER, WORKCENTER_ID, WORKCENTER_NAME;

对于性能,它将需要此。

INDEX(EMPLOYEE_ID, LAST_UPDATE)

另外,删除分区可能会加快速度。

else (在您所采用的路径上的其他修复程序上的注释),

因为不需要yr,请通过将'*'更改为

CREATE TEMPORARY TABLE IF NOT EXISTS ape ENGINE=MEMORY AS
SELECT * FROM all_events
WHERE LAST_UPDATE BETWEEN '2022-05-01 00:00:00' AND CURRENT_TIMESTAMP()

ap2中所需列的列表来避免它。 part_number = ape.part_number和ap2.workcenter_id = ape.workcenter_id

将此复合索引添加到all_events

INDEX(PART_NUMBER, WORKCENTER_ID)

这可能足以使查询足够快,而没有临时表。
还添加 allpe`构建后。

如果您正在运行MySQL 8.0,则可以使用使用,而不需要两个额外的临时表。

Sufficient? If I am not mistaken, this is equivalent to what your SELECT finds (no temp tables needed):

 SELECT  PART_NUMBER, WORKCENTER_ID, WORKCENTER_NAME, EMPLOYEE_ID,
         SUM(SECSEARNED) AS TOT_EARNED,
         SUM(SECSBURNED) AS TOT_BURNED,
         SUM(QUANTITY) AS TOT_QUANTITY
    FROM  all_events
    WHERE  EMPLOYEE_ID = ?
      AND  LAST_UPDATE >= '2022-05-01'
    GROUP BY  PART_NUMBER, WORKCENTER_ID, WORKCENTER_NAME;

For performance, it would need this.

INDEX(EMPLOYEE_ID, LAST_UPDATE)

Also, removing the partitioning might speed it up a little more.

else (Notes on other fixes to the path you have taken)

Since yr is not needed, avoid it by changing '*' to a list of needed columns in

CREATE TEMPORARY TABLE IF NOT EXISTS ape ENGINE=MEMORY AS
SELECT * FROM all_events
WHERE LAST_UPDATE BETWEEN '2022-05-01 00:00:00' AND CURRENT_TIMESTAMP()

WHERE ap2.PART_NUMBER = ape.PART_NUMBER AND ap2.WORKCENTER_ID = ape.WORKCENTER_ID

Add this composite index to all_events:

INDEX(PART_NUMBER, WORKCENTER_ID)

That will probably suffice to make the query fast enough without the temp tables.
Also add thatallpe` after building it.

If you are running MySQL 8.0, you can use WITH instead of needing the two extra temp tables.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文