处理视图的 MySQL 临时表算法
如果将临时表算法重命名为不可扩展算法就好了。也许当开发人员在视图定义中看到这一点时,它会向开发人员提供更多警告 - 类似地,当它在解释结果中使用临时表时。大多数情况下这只是一个半开玩笑的请求,但实际上这对于不知情的人来说可能是灾难性的。
问题在于,如果您在视图定义中执行某些操作,它将从正常的合并算法切换到效率低下的临时表算法。如果涉及的数据很小,这没什么大不了的。但这是随着数据增长而降低性能的因素之一。
但如何最好地处理这个问题呢?自从 5 年前实施视图以来,这一直是一个问题,我不知道有任何解决方案。其他流行的数据库系统是否也存在这样的问题?
在下面的链接中向下滚动到讨论何时无法使用合并算法的地方,以查看是什么导致 MySQL 退化为使用糟糕的临时表算法: http://mysql2.mirrors-r-us .net/doc/refman/5.1/en/create-view.html
这有什么不好呢?临时表算法的工作原理如下:
- 在视图定义中“按原样”运行视图,而不将查询的 where 子句条件合并到其中。
- 将结果数据转储到临时表中。
- 根据查询的 where 子句条件过滤临时表中的数据 - 这里也没有索引。
所以你可以想象当你有一个 5000 万行的表和一个像这样的视图定义的地狱时 - 愚蠢的例子,但你明白了:
create view last_order_date as
select max(order_date) last_date, username from orders group by username;
用户可能会写:
select * from last_order_date where username = 'joejoe22'
2小时后返回结果......
那么如何最好地处理这种情况?
It would be nice if the Temp Table algorithm would be renamed to Unscalable algorithm. Perhaps then it would provide more of a warning to developers when seeing this in a view definition - similarly when it says using temp table in the explain results. Just a tongue-in-cheek request for the most part but really it can be disastrous to the unaware.
The trouble is that if you do certain things in your view definition it'll switch over from the sane merge algorithm to the hopelessly inefficient temp table algorithm. This is no big deal if the data involved is small. But it's one of those things that will kill your performance as your data grows.
How best to deal with this though? It's been a problem since views were implemented over 5 years ago and I don't know of any effort to fix it. Does this sort of problem exist in other popular database systems?
Scroll down in the link below to where it discusses when the Merge Algorithm cannot be used to see what causes MySQL to degenerate into using the awful Temp Table algorithm:
http://mysql2.mirrors-r-us.net/doc/refman/5.1/en/create-view.html
What's so bad about it? The temp table algorithm works like so:
- Run the view "AS IS" in the view definition without merging the query's where clause criteria into it.
- Dump the resulting data into a temp table.
- Filter the data in the temp table based on the query's where clause criteria - no indexes here either.
So you can imagine the hell involved here when you have a 50 million row table with a view definition like - silly example but you get the point:
create view last_order_date as
select max(order_date) last_date, username from orders group by username;
A user might then write:
select * from last_order_date where username = 'joejoe22'
2 hours later the result is returned...
So how best to deal with this situation?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
编写一个返回结果集的存储过程。
现在调用该过程,它将返回其中的 select 语句的结果集。
只需执行
CALL
,不要执行不起作用的SELECT CALL
。您也可以不在另一个 SELECT 语句中使用
CALL
。当然,您可以指示存储过程将输出放入临时表中,然后在另一个 SELECT 中使用它。
存储过程将使用正确的索引,并且您的 select 语句已准备好。
另外,您可以在前面的语句中准备一些内容,从而加快时间关键的查询。
不过你是对的:
MySQL 视图确实很糟糕
Write a stored procedure that returns a result set.
Now call the procedure, it will return the result set of the select statement within.
Just do
CALL
, don't soSELECT CALL
that does not work.You can also not use the
CALL
inside another SELECT statement.You can of course instruct the stored procedure to put the output in a temporary table and than use that in another SELECT.
The stored procedure will use correct indexes, and your select statement is already prepared.
Plus you can prepare stuff in preceding statements, speeding up your time-critical query.
You are right though:
MySQL Views sure do suck