sql 调优 - 具有多个连接

发布于 2024-09-18 09:42:02 字数 194 浏览 5 评论 0原文

我最近在某处读到,调整 SQL 查询的方法之一是,如果它有太多联接,则使用较少的表进行一次联接,并将结果缓存在临时表中,然后在该表上执行其余的查询联接。

我的问题是,当您连接相同数量的表(只是不在一起)时,它将如何提高性能?

注意:我同意这是一般性的陈述;我最近在一篇文章中读到过。我会重新表述一下。在什么条件下将结果存储到临时表中会有帮助?

I recently read somewhere that one of ways of tuning SQL query is that if it has too many joins, then do one join with fewer tables and cache the results in a temporary table, then do the rest of the query joining on that table.

My question is how it will improve the performance, as you are joining same numbers of tables (only not together)?

Note: I agree this is generic statement; I read it recently in some article. I will rephrase it. Under which conditions will storing a result into temp table help?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

风吹过旳痕迹 2024-09-25 09:42:02

您投资 Oracle 这样的产品的原因之一是为了他们在引擎优化器部分所做的开发工作。它在 20 多年来不断改进,一般来说,通过对表和索引进行适当的统计,很难正确地猜测它对数据的访问。

如果我将您的问题解释为如何通过每次执行查询时构建临时表来提高实时数据查询的性能,我会说在大多数情况下不会。在其他情况下,不要构建临时表,而应将时间投入到使用 Oracle 相对较新的WITH 子句构建查询,该子句将在对优化器有意义的情况下动态处理数据子集的具体化。

如果您的问题是关于物化视图、数据集市或数据仓库方式中的非规范化数据,那么是的,这可以显着提高查询性能,但代价是访问信息的当前状态(因为非规范化表总是过时的) 。这种改进通常是因为 RDBMS 引擎需要为查询执行的物理访问工作较少,因为您已经完成了一次来构建非规范化结构。

One of the reasons that you invest in a product like Oracle is for the development work they put in the optimizer piece of their engine. It has constantly improved over 20+ years, and in general, with proper statistics for your tables and indexes, it is hard to correctly outguess it for access to your data.

If I interpret your question as how would performance improve in queries of real-time data by building temporary tables each time the query executes, I would say that it wouldn't in most cases. In those other cases, instead of building a temporary table invest the time in structuring the query with Oracle's relatively new WITH clause which will handle materializing subsets of data dynamically in those cases where it makes sense to the optimizer.

If your question is about denormalizing data in a materialized view, data mart, or data warehouse fashion then yes this can dramatically improve query performance at the expense of access to the current state of the information (since the denormalized tables are always out of date). This improvement comes about in general because the RDBMS engine has less physical access work to do for the query because you have already done it once to build the denormalized structures.

明媚如初 2024-09-25 09:42:02

如果多次运行此查询而不重建临时表,这将提高性能。第一个查询可能会在正常时间运行,但后续查询将避免获取临时表数据的工作。但是,该表中的数据将变得过时 - 构建后的更新不会显示在以后的查询中。

对于不必绝对最新的查询,这可能是可以接受的 - 例如统计报告查询通常可以摆脱一天前的数据。

作为替代方案,许多数据库支持物化视图(或索引视图),它们实际上是临时表,每当您执行更新时都会自动保持最新。

您还可以使用更新触发器或通过存储过程执行更新来手动重现此效果。这种方法会导致数据库非常脆弱,并且通常容易出错,因此我通常建议不要使用它。

This would improve the performance if you run this query multiple times without rebuilding the temporary table. The first query would likely run around normal time, but subsequent queries would avoid the work of getting the data for the temporary table. However, the data in that table would become stale - updates from after it was built would not show up in later queries.

This might be acceptable in the case of queries which do not have to be absolutely up to date - for example statistical reporting queries can usually get away with day-old data.

As an alternative, many databases support materialized views (or indexed views), which are effectively temporary tables which are kept up to date automatically whenever you perform an update.

You could also reproduce this effect manually using triggers on update or by performing your updates via stored procedures. This approach results in a very fragile database and is generally error prone, so I would recommend against it generally.

各自安好 2024-09-25 09:42:02

这在很大程度上取决于您的具体情况 - 这种更改可能会损害或提高性能。对此没有一般规则;您遇到问题的查询是什么?

它可以提高性能,因为结果可能是一个更小的表,更容易查询和连接;查询优化器可能会自动执行此操作,但在某些情况下会出错。这是手动执行优化器工作的一种方法。

This very much depends on your specific situation - such change could hurt or improve the performance. There is no general rule for this; what is the query that you are having trouble with?

It may improve the performance, as the result may be a smaller table, which is easier to query and join to; the query optimizer might do this automatically, but in some cases gets it wrong. This is a way of doing the optimizer's work manually.

〃安静 2024-09-25 09:42:02

我认为这个“规则”的出现是因为当涉及许多表时数据库引擎的行为变得难以预测 - 每个额外的表都会增加执行查询的可能方法的数量。

理论上,可以准确跟踪 Oracle 优化器如何做出决策,并使用统计数据、提示和计划为其提供正确完成工作所需的信息。

事实上,这个过程似乎常常因开发人员/DBA 的差距而失败——无论是在培训方面,还是在获取所需工具方面。

临时表方法的缺点是,您阻止了数据库在资源更改时使用“更好”的优化(即数据库服务器现在有 8Gb 内存,因此最快的方法是将所有表完全加载到内存中,但是临时表方法强制写回磁盘)。

I think this 'rule' has emerged because the behavior of the database engine becomes hard to predict when many tables are involved - each extra table multiplies the number of possible ways to execute the query.

In theory, it is possible to trace exactly how the Oracle optimizer makes it's decision, and use statistics, hints and plans to give it the information it requires to do it's job properly.

In reality, this process often seems to fall down at the developer/DBA gap - both in terms of training, and in terms of access to the required tools.

The disadvantage of the temp table approach is that you have prevented the database from using a 'better' optimisation when resources change (i.e. the DB server now has 8Gb of memory, so the fastest approach is to load all the tables entirely into memory, but the temp table approach has forced a write back to disk).

酒中人 2024-09-25 09:42:02

我永远不会考虑使用临时表来提高单个查询的性能。 (我假设您谈论的是实际表,而不是物化视图。)根据我的经验,Oracle 至少在 99.9% 的情况下可以毫无问题地连接几十个表。 (如果您有最新的统计数据。)

对于那些看起来不太理想的罕见情况,您应该首先尝试在 Oracle 为您提供的系统中工作。我看到的大多数性能问题都是因为有人没有以合乎逻辑的方式做某事,或者他们不知道现有的功能。例如,使用同一个表两次而不是使用分析。如果 Oracle 仍在使用错误的解释计划,那么您应该考虑使用提示或添加 ROWNUM 之类的技巧来阻止 Oracle 重写某些子查询。

如果临时表有帮助,Oracle 会为您完成这一切。有时您可以在解释计划中看到类似“SYS_TEMP...”的对象。

I would never, ever consider using temporary tables to improve the performance of a single query. (I assume you're talking about actual tables, not materialized views.) In my experience, Oracle can join several dozen tables without a problem at least 99.9% of the time. (If you have up to date statistics.)

For those rare cases when things don't seem optimal, you should first try working within the system Oracle provides you. Most performance problems I see are because someone isn't doing something in a logical way, or they are not aware of existing features. For example, using the same table twice instead of using analytics. If Oracle is still using a bad explain plan then you should look into using hints, or a trick like adding a ROWNUM to stop Oracle from re-writing certain subqueries.

If a temporary table will help, Oracle will do it all for you. Sometimes you can see objects like "SYS_TEMP..." in the explain plan.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文