通过大量连接提高视图的性能

发布于 2024-08-15 01:44:12 字数 164 浏览 4 评论 0原文

我有一个使用 11 个外部联接和两个内部联接来创建数据的视图。这会产生超过 800 万行。当我在桌子上计数 (*) 时,运行大约需要 5 分钟。我不知道如何提高该表的性能。有人对从哪里开始有什么建议吗?似乎所有要连接的列上都有索引(尽管有些是复合的,不确定这是否有影响......)

感谢任何帮助。

I have a view that uses 11 outer joins and two inner joins to create the data. This results in over 8 million rows. When I do a count (*) on the table it takes about 5 minutes to run. I'm at a loss as to how to improve the performance of this table. Does anyone have any suggestions on where to begin? There appear to be indexes on all of the columns that are joining (though some are composit, not sure if that makes a difference...)

Any help appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

不打扰别人 2024-08-22 01:44:12

这是一项困难的任务,对于复杂的视图,您还可能与视图的查询进行交互,因此保证合理的性能将非常困难。视图中的外连接(尤其是复杂的视图)也容易给查询优化器带来麻烦。

一种选择是具体化视图(在 SQL Server 上称为“索引视图”)。但是,您可能需要监视更新性能以检查它是否不会造成太多开销。此外,物化视图中的外部联接可能会妨碍实时刷新;如果您需要这个,那么您可能必须将视图重新实现为非规范化表并使用触发器维护数据。

另一种可能性是检查视图是否可以分为两个或三个更简单的视图,可能会具体化部分视图,但不是全部视图。以这种方式实现某些视图并从系统中获得性能可能会更容易。

This is a hard one, with a complex view you also have potential interactions with queries against the view, so guaranteeing reasonable performance will be quite hard. Outer joins in views (especially complex ones) are also prone to cause trouble for the query optimiser.

One option would be to materialise the view (called 'indexed views' on SQL Server). However you may need to monitor update performance to check that it does not impose too much overhead. Also, outer joins in a materialised view may preclude real-time refresh; if you need this then you may have to re-implement the view as a denormalised table and maintain the data with triggers.

Another possibility would be to examine whether the view could be split into two or three simpler views, possibly materialising some but not all of the view. It may be easier to materialise some of the view and get performance from the system that way.

私野 2024-08-22 01:44:12

你的基本前提是错误的。
拥有返回 800 万行的视图并不是一个好主意,因为实际上您无法用这么多数据做任何事情。
由于所有这些连接,5 分钟对于 800 万个 count() 来说听起来相当不错。

您要做的就是考虑您的业务问题并编写一个较小的查询/视图。

your basic premise is wrong.
having a view that returns 8 million rows is not a good idea because realisticaly you can't really do anything with so much data.
5 minutes sounds pretty good for 8 million count() because of all those joins.

what you have to do is to think about your business problem and write a smaller query/view.

天涯沦落人 2024-08-22 01:44:12

您可以考虑以下几件事:

  1. 非规范化。通过非规范化数据结构分区来减少所需的联接数量
  2. 。您可以对大表中的数据进行分区吗?例如,如果将一个大表划分为多个较小的表,则性能会更好。从 SQL 2005 开始的企业版对分区有很好的支持,请参见此处 。如果您开始进入 10/100 数百万行
  3. 索引管理/统计的领域, 请考虑这一点。所有索引都已整理碎片吗?统计数据是最新的吗?

A few things you could consider:

  1. denormalisation. Reduce the number of joins required by denormalising your data structure
  2. partitioning. Can you partition data from large tables? e.g. a large table, could perform better if partitioned into a number of smaller tables. Enterprise Edition from SQL 2005 onwards has good support for partitioning, see here. Would consider this if you start getting in the realms of 10s/100s of millions of rows
  3. index management/statistics. Are all indexes defragged? Are statistics up to date?
成熟的代价 2024-08-22 01:44:12

运行 sql 探查器/索引调整向导。有时它提出的索引建议并没有立即有意义,但结果却具有极好的性能优势

Run the sql profiler/index tuning wizard. sometimes it makes index recommendations that don't immediately make sense, but turn out to have wonderful perf benefits

陪我终i 2024-08-22 01:44:12

也许您尝试(外部)连接的某些表是不相交的?如果是这样,请考虑创建存储过程而不是视图,并创建如下内容:


选择 ...
进入#set1
从 T1 左连接 T2 左连接...
其中...


选择 ...
进入#set2
从 T3 左连接 T4 左连接...
where ...

...

select ... from #set1 left join #set2 left join ...

这样,您可以避免处理大量数据的数据。当您进行外连接时,优化器通常无法在查询的解析树中向下移动选择(如果这样做,您将不会获得您可能想要的带有空值的行)

当然,您无法通过与存储过程连接来创建查询。这只是您可以使用的基本想法。

Maybe some of the tables which you are trying to (outer) join are disjoint? If so, consider creating stored procedure instead of view and create something like this:


select ...
into #set1
from T1 left join T2 left join...
where ...


select ...
into #set2
from T3 left join T4 left join...
where ...

...

select ... from #set1 left join #set2 left join ...

With this, you can avoid processing huge amount of data. When you make outer joins, optimizer often cannot move selection down in the parse tree of query (if it done so, you wouldn't get rows with nulls which you probably want)

Of course you cannot create query with joining with stored procedure. This is only basic idea which you can use.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文