通过大量连接提高视图的性能
我有一个使用 11 个外部联接和两个内部联接来创建数据的视图。这会产生超过 800 万行。当我在桌子上计数 (*) 时,运行大约需要 5 分钟。我不知道如何提高该表的性能。有人对从哪里开始有什么建议吗?似乎所有要连接的列上都有索引(尽管有些是复合的,不确定这是否有影响......)
感谢任何帮助。
I have a view that uses 11 outer joins and two inner joins to create the data. This results in over 8 million rows. When I do a count (*) on the table it takes about 5 minutes to run. I'm at a loss as to how to improve the performance of this table. Does anyone have any suggestions on where to begin? There appear to be indexes on all of the columns that are joining (though some are composit, not sure if that makes a difference...)
Any help appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这是一项困难的任务,对于复杂的视图,您还可能与视图的查询进行交互,因此保证合理的性能将非常困难。视图中的外连接(尤其是复杂的视图)也容易给查询优化器带来麻烦。
一种选择是具体化视图(在 SQL Server 上称为“索引视图”)。但是,您可能需要监视更新性能以检查它是否不会造成太多开销。此外,物化视图中的外部联接可能会妨碍实时刷新;如果您需要这个,那么您可能必须将视图重新实现为非规范化表并使用触发器维护数据。
另一种可能性是检查视图是否可以分为两个或三个更简单的视图,可能会具体化部分视图,但不是全部视图。以这种方式实现某些视图并从系统中获得性能可能会更容易。
This is a hard one, with a complex view you also have potential interactions with queries against the view, so guaranteeing reasonable performance will be quite hard. Outer joins in views (especially complex ones) are also prone to cause trouble for the query optimiser.
One option would be to materialise the view (called 'indexed views' on SQL Server). However you may need to monitor update performance to check that it does not impose too much overhead. Also, outer joins in a materialised view may preclude real-time refresh; if you need this then you may have to re-implement the view as a denormalised table and maintain the data with triggers.
Another possibility would be to examine whether the view could be split into two or three simpler views, possibly materialising some but not all of the view. It may be easier to materialise some of the view and get performance from the system that way.
你的基本前提是错误的。
拥有返回 800 万行的视图并不是一个好主意,因为实际上您无法用这么多数据做任何事情。
由于所有这些连接,5 分钟对于 800 万个 count() 来说听起来相当不错。
您要做的就是考虑您的业务问题并编写一个较小的查询/视图。
your basic premise is wrong.
having a view that returns 8 million rows is not a good idea because realisticaly you can't really do anything with so much data.
5 minutes sounds pretty good for 8 million count() because of all those joins.
what you have to do is to think about your business problem and write a smaller query/view.
您可以考虑以下几件事:
A few things you could consider:
运行 sql 探查器/索引调整向导。有时它提出的索引建议并没有立即有意义,但结果却具有极好的性能优势
Run the sql profiler/index tuning wizard. sometimes it makes index recommendations that don't immediately make sense, but turn out to have wonderful perf benefits
也许您尝试(外部)连接的某些表是不相交的?如果是这样,请考虑创建存储过程而不是视图,并创建如下内容:
选择 ...
进入#set1
从 T1 左连接 T2 左连接...
其中...
选择 ...
进入#set2
从 T3 左连接 T4 左连接...
where ...
...
select ... from #set1 left join #set2 left join ...
这样,您可以避免处理大量数据的数据。当您进行外连接时,优化器通常无法在查询的解析树中向下移动选择(如果这样做,您将不会获得您可能想要的带有空值的行)
当然,您无法通过与存储过程连接来创建查询。这只是您可以使用的基本想法。
Maybe some of the tables which you are trying to (outer) join are disjoint? If so, consider creating stored procedure instead of view and create something like this:
select ...
into #set1
from T1 left join T2 left join...
where ...
select ...
into #set2
from T3 left join T4 left join...
where ...
...
select ... from #set1 left join #set2 left join ...
With this, you can avoid processing huge amount of data. When you make outer joins, optimizer often cannot move selection down in the parse tree of query (if it done so, you wouldn't get rows with nulls which you probably want)
Of course you cannot create query with joining with stored procedure. This is only basic idea which you can use.