SQL Join 需要很长时间
我有两个表
(1) MonthlyTarget {SalesManCode, TargetMonthYear, TargetValue};
该表有 1966177 行。
(2) MonthlySales {SalesManCode, SaleDate, AchievedValue};
该表有 400310 行。
我必须进行一个查询,生成如下表所示的结果:
{SalesManCode, JanTar, JanAch, FebTar, FebAch,....., DecTar, DecAch}
问题是,连接这两个表需要很长时间。
应该查询什么?
如何优化查询?
我不想考虑索引。
I have two tables
(1) MonthlyTarget {SalesManCode, TargetMonthYear, TargetValue};
this table has 1966177 rows.
(2) MonthlySales {SalesManCode, SaleDate, AchievedValue};
this table has 400310 rows.
I have to make a query that produces a result like the following table:
{SalesManCode, JanTar, JanAch, FebTar, FebAch,....., DecTar, DecAch}
The problem is, joining these two tables taking a long time.
What should be the query?
How can the query be optimized?
I don't want to consider indexing.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
好吧,如果您不想考虑索引,那么您将始终执行全表扫描,并且性能不会提高。
Well, if you don't want to consider indexing, then you will always be preforming full table scans and performance will not be improved.
您的 MonthlyTarget 表中似乎缺少一些列,即“TargetDate”列。
除了每个人都已经说过的关于索引的内容之外,有时分而治之的方法确实很有帮助。 不要将 1966177 行表连接到 400310 行表,而是创建小型临时表并将它们连接在一起:
上面的查询创建两个中间表,它们应包含与 SalesMan 表相同数量的记录。 加入它们很简单:
如果您发现自己需要始终按月提取数据,请将代码移至视图中。
PIVOT 需要 SQL Server 2005 或更高版本,并且它通常是一个非常有用的运算符。 希望 SQL Server 2008 将允许用户一次旋转多个列,这将导致比上面显示的查询更简单的结果。
使用 SQL Server 2000:
PIVOT 是语法糖。 例如,
变成
It looks like you're missing some columns in your MonthlyTarget table, namely a "TargetDate" column.
In addition to what everyone has already said about indexing, sometimes a divide-and-conquer approach can really help. Rather than joining a 1966177 row table to a 400310 row table, create to tiny temp tables and join them together instead:
The queries above create two intermediate tables which should contain the same number of records as your SalesMan table. Joining them is straightforward:
If you find yourself needing to pull out data by month all the time, move the code into a view instead.
PIVOT requires SQL Server 2005 or higher, and its often a very useful operator. Hopefully SQL Server 2008 will allow users to pivot on more than one column at a time, which will result in an even simpler query than shown above.
Using SQL Server 2000:
PIVOT is syntax sugar. For example,
Becomes
检查表上的索引是否正确。 如果不查看数据库本身,几乎不可能说出来,但 99% 的慢连接是由于不正确或丢失的表索引造成的。
Check for proper indexing on the tables. It is almost impossible to say without looking at the database itself but 99% of the time slow joins are due to improper or missing table indices.
你必须考虑索引。 无论您如何编写查询,数据库引擎别无选择,只能扫描每个表以查找连接,并且它可能会一遍又一遍地执行此操作。 你没有选择。
如果您因为无法控制数据库而不想搞乱索引,请考虑将数据导出到本地 SQL Express 实例。
哎呀,即使将数据导出到平面文件,按 SalesManCode 对文件进行排序并编写一个简单的程序来读取和匹配它们也会更快。
You have to consider indexing. It doesn't matter how you write the query, the db engine has no choice but to scan each table looking for the joins and it will likely do that over and over again. You have no choice.
If you don't want to mess with indexing because you don't have control over the database, consider exporting the data to a local SQL Express instance.
Heck, even exporting the data to flat files, sorting the files by SalesManCode and writing a simple program to read and match them would be faster.
除了解雇一群销售人员之外,请考虑以下选项:
Short of firing a bunch of sales people, consider the following options:
你有过滤器吗? 您能否将一些部分结果存储在临时表中,然后在减小数据大小后加入到其余数据中?
Do you have any filters? Could you store some partial results in a temp-table, then join to the rest of your data after you've reduced the size of your data?
如果无法选择索引,则唯一加快速度的其他方法是将其放在更快的服务器上。 有些事情告诉我索引会更容易。
If indexing isn't an option, the only other way to speed it up is putting it on a faster server. Something tells me indexing would be easier though.