构建一个汇总另一个表数据的表 - 优化建议?
我正在尝试建立一个表格来总结我们的 Wiki 中的页面引用某些宏的次数。
我构建了一个名为 @currentpages 的临时表,其中有 55k 行,如下所示:
DECLARE @currentpages table(
ContentID NUMERIC(19,0) NOT NULL PRIMARY KEY,
PageTitle VARCHAR(255) NULL,
SpaceKey VARCHAR(255) NULL,
OriginalAuthor VARCHAR(255) NULL,
LastChangedBy VARCHAR(255) NULL,
LastChangedDt VARCHAR(10) NULL,
ContentBody TEXT NULL);
另一个表如下所示:(
DECLARE @usage table(
SpaceKey VARCHAR(255) NOT NULL PRIMARY KEY,
Macro1 NUMERIC(19,0) NULL,
Macro2 NUMERIC(19,0) NULL,
Macro3 NUMERIC(19,0) NULL);
我简化了上面的内容;它实际上有大约 40 列,如 Macro1、Macro2 等)我'我试图计算有多少个对这些不同宏的引用(在@currentpages中)。
我通过为每个可能的 SpaceKey 值创建一行来初始化 @usage 表,将所有宏“x”计数器清零:
INSERT INTO @usage (SpaceKey, Macro1, Macro2, Macro3)
SELECT S.spacekey, 0, 0, 0
FROM spaces S
ORDER BY S.spacekey
然后我运行几个 UPDATE 语句中的第一个来识别每个宏引用:
UPDATE @usage
SET U.AdvancedSearch = C.Counter
FROM @usage U
INNER JOIN (SELECT SpaceKey, COUNT(*) AS Counter
FROM @currentpages
WHERE ContentBody LIKE '%{search-%' GROUP BY SpaceKey) C
ON U.SpaceKey = C.SpaceKey
这似乎工作正常,但它运行很长时间。有没有更有效的方法来完成我想做的事情?
该数据库位于 SQL Server 2005 上。
非常感谢, 贝特西
I am trying to build a table that summarizes how many times pages in our Wiki references certain macros.
I have built a temporary table called @currentpages that with 55k rows that looks like this:
DECLARE @currentpages table(
ContentID NUMERIC(19,0) NOT NULL PRIMARY KEY,
PageTitle VARCHAR(255) NULL,
SpaceKey VARCHAR(255) NULL,
OriginalAuthor VARCHAR(255) NULL,
LastChangedBy VARCHAR(255) NULL,
LastChangedDt VARCHAR(10) NULL,
ContentBody TEXT NULL);
and another table that looks like this:
DECLARE @usage table(
SpaceKey VARCHAR(255) NOT NULL PRIMARY KEY,
Macro1 NUMERIC(19,0) NULL,
Macro2 NUMERIC(19,0) NULL,
Macro3 NUMERIC(19,0) NULL);
(I've simplified the above; it actually has about 40 columns like Macro1, Macro2, etc.) I'm trying to get counts of how many references there are (in @currentpages) to these various macros.
I initialize the @usage table by creating a row for each possible SpaceKey value, zeroing out all of the Macro"x" counters:
INSERT INTO @usage (SpaceKey, Macro1, Macro2, Macro3)
SELECT S.spacekey, 0, 0, 0
FROM spaces S
ORDER BY S.spacekey
Then I run the first of several UPDATE statements to identify each macro reference:
UPDATE @usage
SET U.AdvancedSearch = C.Counter
FROM @usage U
INNER JOIN (SELECT SpaceKey, COUNT(*) AS Counter
FROM @currentpages
WHERE ContentBody LIKE '%{search-%' GROUP BY SpaceKey) C
ON U.SpaceKey = C.SpaceKey
This appears to work fine, but it runs a very long time. Is there a more efficient way to do what I'm trying to do?
This database is on SQL Server 2005.
Many thanks,
Betsy
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
调查全文搜索。 (全文搜索 @ MSDN)
我对此没有任何经验,因此无法提供建议。
不要为每个宏查询一次@CurrentPages,而是一次获取所有宏计数。像这样的东西:
Investigate full text search. (Full Text Search @ MSDN)
I don't have any experience with it, so can not offer advice.
Instead of querying @CurrentPages once per macro, get all the macro counts at once. Something like:
不要为每个宏创建一列,而是创建一个像这样的表。
一旦你有了它,你就可以通过运行来填充它。
在你的故事结束时,你可以查询@usage表,以你想要的任何形式进行计数。
对于 55k 条记录,这将运行几分钟。
Don't make a column per macro, but make a table like this.
Once you have that you can populate it by running
At the end of your story you can query the @usage table to make counts in whatever form you wish.
On 55k records this will run for a few minutes.
我想任何性能影响都将来自包含 LIKE 子句的子查询,因为它将进行全表扫描。我真的不知道还有什么会减慢速度太多。
您可以通过仅运行子查询并将返回所需的时间与某一列的整个更新所需的时间进行比较来进行测试。
I would imagine any performance hits are going to come from the subquery containing the LIKE clause since it's going to be doing a full table scan. I don't really see what else would slow this down too much.
You could test that by running just the sub-query and comparing the time it takes to return to the time the entire update takes for one of your columns.