在 sql server 2005 中从大约 2 亿行的表中进行选择时,select into 查询需要多长时间?

发布于 2024-10-27 03:13:23 字数 2741 浏览 1 评论 0原文

我在 SQL Server 2005 数据库中有一个包含 193,569,270 行的表。该表包含我们网站的用户执行的活动。该表定义为:

Name                  DataType
ID                    int (identity)             PK
ActivityTime          datetime
PersonID              int                        (should be an FK, but isn't)
ActivityTypeID        int                        (should be an FK, but isn't)
Data1                 varchar(50)
Data2                 varchar(50)

我有以下索引:

CREATE NONCLUSTERED INDEX [_MS_Sys_3] ON [dbo].[tblPersonActivity] ([PersonID] ASC)
INCLUDE ( [ID], [ActivityTime], [ActivityTypeID], [Data1], [Data2]) 
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Activity] ON [dbo].[tblPersonActivity] ([PersonID] ASC, [ActivityTypeID] ASC, ActivityTime] ASC)
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_tblPersonActivity_PersonArchive] ON [dbo].[tblPersonActivity] ([ActivityTime] ASC)
INCLUDE ([ID], [PersonID], [ActivityTypeID], [Data1], [Data2]) 
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

ALTER TABLE [dbo].[tblPersonActivity] ADD  CONSTRAINT [PK_tblPersonActivity] PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

这是我编写的查询:

declare @archiveDate            datetime
declare @curDate                datetime
declare @startDate              datetime
declare @curYear                int
declare @preYear                int

set @curDate = getdate()
set @curYear = year(@curDate)
set @preYear = @curYear - 1
set @archiveDate = @curDate
set @startDate = cast(('1/1/' + cast(@preYear as varchar(4))) as datetime)

declare @InactivePersons table
    (PersonID       int     not null PRIMARY KEY)

insert into @InactiveBuyers
    select 
        b.PersonID 
    from 
        HBM.dbo.tblPersons b with (INDEX(IX_tblPersons_InactiveDate_PersonID), nolock)
    where 
        b.InactiveDate is not null 
        and b.InactiveDate  '1/1/1900' 
        and b.InactiveDate  '12/31/1899' 
        and b.InactiveDate = @StartDate

上次运行该查询时,它运行了 1 天以上,然后才终止它。我是否错过了什么,或者这只是需要那么长的时间?

感谢您提供的任何帮助。

韦恩·E·普费弗

I have a table with 193,569,270 rows in a SQL Server 2005 database. The table houses activities that are performed by users of our website. The table is defined as:

Name                  DataType
ID                    int (identity)             PK
ActivityTime          datetime
PersonID              int                        (should be an FK, but isn't)
ActivityTypeID        int                        (should be an FK, but isn't)
Data1                 varchar(50)
Data2                 varchar(50)

I have the following indexes:

CREATE NONCLUSTERED INDEX [_MS_Sys_3] ON [dbo].[tblPersonActivity] ([PersonID] ASC)
INCLUDE ( [ID], [ActivityTime], [ActivityTypeID], [Data1], [Data2]) 
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Activity] ON [dbo].[tblPersonActivity] ([PersonID] ASC, [ActivityTypeID] ASC, ActivityTime] ASC)
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_tblPersonActivity_PersonArchive] ON [dbo].[tblPersonActivity] ([ActivityTime] ASC)
INCLUDE ([ID], [PersonID], [ActivityTypeID], [Data1], [Data2]) 
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

ALTER TABLE [dbo].[tblPersonActivity] ADD  CONSTRAINT [PK_tblPersonActivity] PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

This is the query I've written:

declare @archiveDate            datetime
declare @curDate                datetime
declare @startDate              datetime
declare @curYear                int
declare @preYear                int

set @curDate = getdate()
set @curYear = year(@curDate)
set @preYear = @curYear - 1
set @archiveDate = @curDate
set @startDate = cast(('1/1/' + cast(@preYear as varchar(4))) as datetime)

declare @InactivePersons table
    (PersonID       int     not null PRIMARY KEY)

insert into @InactiveBuyers
    select 
        b.PersonID 
    from 
        HBM.dbo.tblPersons b with (INDEX(IX_tblPersons_InactiveDate_PersonID), nolock)
    where 
        b.InactiveDate is not null 
        and b.InactiveDate  '1/1/1900' 
        and b.InactiveDate  '12/31/1899' 
        and b.InactiveDate = @StartDate

The last time I ran the query it ran for over 1 day before I killed it. Have I missed something or is this just going to take that kind of time?

Thanks for any help you can provide.

Wayne E. Pfeffer

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

雨后咖啡店 2024-11-03 03:13:23

不,如果您的数据库设置正确并建立了索引,那么这应该不会花费那么长时间。

首先您需要创建那些 FK!没有理由不让它们来确保您的数据完整性。 FK 应该有自己的索引。

非活动日期 deosn 似乎不在您的表结构中。它是日期字段吗?如果不是,请将其设为一,否则您会浪费时间进行隐式转换。

b.InactiveDate is not null 
        and b.InactiveDate  '1/1/1900' 
        and b.InactiveDate  '12/31/1899' 
        and b.InactiveDate = @StartDate

整个 where 子句没有意义。如果您正在查找与@startdate 匹配的记录,那么您不需要其余的任何记录。

检查执行计划,看看哪里花费了这么长时间,是否有什么原因导致了表扫描。

如果表变量中有大量记录,那么临时表往往会执行得更快。您没有说明在过程的其余部分中您正在对该表执行什么操作,您确定是插入语句花费了最多时间还是您正在执行的其他操作?

No this should not take that long if your database is properly set up and indexed.

First you need to create those FKs! There is not excuse for not having them to ensure your data integrity. FKs should have their own indexes.

Inactive date deosn't seem to be in your table structure. Is it a date field? Make it one if it is not or you are wasting time doing implicit conversions.

b.InactiveDate is not null 
        and b.InactiveDate  '1/1/1900' 
        and b.InactiveDate  '12/31/1899' 
        and b.InactiveDate = @StartDate

This whole where clause doesn't make sense. If you are looking for the records that match @startdate then you don't need any of the rest.

Check out the Execution plan to see where this is taking so long, something is causing a table scan.

And if there will be large numbers of records in the table varaible, then a temp table tends to perform faster. You don't say what you are doing with this table in the rest of the proc, are you sure it is the insert statement taking the most time or something else you are doing?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文