如何将一个巨大的表数据复制到SQL Server中的另一个表中

发布于 2024-10-21 23:07:48 字数 165 浏览 9 评论 0原文

我有一个包含 340 万行的表。我想将整个数据复制到另一个表中。

我正在使用以下查询执行此任务:

select * 
into new_items 
from productDB.dbo.items

我需要知道执行此任务的最佳方法。

I have a table with 3.4 million rows. I want to copy this whole data into another table.

I am performing this task using the below query:

select * 
into new_items 
from productDB.dbo.items

I need to know the best possible way to do this task.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

当梦初醒 2024-10-28 23:07:48

我遇到了同样的问题,除了我有一个包含 20 亿行的表,因此如果我这样做,即使将恢复模式设置为批量日志记录,日志文件也会无限增长:

insert into newtable select * from oldtable

所以我对数据块进行操作。这样,如果传输中断,您只需重新启动即可。此外,您不需要与表一样大的日志文件。您似乎也获得了更少的 tempdb I/O,不知道为什么。

set identity_insert newtable on
DECLARE @StartID bigint, @LastID bigint, @EndID bigint
select @StartID = isNull(max(id),0) + 1
from newtable

select @LastID = max(ID)
from oldtable

while @StartID <= @LastID
begin
    set @EndID = @StartID + 1000000

    insert into newtable (FIELDS,GO,HERE)
    select FIELDS,GO,HERE from oldtable (NOLOCK)
    where id BETWEEN @StartID AND @EndId
            
    set @StartID = @EndID + 1
end
set identity_insert newtable off
go

您可能需要更改处理 ID 的方式,如果您的表按 ID 进行集群,则此方法效果最佳。

I had the same problem, except I have a table with 2 billion rows, so the log file would grow to no end if I did this, even with the recovery model set to Bulk-Logging:

insert into newtable select * from oldtable

So I operate on blocks of data. This way, if the transfer is interupted, you just restart it. Also, you don't need a log file as big as the table. You also seem to get less tempdb I/O, not sure why.

set identity_insert newtable on
DECLARE @StartID bigint, @LastID bigint, @EndID bigint
select @StartID = isNull(max(id),0) + 1
from newtable

select @LastID = max(ID)
from oldtable

while @StartID <= @LastID
begin
    set @EndID = @StartID + 1000000

    insert into newtable (FIELDS,GO,HERE)
    select FIELDS,GO,HERE from oldtable (NOLOCK)
    where id BETWEEN @StartID AND @EndId
            
    set @StartID = @EndID + 1
end
set identity_insert newtable off
go

You might need to change how you deal with IDs, this works best if your table is clustered by ID.

脸赞 2024-10-28 23:07:48

如果您要复制到表中,最快的方法可能就是您问题中的方法,除非您的行非常大。

如果您的行非常大,您可能需要使用 SQL Server 中的批量插入功能。我认为你可以从 C# 调用它们。

或者,您可以先将该数据下载到文本文件中,然后批量复制 (bcp)。这样做的额外好处是允许您忽略键、索引等。

还可以尝试 SQL Management Studio 附带的导入/导出实用程序;不确定它是否会像直接批量复制一样快,但它应该允许您跳过写为平面文件的中间步骤,而直接复制表到表,这可能比您的 SELECT INTO 语句。

If you are copying into a new table, the quickest way is probably what you have in your question, unless your rows are very large.

If your rows are very large, you may want to use the bulk insert functions in SQL Server. I think you can call them from C#.

Or you can first download that data into a text file, then bulk-copy (bcp) it. This has the additional benefit of allowing you to ignore keys, indexes etc.

Also try the Import/Export utility that comes with the SQL Management Studio; not sure whether it will be as fast as a straight bulk-copy, but it should allow you to skip the intermediate step of writing out as a flat file, and just copy directly table-to-table, which might be a bit faster than your SELECT INTO statement.

暖伴 2024-10-28 23:07:48

我一直在与 DBA 合作将一个包含 240M 行的审计表复制到另一个数据库。

使用简单的选择/插入创建了一个巨大的 tempdb 文件。

使用导入/导出向导有效,但在 10 分钟内复制了 800 万行

创建自定义 SSIS 包并调整设置在 10 分钟内复制了 3000 万行

来说是最快、最有效的

SSIS 包对于我们的目的

I have been working with our DBA to copy an audit table with 240M rows to another database.

Using a simple select/insert created a huge tempdb file.

Using a the Import/Export wizard worked but copied 8M rows in 10min

Creating a custom SSIS package and adjusting settings copied 30M rows in 10Min

The SSIS package turned out to be the fastest and most efficent for our purposes

Earl

無處可尋 2024-10-28 23:07:48

这是转移大表的另一种方法。我刚刚使用它在两台服务器之间传输了 1.05 亿行。也相当快。

  1. 右键单击数据库并选择“任务/导出数据”。
  2. 向导将引导您完成这些步骤,但您选择 SQL Server 客户端作为数据源和目标将允许您选择要传输的数据库和表。

有关详细信息,请参阅 https:// /www.mssqltips.com/sqlservertutorial/202/simple-way-to-export-data-from-sql-server/

Here's another way of transferring large tables. I've just transferred 105 million rows between two servers using this. Quite quick too.

  1. Right-click on the database and choose Tasks/Export Data.
  2. A wizard will take you through the steps but you choosing your SQL server client as the data source and target will allow you to select the database and table(s) you wish to transfer.

For more information, see https://www.mssqltips.com/sqlservertutorial/202/simple-way-to-export-data-from-sql-server/

我要还你自由 2024-10-28 23:07:48

如果是 1 次导入,SSMS 中的导入/导出实用程序可能会最简单、最快。 SSIS 似乎也比直接 INSERT 更适合导入大型数据集。

BULK INSERT 或 BCP 也可用于导入大型记录集。

另一种选择是暂时删除要导入的表上的所有索引和约束,并在导入过程完成后将它们添加回来。以前失败的直接 INSERT 在这些情况下可能会起作用。

如果您在直接从一个数据库转到另一个数据库时遇到超时或锁定/阻塞问题,您可能会考虑从一个数据库进入 TEMPDB,然后从 TEMPDB 进入另一个数据库,因为这样可以最大限度地减少锁定和阻塞进程对数据库的影响。任一侧。 TempDB 不会阻止或锁定源,也不会阻止目标。

这些是一些可以尝试的选择。

——埃里克·艾萨克斯

If it's a 1 time import, the Import/Export utility in SSMS will probably work the easiest and fastest. SSIS also seems to work better for importing large data sets than a straight INSERT.

BULK INSERT or BCP can also be used to import large record sets.

Another option would be to temporarily remove all indexes and constraints on the table you're importing into and add them back once the import process completes. A straight INSERT that previously failed might work in those cases.

If you're dealing with timeouts or locking/blocking issues when going directly from one database to another, you might consider going from one db into TEMPDB and then going from TEMPDB into the other database as it minimizes the effects of locking and blocking processes on either side. TempDB won't block or lock the source and it won't hold up the destination.

Those are a few options to try.

-Eric Isaacs

远山浅 2024-10-28 23:07:48

我喜欢@Mathieu Longtin 的解决方案,批量复制,从而最大限度地减少日志文件问题,并按照@CervEd 的建议创建了带有 OFFSET FETCH 的版本。

其他人建议使用导入/导出向导或 SSIS 包,但这并不总是可行。

对于许多人来说,这可能有点过分了,但我的解决方案还包括对记录计数和输出进度的一些检查。

USE [MyDB]
GO

SET NOCOUNT ON;
DECLARE @intStart int = 1;
DECLARE @intCount int;
DECLARE @intFetch int = 10000;
DECLARE @strStatus VARCHAR(200);
DECLARE @intCopied int = 0;

SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Getting count of HISTORY records currently in MyTable...';
RAISERROR (@strStatus, 10, 1) WITH NOWAIT;
SELECT @intCount = COUNT(*) FROM [dbo].MyTable WHERE IsHistory = 1;
SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Count of HISTORY records currently in MyTable: ' + CONVERT(VARCHAR(20), @intCount);
RAISERROR (@strStatus, 10, 1) WITH NOWAIT;  --(note: PRINT resets @@ROWCOUNT to 0 so using RAISERROR instead)
SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Starting copy...';
RAISERROR (@strStatus, 10, 1) WITH NOWAIT;

WHILE @intStart < @intCount
BEGIN

    INSERT INTO [dbo].[MyTable_History] (
        [PK1], [PK2], [PK3], [Data1], [Data2])
    SELECT
        [PK1], [PK2], [PK3], [Data1], [Data2]
    FROM [MyDB].[dbo].[MyTable]
    WHERE IsHistory = 1
    ORDER BY 
        [PK1], [PK2], [PK3]
        OFFSET @intStart - 1 ROWS 
        FETCH NEXT @intFetch ROWS ONLY;

    SET @intCopied = @intCopied + @@ROWCOUNT;
    SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Records copied so far: ' + CONVERT(VARCHAR(20), @intCopied); 
    RAISERROR (@strStatus, 10, 1) WITH NOWAIT;

    SET @intStart = @intStart + @intFetch;

END

--Check the record count is correct.
IF @intCopied = @intCount
    BEGIN
        SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Correct record count.'; 
        RAISERROR (@strStatus, 10, 1) WITH NOWAIT;
    END
ELSE
    BEGIN
        SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Only ' + CONVERT(VARCHAR(20), @intCopied) + ' records were copied, expected: ' + CONVERT(VARCHAR(20), @intCount);
        RAISERROR (@strStatus, 10, 1) WITH NOWAIT;
    END


GO

I like the solution from @Mathieu Longtin to copy in batches thereby minimising log file issues and created a version with OFFSET FETCH as suggested by @CervEd.

Others have suggested using the Import/Export Wizard or SSIS packages, but that's not always possible.

It's probably overkill for many but my solution includes some checks for record counts and outputs progress as well.

USE [MyDB]
GO

SET NOCOUNT ON;
DECLARE @intStart int = 1;
DECLARE @intCount int;
DECLARE @intFetch int = 10000;
DECLARE @strStatus VARCHAR(200);
DECLARE @intCopied int = 0;

SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Getting count of HISTORY records currently in MyTable...';
RAISERROR (@strStatus, 10, 1) WITH NOWAIT;
SELECT @intCount = COUNT(*) FROM [dbo].MyTable WHERE IsHistory = 1;
SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Count of HISTORY records currently in MyTable: ' + CONVERT(VARCHAR(20), @intCount);
RAISERROR (@strStatus, 10, 1) WITH NOWAIT;  --(note: PRINT resets @@ROWCOUNT to 0 so using RAISERROR instead)
SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Starting copy...';
RAISERROR (@strStatus, 10, 1) WITH NOWAIT;

WHILE @intStart < @intCount
BEGIN

    INSERT INTO [dbo].[MyTable_History] (
        [PK1], [PK2], [PK3], [Data1], [Data2])
    SELECT
        [PK1], [PK2], [PK3], [Data1], [Data2]
    FROM [MyDB].[dbo].[MyTable]
    WHERE IsHistory = 1
    ORDER BY 
        [PK1], [PK2], [PK3]
        OFFSET @intStart - 1 ROWS 
        FETCH NEXT @intFetch ROWS ONLY;

    SET @intCopied = @intCopied + @@ROWCOUNT;
    SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Records copied so far: ' + CONVERT(VARCHAR(20), @intCopied); 
    RAISERROR (@strStatus, 10, 1) WITH NOWAIT;

    SET @intStart = @intStart + @intFetch;

END

--Check the record count is correct.
IF @intCopied = @intCount
    BEGIN
        SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Correct record count.'; 
        RAISERROR (@strStatus, 10, 1) WITH NOWAIT;
    END
ELSE
    BEGIN
        SET @strStatus = CONVERT(VARCHAR(30), GETDATE()) + ' Only ' + CONVERT(VARCHAR(20), @intCopied) + ' records were copied, expected: ' + CONVERT(VARCHAR(20), @intCount);
        RAISERROR (@strStatus, 10, 1) WITH NOWAIT;
    END


GO
粉红×色少女 2024-10-28 23:07:48

简单插入/选择 sp 的工作效果很好,直到行数超过 1 百万。我见过 tempdb 文件在尝试插入/选择 2000 万行以上时发生爆炸。最简单的解决方案是 SSIS 将批处理行大小缓冲区设置为 5000,并将提交大小缓冲区设置为 1000。

Simple Insert/Select sp's work great until the row count exceeds 1 mil. I've watched tempdb file explode trying to insert/select 20 mil + rows. The simplest solution is SSIS setting the batch row size buffer to 5000 and commit size buffer to 1000.

娜些时光,永不杰束 2024-10-28 23:07:48

我知道这已经晚了,但是如果您遇到信号量超时,那么您可以使用 row_number 来设置插入的增量,使用类似

INSERT INTO DestinationTable (column1, column2, etc) 
 FROM ( 
SELECT ROW_NUMBER() OVER (ORDER BY ID) AS RN , column1, column2, etc
FROM SourceTable ) AS A
WHERE A.RN >= 1 AND A.RN <= 10000 )

日志文件的大小将会增长的内容,因此需要解决这个问题。如果在插入现有表时禁用约束和索引,您将获得更好的性能。然后启用约束并在插入完成后为插入的表重建索引。

I know this is late, but if you are encountering semaphore timeouts then you can use row_number to set increments for your insert(s) using something like

INSERT INTO DestinationTable (column1, column2, etc) 
 FROM ( 
SELECT ROW_NUMBER() OVER (ORDER BY ID) AS RN , column1, column2, etc
FROM SourceTable ) AS A
WHERE A.RN >= 1 AND A.RN <= 10000 )

The size of the log file will grow, so there is that to contend with. You get better performance if you disable constraints and index when inserting into an existing table. Then enable the constraints and rebuild the index for the table you inserted into once the insertion is complete.

宣告ˉ结束 2024-10-28 23:07:48

如果您的重点是归档 (DW) 并且正在处理具有 100 多个分区表的 VLDB,并且您希望将大部分资源密集型工作隔离在非生产服务器 (OLTP) 上,这里有一个建议 (OLTP -> DW)
1)使用备份/恢复将数据保存到存档服务器上(所以现在,在存档或DW上您将拥有阶段和目标数据库)
2)Stage数据库:使用分区开关将数据移动到对应的Stage表
3)使用SSIS将双方每个暂存表的数据从暂存数据库传输到目标数据库
4)目标数据库:使用目标数据库上的分区开关将数据从阶段移动到基表
希望这有帮助。

If your focus is Archiving (DW) and are dealing with VLDB with 100+ partitioned tables and you want to isolate most of these resource intensive work on a non production server (OLTP) here is a suggestion (OLTP -> DW)
1) Use backup / Restore to get the data onto the archive server (so now, on Archive or DW you will have Stage and Target database)
2) Stage database: Use partition switch to move data to corresponding stage table
3) Use SSIS to transfer data from staged database to target database for each staged table on both sides
4) Target database: Use partition switch on target database to move data from stage to base table
Hope this helps.

国际总奸 2024-10-28 23:07:48

我正在回答这篇 12 年前的帖子,因为它仍然具有相关性,并且参考我的博客文章可能会有用。这就是将大表/数据分成“块”并一次加载一个块或并行加载这些块。如果出现故障,我们只需重置保存“块”的状态表并从中断的地方继续。

我关于复制大型表/数据集的博客文章

使用块创建状态表的关键查询是以下查询:

    --++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
--Author: Jana Sattainathan (@SQLJana - https://sqljana.wordpress.com)
--Sourcecode: https://sqljana.wordpress.com/?p=7980
--++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
--## **** Script to split any table into chunks based on a column to chunk it by and create a status tracking table out of the chunk data **** 
--@TableName could be any schema.table 
--@ChunkByColumn can be of any data type but prefer an indexed PK ID type column for performance!
--@ChunkSize decides how big each chunk is when table is broken up into chunks (ranges of ChunkByColumn values)
 
DECLARE @StatusTrackingTable VARCHAR(50) = '[dbo].[JanasTableStatusTracker]';
DECLARE @SourceServerName VARCHAR(50) = @@SERVERNAME;
DECLARE @SourceDBName VARCHAR(50) = DB_NAME();
DECLARE @SourceTableName VARCHAR(50) = '[Sales].[SalesOrderHeader]';
DECLARE @ChunkByColumn VARCHAR(50) = '[SalesOrderID]';
DECLARE @ChunkSize INT = 100;
DECLARE @TargetServerName VARCHAR(50) = @SourceServerName;
DECLARE @TargetDBName VARCHAR(50) = 'DBAT';
DECLARE @TargetTableName VARCHAR(50) = '[dbo].[SalesOrderHeader_Jana]' --CASE WHEN @TargetServerName IS NOT NULL THEN @SourceTableName ELSE NULL END;
 
DECLARE @Batch BIGINT = CAST(FORMAT(GETDATE(),'yyyyMMddHHmmssffff') AS BIGINT);
DECLARE @BatchString VARCHAR(18) = LTRIM(STR(@Batch,18));
DECLARE @SQL NVARCHAR(1600); 
DECLARE @CreateInsertToStatusTableSQL NVARCHAR(1600); 
 
BEGIN
    SET @SQL = 'SELECT          
            [Batch], ' +
            '''' + @SourceServerName + ''' AS SourceServerName, ' +
            '''' + @SourceDBName + ''' AS SourceDBName, ' +
            '''' + @SourceTableName + ''' AS SourceTableName, ' +
            '''' + @ChunkByColumn + ''' AS ChunkByColumn, ' +
            '''' + COALESCE(@TargetServerName,'') + ''' AS TargetServerName, ' +
            '''' + COALESCE(@TargetDBName,'')  + ''' AS TargetDBName, ' +
            '''' + COALESCE(@TargetTableName,'')  + ''' AS TargetTableName, 
            ChunkNumber,
            MIN(Id) AS MinValue,
            MAX(Id) AS MaxValue,
            CAST(COUNT(1) AS BIGINT) AS ChunkRowCount, 
            /*'' SELECT *  FROM ' + @SourceDBName + '.' + @SourceTableName + ' WHERE ' + @ChunkByColumn + ' BETWEEN ''+' + 'LTRIM(STR(MIN(Id)))' + '+'' AND ''+' + 'LTRIM(STR(MAX(Id)))' + ' + '''' AS SimpleChunkSelectSQLForNumIds, */
            '' SELECT *  FROM ' + @SourceDBName + '.' + @SourceTableName + ' WHERE ' + @ChunkByColumn + ' BETWEEN (SELECT MinValue FROM ' + @SourceDBName + '.' + @StatusTrackingTable + ' WHERE [Batch] = ' + @BatchString + ' AND ChunkNumber = ''+' + 'LTRIM(STR(ChunkNumber))' + '+'') AND (SELECT MaxValue FROM ' + @SourceDBName + '.' + @StatusTrackingTable + ' WHERE [Batch] = ' + @BatchString + ' AND ChunkNumber = ''+' + 'LTRIM(STR(ChunkNumber))' + '+'')' + ' '' AS ChunkSelectSQL, ' +
            CASE WHEN (LEN(COALESCE(@StatusTrackingTable,'')) > 0)
                THEN ' CONVERT(VARCHAR(20), NULL) AS Status, 0 AS TargetBatchRowCount, CONVERT(DATETIME, NULL) AS StartDateTime,  CONVERT(DATETIME, NULL) AS EndDateTime'
                ELSE ''
            END + '
        /*INTO STATUSTABLE*/
        FROM
            (
                SELECT ' +
                    @BatchString + ' AS [Batch],
                    ' + @ChunkByColumn + '  AS Id,
                    ROW_NUMBER() OVER (ORDER BY ' + @ChunkByColumn + ') RowNum,
                    CEILING(ROW_NUMBER() OVER (ORDER BY ' + @ChunkByColumn + ')/' + LTRIM(STR(@ChunkSize)) + '.0 /*INT to REAL*/) AS ChunkNumber
                FROM 
                    ' + @SourceTableName + '
            ) splits
        GROUP BY
            [Batch], ChunkNumber
        ORDER BY
            ChunkNumber';
 
    --Create or insert the data into the status tracking table if a name is provided
    IF (LEN(COALESCE(@StatusTrackingTable,'')) = 0)
    BEGIN
        PRINT @SQL;
        EXECUTE(@SQL);
    END
    ELSE
    BEGIN
        IF OBJECT_ID(@StatusTrackingTable, 'U') IS NOT NULL
        BEGIN
            SET @CreateInsertToStatusTableSQL = 'INSERT INTO ' + @StatusTrackingTable + ' ' + @SQL;
        END;
        ELSE
        BEGIN
            SET @CreateInsertToStatusTableSQL = REPLACE(@SQL, '/*INTO STATUSTABLE*/', (' INTO ' + @StatusTrackingTable + ' '));         
        END;
 
        PRINT @CreateInsertToStatusTableSQL;
        EXECUTE(@CreateInsertToStatusTableSQL);
    END;
END;

I am answering this 12 year old post since this is still relevant and this might be useful by referring to my blog post. It is all about breaking the large table/data into "chunks" and loading the chunks one at a time or in parallel. If there are failures, we just reset the status table that holds the "chunks" and continue from where we left off.

My blog post on copying a Huge table/dataset

The key query that creates the status table with the chunks is this query:

    --++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
--Author: Jana Sattainathan (@SQLJana - https://sqljana.wordpress.com)
--Sourcecode: https://sqljana.wordpress.com/?p=7980
--++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
--## **** Script to split any table into chunks based on a column to chunk it by and create a status tracking table out of the chunk data **** 
--@TableName could be any schema.table 
--@ChunkByColumn can be of any data type but prefer an indexed PK ID type column for performance!
--@ChunkSize decides how big each chunk is when table is broken up into chunks (ranges of ChunkByColumn values)
 
DECLARE @StatusTrackingTable VARCHAR(50) = '[dbo].[JanasTableStatusTracker]';
DECLARE @SourceServerName VARCHAR(50) = @@SERVERNAME;
DECLARE @SourceDBName VARCHAR(50) = DB_NAME();
DECLARE @SourceTableName VARCHAR(50) = '[Sales].[SalesOrderHeader]';
DECLARE @ChunkByColumn VARCHAR(50) = '[SalesOrderID]';
DECLARE @ChunkSize INT = 100;
DECLARE @TargetServerName VARCHAR(50) = @SourceServerName;
DECLARE @TargetDBName VARCHAR(50) = 'DBAT';
DECLARE @TargetTableName VARCHAR(50) = '[dbo].[SalesOrderHeader_Jana]' --CASE WHEN @TargetServerName IS NOT NULL THEN @SourceTableName ELSE NULL END;
 
DECLARE @Batch BIGINT = CAST(FORMAT(GETDATE(),'yyyyMMddHHmmssffff') AS BIGINT);
DECLARE @BatchString VARCHAR(18) = LTRIM(STR(@Batch,18));
DECLARE @SQL NVARCHAR(1600); 
DECLARE @CreateInsertToStatusTableSQL NVARCHAR(1600); 
 
BEGIN
    SET @SQL = 'SELECT          
            [Batch], ' +
            '''' + @SourceServerName + ''' AS SourceServerName, ' +
            '''' + @SourceDBName + ''' AS SourceDBName, ' +
            '''' + @SourceTableName + ''' AS SourceTableName, ' +
            '''' + @ChunkByColumn + ''' AS ChunkByColumn, ' +
            '''' + COALESCE(@TargetServerName,'') + ''' AS TargetServerName, ' +
            '''' + COALESCE(@TargetDBName,'')  + ''' AS TargetDBName, ' +
            '''' + COALESCE(@TargetTableName,'')  + ''' AS TargetTableName, 
            ChunkNumber,
            MIN(Id) AS MinValue,
            MAX(Id) AS MaxValue,
            CAST(COUNT(1) AS BIGINT) AS ChunkRowCount, 
            /*'' SELECT *  FROM ' + @SourceDBName + '.' + @SourceTableName + ' WHERE ' + @ChunkByColumn + ' BETWEEN ''+' + 'LTRIM(STR(MIN(Id)))' + '+'' AND ''+' + 'LTRIM(STR(MAX(Id)))' + ' + '''' AS SimpleChunkSelectSQLForNumIds, */
            '' SELECT *  FROM ' + @SourceDBName + '.' + @SourceTableName + ' WHERE ' + @ChunkByColumn + ' BETWEEN (SELECT MinValue FROM ' + @SourceDBName + '.' + @StatusTrackingTable + ' WHERE [Batch] = ' + @BatchString + ' AND ChunkNumber = ''+' + 'LTRIM(STR(ChunkNumber))' + '+'') AND (SELECT MaxValue FROM ' + @SourceDBName + '.' + @StatusTrackingTable + ' WHERE [Batch] = ' + @BatchString + ' AND ChunkNumber = ''+' + 'LTRIM(STR(ChunkNumber))' + '+'')' + ' '' AS ChunkSelectSQL, ' +
            CASE WHEN (LEN(COALESCE(@StatusTrackingTable,'')) > 0)
                THEN ' CONVERT(VARCHAR(20), NULL) AS Status, 0 AS TargetBatchRowCount, CONVERT(DATETIME, NULL) AS StartDateTime,  CONVERT(DATETIME, NULL) AS EndDateTime'
                ELSE ''
            END + '
        /*INTO STATUSTABLE*/
        FROM
            (
                SELECT ' +
                    @BatchString + ' AS [Batch],
                    ' + @ChunkByColumn + '  AS Id,
                    ROW_NUMBER() OVER (ORDER BY ' + @ChunkByColumn + ') RowNum,
                    CEILING(ROW_NUMBER() OVER (ORDER BY ' + @ChunkByColumn + ')/' + LTRIM(STR(@ChunkSize)) + '.0 /*INT to REAL*/) AS ChunkNumber
                FROM 
                    ' + @SourceTableName + '
            ) splits
        GROUP BY
            [Batch], ChunkNumber
        ORDER BY
            ChunkNumber';
 
    --Create or insert the data into the status tracking table if a name is provided
    IF (LEN(COALESCE(@StatusTrackingTable,'')) = 0)
    BEGIN
        PRINT @SQL;
        EXECUTE(@SQL);
    END
    ELSE
    BEGIN
        IF OBJECT_ID(@StatusTrackingTable, 'U') IS NOT NULL
        BEGIN
            SET @CreateInsertToStatusTableSQL = 'INSERT INTO ' + @StatusTrackingTable + ' ' + @SQL;
        END;
        ELSE
        BEGIN
            SET @CreateInsertToStatusTableSQL = REPLACE(@SQL, '/*INTO STATUSTABLE*/', (' INTO ' + @StatusTrackingTable + ' '));         
        END;
 
        PRINT @CreateInsertToStatusTableSQL;
        EXECUTE(@CreateInsertToStatusTableSQL);
    END;
END;
几味少女 2024-10-28 23:07:48

对于任何想要完成此任务并随时删除源记录的人,请保持批处理,然后使用 OUTPUT 子句将记录移动到新表。

WHILE (SELECT TOP 1 1 FROM SourceTable) IS NOT NULL
BEGIN
    DELETE TOP (100000)
    FROM SourceTable
    OUTPUT DELETED.* --Either * or list columns
    INTO TargetTable
END;

如果您不想删除,请尝试使用游标从源表中填充表变量,在游标中设置检查,以便当表变量达到约 100K 行时,它会执行插入,然后截断表变量。 (抱歉,我现在没有时间深入探讨这个问题。)

“我能说什么,除了……”

For anyone out there looking to accomplish this and remove source records as you go, keep it batched and just scoot the records over to the new table with an OUTPUT clause.

WHILE (SELECT TOP 1 1 FROM SourceTable) IS NOT NULL
BEGIN
    DELETE TOP (100000)
    FROM SourceTable
    OUTPUT DELETED.* --Either * or list columns
    INTO TargetTable
END;

If you're not looking to delete, try using a cursor to populate a table var from your source table, set a check in your cursor so that when the table var hits ~100K rows, it performs an insert then truncates the table var. (Sorry, I don't have time to dive into that at the moment.)

"What can I say, except..."

醉城メ夜风 2024-10-28 23:07:48

从productDB.dbo.items中选择*到new_items

差不多就是这样了。这是最有效的方法。

select * into new_items from productDB.dbo.items

That pretty much is it. THis is the most efficient way to do it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文