将数据从一个sql表移动到另一个表的最有效方法是什么
目前,我有一个 SSIS 作业,它运行将数据从一台 sql 20008 机器移动到另一台机器。该作业从大约 6 个表中移动大约 200 万条记录。这大约需要 5-10 分钟,具体取决于服务器负载,这很好。由于数据被移至临时表中,因此除了服务器压力之外,不会受到任何影响。
但当我现在想要将这些数据与其各自的实时表合并时,我的问题就出现了。这可能需要大约 15 分钟的时间,在此期间,表将被清空,然后重新填充。我想知道在表之间移动数据的最有效方法是什么。
目前情况如下:
删除表
使用索引和约束重建表
插入选择以移动数据
然后运行任何需要的计算
数据移动后运行以下命令来重建所有索引:
sp_MSforeachtable @command1="print '?' DBCC DBREINDEX ('?')"
我觉得应该有更好的方法,以便用户的停机时间最少。我的一个想法是创建第二组表,然后在它们准备好后重命名它们,但我不确定这是否是最好的方法。
我还刚刚阅读了有关合并命令的信息,这可能会更好,因为我不必删除表并重新填充,这意味着所有数据都将保持可用,但如果不查看几乎所有列,很难知道记录是否发生变化。
我将不胜感激任何帮助。
Currently I have an SSIS job that runs to move data from one sql 20008 machine to another. The job moves about 2 million records from about 6 tables. That takes about 5-10 minutes depending on the server load and that is fine. Because the data is moved into temp tables so nothing is affected besides the strain on the server.
But my problem becomes when I now want to merge that data with their respective live tables. That can take about 15 minutes during which the tables are emptied and then repopulated. What I am wondering is what is the most efficient way to move that data between the tables.
Currently here is how it goes:
drop tables
rebuild tables with indexes and constraints
insert into select to move the data
then run any calculations that are needed
run following command to rebuild all the indexes after the data has been moved:
sp_MSforeachtable @command1="print '?' DBCC DBREINDEX ('?')"
I feel like there should be a better way so that there is minimal down time to the users. One thought I had was to create a second set of tables and then just rename those once they are ready but I am not sure if that is the best way either.
I have also just read about the merge command which might be better as I don't have to drop the tables and repopulate which means all the data would stay available but it is hard to know if the records change without looking at nearly all the columns.
I would appreciate any help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果要清空并重新填充,我们经常做的就是创建一个与当前表同名的视图(因此没有现有的代码中断),并创建两个名为 tablenameA 和 tablenameB 的表,它们具有相同的结构和所有数据。将视图指向 tablenameA。截断表名 B.删除索引。运行进程填充tablenameB并重新索引,运行脚本将视图指向tablenameB。用户停机时间?毫秒。然后下次切换并截断并填充TableNameA,然后将视图重做为TableNameA。
If you are emptying and repopulating what we often do is create a view that is the same name as the current tables (so no existing code breaks) and create two tables called tablenameA and tablenameB with the same structure and all the data. Point the view to tablenameA. Truncate TableNameB. Drop indexes. Run the process to fill tablenameB and reindex, run the script to point the view to tablenameB. Down time to the user? Milliseconds. Then the next time you switch and truncate and fill TableNameA and then redo the view to TableNameA.
看一下表分区。我相信您的用例是表分区存在的原因之一。
这是一个摘要
这更符合您的问题
我应该提到的是,此功能仅适用于企业和开发人员 SKU
Have a look at table partitioning. I believe your use case is one of the reasons table partitioning exists.
Here's a summary
This is more on point with your question
I should mention that this feature is only available on Enterprise and Developer SKUs
除了上面关于表分区的一点之外,您还可以避免临时表的步骤。使用 SQL Server 目标加载,在目标服务器上运行包,将其加载到空分区。使用分区索引并仅为该空分区重建索引。合并到新分区中。
In addition to the above point about the table partitioning, you can then avoid the step to the temp table. Load using a SQL Server Destination, running the package on the destination server, to an empty partition. Use partitioned indexes and rebuild the index only for that empty partition. Merge in the new partition.
在考虑了具有不同基础表的视图选项后,我决定反对,以避免它可能带来的复杂性和混乱。我研究了分区,但由于我对源计算机没有太多控制权,这似乎不是合适的解决方案。所以最后我决定使用SQL MERGE语句并使用BINARY_CHECKSUM来比较行并确定差异与否。虽然我在锁定方面没有遇到任何问题。但我为此提出了另一个问题。
如何提高 SQL MERGE 语句的性能
After considering the option for the view with different underlying tables I decided against to avoid the complexity and confusion that it might introduce. I looked into partitioning but since I do not have much control on the source machine it didn't seem like the appropriate solution. So in the end I just decided to use the SQL MERGE statement and use BINARY_CHECKSUM to compare rows and determine differences or not. Thhough I am not having some issues with the locking with that. But I have opened another question for that.
How to improve performance of SQL MERGE statement