如何在SQL Server中合并两个数据库?
两个数据库具有相同的架构,但它们可能会与某些表中的主键发生冲突。 所以我希望他们忽略重复的行,并继续进一步合并。
Both databases have the same schema, but they may experience conflict with primary key in some tables. So I want them to just ignore the duplicate rows, and continue merging further.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
这是我近年来两次这样做的方法: http://byalexblog.net/merge-sql-databases< /a>
Here is how I did this twice in recent years: http://byalexblog.net/merge-sql-databases
如果您将主键作为 IDENTITY,这是我的建议(不需要修改架构)。
ON UPDATE CASCADE
SET IDENTITY_INSERT
ON/OFF 插入每个表的任一侧,从父表开始,然后移至子表For if you have Primary keys as IDENTITY here is my suggestion (shouldn't require modifying the schema).
ON UPDATE CASCADE
is setSET IDENTITY_INSERT
ON / OFF either side of Inserting each of the tables, starting with the parent table and then moving on to the child tables您只需向合并数据库中的所有表添加一个附加字段(例如,称为 DatabaseID)并将其添加到主键即可。 通过这种方式,您可以保留原始键,同时在合并的数据库中拥有唯一键 - 并且您可以知道该行来自哪个数据库。 这就是 SQL-Hub 所做的 - 如果这只是您可以做的一次性工作免费试用。
You could just add an additional field (called DatabaseID for example) to the all the tables in your merged database and add it to the Primary Keys. This way you can keep the original keys, while having unique keys in the merged database - and you can tell which database the row has come from. This is what SQL-Hub does - if it's just a one off job you can do this with the free trial.
首先,确定所有表中的最高 Id。 说它不到10,000,000。
在第二个数据库中,更新所有 Pks 和 Fks 并添加 10,000,000。
例如,如果您当前的 Id 是 500,则它应该变为 100,000,500。
更新 MyTable 设置 Id = Id + 10,000,00
。更新所有 Id 后,生成插入脚本并将其运行到第一个数据库。
最后,将每个表的身份种子设置为 Max Id。
请注意,更新 PK 时会遇到问题。 您需要在更新之前先删除它们。 在 SQL Server Management Studio 中,您可以生成脚本并将“脚本主键”设置为 false。 将生成的脚本运行到临时数据库。 修改生成的脚本以删除 IDENTITY(1,1)..
First, determine the highest Id in all of the tables. Say its less than 10,000,000.
In your second database, update all Pks and Fks and add 10,000,000.
Example, if your current Id is 500, it should become 100,000,500.
Update MyTable set Id = Id + 10,000,00
.After you updated all Ids, generate insert script and run it to the First Database.
Lastly, set the identity seed of each tables to the Max Id.
Note that you will have a problem updating the PKs. You need to remove them first before updating it. In SQL Server Management Studio, you can generate script and set "Script Primary Keys" to false. Run the generated script to your temp database. Modify the generated script to Remove IDENTITY(1,1)..
首先,按键冲突表明您当前使用的任何进程都是一个糟糕的进程。
要正确合并使用自动生成(非 GUID)键的两个数据库,您需要执行几个步骤。 首先将新的自动生成键添加到父表中,然后导入两个表中的所有数据,将旧的旧 is 文件重命名为 ID_old 并将新文件重命名为旧 id 名称。 此时您可以在子表上移动。 您需要通过连接到父表并将新的 id 字段(而不是现有表中的值)作为外键值来复制到子表。 您需要对每个外键表重复此过程,如果该表也是父表,则需要在复制任何数据之前将 conversionid 字段添加到表中,以便您可以沿着链一直工作。 要正确地做到这一点需要大量的数据库结构知识和大量的规划。 如果没有对两个源数据库进行良好的备份,请勿考虑执行此操作。 如果两个数据库都处于单用户模式时该过程可以发生,那也是最好的。
如果您使用自然键并且有重复项,则会遇到截然不同的问题。 所有重复的关键记录首先应移至单独的表中,然后确定哪个数据更正确。 在某些情况下,您会发现自然键实际上不是唯一的(它们很少是唯一的,这就是为什么我几乎不使用它们)并且合并的数据库将需要使用某种类型的自动生成的键。 这将涉及代码更改以及数据库更改,因此这是最后的选择。
使用自然键时,您经常会发现每个记录的数据不同但相似(地址中的圣副街),在这种情况下标记其中一条记录要插入,然后何时分两步插入,首先是记录没有重复项,则重复项表中标记为插入的记录。 请记住,您必须检查所有外键表中的所有记录,以确定保留哪些记录和不保留哪些记录。 仅仅丢弃任何重复项是一个坏主意,这样您就会丢失数据,可能是关键数据(例如客户的订单)。 这是一个漫长而乏味的过程,需要具有数据专业知识的人来做出决定。 作为一名程序员,您应该为他们提供一个重复数据删除工具,让他们检查每组重复项的所有数据,并选择要保留的内容和要删除的内容,然后标记所有内容,它将运行一个过程来插入记录。 请记住,在您的设计中,对于真正的重复项,将有一些子表(例如orders)需要将两个表中的记录发送到数据库,以便选择作为要输入的记录(orders 是一个示例),对于其他表您需要选择正确的选项(例如地址)。 所以你可以看到这是一个复杂的过程,需要对数据库有透彻的了解。
如果您有很多重复项,他们可能会花费几个月的时间来清理和添加数据,因此工具非常重要。 执行此操作的人可能是系统用户,而不是数据库专家或程序员,因为他们是唯一真正能够对保留哪条记录做出大部分判断的人。 在任何情况下,您可能都需要执行类似的操作,因为即使您有自动生成的密钥,也可能存在重复的记录。 它们只是更难找到。
没有简单的方法来合并两个数据库(即使使用 GUIDS,也会遇到自然键重复的问题)。
First a conflict of keys indicates that whatever process you are currently using is a poor one.
To correctly merge two database which are using autogenerated (non_GUID) keys, you need to take several steps. First add a new autogenerated key to the parent table, then import all the data from both tables, rename the old the old is file to ID_old and rename the new files to the old id name. At this point you can then move onthe the child tables. You will need to copy to child tables by joining to the parent table and taking the new id field as the value for the foreign key instead of the one in the existing table. You will need to repeat this process for every foreign key table and if that table is also a parent table, you will need to add the conversionid field to the table before copying any data, so that you you can work all the way down the chain. To do this properly involves a great deal of of knowlege of the structure of the database and lots of planning. Do not consider doing this without a good backup of both source databases. It is also best if the process can happen when both dabases are in single user mode.
If you use natural keys and have duplicates, you have a far different problem. All duplicate key records whould be moved to a separate table first and a detemination as to which is the more correct data should be made. In some cases you will find that the natural key is in fact not unique (they rarely are which is why I almost nver use them) and the merged database will need to work with an autogenerated key of some type. This will involve code changes as well as database changes, so it is the option of last resort.
What you find often with natural keys is that the data for each one is different but simliar (St. vice Street in the address) in this case mark one of the records for insert and then when do the insert in two steps, first the records which have no duplicates, then the records in the duplicates table that are marked for insertion. Remember you will have to examine all records in all foreign key tables to make the determination which to keep and which not to keep. Just throwing out any duplicates is a bad idea and you will lose data that way, possibly critical data (such as a customer's orders). This is a long tedious process which will require someone with expertise in the data to make the determinations. As a programmer, you should provide them a dedup tool that will let them examine all the data for each set of duplicates and choose what to keep and what to get rid of and then having marked everyithing, it will run a process to insert the records. Remeber in your design, that for true duplicates, there will be some child tables (such as orders ) that need the records from both sent to the database for the record chosen as the one to enter (orders is an example), for other tables you will want to choose which is correct (address for instance). So you can see this is a complex process requiring a thorough understanding othe database.
If you have a lot of duplicates, they may be cleaning up and adding the data for several months, so a tool is really critical. The people doing this will likely be system users not database specialists or programmers as they are the only people who truly can make the judgement most of the itme as to which record to keep. Likely you will need to do something simliar in any event as there may be records which are duplicates even when you have an auto-generated key. They are just more difficult to find.
There is no easy way to merge two databases (even using GUIDS, you have the problem of duplicates in the natural key).
我知道这是一个老话题,但我必须评论一下我在许多帖子中看到的一般方法,即尝试使用 SQL 查询本地完成所有操作。 此类解决方案的共同点是在应用查询之前需要花费相当多的时间来创建和测试查询。
所以是的 - 您可以使用相对复杂的查询本地合并两个数据库,但您可以节省大量时间并免费使用第三方工具(大多数或全部都有功能齐全的免费试用版)。
市场上有很多这样的产品。 Red Gate,已经在其他帖子中提到过,是最好的之一,但您也可以尝试 ApexSQL Data Diff,dbForge、SQL 比较工具集< /a> 和许多其他。
I know this is an old topic but I have to comment on the general approach I see in many posts and that is trying to do everything natively using SQL queries. What such solutions have in common is the fairly large amount of time that needs to be spent on creating and testing a query before applying it.
So yes – you can merge two databases natively using relatively complex queries but you can save yourself a ton of time and use third party tools for free (most or all have fully functional free trial).
There are ton of these on the market. Red Gate, already mentioned in other post, is one of the best but you can also try ApexSQL Data Diff, dbForge, SQL Comparison toolset and many others.
最好的选择可能是使用第三方应用程序,例如 RedGate SQL Data Compare< /a>. 花费一些钱,但在我看来,这比编写脚本是值得的。
Best bet would probably be going with a 3rd party application such as RedGate SQL Data Compare. Costs some money, but it's worth it over writing that script IMO.