当前位置：文江博客话题详情

无论主键如何，如何检查两个关系数据库是否相同？

发布于 2024-09-27 19:13:26 字数 1017 浏览 4 评论 0原文

我有一个大约有 100 个表的关系数据库。每个表都有唯一的数字主键和合成值，并且有许多链接表的外键。表不大（数十或数百或记录）。这是一个 SQLite 数据库。

出于测试目的，我需要通过 Linux 脚本比较数据库的两个副本（可以使用简单的 bash 脚本、perl、diff、sed）。我需要验证两个数据库的记录数量是否相同并且记录具有相同的内容并转储差异。问题是，只要关系相同，键的值就可以不同。

例如：

有一个表“country”，其主键“ix_country”和“name”，以及一个表“customer”，其字段“name”、主键“ix_customer”和外键“ix_country” 。

这两个数据库是相等的：第一个数据库：

country:name=“USA”ix_country=1；客户：name=“Joe”ix_customer=10 ix_country=1

第二个数据库：

国家/地区：name=“USA”ix_country=1771； customer: name="Joe" ix_customer=27 ix_country=1771

两个副本具有相同的结构。

有没有简单的方法可以做到这一点？

更新：

还有一项要求 - 脚本必须能够抵御结构的变化。如果添加或删除表或字段，它必须起作用。

更新2：

我开始自己解决这个问题。一般策略是编写一个创建“身份映射”文件的 SQL 脚本。该映射包含每个记录的主键值（“人工身份”）和“自然身份”键（唯一标识该记录的字符串）。对于数据库中的某些表，有一个唯一的自然 id 键（如我的示例中的国家名称）。其他表需要序列中的序数，还有一些表将其自身的身份与父项中的身份结合起来（如果父项也有父项，则可能是递归的）。

所有记录都由第二个 SQL 脚本以识别人工身份的格式转储到第二个文本文件。

perl 脚本将第二个文件中的所有人工身份替换为地图中的自然身份。

对结果进行排序和差异。

原文

I have a relational database with about 100 tables. Each table has unique, numerical, primary key with synthetic values and there are many foreign keys which link the tables. The tables are not big (tens or hundreds or records). This is a SQLite database.

I need, for testing purposes, to compare two copies of the database by a linux script (simple bash scripts, perl, diff, sed are available). I need to validate that the number of records of both databases is the same and that the records have the same content and to dump the differences. The problem is, that the values of the keys are allowed to be different as far as the relations are the same.

For example:

There is a table "country" with primary key "ix_country" and "name" and a table "customer" with fields "name", primary key "ix_customer" and foreign key "ix_country".

These two databases are equal: first database:

country: name="USA" ix_country=1; customer: name="Joe" ix_customer=10 ix_country=1

second database:

country: name="USA" ix_country=1771; customer: name="Joe" ix_customer=27 ix_country=1771

Both copies have the same structure.

Is there an easy way to do this?

Update:

One more requirement - the script must be robust against changes in the structure. It must work if a table or a field is added or deleted.

Update 2:

I started to work on the problem myself. The general strategy is to write a SQL scripts which creates "identity map" file. The map contains for each record its primary key value ("artificial identity") and "natural identity" key - a string which uniquely identifies the record. For some tables in the database, there is an unique natural id key (like contry name in my example). Other tables require ordinal number in a sequence and still others combine its own identity with identity in parent (maybe recursively if the parent has also a parent).

All records are dumped to second text file by a second SQL script in a format which identifies the artificial identities.

The a perl script replaces all artificial identities in the second file with their natural identities from the map.

The the result is sorted and diffed.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

荒岛晴空 2024-10-04 19:13:26

有一个简单的方法可以做到这一点

不。这需要编程工作。

回复收藏 0 原文

忆悲凉 2024-10-04 19:13:26

如果数据库非常简单，那么在命令行上运行一个查询，转储所有正确格式化的数据，没有 id，正确排序并与 diff 进行比较，可以让您获得很大的帮助。

例如，

sqlite3 test.db 'CREATE TABLE Country (id  integer, name varchar(20))'
sqlite3 test.db 'CREATE TABLE Customer (id  integer, name varchar(20), country integer)'
sqlite3 test.db 'insert into country values (1, "USA")'
sqlite3 test.db 'insert into country values (2, "Belgium")'
sqlite3 test.db 'insert into customer values (1, "Joe", 1)'
sqlite3 test.db 'insert into customer values (1, "Peter", 2)'

sqlite3 test.db 'select cust.name, c.name from customer cust, country c where cust.country = c.id order by c.name, cust.name'

Peter|Belgium
Joe|USA

sqlite3 test.db 'select cust.name, c.name from customer cust, country c where cust.country = c.id order by c.name, cust.name' >db1.txt

在 bash 脚本中执行最后一个查询，在两个数据库上运行它并比较这两个文件将为您提供不同的客户，而无需编程。

当数据模型更加复杂时，这当然会失败。

If the database is pretty simple, running a query on the commandline which dumps all data properly formatted, without the ids, properly sorted and comparing with diffcould get you a large way.

e.g.

sqlite3 test.db 'CREATE TABLE Country (id  integer, name varchar(20))'
sqlite3 test.db 'CREATE TABLE Customer (id  integer, name varchar(20), country integer)'
sqlite3 test.db 'insert into country values (1, "USA")'
sqlite3 test.db 'insert into country values (2, "Belgium")'
sqlite3 test.db 'insert into customer values (1, "Joe", 1)'
sqlite3 test.db 'insert into customer values (1, "Peter", 2)'

sqlite3 test.db 'select cust.name, c.name from customer cust, country c where cust.country = c.id order by c.name, cust.name'

Peter|Belgium
Joe|USA

sqlite3 test.db 'select cust.name, c.name from customer cust, country c where cust.country = c.id order by c.name, cust.name' >db1.txt

doing the last query in a bash script, running it on both db's and diffing the 2 files will give you the different customers without programming.

This breaks down of course when the datamodel is more convoluted.

回复收藏 0 原文