第一个具有引用完整性约束的数据库——建议、反馈、错误?
TARGET_RDBMS: MySQL-5.X-InnoDB(“X”等于当前稳定版本)
背景:构建我的第一个具有真正引用完整性约束的数据库,以努力获得反馈,在创建“真正的”DDL之后,我做了一个抽象,我相信它涵盖了数据库的“感觉”;这只是 3 个表,每张表大约 20 个,所有表都具有引用完整性约束;我发现唯一缺少的模式是复合键表,无论如何,它现在没有要转储的数据,所以我只关注第一次迭代。
样本数据/单元测试:我不知道的一件事是如何构建一个样本数据集,该数据集将提供 100% 的引用完整性建模覆盖率 - 并围绕该样本构建“单元测试”数据和此 DDL:
示例 DLL:(
注意:需要明确的是,图例和命名标准仅适用于本示例,我从“真实”数据库中抽象出该示例。列名称本质上是机器人名称,旨在使给定实例的含义和关系尽可能清晰。如果您对所使用的符号系统有任何建议,请随时发表评论,谢谢!)
CREATE DATABASE sampleDB;
use sampleDB;
# ###############
# LEGEND
# - sID = surrogate key
# - nID = natural key
# - cID = common/shared across tables, but NOT unique/natural-key
# - PK = Primary Key
# - FK = Foreign Key
# - data01 = Sample data (non-key,not-shared-across-tables)
# - data02 = Sample data NOT NULL (non-key,not-shared-across-tables)
#
# - uID = user defined unique/natural key (NOTE: not used)
# ###############
# Behavior
# - create_timestamp (NOT NULL, updated on record creation, NOT update)
# - update_timestamp (NOT NULL, updated on record creation AND updates)
CREATE TABLE `TABLE_01` (
`TABLE_01_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
`TABLE_01_cID` int(8) NOT NULL,
`TABLE_01_data01` varchar(128) default NULL,
`TABLE_01_data02` varchar(128) default NULL,
`create_timestamp` DATETIME DEFAULT NULL,
`update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`TABLE_01_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `TABLE_02` (
`TABLE_02_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
`TABLE_02_nID_FK__TABLE_01_sID_PK` int(8) NOT NULL,
`TABLE_02_cID` int(8) NOT NULL,
`TABLE_02_data01` varchar(128) default NULL,
`TABLE_02_data02` varchar(128) NOT NULL,
`create_timestamp` DATETIME DEFAULT NULL,
`update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`TABLE_02_sID_PK`),
FOREIGN KEY (TABLE_02_nID_FK__TABLE_01_sID_PK) REFERENCES TABLE_01(TABLE_01_sID_PK),
INDEX `TABLE_02_nID_FK__TABLE_01_sID_PK` (`TABLE_02_nID_FK__TABLE_01_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `TABLE_03` (
`TABLE_03_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
`TABLE_03_nID_FK__TABLE_01_sID_PK` int(8) NOT NULL,
`TABLE_03_nID_FK__TABLE_02_sID_PK` int(8) NOT NULL,
`TABLE_03_cID` int(8) NOT NULL,
`TABLE_03_data01` varchar(128) default NULL,
`TABLE_03_data02` varchar(128) NOT NULL,
`create_timestamp` DATETIME DEFAULT NULL,
`update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`TABLE_03_sID_PK`),
FOREIGN KEY (TABLE_03_nID_FK__TABLE_01_sID_PK) REFERENCES TABLE_01(TABLE_01_sID_PK),
FOREIGN KEY (TABLE_03_nID_FK__TABLE_02_sID_PK) REFERENCES TABLE_02(TABLE_02_sID_PK),
INDEX `TABLE_03_nID_FK__TABLE_01_sID_PK` (`TABLE_03_nID_FK__TABLE_01_sID_PK`),
INDEX `TABLE_03_nID_FK__TABLE_02_sID_PK` (`TABLE_03_nID_FK__TABLE_02_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
SHOW TABLES;
# DROP DATABASE `sampleDB`;
# #######################
# View table definition
# DESC inserttablename;
# #######################
# View table create statement
# SHOW CREATE TABLE example;
问题:
有关缺失的任何和所有反馈。欢迎使用错误的或“更好”的方法来构建数据库。如果您有疑问,请发表评论 - 我会尽快回复。再次感谢~!
更新(1):
刚刚在 PK 中添加了“MEDIUMINT NOT NULL AUTO_INCRMENT”——不知道我是怎么把它留下的。
TARGET_RDBMS: MySQL-5.X-InnoDB ("X" equals current stable release)
BACKGROUND: Building my first database with true referential integrity constraints, in an effort to get feedback, after creating the "real" DDL, I've made an abstraction that I believe covers the "feel" of the database; this is only 3 tables of about 20, all with referential integrity constraints; only pattern I see that is missing is a composite key table, which does not have data to be dumped in right now anyway, so I'm just focus on the first iteration.
Sample Data / Unit Test: One thing I do not know is how to build out a sample data set that will offer 100% coverage of the referential integrity modeled -- AND build "Unit Test" around that sample data and this DDL:
Sample DLL:
(Note: Just to be clear, the LEGEND and naming standards are JUST for this example, which I've abstracted from the "real" database. The column names are robotic in nature, and meant to make the meaning and relationship of a given instance as clear as possible. If you have suggestions on the notation system used, please feel free to comment. I'm open to any suggestions. Thanks!)
CREATE DATABASE sampleDB;
use sampleDB;
# ###############
# LEGEND
# - sID = surrogate key
# - nID = natural key
# - cID = common/shared across tables, but NOT unique/natural-key
# - PK = Primary Key
# - FK = Foreign Key
# - data01 = Sample data (non-key,not-shared-across-tables)
# - data02 = Sample data NOT NULL (non-key,not-shared-across-tables)
#
# - uID = user defined unique/natural key (NOTE: not used)
# ###############
# Behavior
# - create_timestamp (NOT NULL, updated on record creation, NOT update)
# - update_timestamp (NOT NULL, updated on record creation AND updates)
CREATE TABLE `TABLE_01` (
`TABLE_01_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
`TABLE_01_cID` int(8) NOT NULL,
`TABLE_01_data01` varchar(128) default NULL,
`TABLE_01_data02` varchar(128) default NULL,
`create_timestamp` DATETIME DEFAULT NULL,
`update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`TABLE_01_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `TABLE_02` (
`TABLE_02_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
`TABLE_02_nID_FK__TABLE_01_sID_PK` int(8) NOT NULL,
`TABLE_02_cID` int(8) NOT NULL,
`TABLE_02_data01` varchar(128) default NULL,
`TABLE_02_data02` varchar(128) NOT NULL,
`create_timestamp` DATETIME DEFAULT NULL,
`update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`TABLE_02_sID_PK`),
FOREIGN KEY (TABLE_02_nID_FK__TABLE_01_sID_PK) REFERENCES TABLE_01(TABLE_01_sID_PK),
INDEX `TABLE_02_nID_FK__TABLE_01_sID_PK` (`TABLE_02_nID_FK__TABLE_01_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `TABLE_03` (
`TABLE_03_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
`TABLE_03_nID_FK__TABLE_01_sID_PK` int(8) NOT NULL,
`TABLE_03_nID_FK__TABLE_02_sID_PK` int(8) NOT NULL,
`TABLE_03_cID` int(8) NOT NULL,
`TABLE_03_data01` varchar(128) default NULL,
`TABLE_03_data02` varchar(128) NOT NULL,
`create_timestamp` DATETIME DEFAULT NULL,
`update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`TABLE_03_sID_PK`),
FOREIGN KEY (TABLE_03_nID_FK__TABLE_01_sID_PK) REFERENCES TABLE_01(TABLE_01_sID_PK),
FOREIGN KEY (TABLE_03_nID_FK__TABLE_02_sID_PK) REFERENCES TABLE_02(TABLE_02_sID_PK),
INDEX `TABLE_03_nID_FK__TABLE_01_sID_PK` (`TABLE_03_nID_FK__TABLE_01_sID_PK`),
INDEX `TABLE_03_nID_FK__TABLE_02_sID_PK` (`TABLE_03_nID_FK__TABLE_02_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
SHOW TABLES;
# DROP DATABASE `sampleDB`;
# #######################
# View table definition
# DESC inserttablename;
# #######################
# View table create statement
# SHOW CREATE TABLE example;
Questions:
Any and all feedback on missing, wrong, or "better" ways to do this database build are welcome. If you have questions, just comment -- and I'll respond ASAP. Again, thanks~!
UPDATE (1):
Just added "MEDIUMINT NOT NULL AUTO_INCREMENT" to the PKs -- not sure how I left that off.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
首先,我要赞扬您定义了标准。未来它将给你带来无穷无尽的帮助。
话虽如此,我有一些非常主观的意见:
我不喜欢在名称中嵌入类型信息,例如“TABLE_PERSON”或“PERSON_T”,因为当您用视图替换表时,它会变得令人困惑。此时,您当然可以搜索“PERSON_T”并将其替换为“PERSON_VW”,但这有点错过了重点:)
列也是如此(尽管我在您的示例中看不到这一点)。想想“n_is_dead”列从数字更改为 varchar。
表中是否可以存在未创建的行(create_timestamp)?如果列确实不能为空,请将它们声明为 NOT NULL。事实上,我开始在大多数列上使用 NOT NULL,因为它让我更加深入地思考数据的性质。
我喜欢将主键列命名为 ID 以外的名称。例如,
我听说有些人对 O/R 映射器有问题,希望您始终将主键命名为“ID”,但我不知道这是否仍然如此,或者最近是否发生了变化。
我不清楚您是否打算在列名称中嵌入 (s,n,c) 以指示它们是代理键、自然键还是公用键。但我也不认为这是一个好主意。我觉得这会“揭示”一些不自然地适合逻辑模型的实现细节。
看起来您正在公开/嵌入列名称中的外键关系。我从来没有想过这一点,但我想你会为此深深后悔。不仅仅是因为它使列名变得难以忍受的丑陋:)
在为索引选择名称时。我唯一后悔为索引命名的时候是当我查看执行计划并看到正在使用“index_01”时。我总是希望将列名放入索引中以使其在 xplan 中可见。我不知道索引名称的限制,但我总是遇到 Oracle 的限制。因此,尝试制定一些如何缩写表名的规则。列名在这里很重要。
关于混合情况。我总是(无一例外)选择 ALL_UPPER_CASE 或 all_lower_case。原因是,过去在数据库之间迁移查询时,当它们以不同的方式处理大小写时,我感到很恼火。最近,我使用 all_lower_case,因为我们编辑器的典型字体使小写字母比大写字母更容易发现拼写错误。当我做事失败时,编辑似乎并没有对我大喊大叫;)
First of all, I want to applaud you for defining a standard. There is no end to how much it will come to help you in the future.
Having said that, a couple of very subjective opinions from my part:
I don't like to embed type information in names, such as "TABLE_PERSON" or "PERSON_T" because it becomes confusing the second you replace a table with a view instead. At this point you could of course search and replace "PERSON_T" with "PERSON_VW" instead, but it kind of misses the point :)
The same goes for columns (although i can't see this in your example). Think of the "n_is_dead" column that gets changed from numeric to varchar.
Can a row exist in a table without being created (create_timestamp)? Declare columns as NOT NULL if they really can't be null. In fact, I start of having NOT NULL on most of my columns because it makes me think harder about the nature of the data.
I'm a fan of naming the primary key column something other than ID. For example
I've heard some people have problems with O/R mappers that want you to have the primary key named "ID" at all times, but I don't know if this is still true of if this has changed recently.
It's not clear to me if you intented to embed (s,n,c) in the column names to indicate whether they are surrogate, natural or common key. But I also don't think this is a good idea. I feel that would "reveal" some implementation detail that doesn't fit naturally in the logical model.
It looks like you are exposing/embedding the foreign key relationship in the column names. I have never thought of this, but I think you will deeply regret this one. If not only because it makes the column names unbearably uggly :)
When choosing a name for an index. The only time I regret naming an index something is when I look at an execution plan and see "index_01" being used. I always wish I had put the column name in the index to make it visible in the xplan. I don't know the limit for an index name, but I always run into the limit on Oracle. So, try to come up with some rule for how to abbreviate the table name. The column name is the important thing here.
Regarding mixed case. I always (no exceptions) go with either ALL_UPPER_CASE or all_lower_case. The reason is that in the past I've been burned when migrating queries between databases when they treat case differently. Lately, I use all_lower_case because the typical font of our editors makes it easier to spot spelling errors in lower case than in upper case. And when I fail at things, it doesn't seem like the editor is SHOUTING AT ME ;)