DB Fk/Pk 密钥性能
在我们的数据库中,我们有一个中心表,其中包含数百万行,并且不断插入和更新。 该表具有充当唯一标识符的单列,用于将该表的内容与具有一对多关系的多个表链接。
这意味着当我们向 USERS 表插入条目时,在同一事务中, USERS_PETS 和 USERS_PARENTS (以及另外 10 个)也将根据主表中相同的唯一标识符填充多行。
由于使用此数据库的应用程序不断插入新条目并更新现有条目,因此这些表之间的关系仅保留在应用程序级别(即逻辑 ERD 而不是通过 FK/PK 减速来处理此问题)。
问题:
- 从纯粹性能的角度来看,这是最好的方法,这种假设是否正确?
- 有没有办法在不影响性能的情况下设置这些键(以便数据库更具自我描述性)?
In our DB we have a single centric table with millions of rows that is constantly being inserted and updated.
This table has a single column acting as the unique identifier and is used to link the content of this table with mutliple tables with a one to many relation.
This means that wehn inserting entry to, say, USERS table, in the same transaction also USERS_PETS and USERS_PARENTS (and 10 more) will be populated, with multiple rows, based on the same unique identifier from the main table.
Since the application using this DB is constantly inserting new entries and updating existing ones the relation between these tables is kept only at the application level (i.e. logical ERD instead of handling this via FK/PK decelrations).
Questions:
- Is this correct to assume that from pure performnces point of view, this is the best approach?
- Is there a way to set these keys (so that the DB will be more self descriptive) without impacting performaces?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是最糟糕的方法,我保证您最终会遇到数据完整性问题。数据完整性比性能更重要。这是愚蠢和短视的。
This is the worst possible approach and I guarantee you will have data integrity issues eventually. Data integrity is far more critical than performance. This is stupid and short-sighted.
DISABLE NOVALIDATE
.您将数据完整性建立在希望之上。希望不能很好地扩展。
而且不存在“纯粹的性能观点”这样的东西。除非,也就是说,您从未从数据库中读取数据。如果您只插入,从不更新,从不删除,并且从不读取,那么您可以证明存在“纯粹的性能观点”。但是,如果您更新、删除或读取,那么性能就不再是重点——它更像是一个表面或一个实体,您所能做的就是在插入之间移动平衡点,更新、删除和读取。
而且,由于阅读本文的人仍然无法理解,因此阅读性能的最关键部分是返回正确答案。如果你不能保证正确的答案,明智的人不会关心你的插入速度有多快。
You base data integrity on hope. Hope doesn't scale well.
And there's no such thing as "pure performance point of view". Unless, that is, you never read from the database. If you only insert, never update, never delete, and never read, you can make a case that there exists a "pure performance point of view". But if you ever update, delete, or read, then performance isn't a point--it's more like a surface or a solid, and all you can do is move the balancing point around among inserts, updates, deletes, and reads.
And, because somebody reading this still won't get it, the most critical part of read performance is getting back the right answer. If you can't guarantee the right answer, sensible people won't care how marginally faster your inserts are.