实现关系数据模型:该表有哪些约束?

发布于 2024-12-12 05:17:13 字数 1031 浏览 2 评论 0原文

我试图提出一个关系模型和数据库实现,但不断遇到这个问题。但我什至不知道该叫什么!改进标题的建议将不胜感激。

我试图将问题归结为基本问题。


简化示例:

这是 MySQLWorkbench 图表:

MySQLWorkbench data model


正如您所看到的,问题都与 <代码>原子表

期望目标概述:

  • 我需要存储特定蛋白质原子的数据
  • 我可能有某些原子的数据,或者没有任何原子的数据
  • 我不希望能够插入垃圾数据 - - 我希望数据库约束能够防止这种

情况发生,我不确定:

  • 是否应该有一个 Atom 表 - 它似乎是 Atom_ref 之间的联接和Residue将生成所有原子蛋白质 - 但我还需要存储有关原子的数据

问题概要:

  • 每个原子需要一个残基和一个atom_ref,
  • 但由于残基与残基_ref相关联,所以atom_ref只能是一个相关的(与residue_ref)atom_ref的
  • 不知道如何将残基的residue_ref与atom_ref的residual_ref相匹配

到目前为止我尝试过的:

  • 添加pk将 Atom_ref 更改为 Atom 的 pk - 但是,residue_ref 可能与 Residue
  • 更改 Residue.rr_name 的不匹配到 pk 的一部分——违反了域语义

我知道这是对问题的一个糟糕的解释,我正在尝试找出如何更清楚地解释它!非常欢迎提出改进建议!

I'm trying to come up with a relational model and database implementation, but keep running into this problem. But I don't even know what to call it! Suggestions for improving the title would be appreciated.

I've tried to boil the problem down to its basics.


Simplified example:

Here's the MySQLWorkbench diagram:

MySQLWorkbench data model


As you can see, the problems are all with the Atom table.

Outline of desired goal:

  • I need to store data for specific atoms of proteins
  • I may have data for some of the atoms, or none of the atoms
  • I don't want to be able to insert junk data -- I'd like for the database constraints to prevent this

What I'm not sure about:

  • whether there should be an Atom table -- it seems like a join between Atom_ref and Residue would generate all the atoms of the protein -- but I also need to store data about the atoms

Outline of problem:

  • each atom needs a residue, and an atom_ref
  • but since the residue is associated with a residue_ref, the atom_ref can only be one of the associated (with the residue_ref) atom_ref's
  • don't know how to match the residue_ref of the Residue with the residue_ref of the atom_ref

What I've tried so far:

  • add the pk of Atom_ref to the pk of Atom -- but then the residue_ref might not match that of Residue
  • change Residue.rr_name to part of the pk -- violates domain semantics

I know this is a poor explanation of the problem, I'm trying to figure out how to explain it more clearly! Suggestions for improvement are more than welcome!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

尸血腥色 2024-12-19 05:17:13

如果我理解正确,那么您所追求的是 (a) Atom 和 Residue 的连接以及 (b) Atom_ref 之间的包含依赖关系。 (即,Atom 中的所有原子名称,与残差中为其定义的 rr_name 相结合,必须声明为有效组合,即必须出现在 Atomref 中)。

仅使用 RI/FK 来实现此目的的方法是将 rr_name 冗余地包含在 Atom 中。将 FK 从 Atom 扩展到 Residue 到所有三列。这将保证Atom中记录的rr_names与Residue中的信息保持一致。但是由于您现在已经在 Atom 中引入了 rr_name,因此您现在可以确保(通过从 Atom 到 Atom_ref 的 FKatomname+rrname)Atom 中记录的任何内容也与已声明的原子名称一致(在 Atomref 中)存在于所涉及的残基引用中。

请注意,此“解决方案”使更新数据库变得更加困难(需要维护更多冗余,从而产生更多违规可能性),因为您刚刚降低了设计的 NF 级别。

另一种方法是保持设计不变,并通过对每个涉及的表(其中更新可能会导致违反业务规则)的适当触发器来强制执行约束。也就是说,对 Atom_ref 进行删除和更新(即任何导致有效存在于某处的有效组合消失的任何事情),对 Residue 进行(rr_name 的)更新,以及对 Atom 进行插入和更新(即可能导致出现某些组合的任何事情)的组合可能无效)。

If I understand you correctly, then what you're after is an inclusion dependency between (a) the join of Atom and Residue, and (b) Atom_ref. (i.e. all atomnames in Atom, combined with the rr_name defined for it in residue, must be declared as being valid combinations, i.e. must appear in Atomref).

The way to do it using merely RI/FK, is to include rr_name in Atom, redundantly. Extend the FK from Atom to Residue to all three columns. This will guarantee you that the rr_names recorded in Atom remain consistent with the information in Residue. But since you have now introduced rr_name in Atom, you now have the means to ensure (through an FK atomname+rrname from Atom to Atom_ref) that whatever is recorded in Atom, is also consistent with the atomnames which have been declared (In atomref) to exist for the residueref involved.

Note that this "solution" makes updating your database harder (more redundancy to maintain, giving rise to more potential for violations), because you have just lowered the NF level of your design.

The other way to do it is to leave your design as is, and enforce the constraint through appropriate triggers on every involved table where an update could cause a violation of your business rule. That would be, deletes and updates on Atom_ref (i.e. anything that causes the disappearing of a valid combination that effectively exists somewhere), updates (of rr_name) on Residue, and inserts and updates on Atom (i.e. anything that might cause the appearance of some combination that might not be valid).

〃温暖了心ぐ 2024-12-19 05:17:13

在蛋白质内,残基名称是否唯一? IE,您可以为 Residue (p_id, rr_name) 创建唯一约束吗?

如果是这样,那么在 Atom 中,您可以将 r_index 替换为 rr_name;在 (p_id,rr_name) 上创建一个到 Residue 的 FK;并在 (rr_name,atom_name) 上创建一个到 Atom_ref 的 FK。

编辑:是的,我想事情可能不会那么简单。我认为你的第二个要点是正确的方向 - 但不要改变 Residue 的 PK,只需在所有三列上创建一个新的唯一约束。这两个约束甚至可以共享一个索引。然后,您可以将 rr_name 添加到 Atom,并为 Residue 指定一个三列 FK,为 Atom_ref 指定一个两列 FK

Within a protein, are the residue names unique? I.E., could you create a unique constraint on Residue (p_id, rr_name)?

If so, then in Atom you could replace r_index with rr_name; create an FK on (p_id,rr_name) to Residue; and create an FK on (rr_name, atom_name) to Atom_ref.

Edit: Yeah, I figured it probably wouldn't be that simple. I think your second bullet point is sort of the right direction -- but don't change the PK of Residue, just create a new unique constraint on all three columns. The two constraints may even be able to share one index. Then you can add rr_name to Atom, and give it a three-column FK to Residue and a two-column FK to Atom_ref.

优雅的叶子 2024-12-19 05:17:13

原子和atom_ref有什么区别?查看您的表结构,看起来 Residue_ref 和 A​​tom_ref 似乎是表达原子和残基之间的多对多关系。 (抱歉,我从未学过生物化学,所以我可能错过了一些重要的东西)。

在特定残基的上下文中是否存在特定于原子的数据?如果答案是否定的,我会将 Residue_ref 和 A​​tom_ref 表缩减为单个表:atoms_residues,如下所示:

atoms_residue(r_index, atom_index)

这将确保原子表中的特定原子与残基表相关。

如果存在原子-残基配对的上下文信息,我仍然会创建上面的表并考虑将上下文信息作为列添加到该表中。

关键是封装。特定于原子的数据应位于原子表中。残留物特定数据应位于残留物表中。连接表表达了两者之间的多对多关系,并且还可以选择包含该关系特有的任何信息。

What is the difference between atom and atom_ref? Looking at your table structure, it appears as if the Residue_ref and Atom_ref are to express the many-to-many relationships between Atoms and Residues. (Sorry, I never took biochem, so I may be missing something important).

Is there data specific to an Atom in context of a specific Residue? If the answer is no, I would reduce the Residue_ref and Atom_ref tables down to a single table: atoms_residues that looks like this:

atoms_residue(r_index, atom_index)

This would ensure that a specific atom from the atom table appears in relation to the residue table.

If there IS context information for the atom-residue pairing, I would still create the above table and consider adding the context information as columns to that table.

The key is encapsulation. Atom-specific data should be in the atom table. Residue-specific data should be in the Residue table. A join table expresses the many-many relationship between the two and can optionally also contain any information unique to that relationship.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文