在我之前完成此操作的地方,我还创建了 UNION problem 和 problem_history 的视图,因为这有时在各种查询中很有用。
选项 1 使得查询当前情况变得困难,因为所有历史数据都与当前数据混合在一起。
选项 3 对查询性能不利,并且难以编写代码,因为您将访问大量行来执行一个简单的查询。
It's a good idea to choose a data structure that makes common questions that you ask of the model easy to answer. It's most likely that you're interested in the current position most of the time. On occasion, you will want to drill into the history for particular problems and solutions.
I would have tables for problem, solution, and relationship that represent the current position. There would also be a problem_history, solution_history, etc table. These would be child tables of problem but also contain extra columns for VersionNumber and EffectiveDate. The key would be (ProblemId, VersionNumber).
When you update a problem, you would write the old values into the problem_history table. Point in time queries are therefore possible as you can pick out the problem_history record that is valid as-at a particular date.
Where I've done this before, I have also created a view to UNION problem and problem_history as this is sometimes useful in various queries.
Option 1 makes it difficult to query the current situation, as all your historic data is mixed in with your current data.
Option 3 is going to be bad for query performance and nasty to code against as you'll be accessing lots of rows for what should just be a simple query.
但是,如果 description 和其他大字段保留在 Things 表中,它也无法解决重复空间问题。
table things
int id | int type | string name | text description | datetime created_at | other common fields...
foreign key type -> thing_types.id
table custom_attributes
int id | int thing_id | string name | string value
foreign key thing_id -> things.id
I suppose there's
Option 4: the hybrid
Move the common Thing attributes into a single-inheritance table, then add an custom_attributes table. This makes foreign-keys simpler, reduces duplication, and allows flexibility. It doesn't solve the problems of type-safety for the additional attributes. It also adds a little complexity since there are two ways for a Thing to have an attribute now.
If description and other large fields stay in the Things table, though, it also doesn't solve the duplication-space problem.
table things
int id | int type | string name | text description | datetime created_at | other common fields...
foreign key type -> thing_types.id
table custom_attributes
int id | int thing_id | string name | string value
foreign key thing_id -> things.id
表问题_修订 内部修订| 整数 ID | 字符串名称 | 文字描述| 日期时间创建时间 外键 ID -> issues.id
在更新之前,您必须在修订表中执行额外的插入操作。 必须付出的代价
这种额外的插入速度很快,但是,这是您为了有效访问当前版本而
- 像往常一样选择问题直观且接近现实的模式您想要对
模式中的表之间的连接进行建模保持高效
使用每个业务事务都有一个修订号,您可以对表记录进行版本控制,就像 SVN 对文件所做的那样。
How do you think about this:
table problems int id | string name | text description | datetime created_at
table problems_revisions int revision | int id | string name | text description | datetime created_at foreign key id -> problems.id
Before updates you have to perform an additional insert in the revision table. This additional insert is fast, however, this is what you have to pay for
efficient access to the current version - select problems as usual
a schema that is intuitive and close to the reality you want to model
joins between tables in your schema keep efficient
using a revision number per busines transaction you can do versioning over table records like SVN does over files.
foreign key (thing_id, thing_type) -> problems.id or solutions.id
Be careful with these kinds of "multidirectional" foreign keys. My experience has shown that query performance suffers dramatically when your join condition has to check the type before figuring out which table to join on. It doesn't seem as elegant but nullable
problem_id and solution_id
will work much better.
Of course, query performance will also suffer with an MVCC design when you have to add the check to get the latest version of a record. The tradeoff is that you never have to worry about contention with updates.
As far as a database design would go, a versioning system kind of like SVN, where you never actually do any updates, just inserts (with a version number) when things change, might be what you need. This is called MVCC, Multi-Value Concurrency Control. A wiki is another good example of this.
发布评论
评论(5)
最好选择一种数据结构,使您向模型提出的常见问题易于回答。 您很可能大部分时间都对当前职位感兴趣。 有时,您会想要深入了解特定问题和解决方案的历史。
我会有代表当前位置的问题、解决方案和关系的表格。 还有一个
problem_history
、solution_history
等表。 这些将是问题的子表,但还包含VersionNumber
和EffectiveDate
的额外列。 键为 (ProblemId
,VersionNumber
)。当您更新问题时,您可以将旧值写入
problem_history
表中。 因此,时间点查询是可能的,因为您可以挑选出在特定日期有效的problem_history
记录。在我之前完成此操作的地方,我还创建了 UNION
problem
和problem_history
的视图,因为这有时在各种查询中很有用。选项 1 使得查询当前情况变得困难,因为所有历史数据都与当前数据混合在一起。
选项 3 对查询性能不利,并且难以编写代码,因为您将访问大量行来执行一个简单的查询。
It's a good idea to choose a data structure that makes common questions that you ask of the model easy to answer. It's most likely that you're interested in the current position most of the time. On occasion, you will want to drill into the history for particular problems and solutions.
I would have tables for problem, solution, and relationship that represent the current position. There would also be a
problem_history
,solution_history
, etc table. These would be child tables of problem but also contain extra columns forVersionNumber
andEffectiveDate
. The key would be (ProblemId
,VersionNumber
).When you update a problem, you would write the old values into the
problem_history
table. Point in time queries are therefore possible as you can pick out theproblem_history
record that is valid as-at a particular date.Where I've done this before, I have also created a view to UNION
problem
andproblem_history
as this is sometimes useful in various queries.Option 1 makes it difficult to query the current situation, as all your historic data is mixed in with your current data.
Option 3 is going to be bad for query performance and nasty to code against as you'll be accessing lots of rows for what should just be a simple query.
我想有
选项 4:混合
将公共事物属性移动到单继承表中,然后添加一个
custom_attributes
表。 这使得外键更简单,减少重复,并提供灵活性。 它没有解决附加属性的类型安全问题。 它还增加了一点复杂性,因为现在有两种方法可以让事物拥有属性。但是,如果
description
和其他大字段保留在 Things 表中,它也无法解决重复空间问题。I suppose there's
Option 4: the hybrid
Move the common Thing attributes into a single-inheritance table, then add an
custom_attributes
table. This makes foreign-keys simpler, reduces duplication, and allows flexibility. It doesn't solve the problems of type-safety for the additional attributes. It also adds a little complexity since there are two ways for a Thing to have an attribute now.If
description
and other large fields stay in the Things table, though, it also doesn't solve the duplication-space problem.您如何看待这个问题:
表格问题
整数 ID | 字符串名称 | 文字描述| 日期时间创建_
表问题_修订
内部修订| 整数 ID | 字符串名称 | 文字描述| 日期时间创建时间
外键 ID -> issues.id
在更新之前,您必须在修订表中执行额外的插入操作。 必须付出的代价
How do you think about this:
table problems
int id | string name | text description | datetime created_at
table problems_revisions
int revision | int id | string name | text description | datetime created_at
foreign key id -> problems.id
Before updates you have to perform an additional insert in the revision table. This additional insert is fast, however, this is what you have to pay for
@Gaius
小心这些“多向”外键。 我的经验表明,当您的联接条件必须在确定要联接的表之前检查类型时,查询性能会受到极大影响。 它看起来不太优雅,但可为空
会工作得更好。
当然,当您必须添加检查以获取记录的最新版本时,MVCC 设计的查询性能也会受到影响。 代价是您永远不必担心更新争用。
@Gaius
Be careful with these kinds of "multidirectional" foreign keys. My experience has shown that query performance suffers dramatically when your join condition has to check the type before figuring out which table to join on. It doesn't seem as elegant but nullable
will work much better.
Of course, query performance will also suffer with an MVCC design when you have to add the check to get the latest version of a record. The tradeoff is that you never have to worry about contention with updates.
嗯,听起来有点像这个网站...
就数据库设计而言,版本控制系统有点像 SVN,您实际上从不进行任何更新,只是在情况发生变化时插入(带有版本号),可能是你需要什么。 这称为 MVCC,多值并发控制。 维基百科是另一个很好的例子。
Hmm, sounds kind of like this site...
As far as a database design would go, a versioning system kind of like SVN, where you never actually do any updates, just inserts (with a version number) when things change, might be what you need. This is called MVCC, Multi-Value Concurrency Control. A wiki is another good example of this.