为审计问题状态建模关系实体

发布于 2024-07-09 16:56:42 字数 521 浏览 8 评论 0原文

这让我怀疑了一段时间,所以我认为将其发布在这里以寻求一些见解是一个好主意,这是一个关系数据库建模问题/疑问

我有以下问题:

我有“问题”必须是在一个特定的“状态”中,所有状态的改变都必须经过审计。

我找到了两种解决方案,但我真的看不出它们之间的区别,如果有的话......你觉得怎么样。

这是带有两个图表的图像。

编辑:

选项A:表“questions”不应包含state_id,并且Question_State不应包含“id”字段。 对错误表示歉意。

EDIT2

感谢所有现实世界的例子和见解,但这是一个学术问题,与现实世界无关:)。

图表

This is something that made me doubt for a while so I thought it would be a good idea to post it here to find some insight, it's a relational database modeling problem/doubt

I have the following problem:

I have "questions" that must be in an specific "state", and all changes of state must be audited.

I've found two solutions to this, but I can't really see the difference between them, if there's any... what do you think.

Here's the image with both diagrams.

EDIT:

Option A: table "questions" should not contain state_id, and Question_State should not contain "id" field. Sorry for the mistakes.

EDIT2:

Thanks for all the real world examples and insight, but this was an academic problem, not real-world related :).

Diagrams

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

痴梦一场 2024-07-16 16:56:42

我认为您要问的要点是:问题的状态是否应该基于问题和状态之间具有时间分量(A)的中间表,或者该表应该更加静态,但具有面向日志的历史表在侧面(B)。

(注意:如果你想做(A)的纯粹版本,那么 Boofus 是对的,你可能不会把 state_id 也放在 Questions 表中,因为它是多余的;但这肯定会很不方便,因为它会让简单的查询来获取特定状态的问题要困难得多。因此,这里有一个混合版本。)

一般来说,如果保留有关状态的历史信息的要求确实只是为了审计目的 - 也就是说,如果它不会'应用程序本身不会定期查询 - 您可能最好选择选项 B,因为它更简单一些(实际上只有一个“问题”表,带有状态参考表和“日志”表对于以前的状态)。 我认为这更好地表明了你的意图。

但是,如果应用程序语义更复杂(例如,如果您有诸如“显示过去 24 小时内处于状态 X 的所有问题……”之类的查询),那么 (A) 这样的方法可能更有意义。 它本质上是将问题的状态变成一个依赖于时间的事实。 如果您这样做,请注意它会使事情变得复杂 - 要么您的所有查询都更困难并且必须考虑时间,要么您必须担心问题上的 state_id 与问题表中的最新状态保持同步。 如果你走这条路,也许可以将其称为“current_state”或问题上的其他名称,所以很明显这是一种派生信息。

I think the gist of what you're asking is: Should the state of the question be based on an intermediary table between Questions and State that has a time component (A) or should the table be more static, but with a log-oriented history table on the side (B).

(Note: If you wanted to do a pure version of (A), then Boofus is right, you probably wouldn't put the state_id in the Questions table as well, as it's redundant; but that would definitely be inconvenient because it would make simple queries to get questions in a particular state much harder. So you've got a hybrid version here.)

In general, if the requirement of keeping historical information about the state is really just for audit purposes - that is, if it won't be regularly queried by the application itself - you're probably better off going with option B, because it's a little simpler (there's really just the one "Questions" table, with a reference table for the states, and a "log" table for previous states). I think that shows your intent a little better.

However, if the application semantics are more complex (e.g., if you have queries like "show all questions that have been in state X within the last 24 hours ..."), then an approach like (A) might make more sense. It's essentially making the state of a question into a time-dependent fact. If you do that, just be aware that it complicates things - either all your queries are harder and have to consider time, or you have to worry about keeping the state_id on Questions in sync with the most recent state in the Questions table. If you go that route, maybe call it "current_state" or something on Questions, so it's clear that it's sort of derivative information.

像你 2024-07-16 16:56:42

您可能想在网上搜索“时态数据库”主题。 基本上,存储任何变量的变化历史都会引发相同的问题,无论变量是否捕获问题状态或人的体重或其他什么。

其次,我认为您的问题与数据库设计有关,而不是与概念数据建模有关。 如果我的意思是正确的,那么您是在问哪种桌子设计更好。

第三,我更喜欢选项 B,但这实际上取决于您要如何处理数据。

我询问数据库设计与概念建模的原因是,我很久以前就采用了使用“实体和关系”进行与数据分析相关的概念数据建模的实践。 在讨论逻辑数据库设计时,我使用术语“表、列和行”。 在大型项目中,将分析和设计分开是非常有价值的。 这并不像听起来那么容易。

您确实应该在选项 B 的图表中的历史表和状态表之间添加一个箭头。从图表的呈现方式来看,历史表几乎看起来是一个不相交的表。 在这个简单的示例中这不是问题,但如果在扩展到具有数十个表的数据库时保持相同的做法,最终会让每个查看该图的人感到困惑。

You might want to scour the web on the subject of "temporal databases". Basically, storing the history of the changes of any variable raises the same problems, regardless of whether the variable captures question state or person's weight or whatever.

Second, I think your question relates to database design, and not to conceptual data modeling. If I get your drift right, you are asking which table design is better.

Third, I like option B better, but it really depends on what you are going to do with the data.

The reason I asked about database design versus conceptual modeling is that I long ago adopted the practice of using "entities and relationships" for conceptual data modeling associated with data analysis. I use the terms "tables, columns, and rows" when discussing logical database design. Keeping analysis and design separate turns out to be very valuable in large projects. And it isn't as easy to do as it sounds.

You really ought to add an arrow between the Historical table and the State table in the diagram for option B. The way the diagram is presented, it almost looks like the Historical table is a disjoint table. Not a problem in this simple example, but if you keep the same practice when you scale up to databases with dozens of tables, you'll end up confusing everybody who looks at the diagram.

南七夏 2024-07-16 16:56:42

一旦你画出了所有的关系,它们都是一样的。

我不明白为什么问题表中有 state_id ——因为你有历史表,所以问题表中的状态是多余的,并且可能会导致数据不同步。

在我看来,如果您想要问题的当前状态,您可以

SELECT State_ID FROM Historical WHERE Question_id = ? ORDER BY Date DESC LIMIT 1

(或者您的 SQL 风格使用的任何方法来限制为仅 1 行)

Once you draw in all the relationships, they're the same.

I don't understand why you have state_id in the question table -- since you have the historical table, having the state in the question table is redundant and can leave you with out-of-sync data.

Seems to me that if you want the current state on a question, you do

SELECT State_ID FROM Historical WHERE Question_id = ? ORDER BY Date DESC LIMIT 1

(or whatever method your flavor of SQL uses to limit to just 1 row)

剑心龙吟 2024-07-16 16:56:42

假设您在数据库和 OO 之间有良好的抽象层,您可能会考虑将 State 表从数据库中取出并使其成为类中的枚举。 这并不一定是需要坚持的事情。

然后在“问题”表和“审核”表中添加“状态”列。

Assuming you have good abstraction layers between the database and your OO, you might consider taking the State table out of the database and make it an enumeration in a class. It's not necessarily something that needs to be persisted.

Then have the State column in the Questions table, and the audit table.

人心善变 2024-07-16 16:56:42

您说已审计,这意味着您只是希望保留历史信息以用于报告目的。 在这种情况下,我建议图 B 更清晰,尽管您可能也应该标记问题和历史以及状态和历史之间的一对多关系。

至于实用性,如果情况如上所述,我自己会将历史插入功能封装到问题上的插入/更新触发器中,并且如果问题表的数量和/或状态更改的数量将很大,我将会考虑将历史表放在不同的数据库中。 这只是简化了以后的数据库管理。 通常我对触发器持谨慎态度,因为过度使用可能会导致数据库难以维护(因为发生的情况并不明显),但这是一个明显的例子,它们非常适合并且是比使用触发器更好的选择应用逻辑。

顺便说一句,你的两个图表都暗示一个问题只能进入每个状态一次(来自你的 PK) - 你应该考虑这是否正确,因为在大多数现实世界的应用程序中都会犯错误并且状态会颠倒。

You say audited, which implies that you simply wish to retain historical information for reporting purposes. In which case I'd suggest that diagram B is the clearer, although you should probably mark the one to many relationships between Questions and Historical and State and Historical too.

As to practicalities, if the circumstances are as above, myself I would encapsulate the Historical insert functionality into an insert/update trigger on Questions, and if the volume of the Questions table and/or the number of state changes is going to be significant I would consider putting Historical table in a different database. This just eases database management later. Normally I am wary of triggers as over-zealous usage can lead to hard-to-maintain databases (as it's not immediately obvious what is going on) but this is a clear case where they are a good fit and are a superior choice to using application logic.

Incidentally both your two diagrams imply that a question can only ever enter each state once (from your PK) - you should consider if this is correct as in most real world applications mistakes will be made and states reversed.

花辞树 2024-07-16 16:56:42

我不明白,像 #Boofus 一样,在 Questions 表中拥有 state_id 字段有什么好处。

我在我们自己的应用程序中经常使用此类“状态”概念。 在大多数复杂的情况下,我们必须跟踪完整的状态历史记录以及对象可以具有多个状态的情况,我们使用以下模型:

alt text

对于多个状态情况,想法是检查 end_date 值是否为 null(另一个想法是在表中拥有一个布尔字段 isActiveState)。 不要低估这种“多状态”配置的好处。 示例:

问题可以

  • 关闭并解决

  • 关闭但不解决。

这可能对应于两种不同的状态:

  • “已关闭并已解决”状态

  • “已关闭且未解决”状态

但我认为最好的解决方案是同时具有

  • “打开/已关闭” 状态状态

  • “已解决/未解决”状态

并允许问题具有多个状态

I did not understand, like #Boofus, the interest of having a state_id field in the Questions table.

I ve been working quite a lot with such "state" concepts in our own application. In most complex situations, where we have to follow up a complete state history and situations where an object can have multiple states, we are using the following model:

alt text

For multiple state situations, the idea is to check if the end_date value is null (another idea would be to have a boolean field isActiveState in the table). Do not underestimate the interest of having this "multiple state" configuration. Example:

A question can be

  • closed and solved

or

  • closed and not solved.

This could correspond to either 2 different states:

  • A "Closed and solved" state

or

  • A "Closed and not solved" state

But I think the best solution would be to have both a

  • A "Open/Closed" state

and

  • A "Solved/Unsolved" state

And to allow the question to have multiple states

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文