一张桌子还是许多张桌子来举办许多不同但相互作用的活动?

发布于 2024-10-13 23:01:34 字数 4229 浏览 2 评论 0原文

我正在创建一个应用程序,其核心功能是跟踪一段时间内的各种数据(血糖水平、胰岛素剂量、食物摄入量等),并且我正在尝试决定如何最好地在数据库中组织这些信息。

最基本的是,这个特定保护伞中的所有内容都是一个事件,因此我想到有一个事件表,其中包含可能出现的所有属性的字段。不过,这可能不太实用,因为对于许多整体来说,绝大多数字段最终都是空白的;但我不确定这是否真的是一个问题。这种方式的好处是可以更容易地调用和显示所有事件。但由于许多事件只有共同的“时间戳”,我怀疑它们是否属于同一张表。

我不确定为每种事件建立一个表是否有意义,因为单独来看,大多数事件只有时间戳以外的一个属性,并且它们通常必须混合在一起。 (许多类型的数据通常但并不总是出现在一个组中)

某些类型的事件具有持续时间。有些是相对非常罕见的。一类事件通常是保持不变的速率,除非速率被永久改变或临时覆盖(这些是我最担心的)。有些是简单的二进制标签(我计划为其建立一个链接表,但为了方便起见,我需要/更喜欢一个整体 event_id 来链接它们。

我的倾向是最好有一些具有密切相关类型的表信息而不是一张包含所有内容和大量空间的表格..但我不太确定如何继续,

我希望得到一些关于在这种情况下确定最佳方法的策略建议

: 以下是我正在处理的数据类型的概述,以防它使问题变得更加清晰

events:
-blood glucose 
     timestamp
     value 
     (tagged w/: from pump, manually entered
     [pre-meal, post-meal (breakfast, lunch, dinner) before bed, fasting, hypo, high, hyper  - which will be either manually entered or inferred based on settings or other user entries], before/after exercise etc i imagine would be better off dynamically generated with queries as necessary. though could apply same paradigm to the meals?

-sensor glucose (must be separate bc it is not as reliable so will be different number from regular bg test, also unlikely to be used by majority of users.)
     timestamp
     amount

-bolus 
     (timestamp)
     bolus total
     food total
     correction total 
     active insulin**
     bolus type - normal[vast majority] square wave or dual wave

-food
     (timestamp)
     carb amount
     carb type (by weight or exchanges) <- this could probably be in user settings table
     food-description
     carb-estimated (binary) 
     meal? - or separate table.
     (accompanying bolus id? though that seems to finicky)

-meals
     timestamp
     mealname (breakfast, lunch, supper) (or mealnames table? seems excessive?)

-basal
     timestamp
     rate per hour
     rate changes throughout day on regular pattern, so either automatically fill in from 'last activated pattern' (in the form midnight: 0.7/hr, 7am: 0.9/hr, 12pm: 0.8/hr etc)
     create new pattern whenever one is used

-temp basal
     (regular basal pattern can be overridden with temporary basal)
     temp basal start
     ?temp basal end and/or temp basal duration
     temp basal amount
     temp basal type -> either in % or specific rate.

-exercise
     start-time
     end-time
     intensity
     ?description (unless 'notes' is universal for any event)

-pump rewind (every 3 days or so)
     -time

-pump prime
     -amount
     -type (fixed or manual)

-pump suspended
     start-time
     end-time

-keytones
     time
     result

-starred
     event

-flagged
     event

-notes
     timestamp
     (user can place a note with any event to provide details or comments, but might want a note where there is no data as well.)

(i want a way for users to flag specific events to indicate they are result of error or otherwise suspect, and to star events as noteworthy either to discuss with doctor or to look at later)

**only place I get active insulin from is when a bolus is entered, but it could be useful other times as a constantly tracked variable, which could be calculated by looking at boluses delivered up to X time ago where X is the Active Insulin Time.

other infrequent events (likely 2-10 per year):
-HbA1C 
     time
     value
-weight
     time
     value
     units
-cholesterol
     time
     value
-blood pressure
     time
     value

-pump settings (will need to track settings changes, but should be able to do that with queries)
     -timestamp
     -bg-target
     -active insulin time
     -carb ratios (changes throughout day like basal)
     -sensitivity
     -active insulin time

。 1)具有类型的总体“事件”表,可以快速带回一段时间内的所有事件,而无需查询每个表? (缺点是我如何处理有持续时间的事件?事件表上有可选的结束时间吗?)

2)这是一个本地数据库,通常是一个用户,并且永远不需要比较或交互任何一个如果在线同步,则会记录其他用户的记录,因此我正在考虑只为每个用户保留一个版本的数据库,尽管可能会在上传时添加“用户”ID。

3)为了便于解释和分析,许多事件通常会放在一起(例如血糖、膳食、食物、丸剂、注释),我认为最好在事后通过查询来做到这一点,而不是对任何内容进行硬编码以保持完整性。

有关数据库用途的一些信息: -一天内所有数据类型的直观表示 - 平均所有测试结果和用于食物、校正、基础的胰岛素百分比。 -以及特定的高级查询,例如:列出最多 20 个示例,说明睡前血糖和早晨血糖之间的血糖水平差异(自上次更改设置以来,在睡前 2 小时内未进食且未运动时)等。 -程序将根据参数自动分配标签。就像如果在指定的“午餐”期间吃了超过 20 种碳水化合物,它会说该食物是午餐。如果 30 分钟内摄入了两次食物(或“膳食长度”偏好),它会将它们分组为一顿饭……目前还不完全确定这将如何发挥作用。

I'm creating an application that as its core function tracks various data over time (blood glucose levels, insulin dosages, food intake etc) and I'm trying to decide how best to organize this information in a database.

At its most basic everything within this particular umbrella is an event, so I thought of having a single Events table with fields for all the properties that might come up. This may be unwieldy, though, because the vast majority of the fields will end up being blank for many of the entires; but i'm not sure if that's actually a problem. The benefit of that way is that it will be easier to call and display all the events. But since many of the events will only have 'timestamp' in common, i question whether they belong on the same table.

I'm not sure that it makes sense to have a table for every kind of event, because taken separately most of the events have only one property other than timestamp, and they will often have to co-mingle. (many types of data will often but not always come in a group)

some types of events have durations. some are comparatively very rare. One class of events are normally a rate that stays the same unless the rate is altered for good or with a temporary override (those are the ones i'm most worried about). Some are simple binary tags (which i was planning on having a linking table for, but to make that easy I would need/prefer an overall event_id to link them with.

My inclination is that is best to have a few tables with closely related types of information rather than one table with everything and a whole lot of space.. but am not quite sure how to proceed.

I would love some advice on strategy for determining the best approach in a situation like this.

edit:
Here is a rundown of the data types I'm dealing with in case it makes things more clear

events:
-blood glucose 
     timestamp
     value 
     (tagged w/: from pump, manually entered
     [pre-meal, post-meal (breakfast, lunch, dinner) before bed, fasting, hypo, high, hyper  - which will be either manually entered or inferred based on settings or other user entries], before/after exercise etc i imagine would be better off dynamically generated with queries as necessary. though could apply same paradigm to the meals?

-sensor glucose (must be separate bc it is not as reliable so will be different number from regular bg test, also unlikely to be used by majority of users.)
     timestamp
     amount

-bolus 
     (timestamp)
     bolus total
     food total
     correction total 
     active insulin**
     bolus type - normal[vast majority] square wave or dual wave

-food
     (timestamp)
     carb amount
     carb type (by weight or exchanges) <- this could probably be in user settings table
     food-description
     carb-estimated (binary) 
     meal? - or separate table.
     (accompanying bolus id? though that seems to finicky)

-meals
     timestamp
     mealname (breakfast, lunch, supper) (or mealnames table? seems excessive?)

-basal
     timestamp
     rate per hour
     rate changes throughout day on regular pattern, so either automatically fill in from 'last activated pattern' (in the form midnight: 0.7/hr, 7am: 0.9/hr, 12pm: 0.8/hr etc)
     create new pattern whenever one is used

-temp basal
     (regular basal pattern can be overridden with temporary basal)
     temp basal start
     ?temp basal end and/or temp basal duration
     temp basal amount
     temp basal type -> either in % or specific rate.

-exercise
     start-time
     end-time
     intensity
     ?description (unless 'notes' is universal for any event)

-pump rewind (every 3 days or so)
     -time

-pump prime
     -amount
     -type (fixed or manual)

-pump suspended
     start-time
     end-time

-keytones
     time
     result

-starred
     event

-flagged
     event

-notes
     timestamp
     (user can place a note with any event to provide details or comments, but might want a note where there is no data as well.)

(i want a way for users to flag specific events to indicate they are result of error or otherwise suspect, and to star events as noteworthy either to discuss with doctor or to look at later)

**only place I get active insulin from is when a bolus is entered, but it could be useful other times as a constantly tracked variable, which could be calculated by looking at boluses delivered up to X time ago where X is the Active Insulin Time.

other infrequent events (likely 2-10 per year):
-HbA1C 
     time
     value
-weight
     time
     value
     units
-cholesterol
     time
     value
-blood pressure
     time
     value

-pump settings (will need to track settings changes, but should be able to do that with queries)
     -timestamp
     -bg-target
     -active insulin time
     -carb ratios (changes throughout day like basal)
     -sensitivity
     -active insulin time

concerns.
1) overarching 'events' table with a type, to quickly bring back all events in period of time without having to query every single table? (disadvantage is how do I work with events with duration? have optional end-time on event table?)

2) this is a local database which will generally be one user, and there will never be a need to compare or interact any of the records of other users if it is synced online, so i was thinking of just keeping one version of the database per user, though maybe adding a 'user' id as it is uploaded.

3) many of the events will often go together for ease of interpretation and analysis (blood sugar, meal, food, bolus, notes for instance), i gather it's better to do that after the fact with queries rather than hardcoding anything to maintain integrity.

Some info on what the database will be used for:
-A visual representation of all data types over the course of a day
-Average all test results and percent of insulin which is used for food, correction, basal.
-As well as specific advanced queries like: list up to 20 examples of the difference in glucose level between before bed glucose and morning glucose when no food eaten and no exercise w/in 2 hours of bed, since settings were last changed, etc.
-program will automatically assign tags based on parameters. like if >20 carbs are eaten during assigned 'lunch' period, it will say that food is lunch. if there are two food intakes within 30 minutes (or 'meal length' preference) it will group them as one meal.. not totally sure how that will function right now.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

青芜 2024-10-20 23:01:35

看看这些示例:一个两个三个四个

Take a look at these SO examples: one, two, three, four.

空心↖ 2024-10-20 23:01:34

“我的倾向是最好有几张包含密切相关类型信息的表格,而不是一张包含所有内容和大量空间的表格......但我不太确定如何继续。”

简单又最好。要了解原因,我们可以检查替代方案。

每个指标一张表。完全正常化但令人难以置信的疲劳。我通常是一个标准化鹰派,而且我通常说“桌子是免费的”,当它们开始达到数十或数百个类似的东西时,这并不完全正确。所以我们更喜欢更简单的东西。

在另一端,我们有 EAV,一个表,其中包含 test_id、metric_id、value 等值。众所周知,这几乎不可能查询和使用。就像捕蝇草一样,用芳香的花蜜吸引你,然后靠近你并吃掉你。

在紧握的手上,有一张大桌子,上面有所有可能的列。出于显而易见的原因,这被称为“稀疏”解决方案。我在直接营销中这样做了,效果很好,但这是一个高度专业化的情况,通常不推荐这种方法。

因此,中间的某个地方有一些表,每组测试都有一个表,需要存储相似的值。

嗯嗯,这正是你的建议。听起来不错!

"My inclination is that is best to have a few tables with closely related types of information rather than one table with everything and a whole lot of space.. but am not quite sure how to proceed."

Simple and best. To see why, we can examine the alternatives.

One table per metric. Fully normalized but incredibly fatiguing. I'm normally a normalization hawk, and also I normally say 'tables are free', this is not quite true when they start to number in the dozens or hundreds for like things. So we'd prefer something simpler.

At the other end we have E-A-V, one table, with values like test_id, metric_id, value. This is well known to be nigh-on impossible to query and work with. Like the venus flytrap, draws you in with sweet smelling nectar, then closes in on you and eats you.

On the gripping hand, there is the one large table with all possible columns. This is called the 'sparse' solution for obvious reasons. I did this in Direct Marketing and it works well, but that was a highly specialized situation, this approach is not generally recommended.

So somewhere in the middle is a handful of tables, one table for each set of tests with similar values to store.

Hmmmm, which is just what you suggested. Sounds good!

于我来说 2024-10-20 23:01:34

V1.0

当数据经过组织和规范化时, 关系数据库和 SQL(专为它们设计)的性能会更好。就性能和关系能力而言,一张大表是不规范的,并且是残缺的。

您的需求需要一个普通的超类型-子类型表簇。不幸的是,像这样的普通关系结构并不“常见”。

  • 标准子类型符号是半圆。

    • Supertype::Subtype 的基数始终为 1::0-to-1。

    • 子类型主键是父类型主键。它也是超类型的外键。

  • 有两种类型:

    • 独占,其中每个超类型行只有一个子类型,用穿过半圆的 X 表示。

    • 非排他性,每个父类型行有多个子类型

  • 您的是独占的。该类型需要一个鉴别器,以识别哪个子类型对于超类型行是活动的。当子类型数量较少时,可以使用指标;否则需要分类表。

  • 请注意,支持它并提供数据完整性所需的所有这些(结构、规则、约束)都可以在普通 IEC/ISO/ANSI SQL 中获得。 (非 SQL 不符合 SQL 要求)。

数据

  1. 命名非常重要。建议我们按行命名表格,而不是内容、含义或操作。你说的是Events,但我只能看到Readings。

  2. 这些读物或事件必须有一个背景。我不明白 EventId 是如何悬在空中的。我假设这些读物是关于某个特定患者的。请提出建议,我将更改模型。

  3. 复合键或复合键正常。 SQL 的功能非常强大(非 SQL 则不然)。 PatientId 已作为 Reading 中的 FK 存在,并用于形成其 PK。不需要附加 ReadingId 列和附加索引,这将是 100% 冗余。

  4. SQL 也能很好地处理很多表(我目前正在处理的数据库超过 500 个表),而大量较小的表是关系数据库的本质。

  5. 这是纯粹的第五范式(没有重复的列;没有更新异常)。

    • 这可以进一步归一化为第六范式,从而获得更多好处; 6NF可优化等;但这里不需要所有这些。

    • 有些表恰好位于 6NF 中,但这是结果,而不是意图,因此不能这样声明。

  6. 如果您提供有关您关心的限制和覆盖的信息,我可以提供一个解决这些问题的模型。

  7. 由于数据建模的,因此已经设置为非常快速的比较(生成警报等)。

▶阅读数据模型◀

不熟悉关系数据库建模标准的读者可能会发现▶IDEF1X 符号◀ 有用。

请随意提出澄清问题,无论是作为评论,还是作为对您的问题的编辑。

警告

  1. OO 和 ORM 群体(由 Fowler 和 Ambler 领导)对关系技术和数据库一无所知。设计对象与建模数据有很大不同。如果您将他们的对象设计应用于数据库,您最终会得到需要“重构”的怪物,并且您将不得不购买另一本向您展示如何有效地做到这一点的“书”。与此同时,“数据库”已瘫痪。

  2. 正确建模的关系数据库(作为数据,而不是对象)永远不需要“重构”。在高度规范化的数据库中,您可以添加表、列和函数,而无需更改现有数据或代码。

  3. 甚至 ORM 的概念也是完全有缺陷的。数据比对象更持久。如果你先对数据建模,然后为数据建模你的对象,那么它是非常稳定的。但是,如果您首先对对象进行建模(无论如何,这很奇怪,因为不了解数据),然后在对象之后对数据进行建模,那么您将不断地来回正确地进行操作。

  4. 关系数据库拥有完全普通的结构(例如超类型-子类型)已有 30 多年的历史,如果按照这样的方式实现,它们就能很好地工作。它们不是“gen-spec”或“class-inheritance”或任何类似的面向对象的东西;如果在没有正确建模数据的情况下实现这些 OO 或 ORM 结构,“数据库”将会瘫痪,并且需要“我们分解”。

    • 此外,它们没有实现所需的数据完整性约束,因此数据质量通常很差。我们不允许不良数据进入数据库;他们的“数据库”里充满了坏数据,他们需要另一本关于如何清洗脏数据的“书”。
  5. 它们的顺序和层次结构混合在一起。如果做得正确,就不会有“阻抗不匹配”,也不会有伪技术名称来掩盖纯粹的愚蠢;证明一遍又一遍地做同一套工作的合理性。

因此,在处理关系数据库时,请远离任何使用 OO 或 ORM 术语的人。

V1.1

您的编辑提供了更多细节,这当然是必需的,因为如果要正确建模数据,则上下文、整体是必要的。这包含了所有这些信息。然而,问题依然存在,在完成之前还需要进行一些反复讨论。对于任何不完全清楚的问题,请随时提出问题;我不确定差距到底是什么,直到我抛出一些东西,然后你对它说话。

▶事件数据模型V1.1◀< /strong>

  1. 我的所有模型都是纯关系模型(保留完整的关系能力)、IDEF1X 兼容和第五范式(无更新异常)。模型中绘制的所有规则(业务或数据/引用完整性)都可以作为 ISO/IEC/ANSI SQL 中的声明来实现。

  2. 永远不要对任何东西进行硬编码。我的模型不需要这样做,并且任何使用数据库的代码都不需要这样做。所有固定文本均被标准化为参考表或查找表。 (该位不完整;您需要填补空白)。

    • 简短的字母代码比枚举要好得多;一旦你习惯了它,它的价值和意义就会立即被识别。

    • 因为它们是 PK,因此稳定,您可以安全地编码:

      ... WHERE EventTypeCode = "P"

      ... WHERE EventTypeCode LIKE "T%"

  3. 我相信数据类型是不言而喻的或者可以很容易地计算出来。如果没有,请询​​问。

  4. 您所说的“挑剔”的一切都是完全有效的。问题是,由于您没有可以使用的数据库,因此您不知道数据库中应该有什么,以及 SQL 代码应该是什么或可以是什么。因此,所有“挑剔”的项目(数据库元素)都已提供,您需要构建代码。再次,如果有差距请询问。

    • 我想说的是,按照传统的工作方式我是数据建模者,你是开发人员,你必须确保从你的角度交付每一项内容,而不是依赖我解释你的笔记。我将提供一个数据库,该数据库支持我从您的笔记中收集到的所有要求。
  5. 每个数据库一个患者。让我们考虑到您的系统成功的可能性,将来您将拥有一个中央主力数据库,而不是限制每个患者一个数据库,这对于管理来说将是一场噩梦。假设您需要将所有患者详细信息保存在一处,一种事实版本。这就是我所提供的。这不会在短期内限制您为每位患者实施 1 Db; Patient 表中只有一行完全没有问题。

    • 或者,我可以从所有表中删除 PatientId,当您发展成为中央数据库配置时,您将需要进行重大数据库升级。

    • 同样,如果您有需要跟踪的传感器或泵,请识别其属性。然后,任何传感器或泵属性都将标准化到这些表中。如果它们是“每个患者一个”,那就没问题,这些表中将只有一行,除非您需要存储传感器或泵的历史记录。

  6. 在 V1.0 中,子类型是独占的。现在它们是非排他性的。这意味着我们正在跟踪事件的年表,没有重复;任何单个事件都可能包含多个子类型。例如。可以为任何事件插入注释。

    • 在完成之前,需要以网格的形式归档所提供的事件类型列表,显示每个事件类型的 (a) 允许的 (b) 强制子类型。 Thate 将作为事件中的 CHECK 约束来实现。
  7. 命名非常重要。我使用 ISO 标准 11179(指南和原则)以及我自己的惯例。阅读类型事件具有这样的前缀。请随意提出更改建议。

  8. 单位。传统上,我们在数据库中使用公制 x 或美制英制,允许输入用户喜欢的任何内容,并在存储之前进行转换。如果您需要混合物,那么至少我们应该在患者或泵级别指定 UnitType,而不是允许存储任一 UnitType。如果您确实需要存储任一 UnitType,并前后更改,那么是的,我们需要将 UnitType 与每个此类值一起存储。

  9. 时态数据库。您记录了时间序列,并通过 SQL 进行解释。主题很大,所以要仔细阅读。我要求您阅读和理解的最低要求是:

    ▶时态数据库性能 (0NF与 5NF)◀

    ▶经典 5NF 时态数据库◀(检查数据模型仔细)

  10. 基本上问题归结为:

    • 要么您拥有真正的 5NF 数据库,没有数据重复,没有更新异常。

      • 这意味着,对于连续时间序列,仅记录 StartDateTimeEndDtateTime 很容易从下一个 行的StartDateTime 派生出来,它不会被存储。例如。事件是一个连续的年表; EventType 标识事件是特定的DateTime 还是周期/持续时间。

      • EndDateTime 仅存储不相交的周期,其中周期之间存在合法的间隙;无论如何,它都可以通过 EventType 清楚地识别。例如。锻炼,泵暂停。 (顺便说一句,我建议患者在锻炼期结束时只知道实际的属性,而不是计划的属性。)

      • 由于通常没有 EndDateTime,因此 StartDateTime 只是 DateTime。例如。 EventDtm

      • 这需要使用普通的 SQL 子查询。一旦编码员掌握了该主题,这实际上非常简单。对于那些不这样做的人,我提供了关于子查询的完整教程,特别是在时态上下文中使用它们,位于:

      ▶只要知道如何做就很容易◀。并非巧合,与上面的经典 5NF 时态数据库完全相同。

    • XOR 您有一个数据库,其中每个 StartDateTime 列都存储了 EndDateTime(100% 重复),并且您可以使用平面,查询速度慢。大量使用 GROUP BY 操作大型结果集,而不是小型结果集。引入了海量数据重复和更新异常,将数据库缩减为平面文件,以满足能力有限的编码人员的需求(当然不是“易于编码”)。

    • 因此,请仔细考虑并选择,仅从长远来看,因为这会影响访问时态数据的每个代码段。当您意识到维护更新异常比编写子查询更糟糕时,您不希望在中途重写。

      • 当然,我将提供支持 5NF 时态数据库、正确的数据类型等的明确要求,以支持您确定的所有要求。

      • 此外,如果您选择 0NF,我将提供这些字段,以便数据模型完整地满足您的目的。

      • 无论哪种情况,您都需要准确计算出任何给定查询所需的 SQL 代码。

    • 要么

  11. 数据类型处理很重要。不要将时间(小时等)存储为整数或偏移量。仅将其存储为 TIME 或 DATETIME 数据类型。如果是偏移量,请将其存储为自午夜以来的时间。这将允许不受限制的 SQL 和日期算术函数。

  12. 给你任务。仔细检查模型,并确保:

    • 每个非键属性与其主键都有 1::1 关系

    • 并且它与任何其他 PK(在其他表中)没有关系

    当然,检查模型并提供反馈。

问题

鉴于上述解释和指导。

  • 什么是 ReadingBasalTemperature.Type,请列出值?

  • 什么是HBA1C?

  • 什么是keytone?

  • 我们需要(即持续时间/erigation endDateTime`):

    • readitbasaltemperaturedend
    • ReadingBolusend
    • 基础模式
    • 基底模式
    • 实际上,什么是模式,如何得出/比较?
  • 如何确定基础质量(或持续时间)

  • 启动位置是,无需存储活动的胰岛素持续时间。但是您需要定义endDateTime的确定方式。基于此,如果它不能轻易得出,并且或基于太多因素或始终更改,则存储enddateTime可能很好。

  • 泵设置需要澄清。

v1.2

好吧,我已将您已证明的所有信息纳入了问题和评论。这是一个进步的数据模型。

仍然有一些问题需要解决。

  • 仅使用一个百分比或一个费率,而不是两个都带有附加指标。一个可以源自另一个。我一直在使用速率。

  • ...对方法的唯一担心是多天的基础速率将是相同的。因此,冗余

    • 不是“冗余”。这是一个时间序列的事实的存储,这恰好是不变的。所需的查询是直接的。

    • 但是,在高级使用中,是的,您可以避免存储不变的事实,而是将持续时间延长到包括新的时间间隔。

  • 我仍然不清楚您对基础温度的解释。请研究新模型。首先,模式现在分别存储。其次,我们正在以速率开始记录一个基础温度。我们是否需要基础温度端(速率)?

  • “葡萄糖素类型将能够每个葡萄糖结果具有多个值” 需要更多的定义。不用担心ID键。只需告诉我数据。对于每个读取的葡萄糖蛋白,命名结果值,以及它们适用于哪些葡萄糖素型;强制性和可选的

  • pumphistory.insulinenddateTime是持续时间的结局。当然,这是通用的,启动瞬间就是您将其比较的任何行。因此,自1900年1月1日午夜以来,应该是秒或几分钟

  • 检查新事件PK。在传入记录标识几个事件的地方,您需要对其进行解析,并使用相同的DateTime插入每个事件 - eventsubtype行。

  • 除了患者外,此数据库中没有ID密钥,到目前为止,不需要键。 除了患者外,请参考完整pk。

05年2月11日

未收到反馈RE v1.2。

我得到的许多数据都从外部(有些混乱)的CSV中提取,该CSV将某些事件类型分组在一行下,并且经常在同一第二行中进行事件,该事件与

很容易克服。但是,这意味着瞬间不是即时。现在,我可以带您完成整个练习,但最重要的是简单。

  • 如果您真的需要它,我们可以在PK中添加Sequenceno,以使其与众不同。但是我怀疑EventTypecode已经足够了(每秒不会超过一个EventType)。如果没有,请告诉我,我将更改穆尔。

  • 将瞬间视为即时的含义,从而避免偏离时间数据库的架构要求。

  • 使用EventType为DateTime PK提供独特性。

    • 请记住,事件类型是在事件PK中部署的,不是作为歧视者的要求,而是提供独特性。因此,它在子类型的PK中的存在是一种人工制品,而不是歧视者(这是通过亚型已知的)。
  • 然而,由于非排他性亚型而导致不必要的复杂性(每个超级型行可能有多种亚型)。

  • 因此,我将其更改为独家亚型,确定性。每个Supertype行的一个EventType;最大一个子类型。

请参阅 实现子类型 对于特定信息,请重新限制等。

数据模型的更改太小,无法保证其他版本。我已经更新了V1.2数据模型。

06年3月11日,

应当在常见问题解答中坚持“最重要的,从技术上讲是诚实的”指南,并按照其中的要求面对错误信息,我被暂停了(这意味着我将不再正确地对此进行正确的错误信息,并且这些海报已经保护了Reign Reign )。与寻求者的互动进行了完成,完成,并完成了最终数据模型。因此,读者失去了进步。但是,发布结果可能是有价值的, ▶最终数据模型v1.16

  • 事件始终具有启动瞬间(event.datetime)。
  • 事件可能是持续时间,在这种情况下,需要立即结束(事件)。
  • 有些事件仅由超级型组成;其他需要亚型。这是在EventType博览会的第三列中标识的。
  • 第四列标识事件类型:
    • 即时或持续时间
    • 持续时间:连接或分离
  • 请注意,在Seeker平台上的DateTime分辨率为一秒钟,并且许多事件可能在一秒钟内发生,但不超过同一EventType之一。因此,EventTypecode已包含在事件主要密钥中以实现该规则。因此,它是一种人工制品,它不是超级型 - 安排结构或独家/非排他性亚型的一般要求。
  • 打算在两个面向美国的字母页面上打印,无论是否放大。

V1.0

Relational Databases, and SQL (which was designed for them) perform much better when the data is organised and Normalised. The one big table is un-normalised, and crippled, in terms of both performance and Relational power.

Your requirement calls for an ordinary Supertype-Subtype cluster of tables. Unfortunately ordinary Relational structures such as this are not "common".

  • The Standard Subtype symbol is the semicircle.

    • The cardinality of the Supertype::Subtype is always 1::0-to-1.

    • The Subtype Primary Key is the Supertype Primary Key. It is also the Foreign Key to the Supertype.

  • There are two types:

    • Exclusive, where there is only one Subtype for each Supertype row, denoted with an X through the semicircle.

    • Non-exclusive, where there is more than one Subtype per Supertype row

  • Yours is Exclusive. This type needs a Discriminator, to identify which Subtype is active for the Supertype row. Where the number of Subtypes is small, Indicators can be used; otherwise a classification table is required.

  • Note that all this, the structures, the rules, the constraints, that are required to support it, and to provide Data Integrity, is available in ordinary IEC/ISO/ANSI SQL. (The Non-SQLs do not comply with the SQL requirement).

Data

  1. Naming is very important. We are advised to name the table by the row, not the content or meaning or action. You speak of Events, but I can only see Readings.

  2. There must be a context for these Readings or Events. I do not see how an EventId hangs in the air. I have assumed that the Readings are about a particular Patient. Please advise, and I will change the model.

  3. Composite or Compound Keys are normal. SQL is quite capable (the Non-SQLs aren't). PatientId already exists as an FK in Reading, and it is used to form its PK. There is no need for an additional ReadingId column and the additional index, which would be 100% redundant.

  4. SQL is also quite capable of handling many tables (the database I am working on currently exceeds 500 tables), and large numbers of smaller tables are the nature of Relational Databases.

  5. This is pure Fifth Normal Form (no columns duplicated; no Update Anomalies).

    • This can be further Normalised to Sixth Normal Form, and thus further benefits can be gained; and the 6NF can be optimised, etc.; but all that is not required here.

    • Some tables happen to be in 6NF, but that is a consequence, not an intent, so it cannot be declared as such.
      .

  6. If you provide information about the limits and overrides that concern you, I can provide a model that resolves those issues.

  7. Since the data is modelled, it is already set up for very fast comparisons (generating alarms, etc.).

▶Reading Data Model◀

Readers who are unfamiliar with the Standard for Modelling Relational Databases may find ▶IDEF1X Notational◀ useful.

Feel free to ask clarifying questions, either as comments, or as Edits to your question.

Caveat

  1. The OO and ORM crowd (lead by Fowler and Ambler) are clueless about Relational technology and Databases. Designing Objects is quite different to modelling data. If you apply their Object design to databases, you will end up with monstrosities that need "re-factoring", and you will have to buy yet another "book" that shows you how to do that efficiently. In the meantime the "database" is crippled.

  2. Relational Databases that are modelled correctly (as data, not objects) never need "re-factoring". In highly Normalised Databases, you can add tables, columns and functions without having to change existing data or code.

  3. Even the concept of ORM is totally flawed. Data has more permanence than Objects. If you model the data first, then model your Objects for the data, it is very stable. But if you model your objects first (which is weird anyway, without an understanding of the data), then model the data after the Objects, you will be going back and forth, constantly correctly both.

  4. Relational Databases have had perfectly ordinary structures such as Supertype-Subtype for over 30 years, and they work well, if they are implemented as that. They are not "gen-spec" or "class-inheritance" or any such OO thing; and if those OO or ORM structures are implemented, without modelling the data correctly, the "database" will be crippled, and need "we-factoring".

    • Additionally, they do not implement the required Data Integrity constraints, so usually data quality is poor. We do not allow bad data to enter the Database; their "databases" are full of bad data, and they need another "book" on how to wash dirty data.
      .
  5. They have the sequence, and the hierarchy, mixed up. Done correctly, there is no "impedance mismatch", no pseudo-technical names to mask pure stupidity; to justify doing the same set of work over and over and over again.

So run like hell from anyone using OO or ORM terminology when dealing with Relational Databases.

V1.1

Your Edit provides far more detail, which of course is demanded, because the context, the whole, is necessary, if data is to be modelled correctly. This incorporates all that info. However, questions remain, and some back-and-forth will be required before it can be complete. Feel free to ask questions about anything that is not absolutely clear; I am not sure exactly what the gap is until I throw something up, and you speak to it.

▶Event Data Model V1.1◀

  1. All my models are pure Relational (retain full Relational power), IDEF1X compliant and Fifth Normal Form (no Update Anomalies). All Rules (business or data/referential Integrity) that are drawn in the model can be implemented as Declaration in ISO/IEC/ANSI SQL.

  2. Never hard-code anything. My models do not require that, and any code working with the database does not have to do that. All fixed text is Normalised into Reference or Look-up tables. (that bit is incomplete,; you need to fill in the gaps).

    • A short alphabetic code is far better than an Enumeration; once you get used to it, the values and meanings become immediately recognisable.

    • Because they are PKs, and therefore stable, you can safely code:

      ... WHERE EventTypeCode = "P"
      or
      ... WHERE EventTypeCode LIKE "T%"

  3. I believe the DataTypes are self-evident or can be worked out easily. If not, please ask.

  4. Everything that your note as "finicky" is perfectly valid. The issue is, since you have not had a database to engage with, you did not know what should be in the database vs what should be or can be SQL code. Therefore all the "finicky" items have been provided for (the database elements), you need to construct the code. Again, if there is a gap please ask.

    • What I am saying is, working in the traditional style of I am the Data Modeller, you are the Developer, you have to ensure every item from your perspective is delivered, rather than relying on me to interpret your notes. I will be delivering a database that supports all the requirements that I can glean from your notes.
      .
  5. One Patient per Database. Let's allow for the possibility that your system is successful, in the future, you will have one central workhorse database, rather than limiting it one database per patient, which would be a nightmare to administer. Let's say that you need to keep all your Patient details in one place, one version of the truth. That is what I have provided. This does not limit you in the short term, from implementing one Db per patient; there is no problem at all with only one row in the Patient table.

    • Alternately, I can strip PatientId out of al the tables, and when you grow into a central database configuration, you will require a major database upgrade.

    • Likewise, if you have Sensors or Pumps that you need to track, please identify their attributes. Any Sensor or Pump attributes would then be Normalised into those tables. If they are "one per patient" that's fine, there will be one row in those tables, unless you need to store the history of Sensors or Pumps.

  6. In V1.0 the Subtypes were Exclusive. Now they are Non-exclusive. This means we are tracking a chronology of Events, without duplication; and any single Event may consist of more than one Subtype. Eg. Notes can be inserted for any Event.

    • Before completion, the EventType list provided needs to be filed out in the form of a grid, showing (a) permitted (b) mandatory Subtypes per EventType. Thate will be implemented as CHECK Constraints in Event.
      .
  7. Naming is very important. I am using ISO standard 11179 (guidelines and principles) plus my own conventions. Reading type Events are prefixed as such. Feel free to suggest changes.

  8. Units. Traditionally, we use either Metric xor US Imperial across the database, allow entry in whatever the user likes, and convert before storage. If you need a mixture, then at least we should have the UnitType specified at the Patient or Pump level, rather than allowing storage of either UnitType. If you really need either UnitType stored, changing back and forth, then yes, we need to store UnitType with each such Value.

  9. Temporal Database. You have Times Series being recorded, and well as interpreted via SQL. Big subject, so read up on it. The minimum I would ask you to read and understand is:

    ▶Temporal Database Performance (0NF vs 5NF)◀

    ▶Classic 5NF Temporal Database◀ (Inspect the Data Model carefully)

  10. Basically the issue boils down to this:

    • Either you have a true 5NF database, no data duplication, no Update Anomalies.

      • That means, for continuous time series, only the StartDateTime is recorded. The EndDtateTime is easily derived from the StartDateTime of the next row, it is not stored. Eg. Event is a continuos chronology; the EventType identifies whether the Event is a specific DateTime or a Period/Duration.

      • EndDateTime is stored only for disjoint Periods, where there are legitimate gaps between Periods; in any case it is clearly identified via the EventType. Eg. Exercise, PumpSuspended. (Incidentally, I am suggesting the patient only knows the actual, as opposed to planned, attributes, at the end of the Exercise period.)

      • Since generally there is no EndDateTime, StartDateTime is simply DateTime. Eg. EventDtm

      • This requires the use of ordinary SQL Subqueries. This is actually quite simple once the coder has a grasp on the subject. For those who don't, I have supplied a full tutorial on Subqueries in general, and using them in a Temporal context in particular, in:

      ▶It Is Easy When You Know How◀. Not coincidentally, re the very same Classic 5NF Temporal Database above.

    • XOR you have a database with EndDateTime stored (100% duplication) with every StartDateTime column, and you can use flat, slow queries. Lots of manipulating large result sets with GROUP BYs, instead of small result sets. Massive data duplication and Update Anomalies have been introduced, reducing the database to a flat file, to supply the needs of coders with limited ability (certainly not "ease of coding").

    • Therefore, consider carefully and choose, for the long term only, because this affects every code segment accessing temporal data. You do not want a re-write halfway down the track when you realise that maintaining Update Anomalies is worse than writing Subqueries.

      • Of course, I will provide the explicit requirements to support a 5NF Temporal Database, correct DataTypes, etc., to support all your identified requirements.

      • Further, if you choose 0NF, I will provide those fields, so that the Data Model is complete for your purpose.

      • In either case, you need to work out exactly the SQL code required for any given query.

  11. DataType handling is important. Do not store Time (hours, etc) as Integer or an offset. Store it only as TIME or DATETIME Datatype. If an offset, store it as Time since midnight. That will allow unrestricted SQL, and Date Arithmetic functions.

  12. Task for you. Go through the model carefully, and ensure that:

    • every non-key Attribute has a 1::1 relationship with its Primary Key

    • and that it does not have a relationship to any other PK (in some other table)

    And of course, check the Model and provide feedback.

Question

Given the above explanations and guidance.

  • What is ReadingBasalTemperature.Type, list values please ?

  • What is HbA1C ?

  • What is KeyTone ?

  • Do we need (ie. Duration/Period EndDateTime`):

    • ReadingBasalTemperatureEnd
    • ReadingBolusEnd
    • Basal Pattern
    • BasalTemp Pattern
    • Actually, what is a pattern, and how is it derived/compared ?
  • How is BasalTemperatureEnd (Or Duration) determined

  • Starting position is, there is no need to store Active Insulin Duration. But you need to define how the EndDateTime is determined. Based on that, if it cannot be easily derived, and or it based on too many factors or changes all the time, storing an EndDateTime might be good.

  • The Pump Settings need clarification.

V1.2

Ok, I have incorporated all information you have proved in the question and the comments. Here is a progressed Data Model.

▶Event Data Model V1.2◀

There are still some issues to be resolved.

  • Use a Percentage or a Rate only, not both with an additional indicator. One can be derived from the other. I am using Rate consistently.

  • ... the only worry about the approach is that for many days the basal rate will be identical.. hence redundancy

    • That is not "redundancy". That is storage of a time series of facts, which happen to be unchanging. The queries required are straight-forward.

    • However, in advanced use, yes, you can avoid storing an unchanged fact, and instead extend the duration to include the new time interval.

  • I am still not clear re your explanation of Basal Temp. Please study the new Model. First, the patterns are now stored separately. Second, we are recording a Basal Temp Start with a Rate. Do we need a Basal Temp End (with a Rate) ?

  • "GlucoseEventType would be able to have more than one value per Glucose Result" needs more definition. Don't worry about ID keys. Just tell me about the data. For each ReadingGlucoseBlood, name the result values, and which GlucoseEventType they apply to; which are mandatory and which are optional.

  • PumpHistory.InsulinEndDateTime is the ending Instant for the Duration. Of course that is generic, the starting Instant is whatever row you compare it to. Thus it should be seconds or minutes since midnight 01 Jan 1900.

  • Check the new Event PK. Where the incoming record identifies several Events, you need to parse that, and INSERT each Event-EventSubtype row, using the same DateTime.

  • Except for Patient, there are no ID keys in this database, none are required thus far. Refer to the parent by full PK.

05 Feb 11

No feedback received re V1.2.

a lot of the data i'm getting is being pulled from an external (and somewhat disorganized) csv which groups certain event types under one row and often has events on the same second, which is as granular as it gets

That is easy to overcome. However, that means that an Instant is not an Instant. Now, I could walk you through the whole exercise, but the bottom line is simple.

  • If you really need it, we could add a SequenceNo to the PK, to make it unique. But I suspect the EventTypeCode is enough (there will not be more than one EventType per second). If not, let me know, and I will change the moel.

  • Retain the meaning of an Instant as an Instant, and thus avoid departing from the architectural requirements of Temporal Databases.

  • Use EventType to afford uniqueness to the DateTime Pk.

    • Keep in mind that the EventTypeCode is deployed in the Event PK, not as a Discriminator requirement, but to afford uniqueness. Thus its presence in the PK of the Subtypes is an artefact, not that of a Discriminator (which is already known by virtue of the Subtype).
  • However there is unnecessary complexity due to the Non-exclusive Subtype (there can be more than one Subtype per Supertype row).

  • Therefore I have changed it back to an Exclusive Subtype, deterministic. One EventType per Supertype row; max one Subtype.

Refer to Implementing Referential Integrity for Subtypes for specific information re Constraints, etc.

The change to the Data Model is too small to warrant another release. I have updated the V1.2 Data Model.

06 Mar 11

Due upholding the "above all, be technically honest" guideline in the FAQ, and confronting misinformation as requested therein, I was suspended for my efforts (which means I will no longer correct misinformation on SO, and such posters have protected reign). Interaction with the seeker was carried on, to completion, and the Final Data Model was completed, away from SO. The progression is therefore lost to SO readers. However, it may be of value to post the result, the ▶Final Data Model V1.16◀.

  • Events always have a starting Instant (Event.DateTime).
  • Events may be Durations, in which case an ending Instant (Event) is required.
  • Some Events consist of only the Supertype; others require a Subtype. This is identified in third column of the EventType exposition.
  • The fourth column identifies the type of Event:
    • Instant or Duration
    • Duration: Conjunct or Disjunct
  • Note that the resolution of DateTime on the seeker's platform is one second, and many Events may occur in one second, but not more than one of the same EventType. EventTypeCode has therefore been included in the Event Primary Key to implement that rule. Thus it is an artefact, it is not a generic requirement for a supertype-subtype structure or for Exclusive/Non-exclusive subtypes.
  • Intended for printing on two facing US Letter pages, enlarged or not, with a gusset.
千鲤 2024-10-20 23:01:34

不,当你说“绝大多数字段对于许多条目来说最终都是空白的”之类的话时,你的模式几乎肯定被破坏了。

血糖应该在自己的表格中,只有日期/时间和水平。如果您只是储存碳水化合物含量,食物摄入量也类似。如果您要存储膳食的各个成分,则食物摄入表和食物成分表之间应该具有一对多的关系。

与胰岛素相同,有时间戳和数量。

表示不同对象之间关系的正确方法是使用关系:外键等。


而且,除了“与编程无关”(关于事件的混合)之外,如果您同时服用胰岛素和吃饭,你可能做错了。大多数速效胰岛素需要半小时才能开始发挥作用,进入肠道的食物将在大约五到十分钟内开始产生糖。因此,进食和进食之间应间隔半小时左右。

据我所知,唯一常见的“相同时间戳或大约”事件是餐前血糖水平,以确保您半小时前注射的胰岛素已经开始发挥作用。

但我不是你的内分泌科医生,我什至不在电视上播放:-)所以请先咨询你的医疗机构。


No, the second you say something like "the vast majority of the fields will end up being blank for many of the entries", your schema is almost certainly broken.

Blood glucose should be in its own table with nothing more than a date/time and level. Food intake similarly, if you're just storing the carb content. If you're storing the individual components of the meal, you should have a one to many relationship between the food-intake table and the food-components table.

Same as the insulin, with a timestamp and quantity.

The right way to represent the relationships between disparate objects is with, well, relationships: foreign keys and such.


And, as a "nothing-to-do-with-the-programming-side-of-it" aside (on the co-mingling of events), if you're taking your insulin and having your meal at the same time, you're probably doing it wrong. Most fast-acting insulin takes half an hour to start doing a proper job and the food entering your gut will start the production of sugars in about five to ten minutes. So there should be about a half-hour delay between shooting up and eating.

The only usual "same-timestamp-or-thereabouts" events that I'm aware of would be the blood sugar level immediately before the meal to ensure the insulin you injected half an hour ago has started working.

But I'm not your endocrinologist and I don't even play one on TV :-) So check with your medical bods first.


~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文