数据质量数据库模型

发布于 2024-10-16 09:56:44 字数 148 浏览 2 评论 0原文

需要将数据库模型的示例附加到数据库以保证数据质量。答案的最佳形式至少是可以在 MySQL 中执行的 DDL;其他 RDMS DDL 都可以,我将发布另一个问题,要求移植代码。

一个好的解释将是一个巨大的优势。

问题、评论、反馈等——只是评论,谢谢!!

Need an example of a database model to be attached to a database for data quality. Best form of the answer would at the very least be DDL that's executable in MySQL; other RDMS DDL's are okay, I'll just post another question asking for a porting of the code.

A good explaintion would be a huge plus.

Questions, comments, feedback, etc. -- just comment, thanks!!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

甜妞爱困 2024-10-23 09:56:44

最大的问题是确定有意义的质量衡量标准。这高度依赖于应用程序,我怀疑任何人都能够为您提供很大帮助。 (至少在没有更多信息的情况下——也许超出了您被允许提供的信息。)

但是,假设您的应用程序记录了个人对鸟类的观察。 (我只是凭空想出这些。请阅读其要点,并期望细节会在仔细检查下崩溃。)在一般的田野条件下,

  • 有些物种即使是初学者也很难弄错,
  • 有些物种则 不然。专家很难正确判断
  • 特定个人的能力随着时间的推移而变化不规则(好的日子,坏的日子)
  • 个人通常会随着时间的推移变得更加熟练
  • 你可能非常擅长识别鹰,但完全不擅长识别海鸥
  • 个人很容易受到建议(他们和谁在一起会影响他们的可靠性)

因此,为了尝试评估识别的质量,除了观察“5月5日在五月岬的3只红尾鹰”之外,您可能会尝试记录大量信息2011 年 2 月下午 4:30”。您可能会尝试记录

  • 天气
  • 照明
  • 温度(有些观鸟者会吸入寒冷)
  • 在野外的时间(有些观鸟者会在 3 小时后或 20 分钟寒冷后吸入)
  • 其他人的姓名呈现出
  • 正确的平均难度
    识别红尾鹰
  • 此个体的概率
    可以正确识别红尾
    在这些野外条件下的
  • 酒精摄入量

虽然这对于野外观鸟者来说可能是“元”,但对于数据库设计者来说这只是数据。您设计这些表就像为任何其他应用程序设计它们一样。 (无论如何,这就是我所做的。)

The biggest problem is identifying meaningful measures of quality. That's so highly application-dependent, I doubt that anybody will be able to help you very much. (At least not without a lot more information--perhaps more than you're allowed to give.)

But let's say your application records observations of birds by individuals. (I'm just throwing this together off the top of my head. Read it for the gist, and expect the details to crumble under scrutiny.) Under average field conditions,

  • some species are hard for even a beginner to get wrong
  • some species are hard for an expert to get right
  • a specific individual's ability varies irregularly over time (good days, bad days)
  • individuals usually become more skilled over time
  • you might be highly skilled at identifying hawks, and totally suck at identifying gulls
  • individuals are prone to suggestion (who they're with makes a difference in their reliability)

So, to take a shot at assessing the quality of an identification, you might try to record a lot of information besides the observation "3 red-tailed hawks at Cape May on 05-Feb-2011 at 4:30 pm". You might try to record

  • weather
  • lighting
  • temperature (some birders suck in the cold)
  • hours afield (some birders suck after 3 hours, or after 20 cold minutes)
  • names of others present
  • average difficulty of correctly
    identifying red-tailed hawks
  • probability that this individual
    could correctly identify red-tails
    under these field conditions
  • alcohol intake

Although this might be "meta" to field birders, to the database designer it's just data. And you'd design the tables just like you'd design them for any other application. (That's what I did, anyway.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文