数据质量数据库模型
需要将数据库模型的示例附加到数据库以保证数据质量。答案的最佳形式至少是可以在 MySQL 中执行的 DDL;其他 RDMS DDL 都可以,我将发布另一个问题,要求移植代码。
一个好的解释将是一个巨大的优势。
问题、评论、反馈等——只是评论,谢谢!!
Need an example of a database model to be attached to a database for data quality. Best form of the answer would at the very least be DDL that's executable in MySQL; other RDMS DDL's are okay, I'll just post another question asking for a porting of the code.
A good explaintion would be a huge plus.
Questions, comments, feedback, etc. -- just comment, thanks!!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
最大的问题是确定有意义的质量衡量标准。这高度依赖于应用程序,我怀疑任何人都能够为您提供很大帮助。 (至少在没有更多信息的情况下——也许超出了您被允许提供的信息。)
但是,假设您的应用程序记录了个人对鸟类的观察。 (我只是凭空想出这些。请阅读其要点,并期望细节会在仔细检查下崩溃。)在一般的田野条件下,
因此,为了尝试评估识别的质量,除了观察“5月5日在五月岬的3只红尾鹰”之外,您可能会尝试记录大量信息2011 年 2 月下午 4:30”。您可能会尝试记录
识别红尾鹰
可以正确识别红尾
在这些野外条件下的
虽然这对于野外观鸟者来说可能是“元”,但对于数据库设计者来说这只是数据。您设计这些表就像为任何其他应用程序设计它们一样。 (无论如何,这就是我所做的。)
The biggest problem is identifying meaningful measures of quality. That's so highly application-dependent, I doubt that anybody will be able to help you very much. (At least not without a lot more information--perhaps more than you're allowed to give.)
But let's say your application records observations of birds by individuals. (I'm just throwing this together off the top of my head. Read it for the gist, and expect the details to crumble under scrutiny.) Under average field conditions,
So, to take a shot at assessing the quality of an identification, you might try to record a lot of information besides the observation "3 red-tailed hawks at Cape May on 05-Feb-2011 at 4:30 pm". You might try to record
identifying red-tailed hawks
could correctly identify red-tails
under these field conditions
Although this might be "meta" to field birders, to the database designer it's just data. And you'd design the tables just like you'd design them for any other application. (That's what I did, anyway.)