使用“链式”规范化数据库条目
我是数据库设计的新手,我还没有找到构建词汇数据库的明确方法。撇开不相关的字段不谈,我想要存储的数据在概念上是这样的:
单词 1
- 含义 1(其中 1-n 个)
- 示例 1(其中 0-n 个)
- 示例 2
- ...
- 含义 2
- ...
单词 2
...
现在,单词由三个属性标识:单词名称、语言和 POS(词性)。我已将其设置为复合键。从我读到的内容来看,我认为含义和示例应该放在单独的表格中,也许是这样的:
单词表
- 关键字
- 名称
- 语言
- POS
- ...
含义表
- 关键字
- 名称
- 语言
- POS
- 含义(每个键 1-n 行)
示例表
- 关键字
- 名称
- 语言
- POS
- 含义
- 示例(每个键 0-n 行)
但是,这让我觉得数据重复量非常大。将 wordname-language-POS 键抽象到一个单独的表中并为每一行提供一个唯一的键会更好吗?有没有更好的方法?
非常感谢。
I'm new to database design, and I haven't found a definitive way to structure my vocabulary database. Irrelevant fields aside, the data I want to store is conceptually something like this:
Word 1
- Meaning 1 (1-n of these)
- Example 1 (0-n of these)
- Example 2
- ...
- Meaning 2
- ...
Word 2
...
Now, a Word is identified by three attributes: Wordname, Language, and POS (part of speech). I've set this up as a compound key. From what I've read, I gather that the meanings and examples should be in separate tables, perhaps something like this:
Word table
- Key
- Wordname
- Language
- POS
- ...
Meaning table
- Key
- Wordname
- Language
- POS
- Meaning (1-n rows per key)
Example table
- Key
- Wordname
- Language
- POS
- Meaning
- Example (0-n rows per key)
But this strikes me as a horrific amount of data duplication. Would it be better to abstract out the wordname-language-POS key into a separate table and give each row a single unique key? Is there some approach that's altogether better?
Thanks much.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
一般来说,您可以通过为每个表的每一行生成唯一的键来避免一些麻烦,其中键是一个简单的整数,而不是实际数据。外键引用更容易,您不必处理诸如“哎呀,有人拼错了一个单词,但该单词现在是另一个表中外键的一部分!”之类的问题。当键值发生变化时,强制外键完整性的数据库确实会让事情变得困难。
而且,正如您所注意到的,使用单词和其他信息作为外键会导致大量重复。标准化的全部目的(嗯,无论如何,其中之一)是消除重复。
大多数数据库引擎都会为您生成这些密钥,通常具有称为“身份”的属性。当插入新数据时,这些数据库通常有一种简单的方法以编程方式检索这些键。然而,这需要更多的代码和实现。
In general, you can save yourself some headaches by generating a unique key for each row of each table, where the key is a simple integer, rather than actual data. Foreign key references are easier, and you don't have to deal with problems like "oops, someone misspelled a word, but that word is now part of a foreign key in another table!" Databases that enforce foreign key integrity can really make life difficult when key values change.
And, as you note, using words and other info as foreign keys is a lot of duplication. The whole point of normalization (well, one of them, anyway) is to eliminate duplication.
Most database engines will generate those keys for you, usually with a property called "identity." These databases will usually have an easy way to retrieve those keys programmatically, when new data is inserted. That gets more into code and implementation, however.
您的方向是正确的,但请注意有列限制。
MEANING
表中,key
将是WORD.key
值的外键 - 这允许您与WORD
表,无需在MEANING
表中重复它们。MEANING.key
不唯一,则可以支持无限MEANING.meaning
值示例
WORD
示例:
MEANING
示例:
如果您想要含义值的顺序,则必须定义一个列来以某种方式指示顺序 - IE:
meaning_id
You're on the right track, but mind that there is a column limit.
MEANING
table, thekey
would be a foreign key to theWORD.key
value - this allows you to relate to the values in theWORD
table without needing them duplicated in theMEANING
table.MEANING.key
is not unique, you can support infiniteMEANING.meaning
valuesExample
WORD
Example:
MEANING
Example:
If you want order of the meaning values, you'll have to define a column to indicate the order somehow - IE:
meaning_id
我也提倡使用整数作为密钥。然后您的表格就会变得非常简单:
Word
密钥表
字名
语言
词性
含义
密钥表
关键词
描述
示例
密钥表
关键含义
描述
给定一个单词,您就可以相对容易地获得给定单词的所有含义:
给定单词的示例也相当简单:
I too would advocate a key which is an integer. Your tables then become very simple:
Word
KeyTable
WordName
Language
PartOfSpeach
Meaning
KeyTable
KeyWord
Description
Example
KeyTable
KeyMeaning
Description
Given a word, you could then get all of the meanings for a given word relatively easily:
Examples for a given word are also fairly simple: