桌子设计
我想知道这是一个好的设计吗?假设表格如下,
ADDRESS(id, address, city_fk, stateFK, countryFK),
CITY(id, name, stateFK, countryFK),
STATE(id, name, countryFK),
COUNTRY(id, name)
请注意,country fk 在 3 个表格中是如何重复的? 并且状态 fk 在 2 个表中重复? 谁能告诉我这是否是一个好的设计? 如果是这样,为什么? 因为我认为没有必要经常重复它。
干杯
I was wondering is this a good design, assuming the tables as follows
ADDRESS(id, address, city_fk, stateFK, countryFK),
CITY(id, name, stateFK, countryFK),
STATE(id, name, countryFK),
COUNTRY(id, name)
Notice how country fk is repeated in 3 tables? and state fk repeated in 2 tables? Can anyone tell me if this is a good design? If so, why? Coz I dont see the need to repeat it every so often.
Cheers
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
你想要更像这样的东西:
如果是我,我会将字段重命名为 a位:
You want something more like this:
And if it was me, i'd rename the fields a bit:
我不确定我会走那条路。 我可以理解拥有一个国家/地区表,当然还有一个州表,但还有一个城市表,可确保该城市属于所讨论的特定国家/州。 我只是想象您的 City FK 表中所需的数据量将是巨大的,而且我不确定我是否看到了好处。 也许如果您能详细说明一下您希望从城市表中获得什么好处,我可能会更好地回答这个问题。 我见过的大多数系统都有国家和州的 FK 表,但这些表不一定彼此相关。
I'm not sure I would head down that path. I can understand having a country table and certainly a state table, but a city table that ensures that city belongs to the particular country/state in question. I would just imagine the amount of data that would be necessary in your City FK table would be enormous, and I'm not sure I see the benefit. Perhaps if you could detail a little more what benefit you hoped to derive from having a city table I might be better able to answer this. Most systems I have seen have FK tables for Country and State, but those tables are not necessarily related to one another.
我想我的问题是“好的设计有什么用?” 如果地址的精确完整性对您的设计绝对至关重要,那么这可能是富有成效的讨论的开始。 另一方面,如果目的是收集地址并能够以您可能期望从真实用户那里获得的各种形式存储它们,您可能会考虑更灵活的东西:即更少的表,更多的可选字段和一个UI 中一组良好的友好验证规则。
I guess my question is "Good design for what?" If the precise integrity of an address is absolutely crucial to your design, then this might be the start of a fruitful discussion. On the other hand, if the purpose is to collect addresses and be able to store them in all the variety you might expect to get from real users, you might consider something with a little more flexibility: i.e. fewer tables with more optional fields and a good set of friendly validation rules in the UI.
我会这样做:
现在告诉我,在不同的城市、州、国家拥有相同地址的几率有多大? 最小。 所以这应该有效。 请记住,将数据库标准化到极致并不总是有效或有帮助。
希望这有帮助
最好的问候!
I would do this:
Now tell me, what are the odds to have the same address in different cities, states, countries? Minimal. So this should work. Remember that normalizing the DB to the extreme not always works or is helpfull.
Hope this helps
Best Regards!
取决于您如何在所有表上定义主键。 如果您将 id 列作为所有表的键,则可以减少所有表中的列,以便它们仅将 FK 保留到下一个区域。
address
(GIVEN)city
via address.city_fkstate
via city.state_fkcountry
via state.country_fk 国家设计将像克里斯一样。 如果您需要一个查询中的所有数据,您将进行联接。
但是,如果 id 不是所有表中的键而是部分键(不能单独定义数据集),那么我会认为这是一个很好的设计。 地址将被这样识别:
仅知道“城市 23”是不够的,因为然后问题出现了:哪个州的第 23 个城市以及哪个国家的哪个州 - 这会导致您将数据存储在所有表中。
但这实际上取决于您如何定义密钥。
Depends on how you have defined the PrimaryKeys on all the tables. If you have the
id
column as a key on all of the tables, you can reduce the colums in all of the tables so that they just keep the FK to the next region.address
(GIVEN)city
via address.city_fkstate
via city.state_fkcountry
via state.country_fk countryDesign would be like Chris'. If you need all the data in one query you would do a join.
However, if
id
is not a key but a partial key (can not define the data-set alone) in all of the tables, then I would assume this a good design. An Address would then be identified like this:Just knowing "City 23" wouldn't be enough, because then the question emerges: 23rd city in WHICH state, and which state in WHICH country - that would cause you to store the data in all of the tables.
But it really depends on how you have defined your keys.
我对表格设计总是做的就是查看它并找出可能会遇到什么样的麻烦(即可能会产生什么不一致)。
在您的特定情况下,显而易见的是您可以创建一个位于怀俄明州和达拉斯市的地址,达拉斯市正处于腐烂状态,哎呀,我的意思是德克萨斯州:-)
同样,您可以有一个地址位于澳大利亚和德克萨斯州(美国)以及南美洲秘鲁的Xolotllotla市。 当邮递员追踪你时,我不想向他解释你的模式。
你如何解决这个问题? 您可能应该只从地址引用城市,然后从城市引用州,然后从州引用国家。
但地址是一个棘手的野兽,它们所采用的形式在很大程度上取决于它们所在的位置。 有些有县,有些城市跨越州界等等。
除非您迫切需要在特定城市寻找人员,否则我会将其合并到自由格式的地址中(大多数地方都有某种邮政编码,这是更好的选择)。 您可能很想按州或国家/地区进行选择,在这种情况下,请将它们保留为单独的字段。
我的第一次尝试是:
您必须区分不同国家的“相同”州(例如,华盛顿和西澳大利亚都是西澳大利亚州),但您可以轻松地在数据输入屏幕中显示该国家/地区,以便操作员知道他们是哪个州正在进入(或更可能的是,他们将首先选择国家/地区,然后国家/地区的选择将逐渐减少)。
a 据我所知,Xolotllotl 并不是一座真正的城市,它只是我编造的一个听起来像古老印加美丽名字的城市。
What I always do with table design is look at it and work out what sort of trouble I can get into (i.e., what inconsistencies can be generated).
In your particular case, the obvious one is that you could create an address that's in the state of Wyoming and in the city of Dallas, which is in the state of decay, oops, I mean Texas :-)
Similarly you could have an address that is in the country of Australia and the state of Texas (in the US) and the city of Xolotllotla in Peru in South America. I wouldn't want to have to explain your schema to the postman when he tracks you down.
How do you resolve that? You should probably only reference the city from the address, then the state from the city, then the country from the state.
But addresses are a tricky beast and the form they take depends a great deal on where they are located. Some have counties, some cities cross state boundaries and so on.
Unless you have a dire need for finding people in specific cities, I'd just incorporate that into a free-form address (most places have postcodes of some sort which is a better choice for that). You may well want to select by state or country, in which case leave those as separate fields.
My first attempt would have been:
You would have to distinguish between 'identical' states in different countries (e.g., Washington and Western Australia both being WA) but you could easily show the country in data entry screens so that the operator knows which state they're entering (or more likely, they'll choose the country first and the choices of state will be whittled down).
a Xolotllotl is not a real city as far as I know, it's just one I made up that sounds like those beautiful Incan names of old.
考虑一下:如果我知道国家,并且国家 FK 就是国家,那么我可以传递地知道国家,并且再次包含它是多余的非规范化。
如果我知道城市,并且城市 FK 就是州,那么我就可以传递地知道州,并且再次包含它是多余的非规范化。
因此,如果我了解这座城市,我就了解这个州,我也了解这个国家。
地址(id, 地址, city_fk),
城市(id,名称,州FK),
STATE(id, 姓名, 国家/地区FK),
COUNTRY(id, name)
实际上,将城市作为实体或属性是有合理依据的。 对于诸如邮寄地址之类的内容,将其设为属性可能会更简单。
对于像“选民档案”(用于竞选和政治运动的数据库)之类的东西,将城市作为一个实体可能是有意义的。 这在美国弗吉尼亚州尤其重要,该州在美国是独一无二的,因为合并的城市是独立的政治管辖区,而美国其他地区在管辖权上,大多数城市都是县的一部分。
即使在弗吉尼亚州以外的州,因为属于县的城市有单独的选举(市长和议会),在政治数据库中,使城市(变得更小,选区和辖区,乡镇和村庄,变得更大)是有用的、县、HD、SD 和 CD)转化为实体。
现在以美国俄亥俄州的莱克伍德为例。 莱克伍德的管辖范围是凯霍加县。 就政治电视广告而言,它属于克利夫兰-阿克伦-坎顿电视指定市场区域的一部分。 它位于克利夫兰市外,也是凯霍加县管辖的一部分。 莱克伍德分为四个议会选区,每个选区又分为十五个或更多辖区。 它是州议会第 13 区的一部分,该区包括莱克伍德和克利夫兰市西侧的一部分。 它是州参议院第 23 区的一部分,其中包括布鲁克林、布鲁克帕克、克利夫兰(第 7-21 选区和第 14 选区的一部分)、莱克伍德、林代尔、米德尔堡高地、帕尔马和帕尔马高地。 莱克伍德位于俄亥俄州美国国会第十选区(由丹尼斯·库西尼奇代表)。
现在,对于任何给定的俄亥俄州选民,我们如何在数据库中代表他投票时将在他的选票上的人?
Consider: If I know the state, and the state FK's the country, I transitively know the country and it's a redundant denormalization to include it again.
If I know the city, and the city FK's the state, I transitively know the state and it's a redundant denormalization to include it again.
So if I know the city, I know the state, and I know the country.
ADDRESS(id, address, city_fk),
CITY(id, name, stateFK),
STATE(id, name, countryFK),
COUNTRY(id, name)
In practice, there are legitimate arguments for making city an entity or and attribute. For something like a mailing address, it's probably simpler to make it an attribute.
For something like a "voter file" (a database used for electioneering and political campaigns) it may sense to make a city an entity. It's especially important in the US State of Virgina, which is unique in the US in that incorporated cities are independent political jurisdictions, unlike the rest of the US, where jurisdictionally, most cities are part of counties.
Even in states other than Virgina, because cities that are part of counties have separate elections (for mayors and councils), in a political database, it's useful to make cities (and getting smaller, wards and precincts and townships and villages, and getting larger, counties and HDs and SDs and CDs) into entities.
Now take for example, Lakewood, in the US State of Ohio. Jurisdictionally, Lakewood is Cuyahoga county. In terms of political TV ads, it is part of the Cleveland-Akron-Canton television Designated Market Area. It lies just outside of the city of Cleveland, which is also jurisdictionally part of Cuyahoga County. Lakewood is divided into four councilmanic Wards, each of which is divided into fifteen or more precincts. It is part of state House District 13, which includes Lakewood and part of the western side of the city of Cleveland. It is part of State Senate District 23, which includes Brooklyn, Brook Park, Cleveland (Wards 7-21 and part of Ward 14), Lakewood, Linndale, Middleburg Heights, Parma and Parma Heights. Lakewood is in 10th US Congresional District of Ohio (Represented by Dennis Kucinich).
Now, for any given Ohio voter, how do we represent in the database who will be on his ballot when he votes?
您的正常化是否太过了?
当您减少重复数据的成本时,您可能会因为需要 3 或 4 个连接才能获得完整地址而否定它...
就我个人而言,我会坚持:
地址:
当然,您必须对城市/州/邮政编码/国家等内容建立索引,但这可能比加入多个表要快得多。
(注意:我不是 DBA,所以我无法支持我的说法,但这就是我使用的并且工作正常)
Are you perhaps normalizing too far?
While you are reducing the cost of data being repeated you are possibly negating it by requiring 3 or 4 joins to get a whole address...
Personally I would stick to:
ADDRESS:
Of course you would have to index on things such as city/state/postcode/country, but it would possibly be a lot quicker then joining on multiple tables.
(Note: I'm not a DBA, so I can't back up my claims, but this is what I use and it works fine)