Address.City:字符串还是引用?
我必须创建一个数据库来存储大量人员及其地址,并且必须可以按地点进行搜索(即查找该城市或其他城市的人员)。
我怀疑是否应该将地址中的城市字段保留为简单的 varchar,还是创建一个城市表并引用它,以避免重复的城市名称等。
注意:我正在使用 SQL-Server 并将通过 EF 访问数据。
I have to create a database that will store a huge amount of people along with their addresses, that will have to be searchable by place (i.e. find people in that or other city).
I am doubting whether I should keep the city field in the address as simple varchar or create a city table and refer to it, to avoid duplicate city names etc.
Note: I am using SQL-Server and will access the data via EF.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
背景
权威表 - 一个参考表,其中包含其具有权威的“事物”的可能值。例如,国家/地区权限表将包含国家/地区的所有可能值。
答案
根据您对“大量……”的定义,您肯定会想要一些权限表。
根据您的地址的地理范围,其中部分或全部似乎是您的权限表的良好开端:
Background
Authority table - a reference table that contains the possible values for the "thing" for which it is an authority. For example, a Country authority table would contain all possible values for country.
An Answer
Depending on your definition of "huge amount ..." you definately will want to have some authority tables.
Based on the geographic range of your addresses, some or all of these seem like a good start for your authority tables:
您当前的设计是否有一个如下所示的 ADDRESS varchar 列:?
如果是,那么如果您必须按街道、城市、州、邮政编码或组合进行搜索,那么您的生活就会变得复杂。
我建议使用 ADDRESS 表,其中包含 STREET、CITY、STATE 和 ZIP 的单独列。这样您就可以单独查询每一项。请务必为您使用的每个 WHERE 子句添加索引。
下一个问题是进一步规范化(例如,城市、州、邮政编码的单独表)和联接以获得地址是否有用。我不确定是否有必要,但你可以尝试一下。
Doesyour current design have an ADDRESS varchar column that looks like this:?
If yes, then you complicate your life if you have to search by street, city, state, zip or combinations.
I'd recommend an ADDRESS table with separate columns for STREET, CITY, STATE, and ZIP. That way you can query for each one individually. Be sure to add an index for each WHERE clause you use.
The next question is whether it's useful to normalize further (e.g. separate tables for CITY, STATE, ZIP) and JOINing to get an address. I'm not sure that it's necessary, but you can try it.
我假设您的模型看起来像这样:
等等。
如果是这样,则无法消除
City
列中某些值的重复。如果您要创建一个City
表,那么城市名称显然是自然键的选择,这意味着您的City
列的实际数据将保持不变。如果您要使用代理键,那么您只需重复该键值而不是城市名称。不过,我不建议在这里使用代理键,因为城市的名称不太可能改变,而且您会添加额外的间接级别,但没有任何好处。最后,有一些潜在的场景需要
City
表:City
的所有值都来自已知值列表,您将其存储在城市列表表中。这还允许您向用户呈现城市列表,而不仅仅是允许他们以自由格式文本的形式输入数据。如果其中任何一个适用于您,那么请务必创建一个城市表。如果没有,那就没有必要了。
I'm assuming that your model looks something like this:
etc.
If so, then there's no way to eliminate the repeating of some value in the
City
column. If you were to create aCity
table, the city name would be an obvious choice for a natural key, which would mean that yourCity
column's actual data would remain unchanged. If you were to use a surrogate key, then you'd simply be repeating that key value instead of the city name. I would not suggest a surrogate key here, though, since a city's name is unlikely to change and you'd be adding an additional level of indirection with no benefit.In the end, there are a couple of potential scenarios that would warrant a
City
table:City
come from a list of known values, which you'd store in your city list table. This would also allow you to present the user with a list of cities rather than just allowing them to enter the data as free-form text.If either of those apply to you, then, by all means, create a city table. If not, then there's no need.