地址的数据库规范化
我正在尝试为一家豪华轿车公司建立一个数据库,但我被困在应该对与客户、司机、附属机构和订单相关的地址进行多少标准化。
基本上,附属机构和驱动程序地址如下所示: address_line_1、address_line_2、城市、州、邮政编码、国家/地区
我的问题来自订单和客户地址。 它们应该看起来像这样: address_line_1、address_line_2、城市、州、邮政编码、国家、address_type_1(家庭、企业)、address_type_2(接送、送机 - 仅需要在订单中包含此信息)。
因此,在所有四个表之间,除了客户和订单表中的两个字段不同之外,地址字段都有相似之处。
我需要提及的是,每条记录都将使用唯一的 ID 进行标识。 示例:
客户 ID - 10,000 - 99,999
订单 ID - 100,000 - 无限制
驱动程序 ID - a1 - a999(可能)
联盟 ID - 1,000 - 9,999
这些只是示例,因此无需花费太多时间尝试理解它们。
我应该使用多少个地址表来创建一个好的规范化数据库?
此时此刻,我的脑海中浮现出三个想法:
一个包含所有字段的地址表,再加上一个描述地址类型(客户、订单、附属机构、司机)的额外表。不太像这个。
两个地址表。一份包含司机和附属机构,一份包含客户和订单。对于第二个表,我的字段对于客户来说始终为 NULL。也不喜欢这个。
三个地址表。一种用于司机和附属机构,一种用于客户,一种用于订单。没有未使用的字段让我认为这可能是比其他两个更好的选择。
有人对这三个选项有建议,或者甚至更好的选择吗?
多谢。
更新:
先不用担心表 ID 的编号系统。这只是一个例子。我仍然没有时间找出最好的编号系统。一旦我解决了我的地址问题就会解决这个问题。
从马特的回答中,我很想留下包含地址的司机和附属表,并以某种方式整理出客户和订单表。
对于客户,我肯定需要一个地址表,因为客户可以有多个地址(家庭、企业 1、企业 2、最喜欢的地方等),我希望将这些地址存储在他们的个人资料中以便于访问。
我忘记提及有关订单表的内容,这可能会稍微改变问题的方程式。 对于任何订单,我都需要有一个取货和送货地点。但这可以是地址(街道地址)或机场。这意味着与街道地址相关的字段无法与机场特定字段匹配。因此,我非常确定,在一个表中包含四个实体(pu_address、pu_airpot、do_address、do_airport)(都带有其特定字段)将使我陷入未使用的空间并导致编程混乱。 前任: 对于接送字段:Address_type、Address_line_1、...、州、国家、机场、航空公司、航班号、... 送机与接机相同。
因此,我的订单表仍然存在问题,我不确定如何继续处理。我需要在使用或不使用额外表格的情况下都包含地址和机场接送地点。
更新 再次感谢马特。首先,是的,我会将地址存储在单独的字段中。订单问题仍然存在。我将举例说明豪华轿车服务使用什么类型的 pu 和做。地址:123 Main St,芝加哥,伊利诺伊州,60640;机场:ORD、AA、123。我需要将所有这些字段以某种方式集成到表中。
选项: 订单表
order_id, ..., 取货字段需要同时具有机场和地址字段, 下车字段需要同时具有机场和地址字段。
这个选项听起来仍然不对。
接下来是有两张额外的桌子。一种是地址(包括用于识别上车或下车的字段)。另一个用于机场(还有一个用于 pu 或 do 的字段)。
我也不喜欢这个选项,因为我需要执行两个查询才能仅检索订单记录的信息。首先,我将检索订单信息,在知道接送机类型(机场或地址)后,我将进行另一个查询以检索具体的接送机信息。
那么,再次...我做错了什么?我错过了什么吗?
是的,我肯定会使用一些验证系统来确保地址正确。
I am trying to build a database for a limousine company and I got stuck on how much Normalization should I do for Addresses related to Customers, Drivers, Affiliates and Orders.
Basically the Affiliate and Driver addresses look like this:
address_line_1, address_line_2, city, state, zipcode, country
My problem comes from orders and customers addresses.
They should look like this:
address_line_1, address_line_2, city, state, zipcode, country, address_type_1 (home, business), address_type_2 (pick-up, drop-off - this only needs to be included for orders).
So between all the four tables I have similarities in address fields except for two fields which differ in customer and orders table.
I need to mention that every record will be identified with unique ids.
Example:
Customer ID - 10,000 - 99,999
Order ID - 100,000 - no limit
Driver ID - a1 - a999 (maybe)
Affiliate ID - 1,000 - 9,999
These are just examples so don't spend to much time trying to understand them.
How many Addresses table should I use to create an good normalized database?
In this moment I have three ideas in my mind:
One Addresses table with all the fields included plus an extra one describing the type of address (customer, order, affiliate, driver). Not really like this one.
Two Addresses tables. One with drivers and affiliates, and one with customers and orders. For the second table I would have and field that will always be NULL for customers. Don't like this one too.
Three Addresses tables. One for drivers and affiliates, one for customers, and one for orders. No unused fields leads me to think that this could be an better option than the other two.
Does anyone has an advice regarding these three options or maybe even a better option?
Thanks a lot.
UPDATE:
Don't bother yet about the numbering system for the tables ID. That was just an example. I still didn't had time to figure out the best numbering system. Will get to that once I got my addresses problem sorted out.
From Matt's answer I am tempted to leave the driver and affiliate tables with the addresses included and just sort out somehow the customer and order tables.
For customers, I would definitely need an Addresses table because a customer can have multiple addresses (home, business1, business2, favorite places, etc) that I want to have stored in their profile for easier access.
I forgot to mention something about the orders table which may change a little bit the equation of the problem.
For any order I would need to have a PICK-UP and DROP-OFF location. But this can be either an address (street address) or an airport. This means that the fields related to a street address cannot match the airport specific fields. So I am pretty sure that to have four entities (pu_address, pu_airpot, do_address, do_airport) inside a table (all with their specific field) would leave me to unused space and with a programming mess.
Ex:
for pick-up fields: Address_type, Address_line_1, ..., state, country, Airport, Airline, Flt no, ...
and for drop off same thing as pick-up.
So I still have a problem with the Order table for which I am not sure on how to move forward. I would need both addresses and airport pick-up and drop-off locations to be included with or without the use of extra tables.
UPDATE
Thanks again Matt. First, yes I will store addresses in separate fields. The problem still remains for orders. I will give an example on what type of pu and do a limo service use. Address: 123 Main St, Chicago, Il, 60640; Airport: ORD, AA, 123. I need to have all those fields somehow integrated into the table.
Options:
Order table
order_id, ..., pick-up fields which need to have both airports and addresses fields, drop-off fields with both airport and address fields.
This option still doesn't sound right.
Next would be to have two extra tables. One would be for addresses (including a field for recognizing pick-up or drop-off). The other one would be for airport (with a field for pu or do as well).
I don't like this option as well because I will need to do two queries in order to retrieve the information for only an order record. First I will retrieve the order information, and after I know the type of pick-up and drop-off (airport or address), I would do another query to retrieve the specific pick-up and drop-off information.
So, again... what am I doing wrong? Do I miss something?
And yes, I will definitely use some verification system to make sure that the addresses would be correct.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
现在可能为时已晚,但我建议使用 1 个
Addresses
表(address_id
、address_line_1
、address_line_2
、城市
、州
、邮政编码
、国家/地区
、address_type
(FK 到AddressTypes< /代码>表)),因为这将遵循标准标准化规则。您的
Orders
表将与Addresses
表有两个外键关系 -pickup_address_id
和delivery_address_id
。我对Customers
、Drivers
和Affiliates
表的设计有疑问,但如果没有更好地理解它们之间的确切关系,就很难提出解决方案。一种选择(但我不知道它是否适合您)是拥有一个
Parties
表(party_id
,party_type
),它与Customers
、Drivers
和Affiliates
创建超类型/子类型关系(每种情况下都是一对一或零)有哪些类型聚会
。我建议阅读 David C. Hay 的一两篇有关数据建模的文章,以便更好地理解。It's probably too late now, but I would suggest 1
Addresses
table (address_id
,address_line_1
,address_line_2
,city
,state
,zipcode
,country
,address_type
(FK toAddressTypes
table)) as this would follow the standard normalization rules. YourOrders
table will have two Foreign Key relationships with theAddresses
table -pickup_address_id
anddelivery_address_id
. I have questions around the design of theCustomers
,Drivers
andAffiliates
tables, but without a better understanding of exactly how they relate it is difficult to prescribe a solution.One option (but I don't know if it is the right one for you) would be to have a
Parties
table (party_id
,party_type
) which creates a supertype / subtype relationship (one to one-or-zero in each case) withCustomers
,Drivers
andAffiliates
, all of which are types ofParty
. I suggest reading one or two of David C. Hay's articles on data modelling for a better understanding.实际上,我在 SmartyStreets 的地址验证行业工作,其中处理和存储地址是我们的专业领域。根据我的经验,我见过很多与你类似的情况。
我最初关心的是您根据记录类型划分的 ID 号。如果四种类型的记录(客户、司机、附属机构、订单)存储在不同的表中,为什么需要 ID 范围限制?(更新:这实际上不是当前的主要问题......)现在,介绍一下数据库设计。理想情况下,您的设计应该反映核心域的操作(即协调客户、订单、司机等),而不仅仅是与地址数据耦合。虽然地址可能很重要,但它们并不是您企业的核心运营。基于此,根据我从您的原始帖子中收集到的信息,我会立即犹豫是否将地址与实际记录分开存储。
虽然每个表中都会有类似的字段,但它们代表不同的业务目的,并且您不会冒未使用、不必要的字段的风险。因此,问题不在于“有多少任何表的问题。
虽然地址有多种形式,但对于豪华轿车公司来说,拥有正确的地址信息以及标准化数据库非常重要。 USPS(我假设您位于美国)认证某些供应商提供地址标准化服务。这称为 CASS™ 认证。通过 CASS™ 服务运行每个地址即可完成。地址看起来是一样的,有完整的信息,而且也可以投递。我建议您从 LiveAddress 等内容开始搜索,它将验证该地址-of-entry,或 CASS列表清理服务,它将立即验证一批地址(并警告您重复的地址)。
更新:如果客户可能拥有多个地址,那么是的,我建议使用单独的表。但是,您仍然希望使用 CASS 对它们进行标准化/验证,因此如果需要,您可以稍后取出重复项(而且您会知道地址实际存在)。
因此,除此之外,请考虑将每个地址与其关联的实际记录一起存储(而不是在单独的表中)。
对于进一步的问题或指导,我可以亲自提供帮助。
更新
关于将地址与机场分开:根据您的业务需求,这可能是一个有效的区别,但请记住,机场也有地址。您可以在表中添加一个字段来存储地址指向的公司名称或位置,例如“奥黑尔国际机场”。这可以巩固一些领域。另外,我建议您按组件(街道、城市、州、邮政编码等)将地址存储在单独的字段中。
I actually work in the address verification industry with SmartyStreets, where processing and storing addresses is our area of expertise. In my experience I've seen a number of situations quite like yours.
I'm initially concerned with your segmenting ID numbers based on the type of record it is. If the four types of records (Customers, Drivers, Affiliates, Orders) are stored in different tables, why are the ID range limits needed?(Update: this isn't the main issue at hand really...)Now, a bit about database design. Ideally, your design should reflect the operation of your core domain (that is, coordinating customers, orders, drivers, etc), without being coupled to merely the address data. While the addresses may be important, they're not the core operation of your business. On this ground and from what I've gathered from your original post, I would immediately hesitate to store the addresses separately from the actual record.
While you will have similar fields in each table, they represent different business purposes, and you won't risk unused, unnecessary fields. So the question isn't so much "how many address tables do I make," it's more a question of even making any tables for addresses only.
While addresses come in many shapes and forms, it's important for the limo company to have correct address information, and for your database to be normalized. The USPS (I assume you're US-based) certifies certain vendors to provide address normalization services. This is called CASS™ Certification. Run each address through a CASS™ service and you're done. The addresses will look the same, have complete information, and be deliverable, too. I suggest you start your search with something like LiveAddress, which will verify addresses at point-of-entry, or a CASS list scrubbing service, which will verify a batch of addresses at once (and warn you of duplicates).
UPDATE: In the case of several addresses a customer may have, then yes, I would advocate using a separate table for that. However, you'll still want to standardize/verify them with CASS so if needed, you can pull out duplicates later (plus you'll know the addresses actually exist).
So, except for that, consider storing each address inline with the actual record it associates with (not in separate tables).
For further questions or direction, I can personally assist.
UPDATE
About separating addresses from airports: that's potentially a valid distinction depending on your business needs, but remember that airports have addresses, too. You could add a field to your table to store the name of the firm or location the address points to, such as "O'Hare International Airport." This could consolidate a few of the fields. Also, I suggest you store the address in separate fields by component (Street, City, State, ZIP, etc).