在 SQL 数据库中存储地址的最佳实践/标准
我想知道是否有某种“标准”用于在数据库中存储美国地址?看来这是一个常见的任务,应该有某种标准。
我正在寻找的是数据库表如何工作和交互的特定模式,已经采用第三范式,包括数据类型(MySQL)。一个好的 UML 文档就可以了。
也许我只是懒惰,但这是一项非常常见的任务,我确信有人已经在某处发布了一种有效的方法来做到这一点。我只是不知道去哪里寻找,而且谷歌也没有提供帮助。请指出我的资源。谢谢。
编辑
虽然这更像是一个一般性问题,但我想澄清我的具体需求。
地址将用于指定活动地点的道路地址。这些地址需要采用一种可以最好地分解和搜索的格式,并且也可以由我最终可能链接到我的数据源的任何第三方应用程序使用。
还。数据将在输入时进行地理编码(经度、纬度)并单独存储,因此它必须符合执行此操作的任何地理编码器/应用程序/库的(尚未确定的)协议。
I am wondering if there is some sort of "standard" for storing US addresses in a database? It seems this is a common task, and there should be some sort of a standard.
What I am looking for is a specific schema of how the database tables should work and interact, already in third normal form, including data types (MySQL). A good UML document would work.
Maybe I'm just being lazy, but this is a very common task, and I am sure someone has published an efficient way to do this somewhere. I just don't know where to look and Google isn't helping. Please point me to the resource. Thanks.
EDIT
Although this is more of a general question, I would like to clarify my specific needs.
Addresses will be used to specify road addresses of locations of events. These addresses will need to be in a format that can be best broken down and searched, and also used by any third-party applications I may end up linking my data source to.
ALSO. Data will be geo-coded (long, lat) on entry and stored separately, so it must fit the (yet undecided) protocol of whatever geocoder / application / library does that.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
对于国际地址,请参阅万国邮政联盟的邮政寻址系统数据库。
对于美国地址,请参阅USPS 出版物 28“邮政地址标准”。
USPS 希望将以下不加标点的地址组件连接在一行上:
N
、SE
等)AVE
、BLVD
等)SW
、E
等)APT
、STE
等)号
102 N MAIN ST SE APT B
如果将整个地址行保留为数据库中的单个字段,则输入和编辑很容易,但搜索可能会更加困难(例如,在
SOUTH EAST LANE
是S EAST LN
中的街道EAST
,还是SE LANE ST< 中的
LANE
/代码>?)。如果您将地址解析为单独的字段,则搜索街道名称或公寓等组件会变得更容易,但您必须将所有内容附加在一起才能输出,您需要 CASS 软件能够正确解析,邮政信箱、乡村路线地址、APO/FPO 地址都有特殊解析。
具有多个地址的物理位置要么是多单元建筑物,在这种情况下,
APT
和STE
等单元后面的字母/数字指定地址,要么是商业邮件接收机构(例如 UPS 商店)并附加邮递/私人邮箱号码(例如100 MAIN ST STE B PMB 102
),或者是一家拥有一个 USPS 投递点的企业,并且邮件在 USPS 之后路由投递(通常需要一个单独的邮站字段,公司可能需要该字段,但 USPS 不希望出现在地址行上)。拥有多个实际地址的联系人通常是拥有街道地址和邮政信箱的企业或个人。请注意,每个地址都有不同的邮政编码是很常见的。
一项业务交易可能有一个送货地址和一个帐单地址(同样具有不同的邮政编码),这是很典型的情况。我为每个地址保留的信息是:
DR
、MS
等)、III
、PHD
等)我通常会在人员姓名和公司之间打印邮件站,因为国家/地区包含州/邮政编码,包含城市,包含地址,包含公司,包含邮站,邮站包含人员。我使用 CASS 软件来验证和标准化输入或编辑的地址。
For international addresses, refer to the Universal Postal Union's Postal Addressing Systems database.
For U.S. addresses, refer to USPS Publication 28 "Postal Addressing Standards".
The USPS wants the following unpunctuated address components concatenated on a single line:
N
,SE
, etc.)AVE
,BLVD
, etc.)SW
,E
, etc.)APT
,STE
, etc.)E.g.
102 N MAIN ST SE APT B
If you keep the entire address line as a single field in your database, input and editing is easy, but searches can be more difficult (eg, in the case
SOUTH EAST LANE
is the streetEAST
as inS EAST LN
or is itLANE
as inSE LANE ST
?).If you keep the address parsed into separate fields, searches for components like street name or apartments become easier, but you have to append everything together for output, you need CASS software to parse correctly, and PO boxes, rural route addresses, and APO/FPO addresses have special parsings.
A physical location with multiple addresses at that location is either a multiunit building, in which case letters/numbers after units like
APT
andSTE
designate the address, or it's a Commercial Mail Receiving Agency (eg, UPS store) and a maildrop/private mailbox number is appended (like100 MAIN ST STE B PMB 102
), or it's a business with one USPS delivery point and mail is routed after USPS delivery (which usually requires a separate mailstop field which the company might need but the USPS won't want on the address line).A contact with more than one physical address is usually a business or person with a street address and a PO box. Note that it's common for each address to have a different ZIP code.
It's quite typical that one business transaction might have a shipping address and a billing address (again, with different ZIP codes). The information I keep for EACH address is:
DR
,MS
, etc)III
,PHD
, etc)I typically print mail stops somewhere between the person's name and company because the country contains the state/ZIP which contains the city which contains the address which contains the company which contains the mail stop which contains the person. I use CASS software to validate and standardize addresses when entered or edited.
首先,作为一个每天大部分时间都在处理地址的人,从数据的角度来看,他们很难管理。
如果你问 5 个人他们住在什么地址;你会发现你得到了5个不同的答案。虽然你和我都可以看出123 Main Street Apt 1和Apt 1 123 Main Street
如果地址相同,数据库程序就会遇到挑战。
如果您使用以美国为中心的地址,几乎所有供应商提供的 CASS 认证软件都会很好地标准化您的地址。我建议使用简单的格式,如下所示:
但是,如果您想要一个通用地址,我会查看 ADIS 来自 IdeaAlliance 的标准。该标准可用于将几乎任何国家/地区的地址分解(解析)为相关部分。然后,可以使用基于万国邮政联盟标准(UPU S42 国际邮政地址组件和模板标准)的模板/组件将它们重新组合在一起。
这种格式的一大优点是,可以将 CASS 等邮政数据库中不存在的地址作为单独的部分输入和存储。
First, as a person who spend most of there professional day working with addresses, they are hard to manage from a data perspective.
If you ask 5 people what address they live at; you will find that you get 5 different answers. While you and I can tell that 123 Main Street Apt 1 and Apt 1 123 Main Street
are the same address, the database program will have a challenge.
If you are using United States centric addresses CASS certified software from almost any vendor will standardize your addresses reasonably well. I would recommend a simple format as follows:
However, if you want a universal address I would look at the ADIS standard from IdeaAlliance. This standard can be used to breakdown (parse) addresses from almost any country into the relevant parts. Then they can be put back together using templates/components based on the Universal Postal Union standards (UPU S42 Standard on International Postal Address Components and Templates).
The big plus of this format is that addresses that dont exist in a postal database like CASS can be entered and stored as separate parts.
非常 类似 问题 之前已被询问过。
地址充其量是混乱的。
这部分取决于您想如何处理这些地址。如果您要使用它们将东西邮寄给人们,那么您只需以方便的形式记录将出现在地址标签上的图像即可。如果你要分析这个地址,你就必须更加努力。
请记住,当您第一次与美国境外的人打交道时,之前的所有规则都会误入歧途。您可能仅限于美国,但要小心。
Very similar questions have been asked before.
Addresses are messy - at best.
It partly depends on what you want to do with the addresses. If you're going to use them to mail thing to people, then you simply need to record the image that will appear on the address label in a convenient form. If you're going to analyze the address, you have to work a lot harder.
Remember that the first time you have to deal with someone outside the US, all previous rules go astray. You may be strictly US-only, but beware.
我不久前研究过这个,但针对的是国际地址。我没有发现太多共识。然而,对于美国,我找到了简洁命名的美国大道、地标和邮政地址数据标准(草案):
http://www.fgdc.gov/standards/projects/FGDC-standards-projects/street-address/index_html
我认为他们实际上并没有提供任何特定的数据库模式想法,但这可能是一个很好的起点。
I looked into this a while ago, but for international addresses. I didn't find much in the way of a consensus. However, for the US, I found the succinctly named United States Thoroughfare, Landmark, and Postal Address Data Standard (Draft):
http://www.fgdc.gov/standards/projects/FGDC-standards-projects/street-address/index_html
I don't think that they actually provide any specific database schema ideas, but it might be a good starting point.
首先,存储地址的“最佳”方式很大程度上取决于它的使用方式。仅供参考或搜索该城市吗?您打算在信封上注明地址吗?您要与 FedEx 或 UPS 等运输系统集成吗?您会存储非美国地址吗?一旦您进入与附带的东西集成的领域,您应该开始查看 中国社会科学院。这是处理 USPS 地址的规范。有一些经过 CASS 认证的应用程序可以存储和验证地址。因此,第二个最佳实践是尽量避免重新发明轮子,看看是否有一个系统可以解决您的问题,特别是如果您要走向国际。您想要利用这样一个事实:其他人已经制定了有关如何正确有效地存储世界各地许多国家/地区地址的所有详细信息,而不必自己进行调查。
First, the "best" means of storing an address depends greatly on how it will be used. Is it just for reference or searches on say city? Do you plan on addressing envelopes? Are you going to integrate with a shipping system like FedEx or UPS? Will you store non-US addresses? Once you get into the realm of integrating with something that ships, you should start looking at CASS. This is a specification for handling the USPS addresses. There are applications out there that are CASS certified which will store and verify addresses. Thus, the second best practice would be to try to avoid reinventing the wheel and see if there is a system out there that will solve your problem especially if you are going to go international. You want to leverage the fact that someone else has worked out all the details about how to properly and efficiently store addresses for many countries around the world instead of having to do that investigation yourself.
我以前曾尝试过这样做,并且发现了此文档 这给了你一些指示。我最终搁置了我的架构,因为我的应用程序确实必须处理国际地址。
I've had to try to do this before and I'd found this document that gives you some pointers. I ended up shelving my schema since my application does have to deal with international addresses.