这是一个很好的位置数据库架构吗

发布于 2024-12-25 11:28:25 字数 1358 浏览 4 评论 0原文

我正在开发一个特定于位置的应用程序 - 将其视为商店定位器，商店所有者在其中输入他们的地址信息，而其他用户只能看到一定范围内的附近商店。然而，从某种意义上说，它有点不同，不需要确切的位置，只需要城市/州（我知道很奇怪）。我考虑过存储位置的模式，并决定采用这个模式。

地点

id                      -- int
formatted_address       -- varchar(200)
is_point_of_interest    -- bool
name                    -- varchar(100) -- NULL
street_number           -- varchar(10)  -- NULL
street                  -- varchar(40)  -- NULL
city                    -- varchar(40)
state                   -- varchar(40)
state_code              -- varchar(3)
postal_code             -- varchar(10)
country                 -- varchar(40)
country_code            -- varchar(3)
latitude                -- float(10,6)
longitude               -- float(10,6)
last_updated_at         -- timestamp

以下是有关该应用程序的一些注意事项：

我想对国际地点敞开大门
我计划使用地理编码服务来搜索和验证店主指定的地点
我确实只需要纬度/经度，但显示商店信息需要其他数据。
formatted_address 字段将包含完全格式化的地址 - 例如，Giants Stadium, 50 NJ-120, East Rutherford, NJ 07073, USA - 以允许更容易搜索存储的位置
可能会有很多重复字段，因为每行可能具有不同的粒度级别 - 例如，123 Main Street, City, State 12345 与 不同>Main Street, City, State 12345 因为一个有指定的街道号码，而另一个则不

知道该模式不是很标准化，但我也认为没有必要再对其进行标准化，因为位置非常复杂，其中这就是为什么我依赖稳定的地理编码服务（谷歌）。另外，我计划允许自由格式的文本输入/搜索，因此不需要任何下拉列表。

考虑到我提到的内容，有人认为有什么问题或有任何改进吗？我可以看到这张桌子变得相当大。

原文

I'm working on an application that is location specific -- think of it as a store locator where store owners enter their address information and other users can only see nearby stores within a certain range. However, it's a little different in the sense that an exact location is not required, only the city/state is required (weird, I know). I have thought about the schema for storing locations, and have decided on this one.

Locations

id                      -- int
formatted_address       -- varchar(200)
is_point_of_interest    -- bool
name                    -- varchar(100) -- NULL
street_number           -- varchar(10)  -- NULL
street                  -- varchar(40)  -- NULL
city                    -- varchar(40)
state                   -- varchar(40)
state_code              -- varchar(3)
postal_code             -- varchar(10)
country                 -- varchar(40)
country_code            -- varchar(3)
latitude                -- float(10,6)
longitude               -- float(10,6)
last_updated_at         -- timestamp

Here are some notes about the application:

I want to keep the door open for international locations
I plan to use a geocoding service to search for and validate the locations specified by the store owner
I truly only need the lat/lon, but the other data is necessary for displaying store information
The formatted_address field will contain the fully formatted address -- e.g., Giants Stadium, 50 NJ-120, East Rutherford, NJ 07073, USA -- to allow for easier searching of stored locations
There will possibly be a lot of duplicate fields, because each row may have a different level of granularity -- for instance, 123 Main Street, City, State 12345 is different from Main Street, City, State 12345 because one has a specified street number and the other doesn't

I understand that the schema is not very normalized, but I also do not see the need to normalize it any more because locations are very complex, which is why I'm relying on a stable geocode service (google). Also, I plan to allow freeform text input/search, so theres no need for any dropdown lists.

Does anybody see anything wrong or have any improvements, taking into consideration what I've mentioned? I can see this table growing rather large.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

雨轻弹 2025-01-01 11:28:25

我不这么认为。这是我的两分钟概要：

这非常糟糕的标准化。至少应该将city->country移出到另一个表（并从那里标准化）。我相信邮政编码可以跨越城市边界（或者我记错了）；我不知道有这样一个跨越国界的城市。

formatted_address 是一个“优化”，应该是一个计算字段：也就是说，重新创建它的所有数据都应该存在于其他地方。（这意味着现在不需要担心。）

快乐设计。

简单的“更规范化”形式只是执行上述建议：

LOCATIONS
location_id             -- int
is_point_of_interest    -- bool
name                    -- varchar(100) -- NULL
street_number           -- varchar(10)  -- NULL
street                  -- varchar(40)  -- NULL
city_id                 -- int
postal_code             -- varchar(10)
latitude                -- float(10,6)
longitude               -- float(10,6)
last_updated_at         -- timestamp

CITIES
city_id                 
name                    -- varchar
-- similarly, the following should be normalized to STATES and COUNTRIES
state                   -- varchar(40)
state_code              -- varchar(3)
country                 -- varchar(40)
country_code            -- varchar(3)

当然，CITIES可以进一步规范化，因此可以一个POSTALS表：我知道得不够关于邮政编码或应用程序域。 postal_code 充当隐式复合代理 FK 的一部分，因此它并不像它所在的那样超级可怕。但是，将其移至单独的表中可以轻松实现验证和完整性约束。

编辑：规范化 POSTALs 表是最好的，因为对于给定城市只有很少数量的邮政编码有效：不过，我不确定邮政编码和城市之间的关系，所以我无法推荐如何做到这一点。也许看看现有的使用模式？

I do not think so. Here is my two-minute synopsis:

This very badly normalized. At least city->country should be moved out to a different table (and normalized from there). I believe postal codes can cross city boundaries though (or I am very badly misremembering); I am not aware of such a city that crosses a state boundary.

formatted_address is an "optimization" and should likely be a computed field: that is, all the data to re-create it should exist elsewhere. (This means that it doesn't need to worried about now.)

Happy designing.

The simple "more-normalized" form just doing the above proposed:

LOCATIONS
location_id             -- int
is_point_of_interest    -- bool
name                    -- varchar(100) -- NULL
street_number           -- varchar(10)  -- NULL
street                  -- varchar(40)  -- NULL
city_id                 -- int
postal_code             -- varchar(10)
latitude                -- float(10,6)
longitude               -- float(10,6)
last_updated_at         -- timestamp

CITIES
city_id                 
name                    -- varchar
-- similarly, the following should be normalized to STATES and COUNTRIES
state                   -- varchar(40)
state_code              -- varchar(3)
country                 -- varchar(40)
country_code            -- varchar(3)

Of course, CITIES can be further normalized, and so could a POSTALS table: I don't know enough about postal codes, or the application domain though. postal_code acts as part of an implicit compound-surrogate-FK so it's not super terrible as it is there. However, moving it into a separate table could easily allow verification and integrity constraints.

Edit: Normalizing a POSTALs table would be best, as only a very samll number of postal codes are valid for a given city: I am not sure the relation between a postal code and a city, though, so I can't recommend how to do this. Perhaps look at existing schemas used?

回复收藏 0 原文