哪个更好:has_one(单列表)还是字符串列?
假设我有一个名为 cities
(id, name) 的表和另一个名为 people
(id, name, city_id) 的表。
现在,您可以添加任意数量的城市,没有任何限制,因此,在这种情况下,将 city
列放在 people
当我想获取不同的城市时,我可以调用DISTINCT
吗?
我节省的复杂性是否比我失去的灵活性更好?
我保存的连接比我必须使用的不同更好吗?
将city
列设置为索引,是不是和在另一个表中查找ID
一样快?
我必须说,在我正在开发的应用程序中,我有大约 5 个此类表,所以它有很多连接。
Suppose I have a table called cities
(id, name) and another called people
(id, name, city_id).
Now, you can add as many cities as you want, without any kind of limit, so, in this situation, wouldn't it be better to have the city
column inside people
and when I want to get the different cities I can just call DISTINCT
?
Is the complexity I save better than the flexibility I lose?
Are the joins I save better than the distinct I have to use?
Setting the city
column as index, would it be as fast as lookin up the ID
in another table?
I must say that in the application I'm working on I have around 5 tables of this kind, so it's, like, a lot of joins.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您的所有用户都拥有不同的城市,则 DISTINCT 将工作得相当好。然而,来自同一城市的人越多,不分离数据的开销就越高。最好的情况优化是对城市列的索引进行唯一扫描。
将城市列放入人员表中还可以减慢对人员表的某些访问请求。
复杂性可能会影响维护城市表的数据质量。假设有适当的元数据,验证已添加的新城市相当简单。它允许您解决城市“纽约”、“纽约”、“纽约”和“纽约”等问题。
哪种方法更快的问题实际上取决于应用程序。如果在访问人员记录时始终需要城市,则将城市放入人员记录中可能会更快。如果您经常需要城市列表,那么将它们放在单独的表中会更好。
拥有额外的表可能会使您的查询变得更加复杂。但是,您可能会获得显着的数据质量。通过将城市列放入人员表中获得的灵活性可能会以牺牲数据质量为代价。
您的数据库设计师似乎做得很好。
DISTINCT will work reasonably well if all your users have different cities. However, the more people you have from the same city the higher the overhead of not separating the data. Best case optimization would be a unique scan on the the index for the cities column.
Putting the city column in the people table can also slow certain access requests to the people table.
The complexity can make maintaining the data quality of your cities table. Assuming the appropriate metadata is available it is reasonably simple to validate new cities which have been added. The allows you clean up issues like having the cities 'Newyork', 'new york', 'New York', and 'new York'
The question of which approach will be faster really depends on the application. If city is always required when accessing a person's record, putting the city in person record might be faster. If you frequently need a list of cities, then having them in a separate table is better.
Having the extra tables may make your queries a little more complex. However, you will likely gain significant data quality. The flexibility you get by putting the city column in the people table will likely be at the expense of data quality.
Your database designer appears to have done a good job.
您始终可以缓存城市表。如果你使用字符串,你就会失去面向对象的能力。假设您想要城市的缩写。或城市有多个邮政编码...或城市有多个地区代码。现在您将很难升级您的应用程序。
你总是可以做这样的事情来失去额外的查询......
你也应该使用belongs_to而不是has_one
You can always cache the cities table. if you use a string you lose object orientation. Lets say you want an abbreviation for cities. or a city has_many zip_codes... or city has_many area_codes. Now you will have a pain upgrading your app.
you can always do something like this to lose the extra query...
also you should probably use belongs_to rather than has_one