Django 查询性能:使用外键的单向包含与使用外键和双向的双向包含相关名称
所有,
当谈到正确的设计和设计时,您更喜欢哪一个?清洁度和查询性能?
选项A:(双向参考)
class Country(models.Model):
name = models.CharField(max_length=128)
capital = models.ForeignKey('City',related_name="country_capital", null=True)
class State(models.Model):
name = models.CharField(max_length=128)
capital = models.ForeignKey('City', related_name="state_capital", null=True)
country = models.ForeignKey('Country', null=True)
class City(models.Model):
name = models.CharField(max_length=128)
state = models.ForeignKey('State', null=True)
country = models.ForeignKey('Country', null=True)
选项B:(单向参考)
class Country(models.Model):
name = models.CharField(max_length=128)
class State(models.Model):
name = models.CharField(max_length=128)
country = models.ForeignKey('Country', null=True)
class City(models.Model):
name = models.CharField(max_length=128)
state = models.ForeignKey('State', null=True)
state_capital = models.BooleanField(default=False)
country_capital = models.BooleanField(default=False)
Search will be done as:
Get me all cities where name is Waterloo and country is US.
Get me all cities where country is UK.
Get me all states where country is US.
Get me a country where country is US. and capital is Washington DC.
Get me the capital where country is US.
All,
Which one do you prefer when it comes to proper design & cleanness as well as query performance?
Option A: (bidirectional reference)
class Country(models.Model):
name = models.CharField(max_length=128)
capital = models.ForeignKey('City',related_name="country_capital", null=True)
class State(models.Model):
name = models.CharField(max_length=128)
capital = models.ForeignKey('City', related_name="state_capital", null=True)
country = models.ForeignKey('Country', null=True)
class City(models.Model):
name = models.CharField(max_length=128)
state = models.ForeignKey('State', null=True)
country = models.ForeignKey('Country', null=True)
Option B: (unidirectional reference)
class Country(models.Model):
name = models.CharField(max_length=128)
class State(models.Model):
name = models.CharField(max_length=128)
country = models.ForeignKey('Country', null=True)
class City(models.Model):
name = models.CharField(max_length=128)
state = models.ForeignKey('State', null=True)
state_capital = models.BooleanField(default=False)
country_capital = models.BooleanField(default=False)
Search will be done as:
Get me all cities where name is Waterloo and country is US.
Get me all cities where country is UK.
Get me all states where country is US.
Get me a country where country is US. and capital is Washington DC.
Get me the capital where country is US.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
从我的角度来看,选项 B 有一些问题。一是没有什么可以阻止两个或更多城市被标记为特定州或国家的首都。您需要在表单上添加额外的验证以确保不会发生这种情况。
从查询的角度来看,我认为还是A更可取。例如,如果您想获取首府为 X 市的州的所有城市,则在第一个查询中您将执行以下操作:
在第二个查询中您将必须执行以下操作:
这将解析为子查询,这可能会更少高效的。
然而,在选项中,AI并不认为你一定需要从城市到国家的FK。在您的模型 (*) 中,所有城市都位于州内,因此您始终可以通过州获取国家/地区 - 再次拥有额外的 FK 意味着您需要更多验证以确保您无法将城市分配给与以下国家/地区不同的国家/地区其州的国家。
(*尽管请注意,这实际上并不符合现实:无论如何,并非所有国家都有州,即使是那些有州的国家也常常拥有不属于所有州的城市,例如澳大利亚的堪培拉)
There are a few things wrong with option B, from my point of view. One is that there's nothing to stop two or more cities from being marked as capital for a particular state or country. You'd need to add extra validation on your forms to ensure that didn't happen.
From the point of querying, I think A is still preferable. For instance, if you wanted to get all cities in the state whose capital is City X, in the first one you would do:
and in the second you would have to do this:
which resolves to a subquery, which is likely to be less efficient.
However, in option A I don't think you necessarily need the FK from City to Country. In your model (*), all cities are in states, so you can always get the country via the state - having the extra FK again means you need more validation to ensure that you can't assign a city to a country that's different from its state's country.
(* although note that this doesn't actually match reality: not all countries have states anyway, and even those that do will often have cities that fall outside all states, like Canberra in Australia)
选项 A 更好,因为它包含相同的信息,而不需要一组几乎始终为空的字段(state_capital 和 Country_capital)。
正如您所注意到的,并非每个国家/地区都有州/省,因此您应该为城市模型同时拥有国家/地区和州字段。无论如何,这是一个巨大回报的非常少量的冗余,并且您不太可能遇到标准化问题(例如,特定州改变其国家的情况极为罕见)。即使每个城市在模型中都有关联的州,最好还是添加冗余字段来加快查询速度——这样您最多只能处理一个表连接,而不是两个。
请注意,华盛顿特区由两个地方组成:华盛顿市和华盛顿州。所以你的查询:
应该读为
(我在这里假设对于国家和州模型,您实际上添加了一个代码字段来处理缩写)
我会注意到其中一些查询似乎相当多余:Why would you need to match states on国家名称和首都?
Option A is better because it contains the same information without requiring a set of fields that will almost always be blank (state_capital and country_capital).
As you note, not every country has states/provinces, so you should have both country and state fields for the City model. In any case, it is a very small amount of redundancy for a huge payoff, and you are unlikely to encounter normalization problems (for example, it's extremely rare that a given state changes its country). Even if every city did have an associated state in your model, it's still better to add the redundant field to speed up the query -- you're then dealing with at most one table join instead of two.
Note that Washington, DC is composed of two places: Washington the city and DC the state. So your query:
Should read as
(I'm assuming here that for both the Country and State models you're actually adding a code field to deal with abbreviations)
I would note that some of these queries seem pretty redundant: Why would you need to match countries on both the country name and the capital?