太多的左连接是否会带来代码异味?
例如,如果您有 > 查询中的 5 个左连接是一种代码味道……
- 您的设计有问题吗?
- 您在一个查询中执行了太多操作?
- 你的数据库太标准化了?
If you have for example > 5 left joins in a query is that a code smell that there is ...
- something wrong with your design?
- you're doing too much in one query?
- you're database is too normalized?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
对于某些设计来说,这是一个完全合法的解决方案。
假设您有一个一对多关系的层次结构,例如
Customer
-Order
-Basket
-Item
- < code>Price 等,可以在任何级别上未填写:Customer
可以没有Orders
,Order
可以没有Baskets
等。在这种情况下,您会发出类似以下内容:
请注意,在某些情况下它可能效率低下,并且可能会替换为
EXISTS
或NOT EXISTS
(如果你只想知道其他表中是否存在相应的记录)。有关性能详细信息,请参阅我博客中的这篇文章:
NOT EXISTS
替换LEFT JOIN
中受益It's a perfectly legitimate solution for some designs.
Say you have a hierarchy of one-to-many relations like
Customer
-Order
-Basket
-Item
-Price
, etc., which can be unfilled on any level: aCustomer
may have noOrders
, anOrder
can have noBaskets
, etc.In this case you issue something like:
Note that it may be inefficient in some cases, and may be replaced with
EXISTS
orNOT EXISTS
(if you only want to figure out that the corresponding records exist or do not exist in other tables).See this article in my blog for performance details:
LEFT JOIN
's withNOT EXISTS
从某种意义上说,这是你可以/应该调查的事情,我会说是的。 通过从中提取一些视图,您可能可以获得更好的实用性和维护性。
从某种意义上说,它是“糟糕的代码”,不,这很容易是合理的,特别是对于较大的数据库,并且现代数据库可能会优化任何低效率的情况。
In the sense that it's something you could/should investigate I'd say yes. It's likely you can get better utility and maintenance by factoring some views out of that.
In the sense that it's "bad code" no, this could quite easily be reasonable especially for larger DBs and modern databases will likely optimise any inefficiencies out.
不,这样做完全没问题,但如果您发现自己使用相同表的相同连接一遍又一遍地编写相同的查询/过程,
它可能是创建视图的候选者,只是为了简化您将来的查询,并减少架构更改时需要更改的接触点数量
Nope it's perfectly fine to do, though if you find yourself writing the same queries/procedures over and over again using the same joins to the same tables,
it maybe a candidate for creating a View just to simplify you're queries in future, and to reduce the number of touch points you'd need to change if you're schema changes
很多时候,您可以通过创建辅助视图来减轻视觉味道,我不认为有一个硬性规定来规定有多少左连接被认为是坏的。
与过程编码不同,将 SQL 分解成小块可能会导致查询效率低下。
A lot of times you can alleviate the visual smell by creating helper views, I do not think there is a hard and fast rule of how many left joins are considered bad.
Unlike procedural coding, breaking down SQL into little bits and pieces can result in inefficient queries.
你的结果我的变化
任何不寻常的事情都可能是任何东西的代码气味。 正如夸斯诺伊所说,这可能是完全合法的。 对于真正深入的报告来说,需要大量的连接才能正确地拼凑信息,这种情况并不罕见。 这并不意味着开发人员应该考虑对其数据库进行非规范化。
Your Results My Vary
Anything out of the ordinary could be a code-smell for anything. Like Quassnoi said it could be perfectly legitimate. It's not uncommon for really in-depth reports to require a crazy amount of joins to piece together the information correctly. That doesn't mean that the developer should looking at denormalizing their database.
某人几乎不可能回答如此笼统的问题,并且尝试创建这样一个任意规则是毫无意义的。
左连接是一种完全可以接受的连接类型,它映射到一个非常常见的需求:获取所有 x,如果它们具有关联的 y,则也获取它们。
It is pretty much impossible for someone to answer a question as general as this and to attempt to create such an arbitrary rule would pointless.
Left joins are a perfectly acceptable type of join which map onto a very common need: get me all x's, if they have associated y's then get those too.
一点都不。 构建在某些查询上使用大量左连接的数据库设计是完全合法的。
话虽如此,我通常更愿意构建数据库,以便限制需要外连接的情况数量,因为经验往往表明使用它们的(复杂)查询更容易出错,并且可能会引起维护问题。
抛开一个有趣的历史不谈,IBM DB2 的早期版本仅在大型机上运行时,不支持外连接(Oracle 和 Ingress 当时都支持外连接,这是一个主要卖点)。 这导致数据库设计中出现一些有趣的问题,因为有必要确保仅使用内部联接即可解决数据库的所有预期数据访问要求。
No, not at all. It's perfectly legitimate to construct a database design that uses a significant number of left joins on some queries.
Having said that I would generally prefer to construct the database so that the number of cases where outer joins are required is limited as experience tends to suggest that (complex) queries that use them are more error prone and likely to give rise to maintenance problems.
As an interesting historical aside, the early versions of IBM's DB2, when it ran on mainframes only, did not support outer joins (Oracle and Ingress both did at the time which was a major selling point). This lead to some interesting issues in database design as it was necessary to ensure that all expected data access requirements for the database could be solved using just inner joins.
我认为必须使用许多连接(例如处理标准化数据)并不是代码味道,而是表明您可能没有在正确的位置工作。 根据我的经验,那些关心查询中联接数量的人在数据库中开发得太多,而在公开数据的应用程序和报告中开发得不够。 数据结构必须足够灵活以支持多种用途,这就是为什么规范化在某种程度上很重要。
在构建当今的企业应用程序时,开发人员可以利用昨天的成就,在高于 SQL 甚至 XML 等技术的抽象级别上工作,以便以更少的工作提供更多价值。 有一些工具,即报告编写器、代码生成器、ORM、实体框架等,可以抽象出手动构建 SQL 的低级工作,并为您执行联接。 大多数人都知道正在使用的 SQL 方言(例如,Oracle 9 与 MySQL 3),并且可以生成对该方言最有效的 SQL 语法; 这意味着他们可能可以比您更好地创建连接。
然而,在没有充分规范化的关系环境中,这些工具的工作效果非常差,或者根本不起作用。 对我来说,这就是“发展”气味显现出来的地方; 如果现有的数据访问工具无法理解我构建数据的关系,我可能需要寻找一种更规范的方式来创建这些关系,并从中获得的好处远远超出了使用该工具的范围。 通常,第二个和第三个正常形式之间的某个位置是最佳点; 尽管总是存在关系数据的小区域,其中较高程度的标准化才有意义并增加价值。
干杯,
特拉维斯
I would contend that having to use many joins (e.g. deal with normalized data) is not a code smell, but rather an indication you might not be working in the right place. In my experience, those that are concerned about the number of joins in queries are developing too much in the database and not enough in the applications and reports that expose the data. The data structures must be flexible enough to support a myriad of uses and this is why normalization, to one degree or another, is important.
In building today's enterprise applications, developers can leverage yesterday's accomplishments to work at abstract levels high above technologies like SQL, and even XML, in order to deliver more value with less work. There are tools, i.e. Report Writers, Code Generators, ORMs, entity frameworks, etc., that abstract away the low level work of constructing SQL manually, and will perform the joins for you. Most are aware of the SQL dialect being used (for example, Oracle 9 vs MySQL 3) and can generate the SQL syntax that is most efficient for that dialect; meaning they can probably create the joins better than you can.
However, these tools work very poorly, or not at all, in a relational environment without sufficient normalization. For me, this is where a "development" smell manifests itself; if an established data access tool can't understand the relations under which I've structured my data, I probably need to look for a more normalized way to create those relationships and therein reap benefits far exceeding just the use of the tool. Typically, somewhere between 2nd and 3rd normal form is the sweet-spot; although invariably there tend to be small areas of relational data where a higher degree of normalization makes sense and adds value.
Cheers,
Travis