NATURAL(JOIN)在生产环境中是否被认为有害?
我正在阅读有关 SQL 连接的 NATURAL 简写形式,我看到了一些陷阱:
- 它只会自动使用所有相同的命名列对(使用 USING 指定显式列列表)
- 如果添加了一些新列, ,然后join 输出也可能被“意外”改变,这在复杂的结构中可能不是那么明显(即使你知道 NATURAL 是如何工作的)
I am reading about NATURAL shorthand form for SQL joins and I see some traps:
- it just takes automatically all same named column-pairs (use USING to specify explicit column list)
- if some new column is added, then join output can be "unexpectedly" changed too, which may be not so obvious (even if you know how NATURAL works) in complicated structures
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
NATURAL JOIN
语法是反模式:因此,我不建议在任何环境中使用该语法。
我也不建议混合语法(即:同时使用 NATURAL JOIN 和显式 INNER/OUTER JOIN 语法) - 保持一致的代码库格式。
NATURAL JOIN
syntax is anti-pattern:Because of this, I don't recommend the syntax in any environment.
I also don't recommend mixing syntax (IE: using both
NATURAL JOIN
and explicit INNER/OUTER JOIN syntax) - keep a consistent codebase format.对我来说,完全破坏
NATURAL
的一件事是,我的大多数表都有一个id
列,这些列在语义上显然是不同的。您可能会认为拥有user_id
比id
更有意义,但最终您会编写诸如user.user_id
之类的内容,这违反了 DRY 。此外,按照相同的逻辑,您还会有诸如user_first_name
、user_last_name
、user_age
... 之类的列(这在以下情况下也有意义)认为它与例如session_age
不同)...恐怖。我会坚持我的
JOIN ... ON ...
,非常感谢。 :)One thing that completely destroys
NATURAL
for me is that most of my tables have anid
column, which are obviously semantically all different. You could argue that having auser_id
makes more sense thanid
, but then you end up writing things likeuser.user_id
, a violation of DRY. Also, by the same logic, you would also have columns likeuser_first_name
,user_last_name
,user_age
... (which also kind of makes sense in view that it would be different from, for example,session_age
)... The horror.I'll stick to my
JOIN ... ON ...
, thankyouverymuch. :)这些似乎反对自然连接的“陷阱”是双向的。假设您向表 A 添加一列新列,完全希望将其用于与表 B 的联接。如果您知道 A 和 B 的每个联接都是自然联接,那么您就完成了。如果每个连接都明确使用 USING,那么您必须全部跟踪并更改它们。错过一个就会出现错误。
当表的语义表明这是正确的做法时,请使用自然连接。当您想要确保以特定方式完成联接时,请使用显式联接条件,无论表定义如何演变。
These "traps", which seem to argue against natural joins, cut both ways. Suppose you add a new column to table A, fully expecting it to be used in joining with table B. If you know that every join of A and B is a natural join, then you're done. If every join explicitly uses USING, then you have to track them all down and change them. Miss one and there's a bug.
Use NATURAL joins when the semantics of the tables suggests that this is the right thing to do. Use explicit join criteria when you want to make sure the join is done in a specific way, regardless of how the table definitions might evolve.
我同意其他发帖者的观点,即为了清晰起见,应该使用显式连接,并且如果您的需求发生变化,也可以轻松地切换到“外部”连接。
然而,您的大多数“陷阱”与联接无关,而是使用“SELECT *”而不是显式命名您需要的列“SELECT a.col1,a.col2,b.col1,b.col2”的弊端。只要使用通配符列列表,就会出现这些陷阱。
I agree with the other posters that an explicit join should be used for reasons of clarity and also to easily allow a switch to an "OUTER" join should your requirements change.
However most of your "traps" have nothing to do with joins but rather the evils of using "SELECT *" instead of explicitly naming the columns you require "SELECT a.col1, a.col2, b.col1, b.col2". These traps occurs whenever a wildcard column list is used.
添加上面任何答案中未列出的额外原因。在 postgres 中(不确定其他数据库是否也是如此),如果使用 NATURAL JOIN 时在两个表之间没有发现共同的列名,则执行 CROSS JOIN 。这意味着,如果您有一个现有查询,然后您随后要更改表中的一个列名,您仍然会得到从查询返回的一组行,而不是错误。相反,如果您使用
JOIN ... USING(...)
语法,如果连接列不再存在,您将会收到错误。postgres 文档对此有注释:
Adding an extra reason not listed in any of the answers above. In postgres (not sure if this the case for other databases) if no column names are found in common between the two tables when using
NATURAL JOIN
then aCROSS JOIN
is performed. This means that if you had an existing query and then you were to subsequently change one of the column names in a table, you would still get a set of rows returned from the query rather than an error. If instead you used theJOIN ... USING(...)
syntax you would get an error if the joining column was no longer there.The postgres documentation has a note to this effect:
您的意思是这样的语法吗:
与此相比:
我更喜欢第二种语法,并且格式也不同:
在这种情况下,很清楚我要加入哪些表以及我使用什么 ON 子句来加入它们。通过使用第一种语法太容易了,无法放入正确的 JOIN 并获得巨大的结果集。我这样做是因为我很容易出现拼写错误,这是我防止拼写错误的保险。另外,它在视觉上更容易调试。
Do you mean the syntax like this:
Versus this:
I prefer the 2nd syntax and also format it differently:
In this case, it is very clear what tables I am joining and what ON clause I am using to join them. By using that first syntax is just too easy to not put in the proper JOIN and get a huge result set. I do this because I am prone to typos, and this is my insurance against that. Plus, it is visually easier to debug.