为什么位置查询不好?
我正在阅读 CJ Date 的 SQL 和关系理论:如何编写准确的 SQL 代码< /a>,他认为位置查询是不好的 - 例如,这个 INSERT
:
INSERT INTO t VALUES (1, 2, 3)
相反,您应该使用基于属性的查询,如下所示:
INSERT INTO t (one, two, three) VALUES (1, 2, 3)
现在,我知道第一个查询已经结束与关系模型一致,因为元组(行)是无序的属性集(列)。 我无法理解第一个查询中的危害在哪里。 谁可以给我解释一下这个?
I'm reading CJ Date's SQL and Relational Theory: How to Write Accurate SQL Code, and he makes the case that positional queries are bad — for example, this INSERT
:
INSERT INTO t VALUES (1, 2, 3)
Instead, you should use attribute-based queries like this:
INSERT INTO t (one, two, three) VALUES (1, 2, 3)
Now, I understand that the first query is out of line with the relational model since tuples (rows) are unordered sets of attributes (columns). I'm having trouble understanding where the harm is in the first query. Can someone explain this to me?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
每当表模式发生更改时,第一个查询就会中断。 第二个查询适应任何架构更改,使其列保持不变,并且不添加无默认列。
执行
SELECT *
查询,然后依靠位置表示法提取他们关心的值的人是 软件维护超级恶棍出于同样的原因。The first query breaks pretty much any time the table schema changes. The second query accomodates any schema change that leaves its columns intact and doesn't add defaultless columns.
People who do
SELECT *
queries and then rely on positional notation for extracting the values they're concerned about are software maintenance supervillains for the same reason.虽然列的顺序是在架构中定义的,但通常不应将其视为重要的,因为它在概念上并不重要。
此外,这意味着任何阅读第一个版本的人都必须查阅架构以找出这些值的含义。 诚然,这就像在大多数编程语言中使用位置参数一样,但不知何故 SQL 在这方面感觉略有不同 - 我当然会更容易地理解第二个版本(假设列名是合理的)。
While the order of columns is defined in the schema, it should generally not be regarded as important because it's not conceptually important.
Also, it means that anyone reading the first version has to consult the schema to find out what the values are meant to mean. Admittedly this is just like using positional arguments in most programming languages, but somehow SQL feels slightly different in this respect - I'd certainly understand the second version much more easily (assuming the column names are sensible).
我并不真正关心这方面的理论概念(在实践中,表确实有定义的列顺序)。 与第一个相比,我更喜欢第二个的主要原因是增加了抽象层。 您可以修改表中的列,而不会搞乱查询。
I don't really care about theoretical concepts in this regard (as in practice, a table does have a defined column order). The primary reason I would prefer the second one to the first is an added layer of abstraction. You can modify columns in a table without screwing up your queries.
您应该尝试使 SQL 查询尽可能少地依赖于表的确切布局。
第一个查询依赖于仅具有三个字段的表,并且按照确切的顺序。 对表的任何更改都会破坏查询。
第二个查询仅依赖于表中存在这三个字段,字段的顺序无关。 您可以在不中断查询的情况下更改表中字段的顺序,甚至可以添加允许空值或具有默认值的字段。
尽管您不经常重新排列表布局,但向表添加更多字段是很常见的。
此外,第二个查询更具可读性。 您可以从查询本身看出记录中的值的含义。
You should try to make your SQL queries depend on the exact layout of the table as little as possible.
The first query relies on the table only having three fields, and in that exact order. Any change at all to the table will break the query.
The second query only relies on there being those three felds in the table, and the order of the fields is irrelevant. You can change the order of fields in the table without breaking the query, and you can even add fields as long as they allow null values or has a default value.
Although you don't rearrange the table layout very often, adding more fields to a table is quite common.
Also, the second query is more readable. You can tell from the query itself what the values put in the record means.
尚未提及的是,您通常会使用代理键作为 PK,并使用
auto_increment
(或类似的东西)来分配值。 对于第一个,您必须在那里指定一些东西 - 但如果不使用它,您可以指定什么值?NULL
可能是一个选项,但这并不适合考虑将 PK 设置为NOT NULL
。但除此之外,整个“锁定到特定模式”是一个更重要的原因,IMO。
Something that hasn't been mentioned yet is that you will often be having a surrogate key as your PK, with
auto_increment
(or something similar) to assign a value. With the first one, you'd have to specify something there — but what value can you specify if it isn't to be used?NULL
might be an option, but that doesn't really fit in considering the PK would be set toNOT NULL
.But apart from that, the whole "locked to a specific schema" is a much more important reason, IMO.
SQL 为您提供了为 INSERT 和 SELECT 语句指定列名的语法。 您应该使用它,因为:
SQL gives you syntax for specifying the name of the column for both INSERT and SELECT statements. You should use this because:
我更喜欢使用类似 UPDATE 的语法:
这比这两个示例更容易阅读和维护。
I prefer to use the UPDATE-like syntax:
Which is far easier to read and maintain than both the examples.
从长远来看,如果您向表中再添加一列,除非您显式指定列列表,否则 INSERT 将不起作用。 如果有人更改了列的顺序,您的 INSERT 可能会默默地成功地将值插入到错误的列中。
Long term, if you add one more column to your table, your INSERT will not work unless you explicitly specify list of columns. If someone changes the order of columns, your INSERT may silently succeed inserting values into wrong columns.
我还要添加一件事,即使在更改表之前,第二个查询本来也不太容易出错。 我为什么这么说呢? 因为使用第二个形式,您可以(并且应该在编写查询时)直观地检查插入表中的列以及 Values 子句或 select 子句中的数据实际上是否按正确的顺序开始。 否则,您最终可能会不小心将社会安全号码放入“酬金”字段中,并向演讲者支付他们的 SSN,而不是他们应为演讲支付的金额(示例不是随机选择的,除非我们在实际发生之前就发现了这一点,因为目视检查!)。
I'm going to add one more thing, the second query is less prone to error orginally even before tables are changed. Why do I say that? Becasue with the seocnd form you can (and should when you write the query) visually check to see if the columns in the insert table and the data in the values clause or select clause are in fact in the right order to begin with. Otherwise you may end up putting the Social Security Number in the Honoraria field by accident and paying speakers their SSN instead of the amount they should make for a speech (example not chosen at random, except we did catch it before it actually happened thanks to that visual check!).