选择 * 与选择 *
昨天,一位同事向我展示了以下 postgres 查询。我们都对它的工作感到惊讶:
SELECT* FROM mytable;
由于我最近为另一种语言编写了一个解析器,我试图更深入地理解为什么这个查询“编译”并返回与 SELECT * FROM mytable; 相同的结果。
据推测,这被识别为有效查询,因为在词法分析期间,postgres 从输入中读取 SELECT
作为标记,然后搜索下一个标记,它发现该标记为 *
,等等——这或多或少是这里发生的事情吗?
另外,postgres 词法分析器/解析器是否恰好足够强大以理解此查询,或者其他数据库是否会理解类似的 SELECT* 查询?
Yesterday a colleague showed me the following postgres query. We were both surprised that it worked:
SELECT* FROM mytable;
Since I recently coded a parser for another language, I am trying to understand in more depth why this query "compiles" and returns the same results as SELECT * FROM mytable;
.
Presumably this is recognized as a valid query because during lexical analysis, postgres reads SELECT
from input as a token, and then searches for the next token, which it finds as *
, and so on - is that more or less what is going on here?
Also, does the postgres lexer/parser just happen to be robust enough to understand this query, or would other databases understand a similar SELECT*
query?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
通常词法分析器会向当前标记添加字符,直到找到不属于当前标记的字符,然后退出并从之前无法继续的位置重新开始。
所以这里发生的事情是词法分析器吞噬了
SELECT
并发现下一个字符是*
,因为它正在收集一个单词,所以不能属于>选择。因此它会停止,分析结果是关键字的
SELECT
,然后从它识别的*
开始,依此类推。这与其他编程语言中的2*2
和2 * 2
得到4
的原因相同。至于是否能在其他数据库上工作,这完全取决于词法分析器的细节和语法的规则。
Usually lexers will add characters to the current token until it finds a character that can't belong to the current token, then it quits and starts over from where it couldn't go on before.
So what's going on here is that the lexer gobbles up
SELECT
and sees that the next character is a*
which, because it's collecting a word, can't belong withSELECT
. So it stops, analyzes theSELECT
which turns out to be a keyword, and starts over with the*
, which it recognizes, and so on. It's the same reason why you get4
from both2*2
and2 * 2
in other programming languages.As for whether it will work on other databases, it all depends on the details of the lexical analyzer and the rules of the grammar.
显然,分词器对算术中使用的空格和特殊字符进行分词。
以下是 SELECT 语句的 BNF:h2database.com:
Apparently, the tokenizer tokenizes on white space and special characters used in arithmetic.
Here's the BNF of SELECT statements: h2database.com:
据我所知,SQL 会跳过解析空格,因此您可以执行 SELECT*FROM 或 SELECT * FROM ,这基本上是相同的。
它还使用
`
和'
来理解什么是什么。因此 SELECT * FROM myTable WHERE id = my string 将是无效查询,因为 和 处的“字符串”无法理解。As far as i know SQL skips parsing white-space so you can do
SELECT*FROM or SELECT * FROM
it's basically the same.It also uses
`
and'
to understand what is what. SoSELECT * FROM myTable WHERE id = my string
would be a invalid query because the "string" at the and is not understood.