是否可以用普通的汇编语言处理器级代码编写 SQL 语句?
就在最近,一位朋友建议用汇编代码编写 SQL 语句是可能且可以实现的(尽管非常困难),因为每个编程操作最终都会进入处理器级别的执行。
我对 SQL 的行为做了一些研究,尽管它遵循关系代数的理论和独立于平台的执行,但我仍然相信抽象和语义的水平相当遥远,甚至无法考虑将 SQL 语句转换为汇编代码的方法(一组非常特定于操作/内存/资源的指令)。
也许您可以模仿 SQL 语句的处理器操作结果,并尝试使用纯汇编指令集来复制它。但您会意识到,您仍然不会编写/翻译 SQL 语句。
以MonetDB的SQL Reference页面为例,他们在第三段中声明了以下内容:
“该架构基于编译器,它可以翻译 SQL 语句转换为 MonetDB 汇编语言 (MAL)。在这个过程中 特定于关系代数的常见优化启发法是 执行。”
SQL 语言甚至不允许键入暴力汇编指令,而基于 C 和 C# 等常见语言则允许此类键入/导入。
你们觉得怎么样?谢谢分享你的想法!
Just recently a friend suggested it is possible and achievable (though very difficult) to write a SQL statement in assembly code, since every programming operation eventually gets down to processor-level execution.
I did a bit of research on SQL's behaviour and although it follows relational algebra's theory and platform-independent execution, I still believe that the level of abstraction and semantics are rather distant as to even consider a way to translate a SQL statement to assembly code (a very operations/memory/resources specific set of instructions).
Perhaps you could mimic a SQL statement's processor operations result and try to replicate it with a pure assembly set of instructions. You would come to realise though, that you still would not be writing/translating SQL statements.
Take for instance, MonetDB's SQL Reference page, they state the following in the third paragraph:
"The architecture is based on a compiler, which translates SQL
statements into the MonetDB Assembly Language (MAL). In this process
common optimization heuristics, specific to the relational algebra are
performed."
The SQL language however does not even allow for brute assembly instructions to be typed, whereas common languages such as C-based, and C# do allow for such typing/imports.
What do you guys think? Thanks for sharing your thoughts!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
计算机上运行的任何内容都可以使用汇编语言进行编码。如果 SQL 数据库可以在您的计算机上运行,那么它就可以用汇编语言进行编码。
但这可能很难做到。
您提到的 SQL 示例与 C 或其他编译语言翻译为机器代码时发生的情况相差不远。现代优化编译器不会将 C 代码直接转换为汇编语言。他们使用一种(或多种)更容易执行优化的中间表示。这是一个多步骤的过程,并且实际的装配输出并不是其复杂性的主要部分。
如果您这样看,您的 SQL 情况并没有太大不同。您可以想象一个 SQL 预处理器,它在给定足够固定的环境(尤其是架构)的情况下从 MAL 生成本机代码。有了类似的东西,向 SQL 方言添加扩展以允许内联汇编(例如聚合函数)可能是有意义的。并且手动完成所有这些(即没有预处理器本身)是可能的。
不过,您会失去从运行时 SQL 解释器获得的所有可移植性和灵活性,每次模式更改时都必须重新编译,依赖于数据的优化几乎不可能等等。因此,我认为,这在非常有用的情况下非常有用。有限的。 (对于通常通过虚拟机或解释器运行的其他语言也是如此 - 将它们编译为本机代码通常会带来严格的限制。)
Anything that runs on your computer can be coded using an assembly language. If a SQL database can run on your machine, then it can be coded in assembly.
It can be ridiculously hard to do though.
The SQL example you mention isn't that far removed from what happens when C or other compiled languages are translated to machine code. Modern optimizing compilers don't translate your C code directly to assembly. They use one (or more) intermediate representations that are easier to perform optimizations on. It's a multi-step process, and the actual assembly output isn't the main part of it complexity-wise.
If you look at it that way, your SQL case is not very different. You could imagine an SQL pre-processor that produces native code from the MAL given a sufficiently fixed environment (schema notably). With something like that, adding extensions to that SQL dialect to allow inline assembly (for aggregate functions for instance) could make sens. And doing all that manually (i.e. without the pre-processor itself) would be possible.
You loose all the portability and flexibility you get from a runtime SQL interpreter though, would have to recompile every time your schema changes, data-dependent optimizations become nearly impossible, etc. So the situations where this would be useful are, I believe, very limited. (Same thing for other languages that are usually run through a VM or interpreter - compiling them down to native code usually carries heavy restrictions.)
不可以,SQL 不允许这样做,因为它是比 C(或 C#)更高级的语言。在 SQL 中,代码描述了应该做什么,而不是如何做,也没有任何关于如何做的细节。实现必须解析代码并将其编译成执行 SQL 代码描述的操作的一组或低级指令。
例如,对于 SELECT,我们无法保证访问表的计划是什么、访问它们的顺序、将使用哪些(如果有)索引、将使用什么类型的操作进行连接,如果将使用临时表或在内存中完成排序等...
因此,允许这样的事情定义不明确并且极其危险:
No, SQL does not allow this because it is a higher level language than C (or C#). In SQL, the code describes what should be done and not how, nor any details on how to do it. The implementation has to parse the code and compile it into a set or low-level instructions that do what SQL code describes.
For example, for a SELECT we have no guarantee on what the plan to access the tables will be, in what order they will be accessed, which (if any) indices will be used, what type of operations will be used for joins, if temporary tables will be used or the sorting is done in memory, etc...
So, something like this would be ill-defined and extremely dangerous to be allowed:
如果您对将关系查询/操作编译为汇编程序感兴趣,您可能需要查看本文:http://www.vldb.org/pvldb/vol4/p539-neumann.pdf。在此 DBMS 中,LLVM 组件用于从 DBMS 内的查询生成 CPU 指令(我认为这就是您所说的汇编程序的意思)。
另外,尽管我可能是在向合唱团说教,但我想澄清 MAL 与 CPU 指令汇编器无关。每个单独的 MAL 语句都由 C 中的实现支持。MAL 仅用作 (taadaa:) 作为易于优化和解释的中间表示。
If you are interested in compilation of relational queries/operations to assembler, you might want to check out this paper: http://www.vldb.org/pvldb/vol4/p539-neumann.pdf. In this DBMS, components of LLVM are used to produce CPU-instructions (which I assume is what you mean when you say assembler) from a query within the DBMS.
Also, even though I might be preaching to the choir, I want to make clear the MAL has nothing to do with CPU-instruction Assembler. Every single MAL-statement its backed by an implementation in C. MAL is used only (taadaa:) as an intermediate representation that is easy to optimized and interpret.
嗯,机器执行您可以用汇编语言编写的指令。但是,我不会将直接编写汇编语言称为执行 SQL 查询。 SQL 的解释可能非常不同......例如,在原始汇编可能没有什么意义的情况下,图书馆员查阅百科全书。
Well, the machine executes instructions you could have written in assembly. However, I wouldn't call writing the assembly language directly doing a SQL query. SQL could be interpreted very differently... e.g. by librarians consulting encyclopediae, in contexts where raw assembly might have little meaning.
不。SQL 是一种抽象,可以通过具有不同物理布局的不同 SQL 环境的不同 SQL 实现来解释*。也许布局甚至会随着时间的推移而改变,因为您
ALTER TABLE
现在您拥有新旧元组布局的混合。此外,您可以使用 SQL 做更多事情,而不仅仅是运行它。您还可以对它进行类型检查,分析它以查看它具有什么样的效果,将其放入视图定义或存储过程等中。这是另一种放置方式。你能用汇编语言“编写”HTML吗?也许您可以编写一个程序,该程序在执行时具有与浏览器渲染特定页面相同的效果。但是你的程序可以被 AdBlock、NoScript 以及我安装的任何其他过滤器处理吗?任何支持 HTML 上所有相关操作的东西都将与 HTML 本身同构。 SQL 和任何其他语言也是如此。事实上,任何其他数据结构:表示形式的更改必须保留该数据结构上所有相关操作的含义。语言往往有很多相关的操作。
(*我的意思不是“解释”,如“与编译”一样;我的意思是“给定的含义”。)
No. SQL is an abstraction that can be interpreted* by different SQL implementations with different SQL environments with different physical layouts. Maybe the layouts even change over time, as you
ALTER TABLE
and now you have a mixture of old and new tuple layouts. Also, there's more you can do with SQL than just run it. You can also type-check it, analyze it to see what kind of effects it has, put it in a view definition or stored procedure, etc.Here's another way to put it. Can you "write" HTML as assembly language? Maybe you can write a program that, when executed, has the same effect as a browser rendering a particular page. But can your program be processed by AdBlock, NoScript, and whatever other filters I have installed? Anything that supports all of the relevant operations on HTML is going to be isomorphic to HTML itself. Similarly with SQL, and any other language. Any other data structure, in fact: a change in representation must preserve the meaning of all the relevant operations on that data structure. And languages tend to have lots of relevant operations.
(* I don't mean "interpreted" as in "vs compiled"; I mean "given meaning".)