SQL 可移植性问题
我的公司让我致力于为 Python ORM 完成 Oracle 后端。 我很惊讶 RDBMS 做事的方式有多么不同,即使对于简单的事情也是如此。 我了解了很多有关 Oracle 和其他 RDBMS 之间的差异的知识。 纯粹出于好奇,我想了解更多。
将 SQL 从一个平台移植到另一个平台时有哪些常见的“陷阱”?
拜托,每个答案只能有一个问题。
My company has me working on finishing a back end for Oracle for a Python ORM. I'm amazed at how much differently RDBMSes do things even for the simple stuff. I've learned a lot about the differences between Oracle and other RDBMSes. Just out of sheer curiosity, I'd like to learn more.
What are some common "gotchas" in terms of porting SQL from one platform to another?
Please, only one gotcha per answer.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(17)
另一个例子是生成唯一(通常是代理)主键。
许多数据库(例如 SQL Server 和 sqlite)允许将列声明为标识:通常,如果插入时缺少该列的值,数据库将为该列生成唯一值。
相比之下,Oracle 让您创建一个独立于表的序列,然后在该序列上使用 nextval 来生成下一个值:
或者,更典型的是:
Another example is generating unique (typically surrogate) primary keys.
Many databases, such as SQL Server and sqlite allow a column to be declared as an identity: typically, if the value for this column is missing on insert, the database will generate a unique value for the column.
Oracle, by contrast, has you create a sequence separate to the table and then use nextval on the sequence to generate the next value:
Or, more typically:
多值 IN 子句查询。 我以前一直在 Oracle 上使用这些,但很惊讶地发现在 SQL Server 中不能这样做。 例如,这个查询:
Multi-value IN clause queries. I used to use these all the time on Oracle and was surprised to find that you can't do this in SQL Server. For example, this query:
Oracle 不允许您插入空字符串:它们会默默地转换为 NULL。
Oracle will not allow you to insert empty strings: they are silently converted to NULL.
Oracle 对引用的处理方式与 MySQL 不同。
而且,逃避是不同的。
(请注意,要转义 Oracle 中引号以外的任何内容,您可以在查询中使用 ESCAPE 指令;SELECT * FROM testTable WHERE% = '50\%' ESCAPE '\')
Oracle has a different approach to quotes opposed to MySQL.
Furthermore, escaping is different.
(Note that to escape anything other than quotes in Oracle, you can use the ESCAPE directive in your query; SELECT * FROM testTable WHERE percent = '50\%' ESCAPE '\')
与尝试通过将字典单词直接替换到短语中来将
英语
翻译为俄语
相同。适用于你好和再见,但对于玛丽有一只小羊则失败,更不用说莎士比亚了。
尽管名称中包含
SQL
,但不同的 RDBMS 具有不同的文化。例如,行限制。
在
Oracle
中:在
SQL Server
中:在
MySQL
和PostgreSQL
中:在
DB2
中:四个不同的子句。
Same as attempts to translate from
English
toRussian
by substituting the dictionary words right into the phrases.Works for hello and goodbye, but fails for Mary had a little lamb to say nothing of Shakespeare.
Different
RDBMS
's have different cultures, despite havingSQL
in their name.For instance, the row limiting.
In
Oracle
:In
SQL Server
:In
MySQL
andPostgreSQL
:In
DB2
:Four different clauses.
我所拥有的 SQL Server 和 Oracle 之间的差异列表
移植时遇到的问题:
认为 ANSI 标准 SQL 在数据库之间完全可移植吗? 再想一想。
编写 ANSI 标准 SQL 不切实际
A list of discrepancies between SQL Server and Oracle that I have
encountered when porting:
Think ANSI Standard SQL Is Fully Portable Between Databases? Think Again.
Writing ANSI Standard SQL is not practical
性能问题是一个大问题。 例如,据我所知,Oracle 中的视图几乎与表一样快。 当我不得不使用 SQLServer 时,情况并非如此。 视图有效地降低了性能,使相同的选择速度降低了一个数量级或更多(直接从表中进行查询需要 0.5 秒,而使用视图可能需要一分钟)。 它们的使用也有很多限制,例如并非所有 SQL 函数都可以在视图上使用。
请注意,六年前 5-6 年前确实如此,我不知道微软此后是否对此进行了改进。
Performance issues is a big one. For instance, views in Oracle are, AFAIK, pretty much as fast as tables. This was not the case with SQLServer when I had to use them. The views effectively killed the performance slowing down the same selects by an order of magnitude or more (a query straight from the tables took, say 0.5s, while using a view could take a minute). There were also a lot of limitations on their use, for instance not all SQL functions could be used on views.
Note that this was true 5-6 six years ago, I don't know if Microsoft has improved this since.
晦涩的连接语法,如 Oracle 的外连接 (+) 语法。 在我曾经工作过的一家公司,到处都使用这种语法,而不是标准的 LEFT OUTER JOIN / LEFT JOIN 语法,这使得将一些东西移植到 MySQL 非常痛苦。
Obscure join syntax, like Oracle's (+) syntax for outer joins. At a company I used to work for this syntax was used all over the place rather than the standard LEFT OUTER JOIN / LEFT JOIN syntax, which made porting some stuff over to MySQL quite a pain.
Oracle 不允许在一个查询中进行多次插入。 MySQL 允许这样做:
Oracle does not allow you to have multiple inserts in one query. MySQL allows this:
临时表 - Oracle 与 SQL Server/MySQL。 从 Oracle 过渡到 MS/MySQL,没问题。 反过来说,有点不同。
Temporary Tables - Oracle vs SQL Server/MySQL. Transitioning from Oracle to MS/MySQL, no problem. The other way around, a little different.
为了全面了解产品特定的问题,您需要了解逻辑数据库设计和物理数据库设计之间的区别。
数据库逻辑设计主要与表的特性有关。 表的特征包括列和约束。 尽管表本身是物理的,但表设计通常非常容易从一个数据库系统移植到另一个数据库系统。 某些数据类型的工作方式存在差异,语法也存在一些差异,例如表名中是否可以使用下划线。 但良好的逻辑设计应该从一个系统移植到另一个系统,只需进行较小的更改或不进行任何更改。
物理数据库设计主要与表结构所依赖的基础设施的特性有关。 几乎所有系统都支持索引,默认索引类型是 B 树,尽管它可能被称为其他名称。 但从那时起,每个系统都有自己的物理特征,每个系统的物理特征可能完全不同。 Oracle 的一个典型物理特性是表空间。 与表空间密切相关的是表和表空间之间的映射。 物理设计必须在系统特定的基础上完成。
除了您使用的 RDM 系统之外,您还需要在设计中考虑数据量、负载、响应时间要求以及磁盘等系统资源。 好消息是,无需更改应用程序代码即可对物理设计进行大量更改。 这称为物理数据独立性。 这意味着在编写了一些应用程序代码并加载了一些数据后,您可以自由地调整和调整物理设计。
您可能想看一些有关数据库设计的书籍,以更深入地了解逻辑设计和物理设计以及它们之间的区别。 一些受欢迎的作家是 CJ Date 和 Joe Celko。
For the big picture on product specific gotchas, you need to learn the difference between logical database design and physical database design.
Logical database design mostly has to do with features of tables. Features of tables include columns and constraints. Although tables themselves are physical, table design is generally very portable from one database system to another. There are differences in the way some datatypes work, and a few differences in syntax, like whether the underscore can be used in a table name or not. But a good logical design should port from one system to another with only minor changes or no changes.
Physical database design mostly has to do with features of the infrastructure on which the table structure rests. Almost all systems support indexes, and the default index type is B-tree although it might be called something else. But from there on out, each system has its own physical features, which can be completely different from one system to another. A typical physical feature of Oracle is tablespaces. Closely associated with tablespaces is the mapping between tables and tablespaces. Physical design has to be done on a system specific basis.
In addition to which RDM system you are using, you need to take data volume, load, response time requirements, and system resources like disks into account in your design. The good news is that a lot of changes to physical design can be made with no changes to application code. This is known as physical data independence. This means that you are somewhat free to tweak and tune the physical desing after you've got some application code written and some data loaded.
You might want to look at some books on database design to get a deeper idea of logical and physical design and the difference between them. Some popular authors are C.J. Date and Joe Celko.
我记得一个奇怪的 Oracle 问题让我完全措手不及。 我不确定这是实例的配置还是默认设置,但 IN 语句中的元素不能超过 1000 个。 所以我们不得不欺骗它做我们想要的事情:
等等。
虽然丑陋,但它有效。
(在有人指出子查询或内联视图的明显解决方案之前,查询是在完全不同的系统上生成的)
I recall a peculiar Oracle problem that got me completely off gueard. I'm not sure it this was a configuration of the instance or a default settings, but we could not have more that 1000 elements in a IN statement. So we had to trick it into doing what wanted:
etc.
Ugly, but it worked.
(Before anyone points out the obvious solution of a sub-query or inline view, the query was generated on a completely different system)
不同的数据库处理二进制数据的方式略有不同。 例如,这将在 MySQL 下工作:
但是,Oracle 依赖于这些十六进制值:
相反,我们必须将其转换为十六进制:
Different databases handle binary data a bit differently. So for instance, this will work under MySQL:
However, Oracle relies on these values being hex:
Instead, we have to convert it to hex:
设置操作员支持。
除了 UNION / UNION ALL 之外,数据库中的集合运算符支持也很不稳定。 Oracle 和 SQL Server 支持其中的大多数操作,但 Oracle 支持 MINUS 操作以及等效的标准 EXCEPT DISTINCT 操作。 AFIK,MySQL 仅支持 UNION(不支持 INTERSECT 或 EXCEPT)。
Set operator support.
Aside from UNION / UNION ALL, set operator support is pretty spotty across databases. Oracle and SQL server support most of them, but Oracle supports a MINUS operation as well as the equivalent standard EXCEPT DISTINCT operation. AFIK, MySQL only supports UNION (no INTERSECT or EXCEPT support).
我不确定在Python附带的驱动程序中这是否属实,但在我使用的“Horrable”版本中,如果结果集中的列为该列中的所有值返回NULL,则列本身是未在结果集结构中返回。 这可能(并且经常)导致生产错误,而这些错误在开发中是不可能重复的。
由于您要放弃 Oracle,所以这不应该是一个问题,但是开发人员使用一个关键字来减轻这个“功能”,我怀疑 MySQL 是否支持该功能。 但我忘记了它是什么,谷歌也没有提供帮助。
I'm not sure if this is true in the drivers that come with Python, but in the version of "Horrable" I used, if a column in a resultset returns NULL for all values in the column, the column itself is not returned in the structure of the resultset. This could (and often did) result in production bugs that were impossible to duplicate in development.
Since you're moving away from Oracle, this shouldn't be a concern, but there is a keyword developers use to mitigate this "feature" that I doubt is supported in MySQL. I forget what it is, though, and Google is not helping.
Oracle似乎没有游标的问题,它们是SQL Server中一个巨大的性能问题。
实际上,几乎所有性能调优都是特定于数据库的(这就是为什么 ANSII 标准代码通常与设计到特定于数据库的特定 SQL 风格的更好方法相比性能非常差)。
日期是另一件事,不同数据库的处理方式似乎非常不同。
数据类型也不等同。 对于 SQL Server 新手来说,时间戳数据类型与日期和时间完全无关,并且无法转换为数据时间值。
Oracle does not seem to have a problem with cursors, they are a huge performance problem in SQL server.
Actually pretty much all performance tuning is database specific (which is why ANSII standard code often performs very poorly compared to the better methods designed into the specifc flavor of SQL that is database specific).
Dates are another thing that seem to be handled very differntly from database to database.
Datatypes are not equivalent either. One thing that tends to get newcomers to SQL Server is that timestamp data type has absolutely nothing to do with dates and times and cannot be converted to a datatime value.
Oracle 不允许没有 FROM 子句的 select 语句。 因此,您不能执行这样的查询:
相反,您必须说查询来自
DUAL
表:Oracle does not allow a select statement without a FROM clause. Therefore, you cannot do queries like this:
Instead, you have to say that the query is from the
DUAL
table: