我多次读到和听到应该避免 sql_variant
。我想我有一个很好的用例。我过去曾使用 varchar(max) 来在同一列中存储不同的类型,但当有一个内置类型完全符合我的要求时,避免反/序列化开销似乎是明智的想。
那么,使用 sql_variant 到底有哪些陷阱呢?它们与性能相关,还是容易犯编程错误,还是其他原因?顺便说一句,如果需要考虑的话,我将通过客户端代码和 CLR 函数与本专栏进行交互。
I've read and heard several times that sql_variant
should be avoided. I think I have a great use case for it. I've used varchar(max)
in the past to store different types in the same column, but it seems sensible to avoid the de/serialization overhead when there's a built-in type that does exactly what I want.
So, what exactly are the pitfalls of using sql_variant
? Are they performance related, or easy-to-make programming mistakes, or something else? Incidentally, I'll be interacting with this column from client code and CLR functions, if that's something to consider.
发布评论
评论(4)
通过
SQL_VARIANT
在同一列中存储不同类型与 .NET 中的所有内容都转换为Object
几乎是一样的。有时使用这种类型是有充分理由的,因为它肯定可以允许更通用的编程结构。然而,正如您所预料的,使用
SQL_VARIANT
存在一些您应该注意的陷阱,特别是其中一个可能会破坏交易:就像将所有内容都转换为
.NET 中的对象(并且可能需要装箱/拆箱,具体取决于基本类型),使用 SQL_VARIANT 时会明显影响性能。根据用例,如果功能确实需要它和/或使用不是很频繁(即每秒多次),那么降低性能可能是可以接受的。
与 .NET 中将所有内容转换为
Object
不同,SQL_VARIANT
数据类型对其可以包含的基本数据类型有限制。以下数据类型不能存储为SQL_VARIANT
:VARCHAR(MAX)
NVARCHAR(MAX)
VARBINARY(MAX)
XML
时间戳
/ROWVERSION
TEXT
(从 SQL Server 2005 开始,您不应该使用此类型)NTEXT
(从 SQL Server 2005 开始,您不应该使用此类型)IMAGE
(从 SQL Server 2005 开始,您不应该使用此类型)如果需要存储任何这些数据类型,此限制可以轻松阻止使用
SQL_VARIANT
。请注意,这里的问题是基本数据类型,而不是数据的大小,如以下测试所示:返回:
公平地说,使用
SQL_VARIANT
的一个好处是将所有内容转换为NVARCHAR
的原因是SQL_VARIANT
保留基础类型信息并强制其使用,以便您不能轻易地在完全不适当的上下文中误用值。返回:
关于无法使用
SQL_VARIANT
作为 PK:这实际上不是问题,因为通用数据类型的本质几乎使其从一开始就不适合这种用途。关于无法将
SQL_VARIANT
与LIKE
运算符一起使用:这基本上不是问题,因为能够将其转换为适用于的适当类型>LIKE
,如:上面当然不是最有效的,但它是功能性的,并且如上所述,在决定使用
时,效率已经被排除,因为它是为了换取功能而牺牲的SQL_VARIANT
数据类型。Storing different types in the same column via
SQL_VARIANT
is almost the same thing as casting everything toObject
in .NET. And sometimes there are valid reasons for using this type as it can certainly allow for a more generic programmatic structure.However, as you were anticipating, there are some pitfalls to using
SQL_VARIANT
that you should be aware of, especially as one of them might be a deal-breaker:Just like casting everything to
Object
in .NET (and possibly requiring boxing / unboxing depending on the base type), there is a definite performance hit when usingSQL_VARIANT
. Depending on the use case, it might be acceptable to have reduced performance if the functionality really needs it and/or the usage is not very frequent (i.e. many times per second).Unlike casting everything to
Object
in .NET, theSQL_VARIANT
datatype has limitations on what base datatypes it can contain. The following datatypes cannot be stored asSQL_VARIANT
:VARCHAR(MAX)
NVARCHAR(MAX)
VARBINARY(MAX)
XML
TIMESTAMP
/ROWVERSION
TEXT
(you shouldn't be using this type anyway as of SQL Server 2005)NTEXT
(you shouldn't be using this type anyway as of SQL Server 2005)IMAGE
(you shouldn't be using this type anyway as of SQL Server 2005)This limitation can easily prevent the possibility of using
SQL_VARIANT
if there is a requirement to store any of these datatypes. Please note that the issue here is the base datatype and not the size of the data, as the following test shows:Returns:
To be fair, one benefit to using
SQL_VARIANT
over casting everything toNVARCHAR
is thatSQL_VARIANT
retains the underlying type info and enforces its usage so that you cannot easily misuse values in completely inappropriate contexts.Returns:
Regarding not being able to use
SQL_VARIANT
as a PK: this is really a non-issue since the very nature of a generic datatype pretty much excludes it from being desirable in the first place for such a use.Regarding not being able to use
SQL_VARIANT
with aLIKE
operator: this is mostly a non-issue due to being able to convert it to an appropriate type that does work withLIKE
, as in:The above is certainly not the most efficient, but it is functional, and as mentioned above, efficiency was already ruled out as it was sacrificed in return for functionality when deciding to use the
SQL_VARIANT
datatype.我见过性能问题和代码质量相关问题:
大多数时候您访问此字段,您必须检查类型(使用 sql_variant_property)。这会使您的查询更加复杂,从而可能导致您列出的两个问题。
您还必须在每次使用该字段时投射该字段,从而导致进一步的性能损失。
此外,sql_variant 列不能作为主键的一部分,它们不能作为计算列的一部分,并且不能与 WHERE 子句中的 LIKE 一起使用。
I've seen both performance problems and code quality related problems:
Most of the time you access this field, you are going to have to check the type (using sql_variant_property). This makes your queries more complex, which can cause both of the problems you list.
You will also have to cast this field every time you use it, causing further performance penalty.
Further, sql_variant columns cant be part of a primary key, they don't work as part of a computed column, and they don't work with LIKE in a WHERE clause.
这也使得编程错误更容易发生。 DBA/程序员查看一列,它看起来像一个整数,因此他在其中放入一个整数,但在更远的地方,进程希望它是一个字符串。我见过这种情况,因为 sql_variant 列的导入写得不好。
It also makes it easier for programming errors to occur. A DBA/Programmer looks at a column and it looks like a integer, so he puts an integer in it, but farther down the line a process wants that to be a string. I've seen this with poorly written imports into sql_variant columns.
想到的唯一明显的陷阱是在您想要推入 sql_variant 字段的值超过其最大长度(8016 字节,根据此网页:http://msdn.microsoft.com/en-us/library/ms173829.aspx)。如果您的值永远不会达到该限制,那么 sql_variant 可能是一个非常好的方法。否则,您仍然可以使用 sql_variant,但提供一个单独的“isBlob”位字段,该字段指向带有 varbinary(max) 值的单独表(例如)。
The only obvious pitfall that comes to mind is in situations wherein you have values that you want to push into your sql_variant field that exceed its max length (8016 bytes, per this web page: http://msdn.microsoft.com/en-us/library/ms173829.aspx). If your values never approach that limit, then sql_variant can be a very good approach. Else, you could still use sql_variant, but provide a separate "isBlob" bit field that points to a separate table with your varbinary(max) values (for eg).