检查 DBNull 然后分配给变量的最有效方法?
这个问题偶尔会出现,但我还没有看到令人满意的答案。
典型的模式是(行是DataRow):
if (row["value"] != DBNull.Value)
{
someObject.Member = row["value"];
}
我的第一个问题是哪个更有效(我已经翻转了条件):
row["value"] == DBNull.Value; // Or
row["value"] is DBNull; // Or
row["value"].GetType() == typeof(DBNull) // Or... any suggestions?
这表明 .GetType() 应该更快,但也许编译器知道一些我不知道的技巧?
第二个问题,是否值得缓存 row["value"] 的值,或者编译器是否会优化索引器?
例如:
object valueHolder;
if (DBNull.Value == (valueHolder = row["value"])) {}
注意:
- row["value"] 存在。
- 我不知道该列的列索引(因此不知道列名查找)。
- 我特别询问关于检查 DBNull 然后分配(而不是关于过早优化等)。
我对几个场景进行了基准测试(时间以秒为单位,10,000,000 次试验):
row["value"] == DBNull.Value: 00:00:01.5478995
row["value"] is DBNull: 00:00:01.6306578
row["value"].GetType() == typeof(DBNull): 00:00:02.0138757
Object.ReferenceEquals 与“==”具有相同的性能
最有趣的结果是什么? 如果您按大小写不匹配列的名称(例如,“Value”而不是“value”,则大约需要十倍的时间(对于字符串):
row["Value"] == DBNull.Value: 00:00:12.2792374
这个故事的寓意似乎是,如果您看不到按索引向上查找列,然后确保提供给索引器的列名称与 DataColumn 的名称完全匹配,
缓存该值的速度也几乎两倍:
No Caching: 00:00:03.0996622
With Caching: 00:00:01.5659920
因此,最有效的方法似乎是:
object temp;
string variable;
if (DBNull.Value != (temp = row["value"]))
{
variable = temp.ToString();
}
This question comes up occasionally, but I haven't seen a satisfactory answer.
A typical pattern is (row is a DataRow):
if (row["value"] != DBNull.Value)
{
someObject.Member = row["value"];
}
My first question is which is more efficient (I've flipped the condition):
row["value"] == DBNull.Value; // Or
row["value"] is DBNull; // Or
row["value"].GetType() == typeof(DBNull) // Or... any suggestions?
This indicates that .GetType() should be faster, but maybe the compiler knows a few tricks I don't?
Second question, is it worth caching the value of row["value"] or does the compiler optimize the indexer away anyway?
For example:
object valueHolder;
if (DBNull.Value == (valueHolder = row["value"])) {}
Notes:
- row["value"] exists.
- I don't know the column index of the column (hence the column name lookup).
- I'm asking specifically about checking for DBNull and then assignment (not about premature optimization, etc.).
I benchmarked a few scenarios (time in seconds, 10,000,000 trials):
row["value"] == DBNull.Value: 00:00:01.5478995
row["value"] is DBNull: 00:00:01.6306578
row["value"].GetType() == typeof(DBNull): 00:00:02.0138757
Object.ReferenceEquals has the same performance as "=="
The most interesting result? If you mismatch the name of the column by case (for example, "Value" instead of "value", it takes roughly ten times longer (for a string):
row["Value"] == DBNull.Value: 00:00:12.2792374
The moral of the story seems to be that if you can't look up a column by its index, then ensure that the column name you feed to the indexer matches the DataColumn's name exactly.
Caching the value also appears to be nearly twice as fast:
No Caching: 00:00:03.0996622
With Caching: 00:00:01.5659920
So the most efficient method seems to be:
object temp;
string variable;
if (DBNull.Value != (temp = row["value"]))
{
variable = temp.ToString();
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(15)
我肯定错过了什么。 不检查
DBNull
到底是什么DataRow.IsNull
方法有什么作用?我一直在使用以下两种扩展方法:
用法:
如果您不希望
Nullable
返回GetValue
的值,您可以轻松返回< code>default(T) 或其他一些选项。顺便提一下,这里有一个 VB.NET 替代 Stevo3000 的建议:
I must be missing something. Isn't checking for
DBNull
exactly what theDataRow.IsNull
method does?I've been using the following two extension methods:
Usage:
If you didn't want
Nullable<T>
return values forGetValue<T>
, you could easily returndefault(T)
or some other option instead.On an unrelated note, here's a VB.NET alternative to Stevo3000's suggestion:
您应该使用该方法:
考虑到它是内置于框架中的,我希望这是最有效的。
我建议这样做:
是的,编译器应该为您缓存它。
You should use the method:
Considering it's built-in to the Framework, I would expect this to be the most efficient.
I'd suggest something along the lines of:
And yes, the compiler should cache it for you.
编译器不会优化索引器(即,如果您使用 row["value"] 两次),所以是的,它稍微更快:
然后使用 value 两次; 如果它为 null,则使用 .GetType() 会带来问题...
DBNull.Value
实际上是一个单例,因此要添加第四个选项 - 您也许可以使用 ReferenceEquals - 但实际上,我认为您'我在这里担心太多了...我不认为“is”、“==”等之间的速度差异将成为您所看到的任何性能问题的原因。 分析您的整个代码并专注于重要的事情......不会是这个。The compiler won't optimise away the indexer (i.e. if you use row["value"] twice), so yes it is slightly quicker to do:
and then use value twice; using .GetType() risks issues if it is null...
DBNull.Value
is actually a singleton, so to add a 4th option - you could perhaps use ReferenceEquals - but in reality, I think you're worrying too much here... I don't think the speed different between "is", "==" etc is going to be the cause of any performance problem you are seeing. Profile your entire code and focus on something that matters... it won't be this.我将在 C# 中使用以下代码(VB.NET 并不那么简单)。
如果该值不为 null/DBNull,则代码会分配该值,否则它会分配默认值,该默认值可以设置为 LHS 值,从而允许编译器忽略分配。
I would use the following code in C# (VB.NET is not as simple).
The code assigns the value if it is not null/DBNull, otherwise it asigns the default which could be set to the LHS value allowing the compiler to ignore the assign.
我觉得这里只有极少数的方法不会让潜在客户面临最担心的风险(Marc Gravell、Stevo3000、Richard Szalay、Neil、Darren Koppand),而且大多数方法都不必要地复杂。 充分意识到这是无用的微优化,让我说你基本上应该采用这些:
1)不要从 DataReader/DataRow 读取值两次 - 所以要么在空检查和转换/转换之前缓存它,要么最好直接传递你的
record[X]
对象到具有适当签名的自定义扩展方法。2)要遵守上述规定,请勿在 DataReader/DataRow 上使用内置的
IsDBNull
函数,因为它会在内部调用record[X]
,因此实际上您将这样做两次。3) 作为一般规则,类型比较总是比值比较慢。 只需做得更好
record[X] == DBNull.Value
即可。4) 直接转换比调用
Convert
类进行转换要快,尽管我担心后者会出现更少的问题。5)最后,通过索引而不是列名访问记录将再次更快。
我觉得按照 Szalay、Neil 和 Darren Koppand 的方法会更好。 我特别喜欢 Darren Koppand 的扩展方法方法,它接受
IDataRecord
(尽管我想将其进一步缩小到IDataReader
)和索引/列名称。请小心调用它:
以防万一
您需要区分
0
和DBNull
。 例如,如果枚举字段中有空值,否则default(MyEnum)
可能会返回第一个枚举值。 因此最好调用record.GetColumnValue("Field")
。由于您正在从
DataRow
读取数据,因此我将通过 干燥通用代码。所以现在这样称呼它:
我相信这就是它在框架中应该的样子(而不是
record.GetInt32
、record.GetString
等方法) - 没有运行时异常,使我们能够灵活地处理空值。根据我的经验,使用一种通用方法从数据库读取数据的运气较差。 我总是必须自定义处理各种类型,因此从长远来看,我必须编写自己的
GetInt
、GetEnum
、GetGuid
等方法。 如果您想在默认情况下从数据库读取字符串时修剪空格,或者将DBNull
视为空字符串,该怎么办? 或者,如果您的小数应该被截断所有尾随零。 我在Guid
类型上遇到了最大的麻烦,当底层数据库可以将它们存储为字符串或二进制时,不同的连接器驱动程序的行为也不同。 我有这样的重载:使用 Stevo3000 的方法,我发现调用有点丑陋和乏味,并且很难用它来创建通用函数。
I feel only very few approaches here doesn't risk the prospect OP the most worry (Marc Gravell, Stevo3000, Richard Szalay, Neil, Darren Koppand) and most are unnecessarily complex. Being fully aware this is useless micro-optimization, let me say you should basically employ these:
1) Don't read the value from DataReader/DataRow twice - so either cache it before null checks and casts/conversions or even better directly pass your
record[X]
object to a custom extension method with appropriate signature.2) To obey the above, do not use built in
IsDBNull
function on your DataReader/DataRow since that calls therecord[X]
internally, so in effect you will be doing it twice.3) Type comparison will be always slower than value comparison as a general rule. Just do
record[X] == DBNull.Value
better.4) Direct casting will be faster than calling
Convert
class for converting, though I fear the latter will falter less.5) Lastly, accessing record by index rather than column name will be faster again.
I feel going by the approaches of Szalay, Neil and Darren Koppand will be better. I particularly like Darren Koppand's extension method approach which takes in
IDataRecord
(though I would like to narrow it down further toIDataReader
) and index/column name.Take care to call it:
and not
in case you need to differentiate between
0
andDBNull
. For example, if you have null values in enum fields, otherwisedefault(MyEnum)
risks first enum value being returned. So better callrecord.GetColumnValue<MyEnum?>("Field")
.Since you're reading from a
DataRow
, I would create extension method for bothDataRow
andIDataReader
by DRYing common code.So now call it like:
I believe this is how it should have been in the framework (instead of the
record.GetInt32
,record.GetString
etc methods) in the first place - no run-time exceptions and gives us the flexibility to handle null values.From my experience I had less luck with one generic method to read from the database. I always had to custom handle various types, so I had to write my own
GetInt
,GetEnum
,GetGuid
, etc. methods in the long run. What if you wanted to trim white spaces when reading string from db by default, or treatDBNull
as empty string? Or if your decimal should be truncated of all trailing zeroes. I had most trouble withGuid
type where different connector drivers behaved differently that too when underlying databases can store them as string or binary. I have an overload like this:With Stevo3000's approach, I find the calling a bit ugly and tedious, and it will be harder to make a generic function out of it.
存在一种麻烦的情况,即该对象可能是字符串。 下面的扩展方法代码处理所有情况。 以下是您将如何使用它:
There is the troublesome case where the object could be a string. The below extension method code handles all cases. Here's how you would use it:
我个人比较喜欢这种语法,它使用 IDataRecord 公开的显式 IsDbNull 方法,并缓存列索引以避免重复的字符串查找。
为了便于阅读,它进行了扩展,如下所示:
为了 DAL 代码的紧凑性而重写为适合单行 - 请注意,在本示例中,我们将分配
int bar = -1
ifrow[" Bar"]
为空。如果您不知道内联赋值的存在,它可能会令人困惑,但它将整个操作保留在一行上,我认为当您从一个代码块中的多个列填充属性时,这会增强可读性。
I personally favour this syntax, which uses the explicit IsDbNull method exposed by
IDataRecord
, and caches the column index to avoid a duplicate string lookup.Expanded for readability, it goes something like:
Rewritten to fit on a single line for compactness in DAL code - note that in this example we're assigning
int bar = -1
ifrow["Bar"]
is null.The inline assignment can be confusing if you don't know it's there, but it keeps the entire operation on one line, which I think enhances readability when you're populating properties from multiple columns in one block of code.
并不是说我已经这样做了,但是您可以通过使用静态/扩展方法来绕过双索引器调用并仍然保持代码干净。
IE。
然后:
还有将空检查逻辑保留在一处的好处。 当然,缺点是这是一个额外的方法调用。
只是一个想法。
Not that I've done this, but you could get around the double indexer call and still keep your code clean by using a static / extension method.
Ie.
Then:
Also has the benefit of keeping the null checking logic in one place. Downside is, of course, that it's an extra method call.
Just a thought.
我尽量避免这种检查。
显然不需要对不能容纳
null
的列执行此操作。如果您存储为 Nullable 值类型(
int?
等),则只需使用as int?
进行转换。如果您不需要区分
string.Empty
和null
,您可以直接调用.ToString()
,因为 DBNull 将返回 <代码>字符串.空。I try to avoid this check as much as possible.
Obviously doesn't need to be done for columns that can't hold
null
.If you're storing in a Nullable value type (
int?
, etc.), you can just convert usingas int?
.If you don't need to differentiate between
string.Empty
andnull
, you can just call.ToString()
, since DBNull will returnstring.Empty
.我总是使用:
发现它简短而全面。
I always use :
Found it short and comprehensive.
这就是我处理从 DataRows 读取的方式
用法示例:
Props to Monsters得到了我的 .Net 的 ChageTypeTo 代码。
This is how I handle reading from DataRows
Usage example:
Props to Monsters Got My .Net for ChageTypeTo code.
我已经用扩展方法做了类似的事情。 这是我的代码:
要使用它,您需要执行类似的操作
I've done something similar with extension methods. Here's my code:
To use it, you would do something like
如果在 DataRow 中 row["fieldname"] isDbNull 将其替换为 0,否则获取十进制值:
if in a DataRow the row["fieldname"] isDbNull replace it with 0 otherwise get the decimal value:
像这样使用
use like this
我的程序中有 IsDBNull,它从数据库读取大量数据。 使用 IsDBNull,它可以在大约 20 秒内加载数据。
如果没有 IsDBNull,大约需要 1 秒。
所以我认为最好使用:
I have IsDBNull in a program which reads a lot of data from a database. With IsDBNull it loads data in about 20 seconds.
Without IsDBNull, about 1 second.
So I think it is better to use: