DataReader - 硬编码序数?

发布于 2024-09-02 05:54:10 字数 596 浏览 2 评论 0原文

DataReader 返回数据时,我通常会使用 DataReader 上的序号引用来获取相关列:

if (dr.HasRows)         
   Console.WriteLine(dr[0].ToString());

或者

if (dr.HasRows)         
   Console.WriteLine(dr.GetString(0));

或者

if (dr.HasRows)         
   Console.WriteLine((string)dr[0]);

我一直这样做,因为我很早就被建议了使用 dr["ColumnName"] 或更优雅的索引方式会导致性能下降。

然而,虽然对数据实体的所有引用都变得越来越强类型,但我对此感到更不舒服。我还知道上面的内容不会检查 DBNull

DataReader 返回数据的最可靠方法是什么?

When returning data from a DataReader I would typically use the ordinal reference on the DataReader to grab the relevant column:

if (dr.HasRows)         
   Console.WriteLine(dr[0].ToString());

or

if (dr.HasRows)         
   Console.WriteLine(dr.GetString(0));

or

if (dr.HasRows)         
   Console.WriteLine((string)dr[0]);

I have always done this because I was advised at an early stage that using dr["ColumnName"] or a more elegant way of indexing causes a performance hit.

However, while all references to data entities are becoming increasingly strongly-typed I feel more uncomfortable with this. I'm also aware that the above does not check for DBNull.

What is the most robust way to return data from a DataReader?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

述情 2024-09-09 05:54:10

在这种情况下,双方都有可能争论。正如其他人已经指出的那样,使用名称更具可读性,并且如果有人更改底层数据库中的列顺序也不会中断。但也有人可能会争辩说,使用序数的优点是,如果有人更改基础数据库中的列名,也不会中断。不过,我更喜欢前一个参数,并且认为列名的可读性参数通常胜过第二个参数。名称的另一个论据是它可以“自我检测”错误。如果有人确实更改了字段名称,那么代码更有可能被破坏,而不是在读取错误字段时出现看似正常的微妙错误。

这似乎是显而易见的,但也许值得一提的是一个既具有自检测错误又具有序数性能的用例。如果您在 SQL 中显式指定 SELECT 列表,那么使用序数不会有问题,因为代码中的语句保证了顺序:

SELECT name, address, phone from mytable

在这种情况下,使用序数访问数据会相当安全。如果有人移动表中的字段也没关系。如果有人更改了名称,那么 SQL 语句在运行时会产生错误。

最后一点。我刚刚对我帮助编写的提供程序进行了测试。该测试读取 100 万行并访问每条记录上的“lastname”字段(与值进行比较)。使用 rdr[“lastname”] 的处理时间为 3301 毫秒,而 rdr.GetString(1) 的处理时间为 2640 毫秒(大约加速了 25%)。在这个特定的提供程序中,名称的查找使用排序查找将名称转换为序数。

It is possible to argue both sides in this situation. As already pointed out by others, using the name is more readable and will not break if someone changes the order of columns in the underlying database. But one might also argue the case that using an ordinal has the advantage of not breaking if someone changes the column name in the underlying database. I prefer the former argument, though, and think the readability argument for column names trumps the second argument in general. And an additional argument for names is that it is that it can “self-detect” errors. If someone does change a field name, then the code has a better chance of breaking rather than having the subtle bug of appearing to work while it reads the wrong field.

It seems obvious but maybe it is worth mentioning a usage case that has both the self-detecting error and the performance of ordinals. If you specify the SELECT list explicitly in the SQL, then using ordinals won’t be a problem because the statement in the code guarantees the order:

SELECT name, address, phone from mytable

In this case, it would be fairly safe to use ordinals to access the data. It doesn’t matter if someone moves fields around in the table. And if someone changes a name, then the SQL statement produce an error when it runs.

And one final point. I just ran a test on a provider I helped write. The test read 1 million rows and accessed the “lastname” field on each record (compared against a value). The usage of rdr[“lastname”] took 3301 milliseconds to process while rdr.GetString(1) took 2640 milliseconds (approximately a 25% speedup). In this particular provider, the lookup of the name uses a sorted lookup to translate the name to ordinal.

萌吟 2024-09-09 05:54:10

字符串名称查找比序号调用昂贵得多,但比对序号进行硬编码更易于维护且更不易“脆弱”。这就是我一直在做的事情。这是两全其美的。我不必记住序数值或关心列顺序是否发生变化,但我可以获得使用序数的性能优势。

var dr = command.ExecuteQuery();
if (dr.HasRows)
{
    //Get your ordinals here, before you run through the reader
    int ordinalColumn1 = dr.GetOrdinal("Column1");
    int ordinalColumn2 = dr.GetOrdinal("Column2");
    int ordinalColumn3 = dr.GetOrdinal("Column3");

    while(dr.Read())
    {
        // now access your columns by ordinal inside the Read loop. 
        //This is faster than doing a string column name lookup every time.
        Console.WriteLine("Column1 = " + dr.GetString(ordinalColumn1);
        Console.WriteLine("Column2 = " + dr.GetString(ordinalColumn2);
        Console.WriteLine("Column3 = " + dr.GetString(ordinalColumn3);
    }
}

注意:这仅对您期望获得相当多行数的读者才有意义。GetOrdinal() 调用是额外的,并且只有当您通过调用 GetString 节省了总成本时,才可以收回成本。循环中的 (int ordinalNumber) 大于调用 GetOrdinal 的成本。

编辑:错过了这个问题的第二部分。关于 DBNull 值,我已经开始编写处理这种可能性的扩展方法。示例:dr.GetDatetimeSafely() 在这些扩展方法中,您可以执行任何您需要的操作,以确保获得预期值。

The string name lookup is much more expensive than the ordinal call, but more maintainable and less "fragile" than hard coding the ordinals. So here's how I've been doing it. It's the best of both worlds. I don't have to remember the ordinal values or care if the column order changes, but I get the performance benefits of using ordinals.

var dr = command.ExecuteQuery();
if (dr.HasRows)
{
    //Get your ordinals here, before you run through the reader
    int ordinalColumn1 = dr.GetOrdinal("Column1");
    int ordinalColumn2 = dr.GetOrdinal("Column2");
    int ordinalColumn3 = dr.GetOrdinal("Column3");

    while(dr.Read())
    {
        // now access your columns by ordinal inside the Read loop. 
        //This is faster than doing a string column name lookup every time.
        Console.WriteLine("Column1 = " + dr.GetString(ordinalColumn1);
        Console.WriteLine("Column2 = " + dr.GetString(ordinalColumn2);
        Console.WriteLine("Column3 = " + dr.GetString(ordinalColumn3);
    }
}

Note: this only really makes sense for readers you expect to have a decent number of rows in. The GetOrdinal() calls are extra, and only pay for themselves if your combined savings from calling GetString(int ordinalNumber) in the loop are greater than the cost of calling GetOrdinal.

Edit: missed the second part of this question. Regarding DBNull values, I've started writing extension methods that deal with that possibility. example: dr.GetDatetimeSafely() Within those extension methods, you can do whatever you need to feel confident you get the expected value back.

﹂绝世的画 2024-09-09 05:54:10

我总是使用字符串名称方法,只是因为阅读代码更清晰。必须在心里将索引解析为列名是可怕的。

I always go with the string name approach just because reading the code is cleaner. Having to mentally parse index to column name is horrible.

调妓 2024-09-09 05:54:10

按名称索引数据读取器的成本稍高。造成这种情况的主要原因有两个。

  • 典型的实现将字段信息存储在使用数字索引的数据结构中。必须进行映射操作才能将名称转换为数字。
  • 某些实现将对名称进行两次查找。第一遍尝试在启用区分大小写的情况下匹配字段名称。如果该传递失败,则第二次传递将在关闭区分大小写的情况下开始。

然而,在大多数情况下,通过名称查找字段所带来的性能损失与数据库执行命令所花费的时间相比是微不足道的。不要让性能损失决定您在名称索引和数字索引之间的选择。

尽管性能略有下降,但我通常选择名称索引有两个原因。

  • 代码更容易阅读。
  • 该代码更能容忍结果集架构的更改。

如果您觉得名称索引的性能损失变得有问题(也许命令执行得很快,但返回很多行),那么按名称查找数字索引一次,将其保存起来,并将其用于剩余的行。

Indexing the data reader by name is slightly more expensive. There are two main reasons for this.

  • Typical implementations store field information in a data structure that uses numerical indexing. A mapping operation has to take place to transpose a name into a number.
  • Some implementations will do a two-pass lookup on the name. The first pass tries to match field names with case-sensitivity turned on. If that pass fails then a second pass begins with case-sensitivity turned off.

However, in most cases the performance penalty incurred by looking up a field by name is dwarfed by the amount of time it takes for the database to execute the command. Do not let the performance penalty dictate your choice between name and numeric indexing.

Despite the slight performance penalty I normally choose name indexing for two reasons.

  • The code is easier to read.
  • The code is more tolerant to changes in the schema of the resultset.

If you feel like the performance penalty of name indexing is becoming problematic (maybe the command executes quickly, but returns a lot of rows) then lookup the numeric index once by name, save it away, and use it for the remaining rows.

酒浓于脸红 2024-09-09 05:54:10

我确实认为索引字段是更好的方法,如果它只是为了避免底层数据库的字段名称更改,这将需要重新编译您的应用程序,因为您对字段名称进行了硬编码。

对于每个字段,您需要手动检查空值。

var dr = command.ExecuteQuery();

if (dr.HasRows) {
    var customer = new Customer();
    // I'm assuming that you know each field's position returned from your query.
    // Where comes the importance to write all of your selected fields and not just "*" or "ALL".
    customer.Id = dr[0] == null || dr[0] == DBNull.Value ? null : Convert.ToInt32(dr[0]);
    ...
}

除此之外,它还允许您使用反射并使此“GetData()”方法更加通用,提供 typeof(T) 并获取正确类型的正确构造函数。与每列顺序的绑定是一些人唯一希望避免的一件事,但在这种情况下,它变得值得。

I do think that indexed fields is the better approach, if it would only be for avoiding field names changes from the underlying database, which would require a recompile of your application because you hardcoded the field name.

For each of the fields, you will need to manually check for nulls.

var dr = command.ExecuteQuery();

if (dr.HasRows) {
    var customer = new Customer();
    // I'm assuming that you know each field's position returned from your query.
    // Where comes the importance to write all of your selected fields and not just "*" or "ALL".
    customer.Id = dr[0] == null || dr[0] == DBNull.Value ? null : Convert.ToInt32(dr[0]);
    ...
}

In addition to it, it would allow you to use reflection and make this "GetData()" method more generic providing a typeof(T) and getting the proper constructor for the proper type. The binding to the order of each columns is the only one thing some wish to avoid, but in this case, it becomes worthy.

姜生凉生 2024-09-09 05:54:10

序数的问题是,如果列的顺序发生变化,您将被迫修改 DataReader 的使用者代码,这与使用列名称不同。

我认为使用序数或列名不会带来性能提升,更多的是最佳实践、编码标准和代码可维护性

The problem with ordinal is if the order of the columns change and you are forced to modify the consumer code for the DataReader, unlike using column names.

I dont think there is a performance gain when using ordinal or column names, it is more on best practice and coding standards and code maintainability really

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文