IDataReader 和“HasColumn”,最佳方法?

发布于 2024-07-11 03:30:27 字数 613 浏览 11 评论 0原文

我见过两种常见的方法来检查 IDataReader 中是否存在列:

public bool HasColumn(IDataReader reader, string columnName)
{
  try
  {
      reader.getOrdinal(columnName)
      return true;
  }
  catch 
  {
       return false;
  }
}

或者:

public bool HasColumn(IDataReader reader, string columnName)
{

    reader.GetSchemaTable()
         .DefaultView.RowFilter = "ColumnName='" + columnName + "'";

    return (reader.GetSchemaTable().DefaultView.Count > 0);
}

就我个人而言,我使用了第二种方法,因为我讨厌因此而使用异常。

然而,在大型数据集上,我相信 RowFilter 可能必须对每列进行表扫描,这可能会非常慢。

想法?

I've seen two common approaches for checking if a column exists in an IDataReader:

public bool HasColumn(IDataReader reader, string columnName)
{
  try
  {
      reader.getOrdinal(columnName)
      return true;
  }
  catch 
  {
       return false;
  }
}

Or:

public bool HasColumn(IDataReader reader, string columnName)
{

    reader.GetSchemaTable()
         .DefaultView.RowFilter = "ColumnName='" + columnName + "'";

    return (reader.GetSchemaTable().DefaultView.Count > 0);
}

Personally, I've used the second one, as I hate using exceptions for this reason.

However, on a large dataset, I believe RowFilter might have to do a table scan per column, and this may be incredibly slow.

Thoughts?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

柠檬色的秋千 2024-07-18 03:30:27

我想我对这颗古老的宝石有一个合理的答案。

我会选择第一种方法,因为它简单得多。 如果您想避免异常,您可以缓存字段名称并在缓存上执行 TryGet。

public Dictionary<string,int> CacheFields(IDataReader reader)
{

    var cache = new Dictionary<string,int>();
    for (int i = 0; i < reader.FieldCount; i++)
    {
        cache[reader.GetName(i)] = i;
    }
    return cache;
}

这种方法的优点是更简单并且可以更好地控制。 另外,请注意,您可能需要研究不区分大小写或不区分假名的比较,这会使事情变得有点棘手。

I think I have a reasonable answer for this old gem.

I would go with the first approach cause its much simpler. If you want to avoid the exception you can cache the field names and do a TryGet on the cache.

public Dictionary<string,int> CacheFields(IDataReader reader)
{

    var cache = new Dictionary<string,int>();
    for (int i = 0; i < reader.FieldCount; i++)
    {
        cache[reader.GetName(i)] = i;
    }
    return cache;
}

The upside of this approach is that it is simpler and gives you better control. Also, note, you may want to look into case insensitive or kana insensitive compares, which would make stuff a little trickier.

昔梦 2024-07-18 03:30:27

很大程度上取决于您如何使用 HasColumn。 您是只调用一次或两次,还是在循环中重复调用它? 该专栏可能存在还是事先完全未知?

设置行过滤器可能每次都会进行表扫描。 (另外,从理论上讲,GetSchemaTable() 可以在每次调用时生成一个全新的表,这会更加昂贵——我不相信 SqlDataReader 会这样做,但在 IDataReader 级别,谁知道呢?)但是如果您只调用它一次或两次我无法想象这是一个多大的问题(除非您有数千个列或其他东西)。

以防您的特定 IDataReader 重新生成它。)

(但是,我至少将 GetSchemaTable() 的结果存储在方法内的本地变量中,以避免快速连续调用它两次,如果不将其缓存在某处, 您事先知道在正常情况下您要求的列将会存在,例外方法更容易接受(因为实际上不存在列是一种例外情况)。 即使没有,它的性能可能会稍好一些,但除非您反复调用它,否则您应该问自己性能是否真的那么重要。

如果您重复调用它,您可能应该考虑采用不同的方法,例如:预先调用 GetSchemaTable() 一次,循环遍历表,并将字段名称加载到字典或其他专为快速设计的结构中查找。

A lot depends on how you're using HasColumn. Are you calling it just once or twice, or repeatedly in a loop? Is the column likely to be there or is that completely unknown in advance?

Setting a row filter probably would do a table scan each time. (Also, in theory, GetSchemaTable() could generate an entirely new table with every call, which would be even more expensive -- I don't believe SqlDataReader does this, but at the IDataReader level, who knows?) But if you only call it once or twice I can't imagine this being that much of an issue (unless you have thousands of columns or something).

(I would, however, at least store the result of GetSchemaTable() in a local var within the method to avoid calling it twice in quick succession, if not cache it somewhere on the off chance that your particular IDataReader DOES regenerate it.)

If you know in advance that under normal circumstances the column you ask for will be present, the exception method is a bit more palatable (because the column not being there is, in fact, an exceptional case). Even if not, it might perform slightly better, but again unless you're calling it repeatedly you should ask yourself if performance is really that much of a concern.

And if you ARE calling it repeatedly, you probably should consider a different approach anyway, such as: call GetSchemaTable() once up front, loop through the table, and load the field names into a Dictionary or some other structure that is designed for fast lookups.

属性 2024-07-18 03:30:27

我不担心性能影响。 即使您有一个包含 1000 列的表(这将是一个巨大的表),您仍然只执行 1000 行的“表扫描”。 这可能是微不足道的。

过早的优化只会导致您实现不必要的复杂实现。 实施最适合您的版本,然后衡量性能影响。 如果与您的性能要求相比不可接受,请考虑替代方案。

I wouldn't worry about the performance impact. Even if you had a table with 1000 columns (which would be an enormous table), you are still only doing a "table scan" of 1000 rows. That is likely to be trivial.

Premature optimization will just lead you toward an unnecessarily complex implementation. Implement the version that seems best to you, and then measure the performance impact. If it is unacceptable compared to your performance requirements, then consider alternatives.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文