IEnumerable 作为 DataTable 性能问题
我有以下扩展,它从 IEnumerable
生成 DataTable
:
public static DataTable AsDataTable<T>(this IEnumerable<T> enumerable)
{
DataTable table = new DataTable();
T first = enumerable.FirstOrDefault();
if (first == null)
return table;
PropertyInfo[] properties = first.GetType().GetProperties();
foreach (PropertyInfo pi in properties)
table.Columns.Add(pi.Name, pi.PropertyType);
foreach (T t in enumerable)
{
DataRow row = table.NewRow();
foreach (PropertyInfo pi in properties)
row[pi.Name] = t.GetType().InvokeMember(pi.Name, BindingFlags.GetProperty, null, t, null);
table.Rows.Add(row);
}
return table;
}
但是,对于大量数据,性能不是很好。是否有任何我看不到的明显性能修复?
I have the following extension, which generates a DataTable
from an IEnumerable
:
public static DataTable AsDataTable<T>(this IEnumerable<T> enumerable)
{
DataTable table = new DataTable();
T first = enumerable.FirstOrDefault();
if (first == null)
return table;
PropertyInfo[] properties = first.GetType().GetProperties();
foreach (PropertyInfo pi in properties)
table.Columns.Add(pi.Name, pi.PropertyType);
foreach (T t in enumerable)
{
DataRow row = table.NewRow();
foreach (PropertyInfo pi in properties)
row[pi.Name] = t.GetType().InvokeMember(pi.Name, BindingFlags.GetProperty, null, t, null);
table.Rows.Add(row);
}
return table;
}
However, on huge amounts of data, the performance isn't very good. Is there any obvious performance fixes that I'm unable to see?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
首先,有几个非性能问题:
在性能方面,我可以看到反射和数据表加载方面的潜在改进:
使用这些模组,您最终会得到如下所示的结果:
First, a couple of non-perf problems:
On the perf side of things, I can see potential improvements on both the reflection and the data table loading sides of things:
With these mods, you would end up with something like the following:
而不是做:
使用:
Instead of doing:
use:
您始终可以使用 Fasterflect 之类的库来发出 IL,而不是对中每个项目的每个属性使用 true Reflection列表。不确定
DataTable
是否存在任何问题。或者,如果此代码不尝试成为通用解决方案,您始终可以将 IEnumerable 中的任何类型将其自身转换为 DataRow,从而避免反射。
You could always use a library like Fasterflect to emit IL instead of using true Reflection for every property on every item in the list. Not sure about any gotcha's with the
DataTable
.Alternatively, if this code is not trying to be a generic solution, you could always have whatever type is within the
IEnumerable
translate itself to aDataRow
, thus avoiding reflection all together.您可能对此没有选择,但可以查看代码的体系结构,看看是否可以避免使用
DataTable
而自己返回IEnumerable
。这样做的主要原因是:
您将从 IEnumerable 转换为 DataTable,这实际上是从流式操作转换为缓冲操作.
流式传输:使用
yield return
,以便仅在需要时才从枚举中提取结果。它不会像foreach
那样一次性迭代整个集合
缓冲:将所有结果拉入内存(例如填充的集合、数据表或数组),因此所有费用都会立即产生。
如果您可以使用 IEnumerable 返回类型,那么您可以自己使用
yield return
关键字,这意味着您可以分散所有反射的成本,而不是一次性产生全部成本。< /p>You may not have a choice about this, but possibly look at the architecture of the code to see if you can avoid using a
DataTable
and rather return anIEnumerable<T>
yourself.Main reason(s) for doing that would be:
You are going from an IEnumerable to a DataTable, which is effectively going from a streamed operation to a buffered operation.
Streamed: uses
yield return
so that results are only pulled off the enumeration as-and-when they are needed. It does not iterate the whole collection at once like aforeach
Buffered: pulls all of the results into memory (e.g. a populated collection, datatable or array) so all of the expense is incurred at once.
If you can use an IEnumerable return type, then you can make use of the
yield return
keyword yourself, meaning you spread the cost of all that reflection out instead of incurring it all at once.