IEnumerable为DataTable性能问题

我有以下扩展，它从IEnumerable生成一个DataTable ：

  public static DataTable AsDataTable(this IEnumerable enumerable) { DataTable table = new DataTable(); T first = enumerable.FirstOrDefault(); if (first == null) return table; PropertyInfo[] properties = first.GetType().GetProperties(); foreach (PropertyInfo pi in properties) table.Columns.Add(pi.Name, pi.PropertyType); foreach (T t in enumerable) { DataRow row = table.NewRow(); foreach (PropertyInfo pi in properties) row[pi.Name] = t.GetType().InvokeMember(pi.Name, BindingFlags.GetProperty, null, t, null); table.Rows.Add(row); } return table; }

但是，在大量数据上，性能不是很好。是否有任何明显的性能修复我无法看到？

而不是做：

 row[pi.Name] = t.GetType().InvokeMember(pi.Name, BindingFlags.GetProperty, null, t, null);

使用：

 row[pi.Name] = pi.GetValue(t, null);

您可以始终使用像Fasterflect这样的库来发出IL，而不是对列表中每个项目的每个属性使用true Reflection。不确定DataTable任何问题。

或者，如果此代码不是一个通用的解决方案，您可以随时将IEnumerable任何类型转换为DataRow ，从而避免一起reflection。

首先，一些非性能问题：

枚举中第一个项的类型可能是T的子类，它定义了可能不存在于其他项上的属性。为避免可能导致的问题，请使用T类型作为属性列表的源。
该类型可能具有无getter或具有索引getter的属性。您的代码不应尝试读取其值。

在这方面，我可以看到reflection和数据表加载方面的潜在改进：

缓存属性getter并直接调用它们。
避免按名称访问数据行列以设置行值。
在添加行的同时将数据表置于“数据加载”模式。

使用这些mod，您最终会得到如下内容：

 public static DataTable AsDataTable(this IEnumerable enumerable) { if (enumerable == null) { throw new ArgumentNullException("enumerable"); } DataTable table = new DataTable(); if (enumerable.Any()) { IList properties = typeof(T) .GetProperties() .Where(p => p.CanRead && (p.GetIndexParameters().Length == 0)) .ToList(); foreach (PropertyInfo property in properties) { table.Columns.Add(property.Name, property.PropertyType); } IList getters = properties.Select(p => p.GetGetMethod()).ToList(); table.BeginLoadData(); try { object[] values = new object[properties.Count]; foreach (T item in enumerable) { for (int i = 0; i < getters.Count; i++) { values[i] = getters[i].Invoke(item, BindingFlags.Default, null, null, CultureInfo.InvariantCulture); } table.Rows.Add(values); } } finally { table.EndLoadData(); } } return table; }

您可能没有选择这个，但可能会查看代码的体系结构，看看您是否可以避免使用DataTable而是自己返回IEnumerable 。

这样做的主要原因是：

您将从IEnumerable转到DataTable，它实际上是从流操作转移到缓冲操作。
- Streamed：使用yield return因此只有在需要时才会将结果从枚举中拉出。它并不像foreach一样迭代整个集合
- 缓冲：将所有结果拉入内存（例如，填充的集合，数据表或数组），因此所有费用都会立即生成。
如果你可以使用IEnumerable返回类型，那么你可以自己使用yield return关键字，这意味着你将所有reflection的成本分摊出来，而不是一次性产生所有reflection。

IEnumerable为DataTable性能问题

非静态字段，方法或属性需要对象引用吗？

将静态子域添加到网站的推荐方法是什么？

在Post上的模型中保留SelectList

无法转换COM对象 – Microsoft Outlook和C＃

如何使用Log4Net实现日志文件的自动存档

客户端自定义数据注释validation

Cache.SetMaxAge不能在IIS下工作，在VS Dev Srv下工作正常

在Visual Studio中关闭表单后调试不会停止

异步TCP服务器 – 消息框架建议

从基础asp.net标识用户创建inheritance用户