有效地从大型数据表中提取行的子集?

发布于 2024-11-30 16:11:58 字数 1243 浏览 0 评论 0 原文

以下表为例:

+----+---------+-----------+
| ID | GroupID | OtherData |
+----+---------+-----------+
| 1  | 1       | w4ij6u    |
+----+---------+-----------+
| 2  | 2       | ai6465    |
+----+---------+-----------+
| 3  | 2       | ows64rg   |
+----+---------+-----------+
| 4  | 2       | wqoi46suj |
+----+---------+-----------+
| 5  | 3       | w9rthzv   |
+----+---------+-----------+
| 6  | 3       | 03ehsat   |
+----+---------+-----------+
| 7  | 4       | w469ia    |
+----+---------+-----------+
| 8  | 5       | nhwh57rt  |
+----+---------+-----------+
| 9  | 5       | mwitgjhx4 |
+----+---------+-----------+

如何有效地获取从该表中提取的基于“GroupID”列的 List>

基本上,我希望结果是:

MyList(0) = List: 1 DataRow, ID(s) 1
MyList(1) = List: 3 DataRows, ID(s) 2,3,4
MyList(2) = List: 2 DataRows, ID(s) 5,6
MyList(3) = List: 1 DataRow, ID(s) 7
MyList(4) = List: 2 DataRows, ID(s) 8,9

但问题是: 此 DataTable 包含数百列和数万行,因此此操作必须尽可能高效。

我已经尝试过以下操作方法:

  • 使用行过滤器创建一个DataView,并从该视图中提取表/行列表。
  • 获取唯一的 GroupID 列表后,循环内的 Linq 查询。 Linq 查询根据 Where 子句选择每个“组”GroupID。

我希望其他人有更好、更有效的方法来提取这些数据。

Given the following table, as an example:

+----+---------+-----------+
| ID | GroupID | OtherData |
+----+---------+-----------+
| 1  | 1       | w4ij6u    |
+----+---------+-----------+
| 2  | 2       | ai6465    |
+----+---------+-----------+
| 3  | 2       | ows64rg   |
+----+---------+-----------+
| 4  | 2       | wqoi46suj |
+----+---------+-----------+
| 5  | 3       | w9rthzv   |
+----+---------+-----------+
| 6  | 3       | 03ehsat   |
+----+---------+-----------+
| 7  | 4       | w469ia    |
+----+---------+-----------+
| 8  | 5       | nhwh57rt  |
+----+---------+-----------+
| 9  | 5       | mwitgjhx4 |
+----+---------+-----------+

How can I efficiently get a List<List<DataRow>> extracted from this table that is based upon the "GroupID" column?

Basically, I want the result to be:

MyList(0) = List: 1 DataRow, ID(s) 1
MyList(1) = List: 3 DataRows, ID(s) 2,3,4
MyList(2) = List: 2 DataRows, ID(s) 5,6
MyList(3) = List: 1 DataRow, ID(s) 7
MyList(4) = List: 2 DataRows, ID(s) 8,9

Here's the problem though: This DataTable contains hundreds of columns and tens of thousands of rows, so this operation must be as efficient as possible.

I have already tried the following methods:

  • Creating a DataView with a Row Filter, and extracting a table/list of rows from that view.
  • A Linq query within a loop, after getting a unique list of GroupIDs. The Linq query selects each "set" of GroupIDs based on a Where clause.

I'm hoping someone else has a better, more efficient way of extracting this data.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

洋洋洒洒 2024-12-07 16:11:58

您尝试过 DataTable.Select() 方法吗?下面是如何使用它的示例:

DataTable table = GetSomeData();
DataRow[] results = table.Select("SomeInt > 0");

List<DataRow> resultList = results.ToList();

使用 DataTable.Select 肯定比 DefaultView.Filter 更快,并且您可以看到,已经内置了将结果放入列表中的功能。

Have you tried the DataTable.Select() method? Here's an example of how to use it:

DataTable table = GetSomeData();
DataRow[] results = table.Select("SomeInt > 0");

List<DataRow> resultList = results.ToList();

Using DataTable.Select should definitely be quicker than DefaultView.Filter, and as you can see the ability to put the results in a list is already built in.

从此见与不见 2024-12-07 16:11:58

事实证明,A) 我返回的数据集太大(由于错误),B) LINQ 可能是实现此目的最快的方法,而无需编写一些非常长或黑客式的代码。谢谢大家的想法,但我现在还是坚持使用 LINQ。

Turns out that A) My returning data set was too large (because of a bug), and B) LINQ is probably the fastest way to do this, without writing some very long or hack-ish code. Thanks for your ideas, everyone, but I'll stick with LINQ for now.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文