DataTable列计算最高效的排序计算
假设您有一个包含“id”、“cost”、“qty”列的 DataTable:
DataTable dt = new DataTable();
dt.Columns.Add("id", typeof(int));
dt.Columns.Add("cost", typeof(double));
dt.Columns.Add("qty", typeof(int));
并且它以“id”为键:
dt.PrimaryKey = new DataColumn[1] { dt.Columns["id"] };
现在我们感兴趣的是每个数量的成本。因此,换句话说,如果您有一行:
id | cost | qty
----------------
42 | 10.00 | 2
每个数量的成本为 5.00。
我的问题是,根据前面的表格,假设它由数千行构成,并且您对单位数量成本前 3 行感兴趣。所需的信息是id、单位数量成本。您不能使用 LINQ。
在 SQL 中这将是微不足道的;在没有 LINQ 的情况下,您如何在 C# 中最好(最有效)地完成它?
更新:寻求不修改表格的答案。
Lets say you have a DataTable that has columns of "id", "cost", "qty":
DataTable dt = new DataTable();
dt.Columns.Add("id", typeof(int));
dt.Columns.Add("cost", typeof(double));
dt.Columns.Add("qty", typeof(int));
And it's keyed on "id":
dt.PrimaryKey = new DataColumn[1] { dt.Columns["id"] };
Now what we are interested in is the cost per quantity. So, in other words if you had a row of:
id | cost | qty
----------------
42 | 10.00 | 2
The cost per quantity is 5.00.
My question then is, given the preceeding table, assume it's constructed with many thousands of rows, and you're interested in the top 3 cost per quantity rows. The information needed is the id, cost per quantity. You cannot use LINQ.
In SQL it would be trivial; how BEST (most efficiently) would you accomplish it in C# without LINQ?
Update: Seeking answers that do not modify the table.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我不确定这是否是最好的,但它比排序然后选择时间复杂度为 O(n log n) 的前三个元素要好。
您可以使用优先级队列来过滤前三个元素。有关 .Net 优先级队列实现的信息可在此处获取。
基本思想是将数据表的前三个元素插入优先级队列。然后,您依次添加所有剩余元素,并在每次添加后删除顶部元素。之后优先级队列(堆)中剩余的元素将是前三个元素。
就添加另一列而言,不需要对表进行修改(您只需要定义相对排序/优先级标准),并且不会更改表元素的顺序。时间复杂度为 O(n log 3) = O(n)。
I'm not sure if this is best, but it beats sorting and then picking the top three elements which has time complexity O(n log n).
You can use a priority queue to filter the top three elements. Information about .Net priority queue implementations is available here.
The basic idea is to insert the first three elements of your data table into the priority queue. You then successively add all of remaining elements, removing the top element after each add. The elements remaining in the priority queue (heap) after that will be the top three elements.
No modification to the table is needed, in terms of adding another column (you just need to define the relative ordering / priority criteria) and doesn't change the order the table elements. Time complexity will be O(n log 3) = O(n).
添加一列:
然后排序:
仅获取前 3 行。
Add a column:
Then sort:
The just get the top 3 rows.
我喜欢 BFree 的解决方案,但是,如果您出于某种原因不想在数据集中添加额外的列,是否有原因无法启动对数据库的调用并为结果执行存储过程?
或者,不要使用 DataTable,而是从 ADO.NET 结果中解析对象并创建这些对象的 IEnumerable(或列表、数组或其他)。然后对它们进行排序?我更喜欢对象解决方案(即使不使用实体或 Linq2SQL 等),只是因为它为我提供了更多的灵活性,让我可以在拥有行后对它执行的操作......
I like BFree's solution, but, providing you don't want an extra column in your dataset for some reason, is there a reason you can't initiate a call to your db and execute a stored procedure for the results?
Alternatively, don't use a DataTable, but parse objects from your ADO.NET results and create an
IEnumerable<T>
(or List, or array, or whatever) of those objects. Then just sort them? I prefer the object solution (even without using Entities, or Linq2SQL, or such) just because it gives me so much more flexability on what I can do with a row once I have it...