如何获取数据表中的不同记录?

发布于 2024-09-10 19:06:16 字数 262 浏览 4 评论 0原文

我正在使用 C# + VS2008 + .Net + ASP.Net + IIS 7.0 + ADO.Net + SQL Server 2008。我有一个 ADO.Net 数据表对象,我想过滤掉重复/相似的记录(在我的特定规则中判断记录是否重复/相似——如果记录/行对于字符串列具有相同的值,我将把它们视为重复/相似记录),并且只保留这样的重复/相似记录之一。

输出需要是一个数据表,如果可以对同一个数据表对象进行过滤操作,则可以输出相同的数据表对象。

最有效的解决方案是什么?

I am using C# + VS2008 + .Net + ASP.Net + IIS 7.0 + ADO.Net + SQL Server 2008. I have a ADO.Net datatable object, and I want to filter out duplicate/similar records (in my specific rule to judge whether records are duplicate/similar -- if record/row has the same value for a string column, I will treat them as duplicate/similar records), and only keep one of such duplicate/similar records.

The output needs to be a datatable, may output the same datatable object if filter operation could be operated on the same datatable object.

What is the most efficient solution?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

奶气 2024-09-17 19:06:16

您使用的是.NET 3.5吗?如果您转换数据行,则可以使用 LINQ to Objects:

var distinctRows = table.Rows.Cast<DataRow>().Distinct(new E());

...

public class E : IEqualityComparer<DataRow>
{
    bool IEqualityComparer<DataRow>.Equals(DataRow x, DataRow y)
    {
        return x["colA"] == y["colA"];
    }

    int IEqualityComparer<DataRow>.GetHashCode(DataRow obj)
    {
        return obj["colA"].GetHashCode();
    }
}

或者更简单的方法,因为您基于单个列的值:

var distinct = from r in table.Rows.Cast<DataRow>()
               group r by (string)r["colA"] into g
               select g.First();

如果您需要从这些不同的行中创建一个新的 DataTable,您可以这样做:

var t2 = new DataTable();
t2.Columns.AddRange(table.Columns.Cast<DataColumn>().ToArray());
foreach(var r in distinct)
{
    t2.Rows.Add(r);
}

或者,如果使用业务对象会更方便,您可以进行简单的转换:

var persons = (from r in distinct
               select new PersonInfo
               {
                   EmpId = (string)r["colA"],
                   FirstName = (string)r["colB"],
                   LastName = (string)r["colC"],
               }).ToList();

...

public class PersonInfo
{
    public string EmpId {get;set;}
    public string FirstName {get;set;}
    public string LastName {get;set;}
}

更新

您在 LINQ to Objects 中可以做的所有事情也可以在没有它的情况下完成:它只是需要更多代码。例如:

var table = new DataTable();
var rowSet = new HashSet<DataRow>(new E());
var newTable = new DataTable();
foreach(DataColumn column in table.Columns)
{
    newTable.Columns.Add(column);
}
foreach(DataRow row in table.Rows)
{
    if(!rowSet.Contains(row))
    {
        rowSet.Add(row);
        newTable.Rows.Add(row);
    }
}

您还可以使用类似的策略简单地从原始表中删除重复行,而不是创建新表。

Are you using .NET 3.5? If you cast your data rows, you can use LINQ to Objects:

var distinctRows = table.Rows.Cast<DataRow>().Distinct(new E());

...

public class E : IEqualityComparer<DataRow>
{
    bool IEqualityComparer<DataRow>.Equals(DataRow x, DataRow y)
    {
        return x["colA"] == y["colA"];
    }

    int IEqualityComparer<DataRow>.GetHashCode(DataRow obj)
    {
        return obj["colA"].GetHashCode();
    }
}

Or an even simpler way, since you're basing it on a single column's values:

var distinct = from r in table.Rows.Cast<DataRow>()
               group r by (string)r["colA"] into g
               select g.First();

If you need to make a new DataTable out of these distinct rows, you can do this:

var t2 = new DataTable();
t2.Columns.AddRange(table.Columns.Cast<DataColumn>().ToArray());
foreach(var r in distinct)
{
    t2.Rows.Add(r);
}

Or if it would be more handy to work with business objects, you can do an easy conversion:

var persons = (from r in distinct
               select new PersonInfo
               {
                   EmpId = (string)r["colA"],
                   FirstName = (string)r["colB"],
                   LastName = (string)r["colC"],
               }).ToList();

...

public class PersonInfo
{
    public string EmpId {get;set;}
    public string FirstName {get;set;}
    public string LastName {get;set;}
}

Update

Everything you can do in LINQ to Objects can also be done without it: it just takes more code. For example:

var table = new DataTable();
var rowSet = new HashSet<DataRow>(new E());
var newTable = new DataTable();
foreach(DataColumn column in table.Columns)
{
    newTable.Columns.Add(column);
}
foreach(DataRow row in table.Rows)
{
    if(!rowSet.Contains(row))
    {
        rowSet.Add(row);
        newTable.Rows.Add(row);
    }
}

You could also use a similar strategy to simply remove duplicate rows from the original table instead of creating a new table.

回忆躺在深渊里 2024-09-17 19:06:16

您可以使用 group by 子句执行 select into,这样就不会创建重复项。然后删除旧表并将您选择的表重命名为原始表名称。

You can do a select into with a group by clause, so not duplicates are created. Then drop the old table and rename the table into which you selected to the original table name.

琉璃梦幻 2024-09-17 19:06:16

我会在数据库层执行此操作:

SELECT Distinct...
FROM MyTable

或者如果您需要聚合:

SELECT SUM(Field1), ID FROM MyTable
GROUP BY ID

将 SELECT 语句放入存储过程中。然后在.net中连接数据库,调用存储过程,执行.ExecuteNonQuery()。返回数据表中的行并将数据表返回到您的 UI。

I would do this in the database layer:

SELECT Distinct...
FROM MyTable

Or if you need aggregates:

SELECT SUM(Field1), ID FROM MyTable
GROUP BY ID

Put the SELECT statement in a stored procedure. Then in .net make a connection to the database, call the stored procedure, execute .ExecuteNonQuery(). Return the rows in a datatable and return the datatable back to your UI.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文