如何根据其他列数据过滤掉相似的行(在某些列上相等)

发布于 2024-09-13 18:32:25 字数 831 浏览 3 评论 0原文

我如何选择名字姓氏上不同的所有行,并且在有重复的地方选择具有较高SomeDate的行,然后选择Id 如果仍然重复,例如

| Id | Forename | Surname | SomeDate   |
----------------------------------------
| 1  | Bill     | Power   | 2011-01-01 |
| 2  | James    | Joyce   | 2011-02-01 |
| 3  | Peter    | Lennon  | 2011-03-01 |
| 4  | John     | Sellers | 2011-04-01 |
| 5  | James    | Joyce   | 2011-05-01 |
| 6  | Peter    | Lennon  | 2011-03-01 |

结果:

| Id | Forename | Surname | SomeDate   |
----------------------------------------
| 1  | Bill     | Power   | 2011-01-01 |
| 4  | John     | Sellers | 2011-04-01 |
| 5  | James    | Joyce   | 2011-05-01 |
| 6  | Peter    | Lennon  | 2011-03-01 |

我如何

  1. 在T-SQL从 DataTable 中实现此目的
  2. 使用 C#

How would I select all rows distinct on Forename and Surname and where there's a duplicate select the one with the higher SomeDate, then Id if still duplicates e.g.

For:

| Id | Forename | Surname | SomeDate   |
----------------------------------------
| 1  | Bill     | Power   | 2011-01-01 |
| 2  | James    | Joyce   | 2011-02-01 |
| 3  | Peter    | Lennon  | 2011-03-01 |
| 4  | John     | Sellers | 2011-04-01 |
| 5  | James    | Joyce   | 2011-05-01 |
| 6  | Peter    | Lennon  | 2011-03-01 |

Results in:

| Id | Forename | Surname | SomeDate   |
----------------------------------------
| 1  | Bill     | Power   | 2011-01-01 |
| 4  | John     | Sellers | 2011-04-01 |
| 5  | James    | Joyce   | 2011-05-01 |
| 6  | Peter    | Lennon  | 2011-03-01 |

How could I achieve this in

  1. T-SQL
  2. From a DataTable using C#

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

浪推晚风 2024-09-20 18:32:25

假设 SQL Server 2005+,使用:

SELECT x.id,
       x.forename,
       x.surname,
       x.somedate
  FROM (SELECT t.id,
               t.forename,
               t.surname,
               t.somedate,
               ROW_NUMBER() OVER (PARTITION BY t.forename, t.surname 
                                      ORDER BY t.somedate DESC, t.id DESC) AS rank
          FROM TABLE t_ x
WHERE x.rank = 1

一个有风险的方法是:

  SELECT MAX(t.id) AS id,
         t.forename,
         t.surname,
         MAX(t.somedate) AS somedate
    FROM TABLE t
GROUP BY t.forename, t.surname

Assuming SQL Server 2005+, use:

SELECT x.id,
       x.forename,
       x.surname,
       x.somedate
  FROM (SELECT t.id,
               t.forename,
               t.surname,
               t.somedate,
               ROW_NUMBER() OVER (PARTITION BY t.forename, t.surname 
                                      ORDER BY t.somedate DESC, t.id DESC) AS rank
          FROM TABLE t_ x
WHERE x.rank = 1

A risky approach would be:

  SELECT MAX(t.id) AS id,
         t.forename,
         t.surname,
         MAX(t.somedate) AS somedate
    FROM TABLE t
GROUP BY t.forename, t.surname
你又不是我 2024-09-20 18:32:25

我倾向于对非分组值使用子选择。

SELECT Forename, Surname, 
    (SELECT TOP 1 Id FROM myTable mt WHERE mt.Forename = m.Forename AND mt.Surname = m.Surname
     ORDER BY m.SomeDate DESC) AS Id
    (SELECT TOP 1 SomeDate FROM myTable mt WHERE mt.Forename = m.Forename AND mt.Surname = m.Surname
     ORDER BY m.SomeDate DESC) AS SomeDate
FROM myTable m
GROUP BY Forename, Surname

或者您可以在 WHERE 子句中对其进行过滤:

SELECT Id, Forename, Surname, SomeDate
FROM myTable m
WHERE m.Id = (SELECT TOP 1 Id FROM myTable mt WHERE mt.Forename = m.Forename AND mt.Surname = m.Surname
    ORDER BY m.SomeDate DESC)

恐怕这两种方法都不会非常有效,但如果需要的话,索引调整会改善这种情况。

对于数据表示例,您基本上会做同样的事情。

var recs = from record in dataTable
           where record.Id == 
               (from rec in dataTable
                where rec.Forename == record.Forename && rec.Surname == record.Surname
                orderby rec.SomeDate descending
                select rec.Id).First()
           select record;

I'd tend to use subselects for the non-grouped values.

SELECT Forename, Surname, 
    (SELECT TOP 1 Id FROM myTable mt WHERE mt.Forename = m.Forename AND mt.Surname = m.Surname
     ORDER BY m.SomeDate DESC) AS Id
    (SELECT TOP 1 SomeDate FROM myTable mt WHERE mt.Forename = m.Forename AND mt.Surname = m.Surname
     ORDER BY m.SomeDate DESC) AS SomeDate
FROM myTable m
GROUP BY Forename, Surname

Or you can filter it in the WHERE clause:

SELECT Id, Forename, Surname, SomeDate
FROM myTable m
WHERE m.Id = (SELECT TOP 1 Id FROM myTable mt WHERE mt.Forename = m.Forename AND mt.Surname = m.Surname
    ORDER BY m.SomeDate DESC)

I'm afraid neither will be terribly efficient, but index toning will ameliate that if needed.

For a datatable example, you'd do essentially the same thing.

var recs = from record in dataTable
           where record.Id == 
               (from rec in dataTable
                where rec.Forename == record.Forename && rec.Surname == record.Surname
                orderby rec.SomeDate descending
                select rec.Id).First()
           select record;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文