C# 3.0:需要从 List<> 返回重复项

发布于 07-13 02:52 字数 421 浏览 10 评论 0 原文

我有一个列表<> C# 中的对象的数量,我需要一种方法来返回列表中被视为重复的那些对象。 我不需要不同的结果集,我需要一个将从我的存储库中删除的项目的列表。

就本示例而言,假设我有一个“汽车”类型列表,我需要知道其中哪些汽车与列表中的其他汽车颜色相同。 以下是列表中的汽车及其颜色属性:

Car1.Color = Red;

Car2.Color = Blue;

Car3.Color = Green;

Car4.Color = Red;

Car5.Color = Red;

对于此示例,我需要结果(IEnumerable<>、List<> 或其他)来包含 Car4 和 Car5,因为我想从我的存储库或数据库中删除它们,所以我的存储库中每种颜色只有一辆车。 任何帮助,将不胜感激。

I have a List<> of objects in C# and I need a way to return those objects that are considered duplicates within the list. I do not need the Distinct resultset, I need a list of those items that I will be deleting from my repository.

For the sake of this example, lets say I have a list of "Car" types and I need to know which of these cars are the same color as another in the list. Here are the cars in the list and their color property:

Car1.Color = Red;

Car2.Color = Blue;

Car3.Color = Green;

Car4.Color = Red;

Car5.Color = Red;

For this example I need the result (IEnumerable<>, List<>, or whatever) to contain Car4 and Car5 because I want to delete these from my repository or db so that I only have one car per color in my repository. Any help would be appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

把时间冻结 2024-07-20 02:52:32

我昨天无意中编写了这个代码,当时我试图写一个“通过投影来区分”。 我包括了一个! 当我不应该这样做时,但这一次它是正确的:

public static IEnumerable<TSource> DuplicatesBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        // Yield it if the key hasn't actually been added - i.e. it
        // was already in the set
        if (!seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

然后你可以用以下方式调用它:

var duplicates = cars.DuplicatesBy(car => car.Color);

I inadvertently coded this yesterday, when I was trying to write a "distinct by a projection". I included a ! when I shouldn't have, but this time it's just right:

public static IEnumerable<TSource> DuplicatesBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        // Yield it if the key hasn't actually been added - i.e. it
        // was already in the set
        if (!seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

You'd then call it with:

var duplicates = cars.DuplicatesBy(car => car.Color);
著墨染雨君画夕 2024-07-20 02:52:32
var duplicates = from car in cars
                 group car by car.Color into grouped
                 from car in grouped.Skip(1)
                 select car;

这将按颜色对汽车进行分组,然后跳过每组的第一个结果,返回每组中的剩余结果,并将其扁平化为单个序列。

如果您对要保留哪一辆有特殊要求,例如,如果汽车具有 Id 属性,并且您希望保留具有最低 Id 的汽车,那么您可以在那里添加一些排序,例如

var duplicates = from car in cars
                 group car by car.Color into grouped
                 from car in grouped.OrderBy(c => c.Id).Skip(1)
                 select car;
var duplicates = from car in cars
                 group car by car.Color into grouped
                 from car in grouped.Skip(1)
                 select car;

This groups the cars by color and then skips the first result from each group, returning the remainder from each group flattened into a single sequence.

If you have particular requirements about which one you want to keep, e.g. if the car has an Id property and you want to keep the car with the lowest Id, then you could add some ordering in there, e.g.

var duplicates = from car in cars
                 group car by car.Color into grouped
                 from car in grouped.OrderBy(c => c.Id).Skip(1)
                 select car;
揽月 2024-07-20 02:52:32

这是一个稍微不同的 Linq 解决方案,我认为它使您想要做的事情更加明显:

var s = from car in cars
    group car by car.Color into g
    where g.Count() == 1
    select g.First();

它只是按颜色对汽车进行分组,丢弃所有具有多个元素的组,然后将其余的放入返回的 IEnumerable 中。

Here's a slightly different Linq solution that I think makes it more obvious what you're trying to do:

var s = from car in cars
    group car by car.Color into g
    where g.Count() == 1
    select g.First();

It's just grouping cars by color, tossing out all the groups that have more than one element, and then putting the rest into the returned IEnumerable.

许久 2024-07-20 02:52:32
IEnumerable<Car> GetDuplicateColors(List<Car> cars)
{
    return cars.Where(c => cars.Any(c2 => c2.Color == c.Color && cars.IndexOf(c2) < cars.IndexOf(c) ) );
}    

它基本上意味着“返回列表中具有相同颜色和较小索引的任何汽车的汽车”。

但不确定性能。 我怀疑使用 O(1) 查找重复项的方法(如字典/哈希集方法)对于大型集合可能会更快。

IEnumerable<Car> GetDuplicateColors(List<Car> cars)
{
    return cars.Where(c => cars.Any(c2 => c2.Color == c.Color && cars.IndexOf(c2) < cars.IndexOf(c) ) );
}    

It basically means "return cars where there's any car in the list with the same color and a smaller index".

Not sure of the performance, though. I suspect an approach with a O(1) lookup for duplicates (like the dictionary/hashset method) can be faster for large sets.

还在原地等你 2024-07-20 02:52:32

创建一个新的 Dictionary foundColors 和一个 List carsToDelete

然后,您可以像这样迭代原始汽车列表:

foreach(Car c in listOfCars)
{
    if (foundColors.containsKey(c.Color))
    {
        carsToDelete.Add(c);
    }
    else
    {
        foundColors.Add(c.Color, c);
    }
}

然后您可以删除foundColors 中的每辆汽车。

通过将“删除记录”逻辑放在 if 语句中而不是创建新列表,您可以获得较小的性能提升,但您提出问题的方式表明您需要将它们收集在列表中。

Create a new Dictionary<Color, Car> foundColors and a List<Car> carsToDelete

Then you iterate through your original list of cars like so:

foreach(Car c in listOfCars)
{
    if (foundColors.containsKey(c.Color))
    {
        carsToDelete.Add(c);
    }
    else
    {
        foundColors.Add(c.Color, c);
    }
}

Then you can delete every car that's in foundColors.

You could get a minor performance boost by putting your "delete record" logic in the if statement instead of creating a new list, but the way you worded the question suggested that you needed to collect them in a List.

对岸观火 2024-07-20 02:52:32

如果没有实际编码,那么像这样的算法怎么样:

  • 迭代您的 List 创建一个 Dictionary
  • 迭代您的 Dictionary< T, int> 删除 int 为 >1 的条目

Dictionary 中剩余的任何内容都有重复项。 当然,实际删除的第二部分是可选的。 您只需遍历 Dictionary 并查找 >1 即可采取操作。

编辑:好的,我提高了瑞安的,因为他实际上给了你代码。 ;)

Without actually coding it, how about an algorithm something like this:

  • iterate through your List<T> creating a Dictionary<T, int>
  • iterate through your Dictionary<T, int> deleting entries where the int is >1

Anything left in the Dictionary has duplicates. The second part where you actually delete is optional, of course. You can just iterate through the Dictionary and look for the >1's to take action.

EDIT: OK, I bumped up Ryan's since he actually gave you code. ;)

北风几吹夏 2024-07-20 02:52:32

我的回答的灵感来自于以下受访者(按此顺序):Joe Coehoorn、Greg Beech 和 Jon Skeet。

我决定提供一个完整的示例,假设(为了实际效率)您有一个静态的汽车颜色列表。 我相信下面的代码以一种优雅但不一定超高效的方式说明了该问题的完整解决方案。

#region SearchForNonDistinctMembersInAGenericListSample
public static string[] carColors = new[]{"Red", "Blue", "Green"}; 
public static string[] carStyles = new[]{"Compact", "Sedan", "SUV", "Mini-Van", "Jeep"}; 
public class Car
{
    public Car(){}
    public string Color { get; set; }
    public string Style { get; set; }
}
public static List<Car> SearchForNonDistinctMembersInAList()
{
    // pass in cars normally, but declare here for brevity
    var cars = new List<Car>(5) { new Car(){Color=carColors[0], Style=carStyles[0]}, 
                                      new Car(){Color=carColors[1],Style=carStyles[1]},
                                      new Car(){Color=carColors[0],Style=carStyles[2]}, 
                                      new Car(){Color=carColors[2],Style=carStyles[3]}, 
                                      new Car(){Color=carColors[0],Style=carStyles[4]}};
    List<Car> carDupes = new List<Car>();

    for (int i = 0; i < carColors.Length; i++)
    {
        Func<Car,bool> dupeMatcher = c => c.Color == carColors[i];

        int count = cars.Count<Car>(dupeMatcher);

        if (count > 1) // we have duplicates
        {
            foreach (Car dupe in cars.Where<Car>(dupeMatcher).Skip<Car>(1))
            {
                carDupes.Add(dupe);
            }
        }
    }
    return carDupes;
}
#endregion

我稍后会回到这里,将这个解决方案与其所有三个灵感进行比较,只是为了对比风格。 还蛮有趣的。

My answer takes inspiration (in this order) from the followers respondents: Joe Coehoorn, Greg Beech and Jon Skeet.

I decided to provide a full example, with the assumption being (for real word efficiency) that you have a static list of car colors. I believe the following code illustrates a complete solution to the problem in an elegant, although not necessarily hyper-efficient, manner.

#region SearchForNonDistinctMembersInAGenericListSample
public static string[] carColors = new[]{"Red", "Blue", "Green"}; 
public static string[] carStyles = new[]{"Compact", "Sedan", "SUV", "Mini-Van", "Jeep"}; 
public class Car
{
    public Car(){}
    public string Color { get; set; }
    public string Style { get; set; }
}
public static List<Car> SearchForNonDistinctMembersInAList()
{
    // pass in cars normally, but declare here for brevity
    var cars = new List<Car>(5) { new Car(){Color=carColors[0], Style=carStyles[0]}, 
                                      new Car(){Color=carColors[1],Style=carStyles[1]},
                                      new Car(){Color=carColors[0],Style=carStyles[2]}, 
                                      new Car(){Color=carColors[2],Style=carStyles[3]}, 
                                      new Car(){Color=carColors[0],Style=carStyles[4]}};
    List<Car> carDupes = new List<Car>();

    for (int i = 0; i < carColors.Length; i++)
    {
        Func<Car,bool> dupeMatcher = c => c.Color == carColors[i];

        int count = cars.Count<Car>(dupeMatcher);

        if (count > 1) // we have duplicates
        {
            foreach (Car dupe in cars.Where<Car>(dupeMatcher).Skip<Car>(1))
            {
                carDupes.Add(dupe);
            }
        }
    }
    return carDupes;
}
#endregion

I'm going to come back through here later and compare this solution to all three of its inspirations, just to contrast the styles. It's rather interesting.

唠甜嗑 2024-07-20 02:52:32

公共静态 IQueryable 重复项(此 IEnumerable 源)其中 TSource : IComparable
{

if (source == null)   
     throw new ArgumentNullException("source");   
 return source.Where(x => source.Count(y=>y.Equals(x)) > 1).AsQueryable<TSource>();   

}

public static IQueryable Duplicates(this IEnumerable source) where TSource : IComparable
{

if (source == null)   
     throw new ArgumentNullException("source");   
 return source.Where(x => source.Count(y=>y.Equals(x)) > 1).AsQueryable<TSource>();   

}

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文