使用 LINQ 查找对称差异

发布于 2024-09-03 03:57:16 字数 449 浏览 11 评论 0原文

我有两个集合 a 和 b。我想计算 a 或 b 中的项目集，但不能同时计算两者（逻辑异或）。通过 LINQ，我可以想到：

IEnumerable<T> Delta<T>(IEnumerable<T> a, IEnumerable<T> b)
{
    return a.Except (b).Union (b.Except (a));
}

我想知道是否有其他更有效或更紧凑的方法来产生两个集合之间的差异。

编辑 1：Jon Skeet 发布了第一个解决方案，该解决方案不依赖 HashSet 来保留项目的顺序。我想知道是否有其他方法可以保留输出中 a 和 b 的顺序。

原文

I have two collections a and b. I would like to compute the set of items in either a or b, but not in both (a logical exclusive or). With LINQ, I can come up with this:

IEnumerable<T> Delta<T>(IEnumerable<T> a, IEnumerable<T> b)
{
    return a.Except (b).Union (b.Except (a));
}

I wonder if there are other more efficient or more compact ways of producing the difference between the two collections.

Edit 1: Jon Skeet posted a first solution which does not preserve the order of the items by relying on a HashSet. I wonder if there are other approaches which would preserve the order of a and b in the output.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

仙女 2024-09-10 03:57:16

直接使用 HashSet - 它有一个 SymmetricExceptWith 方法：

HashSet<T> data = new HashSet<T>(a);
data.SymmetricExceptWith(b);

编辑：如果您想保持顺序，这里有一个替代方案：

HashSet<T> data = new HashSet<T>(a);
data.IntersectWith(b);
foreach (T t in a.Concat(b))
{
    if (!data.Contains(t))
    {
        yield return t;
    }
}

这有以下重要区别：

两者 < code>a 和 b 迭代两次。在某些情况下，这可能是一件非常糟糕的事情 - 您可以对每个对象调用 ToList 以保留缓冲区。
如果a或b中有重复项，它们将被产生多次。如果您想避免这种情况，您可以保留一组已经生成的值。此时，它相当于：
```
a.Concat(b).Except(a.Intersect(b))
```

不过，这仍然只有两个集合操作，而不是原始代码中的三个。

Use HashSet<T> directly - it has a SymmetricExceptWith method:

HashSet<T> data = new HashSet<T>(a);
data.SymmetricExceptWith(b);

EDIT: If you want to maintain the order, here's an alternative:

HashSet<T> data = new HashSet<T>(a);
data.IntersectWith(b);
foreach (T t in a.Concat(b))
{
    if (!data.Contains(t))
    {
        yield return t;
    }
}

This has the following important differences:

Both a and b are iterated over twice. In some cases that could be a very bad thing - you could call ToList on each of them to start with to retain a buffer.
If there are duplicates in either a or b, they will be yielded multiple times. If you wanted to avoid this you could keep a set of already-yielded values. At this point, it would be equivalent to:
```
a.Concat(b).Except(a.Intersect(b))
```

That's still only two set operations instead of the three in your original code though.

回复收藏 0 原文

飘过的浮云 2024-09-10 03:57:16

鉴于 a.Except(b) 和 b.Except(a) 不相交，您可以使用 concat 而不是 union，从而保存集合运算符（以及 concat< /code> 效率更高）。

return a.Except (b).Concat (b.Except (a));

这仍然会在每个列表中运行两次。

Given a.Except(b) and b.Except(a) are disjoint, you can use concat instead of union, saving a set operator (and concat is more efficient).

return a.Except (b).Concat (b.Except (a));

This still runs through each list twice.

回复收藏 0 原文

相思故 2024-09-10 03:57:16

我们公司的一个项目也有类似的需求，所以我们编写了这个扩展：

public class EnumerablePair<T> : IReadOnlyCollection<T>
{
    private IReadOnlyCollection<T> _Left;
    private IReadOnlyCollection<T> _Right;
    private IEnumerable<T> _Union;
    private int _Count;
    public EnumerablePair(IEnumerable<T> left, IEnumerable<T> right)
    {
        _Left = left?.ToList() ?? Enumerable.Empty<T>().ToList();
        _Right = right?.ToList() ?? Enumerable.Empty<T>().ToList();
        _Count = Left.Count + Right.Count;
        _Union = Left.Union(Right);
    }

    public int Count => _Count;
    public IReadOnlyCollection<T> Left { get => _Left; }
    public IReadOnlyCollection<T> Right { get => _Right; }

    public IEnumerator<T> GetEnumerator()
    {
        return _Union.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return _Union.GetEnumerator();
    }
}

public static class EnumerableExtension
{
    public static EnumerablePair<T> ExclusiveDisjunction<T>(this IEnumerable<T> leftOperand, IEnumerable<T> rightOperand, IEqualityComparer<T> comparer = null)
    {
        if (leftOperand == null)
            throw new ArgumentNullException(nameof(leftOperand), $"{nameof(leftOperand)} is null.");
        if (rightOperand == null)
            throw new ArgumentNullException(nameof(rightOperand), $"{nameof(rightOperand)} is null.");

        // TODO : Can be optimized if one of the IEnumerable parameters is empty.

        bool leftIsBigger = leftOperand.Count() > rightOperand.Count();
        var biggestOperand = leftIsBigger ? leftOperand.ToList() : rightOperand.ToList();
        var smallestOperand = leftIsBigger ? rightOperand.ToList() : leftOperand.ToList();

        var except1 = biggestOperand.ToList();
        var except2 = Enumerable.Empty<T>().ToList();

        Func<T, T, bool> areEquals;
        if (comparer != null)
            areEquals = (one, theOther) => comparer.Equals(one, theOther);
        else
            areEquals = (one, theOther) => one?.Equals(theOther) ?? theOther == null;

        foreach (T t in smallestOperand)
            if (except1.RemoveAll(item => areEquals(item, t)) == 0)
                except2.Add(t);

        if (leftIsBigger)
            return new EnumerablePair<T>(except1, except2);
        return new EnumerablePair<T>(except2, except1);
    }
}

它比较两个集合的元素（使用或不使用 IEqualityComparer，由您选择）。

返回的对象是 EnumerablePair，包含位于 leftOperand 或 rightOperand 中的对象，但不能同时包含在两者中 (XOR)。
EnumerablePair.Left 包含位于 leftOperand 中但不在 rightOperand 中的对象。
EnumerablePair.Right 包含位于 rightOperand 中但不在 leftOperand 中的对象。

您可以使用如下扩展：

var xorList = list1.ExclusiveDisjunction(list2);
var leftXor = xorList.Left;
var rightXor = xorList.Right;

xorList、leftXor 和 rightXor 是 IEnumerable。

We had a similar need for a project in my company, so we wrote this extension:

public class EnumerablePair<T> : IReadOnlyCollection<T>
{
    private IReadOnlyCollection<T> _Left;
    private IReadOnlyCollection<T> _Right;
    private IEnumerable<T> _Union;
    private int _Count;
    public EnumerablePair(IEnumerable<T> left, IEnumerable<T> right)
    {
        _Left = left?.ToList() ?? Enumerable.Empty<T>().ToList();
        _Right = right?.ToList() ?? Enumerable.Empty<T>().ToList();
        _Count = Left.Count + Right.Count;
        _Union = Left.Union(Right);
    }

    public int Count => _Count;
    public IReadOnlyCollection<T> Left { get => _Left; }
    public IReadOnlyCollection<T> Right { get => _Right; }

    public IEnumerator<T> GetEnumerator()
    {
        return _Union.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return _Union.GetEnumerator();
    }
}

public static class EnumerableExtension
{
    public static EnumerablePair<T> ExclusiveDisjunction<T>(this IEnumerable<T> leftOperand, IEnumerable<T> rightOperand, IEqualityComparer<T> comparer = null)
    {
        if (leftOperand == null)
            throw new ArgumentNullException(nameof(leftOperand), $"{nameof(leftOperand)} is null.");
        if (rightOperand == null)
            throw new ArgumentNullException(nameof(rightOperand), $"{nameof(rightOperand)} is null.");

        // TODO : Can be optimized if one of the IEnumerable parameters is empty.

        bool leftIsBigger = leftOperand.Count() > rightOperand.Count();
        var biggestOperand = leftIsBigger ? leftOperand.ToList() : rightOperand.ToList();
        var smallestOperand = leftIsBigger ? rightOperand.ToList() : leftOperand.ToList();

        var except1 = biggestOperand.ToList();
        var except2 = Enumerable.Empty<T>().ToList();

        Func<T, T, bool> areEquals;
        if (comparer != null)
            areEquals = (one, theOther) => comparer.Equals(one, theOther);
        else
            areEquals = (one, theOther) => one?.Equals(theOther) ?? theOther == null;

        foreach (T t in smallestOperand)
            if (except1.RemoveAll(item => areEquals(item, t)) == 0)
                except2.Add(t);

        if (leftIsBigger)
            return new EnumerablePair<T>(except1, except2);
        return new EnumerablePair<T>(except2, except1);
    }
}

It compares elements of two collections (using an IEqualityComparer or not, at your choice).

The returned object, an EnumerablePair<T>, contains objects that are in leftOperand or rightOperand, but not both (XOR).
EnumerablePair<T>.Left contains objects that are in leftOperand but not in rightOperand.
EnumerablePair<T>.Right contains objects that are in rightOperand but not in leftOperand.

You can use the extension like this :

var xorList = list1.ExclusiveDisjunction(list2);
var leftXor = xorList.Left;
var rightXor = xorList.Right;

xorList, leftXor and rightXor are IEnumerable<T>.

回复收藏 0 原文

~没有更多了~

关于作者

酒废

暂无简介

文章

24 人气

关注发私信

友情链接

文江博客

使用 LINQ 查找对称差异

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

佚名

羁客

天天爱笑的徐老师

星

夏日落

隐诗

友情链接

使用 LINQ 查找对称差异

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

佚名

羁客

天天爱笑的徐老师

星

夏日落

隐诗

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。