减去 HashSets(并返回一个副本)?

发布于 2024-09-26 22:55:38 字数 361 浏览 4 评论 0原文

我有一个 HashSet,

var universe = new HashSet<int>();

和一堆子集,

var sets = new List<HashSet<int>>(numSets);

我想减去一个块,我可以这样做:

var remaining = universe.ExceptWith(sets[0]);

但是 ExceptWith 可以就地工作。我不想修改宇宙。我应该先克隆它,还是有更好的方法?

I've got a HashSet,

var universe = new HashSet<int>();

And a bunch of subsets,

var sets = new List<HashSet<int>>(numSets);

I want to subtract a chunk, which I can do like this:

var remaining = universe.ExceptWith(sets[0]);

But ExceptWith works in-place. I don't want to modify the universe. Should I clone it first, or is there a better way?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

羁拥 2024-10-03 22:55:38

我想我应该克隆它
第一的?我该怎么做?

var universe = new HashSet<int>();
var subset = new HashSet<int>();
...

// clone the universe
var remaining = new HashSet<int>(universe);
remaining.ExceptWith(subset);

不像 Except 扩展方法那么简单,但可能更快(您应该运行一些性能测试来确保)

I guess I should clone it
first? How do I do that?

var universe = new HashSet<int>();
var subset = new HashSet<int>();
...

// clone the universe
var remaining = new HashSet<int>(universe);
remaining.ExceptWith(subset);

Not as simple as with the Except extension method, but probably faster (you should run a few performance tests to make sure)

苍暮颜 2024-10-03 22:55:38

Except() 怎么样?

var x = new HashSet<int>();
var y = new HashSet<int>();

var xminusy = new HashSet<int>(x.Except(y));

How about Except()?

var x = new HashSet<int>();
var y = new HashSet<int>();

var xminusy = new HashSet<int>(x.Except(y));
一世旳自豪 2024-10-03 22:55:38

我针对克隆和使用 HashSet 原生函数 ExceptWith 对 Linq 的 Except 方法进行了基准测试。这是结果。

static class Program
{
    public static HashSet<T> ToSet<T>(this IEnumerable<T> collection)
    {
        return new HashSet<T>(collection);
    }

    public static HashSet<T> Subtract<T>(this HashSet<T> set, IEnumerable<T> other)
    {
        var clone = set.ToSet();
        clone.ExceptWith(other);
        return clone;
    }

    static void Main(string[] args)
    {
        var A = new HashSet<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
        var B = new HashSet<int> { 2, 4, 6, 8, 10 };
        var sw = new Stopwatch();

        sw.Restart();
        for (int i = 0; i < 1000000; ++i)
        {
            var C = A.Except(B).ToSet();
        }
        sw.Stop();
        Console.WriteLine("Linq: {0} ms", sw.ElapsedMilliseconds);

        sw.Restart();
        for (int i = 0; i < 1000000; ++i)
        {
            var C = A.Subtract(B);
        }
        sw.Stop();
        Console.WriteLine("Native: {0} ms", sw.ElapsedMilliseconds);

        Console.ReadLine();
    }
}

Linq:1297 毫秒
本机:762 毫秒

http://programanddesign.com/cs/subtracting-sets/

I benchmarked Linq's Except method against cloning and using the HashSet-native function ExceptWith. Here are the results.

static class Program
{
    public static HashSet<T> ToSet<T>(this IEnumerable<T> collection)
    {
        return new HashSet<T>(collection);
    }

    public static HashSet<T> Subtract<T>(this HashSet<T> set, IEnumerable<T> other)
    {
        var clone = set.ToSet();
        clone.ExceptWith(other);
        return clone;
    }

    static void Main(string[] args)
    {
        var A = new HashSet<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
        var B = new HashSet<int> { 2, 4, 6, 8, 10 };
        var sw = new Stopwatch();

        sw.Restart();
        for (int i = 0; i < 1000000; ++i)
        {
            var C = A.Except(B).ToSet();
        }
        sw.Stop();
        Console.WriteLine("Linq: {0} ms", sw.ElapsedMilliseconds);

        sw.Restart();
        for (int i = 0; i < 1000000; ++i)
        {
            var C = A.Subtract(B);
        }
        sw.Stop();
        Console.WriteLine("Native: {0} ms", sw.ElapsedMilliseconds);

        Console.ReadLine();
    }
}

Linq: 1297 ms
Native: 762 ms

http://programanddesign.com/cs/subtracting-sets/

你的心境我的脸 2024-10-03 22:55:38

哈希集必须跟踪其哈希算法常量及其溢出箱。集合中的元素通过引用保存。正如 Thomas Levesque 所建议的那样,使用复制构造函数创建新的哈希会创建此开销的浅表副本,并且速度应该相当快。按照 James McNellis 建议的方式使用 except() 首先创建一个匿名副本,然后将其传递给复制构造函数,该复制构造函数使用匿名中的字段来初始化自己的字段。正如托马斯所说,你可能会做一些性能测试,但理论上他的答案应该击败詹姆斯的答案。顺便说一句,按照我的思维方式,浅拷贝不是克隆,因为我相信克隆意味着底层元素也被复制。具有共同元素的哈希集使用修改时复制策略。

A hash set has to track its hash algorithm constants, and its overflow bins. The elements in the set are held by reference. Creating a new hash with the copy constructor, as Thomas Levesque suggests, creates a shallow copy of this overhead and should be quite fast. Using Except() in the way that James McNellis suggests first creates an anonymous copy and then passes that to the copy constructor which uses the fields in the anonymous to initialize its own fields. As Thomas said, you might do a few performance tests, but theoretically his answer should beat James' answer. And by the way, to my way of thinking, a shallow copy is not a clone since I believe a clone implies that the underlying elements are also copied. Hash sets with common elements use a copy when modified strategy.

罪#恶を代价 2024-10-03 22:55:38

答案很晚,但有时可能有用。

@mpen 通过使用 Linq 的 except(IEnumerable<>) 来回答,

这使得 linq 循环通过 IEnumerable 检查它是否包含。

怎么样

setA.Where(i => !setB.Contains(i))

Very late answer but maybe useful sometimes.

@mpen answered by using Linq's Except(IEnumerable<>)

Which make linq loop trough IEnumerable check if it's contains.

How about

setA.Where(i => !setB.Contains(i))

眼趣 2024-10-03 22:55:38

显然,在某些情况下,“手动”在循环中添加项目比复制整个集合然后删除项目更有效。我能想到的一个...

// no more set ops planned? then returning list is an option
public static List<T> ExceptWith<T>(HashSet<T> allObjects, Hashset<T> minus)
{
    //  Set Capacity of list   (allObjects.Count-minus.Count?)
    List<T> retlst = new List<T>(allObjects.Count); 

    foreach( var obj in allObjects) {
        if( minus.Contains(obj)==false)
            retlst.Add(obj);
    }
    return retlst;
}

// Special case where quantity of copying will be high
// more expensive in fact than just adding
public static HashSet<T> ExceptWith<T>(HashSet<T> allObjects, HashSet<T> minus)
{
    if( minus.Count > allObjects.Count * 7/8 )
    {
        HashSet<T> retHash = new HashSet<T>(); 

        foreach( var obj in allObjects) {
            if( minus.Contains(obj)==false)
                retHash.Add(obj);
        }
        return retHash;

    }
    else
    {
        // usual clone and remove
    }
}

Obviously in a few cases 'manually' adding items in a loop is more efficient than copying the whole set and then removing items. One I can think of ...

// no more set ops planned? then returning list is an option
public static List<T> ExceptWith<T>(HashSet<T> allObjects, Hashset<T> minus)
{
    //  Set Capacity of list   (allObjects.Count-minus.Count?)
    List<T> retlst = new List<T>(allObjects.Count); 

    foreach( var obj in allObjects) {
        if( minus.Contains(obj)==false)
            retlst.Add(obj);
    }
    return retlst;
}

// Special case where quantity of copying will be high
// more expensive in fact than just adding
public static HashSet<T> ExceptWith<T>(HashSet<T> allObjects, HashSet<T> minus)
{
    if( minus.Count > allObjects.Count * 7/8 )
    {
        HashSet<T> retHash = new HashSet<T>(); 

        foreach( var obj in allObjects) {
            if( minus.Contains(obj)==false)
                retHash.Add(obj);
        }
        return retHash;

    }
    else
    {
        // usual clone and remove
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文