tr1::unordered_set 并集和交集

发布于 2024-07-20 06:11:46 字数 211 浏览 7 评论 0原文

如何在 C++ 中对 tr1::unordered_set 类型的集合进行交集和并集? 我找不到太多关于它的参考。

任何参考和代码将受到高度赞赏。 非常感谢。

更新:我只是猜测 tr1::unordered_set 应该提供交集、并集、差集的函数。因为这是集合的基本操作。 当然我可以自己写一个函数,但我只是想知道tr1是否有内置函数。 非常感谢。

How to do intersection and union for sets of the type tr1::unordered_set in c++? I can't find much reference about it.

Any reference and code will be highly appreciated. Thank you very much.

Update: I just guessed the tr1::unordered_set should provide the function for intersection, union, difference.. Since that's the basic operation of sets.
Of course I can write a function by myself, but I just wonder if there are built in function from tr1.
Thank you very much.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

雾里花 2024-07-27 06:11:46

我看到 set_intersection() 等人。 来自 algorithm 标头将不起作用,因为它们明确要求对输入进行排序 - 猜猜您已经排除了它们。

在我看来,迭代哈希 A 并查找哈希 B 中的每个元素的“天真的”方法实际上应该为您提供接近最佳的性能,因为哈希 B 中的连续查找将进入同一个哈希桶(假设两个哈希桶)哈希使用相同的哈希函数)。 即使这些存储桶几乎肯定是作为链表实现的,这也应该为您提供良好的内存局部性。

以下是 unordered_set_difference() 的一些代码,您可以调整它以制作集合并集和集合差集的版本:

template <typename InIt1, typename InIt2, typename OutIt>
OutIt unordered_set_intersection(InIt1 b1, InIt1 e1, InIt2 b2, InIt2 e2, OutIt out) {
    while (!(b1 == e1)) {
        if (!(std::find(b2, e2, *b1) == e2)) {
            *out = *b1;
            ++out;
        }

        ++b1;
    }

    return out;
}

假设您有两个 unordered_setx 和 y,您可以使用以下方法将它们的交集放在 z 中:

unordered_set_intersection(
    x.begin(), x.end(),
    y.begin(), y.end(),
    inserter(z, z.begin())
);

bdonlan 的答案这实际上适用于任何键类型和容器类型的任何组合(尽管使用 set_intersection()<如果源容器已排序, /code> 当然会更快)。

注意:如果存储桶占用率很高,则将每个散列复制到向量中,对它们进行排序并在那里 set_intersection() 它们可能会更快,因为在包含 n 个元素的存储桶中进行搜索是 O(n)。

I see that set_intersection() et al. from the algorithm header won't work as they explicitly require their inputs to be sorted -- guess you ruled them out already.

It occurs to me that the "naive" approach of iterating through hash A and looking up every element in hash B should actually give you near-optimal performance, since successive lookups in hash B will be going to the same hash bucket (assuming that both hashes are using the same hash function). That should give you decent memory locality, even though these buckets are almost certainly implemented as linked lists.

Here's some code for unordered_set_difference(), you can tweak it to make the versions for set union and set difference:

template <typename InIt1, typename InIt2, typename OutIt>
OutIt unordered_set_intersection(InIt1 b1, InIt1 e1, InIt2 b2, InIt2 e2, OutIt out) {
    while (!(b1 == e1)) {
        if (!(std::find(b2, e2, *b1) == e2)) {
            *out = *b1;
            ++out;
        }

        ++b1;
    }

    return out;
}

Assuming you have two unordered_sets, x and y, you can put their intersection in z using:

unordered_set_intersection(
    x.begin(), x.end(),
    y.begin(), y.end(),
    inserter(z, z.begin())
);

Unlike bdonlan's answer, this will actually work for any key types, and any combination of container types (although using set_intersection() will of course be faster if the source containers are sorted).

NOTE: If bucket occupancies are high, it's probably faster to copy each hash into a vector, sort them and set_intersection() them there, since searching within a bucket containing n elements is O(n).

稚气少女 2024-07-27 06:11:46

没有什么太多的 - 对于相交,只需遍历一个元素并确保它在另一个元素中即可。 对于并集,添加两个输入集中的所有项目。

例如:

void us_isect(std::tr1::unordered_set<int> &out,
        const std::tr1::unordered_set<int> &in1,
        const std::tr1::unordered_set<int> &in2)
{
    out.clear();
    if (in2.size() < in1.size()) {
        us_isect(out, in2, in1);
        return;
    }
    for (std::tr1::unordered_set<int>::const_iterator it = in1.begin(); it != in1.end(); it++)
    {
        if (in2.find(*it) != in2.end())
            out.insert(*it);
    }
}

void us_union(std::tr1::unordered_set<int> &out,
        const std::tr1::unordered_set<int> &in1,
        const std::tr1::unordered_set<int> &in2)
{
    out.clear();
    out.insert(in1.begin(), in1.end());
    out.insert(in2.begin(), in2.end());
}

There's nothing much to it - for intersect, just go through every element of one and ensure it's in the other. For union, add all items from both input sets.

For example:

void us_isect(std::tr1::unordered_set<int> &out,
        const std::tr1::unordered_set<int> &in1,
        const std::tr1::unordered_set<int> &in2)
{
    out.clear();
    if (in2.size() < in1.size()) {
        us_isect(out, in2, in1);
        return;
    }
    for (std::tr1::unordered_set<int>::const_iterator it = in1.begin(); it != in1.end(); it++)
    {
        if (in2.find(*it) != in2.end())
            out.insert(*it);
    }
}

void us_union(std::tr1::unordered_set<int> &out,
        const std::tr1::unordered_set<int> &in1,
        const std::tr1::unordered_set<int> &in2)
{
    out.clear();
    out.insert(in1.begin(), in1.end());
    out.insert(in2.begin(), in2.end());
}
盛装女皇 2024-07-27 06:11:46

基于之前的答案:
C++11版本,如果集合支持快速查找函数find()
(返回值被有效移动)

/** Intersection and union function for unordered containers which support a fast lookup function find()
 *  Return values are moved by move-semantics, for c++11/c++14 this is efficient, otherwise it results in a copy
 */

namespace unorderedHelpers {

    template<typename UnorderedIn1, typename UnorderedIn2,
             typename UnorderedOut = UnorderedIn1>
    UnorderedOut makeIntersection(const  UnorderedIn1 &in1, const  UnorderedIn2 &in2)
    {
        if (in2.size() < in1.size()) {
            return makeIntersection<UnorderedIn2,UnorderedIn1,UnorderedOut>(in2, in1);
        }

        UnorderedOut out;
        auto e = in2.end();
        for(auto & v : in1)
        {
            if (in2.find(v) != e){
                out.insert(v);
            }
        }
        return out;
    }

    template<typename UnorderedIn1, typename UnorderedIn2,
             typename UnorderedOut = UnorderedIn1>
    UnorderedOut makeUnion(const UnorderedIn1 &in1, const UnorderedIn2 &in2)
    {
        UnorderedOut out;
        out.insert(in1.begin(), in1.end());
        out.insert(in2.begin(), in2.end());
        return out;
    }
}

based on the previous answer:
C++11 version, if the set supports a fast look up function find()
(return values are moved efficiently)

/** Intersection and union function for unordered containers which support a fast lookup function find()
 *  Return values are moved by move-semantics, for c++11/c++14 this is efficient, otherwise it results in a copy
 */

namespace unorderedHelpers {

    template<typename UnorderedIn1, typename UnorderedIn2,
             typename UnorderedOut = UnorderedIn1>
    UnorderedOut makeIntersection(const  UnorderedIn1 &in1, const  UnorderedIn2 &in2)
    {
        if (in2.size() < in1.size()) {
            return makeIntersection<UnorderedIn2,UnorderedIn1,UnorderedOut>(in2, in1);
        }

        UnorderedOut out;
        auto e = in2.end();
        for(auto & v : in1)
        {
            if (in2.find(v) != e){
                out.insert(v);
            }
        }
        return out;
    }

    template<typename UnorderedIn1, typename UnorderedIn2,
             typename UnorderedOut = UnorderedIn1>
    UnorderedOut makeUnion(const UnorderedIn1 &in1, const UnorderedIn2 &in2)
    {
        UnorderedOut out;
        out.insert(in1.begin(), in1.end());
        out.insert(in2.begin(), in2.end());
        return out;
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文