unordered_map / unordered_set 中元组的通用哈希

发布于 2024-11-30 01:32:32 字数 597 浏览 1 评论 0 原文

为什么 std::unordered_map, string> 不只是 开箱即用? 必须为 tuple 定义哈希函数是很乏味的,例如

template<> struct do_hash<tuple<int, int>>                               
{   size_t operator()(std::tuple<int, int> const& tt) const {...}  }; 

用元组作为键构建无序映射 (Matthieu M.) 展示了如何 为 boost::tuple 自动执行此操作。无论如何,是否可以在不使用可变参数模板的情况下对 c++0x 元组执行此操作?

当然这应该在标准中:(

Why doesn't std::unordered_map<tuple<int, int>, string> just
work out of the box?
It is tedious to have to define a hash function for tuple<int, int>, e.g.

template<> struct do_hash<tuple<int, int>>                               
{   size_t operator()(std::tuple<int, int> const& tt) const {...}  }; 

Building an unordered map with tuples as keys (Matthieu M.) shows how to
automate this for boost::tuple. Is there anyway of doing this for c++0x tuples without using variadic templates?

Surely this should be in the standard :(

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

拥醉 2024-12-07 01:32:32

这适用于 gcc 4.5,允许所有包含标准可哈希类型的 c++0x 元组成为以下成员
unordered_mapunordered_set 不用多说。
(我将代码放入头文件中,然后将其包含在内。)

该函数必须位于 std 命名空间中,以便由
参数相关名称查找 (ADL)。

有更简单的解决方案吗?

#include <tuple>
namespace std{
    namespace
    {

        // Code from boost
        // Reciprocal of the golden ratio helps spread entropy
        //     and handles duplicates.
        // See Mike Seymour in magic-numbers-in-boosthash-combine:
        //     http://stackoverflow.com/questions/4948780

        template <class T>
        inline void hash_combine(std::size_t& seed, T const& v)
        {
            seed ^= std::hash<T>()(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
        }

        // Recursive template code derived from Matthieu M.
        template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>
        struct HashValueImpl
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            HashValueImpl<Tuple, Index-1>::apply(seed, tuple);
            hash_combine(seed, std::get<Index>(tuple));
          }
        };

        template <class Tuple>
        struct HashValueImpl<Tuple,0>
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            hash_combine(seed, std::get<0>(tuple));
          }
        };
    }

    template <typename ... TT>
    struct hash<std::tuple<TT...>> 
    {
        size_t
        operator()(std::tuple<TT...> const& tt) const
        {                                              
            size_t seed = 0;                             
            HashValueImpl<std::tuple<TT...> >::apply(seed, tt);    
            return seed;                                 
        }                                              

    };
}

标准一致性代码

Yakk 指出,在 std 命名空间中专门化事物实际上是未定义的行为。如果您希望拥有一个符合标准的解决方案,那么您需要将所有这些代码移至您自己的命名空间中,并放弃任何 ADL 自动查找正确哈希实现的想法。而不是 :

unordered_set<tuple<double, int> > test_set;

您需要:

unordered_set<tuple<double, int>, hash_tuple::hash<tuple<double, int>>> test2;

其中 hash_tuple 是您自己的命名空间,而不是 std::

为此,您首先必须在 hash_tuple 命名空间内声明一个哈希实现。这会将所有非元组类型转发到 std::hash

namespace hash_tuple{

template <typename TT>
struct hash
{
    size_t
    operator()(TT const& tt) const
    {                                              
        return std::hash<TT>()(tt);                                 
    }                                              
};
}

确保 hash_combine 调用 hash_tuple::hash 而不是 std ::hash

namespace hash_tuple{

namespace
    {
    template <class T>
    inline void hash_combine(std::size_t& seed, T const& v)
    {
        seed ^= hash_tuple::hash<T>()(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
    }
}

然后包含所有其他先前的代码,但将其放入 namespace hash_tuple 而不是 std::

namespace hash_tuple{

    namespace
    {
        // Recursive template code derived from Matthieu M.
        template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>
        struct HashValueImpl
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            HashValueImpl<Tuple, Index-1>::apply(seed, tuple);
            hash_combine(seed, std::get<Index>(tuple));
          }
        };

        template <class Tuple>
        struct HashValueImpl<Tuple,0>
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            hash_combine(seed, std::get<0>(tuple));
          }
        };
    }

    template <typename ... TT>
    struct hash<std::tuple<TT...>> 
    {
        size_t
        operator()(std::tuple<TT...> const& tt) const
        {                                              
            size_t seed = 0;                             
            HashValueImpl<std::tuple<TT...> >::apply(seed, tt);    
            return seed;                                 
        }                                              
    };

}

This works on gcc 4.5 allowing all c++0x tuples containing standard hashable types to be members of
unordered_map and unordered_set without further ado.
(I put the code in a header file and just include it.)

The function has to live in the std namespace so that it is picked up by
argument-dependent name lookup (ADL).

Is there a simpler solution?

#include <tuple>
namespace std{
    namespace
    {

        // Code from boost
        // Reciprocal of the golden ratio helps spread entropy
        //     and handles duplicates.
        // See Mike Seymour in magic-numbers-in-boosthash-combine:
        //     http://stackoverflow.com/questions/4948780

        template <class T>
        inline void hash_combine(std::size_t& seed, T const& v)
        {
            seed ^= std::hash<T>()(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
        }

        // Recursive template code derived from Matthieu M.
        template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>
        struct HashValueImpl
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            HashValueImpl<Tuple, Index-1>::apply(seed, tuple);
            hash_combine(seed, std::get<Index>(tuple));
          }
        };

        template <class Tuple>
        struct HashValueImpl<Tuple,0>
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            hash_combine(seed, std::get<0>(tuple));
          }
        };
    }

    template <typename ... TT>
    struct hash<std::tuple<TT...>> 
    {
        size_t
        operator()(std::tuple<TT...> const& tt) const
        {                                              
            size_t seed = 0;                             
            HashValueImpl<std::tuple<TT...> >::apply(seed, tt);    
            return seed;                                 
        }                                              

    };
}

Standard Conformant code

Yakk points out that specialising things in the std namespace is actually undefined behaviour. If you wish to have a standards conforming solution, then you need to move all of this code into your own namespace and give up any idea of ADL finding the right hash implementation automatically. Instead of :

unordered_set<tuple<double, int> > test_set;

You need:

unordered_set<tuple<double, int>, hash_tuple::hash<tuple<double, int>>> test2;

where hash_tuple is your own namespace rather than std::.

To do this, you first have to declare a hash implementation inside the hash_tuple namespace. This will forward all non tuple types to the std::hash:

namespace hash_tuple{

template <typename TT>
struct hash
{
    size_t
    operator()(TT const& tt) const
    {                                              
        return std::hash<TT>()(tt);                                 
    }                                              
};
}

Make sure that hash_combine calls hash_tuple::hash and not std::hash

namespace hash_tuple{

namespace
    {
    template <class T>
    inline void hash_combine(std::size_t& seed, T const& v)
    {
        seed ^= hash_tuple::hash<T>()(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
    }
}

Then include all the other previous code but put it inside namespace hash_tuple and not std::

namespace hash_tuple{

    namespace
    {
        // Recursive template code derived from Matthieu M.
        template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>
        struct HashValueImpl
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            HashValueImpl<Tuple, Index-1>::apply(seed, tuple);
            hash_combine(seed, std::get<Index>(tuple));
          }
        };

        template <class Tuple>
        struct HashValueImpl<Tuple,0>
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            hash_combine(seed, std::get<0>(tuple));
          }
        };
    }

    template <typename ... TT>
    struct hash<std::tuple<TT...>> 
    {
        size_t
        operator()(std::tuple<TT...> const& tt) const
        {                                              
            size_t seed = 0;                             
            HashValueImpl<std::tuple<TT...> >::apply(seed, tt);    
            return seed;                                 
        }                                              
    };

}
情绪失控 2024-12-07 01:32:32
#include <boost/functional/hash.hpp>
#include <tuple>

namespace std
{

template<typename... T>
struct hash<tuple<T...>>
{
    size_t operator()(tuple<T...> const& arg) const noexcept
    {
        return boost::hash_value(arg);
    }
};

}
#include <boost/functional/hash.hpp>
#include <tuple>

namespace std
{

template<typename... T>
struct hash<tuple<T...>>
{
    size_t operator()(tuple<T...> const& arg) const noexcept
    {
        return boost::hash_value(arg);
    }
};

}
眼角的笑意。 2024-12-07 01:32:32

对于 C++20,可以使用 折叠表达式通用 lambda 用于计算元组的哈希值,无需递归。我更喜欢依赖 std::hash 而不是手动组合哈希:

#include <cinttypes>
#include <cstddef>
#include <functional>
#include <tuple>

class hash_tuple {
    template<class T>
    struct component {
        const T& value;
        component(const T& value) : value(value) {}
        uintmax_t operator,(uintmax_t n) const {
            n ^= std::hash<T>()(value);
            n ^= n << (sizeof(uintmax_t) * 4 - 1);
            return n ^ std::hash<uintmax_t>()(n);
        }
    };

public:
    template<class Tuple>
    size_t operator()(const Tuple& tuple) const {
        return std::hash<uintmax_t>()(
            std::apply([](const auto& ... xs) { return (component(xs), ..., 0); }, tuple));
    }
};

- 1 in sizeof(uintmax_t) * 4 - 1是可选的,但似乎稍微改善了哈希分布。该类可以与 std::tuplestd::pair 一起使用。

With C++20, it is possible to use fold expressions and generic lambdas to compute hash of a tuple without recursion. I prefer to rely on std::hash<uintmax_t> instead of manually combining hashes:

#include <cinttypes>
#include <cstddef>
#include <functional>
#include <tuple>

class hash_tuple {
    template<class T>
    struct component {
        const T& value;
        component(const T& value) : value(value) {}
        uintmax_t operator,(uintmax_t n) const {
            n ^= std::hash<T>()(value);
            n ^= n << (sizeof(uintmax_t) * 4 - 1);
            return n ^ std::hash<uintmax_t>()(n);
        }
    };

public:
    template<class Tuple>
    size_t operator()(const Tuple& tuple) const {
        return std::hash<uintmax_t>()(
            std::apply([](const auto& ... xs) { return (component(xs), ..., 0); }, tuple));
    }
};

- 1 in sizeof(uintmax_t) * 4 - 1 is optional, but appears to slightly improve hash distribution. This class can be used both with std::tuple and std::pair.

寻找一个思念的角度 2024-12-07 01:32:32

在我的 C++0x 草案中,20.8.15 表示 hash 专门用于内置类型(包括指针,但似乎并不意味着取消引用它们)。它似乎还专门用于 error_codebitsetunique_ptrshared_ptr< /code>、类型索引字符串u16stringu32stringwstringvectorthread::id。 (令人着迷的列表!)

我没有使用过 C++0x 变量,所以我的格式可能很差,但是沿着这些思路的东西可能适用于所有元组。

size_t hash_combiner(size_t left, size_t right) //replacable
{ return left + 0x9e3779b9 + (right<<6) + (right>>2);}

template<int index, class...types>
struct hash_impl {
    size_t operator()(size_t a, const std::tuple<types...>& t) const {
        typedef typename std::tuple_element<index, std::tuple<types...>>::type nexttype;
        hash_impl<index-1, types...> next;
        size_t b = std::hash<nexttype>()(std::get<index>(t));
        return next(hash_combiner(a, b), t); 
    }
};
template<class...types>
struct hash_impl<0, types...> {
    size_t operator()(size_t a, const std::tuple<types...>& t) const {
        typedef typename std::tuple_element<0, std::tuple<types...>>::type nexttype;
        size_t b = std::hash<nexttype>()(std::get<0>(t));
        return hash_combiner(a, b); 
    }
};

template<class...types>
struct tuple_hash<std::tuple<types...>> {
    size_t operator()(const std::tuple<types...>& t) {
        const size_t begin = std::tuple_size<std::tuple<types...>>::value-1;
        return hash_impl<begin, types...>()(0, t);
    }
}

这个版本实际上可以编译并运行

Yakk 观察到直接专门化 std::hash 是 <从技术上来说是不允许的,因为我们正在专门使用一个不依赖于用户定义类型的声明的标准库模板。

In my C++0x draft, 20.8.15 says hash is specialized for built-in types (including pointers, but doesn't seem to imply dereferencing them). It also appears to be specialized for error_code, bitset<N>, unique_ptr<T, D>, shared_ptr<T>, typeindex, string, u16string, u32string, wstring, vector<bool, Allocator>, and thread::id. (facinating list!)

I've not used C++0x variadics, so my formatting is probably way off, but something along these lines might work for all tuples.

size_t hash_combiner(size_t left, size_t right) //replacable
{ return left + 0x9e3779b9 + (right<<6) + (right>>2);}

template<int index, class...types>
struct hash_impl {
    size_t operator()(size_t a, const std::tuple<types...>& t) const {
        typedef typename std::tuple_element<index, std::tuple<types...>>::type nexttype;
        hash_impl<index-1, types...> next;
        size_t b = std::hash<nexttype>()(std::get<index>(t));
        return next(hash_combiner(a, b), t); 
    }
};
template<class...types>
struct hash_impl<0, types...> {
    size_t operator()(size_t a, const std::tuple<types...>& t) const {
        typedef typename std::tuple_element<0, std::tuple<types...>>::type nexttype;
        size_t b = std::hash<nexttype>()(std::get<0>(t));
        return hash_combiner(a, b); 
    }
};

template<class...types>
struct tuple_hash<std::tuple<types...>> {
    size_t operator()(const std::tuple<types...>& t) {
        const size_t begin = std::tuple_size<std::tuple<types...>>::value-1;
        return hash_impl<begin, types...>()(0, t);
    }
}

This version actually compiles and runs

Yakk has observed that specializing std::hash directly is technically not allowed, since we're specializing a standard library template with a declaration that does not depend on a user-defined type.

指尖上得阳光 2024-12-07 01:32:32

如果你想以简单的方式做到这一点。只要做

std::unordered_map<std::tuple<int, int>, std::string, boost::hash<std::tuple<int, int>>> mp;

If you want to do it in a simple way. Just do

std::unordered_map<std::tuple<int, int>, std::string, boost::hash<std::tuple<int, int>>> mp;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文