散列 2D、3D 和 nD 向量

发布于 2024-11-06 10:22:39 字数 248 浏览 0 评论 0原文

用于对由 IEEE 32 位浮点数组成的 2d 和 3d 向量进行哈希处理的良好哈希函数（快速、分布良好、冲突少）是什么？我假设一般的 3d 向量，但假设法线（始终在 [-1,1] 中）的算法也受到欢迎。我也不担心位操作，因为 IEEE 浮点数始终是 IEEE 浮点数。

另一个更普遍的问题是对 Nd 浮点向量进行哈希处理，其中 N 非常小（3-12）并且是常数，但在编译时未知。目前，我只是将这些浮点数作为 uint 并将它们异或在一起，这可能不是最好的解决方案。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

吻风 2024-11-13 10:22:39

用于可变形对象碰撞检测的优化空间哈希中描述了一个空间哈希函数。他们使用哈希函数

哈希(x,y,z) = ( x p1 xor y p2 xor z
p3) 模 n
其中 p1、p2、p3 较大
素数，在我们的例子中是 73856093，
分别为19349663、83492791。这
值 n 是哈希表大小。

论文中，x、y、z为离散化坐标；您可能还可以使用浮点数的二进制值。

回复收藏 0 原文

美人如玉 2024-11-13 10:22:39

我有两个建议。

假设网格单元大小为 l，并通过计算 x、y 和 z 坐标进行量化>ix = 下限(x/l)、iy = 下限(y/l) 和iz = 下限(z/l)，其中 >ix、iy 和 iz 是整数。请使用用于可变形对象碰撞检测的优化空间哈希中定义的哈希函数

现在，如果您不进行量化，，它不会对邻近性（局部性）敏感。

局部敏感哈希已被提及用于对更高维向量进行哈希处理。为什么不将它们也用于 3d 或 2d 矢量呢？使用适用于欧几里德距离度量（这是我们需要的 2d 和 3d 向量）的 LSH 变体称为使用 p 稳定分布的局部敏感哈希。这里有一个非常易读的教程。

回复收藏 0 原文

睫毛上残留的泪 2024-11-13 10:22:39

我根据这里看到的评论用 Python 写了这个，

l = 5
n = 5
p1,p2,p3 = 73856093, 19349663, 83492791

x1 = [33,4,11]
x2 = [31,1,14]
x3 = [10,44,19]

def spatial_hash(x):
    ix,iy,iz = np.floor(x[0]/l), np.floor(x[1]/l), np.floor(x[2]/l)
    return (int(ix*p1) ^ int(iy*p2) ^ int(iz*p3)) % n

print (spatial_hash(x1))
print (spatial_hash(x2))
print (spatial_hash(x3))

它

1
1
3

看起来很有效。

在 C++ 中

#include <cstdlib>
#include <iostream>
#include <unordered_map>
#include <vector>
#include <random>

#include <eigen3/Eigen/Dense>
using namespace Eigen;

using namespace std;
const int HASH_SIZE = 200;    
//const float MAX = 500.0;
const float L = 0.2f;
const float mmin = -1.f;
const float mmax = 1.f;

unordered_map<int, vector<Vector3d>> map ;

inline size_t hasha(Vector3d &p) {
    int ix = (unsigned int)((p[0]+2.f) / L);
    int iy = (unsigned int)((p[1]+2.f) / L);
    int iz = (unsigned int)((p[2]+2.f) / L);
    return (unsigned int)((ix * 73856093) ^ (iy * 19349663) ^ (iz * 83492791)) % HASH_SIZE;
}


int main(int argc, char** argv) {

    std::default_random_engine generator;
    std::uniform_real_distribution<double> distribution(-1.0,1.0);

    
    for(size_t i=0;i<300;i++){
    float x = distribution(generator);
    float y = distribution(generator);
    float z = distribution(generator);
        Vector3d v(x,y,z);
        std::cout << hasha(v)  << " " << v[0] << " " << v[1] << " " << v[2] << std::endl;
    map[hasha(v)].push_back(v);
    vector<Vector3d> entry = map[hasha(v)];
    std::cout << "size " << entry.size() << std::endl;
    }

    for (const auto & [ key, value ] : map) {
    cout << key << std::endl;
    vector<Vector3d> v = map[key];
    float average = 0.0f;
    for (int i=0; i<v.size(); i++){
        for (int j=0; j<v.size(); j++){
        if (i!=j){
            Vector3d v1 = v[i];
            Vector3d v2 = v[j];
            std::cout << "   dist " <<  (v1-v2).norm() << std::endl;
        }
        } 
    }

    }
    

}

I wrote this in Python based on the comments seen here,

l = 5
n = 5
p1,p2,p3 = 73856093, 19349663, 83492791

x1 = [33,4,11]
x2 = [31,1,14]
x3 = [10,44,19]

def spatial_hash(x):
    ix,iy,iz = np.floor(x[0]/l), np.floor(x[1]/l), np.floor(x[2]/l)
    return (int(ix*p1) ^ int(iy*p2) ^ int(iz*p3)) % n

print (spatial_hash(x1))
print (spatial_hash(x2))
print (spatial_hash(x3))

It gives

1
1
3

It seemed to work.

In C++

#include <cstdlib>
#include <iostream>
#include <unordered_map>
#include <vector>
#include <random>

#include <eigen3/Eigen/Dense>
using namespace Eigen;

using namespace std;
const int HASH_SIZE = 200;    
//const float MAX = 500.0;
const float L = 0.2f;
const float mmin = -1.f;
const float mmax = 1.f;

unordered_map<int, vector<Vector3d>> map ;

inline size_t hasha(Vector3d &p) {
    int ix = (unsigned int)((p[0]+2.f) / L);
    int iy = (unsigned int)((p[1]+2.f) / L);
    int iz = (unsigned int)((p[2]+2.f) / L);
    return (unsigned int)((ix * 73856093) ^ (iy * 19349663) ^ (iz * 83492791)) % HASH_SIZE;
}


int main(int argc, char** argv) {

    std::default_random_engine generator;
    std::uniform_real_distribution<double> distribution(-1.0,1.0);

    
    for(size_t i=0;i<300;i++){
    float x = distribution(generator);
    float y = distribution(generator);
    float z = distribution(generator);
        Vector3d v(x,y,z);
        std::cout << hasha(v)  << " " << v[0] << " " << v[1] << " " << v[2] << std::endl;
    map[hasha(v)].push_back(v);
    vector<Vector3d> entry = map[hasha(v)];
    std::cout << "size " << entry.size() << std::endl;
    }

    for (const auto & [ key, value ] : map) {
    cout << key << std::endl;
    vector<Vector3d> v = map[key];
    float average = 0.0f;
    for (int i=0; i<v.size(); i++){
        for (int j=0; j<v.size(); j++){
        if (i!=j){
            Vector3d v1 = v[i];
            Vector3d v2 = v[j];
            std::cout << "   dist " <<  (v1-v2).norm() << std::endl;
        }
        } 
    }

    }
    

}

回复收藏 0 原文

~没有更多了~