
发布于 2024-08-02 10:12:19 字数 270 浏览 7 评论 0原文






编辑:我正在使用 .net,因此哈希码应该是 32 位长。

I'm looking for the optimal way to compute a hashcode for a set of bi-dimensional points (so that I can store polygons in a hashtable).

There are some obvious ways to do that, such as concatenating all the points coordinates in a string and its hashcode, but this would be very slow.

On the other end of the speed/collision spectrum, I can also for example sum up all the coordinates, which would result in a very fast code, but would also create a lot of collisions.

What's the optimal way to compute a hashcode for a set of points?

Is the optimal solution different if the coordinates are integer (vs real coordinates)?

Edit : I'm using .net so the hashcode should be 32 bits long.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。



需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。


痕至 2024-08-09 10:12:19



unsigned int JSHash(char* str, unsigned int len)
    unsigned int hash = 1315423911;
    unsigned int i    = 0;

    for(i = 0; i < len; str++, i++)
        hash ^= ((hash << 5) + (*str) + (hash >> 2));

    return hash;
/* End Of JS Hash Function */

您说将点聚合在一起会很慢。如果您修复上面的代码,它不需要任何类型的聚合,只需通过 trought (总和没有太大不同)并且如果您使用整数和浮点数,您可能会修复移位(<< 和 >> 是移位操作,其中一起工作就像按位旋转)以适合您的数据类型。


There is no optimal way for this job. It all depends on how big hash can you afford. You have to make tradoffs between speed and diffusion. Keep in mind that there is no such thing as optimal solution (if you do not exactly know what you are going to hash) In some cases xor can be good enough.

Take for instance this code

unsigned int JSHash(char* str, unsigned int len)
    unsigned int hash = 1315423911;
    unsigned int i    = 0;

    for(i = 0; i < len; str++, i++)
        hash ^= ((hash << 5) + (*str) + (hash >> 2));

    return hash;
/* End Of JS Hash Function */

You said that agregating points together is to slow. If you fix upper code it does not need any kind of agregation just pass trought (not much different that sums) And if you are using integeres and floats you would probably fix shifts (<< and >> are shift operations which together works like bitwise rotation) to fit your data type.

Check for other hash functions here:

流殇 2024-08-09 10:12:19




Optimal is dependent on your requirements from the hash computation.

Performance will come at the cost of more hash collisions.

Do you have a hard bound on either one? It's going to come down to a mathematical analysis of how much each percent of hash collisions is going to cost you in terms of performance.

幻想少年梦 2024-08-09 10:12:19


编辑:重新考虑这一点,想象一下与凹/凸边界可能发生的碰撞,多边形重叠也是如此。 - 叹息

唉:当凸面和凹面相遇时,总是给我带来麻烦。 :-P

If your data set is by any chance one of polygons that can have common edges but not overlap otherwise, you only need to hash on three points in each polygon to avoid collisions.

Edit: Reconsidering this, picturing possible collisions with concave/convex boundaries, it is just as well your polygons overlap. - Sigh

Alas: When the convex and the concave meet, it always gets me into trouble. :-P

时光是把杀猪刀 2024-08-09 10:12:19


return p1.GetHashCode() ^ p2.GetHashCode()


Alternatively, you can just XOR the hashes of the individual points.

return p1.GetHashCode() ^ p2.GetHashCode()

Depending on what the values are going to be anyway. Probably could just add them.

鹤仙姿 2024-08-09 10:12:19



  1. 找到最左上角点的集合(具有最小 y 的点中具有最小 x 的点),这些是起点。
  2. 对于每个起点和每个方向,迭代地添加给定方向上的连接点,并消除当前迭代中所有非左上角的点。
    当只剩下一个起点、方向对或完成 n-1 次迭代时停止。如果剩余多个起点和方向,请选择任意一个 - 它们都是同构的。
  3. 从找到的点开始沿找到的方向对点重新排序。

对于完全退化的多边形来说,这是 O(n^2) 最坏情况,但如果您的多边形没有重叠点,则这是 O(n),并且常数因子非常小。


int result = 0;
foreach (var point in this.points) {
    result = (result * 31 + point.X.GetHashCode()) * 31 + point.Y.GetHashCode();

If you want polygons that are defined clockwise and anticlockwise, but otherwise equal, to be equal, then you'll have to create a canonicalization function. A function that given a polygons points starting from any point and in any order will return the points in equal order.

One algorithm that I can think of is to find the minimum of all possible sequences of points:

  1. Find the set of top-leftmost points (points with minimum x of the points with minimum y), these are the starting points.
  2. For each starting point and each direction, iteratively add connected points in the given direction and eliminate all that aren't top-leftmost in the current iteration.
    Halt when only one starting point,direction pair is left or when n-1 iterations are completed. If more than one starting point and direction is remaining, choose any - they are all isomorphic.
  3. Reorder the points starting from the found point in the found direction.

This is O(n^2) worst-case for fully degenerate polygons, but if your polygons don't have overlapping points, this is O(n), with a pretty small constant factor.

With the canonicalized order you can easily compare two polygons for equality, just iteratively compare points for equality. Hashcode calculation is also trivial, use any reasonably robust hash combination method. For example:

int result = 0;
foreach (var point in this.points) {
    result = (result * 31 + point.X.GetHashCode()) * 31 + point.Y.GetHashCode();
栀子花开つ 2024-08-09 10:12:19




假设有一个组合函数 int ->整数-> int 是结合律

public static int combine(int h, int x)
    return h * 31 + x;

public static int combine(int h, int x)
    return h ^ x;


public override int GetHashCode()
    int x = 0;
    int y = 0;
    uint h = 0;    
    foreach (var point p in polgon)
        x = combine(x, p.X);
        y = combine(y, p.Y);
    // simplified, unrolled Murmur2 hash for end stage
    const uint m = 0x5bd1e995;
    const int r = 24;
    uint h = count;
    uint k = ReinterpretInt32ToUInt32(x);
    k *= m;
    k ^= k >> r;
    k *= m;
    h *= m;
    h ^= k;
    k = ReinterpretInt32ToUInt32(y);
    k *= m;
    k ^= k >> r;
    k *= m;
    h *= m;
    h ^= k;
    // avalanche
    h ^= h >> 13;
    h *= m;
    h ^= h >> 15;
    return ReinterpretUInt32ToInt32(h);


public unsafe uint ReinterpretInt32ToUInt32(int i)
    return *((uint*) (void*) &i);

public unsafe int ReinterpretUInt32ToInt32(uint u)
    return *((int*) (void*) &u);


For a very quick (to calculate) hash with the desired properties on clockwise/counter clockwise independence you would not want to be dependent on finding a well defined ordering of the points.

This limits your hash combining operations to ones which commute. Therefore we wish to keep any and all data which is independent of orientation separate during the combining operations.

Here is a simple solution:

Assuming a combine function int -> int -> int which is associative
any of the following will do to start with:

public static int combine(int h, int x)
    return h * 31 + x;

public static int combine(int h, int x)
    return h ^ x;

Then we can do the following:

public override int GetHashCode()
    int x = 0;
    int y = 0;
    uint h = 0;    
    foreach (var point p in polgon)
        x = combine(x, p.X);
        y = combine(y, p.Y);
    // simplified, unrolled Murmur2 hash for end stage
    const uint m = 0x5bd1e995;
    const int r = 24;
    uint h = count;
    uint k = ReinterpretInt32ToUInt32(x);
    k *= m;
    k ^= k >> r;
    k *= m;
    h *= m;
    h ^= k;
    k = ReinterpretInt32ToUInt32(y);
    k *= m;
    k ^= k >> r;
    k *= m;
    h *= m;
    h ^= k;
    // avalanche
    h ^= h >> 13;
    h *= m;
    h ^= h >> 15;
    return ReinterpretUInt32ToInt32(h);

Relying on this to make the code above easy

public unsafe uint ReinterpretInt32ToUInt32(int i)
    return *((uint*) (void*) &i);

public unsafe int ReinterpretUInt32ToInt32(uint u)
    return *((int*) (void*) &u);

This will not be the best hash in terms of collision avoidance but should be very fast to calculate and you may find it sufficient for your needs.

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。