针对特定数据结构的无碰撞哈希函数

发布于 2024-08-29 22:47:10 字数 299 浏览 9 评论 0原文

是否可以为具有特定属性的数据结构创建无冲突哈希函数。

  1. 数据结构是 int[][][]
  2. 它不包含重复项
  3. 定义了其中包含的整数范围。假设它是 0..1000,最大整数肯定不会大于 10000。

最大的问题是这个哈希函数也应该非常快。 有没有办法创建这样的哈希函数?也许在运行时取决于整数范围?

另外:我应该说这个哈希函数的目的是快速检查特定组合是否已被处理。因此,当处理数据结构中的某些数字组合时,我计算哈希值并存储它。然后,当处理数据结构中的另一个数字组合时,我将比较哈希值。

Is it possible to create collision free hash function for a data structure with specific properties.

  1. The datastructure is int[][][]
  2. It contains no duplicates
  3. The range of integers that are contained in it is defined. Let's say it's 0..1000, the maximal integer is definitely not greater than 10000.

Big problem is that this hash function should also be very fast.
Is there a way to create such a hash function? Maybe at run time depending on the integer range?

ADDITION: I should say that the purpose of this hash function is to quckily check if the particular combination was processed. So when some combination of numbers in the data structure is processed, I calculate the hash value and store it. Then when processing another combination of numbers within the data structure I will compare the hash values.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

野却迷人 2024-09-05 22:47:10

我认为你想要的是“完美哈希”甚至“最小完美哈希”:

http://en .wikipedia.org/wiki/Perfect_hash_function

编辑:也就是说,如果你确信并且确定你永远不会超过 [0...1000] 并且根据你需要做什么,你可能可以简单地“将结果直接存储在数组中。如果你没有很多元素,该数组将是稀疏的(因此有点浪费),但最多 1001 个元素来自 [0...1000] 一个 Object[1001] (或 int[1001] 或无论如何)可能会做。

I think what you want is a "perfect hash" or even a "minimal perfect hash":

http://en.wikipedia.org/wiki/Perfect_hash_function

Edit: That said, if you're sure and certain you'll never go above [0...1000] and depending on what you need to do you probably can simply "bucket" your results directly in an array. If you don't have many elements, that array would be sparse (and hence a bit of a waste) but for at most 1001 elements going from [0...1000] an Object[1001] (or int[1001] or whatever) will probably do.

夜灵血窟げ 2024-09-05 22:47:10

如果您只使用 64 位值并将层次结构每一层中的位置存储到一个位段中会怎样?

类似的东西(在我的脑海中):hash = (a << 34) | (b << 17) | (三)

what if you just use a 64-bit value and store the location in each level of the hierarchy into one section of bits?

something like(off the top of my head): hash = (a << 34) | (b << 17) | (c)

芯好空 2024-09-05 22:47:10

完美的哈希可能不可行,因为为您的数据集找到一个完美的哈希可能需要大量的计算时间。

bool[][][] 是否适合您,其中 true 表示某个 x,y,z 组合已被处理?下面是三维位数组的原型。由于 Int32 的限制,这最多只能工作到大约 1,024 的最大索引(但适合 128 MB)。通过创建 BitArray[][] 可以达到 10,000。但是,这对于该大小可能不切实际,因为它将占用超过 116 GB 的 RAM。

根据您的具体问题大小和需求,普通的旧哈希表(有冲突)可能是您的最佳选择。也就是说,这是原型代码:

public class ThreeDimensionalBitArray
{
    // todo: consider making the size configurable
    private const int MAX_INDEX = 1000;

    private BitArray _bits = new BitArray(MAX_INDEX * MAX_INDEX * MAX_INDEX);

    public bool this[int x, int y, int z]
    {
        get { return _bits[getBitIndex(x, y, z)]; }
        set { _bits[getBitIndex(x, y, z)] = value; }
    }

    public ThreeDimensionalBitArray()
    {
    }

    private static int getBitIndex(int x, int y, int z)
    {
        // todo: bounds check x, y, and z

        return (x * MAX_INDEX * MAX_INDEX) + (y * MAX_INDEX) + z;
    }
}


public class BitArrayExample
{
    public static void Main()
    {
        ThreeDimensionalBitArray bitArray = new ThreeDimensionalBitArray();
        Console.WriteLine(bitArray[500, 600, 700]); // "false"
        bitArray[500, 600, 700] = true;
        Console.WriteLine(bitArray[500, 600, 700]); // "true"
    }
}

A perfect hash is likely not feasible, because it can take a lot of computation time to find one for your data set.

Would a bool[][][] work for you, where true means a certain x,y,z combination has been processed? Below is a prototype for a three-dimensional bit array. Because of the limits of an Int32, this will only work up to a maximum index of about 1,024 (but would fit within 128 MB). You could get to 10,000 by creating a BitArray[][]. However, this is probably not practical at that size, because it would occupy over 116 GB of RAM.

Depending on your exact problem size and needs, a plain old hash table (with collisions) may be your best bet. That said, here is the prototype code:

public class ThreeDimensionalBitArray
{
    // todo: consider making the size configurable
    private const int MAX_INDEX = 1000;

    private BitArray _bits = new BitArray(MAX_INDEX * MAX_INDEX * MAX_INDEX);

    public bool this[int x, int y, int z]
    {
        get { return _bits[getBitIndex(x, y, z)]; }
        set { _bits[getBitIndex(x, y, z)] = value; }
    }

    public ThreeDimensionalBitArray()
    {
    }

    private static int getBitIndex(int x, int y, int z)
    {
        // todo: bounds check x, y, and z

        return (x * MAX_INDEX * MAX_INDEX) + (y * MAX_INDEX) + z;
    }
}


public class BitArrayExample
{
    public static void Main()
    {
        ThreeDimensionalBitArray bitArray = new ThreeDimensionalBitArray();
        Console.WriteLine(bitArray[500, 600, 700]); // "false"
        bitArray[500, 600, 700] = true;
        Console.WriteLine(bitArray[500, 600, 700]); // "true"
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文