以一种独特且确定性的方式将两个整数映射为一个

发布于 2024-07-22 01:41:59 字数 237 浏览 7 评论 0 原文

想象两个正整数 A 和 B。我想将这两个组合成一个整数 C。

不能有其他整数 D 和 E 组合成 C。 因此将它们与加法运算符结合起来是行不通的。 例如 30 + 10 = 40 = 40 + 0 = 39 + 1 连接也不起作用。 例如“31”+“2”= 312 =“3”+“12”

此组合运算也应该是确定性的(使用相同的输入始终产生相同的结果)并且应该始终产生一个整数整数的正数或负数。

Imagine two positive integers A and B. I want to combine these two into a single integer C.

There can be no other integers D and E which combine to C.
So combining them with the addition operator doesn't work. Eg 30 + 10 = 40 = 40 + 0 = 39 + 1
Neither does concatination work. Eg "31" + "2" = 312 = "3" + "12"

This combination operation should also be deterministic (always yield the same result with the same inputs) and should always yield an integer on either the positive or the negative side of integers.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(21

还在原地等你 2024-07-29 01:41:59

Cantor 配对函数 考虑到其简单、快速且节省空间,它确实是最好的函数之一,但是,Wolfram Matthew Szudzik 在此处发表了一些更好的内容。 康托配对函数(相对)的局限性在于,如果输入是两个 N 位,则编码结果的范围并不总是保持在 2N 位整数的限制内整数。 也就是说,如果我的输入是两个 16 位整数,范围从 0 到 2^16 -1,则有 2^16 * (2^16 - 1) 输入可能的组合,因此根据明显的鸽洞原理,我们需要一个输出大小至少为 2^16 * (2^16 -1),等于 2^32 - 2^16,或者换句话说,一个地图理想情况下,32 位数字应该是可行的。 这在编程世界中可能具有不小的实际重要性。

康托尔配对函数

(a + b) * (a + b + 1) / 2 + a; where a, b >= 0

两个最大 16 位整数 (65535, 65535) 的映射将为 8589803520,如您所见,它无法适合 32 位。

输入Szudzik 函数

a >= b ? a * a + a + b : a + b * b;  where a, b >= 0

(65535, 65535) 的映射现在将为 4294967295,如您所见,它是一个 32 位(0 到 2^32 -1)整数。 这就是该解决方案的理想之处,它只是利用该空间中的每个点,因此没有什么可以提高空间效率。


现在考虑到我们通常在语言/框架中处理不同大小的数字的有符号实现,让我们考虑范围从 -(2^15) 到 2^15 的有符号 16 位整数-1(稍后我们将看到如何扩展输出以跨越有符号范围)。 由于 ab 必须为正数,因此它们的范围为 0 到 2^15 - 1

康托尔配对函数

两个最大 16 位有符号整数 (32767, 32767) 的映射将为 2147418112,这刚好低于有符号 32 位整数的最大值。

现在Szudzik 的函数

(32767, 32767) => 1073741823,小得多..

让我们考虑负整数。 这超出了我知道的最初问题,但只是为了帮助未来的访客而详细阐述。

康托尔配对函数

A = a >= 0 ? 2 * a : -2 * a - 1;
B = b >= 0 ? 2 * b : -2 * b - 1;
(A + B) * (A + B + 1) / 2 + A;

(-32768, -32768) => 8589803520 这是 Int64。 16 位输入的 64 位输出可能是不可原谅的!!

Szudzik 的函数

A = a >= 0 ? 2 * a : -2 * a - 1;
B = b >= 0 ? 2 * b : -2 * b - 1;
A >= B ? A * A + A + B : A + B * B;

(-32768, -32768) => 4294967295,对于无符号范围是 32 位,对于有符号范围是 64 位,但仍然更好。

现在,输出始终为正。 在有符号的世界中,如果我们可以将一半的输出转移到负轴,将会更加节省空间。 对于 Szudzik 来说,您可以这样做:

A = a >= 0 ? 2 * a : -2 * a - 1;
B = b >= 0 ? 2 * b : -2 * b - 1;
C = (A >= B ? A * A + A + B : A + B * B) / 2;
a < 0 && b < 0 || a >= 0 && b >= 0 ? C : -C - 1;

(-32768, 32767) => -2147483648

(32767, -32768) => -2147450880

(0, 0) => 0 

(32767, 32767) => 2147418112

(-32768, -32768) => 2147483647

我所做的:在对输入应用 2 权重并遍历该函数后,我将输出除以二,并将其中一些取负数轴乘以 -1

查看结果,对于有符号 16 位数字范围内的任何输入,输出都在有符号 32 位整数的限制内,这很酷。 我不确定如何以相同的方式实现康托配对功能,但没有尝试太多,因为它效率不高。 此外,Cantor配对函数涉及的计算量较多,也意味着其速度较慢

这是一个 C# 实现。

public static long PerfectlyHashThem(int a, int b)
{
    var A = (ulong)(a >= 0 ? 2 * (long)a : -2 * (long)a - 1);
    var B = (ulong)(b >= 0 ? 2 * (long)b : -2 * (long)b - 1);
    var C = (long)((A >= B ? A * A + A + B : A + B * B) / 2);
    return a < 0 && b < 0 || a >= 0 && b >= 0 ? C : -C - 1;
}

public static int PerfectlyHashThem(short a, short b)
{
    var A = (uint)(a >= 0 ? 2 * a : -2 * a - 1);
    var B = (uint)(b >= 0 ? 2 * b : -2 * b - 1);
    var C = (int)((A >= B ? A * A + A + B : A + B * B) / 2);
    return a < 0 && b < 0 || a >= 0 && b >= 0 ? C : -C - 1;
}

由于中间计算可能超出 2N 有符号整数的限制,因此我使用了 4N 整数类型(最后除以 2 将结果返回到2N)。

我在替代解决方案上提供的链接很好地描述了利用空间中每个点的函数图。 令人惊讶的是,您可以将一对坐标唯一地可逆地编码为单个数字! 神奇的数字世界!!

Cantor pairing function is really one of the better ones out there considering its simple, fast and space efficient, but there is something even better published at Wolfram by Matthew Szudzik, here. The limitation of Cantor pairing function (relatively) is that the range of encoded results doesn't always stay within the limits of a 2N bit integer if the inputs are two N bit integers. That is, if my inputs are two 16 bit integers ranging from 0 to 2^16 -1, then there are 2^16 * (2^16 -1) combinations of inputs possible, so by the obvious Pigeonhole Principle, we need an output of size at least 2^16 * (2^16 -1), which is equal to 2^32 - 2^16, or in other words, a map of 32 bit numbers should be feasible ideally. This may not be of little practical importance in programming world.

Cantor pairing function:

(a + b) * (a + b + 1) / 2 + a; where a, b >= 0

The mapping for two maximum most 16 bit integers (65535, 65535) will be 8589803520 which as you see cannot be fit into 32 bits.

Enter Szudzik's function:

a >= b ? a * a + a + b : a + b * b;  where a, b >= 0

The mapping for (65535, 65535) will now be 4294967295 which as you see is a 32 bit (0 to 2^32 -1) integer. This is where this solution is ideal, it simply utilizes every single point in that space, so nothing can get more space efficient.


Now considering the fact that we typically deal with the signed implementations of numbers of various sizes in languages/frameworks, let's consider signed 16 bit integers ranging from -(2^15) to 2^15 -1 (later we'll see how to extend even the ouput to span over signed range). Since a and b have to be positive they range from 0 to 2^15 - 1.

Cantor pairing function:

The mapping for two maximum most 16 bit signed integers (32767, 32767) will be 2147418112 which is just short of maximum value for signed 32 bit integer.

Now Szudzik's function:

(32767, 32767) => 1073741823, much smaller..

Let's account for negative integers. That's beyond the original question I know, but just elaborating to help future visitors.

Cantor pairing function:

A = a >= 0 ? 2 * a : -2 * a - 1;
B = b >= 0 ? 2 * b : -2 * b - 1;
(A + B) * (A + B + 1) / 2 + A;

(-32768, -32768) => 8589803520 which is Int64. 64 bit output for 16 bit inputs may be so unpardonable!!

Szudzik's function:

A = a >= 0 ? 2 * a : -2 * a - 1;
B = b >= 0 ? 2 * b : -2 * b - 1;
A >= B ? A * A + A + B : A + B * B;

(-32768, -32768) => 4294967295 which is 32 bit for unsigned range or 64 bit for signed range, but still better.

Now all this while the output has always been positive. In signed world, it will be even more space saving if we could transfer half the output to negative axis. You could do it like this for Szudzik's:

A = a >= 0 ? 2 * a : -2 * a - 1;
B = b >= 0 ? 2 * b : -2 * b - 1;
C = (A >= B ? A * A + A + B : A + B * B) / 2;
a < 0 && b < 0 || a >= 0 && b >= 0 ? C : -C - 1;

(-32768, 32767) => -2147483648

(32767, -32768) => -2147450880

(0, 0) => 0 

(32767, 32767) => 2147418112

(-32768, -32768) => 2147483647

What I do: After applying a weight of 2 to the the inputs and going through the function, I then divide the ouput by two and take some of them to negative axis by multiplying by -1.

See the results, for any input in the range of a signed 16 bit number, the output lies within the limits of a signed 32 bit integer which is cool. I'm not sure how to go about the same way for Cantor pairing function but didn't try as much as its not as efficient. Furthermore, more calculations involved in Cantor pairing function means its slower too.

Here is a C# implementation.

public static long PerfectlyHashThem(int a, int b)
{
    var A = (ulong)(a >= 0 ? 2 * (long)a : -2 * (long)a - 1);
    var B = (ulong)(b >= 0 ? 2 * (long)b : -2 * (long)b - 1);
    var C = (long)((A >= B ? A * A + A + B : A + B * B) / 2);
    return a < 0 && b < 0 || a >= 0 && b >= 0 ? C : -C - 1;
}

public static int PerfectlyHashThem(short a, short b)
{
    var A = (uint)(a >= 0 ? 2 * a : -2 * a - 1);
    var B = (uint)(b >= 0 ? 2 * b : -2 * b - 1);
    var C = (int)((A >= B ? A * A + A + B : A + B * B) / 2);
    return a < 0 && b < 0 || a >= 0 && b >= 0 ? C : -C - 1;
}

Since the intermediate calculations can exceed limits of 2N signed integer, I have used 4N integer type (the last division by 2 brings back the result to 2N).

The link I have provided on alternate solution nicely depicts a graph of the function utilizing every single point in space. Its amazing to see that you could uniquely encode a pair of coordinates to a single number reversibly! Magic world of numbers!!

如梦 2024-07-29 01:41:59

您正在寻找双射 NxN -> N 映射。 这些用于例如吻合。 请查看此 PDF,了解所谓的配对功能。 维基百科介绍了一个特定的配对函数,即 Cantor 配对函数

pi(k1, k2) = 1/2(k1 + k2)(k1 + k2 + 1) + k2]

三点注释:

  • 正如其他人已经明确指出的,如果您打算实现配对函数,您可能很快就会发现您需要任意大的整数(bignums)。
  • 如果您不想区分 (a, b) 和 (b, a) 对,请在应用配对函数之前对 a 和 b 进行排序。
  • 其实我撒谎了。 您正在寻找双射 ZxZ -> N 映射。 康托尔函数仅适用于非负数。 但这不是问题,因为定义双射很容易 f : Z -> N,如下所示:
  • f(n) = n * 2 if n >= 0
  • f(n) = -n * 2 - 1 if n < 0

You're looking for a bijective NxN -> N mapping. These are used for e.g. dovetailing. Have a look at this PDF for an introduction to so-called pairing functions. Wikipedia introduces a specific pairing function, namely the Cantor pairing function:

pi(k1, k2) = 1/2(k1 + k2)(k1 + k2 + 1) + k2]

Three remarks:

  • As others have made clear, if you plan to implement a pairing function, you may soon find you need arbitrarily large integers (bignums).
  • If you don't want to make a distinction between the pairs (a, b) and (b, a), then sort a and b before applying the pairing function.
  • Actually I lied. You are looking for a bijective ZxZ -> N mapping. Cantor's function only works on non-negative numbers. This is not a problem however, because it's easy to define a bijection f : Z -> N, like so:
  • f(n) = n * 2 if n >= 0
  • f(n) = -n * 2 - 1 if n < 0
分分钟 2024-07-29 01:41:59

如果A和B可以用2个字节表示,则可以将它们组合在4个字节上。 将 A 放在最重要的一半,将 B 放在最不重要的一半。

在 C 语言中,这给出(假设 sizeof(short)=2 和 sizeof(int)=4):

unsigned int combine(unsigned short A, unsigned short B)
{
    return ((unsigned)A<<16) | (unsigned)B;
}

unsigned short getA(unsigned int C)
{
    return C>>16;
}

unsigned short getB(unsigned int C)
{
    return C & 0xFFFF;    // or  return (unsigned short)C;
}

使输入 unsigned Shortuint16_t 确保它们在之前进行零扩展您将 |+ 它们放在一起。 否则,负数 B 会通过 OR 将高位设置为全 1,或者如果进行 ADD,则从上半部分减一。

在默认将窄类型提升为有符号 int 后,强制转换 (unsigned)A 可以避免左移中的有符号溢出 UB。 对于更广泛的类型,避免移出要保留的位也很重要,例如 ((uint64_t)A << 32 | B,因为默认提升在 int 处停止

(unsigned)B 转换不是必需的;重要的是它是 unsigned Short B 的左侧。 >| 是 unsigned 意味着这也将转换为 unsigned

您可以将其与有符号类型一起使用,至少是 getA 和 getB,并且您可以返回有符号。来自 combineint,但输入需要零扩展,因此在 C 中,您需要在加宽之前将它们设置为 unsigned Short。 ((unsigned)(unsigned short)A << 16) | (unsigned short)B

您可能需要使用 uint16_tuint32_t 来定义类型宽度与您正在使用的班次计数相匹配。

If A and B can be expressed with 2 bytes, you can combine them on 4 bytes. Put A on the most significant half and B on the least significant half.

In C language this gives (assuming sizeof(short)=2 and sizeof(int)=4):

unsigned int combine(unsigned short A, unsigned short B)
{
    return ((unsigned)A<<16) | (unsigned)B;
}

unsigned short getA(unsigned int C)
{
    return C>>16;
}

unsigned short getB(unsigned int C)
{
    return C & 0xFFFF;    // or  return (unsigned short)C;
}

Making the inputs unsigned short or uint16_t makes sure they zero-extend before you | or + them together. Otherwise negative B would set the upper bits to all-ones with OR, or subtract one from the upper half if you ADD.

Casting (unsigned)A avoids signed overflow UB in the left shift after default promotion of narrow types to signed int. And for for wider types, it's also essential to avoid shifting out bits you to keep, like ((uint64_t)A << 32 | B, since default promotion stops at int.

The (unsigned)B cast isn't necessary; the important part is that it was unsigned short B to start with. The left hand side of the | being unsigned means this will also convert to unsigned.

You can use this with signed types, at least the getA and getB, and you can return signed int from combine, but the inputs need to zero-extend so in C you need them to be unsigned short before widening. Like ((unsigned)(unsigned short)A << 16) | (unsigned short)B

You might want to use uint16_t and uint32_t, to define the type widths to match the shift counts you're using.

自此以后,行同陌路 2024-07-29 01:41:59

这可能吗?
您正在组合两个整数。 它们的范围都是 -2,147,483,648 到 2,147,483,647,但您只能取正数。
这使得 2147483647^2 = 4,61169E+18 个组合。
由于每个组合都必须是唯一的并且结果是一个整数,因此您需要某种可以包含这么多数字的神奇整数。

还是我的逻辑有问题?

Is this even possible?
You are combining two integers. They both have the range -2,147,483,648 to 2,147,483,647 but you will only take the positives.
That makes 2147483647^2 = 4,61169E+18 combinations.
Since each combination has to be unique AND result in an integer, you'll need some kind of magical integer that can contain this amount of numbers.

Or is my logic flawed?

人间不值得 2024-07-29 01:41:59

正整数的标准数学方法是利用素因数分解的唯一性。

f( x, y ) -> 2^x * 3^y

缺点是图像往往跨越相当大的整数范围,因此当涉及到用计算机算法表达映射时,您可能会在为结果选择合适的类型时遇到问题。

您可以修改它,通过对具有 5 和 7 项的幂的标志进行编码来处理负 xy

例如

f( x, y ) -> 2^|x| * 3^|y| * 5^(x<0) * 7^(y<0)

The standard mathematical way for positive integers is to use the uniqueness of prime factorization.

f( x, y ) -> 2^x * 3^y

The downside is that the image tends to span quite a large range of integers so when it comes to expressing the mapping in a computer algorithm you may have issues with choosing an appropriate type for the result.

You could modify this to deal with negative x and y by encoding a flags with powers of 5 and 7 terms.

e.g.

f( x, y ) -> 2^|x| * 3^|y| * 5^(x<0) * 7^(y<0)
淡墨 2024-07-29 01:41:59

设数字 a 为第一个,b 为第二个。 令 p 为第 a+1 个素数,q 为第 b+1 个素数

那么,如果a,结果为pq,如果a>b,结果为2pq。 如果a=b,则设为p^2

Let number a be the first, b the second. Let p be the a+1-th prime number, q be the b+1-th prime number

Then, the result is pq, if a<b, or 2pq if a>b. If a=b, let it be p^2.

终弃我 2024-07-29 01:41:59

尽管 Stephan202 的答案是唯一真正通用的答案,但对于有界范围内的整数,您可以做得更好。 例如,如果您的范围是 0..10,000,那么您可以执行以下操作:

#define RANGE_MIN 0
#define RANGE_MAX 10000

unsigned int merge(unsigned int x, unsigned int y)
{
    return (x * (RANGE_MAX - RANGE_MIN + 1)) + y;
}

void split(unsigned int v, unsigned int &x, unsigned int &y)
{
    x = RANGE_MIN + (v / (RANGE_MAX - RANGE_MIN + 1));
    y = RANGE_MIN + (v % (RANGE_MAX - RANGE_MIN + 1));
}

结果可以适合范围最大为整数类型基数平方根的单个整数。 这种打包方式比 Stephan202 更通用的方法稍微高效一些。 解码起来也相当简单; 对于初学者来说不需要平方根:)

Although Stephan202's answer is the only truly general one, for integers in a bounded range you can do better. For example, if your range is 0..10,000, then you can do:

#define RANGE_MIN 0
#define RANGE_MAX 10000

unsigned int merge(unsigned int x, unsigned int y)
{
    return (x * (RANGE_MAX - RANGE_MIN + 1)) + y;
}

void split(unsigned int v, unsigned int &x, unsigned int &y)
{
    x = RANGE_MIN + (v / (RANGE_MAX - RANGE_MIN + 1));
    y = RANGE_MIN + (v % (RANGE_MAX - RANGE_MIN + 1));
}

Results can fit in a single integer for a range up to the square root of the integer type's cardinality. This packs slightly more efficiently than Stephan202's more general method. It is also considerably simpler to decode; requiring no square roots, for starters :)

微暖i 2024-07-29 01:41:59

对于正整数作为参数且参数顺序无关紧要:

  1. 这是一个 无序配对函数

    ;   = x * y + trunc((|x - y| - 1)^2 / 4) =  
      
  2. 对于 x ≠ y,这里有一个 独特的无序配对功能

    ;   = 如果 x <   y: 
                 x * (y - 1) + trunc((y - x - 2)^2 / 4) 
               如果x>   y: 
                 (x - 1) * y + 截断((x - y - 2)^2 / 4) 
             =  
      

For positive integers as arguments and where argument order doesn't matter:

  1. Here's an unordered pairing function:

    <x, y> = x * y + trunc((|x - y| - 1)^2 / 4) = <y, x>
    
  2. For x ≠ y, here's a unique unordered pairing function:

    <x, y> = if x < y:
               x * (y - 1) + trunc((y - x - 2)^2 / 4)
             if x > y:
               (x - 1) * y + trunc((x - y - 2)^2 / 4)
           = <y, x>
    
別甾虛僞 2024-07-29 01:41:59

f(a, b) = s(a+b) + a,其中 s(n) = n*(n+1)/2

  • 这是一个函数 - - 它是确定性的。
  • 它也是单射的——f 为不同的 (a,b) 对映射不同的值。 你可以证明
    这是使用以下事实: s(a+b+1)-s(a+b) = a+b+1
    < 一个。
  • 它返回非常小的值——如果您打算将它用于数组索引,那么这很好,因为数组不必很大。
  • 它是缓存友好的——如果两个 (a, b) 对彼此接近,则 f 将数字映射到彼此接近的它们(与其他方法相比)。

我不明白你的意思:

应该总是产生一个整数
无论是积极的还是消极的
整数的一边

如何在这个论坛中写(大于)、(小于)字符?

f(a, b) = s(a+b) + a, where s(n) = n*(n+1)/2

  • This is a function -- it is deterministic.
  • It is also injective -- f maps different values for different (a,b) pairs. You can prove
    this using the fact: s(a+b+1)-s(a+b) = a+b+1
    < a
    .
  • It returns quite small values -- good if your are going to use it for array indexing, as the array does not have to be big.
  • It is cache-friendly -- if two (a, b) pairs are close to each other, then f maps numbers to them which are close to each other (compared to other methods).

I did not understand what You mean by:

should always yield an integer on
either the positive or the negative
side of integers

How can I write (greater than), (less than) characters in this forum?

淡莣 2024-07-29 01:41:59

这是基于 @nawfal 给出的方法将 @DoctorJ 的代码扩展到无界整数。 它可以编码和解码。 它适用于普通数组和 numpy 数组。

#!/usr/bin/env python
from numbers import Integral    

def tuple_to_int(tup):
    """:Return: the unique non-negative integer encoding of a tuple of non-negative integers."""
    if len(tup) == 0:  # normally do if not tup, but doesn't work with np
        raise ValueError('Cannot encode empty tuple')
    if len(tup) == 1:
        x = tup[0]
        if not isinstance(x, Integral):
            raise ValueError('Can only encode integers')
        return x
    elif len(tup) == 2:
        # print("len=2")
        x, y = tuple_to_int(tup[0:1]), tuple_to_int(tup[1:2])  # Just to validate x and y

        X = 2 * x if x >= 0 else -2 * x - 1  # map x to positive integers
        Y = 2 * y if y >= 0 else -2 * y - 1  # map y to positive integers
        Z = (X * X + X + Y) if X >= Y else (X + Y * Y)  # encode

        # Map evens onto positives
        if (x >= 0 and y >= 0):
            return Z // 2
        elif (x < 0 and y >= 0 and X >= Y):
            return Z // 2
        elif (x < 0 and y < 0 and X < Y):
            return Z // 2
        # Map odds onto negative
        else:
            return (-Z - 1) // 2
    else:
        return tuple_to_int((tuple_to_int(tup[:2]),) + tuple(tup[2:]))  # ***speed up tuple(tup[2:])?***


def int_to_tuple(num, size=2):
    """:Return: the unique tuple of length `size` that encodes to `num`."""
    if not isinstance(num, Integral):
        raise ValueError('Can only encode integers (got {})'.format(num))
    if not isinstance(size, Integral) or size < 1:
        raise ValueError('Tuple is the wrong size ({})'.format(size))
    if size == 1:
        return (num,)
    elif size == 2:

        # Mapping onto positive integers
        Z = -2 * num - 1 if num < 0 else 2 * num

        # Reversing Pairing
        s = isqrt(Z)
        if Z - s * s < s:
            X, Y = Z - s * s, s
        else:
            X, Y = s, Z - s * s - s

        # Undoing mappint to positive integers
        x = (X + 1) // -2 if X % 2 else X // 2  # True if X not divisible by 2
        y = (Y + 1) // -2 if Y % 2 else Y // 2  # True if Y not divisible by 2

        return x, y

    else:
        x, y = int_to_tuple(num, 2)
        return int_to_tuple(x, size - 1) + (y,)


def isqrt(n):
    """":Return: the largest integer x for which x * x does not exceed n."""
    # Newton's method, via http://stackoverflow.com/a/15391420
    x = n
    y = (x + 1) // 2
    while y < x:
        x = y
        y = (x + n // x) // 2
    return x

Here is an extension of @DoctorJ 's code to unbounded integers based on the method given by @nawfal. It can encode and decode. It works with normal arrays and numpy arrays.

#!/usr/bin/env python
from numbers import Integral    

def tuple_to_int(tup):
    """:Return: the unique non-negative integer encoding of a tuple of non-negative integers."""
    if len(tup) == 0:  # normally do if not tup, but doesn't work with np
        raise ValueError('Cannot encode empty tuple')
    if len(tup) == 1:
        x = tup[0]
        if not isinstance(x, Integral):
            raise ValueError('Can only encode integers')
        return x
    elif len(tup) == 2:
        # print("len=2")
        x, y = tuple_to_int(tup[0:1]), tuple_to_int(tup[1:2])  # Just to validate x and y

        X = 2 * x if x >= 0 else -2 * x - 1  # map x to positive integers
        Y = 2 * y if y >= 0 else -2 * y - 1  # map y to positive integers
        Z = (X * X + X + Y) if X >= Y else (X + Y * Y)  # encode

        # Map evens onto positives
        if (x >= 0 and y >= 0):
            return Z // 2
        elif (x < 0 and y >= 0 and X >= Y):
            return Z // 2
        elif (x < 0 and y < 0 and X < Y):
            return Z // 2
        # Map odds onto negative
        else:
            return (-Z - 1) // 2
    else:
        return tuple_to_int((tuple_to_int(tup[:2]),) + tuple(tup[2:]))  # ***speed up tuple(tup[2:])?***


def int_to_tuple(num, size=2):
    """:Return: the unique tuple of length `size` that encodes to `num`."""
    if not isinstance(num, Integral):
        raise ValueError('Can only encode integers (got {})'.format(num))
    if not isinstance(size, Integral) or size < 1:
        raise ValueError('Tuple is the wrong size ({})'.format(size))
    if size == 1:
        return (num,)
    elif size == 2:

        # Mapping onto positive integers
        Z = -2 * num - 1 if num < 0 else 2 * num

        # Reversing Pairing
        s = isqrt(Z)
        if Z - s * s < s:
            X, Y = Z - s * s, s
        else:
            X, Y = s, Z - s * s - s

        # Undoing mappint to positive integers
        x = (X + 1) // -2 if X % 2 else X // 2  # True if X not divisible by 2
        y = (Y + 1) // -2 if Y % 2 else Y // 2  # True if Y not divisible by 2

        return x, y

    else:
        x, y = int_to_tuple(num, 2)
        return int_to_tuple(x, size - 1) + (y,)


def isqrt(n):
    """":Return: the largest integer x for which x * x does not exceed n."""
    # Newton's method, via http://stackoverflow.com/a/15391420
    x = n
    y = (x + 1) // 2
    while y < x:
        x = y
        y = (x + n // x) // 2
    return x
旧城烟雨 2024-07-29 01:41:59

检查一下:http://en.wikipedia.org/wiki/Pigeonhole_principle。 如果A、B、C是同一类型,则不能这样做。 如果A和B是16位整数,而C是32位整数,那么你可以简单地使用移位。

哈希算法的本质是它们无法为每个不同的输入提供唯一的哈希值。

Check this: http://en.wikipedia.org/wiki/Pigeonhole_principle. If A, B and C are of same type, it cannot be done. If A and B are 16-bit integers, and C is 32-bit, then you can simply use shifting.

The very nature of hashing algorithms is that they cannot provide a unique hash for each different input.

攀登最高峰 2024-07-29 01:41:59

构建映射并不难:

   1  2  3  4  5  use this mapping if (a,b) != (b,a)
1  0  1  3  6 10
2  2  4  7 11 16
3  5  8 12 17 23
4  9 13 18 24 31
5 14 19 25 32 40

   1  2  3  4  5 use this mapping if (a,b) == (b,a) (mirror)
1  0  1  2  4  6
2  1  3  5  7 10
3  2  5  8 11 14
4  4  8 11 15 19
5  6 10 14 19 24


    0  1 -1  2 -2 use this if you need negative/positive
 0  0  1  2  4  6
 1  1  3  5  7 10
-1  2  5  8 11 14
 2  4  8 11 15 19
-2  6 10 14 19 24

弄清楚如何获取任意 a、b 的值要困难一些。

It isn't that tough to construct a mapping:

   1  2  3  4  5  use this mapping if (a,b) != (b,a)
1  0  1  3  6 10
2  2  4  7 11 16
3  5  8 12 17 23
4  9 13 18 24 31
5 14 19 25 32 40

   1  2  3  4  5 use this mapping if (a,b) == (b,a) (mirror)
1  0  1  2  4  6
2  1  3  5  7 10
3  2  5  8 11 14
4  4  8 11 15 19
5  6 10 14 19 24


    0  1 -1  2 -2 use this if you need negative/positive
 0  0  1  2  4  6
 1  1  3  5  7 10
-1  2  5  8 11 14
 2  4  8 11 15 19
-2  6 10 14 19 24

Figuring out how to get the value for an arbitrary a,b is a little more difficult.

注定孤独终老 2024-07-29 01:41:59

让我们有两个数字 B 和 C ,将它们编码为单个数字 A

A = B + C * N

其中

B=A % N = B

C=A / N = C

let us have two number B and C , encoding them into single number A

A = B + C * N

where

B=A % N = B

C=A / N = C

计㈡愣 2024-07-29 01:41:59

你的建议是不可能的。 你们总会有冲突。

为了将两个对象映射到另一个集合,映射集合的最小大小必须达到预期的组合数量:

假设是 32 位整数,则有 2147483647 个正整数。 选择其中两个(顺序无关紧要且重复)会产生 2305843008139952128 个组合。 这不太适合 32 位整数集合。

不过,您可以将此映射装入 61 位。 使用 64 位整数可能是最简单的。 将高位字设置为较小的整数,将低位字设置为较大的整数。

What you suggest is impossible. You will always have collisions.

In order to map two objects to another single set, the mapped set must have a minimum size of the number of combinations expected:

Assuming a 32-bit integer, you have 2147483647 positive integers. Choosing two of these where order doesn't matter and with repetition yields 2305843008139952128 combinations. This does not fit nicely in the set of 32-bit integers.

You can, however fit this mapping in 61 bits. Using a 64-bit integer is probably easiest. Set the high word to the smaller integer and the low word to the larger one.

七婞 2024-07-29 01:41:59

更简单的事情怎么样:给定两个数字,A 和 B 让 str 成为串联:'A' + ';' +“B”。 然后让输出为hash(str)。 我知道这不是一个数学答案,但一个简单的 python(有一个内置的哈希函数)脚本应该可以完成这项工作。

How about something much simpler: Given two numbers, A and B let str be the concatenation: 'A' + ';' + 'B'. Then let the output be hash(str). I know that this is not a mathematical answer, but a simple python (which has an in built hash function) script should do the job.

旧话新听 2024-07-29 01:41:59

假设您有一个 32 位整数,为什么不将 A 移至前 16 位一半,将 B 移至另一半呢?

def vec_pack(vec):
    return vec[0] + vec[1] * 65536;


def vec_unpack(number):
    return [number % 65536, number // 65536];

除了尽可能节省空间和计算成本之外,一个非常酷的副作用是您可以对打包的数字进行向量数学运算。

a = vec_pack([2,4])
b = vec_pack([1,2])

print(vec_unpack(a+b)) # [3, 6] Vector addition
print(vec_unpack(a-b)) # [1, 2] Vector subtraction
print(vec_unpack(a*2)) # [4, 8] Scalar multiplication

Say you have a 32 bit integer, why not just move A into the first 16 bit half and B into the other?

def vec_pack(vec):
    return vec[0] + vec[1] * 65536;


def vec_unpack(number):
    return [number % 65536, number // 65536];

Other than this being as space efficient as possible and cheap to compute, a really cool side effect is that you can do vector math on the packed number.

a = vec_pack([2,4])
b = vec_pack([1,2])

print(vec_unpack(a+b)) # [3, 6] Vector addition
print(vec_unpack(a-b)) # [1, 2] Vector subtraction
print(vec_unpack(a*2)) # [4, 8] Scalar multiplication
陌上青苔 2024-07-29 01:41:59

如果您想要更多控制,例如为第一个数字分配 X 位,为第二个数字分配 Y 位,您可以使用以下代码:

class NumsCombiner
{

    int num_a_bits_size;
    int num_b_bits_size;

    int BitsExtract(int number, int k, int p)
    {
        return (((1 << k) - 1) & (number >> (p - 1)));
    }

public:
    NumsCombiner(int num_a_bits_size, int num_b_bits_size)
    {
        this->num_a_bits_size = num_a_bits_size;
        this->num_b_bits_size = num_b_bits_size;
    }

    int StoreAB(int num_a, int num_b)
    {
        return (num_b << num_a_bits_size) | num_a;
    }

    int GetNumA(int bnum)
    {
        return BitsExtract(bnum, num_a_bits_size, 1);
    }

    int GetNumB(int bnum)
    {
        return BitsExtract(bnum, num_b_bits_size, num_a_bits_size + 1);
    }
};

我总共使用 32 位。 这里的想法是,例如,如果您希望第一个数字最多为 10 位,第二个数字最多为 12 位,您可以这样做:

NumsCombiner nums_mapper(10/*bits for first number*/, 12/*bits for second number*/);

现在您可以在 num_a 中存储最大数字即 2^10 - 1 = 1023,在 num_b 中,最大值为 2^12 - 1 = 4095

设置 num A 和 num B 的值:

int bnum = nums_mapper.StoreAB(10/*value for a*/, 12 /*value from b*/);

现在 bnum 是所有位(总共 32 位。您可以修改代码以使用 64 位)
获取数字 a:

int a = nums_mapper.GetNumA(bnum);

获取数字 b:

int b = nums_mapper.GetNumB(bnum);

编辑:
bnum 可以存储在类内部。 我不是因为我自己的需要才这么做的
我分享了代码,希望对您有所帮助。

感谢您提供来源:
https://www.geeksforgeeks.org/extract-k-bits -给定位置编号/
用于提取位的函数,也感谢本文中的 mouviciel 回答。
使用这些资源我可以找出更高级的解决方案

If you want more control such as allocate X bits for the first number and Y bits for the second number, you can use this code:

class NumsCombiner
{

    int num_a_bits_size;
    int num_b_bits_size;

    int BitsExtract(int number, int k, int p)
    {
        return (((1 << k) - 1) & (number >> (p - 1)));
    }

public:
    NumsCombiner(int num_a_bits_size, int num_b_bits_size)
    {
        this->num_a_bits_size = num_a_bits_size;
        this->num_b_bits_size = num_b_bits_size;
    }

    int StoreAB(int num_a, int num_b)
    {
        return (num_b << num_a_bits_size) | num_a;
    }

    int GetNumA(int bnum)
    {
        return BitsExtract(bnum, num_a_bits_size, 1);
    }

    int GetNumB(int bnum)
    {
        return BitsExtract(bnum, num_b_bits_size, num_a_bits_size + 1);
    }
};

I use 32 bits in total. The idea here is that if you want for example that first number will be up to 10 bits and second number will be up to 12 bits, you can do this:

NumsCombiner nums_mapper(10/*bits for first number*/, 12/*bits for second number*/);

Now you can store in num_a the maximum number that is 2^10 - 1 = 1023 and in num_b naximum value of 2^12 - 1 = 4095.

To set value for num A and num B:

int bnum = nums_mapper.StoreAB(10/*value for a*/, 12 /*value from b*/);

Now bnum is all of the bits (32 bits in total. You can modify the code to use 64 bits)
To get num a:

int a = nums_mapper.GetNumA(bnum);

To get num b:

int b = nums_mapper.GetNumB(bnum);

EDIT:
bnum can be stored inside the class. I did not did it because my own needs
I shared the code and hope that it will be helpful.

Thanks for source:
https://www.geeksforgeeks.org/extract-k-bits-given-position-number/
for function to extract bits and thanks also to mouviciel answer in this post.
Using these to sources I could figure out more advanced solution

贩梦商人 2024-07-29 01:41:59

我们可以在 O(1) 空间和 O(N) 时间内将两个数字编码为一个。
假设您要将 0-9 范围内的数字编码为 1,例如: 5、6.怎么做? 很简单,

  5*10 + 6 = 56. 
   
    5 can be obtained by doing 56/10 
    6 can be obtained by doing 56%10.

即使对于两位整数,比如 56 和 45,56*100 + 45 = 5645。我们可以通过执行 5645/100 和 5645%100 再次获得单个数字,

但是对于大小为 n 的数组,例如: a = {4,0,2,1,3},假设我们要对 3 和 4 进行编码,因此:

 3 * 5 + 4 = 19               OR         3 + 5 * 4 = 23
 3 :- 19 / 5 = 3                         3 :- 23 % 5 = 3
 4 :- 19 % 5 = 4                         4 :- 23 / 5 = 4

推广后,我们得到

    x * n + y     OR       x + n * y

但是我们还需要注意我们更改的值; 所以它最终是

    (x%n)*n + y  OR x + n*(y%n)

你可以通过除法并找到结果数字的模来单独获得每个数字。

We can encode two numbers into one in O(1) space and O(N) time.
Suppose you want to encode numbers in the range 0-9 into one, eg. 5 and 6. How to do it? Simple,

  5*10 + 6 = 56. 
   
    5 can be obtained by doing 56/10 
    6 can be obtained by doing 56%10.

Even for two digit integer let's say 56 and 45, 56*100 + 45 = 5645. We can again obtain individual numbers by doing 5645/100 and 5645%100

But for an array of size n, eg. a = {4,0,2,1,3}, let's say we want to encode 3 and 4, so:

 3 * 5 + 4 = 19               OR         3 + 5 * 4 = 23
 3 :- 19 / 5 = 3                         3 :- 23 % 5 = 3
 4 :- 19 % 5 = 4                         4 :- 23 / 5 = 4

Upon generalising it, we get

    x * n + y     OR       x + n * y

But we also need to take care of the value we changed; so it ends up as

    (x%n)*n + y  OR x + n*(y%n)

You can obtain each number individually by dividing and finding mod of the resultant number.

云柯 2024-07-29 01:41:59

对于新手,我想通过指出以下参考文献(那里的公式 2.1)来补充 @nawfal 的精彩答案:

The Rosenberg-Strong Pairing Function

数学公式如下:给定 a, b 两个非负整数,我们感兴趣的是函数 f,使得 f(a, b) = f(a', b') == > a=a',b=b'。

上述参考文献提供了 Rosenberg-Strong 函数 $f$ 定义如下:

f(a, b) = (max(a,b))^2 + max(a, b) + a - b

这看起来比 Cantor 的更经济和 Szudzik 的(查看参考文献中的图 2 以了解编码的工作原理!枚举看起来是最佳的)。

请注意,Szudzik 撰写了这篇论文,但并未将他自己的优雅解决方案放入本文中。

For newcomers, I would like to add to @nawfal's brilliant answer by pointing to the following reference (equation 2.1 there):

The Rosenberg-Strong Pairing Function

The mathematical formulation is the following: given a, b two nonnegative integers, we are interested in a function f such that f(a, b) = f(a', b') ==> a=a', b=b'.

The above references provide a Rosenberg-Strong function $f$ defined as follows:

f(a, b) = (max(a,b))^2 + max(a, b) + a - b

This looks more economical than Cantor's and Szudzik's (check out Figure 2 in the reference to see how the encoding works! The enumeration looks optimal).

Note that Szudzik wrote this paper but did not put his own elegant solution in this paper.

我不在是我 2024-07-29 01:41:59

给定正整数 A 和 B,令 D = A 的位数,E = B 的位数
结果可以是 D、0、E、0、A 和 B 的串联。

示例:A = 300,B = 12。D = 3,E=2 结果 = 302030012。
这利用了唯一以 0 开头的数字是 0 的事实,

优点:易于编码,易于解码,人类可读,可以首先比较有效数字,无需计算即可进行比较的潜力,简单的错误检查。

缺点:结果的大小是一个问题。 但这没关系,为什么我们要在计算机中存储无界整数呢?

Given positive integers A and B, let D = number of digits A has, and E=number of digits B has
The result can be a concatenation of D, 0, E, 0, A, and B.

Example: A = 300, B = 12. D = 3, E=2 result = 302030012.
This takes advantage of the fact that the only number that starts with 0, is 0,

Pro: Easy to encode, easy to decode, human readable, significant digits can be compared first, potential for compare without calculation, simple error checking.

Cons: Size of results is an issue. But that's ok, why are we storing unbounded integers in a computer anyways.

深巷少女 2024-07-29 01:41:59

要将两个任意大小的非负整数合并为一个,只需将它们的数字交错即可。

示例:12345 和 678:

1 2 3 4 5
     6 7 8
----------
1020364758

结果是 1020364758。

您可以使用任何基数(例如基数 2 或基数 16)执行此操作。 结果最多是最大数字的两倍,这意味着例如组合两个 16 位数字将始终适合 32 位。

如果我们还想允许负整数,我们可以通过将绝对值乘以 2 来编码其中之一或两者,然后如果数字为负则加 1。 解码时,模 2 将表示符号。

我们可以将此方法扩展到小数,例如,如果我们想要组合 1234.5 和 6.78:

1 2 3 4 .5
       6. 7 8
-------- ----
10203046.5708

结果是 10203046.5708。 (这在编程中可能不是很有用,除非您使用定点数。)

请注意我们如何利用整数部分中的前导零无关紧要,就像小数部分中的尾随零无关紧要一样。

To combine two non-negative integers of any size into one, you can simply interleave their digits.

Example: 12345 and 678:

1 2 3 4 5
     6 7 8
----------
1020364758

The result is 1020364758.

You can do this with any base (e.g. base 2 or base 16). The result will be at most twice as big as the biggest number, which means that e.g. combining two 16-bit numbers will always fit in 32-bit.

If we wanted to also allow negative integers, we could encode one or both by multiplying the absolute value by 2, and then adding 1 if the number was negative. When decoding, the modulus of 2 would then indicate the sign.

We can extend this approach to decimals, for example if we wanted to combine 1234.5 and 6.78:

1 2 3 4 .5
       6. 7 8
-------- ----
10203046.5708

The result is 10203046.5708. (This is probably not very useful in programming, unless you use fixed-point numbers.)

Note how we take advantage that leading zeroes are insignificant in the integer part in the same way that trailing zeroes are insignificant in the fractional part.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文