当前位置：文江博客话题详情

经过几次乘法有溢出之后是否有可能得到一个数字的原始值？

发布于 2024-11-05 19:06:38 字数 984 浏览 1 评论 0原文

摘要：假设我有一个unsigned int 数字。然后我将其相乘几次（并且存在溢出，这是预期的）。那么是否可以“恢复”原始值？

详细信息：

这都是关于 < strong>Rabin-Karp 滚动哈希。我需要做的是：我有一个长字符串的哈希值 - 例如：“abcd”。然后我得到了较短子字符串的哈希值 - 例如“cd”。如何使用两个给定的哈希值以 O(1) 计算“ab”哈希值？

我现在的算法是：

从“abcd”哈希中减去“cd”哈希（从多项式中删除最后一个元素）
将“abcd”哈希除以 p ^ len( "cd" ) ，其中 p 是基数（质数）。

所以这是：

a * p ^ 3 + b * p ^ 2 + c * p ^ 1 + d * p ^ 0 - abcd

c * p ^ 1 + d * p ^ 0 - cd

ab 得到：

( 
  ( a * p ^ 3 + b * p ^ 2 + c * p ^ 1 + d * p ^ 0 ) -
  ( c * p ^ 1 + d * p ^ 0 ) 
)
/ ( p ^ 2 )
= a * p ^ 1 + b * p ^ 0

如果我没有溢出（如果 p是小数字）。但如果不是——它就不起作用。

有什么技巧或者什么吗？

PS c++ 标签是因为数字溢出，因为它是特定的（并且与 python、scheme 或 sth 不同）

原文

Summary: Suppose I have an unsigned int number. Then I multiply it several times(and there's overflow, which is expected). Then is it possible to "revert" the original value back?

In details:

It's all about Rabin-Karp rolling hash. What I need to do is: I have the hash of a long string - for example: "abcd". Then I have the hash for a shorter substring - for example "cd". How to calculate the "ab" hash with O(1), using the two given hashes?

What I have now as an algorithm:

substract the "cd" hash from "abcd" hash (remove the last elements from the polynomial)
devide the "abcd" hash by p ^ len( "cd" ), where p is the base (prime number).

So this is:

a * p ^ 3 + b * p ^ 2 + c * p ^ 1 + d * p ^ 0 - abcd

c * p ^ 1 + d * p ^ 0 - cd

ab gets:

( 
  ( a * p ^ 3 + b * p ^ 2 + c * p ^ 1 + d * p ^ 0 ) -
  ( c * p ^ 1 + d * p ^ 0 ) 
)
/ ( p ^ 2 )
= a * p ^ 1 + b * p ^ 0

And this works, if I don't have an overflow (if p is small number). But if it's not - it's not working.

Is there any trick or something?

P.S. The c++ tag is because of the number's overflow, as it is specific (and different from python, scheme or sth)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

聚集的泪 2024-11-12 19:06:38

不知道溢出部分，但有一种方法可以恢复原始值。

中国剩余定理有很大帮助。让我们调用h = abcd - cd。 G 是值 h，没有溢出，G = h + k*2^32，假设溢出只是 %2^32 。因此ab = G / p^2。

G = h (mod 2^32)
G = 0 (mod p^2)

如果 p^2 和 2^32 互质。中国余数定理上的这个页面，给了我们

G = h * b * p^2 (mod 2^32 * p^2)

其中 b 是模块化的p^2 模 2^32 的乘法逆元，b * p^2 = 1 (mod 2^32)。计算出 G 后，只需除以 p^2 即可找到 ab。

Don't know about the overflow part, but there is a way of getting back the original value.

The Chinese Remainder Theorem help a great deal. Let's call h = abcd - cd. G is the value, h, without overflows, G = h + k*2^32, assuming the overflow simply does %2^32. And thus ab = G / p^2.

G = h (mod 2^32)
G = 0 (mod p^2)

If p^2 and 2^32 are coprime. This page on Chinese Remainder Theorem, gives us

G = h * b * p^2 (mod 2^32 * p^2)

Where b is modular multiplicative inverse of p^2 modulo 2^32, b * p^2 = 1 (mod 2^32). After you calculate G, simply divide by p^2 to find ab.

回复收藏 0 原文

后知后觉 2024-11-12 19:06:38

扩展欧几里得算法是一个很好的解决方案，但它过于复杂且难以实现。还有一个更好的。

还有另一种方法可以做到这一点（感谢我的朋友（:）

wikipedia<中有一篇很好的文章/a> - 当 m 和 a 互质时，使用欧拉定理的模乘法逆元：

欧拉互质数和模定理

其中 φ(m) 是 Euler 的 totient 函数。

在我的例子中，m（模）是哈希类型的大小 - 2 ^32、2^64 等（在我的例子中是 64 位）。
嗯，这意味着，我们应该只找到 φ(m) 的值。但想想 - m == 2 ^ 64 所以，这给了我们保证 m 将与所有奇数和 < em>不会与任何偶数互质。因此，我们需要做的是获取所有值的数量并将它们除以 2。

此外，我们知道 m 将是无符号的，否则我们会遇到一些问题。这让我们有机会做到这一点：

hash_t x = -1;
x /= 2;
hash_t a_reverse = fast_pow( a, x );

嗯，对于 64 位数字，x 确实是一个很大的数字（19 位数字：9 223 372 036 854 775 807），但是 < code>fast_pow 确实很快，我们可以缓存相反的数字，以防我们需要多个查询。

fast_pow 是一种著名的算法：

hash_t fast_pow( hash_t source, hash_t pow )
{
    if( 0 == pow )
    {
        return 1;
    }

    if( 0 != pow % 2 )
    {
        return source * fast_pow( source, pow - 1 );
    }
    else
    {
        return fast_pow( source * source, pow / 2  );    
    }

}

加法：例如：

    hash_t base = 2305843009213693951;  // 9th mersenne prime
    hash_t x = 1234567890987654321;

    x *= fast_pow( base, 123456789 );   // x * ( base ^ 123456789 )

    hash_t y = -1;
    y /= 2;
    hash_t base_reverse = fast_pow( base, y );

    x *= fast_pow( base_reverse, 123456789 );   // x * ( base_reverse ^ 123456789 )
    assert( x == 1234567890987654321 ) ;

效果完美且速度非常快。

Extended Euclidean algorithm is a good solution for this, but it's too complicated and hard to implement. There's a better one.

And there's another way to do this (thanks to e friend of mine (: )

There's a nice article in wikipedia - modular multiplicative inverse using Euler's theorem in the case, when m and a are coprime:

Euler's theorem for coprime number and modulo

where φ(m) is Euler's totient function.

In my case, the m (modulo) is the size of the hash type - 2^32, 2^64, etc. (64bit in my case).
Well, this means, that we should only find the value of φ(m). But think about that - m == 2 ^ 64 so, that gives us the guarantee that m will be coprime with all odd numbers and will NOT be coprime any even number. So, what we need to do is to get the number of all values and divide them by 2.

Also, we know that m will be unsigned, as otherwise we will have some issues. Than this gives us the chance to do this:

hash_t x = -1;
x /= 2;
hash_t a_reverse = fast_pow( a, x );

Well, about 64bit numbers, x is really big number ( 19 digits: 9 223 372 036 854 775 807), but fast_pow is really fast and we could cache the reverse number, in case that we need for more than one query.

fast_pow is a well-known algorithm:

hash_t fast_pow( hash_t source, hash_t pow )
{
    if( 0 == pow )
    {
        return 1;
    }

    if( 0 != pow % 2 )
    {
        return source * fast_pow( source, pow - 1 );
    }
    else
    {
        return fast_pow( source * source, pow / 2  );    
    }

}

Addition: for example:

    hash_t base = 2305843009213693951;  // 9th mersenne prime
    hash_t x = 1234567890987654321;

    x *= fast_pow( base, 123456789 );   // x * ( base ^ 123456789 )

    hash_t y = -1;
    y /= 2;
    hash_t base_reverse = fast_pow( base, y );

    x *= fast_pow( base_reverse, 123456789 );   // x * ( base_reverse ^ 123456789 )
    assert( x == 1234567890987654321 ) ;

works perfect and very fast.

回复收藏 0 原文

謸气贵蔟 2024-11-12 19:06:38

你有一个 * b = c mod 2^32 （或 mod 其他东西，具体取决于你如何进行哈希）。如果你能找到 d 使得 b * d = 1 mod 2^32 （或 mod 其他），那么你可以计算 a * b * d = a
并检索 a.如果 gcd(b, mod 2^32) = 1 那么您可以使用 http://en.wikipedia。 org/wiki/Extended_Euclidean_algorithm 找到 x 和 y，使得 b * x + 2^32 * y = 1，或者
b * x = 1 - y * 2^32，或
b * x = 1 mod 2^32，因此 x 是您要乘以的数字。

回复收藏 0 原文