如何安全地平均 C++ 中的两个无符号整数?

发布于 2024-09-24 23:52:58 字数 676 浏览 1 评论 0原文

仅使用整数数学,我想在 C++ 中“安全”地平均两个无符号整数。

我所说的“安全”是指避免溢出(以及任何其他可以想到的事情)。

例如,平均 2005000 很容易:

unsigned int a = 200;
unsigned int b = 5000;
unsigned int average = (a + b) / 2; // Equals: 2600 as intended

但对于 4294967295>5000 然后:

unsigned int a = 4294967295;
unsigned int b = 5000;
unsigned int average = (a + b) / 2; // Equals: 2499 instead of 2147486147

我想出的最好方法是:

unsigned int a = 4294967295;
unsigned int b = 5000;
unsigned int average = (a / 2) + (b / 2); // Equals: 2147486147 as expected

有更好的方法吗?

Using integer math alone, I'd like to "safely" average two unsigned ints in C++.

What I mean by "safely" is avoiding overflows (and anything else that can be thought of).

For instance, averaging 200 and 5000 is easy:

unsigned int a = 200;
unsigned int b = 5000;
unsigned int average = (a + b) / 2; // Equals: 2600 as intended

But in the case of 4294967295 and 5000 then:

unsigned int a = 4294967295;
unsigned int b = 5000;
unsigned int average = (a + b) / 2; // Equals: 2499 instead of 2147486147

The best I've come up with is:

unsigned int a = 4294967295;
unsigned int b = 5000;
unsigned int average = (a / 2) + (b / 2); // Equals: 2147486147 as expected

Are there better ways?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

一绘本一梦想 2024-10-01 23:52:58

你的最后一种方法似乎很有希望。您可以通过手动考虑 a 和 b 的最低位来改进这一点:

unsigned int average = (a / 2) + (b / 2) + (a & b & 1);

如果 a 和 b 都是奇数,这会给出正确的结果。

Your last approach seems promising. You can improve on that by manually considering the lowest bits of a and b:

unsigned int average = (a / 2) + (b / 2) + (a & b & 1);

This gives the correct results in case both a and b are odd.

何止钟意 2024-10-01 23:52:58

如果您提前知道哪个更高,则可以采用一种非常有效的方法。否则,您最好使用其他策略之一,而不是有条件地交换以使用此策略。

unsigned int average = low + ((high - low) / 2);

以下是相关文章:http:// /googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html

If you know ahead of time which one is higher, a very efficient way is possible. Otherwise you're better off using one of the other strategies, instead of conditionally swapping to use this.

unsigned int average = low + ((high - low) / 2);

Here's a related article: http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html

°如果伤别离去 2024-10-01 23:52:58

如果两个数字都是奇数(例如 5 和 7),平均值为 6,但您的方法 #3 返回 5,则您的方法不正确。

请尝试以下操作:

average = (a>>1) + (b>>1) + (a & b & 1)

仅使用数学运算符:

average = a/2 + b/2 + (a%2) * (b%2)

Your method is not correct if both numbers are odd eg 5 and 7, average is 6 but your method #3 returns 5.

Try this:

average = (a>>1) + (b>>1) + (a & b & 1)

with math operators only:

average = a/2 + b/2 + (a%2) * (b%2)
泅人 2024-10-01 23:52:58

正确答案是...

(A&B)+((A^B)>>1)

And the correct answer is...

(A&B)+((A^B)>>1)
抽个烟儿 2024-10-01 23:52:58

如果您不介意一点 x86 内联汇编(GNU C 语法),您可以利用 supercat 的建议来使用 rotate-with-carry 在添加之后将完整 33 位结果的高 32 位放入寄存器。

当然,您通常应该介意使用 inline-asm,因为它会破坏一些优化(https://gcc.gnu.org/wiki/DontUseInlineAsm)。但无论如何,我们还是要说:

// works for 64-bit long as well on x86-64, and doesn't depend on calling convention
unsigned average(unsigned x, unsigned y)
{
    unsigned result;
    asm("add   %[x], %[res]\n\t"
        "rcr   %[res]"
        : [res] "=r" (result)   // output
        : [y] "%0"(y),  // input: in the same reg as results output.  Commutative with next operand
          [x] "rme"(x)  // input: reg, mem, or immediate
        :               // no clobbers.  ("cc" is implicit on x86)
    );
    return result;
}

% 修饰符 告诉编译器 args 是可交换的,实际上并不能帮助我在尝试使用 y 作为常量或指针 deref(内存操作数)来调用函数时更好地编写 asm。可能对输出操作数使用匹配约束会失败,因为您不能将它与读写操作数一起使用。

如您所见 在 Godbolt 编译器浏览器上,它可以正确编译,我们将操作数更改为 unsigned long 的版本也是如此,具有相同的内联汇编。然而 clang3.9 把它弄乱了,并决定使用 "m" 选项作为 "rme" 约束,因此它存储到内存并使用内存操作数。


RCR-by-one 并不算太慢,但在 Skylake 上仍然是 3 uops,有 2 个周期的延迟。它非常适合 AMD CPU,其中 RCR 具有单周期延迟。 (来源:Agner Fog 的说明表,另请参阅 标签 wiki,获取 x86 性能链接)。它仍然比 @sellibitze 的版本好,但比 @Sheldon 的顺序相关版本差。 (请参阅 Godbolt 上的代码)

但请记住,内联汇编会击败常量传播等优化,因此在这种情况下任何纯 C++ 版本都会更好。

If you don't mind a little x86 inline assembly (GNU C syntax), you can take advantage of supercat's suggestion to use rotate-with-carry after an add to put the high 32 bits of the full 33-bit result into a register.

Of course, you usually should mind using inline-asm, because it defeats some optimizations (https://gcc.gnu.org/wiki/DontUseInlineAsm). But here we go anyway:

// works for 64-bit long as well on x86-64, and doesn't depend on calling convention
unsigned average(unsigned x, unsigned y)
{
    unsigned result;
    asm("add   %[x], %[res]\n\t"
        "rcr   %[res]"
        : [res] "=r" (result)   // output
        : [y] "%0"(y),  // input: in the same reg as results output.  Commutative with next operand
          [x] "rme"(x)  // input: reg, mem, or immediate
        :               // no clobbers.  ("cc" is implicit on x86)
    );
    return result;
}

The % modifier to tell the compiler the args are commutative doesn't actually help make better asm in the case I tried, calling the function with y being a constant or pointer-deref (memory operand). Probably using a matching constraint for an output operand defeats that, since you can't use it with read-write operands.

As you can see on the Godbolt compiler explorer, this compiles correctly, and so does a version where we change the operands to unsigned long, with the same inline asm. clang3.9 makes a mess of it, though, and decides to use the "m" option for the "rme" constraint, so it stores to memory and uses a memory operand.


RCR-by-one is not too slow, but it's still 3 uops on Skylake, with 2 cycle latency. It's great on AMD CPUs, where RCR has single-cycle latency. (Source: Agner Fog's instruction tables, see also the tag wiki for x86 performance links). It's still better than @sellibitze's version, but worse than @Sheldon's order-dependent version. (See code on Godbolt)

But remember that inline-asm defeats optimizations like constant-propagation, so any pure-C++ version will be better in that case.

离旧人 2024-10-01 23:52:58

你所拥有的很好,有一个小细节,它会声称 3 和 3 的平均值是 2。我猜你不希望这样;幸运的是,有一个简单的解决办法:

unsigned int average = a/2 + b/2 + (a & b & 1);

在两个部门都被截断的情况下,这只会使平均值回升。

What you have is fine, with the minor detail that it will claim that the average of 3 and 3 is 2. I'm guessing that you don't want that; fortunately, there's an easy fix:

unsigned int average = a/2 + b/2 + (a & b & 1);

This just bumps the average back up in the case that both divisions were truncated.

意中人 2024-10-01 23:52:58

在 C++20 中,您可以使用 std::midpoint

template <class T>
constexpr T midpoint(T a, T b) noexcept;

论文P0811R3std::midpoint 的 a> 推荐了这个片段(稍微采用了 C++11):

#include <type_traits>

template <typename Integer>
constexpr Integer midpoint(Integer a, Integer b) noexcept {
  using U = std::make_unsigned<Integer>::type;
  return a>b ? a-(U(a)-b)/2 : a+(U(b)-a)/2;
}

为了完整起见,这里是论文中未经修改的 C++20 实现:

constexpr Integer midpoint(Integer a, Integer b) noexcept {
  using U = make_unsigned_t<Integer>;
  return a>b ? a-(U(a)-b)/2 : a+(U(b)-a)/2;
}

In C++20, you can use std::midpoint:

template <class T>
constexpr T midpoint(T a, T b) noexcept;

The paper P0811R3 that introduced std::midpoint recommended this snippet (slightly adopted to work with C++11):

#include <type_traits>

template <typename Integer>
constexpr Integer midpoint(Integer a, Integer b) noexcept {
  using U = std::make_unsigned<Integer>::type;
  return a>b ? a-(U(a)-b)/2 : a+(U(b)-a)/2;
}

For completeness, here is the unmodified C++20 implementation from the paper:

constexpr Integer midpoint(Integer a, Integer b) noexcept {
  using U = make_unsigned_t<Integer>;
  return a>b ? a-(U(a)-b)/2 : a+(U(b)-a)/2;
}
极致的悲 2024-10-01 23:52:58

如果代码用于嵌入式微控制器,并且速度至关重要,那么汇编语言可能会有所帮助。在许多微控制器上,加法的结果自然会进入进位标志,并且存在将其移回寄存器的指令。在 ARM 上,平均操作(寄存器中的源和目标)可以用两条指令完成;任何 C 语言的等价物都可能产生至少 5 个,并且可能比这个多一点。

顺便说一句,在字长较短的机器上,差异可能会更大。在 8 位 PIC-18 系列上,对两个 32 位数字求平均值需要 12 条指令。进行移位、加法和校正,每个移位需要 5 条指令,8 条用于加法,8 条用于校正,因此需要 26 条指令(不是 2.5 倍的差异,但从绝对值来看可能更重要)。

If the code is for an embedded micro, and if speed is critical, assembly language may be helpful. On many microcontrollers, the result of the add would naturally go into the carry flag, and instructions exist to shift it back into a register. On an ARM, the average operation (source and dest. in registers) could be done in two instructions; any C-language equivalent would likely yield at least 5, and probably a fair bit more than that.

Incidentally, on machines with shorter word sizes, the differences can be even more substantial. On an 8-bit PIC-18 series, averaging two 32-bit numbers would take twelve instructions. Doing the shifts, add, and correction, would take 5 instructions for each shift, eight for the add, and eight for the correction, so 26 (not quite a 2.5x difference, but probably more significant in absolute terms).

生活了然无味 2024-10-01 23:52:58
    int[] array = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    decimal avg = 0;
    for (int i = 0; i < array.Length; i++){
        avg = (array[i] - avg) / (i+1) + avg;
    }

此测试预计 avg == 5.0

    int[] array = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    decimal avg = 0;
    for (int i = 0; i < array.Length; i++){
        avg = (array[i] - avg) / (i+1) + avg;
    }

expects avg == 5.0 for this test

苍景流年 2024-10-01 23:52:58

(((a&b << 1) + (a^b)) >> 1) 也是一个不错的方法。

礼貌:http://www.ragestorm.net/blogs/?p=29

(((a&b << 1) + (a^b)) >> 1) is also a nice way.

Courtesy: http://www.ragestorm.net/blogs/?p=29

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文