Restrict 关键字在 C++ 中的含义是什么?

发布于 2024-07-17 07:31:50 字数 99 浏览 5 评论 0原文

我总是不确定; C++ 中的 restrict 关键字是什么意思?

这是否意味着赋予函数的两个或多个指针不重叠? 还有什么意思呢?

I was always unsure; what does the restrict keyword mean in C++?

Does it mean the two or more pointer given to the function does not overlap?
What else does it mean?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

未蓝澄海的烟 2024-07-24 07:31:50

正如其他人所说,它在 C++14 中没有任何意义,所以让我们考虑一下 __restrict__ GCC 扩展,它与 C99 restrict 的作用相同。

C99

restrict 表示两个指针不能指向重叠的内存区域。 最常见的用法是函数参数。

这限制了函数的调用方式,但允许更多的编译优化。

如果调用者不遵守 restrict 约定,则可能会发生未定义的行为。

C99 N1256 草案 6.7.3/7 “类型限定符”说:

限制限定符(如寄存器存储类)的预期用途是促进优化,并且从组成一致程序的所有预处理翻译单元中删除限定符的所有实例不会改变其含义(即,可观察的行为)。

6.7.3.1“限制的正式定义”给出了详细信息。

可能的优化

维基百科示例非常 有启发性。

它清楚地显示了如何它允许保存一条汇编指令

不带限制:

void f(int *a, int *b, int *x) {
  *a += *x;
  *b += *x;
}

伪汇编:

load R1 ← *x    ; Load the value of x pointer
load R2 ← *a    ; Load the value of a pointer
add R2 += R1    ; Perform Addition
set R2 → *a     ; Update the value of a pointer
; Similarly for b, note that x is loaded twice,
; because x may point to a (a aliased by x) thus 
; the value of x will change when the value of a
; changes.
load R1 ← *x
load R2 ← *b
add R2 += R1
set R2 → *b

带限制:

void fr(int *restrict a, int *restrict b, int *restrict x);

伪汇编:

load R1 ← *x
load R2 ← *a
add R2 += R1
set R2 → *a
; Note that x is not reloaded,
; because the compiler knows it is unchanged
; "load R1 ← *x" is no longer needed.
load R2 ← *b
add R2 += R1
set R2 → *b

GCC 真的能做到吗?

g++ 4.8 Linux x86-64:

g++ -g -std=gnu++98 -O0 -c main.cpp
objdump -S main.o

使用 -O0,他们是一样的。

使用 -O3

void f(int *a, int *b, int *x) {
    *a += *x;
   0:   8b 02                   mov    (%rdx),%eax
   2:   01 07                   add    %eax,(%rdi)
    *b += *x;
   4:   8b 02                   mov    (%rdx),%eax
   6:   01 06                   add    %eax,(%rsi)  

void fr(int *__restrict__ a, int *__restrict__ b, int *__restrict__ x) {
    *a += *x;
  10:   8b 02                   mov    (%rdx),%eax
  12:   01 07                   add    %eax,(%rdi)
    *b += *x;
  14:   01 06                   add    %eax,(%rsi) 

对于新手来说,调用约定是:

  • rdi = 第一个参数
  • rsi = 第二个参数
  • rdx = 第三个参数

GCC 输出甚至比 wiki 文章更清晰:4 条指令 vs 3 条指令。

数组

到目前为止,我们节省了单条指令,但如果指针表示要循环的数组(这是一个常见的用例),那么可以保存一堆指令,如 supercat迈克尔

例如,考虑一下:

void f(char *restrict p1, char *restrict p2, size_t size) {
     for (size_t i = 0; i < size; i++) {
         p1[i] = 4;
         p2[i] = 9;
     }
 }

由于 restrict,智能编译器(或人类)可以将其优化为:

memset(p1, 4, size);
memset(p2, 9, size);

这可能会更有效,因为它可能会在适当的 libc 实现(如 glibc)上进行汇编优化< a href="https://stackoverflow.com/questions/4707012/c-memcpy-vs-stdcopy">就性能而言,使用 std::memcpy() 还是 std::copy() 更好?,可能带有 SIMD 指令

如果没有restrict,这个优化就无法完成,例如考虑:

char p1[4];
char *p2 = &p1[1];
f(p1, p2, 3);

那么for版本使得:

p1 == {4, 4, 4, 9}

memset版本使得:

p1 == {4, 9, 9, 9}

GCC真的做到了吗?

GCC 5.2.1.Linux x86-64 Ubuntu 15.10:

gcc -g -std=c99 -O0 -c main.c
objdump -dr main.o

使用-O0,两者是相同的。

使用-O3

  • 使用限制:

    <前><代码>3f0: 48 85 d2 测试 %rdx,%rdx
    3f3: 74 33 je 428
    3f5: 55 推 %rbp
    3f6: 53 推 %rbx
    3f7: 48 89 f5 移动%rsi,%rbp
    3fa: 为 04 00 00 00 mov $0x4,%esi
    3ff: 48 89 d3 mov %rdx,%rbx
    402: 48 83 ec 08 子 $0x8,%rsp
    406: e8 00 00 00 00 callq 40b
    407:R_X86_64_PC32 memset-0x4
    40b: 48 83 c4 08 添加 $0x8,%rsp
    40f: 48 89 da mov %rbx,%rdx
    412: 48 89 ef 移动 %rbp,%rdi
    415: 5b 流行 %rbx
    416: 5d 弹出 %rbp
    417: 为 09 00 00 00 mov $0x9,%esi
    41c: e9 00 00 00 00 jmpq 421
    41d:R_X86_64_PC32 memset-0x4
    421: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
    428:f3 c3 代表 retq

    按预期进行两次 memset 调用。

  • 没有限制:没有stdlib调用,只有16次迭代宽循环展开,我不这样做打算在这里重现:-)

我没有耐心对它们进行基准测试,但我相信限制版本会更快。

严格别名规则

restrict 关键字仅影响兼容类型的指针(例如两个 int*),因为严格别名规则规定,不兼容类型的别名是默认情况下未定义的行为,因此编译器可以假设它不会发生并进行优化。

请参阅:什么是严格别名规则?

它适用于

根据 GCC 文档,它确实: https://gcc.gnu.org/onlinedocs/gcc-5.1.0/gcc/Restricted-Pointers.html 语法:

int &__restrict__ rref

甚至还有一个成员函数 this 的版本:

void T::fn () __restrict__

As others said, it means nothing as of C++14, so let's consider the __restrict__ GCC extension which does the same as the C99 restrict.

C99

restrict says that two pointers cannot point to overlapping memory regions. The most common usage is for function arguments.

This restricts how the function can be called, but allows for more compile optimizations.

If the caller does not follow the restrict contract, undefined behavior can occur.

The C99 N1256 draft 6.7.3/7 "Type qualifiers" says:

The intended use of the restrict qualifier (like the register storage class) is to promote optimization, and deleting all instances of the qualifier from all preprocessing translation units composing a conforming program does not change its meaning (i.e., observable behavior).

and 6.7.3.1 "Formal definition of restrict" gives the gory details.

A possible optimization

The Wikipedia example is very illuminating.

It clearly shows how as it allows to save one assembly instruction.

Without restrict:

void f(int *a, int *b, int *x) {
  *a += *x;
  *b += *x;
}

Pseudo assembly:

load R1 ← *x    ; Load the value of x pointer
load R2 ← *a    ; Load the value of a pointer
add R2 += R1    ; Perform Addition
set R2 → *a     ; Update the value of a pointer
; Similarly for b, note that x is loaded twice,
; because x may point to a (a aliased by x) thus 
; the value of x will change when the value of a
; changes.
load R1 ← *x
load R2 ← *b
add R2 += R1
set R2 → *b

With restrict:

void fr(int *restrict a, int *restrict b, int *restrict x);

Pseudo assembly:

load R1 ← *x
load R2 ← *a
add R2 += R1
set R2 → *a
; Note that x is not reloaded,
; because the compiler knows it is unchanged
; "load R1 ← *x" is no longer needed.
load R2 ← *b
add R2 += R1
set R2 → *b

Does GCC really do it?

g++ 4.8 Linux x86-64:

g++ -g -std=gnu++98 -O0 -c main.cpp
objdump -S main.o

With -O0, they are the same.

With -O3:

void f(int *a, int *b, int *x) {
    *a += *x;
   0:   8b 02                   mov    (%rdx),%eax
   2:   01 07                   add    %eax,(%rdi)
    *b += *x;
   4:   8b 02                   mov    (%rdx),%eax
   6:   01 06                   add    %eax,(%rsi)  

void fr(int *__restrict__ a, int *__restrict__ b, int *__restrict__ x) {
    *a += *x;
  10:   8b 02                   mov    (%rdx),%eax
  12:   01 07                   add    %eax,(%rdi)
    *b += *x;
  14:   01 06                   add    %eax,(%rsi) 

For the uninitiated, the calling convention is:

  • rdi = first parameter
  • rsi = second parameter
  • rdx = third parameter

GCC output was even clearer than the wiki article: 4 instructions vs 3 instructions.

Arrays

So far we have single instruction savings, but if pointer represent arrays to be looped over, a common use case, then a bunch of instructions could be saved, as mentioned by supercat and michael.

Consider for example:

void f(char *restrict p1, char *restrict p2, size_t size) {
     for (size_t i = 0; i < size; i++) {
         p1[i] = 4;
         p2[i] = 9;
     }
 }

Because of restrict, a smart compiler (or human), could optimize that to:

memset(p1, 4, size);
memset(p2, 9, size);

Which is potentially much more efficient as it may be assembly optimized on a decent libc implementation (like glibc) Is it better to use std::memcpy() or std::copy() in terms to performance?, possibly with SIMD instructions.

Without, restrict, this optimization could not be done, e.g. consider:

char p1[4];
char *p2 = &p1[1];
f(p1, p2, 3);

Then for version makes:

p1 == {4, 4, 4, 9}

while the memset version makes:

p1 == {4, 9, 9, 9}

Does GCC really do it?

GCC 5.2.1.Linux x86-64 Ubuntu 15.10:

gcc -g -std=c99 -O0 -c main.c
objdump -dr main.o

With -O0, both are the same.

With -O3:

  • with restrict:

    3f0:   48 85 d2                test   %rdx,%rdx
    3f3:   74 33                   je     428 <fr+0x38>
    3f5:   55                      push   %rbp
    3f6:   53                      push   %rbx
    3f7:   48 89 f5                mov    %rsi,%rbp
    3fa:   be 04 00 00 00          mov    $0x4,%esi
    3ff:   48 89 d3                mov    %rdx,%rbx
    402:   48 83 ec 08             sub    $0x8,%rsp
    406:   e8 00 00 00 00          callq  40b <fr+0x1b>
                            407: R_X86_64_PC32      memset-0x4
    40b:   48 83 c4 08             add    $0x8,%rsp
    40f:   48 89 da                mov    %rbx,%rdx
    412:   48 89 ef                mov    %rbp,%rdi
    415:   5b                      pop    %rbx
    416:   5d                      pop    %rbp
    417:   be 09 00 00 00          mov    $0x9,%esi
    41c:   e9 00 00 00 00          jmpq   421 <fr+0x31>
                            41d: R_X86_64_PC32      memset-0x4
    421:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
    428:   f3 c3                   repz retq
    

    Two memset calls as expected.

  • without restrict: no stdlib calls, just a 16 iteration wide loop unrolling which I do not intend to reproduce here :-)

I haven't had the patience to benchmark them, but I believe that the restrict version will be faster.

Strict aliasing rule

The restrict keyword only affects pointers of compatible types (e.g. two int*) because the strict aliasing rules says that aliasing incompatible types is undefined behavior by default, and so compilers can assume it does not happen and optimize away.

See: What is the strict aliasing rule?

Does it work for references?

According to the GCC docs it does: https://gcc.gnu.org/onlinedocs/gcc-5.1.0/gcc/Restricted-Pointers.html with syntax:

int &__restrict__ rref

There is even a version for this of member functions:

void T::fn () __restrict__
落花随流水 2024-07-24 07:31:50

在他的论文中,内存优化,Christer Ericson 表示,虽然 restrict 尚未成为 C++ 标准的一部分,但许多编译器都支持它,他建议在可用时使用它:

限制关键字

! 1999 ANSI/ISO C 标准的新增内容

! 尚未纳入 C++ 标准,但许多 C++ 编译器都支持

! 仅作为提示,因此可能什么都不做但仍然符合要求

限制限定指针(或引用)...

! ...基本上是一个
向编译器承诺
指针的作用域,指针的目标只会
通过该指针访问(并且复制的指针
来自它)。

在支持它的 C++ 编译器中,它的行为可能与 C 中的行为相同。

有关详细信息,请参阅此 SO 帖子:C99 'restrict' 关键字的实际用法?

花半个小时浏览一下 Ericson 的论文,很有趣,值得花时间。

编辑

我还发现IBM的AIX C/C++ 编译器支持 __restrict__ 关键字

g++ 似乎也支持这一点,因为以下程序可以在 g++ 上干净地编译:

#include <stdio.h>

int foo(int * __restrict__ a, int * __restrict__ b) {
    return *a + *b;
}

int main(void) {
    int a = 1, b = 1, c;
    
    c = foo(&a, &b);

    printf("c == %d\n", c);

    return 0;
}

我还发现了一篇关于使用 restrict 的好文章:

揭秘 Restrict 关键字

Edit2

我遇到了一篇文章,专门讨论了 limit 的使用在 C++ 程序中:

Load-hit-stores 和 __restrict 关键字

另外,Microsoft Visual C++ 还支持 <代码>__restrict关键字

In his paper, Memory Optimization, Christer Ericson says that while restrict is not part of the C++ standard yet, that it is supported by many compilers and he recommends it's usage when available:

restrict keyword

! New to 1999 ANSI/ISO C standard

! Not in C++ standard yet, but supported by many C++ compilers

! A hint only, so may do nothing and still be conforming

A restrict-qualified pointer (or reference)...

! ...is basically a
promise to the compiler that for the
scope of the pointer, the target of the pointer will only
be accessed through that pointer (and pointers copied
from it).

In C++ compilers that support it it should probably behave the same as in C.

See this SO post for details: Realistic usage of the C99 ‘restrict’ keyword?

Take half an hour to skim through Ericson's paper, it's interesting and worth the time.

Edit

I also found that IBM's AIX C/C++ compiler supports the __restrict__ keyword.

g++ also seems to support this as the following program compiles cleanly on g++:

#include <stdio.h>

int foo(int * __restrict__ a, int * __restrict__ b) {
    return *a + *b;
}

int main(void) {
    int a = 1, b = 1, c;
    
    c = foo(&a, &b);

    printf("c == %d\n", c);

    return 0;
}

I also found a nice article on the use of restrict:

Demystifying The Restrict Keyword

Edit2

I ran across an article which specifically discusses the use of restrict in C++ programs:

Load-hit-stores and the __restrict keyword

Also, Microsoft Visual C++ also supports the __restrict keyword.

呆萌少年 2024-07-24 07:31:50

没有什么。 它被添加到C99标准中。

Nothing. It was added to the C99 standard.

非要怀念 2024-07-24 07:31:50

是添加此关键字的原始提案。 正如 dirkgently 指出的那样,这是一个 C99 功能; 它与C++无关。

This is the original proposal to add this keyword. As dirkgently pointed out though, this is a C99 feature; it has nothing to do with C++.

海螺姑娘 2024-07-24 07:31:50

C++中没有这样的关键字。 C++ 关键字列表可以在 C++ 语言标准的第 2.11/1 节中找到。 restrict 是 C 语言 C99 版本中的关键字,而不是 C++ 中的关键字。

There's no such keyword in C++. List of C++ keywords can be found in section 2.11/1 of C++ language standard. restrict is a keyword in C99 version of C language and not in C++.

暮色兮凉城 2024-07-24 07:31:50

由于某些 C 库中的头文件使用关键字,C++ 语言将不得不对此做一些事情.. 至少,忽略关键字,因此我们不必 #define 关键字到空白宏来抑制关键字。

Since header files from some C libraries use the keyword, the C++ language will have to do something about it.. at the minimum, ignoring the keyword, so we don't have to #define the keyword to a blank macro to suppress the keyword.

笑看君怀她人 2024-07-24 07:31:50

使用 __restrict__,编译器可以进行复杂的优化,因为程序员已经保证了 __restrict__ 修饰的指针< strong>s 指向彼此不重叠的数据范围。

通常情况就是这样,因此为了实现高性能目标,您大多数时候可以将 __restrict__ 装饰器放置到代码中的指针 中。

With __restrict__, the compiler can do sophisticated optimizations, as programmer has guranteed the __restrict__ decorated pointers point to data ranges that will definetly not overlap each other.

This is usually the case, so for high performance goal, you can at most times put a __restrict__ decorator to the pointers in your code.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文