使用“指向易失性的指针”是否有效?始终阻止编译器优化?

发布于 2024-09-25 02:21:21 字数 734 浏览 1 评论 0原文

问题是这样的:你的程序暂时使用了一些敏感的数据并希望在不再需要时将其删除。本身使用 std::fill() 并不总是有帮助 - 编译器可能会决定稍后不会访问内存块,因此擦除它是浪费时间并消除擦除代码。

用户 ybungalobill 建议使用 易失性 关键字

{
  char buffer[size];
  //obtain and use password
  std::fill_n( (volatile char*)buffer, size, 0);
}

其目的是,在看到 易失性 关键字后,编译器不会尝试消除对 std::fill_n() 的调用。

volatile 关键字是否总是会阻止编译器消除此类内存修改代码?

Here's the problem: your program temporarily uses some sensitive data and wants to erase it when it's no longer needed. Using std::fill() on itself won't always help - the compiler might decide that the memory block is not accessed later, so erasing it is a waste of time and eliminate erasing code.

User ybungalobill suggests using volatile keyword:

{
  char buffer[size];
  //obtain and use password
  std::fill_n( (volatile char*)buffer, size, 0);
}

The intent is that upon seeing the volatile keyword the compiler will not try to eliminate the call to std::fill_n().

Will volatile keyword always prevent the compiler from such memory modifying code elimination?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

千紇 2024-10-02 02:21:21

编译器可以自由地优化您的代码,因为 buffer 不是易失性对象。

该标准仅要求编译器严格遵守易失性对象的语义。这是 C++03 所说的

符合实施的最低要求是:

  • 在序列点,易失性对象是稳定的,因为之前的评估是完整的并且
    后续评估尚未进行。
    [...]

抽象机的可观察行为是其对易失性数据的读取和写入序列以及
调用库 I/O 函数

在您的示例中,您所拥有的是使用易失性左值对非易失性对象进行读取和写入。 C++0x 删除了我上面引用的第二个文本,因为它是多余的。 C++0x 只是说

符合实施的最低要求是:

  • 对易失性对象的访问严格按照抽象机的规则进行评估。[...]

这些统称为程序的可观察行为

虽然有人可能会说“易失性数据”可能意味着“易失性左值访问的数据”,这仍然是相当夸张的,但 C++0x 措辞消除了对代码的所有疑虑,并且明确允许实现对其进行优化。

但正如人们向我指出的那样,这在实践中可能并不重要。优化此类事情的编译器很可能会违背程序员的意图(否则为什么有人会有指向 volatile 的指针),因此可能会包含错误。尽管如此,我还是有经验丰富的编译器供应商在面对有关其过度优化的错误报告时引用了这些段落。最后,易失性是固有的平台特定的,无论如何你都应该仔细检查结果。

The compiler is free to optimize your code out because buffer is not a volatile object.

The Standard only requires a compiler to strictly adhere to semantics for volatile objects. Here is what C++03 says

The least requirements on a conforming implementation are:

  • At sequence points, volatile objects are stable in the sense that previous evaluations are complete and
    subsequent evaluations have not yet occurred.
    [...]

and

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and
calls to library I/O functions

In your example, what you have are reads and writes using volatile lvalues to non-volatile objects. C++0x removed the second text I quoted above, because it's redundant. C++0x just says

The least requirements on a conforming implementation are:

  • Access to volatile objects are evaluated strictly according to the rules of the abstract machine.[...]

These collectively are referred to as the observable behavior of the program.

While one may argue that "volatile data" could maybe mean "data accessed by volatile lvalues", which would still be quite a stretch, the C++0x wording removed all doubts about your code and clearly allows implementations to optimize it away.

But as people pointed out to me, It probably does not matter in practice. A compiler that optimizes such a thing will most probably go against the programmers intention (why would someone have a pointer to volatile otherwise) and so would probably contain a bug. Still, I have experienced compiler vendors that cited these paragraphs when they were faced with bugreports about their over-aggressive optimizations. In the end, volatile is inherent platform specific and you are supposed to double check the result anyway.

泡沫很甜 2024-10-02 02:21:21

来自最新的 C++0x 草案 [intro.execution]:

8 最低要求
符合要求的实现是:

——对易失性对象的访问是
严格按照
抽象机的规则。

[...]

12 访问由 a 指定的对象
易失性左值(3.10),修改
对象,调用库 I/O
函数,或者调用一个函数
这些操作中的任何一个都是
副作用,[...]

所以即使您提供的代码也不能被优化。

From the last C++0x draft [intro.execution]:

8 The least requirements on a
conforming implementation are:

— Access to volatile objects are
evaluated strictly according to the
rules of the abstract machine.

[...]

12 Accessing an object designated by a
volatile glvalue (3.10), modifying an
object, calling a library I/O
function, or calling a function that
does any of those operations are all
side effects, [...]

So even the code you provided must not be optimized.

四叶草在未来唯美盛开 2024-10-02 02:21:21

您想要删除的内存内容可能已经从 CPU/核心的内部缓存刷新到 RAM,其他 CPU 可以继续看到它。覆盖后,您需要使用互斥锁/内存屏障指令/原子操作或其他东西来触发与其他核心的同步。在实践中,你的编译器可能会在调用任何外部函数之前执行此操作(谷歌 Dave Butenhof 的关于多线程中易失性可疑实用程序的帖子),因此如果你的线程随后很快执行此操作,那么这不是一个主要问题。总之:不需要 volatility。

The memory content you wish to remove may have already been flushed out from your CPU/core's inner cache to RAM, where other CPUs can continue to see it. After overwriting it, you need to use a mutex / memory barrier instruction / atomic operation or something to trigger a sync with other cores. In practice, your compiler will probably do this before calling any external functions (google Dave Butenhof's post on volatile's dubious utility in multi-threading), so if you thread does that soon afterwards anyway then it's not a major issue. Summarily: volatile isn't needed.

木落 2024-10-02 02:21:21

符合要求的实现可以在闲暇时推迟任何易失性读取和写入的实际性能,直到易失性读取的结果会影响易失性写入或 I/O 操作的执行。

例如,给定类似的情况:

volatile unsigned char vol1,vol2;
extern unsigned char res[1000];
void test(int scale)
{
  unsigned char ch;

  for (int 0=0; i<10000; i++)
  {
    res[i] = i*vol1*scale;
    vol2 = res[i];
  }
}

符合标准的编译器可以根据自己的选择检查 scale 是否是 128 的倍数,如果是,则清除 res 的所有偶数索引值 在从 vol1 进行任何读取或写入 vol2 之前。尽管编译器需要先从 vol1 进行每次读取,然后才能对 vol2 进行以下写入,但编译器可能能够将这两个操作推迟到运行之后基本上无限量的代码。

A conforming implementation may, at its leisure, defer the actual performance of any volatile reads and writes until the result of a volatile read would affect the execution of a volatile write or I/O operation.

For example, given something like:

volatile unsigned char vol1,vol2;
extern unsigned char res[1000];
void test(int scale)
{
  unsigned char ch;

  for (int 0=0; i<10000; i++)
  {
    res[i] = i*vol1*scale;
    vol2 = res[i];
  }
}

a conforming compiler could, at its option, check whether scale is a multiple of 128 and--if so--clear out all even-indexed values of res before doing any reads from vol1 or writes to vol2. Even though the compiler would need to do each reads from vol1 before it could do the following write to vol2, a compiler may be able to defer both operations until after it has run an essentially unlimited amount of code.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文