使用“指向易失性的指针”是否有效?始终阻止编译器优化?
问题是这样的:你的程序暂时使用了一些敏感的数据并希望在不再需要时将其删除。本身使用 std::fill() 并不总是有帮助 - 编译器可能会决定稍后不会访问内存块,因此擦除它是浪费时间并消除擦除代码。
{
char buffer[size];
//obtain and use password
std::fill_n( (volatile char*)buffer, size, 0);
}
其目的是,在看到 易失性
关键字后,编译器不会尝试消除对 std::fill_n()
的调用。
volatile
关键字是否总是会阻止编译器消除此类内存修改代码?
Here's the problem: your program temporarily uses some sensitive data and wants to erase it when it's no longer needed. Using std::fill()
on itself won't always help - the compiler might decide that the memory block is not accessed later, so erasing it is a waste of time and eliminate erasing code.
User ybungalobill suggests using volatile
keyword:
{
char buffer[size];
//obtain and use password
std::fill_n( (volatile char*)buffer, size, 0);
}
The intent is that upon seeing the volatile
keyword the compiler will not try to eliminate the call to std::fill_n()
.
Will volatile
keyword always prevent the compiler from such memory modifying code elimination?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
编译器可以自由地优化您的代码,因为
buffer
不是易失性对象。该标准仅要求编译器严格遵守易失性对象的语义。这是 C++03 所说的
和
在您的示例中,您所拥有的是使用易失性左值对非易失性对象进行读取和写入。 C++0x 删除了我上面引用的第二个文本,因为它是多余的。 C++0x 只是说
虽然有人可能会说“易失性数据”可能意味着“易失性左值访问的数据”,这仍然是相当夸张的,但 C++0x 措辞消除了对代码的所有疑虑,并且明确允许实现对其进行优化。
但正如人们向我指出的那样,这在实践中可能并不重要。优化此类事情的编译器很可能会违背程序员的意图(否则为什么有人会有指向 volatile 的指针),因此可能会包含错误。尽管如此,我还是有经验丰富的编译器供应商在面对有关其过度优化的错误报告时引用了这些段落。最后,
易失性
是固有的平台特定的,无论如何你都应该仔细检查结果。The compiler is free to optimize your code out because
buffer
is not a volatile object.The Standard only requires a compiler to strictly adhere to semantics for volatile objects. Here is what C++03 says
and
In your example, what you have are reads and writes using volatile lvalues to non-volatile objects. C++0x removed the second text I quoted above, because it's redundant. C++0x just says
While one may argue that "volatile data" could maybe mean "data accessed by volatile lvalues", which would still be quite a stretch, the C++0x wording removed all doubts about your code and clearly allows implementations to optimize it away.
But as people pointed out to me, It probably does not matter in practice. A compiler that optimizes such a thing will most probably go against the programmers intention (why would someone have a pointer to volatile otherwise) and so would probably contain a bug. Still, I have experienced compiler vendors that cited these paragraphs when they were faced with bugreports about their over-aggressive optimizations. In the end,
volatile
is inherent platform specific and you are supposed to double check the result anyway.来自最新的 C++0x 草案 [intro.execution]:
所以即使您提供的代码也不能被优化。
From the last C++0x draft [intro.execution]:
So even the code you provided must not be optimized.
您想要删除的内存内容可能已经从 CPU/核心的内部缓存刷新到 RAM,其他 CPU 可以继续看到它。覆盖后,您需要使用互斥锁/内存屏障指令/原子操作或其他东西来触发与其他核心的同步。在实践中,你的编译器可能会在调用任何外部函数之前执行此操作(谷歌 Dave Butenhof 的关于多线程中易失性可疑实用程序的帖子),因此如果你的线程随后很快执行此操作,那么这不是一个主要问题。总之:不需要 volatility。
The memory content you wish to remove may have already been flushed out from your CPU/core's inner cache to RAM, where other CPUs can continue to see it. After overwriting it, you need to use a mutex / memory barrier instruction / atomic operation or something to trigger a sync with other cores. In practice, your compiler will probably do this before calling any external functions (google Dave Butenhof's post on volatile's dubious utility in multi-threading), so if you thread does that soon afterwards anyway then it's not a major issue. Summarily: volatile isn't needed.
符合要求的实现可以在闲暇时推迟任何易失性读取和写入的实际性能,直到易失性读取的结果会影响易失性写入或 I/O 操作的执行。
例如,给定类似的情况:
符合标准的编译器可以根据自己的选择检查
scale
是否是 128 的倍数,如果是,则清除res 的所有偶数索引值
在从vol1
进行任何读取或写入vol2
之前。尽管编译器需要先从vol1
进行每次读取,然后才能对vol2
进行以下写入,但编译器可能能够将这两个操作推迟到运行之后基本上无限量的代码。A conforming implementation may, at its leisure, defer the actual performance of any volatile reads and writes until the result of a volatile read would affect the execution of a volatile write or I/O operation.
For example, given something like:
a conforming compiler could, at its option, check whether
scale
is a multiple of 128 and--if so--clear out all even-indexed values ofres
before doing any reads fromvol1
or writes tovol2
. Even though the compiler would need to do each reads fromvol1
before it could do the following write tovol2
, a compiler may be able to defer both operations until after it has run an essentially unlimited amount of code.