const_cast 的未定义行为
我希望有人能够准确地澄清 C++ 中未定义行为的含义。给出以下类定义:
class Foo
{
public:
explicit Foo(int Value): m_Int(Value) { }
void SetValue(int Value) { m_Int = Value; }
private:
Foo(const Foo& rhs);
const Foo& operator=(const Foo& rhs);
private:
int m_Int;
};
如果我正确理解了以下代码中对引用和指针的两个 const_cast 将删除 Foo 类型的原始对象的常量性,但是通过以下任一方式修改此对象的任何尝试指针或引用将导致未定义的行为。
int main()
{
const Foo MyConstFoo(0);
Foo& rFoo = const_cast<Foo&>(MyConstFoo);
Foo* pFoo = const_cast<Foo*>(&MyConstFoo);
//MyConstFoo.SetValue(1); //Error as MyConstFoo is const
rFoo.SetValue(2); //Undefined behaviour
pFoo->SetValue(3); //Undefined behaviour
return 0;
}
让我困惑的是为什么这似乎有效并且会修改原始的 const 对象,但甚至没有提示我警告,通知我此行为未定义。我知道从广义上讲, const_cast 是不受欢迎的,但我可以想象一种情况,即缺乏对 C 风格强制转换的认识,可能会导致进行 const_cast 而不被注意到,例如:
Foo& rAnotherFoo = (Foo&)MyConstFoo;
Foo* pAnotherFoo = (Foo*)&MyConstFoo;
rAnotherFoo->SetValue(4);
pAnotherFoo->SetValue(5);
在什么情况下这种行为可能会导致致命的运行时错误?我可以设置一些编译器设置来警告我这种(潜在的)危险行为吗?
注意:我使用 MSVC2008。
I was hoping that someone could clarify exactly what is meant by undefined behaviour in C++. Given the following class definition:
class Foo
{
public:
explicit Foo(int Value): m_Int(Value) { }
void SetValue(int Value) { m_Int = Value; }
private:
Foo(const Foo& rhs);
const Foo& operator=(const Foo& rhs);
private:
int m_Int;
};
If I've understood correctly the two const_casts to both a reference and a pointer in the following code will remove the const-ness of the original object of type Foo, but any attempts made to modify this object through either the pointer or the reference will result in undefined behaviour.
int main()
{
const Foo MyConstFoo(0);
Foo& rFoo = const_cast<Foo&>(MyConstFoo);
Foo* pFoo = const_cast<Foo*>(&MyConstFoo);
//MyConstFoo.SetValue(1); //Error as MyConstFoo is const
rFoo.SetValue(2); //Undefined behaviour
pFoo->SetValue(3); //Undefined behaviour
return 0;
}
What is puzzling me is why this appears to work and will modify the original const object but doesn't even prompt me with a warning to notify me that this behaviour is undefined. I know that const_casts are, broadly speaking, frowned upon, but I can imagine a case where lack of awareness that C-style cast can result in a const_cast being made could occur without being noticed, for example:
Foo& rAnotherFoo = (Foo&)MyConstFoo;
Foo* pAnotherFoo = (Foo*)&MyConstFoo;
rAnotherFoo->SetValue(4);
pAnotherFoo->SetValue(5);
In what circumstances might this behaviour cause a fatal runtime error? Is there some compiler setting that I can set to warn me of this (potentially) dangerous behaviour?
NB: I use MSVC2008.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
从技术上讲,“未定义的行为”意味着该语言没有定义执行此类操作的语义。
实际上,这通常意味着“不要这样做;当编译器执行优化或由于其他原因时,它可能会中断”。
在这个特定的示例中,尝试修改任何不可变对象可能“看起来有效”,或者它可能会覆盖不属于程序或属于某个其他对象[一部分]的内存,因为不可变对象对象可能已在编译时被优化掉,或者它可能存在于内存中的某些只读数据段中。
可能导致这些事情发生的因素实在太复杂,无法一一列举。考虑取消引用未初始化的指针(也称为 UB)的情况:您正在使用的“对象”将具有一些任意内存地址,该地址取决于指针所在位置的内存中发生的任何值;该“值”可能依赖于先前的程序调用、同一程序中的先前工作、用户提供的输入的存储等。尝试合理化调用未定义行为的可能结果是根本不可行的,因此,我们通常不会这样做不要打扰,只需说“不要这样做”。
更复杂的是,编译器不需要诊断(发出警告/错误)未定义行为,因为调用未定义行为的代码与格式不正确(即明确非法)的代码不同。在许多情况下,编译器甚至无法检测到 UB,因此这是程序员有责任正确编写代码的领域。
类型系统——包括 const 关键字的存在和语义——提供了基本的保护,防止编写会破坏的代码; C++ 程序员应该始终意识到,颠覆这个系统(例如通过破解
const
ness)需要您自担风险,而且通常是一个坏主意。™绝对地。当警告级别设置得足够高时,理智的编译器可能会选择警告您这一点,但它没有必要,也可能不会。一般来说,这是 C 风格转换不受欢迎的一个很好的原因,但它们仍然受到支持以向后兼容 C。这只是不幸的事情之一。
Technically, "Undefined Behaviour" means that the language defines no semantics for doing such a thing.
In practice, this usually means "don't do it; it can break when your compiler performs optimisations, or for other reasons".
In this specific example, attempting to modify any non-mutable object may "appear to work", or it may overwrite memory that doesn't belong to the program or that belongs to [part of] some other object, because the non-mutable object might have been optimised away at compile-time, or it may exist in some read-only data segment in memory.
The factors that may lead to these things happening are simply too complex to list. Consider the case of dereferencing an uninitialised pointer (also UB): the "object" you're then working with will have some arbitrary memory address that depends on whatever value happened to be in memory at the pointer's location; that "value" is potentially dependent on previous program invocations, previous work in the same program, storage of user-provided input etc. It's simply not feasible to try to rationalise the possible outcomes of invoking Undefined Behaviour so, again, we usually don't bother and instead just say "don't do it".
As a further complication, compilers are not required to diagnose (emit warnings/errors) for Undefined Behaviour, because code that invokes Undefined Behaviour is not the same as code that is ill-formed (i.e. explicitly illegal). In many cases, it's not tractible for the compiler to even detect UB, so this is an area where it is the programmer's responsibility to write the code properly.
The type system — including the existence and semantics of the
const
keyword — presents basic protection against writing code that will break; a C++ programmer should always remain aware that subverting this system — e.g. by hacking awayconst
ness — is done at your own risk, and is generally A Bad Idea.™Absolutely. With warning levels set high enough, a sane compiler may choose to warn you about this, but it doesn't have to and it may not. In general, this is a good reason why C-style casts are frowned upon, but they are still supported for backwards compatibility with C. It's just one of those unfortunate things.
未定义的行为取决于对象的诞生方式,可以查看Stephan 在 00:10:00 左右解释了它,但本质上,请遵循以下代码:
现在有调用
f
有两种情况总结一下,
K
天生就是一个非 const,所以调用 f 时强制转换是可以的,而AK
则为天生就是一个const,所以……UB就是这样。Undefined behaviour depends on the way the object was born, you can see Stephan explaining it at around 00:10:00 but essentially, follow the code below:
Now there are two cases for calling
f
To sum up,
K
was born a non const, so the cast is ok when calling f, whereasAK
was born aconst
so ... UB it is.未定义的行为字面意思就是:语言标准未定义的行为。它通常发生在代码执行错误但编译器无法检测到错误的情况下。捕获错误的唯一方法是引入运行时测试 - 这会损害性能。因此,语言规范告诉您不能做某些事情,如果您这样做,那么任何事情都可能发生。
在写入常量对象的情况下,使用 const_cast 破坏编译时检查,可能存在三种情况:
在您的测试中,您最终遇到了第一种情况 - 对象(几乎肯定)是在堆栈上创建的,该堆栈没有写保护。您可能会发现,如果对象是静态的,则会出现第二种情况,如果启用更多优化,则会出现第三种情况。
一般来说,编译器无法诊断此错误 - 无法判断(除了像您这样的非常简单的示例之外)引用或指针的目标是否是常量。您需要确保仅在可以保证安全的情况下才使用 const_cast - 无论是当对象不是常量时,还是当您实际上并不打算修改它时。
Undefined behaviour literally means just that: behaviour which is not defined by the language standard. It typically occurs in situations where the code is doing something wrong, but the error can't be detected by the compiler. The only way to catch the error would be to introduce a run-time test - which would hurt performance. So instead, the language specification tells you that you mustn't do certain things and, if you do, then anything could happen.
In the case of writing to a constant object, using
const_cast
to subvert the compile-time checks, there are three likely scenarios:In your test, you ended up in the first scenario - the object was (almost certainly) created on the stack, which is not write protected. You may find that you get the second scenario if the object is static, and the third if you enable more optimisation.
In general, the compiler can't diagnose this error - there is no way to tell (except in very simple examples like yours) whether the target of a reference or pointer is constant or not. It's up to you to make sure that you only use
const_cast
when you can guarantee that it's safe - either when the object isn't constant, or when you're not actually going to modify it anyway.这就是未定义行为的含义。
它可以做任何事情,包括看起来有效。
如果您将优化级别提高到最高值,它可能会停止工作。
当它进行修改时,对象不是 const。在一般情况下,它无法判断该对象最初是 const,因此不可能警告您。即使它是每个语句单独评估而不参考其他语句(当查看那种警告生成时)。
其次,通过使用强制转换,您可以告诉编译器“我知道我在做什么,覆盖您所有的安全功能,然后就去做”。
例如,以下工作正常:(或者看起来也是如此(以鼻恶魔类型的方式))
这是看待它们的错误方式。它们是在代码中记录您正在做一些奇怪的事情的一种方式,需要由聪明的人验证(因为编译器将毫无疑问地遵守强制转换)。您需要一个聪明的人来验证的原因是它可能会导致未定义的行为,但好处是您现在已经在代码中明确记录了这一点(人们肯定会仔细查看您所做的事情)。
在 C++ 中,不需要使用 C 风格强制转换。
在最坏的情况下,C 风格的强制转换可以用reinterpret_cast<> 代替。但是在移植代码时,您想看看是否可以使用 static_cast<>。 C++ 强制转换的目的是使它们脱颖而出,以便您可以看到它们并一眼就能看出危险强制转换和良性强制转换之间的区别。
That is what undefined behavior means.
It can do anything including appear to work.
If you increase your optimization level to its top value it will probably stop working.
At the point it were it does the modification the object is not const. In the general case it can not tell that the object was originally a const, therefore it is not possible to warn you. Even if it was each statement is evaluated on its own without reference to the others (when looking at that kind of warning generation).
Secondly by using cast you are telling the compiler "I know what I am doing override all your safety features and just do it".
For example the following works just fine: (or will seem too (in the nasal deamon type of way))
That is the wrong way to look at them. They are a way of documenting in the code that you are doing something strange that needs to be validated by smart people (as the compiler will obey the cast without question). The reason you need a smart person to validate is that it can lead to undefined behavior, but the good thing you have now explicitly documented this in your code (and people will definitely look closely at what you have done).
In C++ there is no need to use a C style cast.
In the worst case the C-Style cast can be replaced by reinterpret_cast<> but when porting code you want to see if you could have used static_cast<>. The point of the C++ casts is to make them stand out so you can see them and at a glance spot the difference between the dangerous casts the benign casts.
一个典型的例子是尝试修改 const 字符串文字,它可能存在于受保护的数据段中。
A classic example would be trying to modify a const string literal, which may exist in a protected data segment.
出于优化原因,编译器可能会将 const 数据放置在内存的只读部分中,并且尝试修改此数据将导致 UB。
Compilers may place const data in read only parts of memory for optimization reasons and attempt to modify this data will result in UB.
静态和常量数据通常存储在程序的其他部分而不是局部变量中。对于常量变量,这些区域通常处于只读模式以强制变量的常量性。尝试写入只读内存会导致“未定义的行为”,因为反应取决于您的操作系统。 “未定义的行为”意味着该语言没有指定如何处理这种情况。
如果您想了解有关内存的更详细解释,我建议您阅读此。这是基于 UNIX 的解释,但所有操作系统都使用类似的机制。
Static and const data are often stored in another part of you program than local variables. For const variables, these areas are often in read-only mode to enforce the constness of the variables. Attempting to write in a read-only memory results in an "undefined behavior" because the reaction depends on your operating system. "Undefined beheavior" means that the language doesn't specify how this case is to be handled.
If you want a more detailed explanation about memory, I suggest you read this. It's an explanation based on UNIX but similar mecanism are used on all OS.