局部变量的内存可以在其作用域之外访问吗？

发布于 2024-11-16 10:23:55 字数 305 浏览 11 评论 0原文

我有以下代码。

#include <iostream>

int * foo()
{
    int a = 5;
    return &a;
}

int main()
{
    int* p = foo();
    std::cout << *p;
    *p = 8;
    std::cout << *p;
}

并且代码只是运行，没有运行时异常！

输出是58

这是怎么回事？局部变量的内存在其函数之外是不可访问的吗？

原文

I have the following code.

#include <iostream>

int * foo()
{
    int a = 5;
    return &a;
}

int main()
{
    int* p = foo();
    std::cout << *p;
    *p = 8;
    std::cout << *p;
}

And the code is just running with no runtime exceptions!

The output was 58

How can it be? Isn't the memory of a local variable inaccessible outside its function?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

溺ぐ爱和你が 2024-11-23 10:23:55

怎么可能呢？局部变量的内存在其函数之外不是不可访问的吗？

你租了一个酒店房间。你把一本书放在床头柜最上面的抽屉里，然后去睡觉。您第二天早上退房，但“忘记”归还钥匙。你偷了钥匙！

一周后，您回到酒店，没有办理入住，而是用偷来的钥匙潜入您的旧房间，然后查看抽屉。你的书还在那里。惊人！

怎么可能呢？如果您没有租用房间，那么酒店房间抽屉里的东西不是就无法访问吗？

嗯，显然这种情况在现实世界中发生是没有问题的。当您不再被授权进入房间时，不会有任何神秘的力量导致您的书消失。也没有什么神秘的力量可以阻止你用偷来的钥匙进入房间。

酒店管理人员不需要删除您的预订。你没有与他们签订合同，规定如果你留下东西，他们会帮你把它撕碎。如果您使用偷来的钥匙非法重新进入房间以取回钥匙，酒店保安人员不需要抓住您偷偷溜进去。您没有与他们签订合同，规定“如果我待会儿想偷偷溜回我的房间，你得阻止我。”相反，你与他们签订了一份合同，上面写着“我保证以后不再溜回我的房间”，但你违反了这份合同。

在这种情况下任何事情都可能发生。这本书可以在那里——你很幸运。别人的书可能在那里，而你的书可能在酒店的熔炉里。当你进来时，可能有人就在那里，把你的书撕成碎片。酒店本可以把桌子和书本全部拆除，换上一个衣柜。整个酒店可能即将被拆除，取而代之的是一个足球场，而你在偷偷摸摸的时候就会死于爆炸。

你不知道会发生什么；当您退房并偷了一把钥匙以便稍后非法使用时，您就放弃了生活在可预测的安全世界中的权利，因为您选择违反系统规则。

C++ 不是一种安全语言。它会很高兴地让你打破系统的规则。如果你试图做一些非法和愚蠢的事情，比如回到一个你无权进入的房间，翻阅一张可能已经不存在的桌子，C++ 不会阻止你。比 C++ 更安全的语言通过限制您的权力来解决这个问题，例如对密钥进行更严格的控制。

编译器的职责是生成代码来管理该程序操作的数据的存储。生成管理内存的代码有很多不同的方法，但随着时间的推移，两种基本技术已经变得根深蒂固。

第一个是拥有某种“长期存在”的存储区域，其中存储中每个字节的“生命周期”（即与某个程序变量有效关联的时间段）无法轻松提前预测。编译器生成对“堆管理器”的调用，该管理器知道如何在需要时动态分配存储空间并在不再需要时回收它。

第二种方法是使用“短期”存储区域，其中每个字节的生命周期是众所周知的。在这里，生命周期遵循“嵌套”模式。这些短期变量中寿命最长的变量将在任何其他短期变量之前分配，并且最后被释放。寿命较短的变量将在寿命最长的变量之后分配，并在它们之前释放。这些寿命较短的变量的生命周期“嵌套”在寿命较长的变量的生命周期内。

局部变量遵循后一种模式；当进入一个方法时，它的局部变量就会活跃起来。当该方法调用另一个方法时，新方法的局部变量就会激活。在第一个方法的局部变量失效之前它们就会失效。与局部变量相关的存储生命周期的开始和结束的相对顺序可以提前算出。

因此，局部变量通常生成为“堆栈”数据结构上的存储，因为堆栈具有这样的属性：第一个压入其中的东西将是最后一个弹出的东西。

就好像酒店决定只按顺序出租房间，直到房号比你高的人都退房之后你才能退房。

那么让我们考虑一下堆栈。在许多操作系统中，每个线程都有一个堆栈，并且堆栈被分配为特定的固定大小。当你调用一个方法时，东西就会被压入堆栈。如果您随后将指向堆栈的指针从方法中传回，就像原始发布者在这里所做的那样，那么这只是指向某个完全有效的百万字节内存块中间的指针。在我们的比喻中，你从酒店退房；当您这样做时，您刚刚从编号最大的占用房间中签出。如果没有其他人在您之后办理入住，并且您非法返回自己的房间，那么您的所有物品都保证仍然在这家特定的酒店。

我们使用堆栈作为临时存储，因为它们非常便宜且简单。 C++ 的实现不需要使用堆栈来存储局部变量；它可以使用堆。事实并非如此，因为这会使程序变慢。

C++ 的实现不需要将您留在堆栈上的垃圾原封不动地保留下来，以便您以后可以非法地返回；编译器生成将您刚刚腾出的“房间”中的所有内容归零的代码是完全合法的。并不是因为那样会很贵。

C++ 的实现不需要确保当堆栈逻辑收缩时，曾经有效的地址仍然映射到内存中。允许实现告诉操作系统“我们现在已经不再使用这个堆栈页面了。除非我另有说明，否则如果有人接触了先前有效的堆栈页面，则发出一个异常，该异常会破坏进程”。同样，实现实际上并没有这样做，因为它很慢而且没有必要。

相反，实施会让你犯错误并侥幸逃脱惩罚。大多数时候。直到有一天，出现了真正可怕的问题，整个过程崩溃了。

这是有问题的。规则有很多，很容易不小心违反。我当然有很多次了。更糟糕的是，通常只有在损坏发生数十亿纳秒后检测到内存损坏时，问题才会浮出水面，而此时很难找出是谁搞砸了。

更多内存安全语言通过限制你的能力来解决这个问题。在“普通”C# 中，根本无法获取本地地址并将其返回或存储以供以后使用。您可以获取本地地址，但该语言经过巧妙设计，使得在本地生命周期结束后无法使用它。为了获取本地地址并将其传回，您必须将编译器置于特殊的“不安全”模式，并在程序中添加“不安全”一词，以引起注意事实上，您可能正在做一些可能违反规则的危险事情。

进一步阅读：

如果 C# 允许返回引用怎么办？巧合的是，这就是今天博客文章的主题：
引用返回值和引用局部变量
为什么我们使用堆栈来管理内存？ C# 中的值类型总是存储在堆栈中吗？虚拟内存如何工作？还有更多关于 C# 内存管理器如何工作的主题。其中许多文章也与 C++ 程序员密切相关：
内存管理

How can it be? Isn't the memory of a local variable inaccessible outside its function?

You rent a hotel room. You put a book in the top drawer of the bedside table and go to sleep. You check out the next morning, but "forget" to give back your key. You steal the key!

A week later, you return to the hotel, do not check in, sneak into your old room with your stolen key, and look in the drawer. Your book is still there. Astonishing!

How can that be? Aren't the contents of a hotel room drawer inaccessible if you haven't rented the room?

Well, obviously that scenario can happen in the real world no problem. There is no mysterious force that causes your book to disappear when you are no longer authorized to be in the room. Nor is there a mysterious force that prevents you from entering a room with a stolen key.

The hotel management is not required to remove your book. You didn't make a contract with them that said that if you leave stuff behind, they'll shred it for you. If you illegally re-enter your room with a stolen key to get it back, the hotel security staff is not required to catch you sneaking in. You didn't make a contract with them that said "if I try to sneak back into my room later, you are required to stop me." Rather, you signed a contract with them that said "I promise not to sneak back into my room later", a contract which you broke.

In this situation anything can happen. The book can be there—you got lucky. Someone else's book can be there and yours could be in the hotel's furnace. Someone could be there right when you come in, tearing your book to pieces. The hotel could have removed the table and book entirely and replaced it with a wardrobe. The entire hotel could be just about to be torn down and replaced with a football stadium, and you are going to die in an explosion while you are sneaking around.

You don't know what is going to happen; when you checked out of the hotel and stole a key to illegally use later, you gave up the right to live in a predictable, safe world because you chose to break the rules of the system.

C++ is not a safe language. It will cheerfully allow you to break the rules of the system. If you try to do something illegal and foolish like going back into a room you're not authorized to be in and rummaging through a desk that might not even be there anymore, C++ is not going to stop you. Safer languages than C++ solve this problem by restricting your power—by having much stricter control over keys, for example.

Compilers are in the business of generating code which manages the storage of the data manipulated by that program. There are lots of different ways of generating code to manage memory, but over time two basic techniques have become entrenched.

The first is to have some sort of "long lived" storage area where the "lifetime" of each byte in the storage—that is, the period of time when it is validly associated with some program variable—cannot be easily predicted ahead of time. The compiler generates calls into a "heap manager" that knows how to dynamically allocate storage when it is needed and reclaim it when it is no longer needed.

The second method is to have a “short-lived” storage area where the lifetime of each byte is well known. Here, the lifetimes follow a “nesting” pattern. The longest-lived of these short-lived variables will be allocated before any other short-lived variables, and will be freed last. Shorter-lived variables will be allocated after the longest-lived ones, and will be freed before them. The lifetime of these shorter-lived variables is “nested” within the lifetime of longer-lived ones.

Local variables follow the latter pattern; when a method is entered, its local variables come alive. When that method calls another method, the new method's local variables come alive. They'll be dead before the first method's local variables are dead. The relative order of the beginnings and endings of lifetimes of storages associated with local variables can be worked out ahead of time.

For this reason, local variables are usually generated as storage on a "stack" data structure, because a stack has the property that the first thing pushed on it is going to be the last thing popped off.

It's like the hotel decides to only rent out rooms sequentially, and you can't check out until everyone with a room number higher than you has checked out.

So let's think about the stack. In many operating systems you get one stack per thread and the stack is allocated to be a certain fixed size. When you call a method, stuff is pushed onto the stack. If you then pass a pointer to the stack back out of your method, as the original poster does here, that's just a pointer to the middle of some entirely valid million-byte memory block. In our analogy, you check out of the hotel; when you do, you just checked out of the highest-numbered occupied room. If no one else checks in after you, and you go back to your room illegally, all your stuff is guaranteed to still be there in this particular hotel.

We use stacks for temporary stores because they are really cheap and easy. An implementation of C++ is not required to use a stack for storage of locals; it could use the heap. It doesn't, because that would make the program slower.

An implementation of C++ is not required to leave the garbage you left on the stack untouched so that you can come back for it later illegally; it is perfectly legal for the compiler to generate code that turns back to zero everything in the "room" that you just vacated. It doesn't because again, that would be expensive.

An implementation of C++ is not required to ensure that when the stack logically shrinks, the addresses that used to be valid are still mapped into memory. The implementation is allowed to tell the operating system "we're done using this page of stack now. Until I say otherwise, issue an exception that destroys the process if anyone touches the previously-valid stack page". Again, implementations do not actually do that because it is slow and unnecessary.

Instead, implementations let you make mistakes and get away with it. Most of the time. Until one day something truly awful goes wrong and the process explodes.

This is problematic. There are a lot of rules and it is very easy to break them accidentally. I certainly have many times. And worse, the problem often only surfaces when memory is detected to be corrupt billions of nanoseconds after the corruption happened, when it is very hard to figure out who messed it up.

More memory-safe languages solve this problem by restricting your power. In "normal" C# there simply is no way to take the address of a local and return it or store it for later. You can take the address of a local, but the language is cleverly designed so that it is impossible to use it after the lifetime of the local ends. In order to take the address of a local and pass it back, you have to put the compiler in a special "unsafe" mode, and put the word "unsafe" in your program, to call attention to the fact that you are probably doing something dangerous that could be breaking the rules.

For further reading:

What if C# did allow returning references? Coincidentally that is the subject of today's blog post:
Ref returns and ref locals
Why do we use stacks to manage memory? Are value types in C# always stored on the stack? How does virtual memory work? And many more topics in how the C# memory manager works. Many of these articles are also germane to C++ programmers:
Memory management

局部变量的内存可以在其作用域之外访问吗？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（21）

关于作者

相关话题

热门标签

推荐作者

牛↙奶布丁

COSO

落叶

暗地喜欢

qq_i8qOEG

qq_Wl4Sbi

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。