为什么线程共享堆空间?
每个线程都有自己的堆栈,但它们共享一个公共堆。
每个人都清楚堆栈是用于局部/方法变量和变量的。堆用于实例/类变量。
线程之间共享堆有什么好处?
有多个线程同时运行,因此共享内存可能会导致并发修改、互斥等开销等问题。 堆中线程共享哪些内容。
为什么会这样呢?为什么不让每个线程也拥有自己的堆呢?谁能提供一个现实世界的例子,线程如何利用共享内存?
Threads each have their own stack, but they share a common heap.
Its clear to everyone that stack is for local/method variables & heap is for instance/class variables.
What is the benefit of sharing heap among threads.
There are several number of threads running simultaneously, so sharing memory can lead to issues such as concurrent modification, mutual exclusion etc overhead.
What contents are shared by threads in heap.
Why is this the case? Why not have each thread own its own heap as well? Can anyone provide a real world example of this, how shared memory is utilized by threads?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
当您想将数据从一个线程传递到另一个线程时该怎么办? (如果您从未这样做过,那么您将编写单独的程序,而不是一个多线程程序。)主要有两种方法:
您似乎认为理所当然的方法是共享内存:除了有令人信服的理由需要特定于线程的数据(例如堆栈)之外,所有线程都可以访问所有数据。基本上,有一个共享堆。这为您提供了速度:任何时候一个线程更改某些数据,其他线程都可以看到它。 (限制:如果线程在不同的处理器上执行,则情况并非如此:程序员需要特别努力才能正确且有效地使用共享内存。)大多数主要命令式语言,特别是 Java 和 C# ,喜欢这个模型。
每个线程可以有一个堆,再加上一个共享堆。这需要程序员决定将哪些数据放在哪里,而这通常与现有的编程语言不能很好地配合。
双重方法是消息传递:每个线程都有自己的数据空间;当一个线程想要与另一个线程通信时,它需要显式地向另一个线程发送消息,以便将数据从发送者堆复制到接收者堆。在这种设置中,许多社区更喜欢将线程称为进程。这为您提供了安全:由于线程无法随心所欲地覆盖其他线程的内存,因此可以避免很多错误。另一个好处是分布:您可以使线程在不同的计算机上运行,而无需更改程序中的一行。您可以找到大多数语言的消息传递库,但集成往往不太好。理解消息传递的好语言是 Erlang 和 JoCaml。
事实上,消息传递环境通常在幕后使用共享内存,至少只要线程在同一台机器/处理器上运行。这节省了大量时间和内存,因为将消息从一个线程传递到另一个线程不需要复制数据。但由于共享内存不暴露给程序员,其固有的复杂性仅限于语言/库的实现。
What do you do when you want to pass data from one thread to another? (If you never did that you'd be writing separate programs, not one multi-threaded program.) There are two major approaches:
The approach you seem to take for granted is shared memory: except for data that has a compelling reason to be thread-specific (such as the stack), all data is accessible to all threads. Basically, there is a shared heap. That gives you speed: any time a thread changes some data, other threads can see it. (Limitation: this is not true if the threads are executing on different processors: there the programmer needs to work especially hard to use shared memory correctly and efficiently.) Most major imperative languages, in particular Java and C#, favor this model.
It is possible to have one heap per thread, plus a shared heap. This requires the programmer to decide which data to put where, and that often doesn't mesh well with existing programming languages.
The dual approach is message passing: each thread has its own data space; when a thread wants to communicate with another thread it needs to explicitly send a message to the other thread, so as to copy the data from the sender's heap to the recipient's heap. In this setting many communities prefer to call the threads processes. That gives you safety: since a thread can't overwrite some other thread's memory on a whim, a lot of bugs are avoided. Another benefit is distribution: you can make your threads run on separate machines without having to change a single line in your program. You can find message passing libraries for most languages but integration tends to be less good. Good languages to understand message passing in are Erlang and JoCaml.
In fact message passing environments usually use shared memory behind the scene, at least as long as the threads are running on the same machine/processor. This saves a lot of time and memory since passing a message from one thread to another then doesn't require making a copy of the data. But since the shared memory is not exposed to the programmer, its inherent complexity is confined to the language/library implementation.
因为否则它们将是流程。这就是线程共享内存的整个想法。
Because otherwise they would be processes. That is the whole idea of threads, to share memory.
通常,进程不共享堆空间。有 API 允许这样做,但默认情况下进程是单独的
线程共享堆空间。
这就是“实用的想法”——使用内存的两种方式——共享和不共享。
Processes don't --generally-- share heap space. There are API's to permit this, but the default is that processes are separate
Threads share heap space.
That's the "practical idea" -- two ways to use memory -- shared and not shared.
在许多语言/运行时中,堆栈(除其他外)用于保存函数/方法参数和变量。如果线程共享堆栈,事情就会变得非常混乱。
当对“MyFunc”的调用完成时,堆栈将被弹出,并且 a 和 b 不再位于堆栈中。由于线程不共享堆栈,因此变量 a 和 b 不存在线程问题。
由于堆栈的性质(压入/弹出),它并不真正适合在函数调用之间保持“长期”状态或共享状态。像这样:
In many languages/runtimes the stack is (among other) used for keep function/method parameters and variables. If thread shared a stack, things would get really messy.
When the call to 'MyFunc' is done, the stacked is popped and a and b is no longer on the stack. Because threads dont share stacks, there is no threading issue for the variables a and b.
Because of the nature of the stack (pushing/popping) its not really suited for keeping 'long term' state or shared state across function calls. Like this:
堆只是堆栈之外动态分配的所有内存。由于操作系统提供了单个地址空间,因此很明显,根据定义,堆由进程中的所有线程共享。至于为什么堆栈不共享,那是因为执行线程必须有自己的堆栈才能管理其调用树(例如,它包含有关离开函数时要执行的操作的信息!)。
现在,您当然可以编写一个内存管理器,根据调用线程从地址空间中的不同区域分配数据,但其他线程仍然能够看到该数据(就像您以某种方式泄漏了指向线程上某些内容的指针一样)堆栈到另一个线程,其他线程可以读取它,尽管这是一个可怕的想法)
The Heap is just all memory outside of the stack that is dynamically allocated. Since the OS provides a single address space then it becomes clear that the heap is by definition shared by all threads in the process. As for why stacks are not shared, that's because an execution thread has to have its own stack to be able to manage its call tree (it contains information about what to do when you leave a function, for instance!).
Now you could of course write a memory manager that allocated data from different areas in your address space depending on the calling thread, but other threads would still be able to see that data (just like if you somehow leak a pointer to something on your thread's stack to another thread, that other thread could read it, despite this being a horrible idea)
问题在于,拥有本地堆会显着增加复杂性,但价值却很小。
有一个小的性能优势,TLAB(线程本地分配缓冲区)可以很好地处理这一点,它可以透明地为您提供大部分优势。
The problem is that having local heaps adds significant complexity for very little value.
There is a small performance advantage and this is handled well by the TLAB (Thread Local Allocation Buffer) which gives you most of the advantage transparently.
在多线程应用程序中,每个线程都有自己的堆栈,但共享相同的堆。这就是为什么在代码中应注意避免堆空间中出现任何并发访问问题。堆栈是线程安全的(每个线程都有自己的堆栈),但堆不是线程安全的,除非通过代码进行同步保护。
In a multi-threaded application each thread will have its own stack but will share the same heap. This is why care should be taken in your code to avoid any concurrent access issues in the heap space. The stack is threadsafe (each thread will have its own stack) but the heap is not threadsafe unless guarded with synchronisation through your code.
那是因为线程的思想是“共享一切”。当然,有些东西你不能共享,比如处理器上下文和堆栈,但其他一切都是共享的。
That's because the idea of threads is "share everything". Of course, there are some things you cannot share, like processor context and stack, but everything else is shared.