Java 和 .NET 字符串文字驻留在哪里?

发布于 2024-07-10 14:54:18 字数 684 浏览 15 评论 0原文

最近 .NET 中有关字符串文字的问题引起了我的注意。 我知道字符串文字是 interned ,以便具有相同值的不同字符串引用同一对象。 我还知道字符串可以在运行时被保留:

string now = DateTime.Now.ToString().Intern(); 

显然,在运行时被保留的字符串驻留在堆上,但我假设文字被放置在程序的数据段中(并且在我的 回答上述问题)。 但我不记得在任何地方看到过这个。 我认为情况就是如此,因为我就是这样做的,并且 ldstr IL 指令用于获取文字并且似乎没有发生分配的事实似乎支持了我。

长话短说,字符串文字驻留在哪里? 是在堆上、数据段上还是在我没有想到的某个地方?


编辑:如果字符串文字确实驻留在堆上,那么它们何时分配?

A recent question about string literals in .NET caught my eye. I know that string literals are interned so that different strings with the same value refer to the same object. I also know that a string can be interned at runtime:

string now = DateTime.Now.ToString().Intern(); 

Obviously a string that is interned at runtime resides on the heap but I had assumed that a literal is placed in the program's data segment (and said so in my answer to said question). However I don't remember seeing this anywhere. I assume this is the case since it's how I would do it and the fact that the ldstr IL instruction is used to get literals and no allocation seems to take place seems to back me up.

To cut a long story short, where do string literals reside? Is it on the heap, the data segment or some-place I haven't thought of?


Edit: If string literals do reside on the heap, when are they allocated?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

蓝色星空 2024-07-17 14:54:19

如果我错了,请纠正我,但在 Java 和 .NET 中,不是所有对象都驻留在堆上吗?

Correct me if I am wrong but don't all objects reside on the heap, in both Java and .NET?

淡紫姑娘! 2024-07-17 14:54:19

在 .Net 中,“interned”时的字符串文字存储在一个称为“intern 表”的特殊数据结构中。 这与堆和堆栈是分开的。 然而,并非所有字符串都被保留......我很确定那些不是的字符串都存储在堆上。

不知道Java

In .Net, string literals when "interned", are stored in a special data structure called, the "intern table". This is separate from the heap and the stack. Not all strings are interned however... I'm pretty sure that those that aren't are stored on the heap.

Don't know about Java

不醒的梦 2024-07-17 14:54:19

我在 MSDN 网站上找到了关于 ldstr IL指令

ldstr 指令将对象引用(类型 O)推送到表示元数据中存储的特定字符串文字的新字符串对象。 ldstr 指令分配必要的内存量,并执行将字符串文字从文件中使用的形式转换为运行时所需的字符串格式所需的任何格式转换。< /p>

公共语言基础设施 (CLI) 确保引用具有相同字符序列的两个元数据标记的两个 ldstr 指令的结果返回完全相同的字符串对象(称为“字符串驻留”的过程)。

这意味着字符串文字实际上存储在 .NET 的堆上(与 Java 不同,mmyers 指出)。

I found this on MSDN's site about the ldstr IL instruction:

The ldstr instruction pushes an object reference (type O) to a new string object representing the specific string literal stored in the metadata. The ldstr instruction allocates the requisite amount of memory and performs any format conversion required to convert the string literal from the form used in the file to the string format required at runtime.

The Common Language Infrastructure (CLI) guarantees that the result of two ldstr instructions referring to two metadata tokens that have the same sequence of characters return precisely the same string object (a process known as "string interning").

This implies that the string literals are in fact stored on the heap in .NET (unlike Java as pointed out by mmyers).

岁月流歌 2024-07-17 14:54:19

在Java中,字符串像所有对象一样驻留在堆中。
只有局部原始变量(整数、字符和对象引用)驻留在堆栈中。

In Java, strings like all objects reside in the heap.
Only local primitive variables (ints, chars and references to objects) reside in stack.

马蹄踏│碎落叶 2024-07-17 14:54:19

java 中的 Interned String 位于一个单独的池中,称为字符串池。 该池由 String 类维护并驻留在普通堆上(不是上面提到的 Perm 池,用于存储类数据)。

据我了解,并非所有字符串都被保留,但调用 myString.intern() 返回一个由字符串池保证的字符串。

也可以看看:
http://www.javaranch.com/journal/200409/ScjpTipLine-StringsLiterally。 html
和javadoc
http://java. sun.com/j2se/1.5.0/docs/api/java/lang/String.html#intern()

Interned String's in java are located in a separate Pool called the String Pool. This pool is maintained by the String class and resides on the normal Heap (not the Perm pool as mentioned above, that is used for storing the class data).

As I understand it not all Strings are interned, but calling myString.intern() returns a String that is guaranteed from the String Pool.

See also:
http://www.javaranch.com/journal/200409/ScjpTipLine-StringsLiterally.html
and the javadoc
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#intern()

自演自醉 2024-07-17 14:54:18

.NET 中的字符串是引用类型,因此它们始终位于堆上(即使它们被保留)。 您可以使用 WinDbg 等调试器来验证这一点。

如果您有下面的类

   class SomeType {
      public void Foo() {
         string s = "hello world";
         Console.WriteLine(s);
         Console.WriteLine("press enter");
         Console.ReadLine();
      }
   }

并且在实例上调用 Foo(),则可以使用 WinDbg 来检查堆。

该引用很可能存储在小程序的寄存器中,因此最简单的方法是通过执行 !dso 来查找对特定字符串的引用。 这为我们提供了相关字符串的地址:

0:000> !dso
OS Thread Id: 0x1660 (0)
ESP/REG  Object   Name
002bf0a4 025d4bf8 Microsoft.Win32.SafeHandles.SafeFileHandle
002bf0b4 025d4bf8 Microsoft.Win32.SafeHandles.SafeFileHandle
002bf0e8 025d4e5c System.Byte[]
002bf0ec 025d4c0c System.IO.__ConsoleStream
002bf110 025d4c3c System.IO.StreamReader
002bf114 025d4c3c System.IO.StreamReader
002bf12c 025d5180 System.IO.TextReader+SyncTextReader
002bf130 025d4c3c System.IO.StreamReader
002bf140 025d5180 System.IO.TextReader+SyncTextReader
002bf14c 025d5180 System.IO.TextReader+SyncTextReader
002bf15c 025d2d04 System.String    hello world             // THIS IS THE ONE
002bf224 025d2ccc System.Object[]    (System.String[])
002bf3d0 025d2ccc System.Object[]    (System.String[])
002bf3f8 025d2ccc System.Object[]    (System.String[])

现在使用 !gcgen 找出实例所在的代:

0:000> !gcgen 025d2d04 
Gen 0

它位于第 0 代 - 即它刚刚被分配。 谁在扎根它?

0:000> !gcroot 025d2d04 
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 1660
ESP:2bf15c:Root:025d2d04(System.String)
Scan Thread 2 OSTHread 16b4
DOMAIN(000E4840):HANDLE(Pinned):6513f4:Root:035d2020(System.Object[])->
025d2d04(System.String)

ESP 是我们的 Foo() 方法的堆栈,但请注意,我们还有一个 object[] 。 那是实习生桌子。 让我们来看看。

0:000> !dumparray 035d2020
Name: System.Object[]
MethodTable: 006984c4
EEClass: 00698444
Size: 528(0x210) bytes
Array: Rank 1, Number of elements 128, Type CLASS
Element Methodtable: 00696d3c
[0] 025d1360
[1] 025d137c
[2] 025d139c
[3] 025d13b0
[4] 025d13d0
[5] 025d1400
[6] 025d1424
...
[36] 025d2d04  // THIS IS OUR STRING
...
[126] null
[127] null

我稍微减少了输出,但你明白了。

结论:字符串位于堆上 - 即使它们被保留。 interned 表保存对堆上实例的引用。 即,在 GC 期间不会收集临时字符串,因为临时表将它们作为根。

Strings in .NET are reference types, so they are always on the heap (even when they are interned). You can verify this using a debugger such as WinDbg.

If you have the class below

   class SomeType {
      public void Foo() {
         string s = "hello world";
         Console.WriteLine(s);
         Console.WriteLine("press enter");
         Console.ReadLine();
      }
   }

And you call Foo() on an instance, you can use WinDbg to inspect the heap.

The reference will most likely be stored in a register for a small program, so the easiest is to find the reference to the specific string is by doing a !dso. This gives us the address of our string in question:

0:000> !dso
OS Thread Id: 0x1660 (0)
ESP/REG  Object   Name
002bf0a4 025d4bf8 Microsoft.Win32.SafeHandles.SafeFileHandle
002bf0b4 025d4bf8 Microsoft.Win32.SafeHandles.SafeFileHandle
002bf0e8 025d4e5c System.Byte[]
002bf0ec 025d4c0c System.IO.__ConsoleStream
002bf110 025d4c3c System.IO.StreamReader
002bf114 025d4c3c System.IO.StreamReader
002bf12c 025d5180 System.IO.TextReader+SyncTextReader
002bf130 025d4c3c System.IO.StreamReader
002bf140 025d5180 System.IO.TextReader+SyncTextReader
002bf14c 025d5180 System.IO.TextReader+SyncTextReader
002bf15c 025d2d04 System.String    hello world             // THIS IS THE ONE
002bf224 025d2ccc System.Object[]    (System.String[])
002bf3d0 025d2ccc System.Object[]    (System.String[])
002bf3f8 025d2ccc System.Object[]    (System.String[])

Now use !gcgen to find out which generation the instance is in:

0:000> !gcgen 025d2d04 
Gen 0

It's in generation zero - i.e. it has just be allocated. Who's rooting it?

0:000> !gcroot 025d2d04 
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 1660
ESP:2bf15c:Root:025d2d04(System.String)
Scan Thread 2 OSTHread 16b4
DOMAIN(000E4840):HANDLE(Pinned):6513f4:Root:035d2020(System.Object[])->
025d2d04(System.String)

The ESP is the stack for our Foo() method, but notice that we have a object[] as well. That's the intern table. Let's take a look.

0:000> !dumparray 035d2020
Name: System.Object[]
MethodTable: 006984c4
EEClass: 00698444
Size: 528(0x210) bytes
Array: Rank 1, Number of elements 128, Type CLASS
Element Methodtable: 00696d3c
[0] 025d1360
[1] 025d137c
[2] 025d139c
[3] 025d13b0
[4] 025d13d0
[5] 025d1400
[6] 025d1424
...
[36] 025d2d04  // THIS IS OUR STRING
...
[126] null
[127] null

I reduced the output somewhat, but you get the idea.

In conclusion: strings are on the heap - even when they are interned. The interned table holds a reference to the instance on the heap. I.e. interned strings are not collected during GC because the interned table roots them.

月寒剑心 2024-07-17 14:54:18

在 Java 中(来自 Java 术语表):

在 Sun 的 JVM 中,interned 字符串(包括字符串文字)存储在称为 perm gen 的特殊 RAM 池中,JVM 还加载类并存储本机编译的代码。 但是,所需字符串的行为与存储在普通对象堆中的行为没有什么不同。

In Java (from the Java Glossary):

In Sun’s JVM, the interned Strings (which includes String literals) are stored in a special pool of RAM called the perm gen, where the JVM also loads classes and stores natively compiled code. However, the intered Strings behave no differently than had they been stored in the ordinary object heap.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文