什么时候在 .Net 代码中手动实习字符串是个好主意?
参考资料在这里: http://msdn.microsoft.com/en-us/library /system.string.intern.aspx
看起来这很多时候是由编译器自动完成的,但也可以手动完成。 如果我错了,请纠正我,并对此提供更多说明。 语言是 C#、VB.Net、C++/CLI 还是其他有关系吗?
谢谢。
The reference is here:
http://msdn.microsoft.com/en-us/library/system.string.intern.aspx
Looks like this is done automatically by the compiler a lot, but can also be done manually.
Please correct me if I am wrong and shed some more light on this.
Does it matter whether the language is C#, VB.Net, C++/CLI, other?
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
当很可能出现重复值时(几乎是一个枚举,但不完全是),我已经完成了反序列化/物化代码。当反序列化数千条记录时,这可以带来显着的内存优势。但是,在这种情况下,您可能更愿意使用单独的内部缓存,以避免共享缓存饱和(或者共享缓存也可以;这取决于具体情况)。
但关键点是:您可能有很多不同的字符串实例具有相同的值。反序列化是一个重要的候选者。还应该注意的是,检查 interned 缓存会产生一些 CPU 开销(随着添加数据,开销会逐渐增加),因此如果构造的对象有可能存在超过 gen-0 的时间,则必须这样做;如果它们总是会被快速收集无论如何,那么就不值得将它们交换为实习版本。
I have done this is deserialization/materialization code when there is a good chance of repeated values (almost an enum, but not quite). When deserializing thousands of records this can give a significant memory benefit. However, in such cases you might prefer to use a separate intern cache, to avoid saturatig the shared one (or maybe the shared one is fine; it depends on the scenario).
But the key point there is: a scenario where you are likely to have lots and lots of different string instances with the same value. Deserialization is a big candidate there. It should also be note that there is some CPU overhead in checking the interned cache (progressively more overhead as you add data), so this should obly be done if there is a chance that the constucted objects are goin to live more than gen-0; if they are always going to be collected quickly anyway then it isn't worth swapping them for interned versions.
当分析表明这样做可以带来性能优势时,这样做是个好主意。
It's a good idea to do so when profiling shows that it gives performance benefits.
它是由运行时完成的,但是语言可以引入具有不同行为的自己的字符串类型。它仅适用于文字字符串。如果您想实习动态创建的字符串,您可以这样做。一方面,它使比较字符串变得非常简单,但请记住,虽然某些操作会从实习中受益,但其他操作则不会。例如,保留的字符串在进程关闭之前不会释放(因为它们以内部结构为根,请参阅 这个问题了解详细信息),因此如果您手动实习大量字符串,该过程将携带大量内存。
It is done by the runtime, but a language could introduce its own string type with a different behavior. It is only done for literal strings. If you want to intern dynamically created strings, you can do so. For one thing it makes comparing strings really simple, but keep in mind that while some operations will benefit from interning others will not. E.g. interned strings are not released until process shutdown (as they are rooted by the internal structure, see this question for details), so if you intern a lot of strings manually, the process will carry around a lot of memory.