.NET 唯一对象标识符

发布于 2024-07-17 12:56:55 字数 1631 浏览 15 评论 0原文

有没有办法获取实例的唯一标识符?

对于指向同一实例的两个引用,GetHashCode() 是相同的。 然而,两个不同的实例可以(很容易)获得相同的哈希码:

Hashtable hashCodesSeen = new Hashtable();
LinkedList<object> l = new LinkedList<object>();
int n = 0;
while (true)
{
    object o = new object();
    // Remember objects so that they don't get collected.
    // This does not make any difference though :(
    l.AddFirst(o);
    int hashCode = o.GetHashCode();
    n++;
    if (hashCodesSeen.ContainsKey(hashCode))
    {
        // Same hashCode seen twice for DIFFERENT objects (n is as low as 5322).
        Console.WriteLine("Hashcode seen twice: " + n + " (" + hashCode + ")");
        break;
    }
    hashCodesSeen.Add(hashCode, null);
}

我正在编写一个调试插件,并且我需要获得某种 ID 来作为参考,该 ID 在程序运行期间是唯一的。

我已经设法获取实例的内部地址,该地址在垃圾收集器 (GC) 压缩堆(= 移动对象 = 更改地址)之前是唯一的。

堆栈溢出问题Object.GetHashCode()的默认实现 > 可能相关。

这些对象不受我的控制,因为我正在使用调试器 API 访问正在调试的程序中的对象。 如果我可以控制这些对象,那么添加我自己的唯一标识符将是微不足道的。

我想要用于构建哈希表 ID 的唯一 ID -> 对象,能够查找已经见过的对象。 现在我是这样解决的:

Build a hashtable: 'hashCode' -> (list of objects with hash code == 'hashCode')
Find if object seen(o) {
    candidates = hashtable[o.GetHashCode()] // Objects with the same hashCode.
    If no candidates, the object is new
    If some candidates, compare their addresses to o.Address
        If no address is equal (the hash code was just a coincidence) -> o is new
        If some address equal, o already seen
}

Is there a way of getting a unique identifier of an instance?

GetHashCode() is the same for the two references pointing to the same instance. However, two different instances can (quite easily) get the same hash code:

Hashtable hashCodesSeen = new Hashtable();
LinkedList<object> l = new LinkedList<object>();
int n = 0;
while (true)
{
    object o = new object();
    // Remember objects so that they don't get collected.
    // This does not make any difference though :(
    l.AddFirst(o);
    int hashCode = o.GetHashCode();
    n++;
    if (hashCodesSeen.ContainsKey(hashCode))
    {
        // Same hashCode seen twice for DIFFERENT objects (n is as low as 5322).
        Console.WriteLine("Hashcode seen twice: " + n + " (" + hashCode + ")");
        break;
    }
    hashCodesSeen.Add(hashCode, null);
}

I'm writing a debugging addin, and I need to get some kind of ID for a reference which is unique during the run of the program.

I already managed to get internal ADDRESS of the instance, which is unique until the garbage collector (GC) compacts the heap (= moves the objects = changes the addresses).

Stack Overflow question Default implementation for Object.GetHashCode() might be related.

The objects are not under my control as I am accessing objects in a program being debugged using the debugger API. If I was in control of the objects, adding my own unique identifiers would be trivial.

I wanted the unique ID for building a hashtable ID -> object, to be able to lookup already seen objects. For now I solved it like this:

Build a hashtable: 'hashCode' -> (list of objects with hash code == 'hashCode')
Find if object seen(o) {
    candidates = hashtable[o.GetHashCode()] // Objects with the same hashCode.
    If no candidates, the object is new
    If some candidates, compare their addresses to o.Address
        If no address is equal (the hash code was just a coincidence) -> o is new
        If some address equal, o already seen
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

横笛休吹塞上声 2024-07-24 12:56:55

仅限 .NET 4 及更高版本

大家好消息!

这项工作的完美工具是在 .NET 4 中构建的,名为 ConditionalWeakTable。 此类:

  • 可用于将任意数据与托管对象实例关联起来,就像字典一样(尽管它不是字典)
  • 不依赖于内存地址,因此不受 GC 压缩堆的
  • 影响不只是因为对象已作为键输入表中而使对象保持活动状态,因此可以使用它而无需使进程中的每个对象永远活动
  • 使用引用相等性来确定对象标识; 另一方面,类作者无法修改此行为,因此可以一致在任何类型的对象上使用它,
  • 可以动态填充,因此不需要您在对象构造函数中注入代码

.NET 4 and later only

Good news, everyone!

The perfect tool for this job is built in .NET 4 and it's called ConditionalWeakTable<TKey, TValue>. This class:

  • can be used to associate arbitrary data with managed object instances much like a dictionary (although it is not a dictionary)
  • does not depend on memory addresses, so is immune to the GC compacting the heap
  • does not keep objects alive just because they have been entered as keys into the table, so it can be used without making every object in your process live forever
  • uses reference equality to determine object identity; moveover, class authors cannot modify this behavior so it can be used consistently on objects of any type
  • can be populated on the fly, so does not require that you inject code inside object constructors
自演自醉 2024-07-24 12:56:55

查看 ObjectIDGenerator 类? 这就是您想要做的事情以及 Marc Gravell 所描述的事情。

ObjectIDGenerator 会跟踪先前识别的对象。 当您请求对象的 ID 时,ObjectIDGenerator 知道是返回现有 ID,还是生成并记住新 ID。

ID 在 ObjectIDGenerator 实例的生命周期中是唯一的。 一般来说,ObjectIDGenerator 的生命周期与创建它的 Formatter 的生命周期一样长。 对象 ID 仅在给定的序列化流中有意义,并用于跟踪哪些对象对序列化对象图中的其他对象有引用。

使用哈希表,ObjectIDGenerator 保留分配给哪个对象的 ID。 唯一标识每个对象的对象引用是运行时垃圾收集堆中的地址。 对象引用值在序列化过程中可能会发生变化,但表会自动更新,因此信息是正确的。

对象 ID 是 64 位数字。 分配从 1 开始,因此 0 永远不是有效的对象 ID。 格式化程序可以选择零值来表示值为空引用(在 Visual Basic 中为 Nothing)的对象引用。

Checked out the ObjectIDGenerator class? This does what you're attempting to do, and what Marc Gravell describes.

The ObjectIDGenerator keeps track of previously identified objects. When you ask for the ID of an object, the ObjectIDGenerator knows whether to return the existing ID, or generate and remember a new ID.

The IDs are unique for the life of the ObjectIDGenerator instance. Generally, a ObjectIDGenerator life lasts as long as the Formatter that created it. Object IDs have meaning only within a given serialized stream, and are used for tracking which objects have references to others within the serialized object graph.

Using a hash table, the ObjectIDGenerator retains which ID is assigned to which object. The object references, which uniquely identify each object, are addresses in the runtime garbage-collected heap. Object reference values can change during serialization, but the table is updated automatically so the information is correct.

Object IDs are 64-bit numbers. Allocation starts from one, so zero is never a valid object ID. A formatter can choose a zero value to represent an object reference whose value is a null reference (Nothing in Visual Basic).

傾城如夢未必闌珊 2024-07-24 12:56:55

引用对象的唯一标识符。 我不知道有什么方法可以将其转换为字符串等。引用的值会在压缩过程中发生变化(如您所见),但每个先前的值 A 都会更改为值 B,所以到目前为止就安全代码而言,它仍然是唯一的 ID。

如果涉及的对象在您的控制之下,您可以使用弱引用创建映射(以避免阻止垃圾收集)从对您选择的 ID(GUID、整数等)的引用。 然而,这会增加一定量的开销和复杂性。

The reference is the unique identifier for the object. I don't know of any way of converting this into anything like a string etc. The value of the reference will change during compaction (as you've seen), but every previous value A will be changed to value B, so as far as safe code is concerned it's still a unique ID.

If the objects involved are under your control, you could create a mapping using weak references (to avoid preventing garbage collection) from a reference to an ID of your choosing (GUID, integer, whatever). That would add a certain amount of overhead and complexity, however.

日裸衫吸 2024-07-24 12:56:55

RuntimeHelpers.GetHashCode() 可能会有所帮助(MSDN)。

RuntimeHelpers.GetHashCode() may help (MSDN).

樱娆 2024-07-24 12:56:55

您可以在一秒钟内开发自己的东西。 例如:

   class Program
    {
        static void Main(string[] args)
        {
            var a = new object();
            var b = new object();
            Console.WriteLine("", a.GetId(), b.GetId());
        }
    }

    public static class MyExtensions
    {
        //this dictionary should use weak key references
        static Dictionary<object, int> d = new Dictionary<object,int>();
        static int gid = 0;

        public static int GetId(this object o)
        {
            if (d.ContainsKey(o)) return d[o];
            return d[o] = gid++;
        }
    }   

您可以选择自己想要的唯一 ID,例如 System.Guid.NewGuid() 或简单的整数以实现最快的访问。

You can develop your own thing in a second. For instance:

   class Program
    {
        static void Main(string[] args)
        {
            var a = new object();
            var b = new object();
            Console.WriteLine("", a.GetId(), b.GetId());
        }
    }

    public static class MyExtensions
    {
        //this dictionary should use weak key references
        static Dictionary<object, int> d = new Dictionary<object,int>();
        static int gid = 0;

        public static int GetId(this object o)
        {
            if (d.ContainsKey(o)) return d[o];
            return d[o] = gid++;
        }
    }   

You can choose what you will like to have as unique ID on your own, for instance, System.Guid.NewGuid() or simply integer for fastest access.

黯淡〆 2024-07-24 12:56:55

这个方法怎么样:

将第一个对象中的字段设置为新值。 如果第二个对象中的相同字段具有相同的值,则它可能是同一个实例。 否则,以不同方式退出。

现在将第一个对象中的字段设置为不同的新值。 如果第二个对象中的相同字段已更改为不同的值,则它肯定是同一个实例。

不要忘记在退出时将第一个对象中的字段设置回其原始值。

问题?

How about this method:

Set a field in the first object to a new value. If the same field in the second object has the same value, it's probably the same instance. Otherwise, exit as different.

Now set the field in the first object to a different new value. If the same field in the second object has changed to the different value, it's definitely the same instance.

Don't forget to set field in the first object back to it's original value on exit.

Problems?

捂风挽笑 2024-07-24 12:56:55

可以在 Visual Studio 中创建唯一的对象标识符:在监视窗口中,右键单击对象变量,然后从上下文菜单中选择创建对象 ID

不幸的是,这是一个手动步骤,我不相信可以通过代码访问标识符。

It is possible to make a unique object identifier in Visual Studio: In the watch window, right-click the object variable and choose Make Object ID from the context menu.

Unfortunately, this is a manual step, and I don't believe the identifier can be accessed via code.

十六岁半 2024-07-24 12:56:55

您必须自己手动分配这样的标识符 - 无论是在实例内部还是在外部。

对于与数据库相关的记录,主键可能很有用(但您仍然可以获得重复项)。 或者,可以使用 Guid,或者保留自己的计数器,使用 Interlocked.Increment 进行分配(并使其足够大,以免溢出)。

You would have to assign such an identifier yourself, manually - either inside the instance, or externally.

For records related to a database, the primary key may be useful (but you can still get duplicates). Alternatively, either use a Guid, or keep your own counter, allocating using Interlocked.Increment (and make it large enough that it isn't likely to overflow).

喜爱纠缠 2024-07-24 12:56:55

我知道这个问题已经得到解答,但至少值得注意的是,您可以使用:

http://msdn.microsoft.com/en-us/library/system.object.referenceequals.aspx

这不会直接给你一个“唯一的id”,而是与WeakReferences(和哈希集?)可以为您提供一种非常简单的方法来跟踪各种实例。

I know that this has been answered, but it's at least useful to note that you can use:

http://msdn.microsoft.com/en-us/library/system.object.referenceequals.aspx

Which will not give you a "unique id" directly, but combined with WeakReferences (and a hashset?) could give you a pretty easy way of tracking various instances.

只是偏爱你 2024-07-24 12:56:55

如果您正在自己的代码中编写用于特定用途的模块,majkinetor 的方法 可能工作了。 但也存在一些问题。

首先,官方文档保证GetHashCode()返回唯一标识符(参见Object.GetHashCode 方法 () ):

您不应假设相等的哈希码意味着对象相等。

第二,假设您的对象数量非常少,因此 GetHashCode() 在大多数情况下都可以工作,此方法可以被某些类型覆盖。
例如,您正在使用某个类 C,它重写 GetHashCode() 以始终返回 0。然后 C 的每个对象都将获得相同的哈希码。
不幸的是,DictionaryHashTable 和其他一些关联容器将使用此方法:

哈希码是用于在基于哈希的集合(例如字典)中插入和标识对象的数值。 类、Hashtable 类或从 DictionaryBase 类派生的类型。 GetHashCode 方法为需要快速检查对象相等性的算法提供此哈希代码。

所以,这种方法有很大的局限性。

甚至,如果您想构建一个通用库怎么办?
您不仅无法修改所使用的类的源代码,而且它们的行为也是不可预测的。

我很欣赏乔恩西蒙 已经发布了他们的答案,我将在下面发布一个代码示例和关于性能的建议。

using System;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Runtime.Serialization;
using System.Collections.Generic;


namespace ObjectSet
{
    public interface IObjectSet
    {
        /// <summary> check the existence of an object. </summary>
        /// <returns> true if object is exist, false otherwise. </returns>
        bool IsExist(object obj);

        /// <summary> if the object is not in the set, add it in. else do nothing. </summary>
        /// <returns> true if successfully added, false otherwise. </returns>
        bool Add(object obj);
    }

    public sealed class ObjectSetUsingConditionalWeakTable : IObjectSet
    {
        /// <summary> unit test on object set. </summary>
        internal static void Main() {
            Stopwatch sw = new Stopwatch();
            sw.Start();
            ObjectSetUsingConditionalWeakTable objSet = new ObjectSetUsingConditionalWeakTable();
            for (int i = 0; i < 10000000; ++i) {
                object obj = new object();
                if (objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.Add(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
            }
            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
        }


        public bool IsExist(object obj) {
            return objectSet.TryGetValue(obj, out tryGetValue_out0);
        }

        public bool Add(object obj) {
            if (IsExist(obj)) {
                return false;
            } else {
                objectSet.Add(obj, null);
                return true;
            }
        }

        /// <summary> internal representation of the set. (only use the key) </summary>
        private ConditionalWeakTable<object, object> objectSet = new ConditionalWeakTable<object, object>();

        /// <summary> used to fill the out parameter of ConditionalWeakTable.TryGetValue(). </summary>
        private static object tryGetValue_out0 = null;
    }

    [Obsolete("It will crash if there are too many objects and ObjectSetUsingConditionalWeakTable get a better performance.")]
    public sealed class ObjectSetUsingObjectIDGenerator : IObjectSet
    {
        /// <summary> unit test on object set. </summary>
        internal static void Main() {
            Stopwatch sw = new Stopwatch();
            sw.Start();
            ObjectSetUsingObjectIDGenerator objSet = new ObjectSetUsingObjectIDGenerator();
            for (int i = 0; i < 10000000; ++i) {
                object obj = new object();
                if (objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.Add(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
            }
            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
        }


        public bool IsExist(object obj) {
            bool firstTime;
            idGenerator.HasId(obj, out firstTime);
            return !firstTime;
        }

        public bool Add(object obj) {
            bool firstTime;
            idGenerator.GetId(obj, out firstTime);
            return firstTime;
        }


        /// <summary> internal representation of the set. </summary>
        private ObjectIDGenerator idGenerator = new ObjectIDGenerator();
    }
}

在我的测试中,在 for 循环中创建 10,000,000 个对象(比上面代码高出 10 倍)时,ObjectIDGenerator 会抛出异常,抱怨对象过多。

此外,基准测试结果是 ConditionalWeakTable 实现比 ObjectIDGenerator 实现快 1.8 倍。

If you are writing a module in your own code for a specific usage, majkinetor's method MIGHT have worked. But there are some problems.

First, the official document does NOT guarantee that the GetHashCode() returns an unique identifier (see Object.GetHashCode Method ()):

You should not assume that equal hash codes imply object equality.

Second, assume you have a very small amount of objects so that GetHashCode() will work in most cases, this method can be overridden by some types.
For example, you are using some class C and it overrides GetHashCode() to always return 0. Then every object of C will get the same hash code.
Unfortunately, Dictionary, HashTable and some other associative containers will make use this method:

A hash code is a numeric value that is used to insert and identify an object in a hash-based collection such as the Dictionary<TKey, TValue> class, the Hashtable class, or a type derived from the DictionaryBase class. The GetHashCode method provides this hash code for algorithms that need quick checks of object equality.

So, this approach has great limitations.

And even more, what if you want to build a general purpose library?
Not only are you not able to modify the source code of the used classes, but their behavior is also unpredictable.

I appreciate that Jon and Simon have posted their answers, and I will post a code example and a suggestion on performance below.

using System;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Runtime.Serialization;
using System.Collections.Generic;


namespace ObjectSet
{
    public interface IObjectSet
    {
        /// <summary> check the existence of an object. </summary>
        /// <returns> true if object is exist, false otherwise. </returns>
        bool IsExist(object obj);

        /// <summary> if the object is not in the set, add it in. else do nothing. </summary>
        /// <returns> true if successfully added, false otherwise. </returns>
        bool Add(object obj);
    }

    public sealed class ObjectSetUsingConditionalWeakTable : IObjectSet
    {
        /// <summary> unit test on object set. </summary>
        internal static void Main() {
            Stopwatch sw = new Stopwatch();
            sw.Start();
            ObjectSetUsingConditionalWeakTable objSet = new ObjectSetUsingConditionalWeakTable();
            for (int i = 0; i < 10000000; ++i) {
                object obj = new object();
                if (objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.Add(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
            }
            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
        }


        public bool IsExist(object obj) {
            return objectSet.TryGetValue(obj, out tryGetValue_out0);
        }

        public bool Add(object obj) {
            if (IsExist(obj)) {
                return false;
            } else {
                objectSet.Add(obj, null);
                return true;
            }
        }

        /// <summary> internal representation of the set. (only use the key) </summary>
        private ConditionalWeakTable<object, object> objectSet = new ConditionalWeakTable<object, object>();

        /// <summary> used to fill the out parameter of ConditionalWeakTable.TryGetValue(). </summary>
        private static object tryGetValue_out0 = null;
    }

    [Obsolete("It will crash if there are too many objects and ObjectSetUsingConditionalWeakTable get a better performance.")]
    public sealed class ObjectSetUsingObjectIDGenerator : IObjectSet
    {
        /// <summary> unit test on object set. </summary>
        internal static void Main() {
            Stopwatch sw = new Stopwatch();
            sw.Start();
            ObjectSetUsingObjectIDGenerator objSet = new ObjectSetUsingObjectIDGenerator();
            for (int i = 0; i < 10000000; ++i) {
                object obj = new object();
                if (objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.Add(obj)) { Console.WriteLine("bug!!!"); }
                if (!objSet.IsExist(obj)) { Console.WriteLine("bug!!!"); }
            }
            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
        }


        public bool IsExist(object obj) {
            bool firstTime;
            idGenerator.HasId(obj, out firstTime);
            return !firstTime;
        }

        public bool Add(object obj) {
            bool firstTime;
            idGenerator.GetId(obj, out firstTime);
            return firstTime;
        }


        /// <summary> internal representation of the set. </summary>
        private ObjectIDGenerator idGenerator = new ObjectIDGenerator();
    }
}

In my test, the ObjectIDGenerator will throw an exception to complain that there are too many objects when creating 10,000,000 objects (10x than in the code above) in the for loop.

Also, the benchmark result is that the ConditionalWeakTable implementation is 1.8x faster than the ObjectIDGenerator implementation.

心清如水 2024-07-24 12:56:55

我在这里提供的信息并不新鲜,我只是为了完整性而添加了这些信息。

这段代码的想法非常简单:

  • 对象需要一个唯一的 ID,但默认情况下并不存在。 相反,我们必须依靠下一个最好的方法,即 RuntimeHelpers.GetHashCode 来获取某种唯一 ID
  • 要检查唯一性,这意味着我们需要使用 object.ReferenceEquals< /code>
  • 但是,我们仍然希望有一个唯一的 ID,因此我添加了一个 GUID,它根据定义是唯一的。
  • 因为我不喜欢在不必要的情况下锁定所有内容,所以我不使用 ConditionalWeakTable

组合起来,将为您提供以下代码:

public class UniqueIdMapper
{
    private class ObjectEqualityComparer : IEqualityComparer<object>
    {
        public bool Equals(object x, object y)
        {
            return object.ReferenceEquals(x, y);
        }

        public int GetHashCode(object obj)
        {
            return RuntimeHelpers.GetHashCode(obj);
        }
    }

    private Dictionary<object, Guid> dict = new Dictionary<object, Guid>(new ObjectEqualityComparer());
    public Guid GetUniqueId(object o)
    {
        Guid id;
        if (!dict.TryGetValue(o, out id))
        {
            id = Guid.NewGuid();
            dict.Add(o, id);
        }
        return id;
    }
}

要使用它,请创建 UniqueIdMapper 的实例并使用它为对象返回的 GUID。


附录

所以,这里还有更多内容; 让我写一些关于 ConditionalWeakTable 的内容。

ConditionalWeakTable 做了几件事。 最重要的是它不关心垃圾收集器,也就是说:您在此表中引用的对象无论如何都会被收集。 如果你查找一个对象,它的工作原理基本上与上面的字典相同。

好奇吗? 毕竟,当 GC 收集对象时,它会检查是否存在对该对象的引用,如果存在,就会收集它们。 那么,如果 ConditionalWeakTable 中有一个对象,那么为什么要收集引用的对象呢?

ConditionalWeakTable 使用了一个小技巧,其他一些 .NET 结构也使用了该技巧:它实际上存储的是 IntPtr,而不是存储对象的引用。 因为那不是真正的引用,所以可以收集该对象。

所以,此时有两个问题需要解决。 首先,对象可以在堆上移动,那么我们将使用什么作为IntPtr呢? 其次,我们如何知道对象具有活动引用?

  • 该对象可以固定在堆上,并且可以存储其真实指针。 当 GC 命中要删除的对象时,它会取消固定它并收集它。 但是,这意味着我们获得了固定资源,如果您有很多对象(由于内存碎片问题),这不是一个好主意。 这可能不是它的工作原理。
  • 当 GC 移动对象时,它会回调,然后更新引用。 根据 DependentHandle 中的外部调用来判断,这可能是它的实现方式 - 但我相信它稍微复杂一些。
  • 存储的不是指向对象本身的指针,而是存储来自 GC 的所有对象列表中的指针。 IntPtr 是该列表中的索引或指针。 该列表仅在对象更改世代时更改,此时简单的回调可以更新指针。 如果你还记得 Mark & 是如何做的吗? 扫一扫有效,这更有意义。 没有固定,移除也像以前一样。 我相信这就是它在 DependentHandle 中的工作原理。

最后一个解决方案确实要求运行时在显式释放列表存储桶之前不重复使用它们,并且还要求通过调用运行时来检索所有对象。

如果我们假设他们使用这个解决方案,我们也可以解决第二个问题。 马克与 扫描算法会跟踪哪些对象已被收集; 一旦收集完毕,我们就知道了。 一旦对象检查该对象是否存在,它就会调用“Free”,这将删除指针和列表条目。 对象确实消失了。

此时需要注意的一件重要事情是,如果 ConditionalWeakTable 在多个线程中更新并且它不是线程安全的,那么事情会发生严重错误。 结果将是内存泄漏。 这就是为什么 ConditionalWeakTable 中的所有调用都会执行一个简单的“锁定”以确保这种情况不会发生。

另一件需要注意的事情是清理条目必须每隔一段时间进行一次。 虽然实际对象将被 GC 清理,但条目却不会。 这就是为什么 ConditionalWeakTable 的大小只会增长。 一旦达到一定的限制(由哈希中的碰撞机会确定),它就会触发Resize,检查对象是否必须清理——如果需要清理,free > 在 GC 进程中调用,删除 IntPtr 句柄。

我相信这也是 DependentHandle 不直接公开的原因 - 您不想弄乱事情并因此导致内存泄漏。 下一个最好的方法是 WeakReference (它还存储 IntPtr 而不是对象) - 但不幸的是不包括“依赖关系”方面。

剩下的就是让您尝试一下机制,以便您可以看到实际的依赖关系。 请务必启动多次并观察结果:

class DependentObject
{
    public class MyKey : IDisposable
    {
        public MyKey(bool iskey)
        {
            this.iskey = iskey;
        }

        private bool disposed = false;
        private bool iskey;

        public void Dispose()
        {
            if (!disposed)
            {
                disposed = true;
                Console.WriteLine("Cleanup {0}", iskey);
            }
        }

        ~MyKey()
        {
            Dispose();
        }
    }

    static void Main(string[] args)
    {
        var dep = new MyKey(true); // also try passing this to cwt.Add

        ConditionalWeakTable<MyKey, MyKey> cwt = new ConditionalWeakTable<MyKey, MyKey>();
        cwt.Add(new MyKey(true), dep); // try doing this 5 times f.ex.

        GC.Collect(GC.MaxGeneration);
        GC.WaitForFullGCComplete();

        Console.WriteLine("Wait");
        Console.ReadLine(); // Put a breakpoint here and inspect cwt to see that the IntPtr is still there
    }

The information I give here is not new, I just added this for completeness.

The idea of this code is quite simple:

  • Objects need a unique ID, which isn't there by default. Instead, we have to rely on the next best thing, which is RuntimeHelpers.GetHashCode to get us a sort-of unique ID
  • To check uniqueness, this implies we need to use object.ReferenceEquals
  • However, we would still like to have a unique ID, so I added a GUID, which is by definition unique.
  • Because I don't like locking everything if I don't have to, I don't use ConditionalWeakTable.

Combined, that will give you the following code:

public class UniqueIdMapper
{
    private class ObjectEqualityComparer : IEqualityComparer<object>
    {
        public bool Equals(object x, object y)
        {
            return object.ReferenceEquals(x, y);
        }

        public int GetHashCode(object obj)
        {
            return RuntimeHelpers.GetHashCode(obj);
        }
    }

    private Dictionary<object, Guid> dict = new Dictionary<object, Guid>(new ObjectEqualityComparer());
    public Guid GetUniqueId(object o)
    {
        Guid id;
        if (!dict.TryGetValue(o, out id))
        {
            id = Guid.NewGuid();
            dict.Add(o, id);
        }
        return id;
    }
}

To use it, create an instance of the UniqueIdMapper and use the GUID's it returns for the objects.


Addendum

So, there's a bit more going on here; let me write a bit down about ConditionalWeakTable.

ConditionalWeakTable does a couple of things. The most important thing is that it doens't care about the garbage collector, that is: the objects that you reference in this table will be collected regardless. If you lookup an object, it basically works the same as the dictionary above.

Curious no? After all, when an object is being collected by the GC, it checks if there are references to the object, and if there are, it collects them. So if there's an object from the ConditionalWeakTable, why will the referenced object be collected then?

ConditionalWeakTable uses a small trick, which some other .NET structures also use: instead of storing a reference to the object, it actually stores an IntPtr. Because that's not a real reference, the object can be collected.

So, at this point there are 2 problems to address. First, objects can be moved on the heap, so what will we use as IntPtr? And second, how do we know that objects have an active reference?

  • The object can be pinned on the heap, and its real pointer can be stored. When the GC hits the object for removal, it unpins it and collects it. However, that would mean we get a pinned resource, which isn't a good idea if you have a lot of objects (due to memory fragmentation issues). This is probably not how it works.
  • When the GC moves an object, it calls back, which can then update the references. This might be how it's implemented judging by the external calls in DependentHandle - but I believe it's slightly more sophisticated.
  • Not the pointer to the object itself, but a pointer in the list of all objects from the GC is stored. The IntPtr is either an index or a pointer in this list. The list only changes when an object changes generations, at which point a simple callback can update the pointers. If you remember how Mark & Sweep works, this makes more sense. There's no pinning, and removal is as it was before. I believe this is how it works in DependentHandle.

This last solution does require that the runtime doesn't re-use the list buckets until they are explicitly freed, and it also requires that all objects are retrieved by a call to the runtime.

If we assume they use this solution, we can also address the second problem. The Mark & Sweep algorithm keeps track of which objects have been collected; as soon as it has been collected, we know at this point. Once the object checks if the object is there, it calls 'Free', which removes the pointer and the list entry. The object is really gone.

One important thing to note at this point is that things go horribly wrong if ConditionalWeakTable is updated in multiple threads and if it isn't thread safe. The result would be a memory leak. This is why all calls in ConditionalWeakTable do a simple 'lock' which ensures this doesn't happen.

Another thing to note is that cleaning up entries has to happen once in a while. While the actual objects will be cleaned up by the GC, the entries are not. This is why ConditionalWeakTable only grows in size. Once it hits a certain limit (determined by collision chance in the hash), it triggers a Resize, which checks if objects have to be cleaned up -- if they do, free is called in the GC process, removing the IntPtr handle.

I believe this is also why DependentHandle is not exposed directly - you don't want to mess with things and get a memory leak as a result. The next best thing for that is a WeakReference (which also stores an IntPtr instead of an object) - but unfortunately doesn't include the 'dependency' aspect.

What remains is for you to toy around with the mechanics, so that you can see the dependency in action. Be sure to start it multiple times and watch the results:

class DependentObject
{
    public class MyKey : IDisposable
    {
        public MyKey(bool iskey)
        {
            this.iskey = iskey;
        }

        private bool disposed = false;
        private bool iskey;

        public void Dispose()
        {
            if (!disposed)
            {
                disposed = true;
                Console.WriteLine("Cleanup {0}", iskey);
            }
        }

        ~MyKey()
        {
            Dispose();
        }
    }

    static void Main(string[] args)
    {
        var dep = new MyKey(true); // also try passing this to cwt.Add

        ConditionalWeakTable<MyKey, MyKey> cwt = new ConditionalWeakTable<MyKey, MyKey>();
        cwt.Add(new MyKey(true), dep); // try doing this 5 times f.ex.

        GC.Collect(GC.MaxGeneration);
        GC.WaitForFullGCComplete();

        Console.WriteLine("Wait");
        Console.ReadLine(); // Put a breakpoint here and inspect cwt to see that the IntPtr is still there
    }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文