用于唯一标识节点的最佳集合是什么？

发布于 2024-12-13 06:59:06 字数 1234 浏览 0 评论 0原文

目前我正在使用 Dictionary 来存储大约 10,000 个节点。键用作稍后查找的 ID 号，“节点”是包含一些数据的类。程序中的其他类使用 ID 号作为指向节点的指针。（这听起来可能效率低下。但是，解释我使用字典的原因超出了我的问题范围。）

但是，20% 的节点是重复的。我想要做的是当我添加一个节点时检查它是否全部准备好存在。如果确实如此，则使用该 ID 号。如果没有创建一个新的。

这是我当前对该问题的解决方案：

public class nodeDictionary 
{

    Dictionary<int, node> dict = new Dictionary<int, node>( );
    public int addNewNode( latLng ll )
    {
        node n = new node( ll );
        if ( dict.ContainsValue( n ) )
        {
            foreach ( KeyValuePair<int, node> kv in dict )
            {
                if ( kv.Value == n )
                {
                    return kv.Key;
                }
            }
        }
        else
        {
            if ( dict.Count != 0 )
            {
                dict.Add( dict.Last( ).Key + 1, n );
                return dict.Last( ).Key + 1;
            }
            else
            {
                dict.Add( 0, n );
                return 0;
            }
        }
        throw new Exception( );
    }//end add new node
}

问题是当尝试将新节点添加到 100,000 个节点的列表中时，添加该节点需要 78 毫秒。这是不可接受的，因为我可能在任何给定时间添加额外的 1,000 个节点。

那么，有更好的方法吗？我不是在寻找有人为我编写代码，我只是在寻找指导。

原文

Currently I am using a Dictionary<int,node> to store around 10,000 nodes. The key is used as an ID number for later look up and the 'node' is a class that contains some data. Other classes within the program use the ID number as a pointer to the node. (this may sound inefficient. However, explaining my reasoning for using a dictionary for this is beyond the scope of my question.)

However, 20% of the nodes are duplicate.
What i want to do is when i add a node check to see if it all ready exists. if it does then use that ID number. If not create a new one.

This is my current solution to the problem:

public class nodeDictionary 
{

    Dictionary<int, node> dict = new Dictionary<int, node>( );
    public int addNewNode( latLng ll )
    {
        node n = new node( ll );
        if ( dict.ContainsValue( n ) )
        {
            foreach ( KeyValuePair<int, node> kv in dict )
            {
                if ( kv.Value == n )
                {
                    return kv.Key;
                }
            }
        }
        else
        {
            if ( dict.Count != 0 )
            {
                dict.Add( dict.Last( ).Key + 1, n );
                return dict.Last( ).Key + 1;
            }
            else
            {
                dict.Add( 0, n );
                return 0;
            }
        }
        throw new Exception( );
    }//end add new node
}

The problem with this is when trying to add a new node to a list of 100,000 nodes it takes 78 milliseconds to add the node. This is unacceptable because i could be adding an additional 1,000 nodes at any given time.

So, is there a better way do do this? I am not looking for someone to write the code for me, I am just looking for guidance.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

屋顶上的小猫咪 2024-12-20 06:59:06

听起来您想

确保 LatLng 覆盖 Equals/GetHashCode （最好实现 IEquatable接口）
将所有项目直接填充到 HashSet

中实现 GetHashCode，请参见此处：为什么是当覆盖 Equals 方法时覆盖 GetHashCode 很重要吗？

如果您需要以某种方式生成“人工”唯一 ID，我建议您使用再次使用字典方法，但“相反”：

// uses the same hash function for speedy lookup/insertion
IDictionary<LatLng, int> idMap = new Dictionary<LatLng, int>(); 

foreach (LatLng latLng in LatLngCoords)
{
    if (!idMap.ContainsKey(latLng))
        idMap.Add(latLng, idMap.Count+1); // to start with 1
}

您可以使用 idMap 替换 HashSet；实现（和性能特征）本质上是相同的，但作为关联容器。

这是从 LatLng 到 Id 的查找函数：

int IdLookup(LatLng latLng)
{
     int id;
     if (idMap.TryGetValue(latLng, id))
         return id;
     throw new InvalidArgumentException("Coordinate not in idMap");
}

您可以及时添加它：

int IdFor(LatLng latLng)
{
     int id;
     if (idMap.TryGetValue(latLng, id))
         return id;

     id = idMap.Count+1;
     idMap.Add(latLng, id);
     return id;
}

It sounds like you want to

make sure that LatLng overrides Equals/GetHashCode (preferrably implement the IEquatable<LatLng> interface)
stuff all the items directly into a HashSet<LatLng>

For implementing GetHashCode, see here: Why is it important to override GetHashCode when Equals method is overridden?

If you need to generate 'artificial' unique IDs in some fashion, I suggest you use the dictionary approach again, but 'in reverse':

// uses the same hash function for speedy lookup/insertion
IDictionary<LatLng, int> idMap = new Dictionary<LatLng, int>(); 

foreach (LatLng latLng in LatLngCoords)
{
    if (!idMap.ContainsKey(latLng))
        idMap.Add(latLng, idMap.Count+1); // to start with 1
}

You can have the idMap replace the HashSet<>; the implementation (and performance characteristics) is essentially the same but as an associative container.

Here's a lookup function to get from LatLng to Id:

int IdLookup(LatLng latLng)
{
     int id;
     if (idMap.TryGetValue(latLng, id))
         return id;
     throw new InvalidArgumentException("Coordinate not in idMap");
}

You could just-in-time add it:

int IdFor(LatLng latLng)
{
     int id;
     if (idMap.TryGetValue(latLng, id))
         return id;

     id = idMap.Count+1;
     idMap.Add(latLng, id);
     return id;
}

回复收藏 0 原文

久夏青 2024-12-20 06:59:06

我会为相反方向添加第二个字典。 ie Dictionary

那么你要么

满足于引用相等并且什么也不做。
创建一个 IEqualityComparer 并将其提供给
Node 上的字典 Override Equals 和 GetHashCode

在这两种情况下哈希码的良好实现对于获得良好的性能至关重要。

回复收藏 0 原文

故事还在继续 2024-12-20 06:59:06

你的解决方案不仅慢，而且是错误的。 Dictionary 中项目的顺序未定义，因此 dict.Last() 不能保证返回最后添加的项目。（尽管通常看起来是这样。）

使用 id 来标识应用程序中的对象似乎也是错误的。您应该考虑直接使用对该对象的引用。

但是，如果您想使用当前的设计并假设您根据 latLng 比较节点，则可以创建两个字典：您已有的一个和第二个字典，Dictionary，可以用来有效地找出某个节点是否已经存在。如果是的话，它会给你它的 id。

回复收藏 0 原文

孤星 2024-12-20 06:59:06

这段代码的目的究竟是什么？

if ( dict.ContainsValue( n ) )
{
    foreach ( KeyValuePair kv in dict )
    {
        if ( kv.Value == n )
        {
            return kv.Key;
        }
    }
}

ContainsValue 搜索值（而不是键）并且效率非常低 (O(n))。 foreach 也是如此。更不用说当只需要一个时您就同时执行这两项操作（您可以通过稍微重新安排您的 if 来完全删除 ContainsValue）！

您可能应该维护与原始字典“相反”的附加字典（即旧字典中的值是新字典中的键，反之亦然），以“覆盖”您的搜索模式（类似于数据库如何维护表中的多个索引）以涵盖查询表的多种方式）。

What exactly is the purpose of this code?

if ( dict.ContainsValue( n ) )
{
    foreach ( KeyValuePair kv in dict )
    {
        if ( kv.Value == n )
        {
            return kv.Key;
        }
    }
}

The ContainsValue searches for a value (instead of a key) and is very inefficient (O(n)). Ditto for foreach. Let alone you do both when only one is necessary (you could completely remove ContainsValue by rearranging your ifs a little)!

You should probably mainntain additional dictionary that is "reverse" of the original one (i.e. values in old dictionary are keys in the new one and vice versa), to "cover" your search patterns (similarly to how databases can maintain multiple indexes par table to cover multiple ways table can be queried).

回复收藏 0 原文

半枫 2024-12-20 06:59:06

您可以尝试使用 HashSet

回复收藏 0 原文

音盲 2024-12-20 06:59:06

您可能需要考虑将其重组为仅使用列表（其中“键”只是列表中的索引）而不是字典。一些优点：

通过整数键查找元素现在是 O(1)（并且非常快的 O(1)，因为它只是内部的数组取消引用）。
当插入一个新元素时，您将执行 O(n) 搜索来查看它是否已存在于列表中。如果没有，您也已经遍历了该列表，并且可以记录您是否遇到了具有空记录的条目。如果有，该索引就是新键。如果不是，则新键是当前列表计数。您只需枚举一次集合而不是多次，并且枚举本身比枚举字典要快得多。

回复收藏 0 原文

~没有更多了~