.NET 中是否存在保存数据集的数据结构?

发布于 2024-08-21 01:02:37 字数 900 浏览 4 评论 0 原文

我正在寻找一种类似于字典的数据结构,它将所有相关项的集合返回到一个键。

例如,我会这样使用它:

var data = new FancyDataStructure();

data.Add(new string[] {"Elizabeth", "Liz", "Betty"});
data.Add(new string[] {"Bob", "Robert", "Rob"});

string[] alternateNames1 = data["Betty"];
string[] alternateNames2 = data["Liz"]

在本例中,alternateNames1 将是一个包含“Liz”和“Elizabeth”的数组,而alternateNames2 将是一个包含“Elizabeth”和“Betty”的数组。

我不想重新发明它,但我找不到这种结构的任何例子。

更新

感谢那些回信提出建议的人。许多人建议使用某种版本的 Dictionary>。目前我正在使用这种方法,但它实际上并不能满足要求,而且维护起来非常困难。每个列表中的每个值都需要能够充当集合中添加到该列表中的每个其他值的键。

因此,考虑到以下情况:

data.Add(new string[] {"Elizabeth", "Liz"}
data.Add(new string[] {"Liz", "Betty"}
alternates = data["Betty"];

我希望替代项现在包含“Elizabeth”和“Liz”。

看来我可能只需要构建这样一个结构来满足我的需要。不过,请继续提出想法!

布莱恩

I'm looking for a data structure similar to a dictionary that returns the set of all related items to a key.

For example, I would use it like this:

var data = new FancyDataStructure();

data.Add(new string[] {"Elizabeth", "Liz", "Betty"});
data.Add(new string[] {"Bob", "Robert", "Rob"});

string[] alternateNames1 = data["Betty"];
string[] alternateNames2 = data["Liz"]

In this instance, alternateNames1 would be an array containing "Liz" and "Elizabeth", and alternateNames2 would be an array containing "Elizabeth" and "Betty."

I don't want to reinvent this, but I couldn't find any examples of such a structure.

Update

Thank you to those that have written back with suggestions. Many people have suggested using some version of Dictionary<string, IEnumerable<string>>. Currently I am using this approach, but it doesn't actually fulfill the requirement without being horribly difficult to maintain. Every value in every list needs to be able to function as a key to every other value ever added to it in a set.

Thus, given the following:

data.Add(new string[] {"Elizabeth", "Liz"}
data.Add(new string[] {"Liz", "Betty"}
alternates = data["Betty"];

I would expect alternates to now contain "Elizabeth," and "Liz."

It looks as though I might just have to build such a structure to suit my needs. Keep the ideas coming though!

Brian

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

守望孤独 2024-08-28 01:02:37

您的问题听起来确实是一个绘图问题。将名称视为节点,将集合中的成员身份视为边。从这个角度来看,您需要一个能够很好地处理稀疏图的数据结构,例如邻接列表。当然,这与您已经使用 Dictionary> 所做的类似,但以这种方式思考可能会引导您找到一些有用的实现和算法。

Your problem sounds like it is really a graphing problem. Think of the names as nodes and membership in the set as the edges. From this standpoint, you would want a data structure that handles sparse graphs well, such as an adjacency list. This is, of course, similary to what you are already doing with a Dictionary<string, IEnumerable<string>> but thinking about it in this way might lead you to some helpful implementations and algorithms.

微凉 2024-08-28 01:02:37

System.Collections.Generic 命名空间和 System.Collections 加载了键值对字典、排序字典、列表对象等。

System.Collections.Generic.Dictionary<int, string> dic = new Dictionary<int, string>();
        dic.Add(1, test);

或字典内的嵌套列表

Dictionary<string, List<string>> dic = new Dictionary<string, List<string>>();
List<string> alternatives = new List<string>();
alternatives.Add("Brenda");
dic.Add("Betty", alternatives);

System.Collections.Generic namespace and the System.Collections are loaded with KeyValue pair dictionaries, sorted dictionaries, List Objects and much more.

System.Collections.Generic.Dictionary<int, string> dic = new Dictionary<int, string>();
        dic.Add(1, test);

or a nested list inside a dictionary

Dictionary<string, List<string>> dic = new Dictionary<string, List<string>>();
List<string> alternatives = new List<string>();
alternatives.Add("Brenda");
dic.Add("Betty", alternatives);
迷荒 2024-08-28 01:02:37

只是另一个方向的想法 - 强类型数据集似乎有很多好处。并且序列化为字节数组,它们对于移动多维结构化数据来说非常快。

迭代和 Linq 功能是内置的。

对于很多东西来说可能有些过头了,但我在很多地方将整个数据集存储在 SQL 中的一个 varbinary(max) 列中。

Just a thought in another direction - strongly typed datasets seem to have a lot going for them. And serialized as byte arrays they are pretty fast for moving multidimensionally structured data around.

Iteration and Linq capability are sort of built in.

Maybe overkill for a lot of stuff, but I have a number of places where I stored the whole dataset in one varbinary(max) columnn in SQL.

烟雨凡馨 2024-08-28 01:02:37

这样的事情看起来很简单。

var data = new List<string[]>();

data.Add(new string[] {"Elizabeth", "Liz", "Betty"});
data.Add(new string[] {"Bob", "Robert", "Rob"});

var alternateNames1 = data.Where(x =>x.Contains("Betty")).Select(x => x.Where(y => y != "Betty"));

Something like this seems simple enough.

var data = new List<string[]>();

data.Add(new string[] {"Elizabeth", "Liz", "Betty"});
data.Add(new string[] {"Bob", "Robert", "Rob"});

var alternateNames1 = data.Where(x =>x.Contains("Betty")).Select(x => x.Where(y => y != "Betty"));
寄风 2024-08-28 01:02:37

我只使用类型 Dictionary>。要从列表列表构建此结构,您可以使用如下代码:

var alternateNames = new string[][] {
    new string[] { "Elizabeth", "Liz", "Betty" },
    new string[] { "Bob", "Robert", "Rob" }, };
var altNameLookup = 
    (
        from nameList in alternateNames
        from name in nameList
        select new { 
            Name = name, NameList = nameList.Except(new string[] { name } ) }
    ).ToDictionary(o => o.Name, o => o.NameList);

I would just use the type Dictionary<string, IEnumerable<string>>. To build this structure from a list of lists, you could have code like this:

var alternateNames = new string[][] {
    new string[] { "Elizabeth", "Liz", "Betty" },
    new string[] { "Bob", "Robert", "Rob" }, };
var altNameLookup = 
    (
        from nameList in alternateNames
        from name in nameList
        select new { 
            Name = name, NameList = nameList.Except(new string[] { name } ) }
    ).ToDictionary(o => o.Name, o => o.NameList);
人事已非 2024-08-28 01:02:37

您基本上有一个字典,其中多个键映射到相同的值。没有支持您想要的操作的内置数据结构,但它很容易在 .NET 中表示为 Dictionary{string, HashSet{string}}

static void AddNames(Dictionary<string, HashSet<string>> map, params string[] names)
{
    for (int i = 0; i < names.Length; i++)
    {
        HashSet<string> value;
        if (!map.TryGetValue(names[i], out value))
        {
            value = new HashSet<string>();
            map.Add(names[i], value);
        }

        for (int j = 0; j < names.Length; j++)
        {
            value.Add(names[j]);
        }
    }
}

static void Main(string[] args)
{
    Dictionary<string, HashSet<string>> names = new Dictionary<string,HashSet<string>>();
    AddNames(names, "Chris", "Christopher");
    AddNames(names, "Christina", "Chrissy", "Chris");

    HashSet<string> relatedToChris = names["Chris"];                // gets "Chris", "Christina", "Chrissy", "Christopher";
    HashSet<string> namesRelatedToChristinia = names["Christina"];  // gets "Christina", "Chrissy", "Chris";
}

您可以将数据结构视为有向图,其中每个节点都有一条与其相关名称相连的边。由于有n^2条边,所以字典需要O(n^2)时间来插入和存储。不可能将查找时间减少到更好的程度。

幸运的是,由于它是作为字典实现的,所以查找仍然是 O(1)。删除的时间复杂度为 O(m),其中 m 是与键相关的值的数量。

You basically have a dictionary where multiple keys map to the same value. There's no built-in data structure which supports the operation you want, but its easy to represent as a Dictionary{string, HashSet{string}} in .NET:

static void AddNames(Dictionary<string, HashSet<string>> map, params string[] names)
{
    for (int i = 0; i < names.Length; i++)
    {
        HashSet<string> value;
        if (!map.TryGetValue(names[i], out value))
        {
            value = new HashSet<string>();
            map.Add(names[i], value);
        }

        for (int j = 0; j < names.Length; j++)
        {
            value.Add(names[j]);
        }
    }
}

static void Main(string[] args)
{
    Dictionary<string, HashSet<string>> names = new Dictionary<string,HashSet<string>>();
    AddNames(names, "Chris", "Christopher");
    AddNames(names, "Christina", "Chrissy", "Chris");

    HashSet<string> relatedToChris = names["Chris"];                // gets "Chris", "Christina", "Chrissy", "Christopher";
    HashSet<string> namesRelatedToChristinia = names["Christina"];  // gets "Christina", "Chrissy", "Chris";
}

You can think of your datastructure as a directed graph where each node has an edge connected to its related name. Since there are n^2 edges, the dictionary requires O(n^2) time to insert and memory. Its not possible to reduce the lookup time to anything better.

Fortunately, since its implemented as a dictionary, lookups as still O(1). Deletes are O(m) where m is the number of values related to a key.

友欢 2024-08-28 01:02:37

事实上的alt.net标准在Iesi.Collections中,但基类库在dotnet 3.5或更高版本中只有HashSet

我在 linq 中使用了类似“group by”的子句,可以轻松地从任意 IEnumerable 集合中删除重复项,但这并不能完全为您提供设置的语义。

哈希集<>接近你想要的。

根据您的要求,我认为没有现成的东西可以将字符串映射到预先存在的集合;基本上,您必须编写一个类,该类采用类似 StoreAssociations<>(IEnumerable<> name) 的方法,将 IEnumerable 转换为 HashSet,然后迭代HashSet 中的每个项目将 IDictionary> 中的映射添加到新创建的哈希集。

The de facto alt.net standard is in Iesi.Collections, but the base class library only has HashSet<T> in dotnet 3.5 or above.

I've used "group by" like clauses in linq to easily remove duplicates from arbitrary IEnumerable<T> collections, but that doesn't quite give you set semantics.

HashSet<> is close to what you want.

Based on your requirements, I don't think there's something off the shelf that would map strings to pre-existing collections; basically, you'd have to write a class that takes a method like StoreAssociations<<T>>(IEnumerable<<T>> names), converts IEnumerable to HashSet, and iterates over each item in the HashSet to add a mapping in an IDictionary<string,HashSet<T>> to the newly-created hashset.

仙气飘飘 2024-08-28 01:02:37

一对数据结构怎么样: DictionaryDictionary>

添加一对键 (a, b) [您可以将较大的添加分解为对 (1+2, 2+3, ...] 进行如下操作:-

在第一个字典中查找 a 和 b。
如果两者都不存在,则创建一个新的 Guid 并将 (a,g) 和 (b,g) 添加到第一个字典,将 (g,List{a}) 和 (g,List{b}) 添加到第二个字典。

如果存在,例如 a,则从中获取 guid (g),并将另一个 (b, g) 添加到第一个字典中,并将 b 附加到第二个字典中 [g] 处找到的列表的末尾。

如果两者都存在并且它们具有相同的 guid - 则无需执行任何操作。

如果两者都存在并且它们具有不同的 guid,则需要合并这两个集合 // 这是大多数其他建议的解决方案似乎缺少的东西 // 所以选择一个要消除的 Guid,从第二个字典中获取它,添加列表字符串到另一个条目,然后删除该条目。最后标记第一本词典中该列表中的所有单词。

要获取所有相关单词,请在第一个字典中查找 Guid 并从第二个字典中获取列表。

当然,静态递增长值可能比 Guid 效果更好。

How about a pair of data structures: Dictionary<string, Guid>and Dictionary<Guid, List<string>>

To add a pair of keys (a, b) [you can decompose a larger add into pairs (1+2, 2+3, ...] proceed as follows:-

Lookup a and b in the first dictionary.
If neither exists, create a new Guid and add (a,g) and (b,g) to first dictionary and (g,List{a}) and (g,List{b}) to second dictionary.

If one exists, say a, grab the guid from it (g) and add the other (b, g) to the first dictionary and tack b onto the end of the list found at [g] in the second dictionary.

If both exist AND they have the same guid - nothing to do.

If both exist and they have different guids you need to merge the two sets // This is something most of the other proposed solutions seem to be missing // so pick a Guid to eliminate, go get it from the second dictionary, add the list of strings to the other entry and then remove this entry. Finally mark all the words in the first dictionary that were in that list.

To get all the related words, lookup the Guid in the first dictionary and grab the list from the second dictionary.

Of course a static incrementing long value would probably work better than a Guid.

三寸金莲 2024-08-28 01:02:37

或者,由于 List 是引用类型,您可以执行以下操作...按

Dictionary<string, List<string>> dict = new ...

如下方式进行:-

添加单个关联 (a = b) {从等价列表分解}

在字典中查找 a 和 b

如果都不存在

dict.Add(a, new List<string>(){a}); dict.Add(b, new List<string>(){b});

如果一个存在,比如说, a

var list = dict[a];
list.Add(b);
dict.Add(b, list);

如果两者都存在并且列表相同(对象比较),那么您就完成了。

如果两者都存在并且列表不同:

var list1 = dict[a];
var list2 = dict[b];
list1.AddRange(list2);
dict.Remove(b);
dict.Add(b, list1);

Or, since List is a reference type you could do the following ...

Dictionary<string, List<string>> dict = new ...

Proceed as follows:-

To add a single association (a = b) {decomposed from a list of equivalences}

Lookup a and b in the Dictionary

If neither exists

dict.Add(a, new List<string>(){a}); dict.Add(b, new List<string>(){b});

If one exists, say, a

var list = dict[a];
list.Add(b);
dict.Add(b, list);

If both exist and the lists are the same (object compare) you are done.

If both exists and the lists are different:

var list1 = dict[a];
var list2 = dict[b];
list1.AddRange(list2);
dict.Remove(b);
dict.Add(b, list1);
谷夏 2024-08-28 01:02:37

我写了一些代码,我不知道它的效率如何,但我认为它可以达到你想要的效果。

这是你的结构

class FancyDataStructure
{
    private IDictionary<string, HashSet<string>> dictionary 
        = new Dictionary<string, HashSet<string>>();

    public void Add(params string[] names)
    {
        HashSet<string> set = new HashSet<string>(names);
        for (int i = 0; i < names.Length; i++)
        {
            if (!dictionary.ContainsKey(names[i]))
            {
                dictionary.Add(names[i], set);
            }
            else
            {
                HashSet<string> union = 
                new HashSet<string>(set.Union<string>(dictionary[names[i]]));
                set = union;
                foreach (string oldName in dictionary[names[i]])
                {
                    dictionary[oldName] = union;
                }
                for (int j = 0; j < i; j++)
                {
                    if (!dictionary.ContainsKey(names[j]))
                    {
                        dictionary.Add(names[j], union);
                    }
                }
            }
        }
    }

    public string[] this[string key]
    {
        get
        {
            List<string> result = dictionary[key].ToList<string>();
            result.Remove(key);
            return result.ToArray();
        }
    }
}

,你可以使用它,就像这样

    static void Main(string[] args)
    {

        FancyDataStructure data = new FancyDataStructure();

        data.Add("Elizabeth", "Liz");
        data.Add("Liz", "Betty");

        string[] alternates = data["Betty"];
        foreach (var item in alternates)
        {
            Console.WriteLine(item);
        }
    }

I wrote some code, I don't know how efficient it, but it i think that it does what you want.

It is your structure

class FancyDataStructure
{
    private IDictionary<string, HashSet<string>> dictionary 
        = new Dictionary<string, HashSet<string>>();

    public void Add(params string[] names)
    {
        HashSet<string> set = new HashSet<string>(names);
        for (int i = 0; i < names.Length; i++)
        {
            if (!dictionary.ContainsKey(names[i]))
            {
                dictionary.Add(names[i], set);
            }
            else
            {
                HashSet<string> union = 
                new HashSet<string>(set.Union<string>(dictionary[names[i]]));
                set = union;
                foreach (string oldName in dictionary[names[i]])
                {
                    dictionary[oldName] = union;
                }
                for (int j = 0; j < i; j++)
                {
                    if (!dictionary.ContainsKey(names[j]))
                    {
                        dictionary.Add(names[j], union);
                    }
                }
            }
        }
    }

    public string[] this[string key]
    {
        get
        {
            List<string> result = dictionary[key].ToList<string>();
            result.Remove(key);
            return result.ToArray();
        }
    }
}

and you can use it, like this

    static void Main(string[] args)
    {

        FancyDataStructure data = new FancyDataStructure();

        data.Add("Elizabeth", "Liz");
        data.Add("Liz", "Betty");

        string[] alternates = data["Betty"];
        foreach (var item in alternates)
        {
            Console.WriteLine(item);
        }
    }
站稳脚跟 2024-08-28 01:02:37

I use this:

It has a generic Set<a> type and implements all of the lovely iterators, .Contains, .Count, etc.

唯憾梦倾城 2024-08-28 01:02:37

尝试使用字典,例如:

Dictionary<string, List<string>>

所以一个包含 List 值的字符串键的字典

Try using a dictionary, something like:

Dictionary<string, List<string>>

So a dictionary of string keys with values of List

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文