对不同字符串字典的基准测试显示普通字典更快,我不知道为什么

发布于 2024-10-07 18:46:01 字数 5592 浏览 8 评论 0原文

我正在对几个字符串字典进行一些基准测试,因为我想了解如何执行不同的配置,并且令我惊讶的是普通的 Dictionary 更快。可能是因为我缺少一些概念或者我做错了什么,但这些是我得到的结果:

Collection:             2147482 items.
Random Keys:            1000 keys.

Normal dictionary
        Add 1000 items:         573ms.
        Get 1000 keys:          0ms.

Normal Dictionary with OrdinalIgnoreCase comparer
        Add 1000 items:         642ms.
        Get 1000 keys:          0ms.

Normal Dictionary with InvariantCultureIgnoreCase comparer
        Add 1000 items:         1661ms.
        Get 1000 keys:          0ms.

Sorted dictionary
        Add 1000 items:         11996ms.
        Get 1000 keys:          5ms.

Sorted dictionary with OrdinalIgnoreCase comparer
        Add 1000 items:         11097ms.
        Get 1000 keys:          5ms.

Sorted dictionary with InvariantCultureIgnoreCase comparer
        Add 1000 items:         9814ms.
        Get 1000 keys:          5ms.

这是我用来测试它的代码:

    static void Main(string[] args)
    {
        String[] col = GenerateCollectionUnsorted(Int32.MaxValue / 1000).ToArray();
        Int32 len = col.Length;
        Console.WriteLine("Collection:\t\t" + len.ToString() + " items.");

        String[] randomKeys = GetRandomKeys(col, 1000);
        Console.WriteLine("Random Keys:\t\t" + randomKeys.Length.ToString() + " keys.");

        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine();
        Console.WriteLine("Normal dictionary");
        TestDic(col, randomKeys, new Dictionary<String, String>());
        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine();
        Console.WriteLine("Normal Dictionary with OrdinalIgnoreCase comparer");
        TestDic(col, randomKeys, new Dictionary<String, String>(StringComparer.OrdinalIgnoreCase));
        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine();
        Console.WriteLine("Normal Dictionary with InvariantCultureIgnoreCase comparer");
        TestDic(col, randomKeys, new Dictionary<String, String>(StringComparer.InvariantCultureIgnoreCase));
        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine();
        Console.WriteLine("Sorted dictionary");
        TestDic(col,randomKeys, new SortedDictionary<String,String>());
        GC.Collect(); 
        GC.WaitForPendingFinalizers();
        
        Console.WriteLine();
        Console.WriteLine("Sorted dictionary with OrdinalIgnoreCase comparer");
        TestDic(col, randomKeys, new SortedDictionary<String, String>(StringComparer.OrdinalIgnoreCase));
        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine();
        Console.WriteLine("Sorted dictionary with InvariantCultureIgnoreCase comparer");
        TestDic(col, randomKeys, new SortedDictionary<String, String>(StringComparer.InvariantCultureIgnoreCase));
        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine("\nEnd");
        Console.ReadKey(true);
    }

    static void TestDic(String[] col, String[] randomKeys, IDictionary<String,String> dic)
    {
        Stopwatch crono = new Stopwatch();
        
        crono.Start();
        foreach (String s in col)
            dic.Add(s, s);
        crono.Stop();

        Console.WriteLine("\tAdd "+randomKeys.Length.ToString()+" items:\t\t" + crono.ElapsedMilliseconds.ToString() + "ms.");

        crono.Reset();
        crono.Start();
        String sv;
        foreach (var rk in randomKeys)
        {
            sv = dic[rk];
            sv.ToString();
        }
        crono.Stop();
        Console.WriteLine("\tGet " + randomKeys.Length.ToString() + " keys:\t\t" + crono.ElapsedMilliseconds.ToString() + "ms.");
    }

    static String[] GetRandomKeys(String[] col, Int32 count)
    {
        Random ran = new Random(DateTime.Now.Millisecond);

        List<Int32> indexSelection = new List<int>();
        List<String> selection = new List<string>();
        
        Int32 len = col.Length;
        
        while (indexSelection.Count < count)
        {
            Int32 value = ran.Next(0, len - 1);
            if (!indexSelection.Contains(value))
            {
                indexSelection.Add(value);
                selection.Add(col[value]);
            }
        }
        return selection.ToArray();
    }

    static IEnumerable<String> GenerateCollection(Int32 count)
    {
        for (int i = 0; i < count; i++)
        {
            yield return i.ToString("X").PadLeft(5, '0');
        }
    }

    static IEnumerable<String> GenerateCollectionUnsorted(Int32 amount)
    {
        for (int i = 0; i < amount / 2; i++)
        {
            yield return i.ToString("X").PadLeft(5, '0');
            yield return (amount-i).ToString("X").PadLeft(5, '0');
        }
    }

知道为什么会发生这种情况吗?

编辑1:据我了解,排序字典添加项目的速度会更慢,而获取它们的速度会更快,因为集合是排序的。而且,使用 OrdinalIgnoreCase 或 InvariantCultureIgnoreCase 比较字符串应该比正常比较更快,因此查找应该更快。但也许我的理解是完全错误的:D

提前致谢。

编辑2:出于好奇,我对字符串比较做了一些测试:

Collection:             2147482 items.
Random Keys:            1000 keys.

CurrentCulture
        LookUp:         158209ms.

CurrentCultureIgnoreCase
        LookUp:         160710ms.

InvariantCulture
        LookUp:         132765ms.

InvariantCultureIgnoreCase
        LookUp:         133409ms.

Ordinal
        LookUp:         36115ms.

OrdinalIgnoreCase
        LookUp:         36329ms.

I'm doing some benchmark on several string dictionaries, because I wanted to find out how performs the different configurations, and I got surprised that the normal Dictionary<String,String> is the faster. Probably because I'm missing some concept or I'm doing something wrong, but these are the results I got:

Collection:             2147482 items.
Random Keys:            1000 keys.

Normal dictionary
        Add 1000 items:         573ms.
        Get 1000 keys:          0ms.

Normal Dictionary with OrdinalIgnoreCase comparer
        Add 1000 items:         642ms.
        Get 1000 keys:          0ms.

Normal Dictionary with InvariantCultureIgnoreCase comparer
        Add 1000 items:         1661ms.
        Get 1000 keys:          0ms.

Sorted dictionary
        Add 1000 items:         11996ms.
        Get 1000 keys:          5ms.

Sorted dictionary with OrdinalIgnoreCase comparer
        Add 1000 items:         11097ms.
        Get 1000 keys:          5ms.

Sorted dictionary with InvariantCultureIgnoreCase comparer
        Add 1000 items:         9814ms.
        Get 1000 keys:          5ms.

An this is the code I'm using to test it:

    static void Main(string[] args)
    {
        String[] col = GenerateCollectionUnsorted(Int32.MaxValue / 1000).ToArray();
        Int32 len = col.Length;
        Console.WriteLine("Collection:\t\t" + len.ToString() + " items.");

        String[] randomKeys = GetRandomKeys(col, 1000);
        Console.WriteLine("Random Keys:\t\t" + randomKeys.Length.ToString() + " keys.");

        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine();
        Console.WriteLine("Normal dictionary");
        TestDic(col, randomKeys, new Dictionary<String, String>());
        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine();
        Console.WriteLine("Normal Dictionary with OrdinalIgnoreCase comparer");
        TestDic(col, randomKeys, new Dictionary<String, String>(StringComparer.OrdinalIgnoreCase));
        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine();
        Console.WriteLine("Normal Dictionary with InvariantCultureIgnoreCase comparer");
        TestDic(col, randomKeys, new Dictionary<String, String>(StringComparer.InvariantCultureIgnoreCase));
        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine();
        Console.WriteLine("Sorted dictionary");
        TestDic(col,randomKeys, new SortedDictionary<String,String>());
        GC.Collect(); 
        GC.WaitForPendingFinalizers();
        
        Console.WriteLine();
        Console.WriteLine("Sorted dictionary with OrdinalIgnoreCase comparer");
        TestDic(col, randomKeys, new SortedDictionary<String, String>(StringComparer.OrdinalIgnoreCase));
        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine();
        Console.WriteLine("Sorted dictionary with InvariantCultureIgnoreCase comparer");
        TestDic(col, randomKeys, new SortedDictionary<String, String>(StringComparer.InvariantCultureIgnoreCase));
        GC.Collect();
        GC.WaitForPendingFinalizers();

        Console.WriteLine("\nEnd");
        Console.ReadKey(true);
    }

    static void TestDic(String[] col, String[] randomKeys, IDictionary<String,String> dic)
    {
        Stopwatch crono = new Stopwatch();
        
        crono.Start();
        foreach (String s in col)
            dic.Add(s, s);
        crono.Stop();

        Console.WriteLine("\tAdd "+randomKeys.Length.ToString()+" items:\t\t" + crono.ElapsedMilliseconds.ToString() + "ms.");

        crono.Reset();
        crono.Start();
        String sv;
        foreach (var rk in randomKeys)
        {
            sv = dic[rk];
            sv.ToString();
        }
        crono.Stop();
        Console.WriteLine("\tGet " + randomKeys.Length.ToString() + " keys:\t\t" + crono.ElapsedMilliseconds.ToString() + "ms.");
    }

    static String[] GetRandomKeys(String[] col, Int32 count)
    {
        Random ran = new Random(DateTime.Now.Millisecond);

        List<Int32> indexSelection = new List<int>();
        List<String> selection = new List<string>();
        
        Int32 len = col.Length;
        
        while (indexSelection.Count < count)
        {
            Int32 value = ran.Next(0, len - 1);
            if (!indexSelection.Contains(value))
            {
                indexSelection.Add(value);
                selection.Add(col[value]);
            }
        }
        return selection.ToArray();
    }

    static IEnumerable<String> GenerateCollection(Int32 count)
    {
        for (int i = 0; i < count; i++)
        {
            yield return i.ToString("X").PadLeft(5, '0');
        }
    }

    static IEnumerable<String> GenerateCollectionUnsorted(Int32 amount)
    {
        for (int i = 0; i < amount / 2; i++)
        {
            yield return i.ToString("X").PadLeft(5, '0');
            yield return (amount-i).ToString("X").PadLeft(5, '0');
        }
    }

Any idea why is this happening?

EDIT 1: It was my understanding that sorted dictionary would be slower adding items, and faster getting them because the collection is sorted. And also that comparing strings with a OrdinalIgnoreCase or InvariantCultureIgnoreCase should be faster that a normal comparision, so the seek should be faster. But maybe my understanding is completely wrong :D

Thanks in advance.

EDIT 2: As curiosity, I've made some test on String comparisons:

Collection:             2147482 items.
Random Keys:            1000 keys.

CurrentCulture
        LookUp:         158209ms.

CurrentCultureIgnoreCase
        LookUp:         160710ms.

InvariantCulture
        LookUp:         132765ms.

InvariantCultureIgnoreCase
        LookUp:         133409ms.

Ordinal
        LookUp:         36115ms.

OrdinalIgnoreCase
        LookUp:         36329ms.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

如梦 2024-10-14 18:46:01

因为只有带有默认比较器的普通字典才是真正的普通字典,其他的都是排序字典。

所以,结果是非常一致的。

编辑:

修复后,结果会有所不同。 :)

排序字典在添加项目时会比较慢,因为它们需要排序,但在获取项目时却不会更快。搜索二叉树很快,但是搜索常规字典使用的哈希列表更快。当二叉树增长时,将需要更多步骤来定位每个项目,而字典主要通过添加更多存储桶来增长,因此比较次数几乎不受影响。

使用 Ordinal 比较字符串比常规 (CurrentCulture) 比较更快,并且 OrdinalIgnoreCaseCurrentCultureIgnoreCase 更快,但它不确定 OrdinalIgnoreCaseCurrentCulture 更快。 InvariantCulture 比较并不比常规比较快,它只是使用不同的区域性。序数比较速度快得多的原因是它根本不必考虑文化设置。

顺便说一下,我注意到 GetRandomKeys 方法中有一个错误。它永远不会选择最后一个项目,因为您将获得 0 到 Length - 2 之间的随机数。

Because only the normal dictionary with the default comparer is actually a normal dictionary, all the others are sorted dictionaries.

So, the result is very consistent.

Edit:

With that fixed, the result is different. :)

A sorted dictionary is slower when adding items as they need to be sorted, but it's not faster when getting items. Searching the binary tree is fast, but searching the hash list that a regular dictionary uses is faster. When the binary tree grows, there will be more steps to locate each item, while a dictionary mostly grows by adding more buckets so the number of compares is barely affected.

Comparing strings with Ordinal is faster than regular (CurrentCulture) compare, and OrdinalIgnoreCase is faster than CurrentCultureIgnoreCase, but it's not certain that OrdinalIgnoreCase is faster than CurrentCulture. InvariantCulture compare is not faster than a regular compare, it just uses a different culture. The reason that ordinal compare is a lot faster is that it doesn't have to bother with cultural settings at all.

By the way, I noticed an error in the GetRandomKeys method. It will never pick the last item, as you are getting a random number between 0 and Length - 2.

绝不放开 2024-10-14 18:46:01

来自 关于 SortedDictionary 的 MSDN 文档:

SortedDictionary 泛型类是一个具有 O(log n) 检索时间的二叉搜索树,其中 n 是字典中的元素数量。

来自 Dictionary 上的文档:

使用键检索值非常快,接近 O(1),因为 Dictionary 类是作为哈希表实现的。

因此,在许多情况下 Dictionary 的性能优于 SortedDictionary 也就不足为奇了。

From the MSDN documentation on SortedDictionary:

The SortedDictionary<TKey, TValue> generic class is a binary search tree with O(log n) retrieval, where n is the number of elements in the dictionary.

From the docs on Dictionary:

Retrieving a value by using its key is very fast, close to O(1), because the Dictionary<TKey, TValue> class is implemented as a hash table.

Therefore it should not be surprising that Dictionary can outperform SortedDictionary in many cases.

玩世 2024-10-14 18:46:01

我不明白为什么你会对结果感到惊讶。

“正常”/未排序显然比排序字典更快,因为排序字典必须执行额外的操作。

具有额外选项的两个“正常”选项也都需要额外的处理。

I'm failing to see why you would be surprised at the results.

"normal"/unsorted would obviously be faster than a sorted dictionary as the sorted one would have to perform extra operations.

The two "normal" ones that have extra options both require extra processing as well.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文