.NET ConcurrentDictionary 初始容量设置为任意素数,而不是 MSDN 示例文档中的预期容量。为什么?

发布于 2024-10-02 15:04:03 字数 835 浏览 10 评论 0原文

我只是在查看 ConcurrentDictionary 的 MSDN 文档,然后我看到了这个在“示例”代码中:

// We know how many items we want to insert into the ConcurrentDictionary.
// So set the initial capacity to some prime number above that, to ensure that
// the ConcurrentDictionary does not need to be resized while initializing it.
int NUMITEMS = 64;
int initialCapacity = 101;

作为参考,MSDN 示例中的字典初始化如下:

ConcurrentDictionary<int, int> cd = new ConcurrentDictionary<int, int>(Environment.ProcessorCount * 2, initialCapacity);
for (int i = 0; i < NUMITEMS; i++) cd[i] = i * i;

在该示例中,字典永远不会包含超过 64 个项目。为什么不将初始容量设置为 64,而不是设置为看似任意的大于 64 的质数?注释说这是为了确保字典在初始化时不需要调整大小,但是为什么initialCapacity=64的类似字典需要调整大小呢?为什么选择这个素数?

I was just looking at the MSDN documentation for ConcurrentDictionary, and I saw this in the "example" code:

// We know how many items we want to insert into the ConcurrentDictionary.
// So set the initial capacity to some prime number above that, to ensure that
// the ConcurrentDictionary does not need to be resized while initializing it.
int NUMITEMS = 64;
int initialCapacity = 101;

For reference, the dictionary in the MSDN example is initialized as follows:

ConcurrentDictionary<int, int> cd = new ConcurrentDictionary<int, int>(Environment.ProcessorCount * 2, initialCapacity);
for (int i = 0; i < NUMITEMS; i++) cd[i] = i * i;

In the example, the dictionary is never going to contain more than 64 items. Why not set the initial capacity to 64, rather than to a seemingly arbitrary prime number greater than 64? The comment says that this is to ensure that the dictionary does not need to be resized while initializing it, but why would a similar dictionary with initialCapacity=64 need to be resized? Why was this prime number chosen?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

柏拉图鍀咏恒 2024-10-09 15:04:03

字典或哈希表依赖于对键进行哈希处理来获取较小的索引来查找相应的存储(数组)。所以哈希函数的选择非常重要。典型的选择是获取密钥的哈希码(以便我们获得良好的随机分布),然后将代码除以素数并使用提醒来索引到固定数量的桶中。这允许将任意大的哈希码转换为一组有界的小数字,我们可以为其定义一个要查找的数组。因此,重要的是让数组大小为素数,然后大小的最佳选择成为大于所需容量的素数。这正是字典实现的作用。

因此基本上任何 Modulo N(n 为素数)字典实现都需要其容量为素数。因此,如果您说所需容量是 X,那么这些实现通常会选择比所需容量更大的引物编号。

Dictionary or hash table relies on hashing the key to get a smaller index to look up into corresponding store (array). So choice of hash function is very important. Typical choice is to get hash code of a key (so that we get good random distribution) and then divide the code by a prime number and use reminder to index into fixed number of buckets. This allows to convert arbitrarily large hash codes into a bounded set of small numbers for which we can define an array to look up into. So its important to have array size in prime number and then the best choice for the size become the prime number that is larger than the required capacity. And that's exactly dictionary implementation does.

So basically any Modulo N (n being prime number) dictionary implementation will need its capacity to be in prime number. So if you say, required capacity is X then these implementation will typically choose next larger primer number than required capacity.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文