GHashTable 的大小/调整大小
这是我的用例:我想使用 glib 的 GHashTable 和使用IP地址作为键,并使用该IP地址发送/接收的数据量作为值。例如,我使用一些内核变量成功地在用户空间中实现了整个问题,以便查看每个 IP 地址的卷。
现在的问题是:假设我有很多 IP 地址(即 500,000 到 1,000,000 个唯一地址)=>确实不清楚使用 (g_hash_table_new()
/g_hash_table_new_full() 时创建的新哈希表分配的空间以及指定的第一个大小是多少
),以及整个过程如何在后台运行。众所周知,调整哈希表的大小时可能会花费很多时间。那么我们如何使用这些参数呢?
Here is my use case: I want to use glib's GHashTable and use IP-addresses as keys, and the olume of data sent/received by this IP-address as the value. For instance I succeeded to implement the whole issue in user-space using some kernel variables in order to look to the volume per IP-address.
Now the question is: Suppose I have a LOT of IP-addresses (i.e. 500,000 up to 1,000,000 uniques) => it is really not clear what is the space allocated and the first size that was given to a new hash table created when using (g_hash_table_new()
/g_hash_table_new_full()
), and how the whole thing works in the background. It is known that when resizing a hash table it can take a lot of time. So how can we play with these parameters?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
g_hash_table_new()
也不是g_hash_table_new_full()
让您指定大小。哈希表的大小仅可用作存储在其中的值的数量,您无权访问通常在实现中使用的实际数组大小。
但是,
g_spaced_primes_closest()
暗示 glib 的哈希表使用质数大小的内部数组。我想说,虽然一百万把钥匙相当多,但也不算特别。尝试一下,然后测量性能以确定是否值得深入挖掘。
Neither
g_hash_table_new()
norg_hash_table_new_full()
let you specify the size.The size of a hash table is only available as the number of values stored in it, you don't have access to the actual array size that typically is used in the implementation.
However, the existance of
g_spaced_primes_closest()
kind of hints that glib's hash table uses a prime-sized internal array.I would say that although a million keys is quite a lot, it's not extraordinary. Try it, and then measure the performance to determine if it's worth digging deeper.