线程安全字典.Add
仅插入时 Dictionary.Add()
线程安全吗?
我有一个从多线程插入密钥的代码,我是否仍然需要围绕 Dictionary.Add() 进行锁定
我在添加新密钥时遇到了此异常:
Exception Source: mscorlib
Exception Type: System.IndexOutOfRangeException
Exception Message: Index was outside the bounds of the array.
Exception Target Site: Insert
尽管这种情况非常罕见。我知道 Dictionary
不是线程安全的,尽管我认为仅调用 .Add
不会导致任何问题。
Is Dictionary.Add()
thread safe when you only insert?
I've got a code that insert keys from multiple-threads, do I still need locking around Dictionary.Add()
I got this exception while adding a new key:
Exception Source: mscorlib
Exception Type: System.IndexOutOfRangeException
Exception Message: Index was outside the bounds of the array.
Exception Target Site: Insert
Although it's quite rare. I know that Dictionary
is not thread-safe although I thought that only calling .Add
wouldn't cause any problems.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
字典根本不是线程安全的,无论您是否只添加到它 - 它有一些内部结构需要保持同步(特别是当内部哈希桶调整大小时) )。
您要么必须围绕其上的任何操作实现自己的锁定,要么如果您在 .Net 4.0 中,您可以使用新的 ConcurrentDictionary - 这绝对是非常棒的 - 并且它是完全线程安全的。
另一种选择(更新)
也就是说 - 您可以使用另一种技术 - 但它需要进行一些调整,具体取决于您插入字典中的数据类型,以及是否保证所有键都是唯一的:
给每个线程它插入它自己的私有字典。
当每个线程结束时,将所有字典整理在一起,并将它们合并成一个更大的字典;如何处理重复的键取决于您。例如,如果您通过某个键缓存项目列表,那么您可以简单地将每个相同键的列表合并为一个列表并将其放入主词典中。
官方答案回复:性能(在您接受之后)
因此,正如您的评论所说,您需要了解性能等的最佳方法(锁定或合并)。我无法告诉您这会是什么;最终需要对其进行基准测试。不过,我会看看是否可以提供一些指导:)
首先 - 如果您知道您的词典(y/ies)最终需要多少个项目,请使用
(int)
构造函数来最小化调整大小。合并操作可能是最好的;因为没有线程会互相干扰。除非两个对象共享同一个key所涉及的过程特别漫长;在这种情况下,在操作结束时强制这一切发生在单个线程上可能最终会通过并行化第一阶段而将所有性能增益归零!
同样,存在潜在的内存问题,因为您将有效地克隆字典,因此如果最终结果足够大,您最终可能会消耗大量资源;不过,当然,他们会被释放。
如果在密钥已经存在时需要在线程级做出决定,那么您将需要一个 lock(){} 构造。
在字典中,这通常采用以下形式:
现在,如果
MergeOperation
确实很慢;那么您可能会考虑释放锁,创建一个代表现有对象和新对象合并的克隆对象,然后重新获取锁。但是 - 您需要一种可靠的方法来检查现有对象的状态在第一个锁和第二个锁之间没有更改(版本号对此很有用)。Dictionary is not thread-safe at all, regardless of whether you only add to it or not - there are a few internal structures to it that need to be kept in sync (especially when the internal hashbuckets get resized).
You either have to implement your own locking around any operation on it, or if you're in .Net 4.0 you can use the new ConcurrentDictionary - which is absolutely fantastic - and which is totally thread-safe.
Another option (update)
That said - there is another technique you can use - but it'll require a bit of tweaking depending upon the kind of data you're inserting into your dictionary, and whether all your keys are guaranteed unique:
Give each thread it's own private dictionary that it inserts into.
When each thread finishes, collate all the dictionaries together and merge them into a bigger one; how you handle duplicate keys is up to you. For example, if you're caching lists of items by a key, then you can simply merge each same-keyed list into one and put it in the master dictionary.
Official answer re: performance (after you accepted)
So as your comments say, you need an idea of the best method (lock or merge) for performance etc. I can't tell you what this will be; ultimately it will need to be benchmarked. I'll see if I can offer some guidance, though :)
Firstly - if you have any idea how many items your Dictionar(y/ies) will ultimately need, use the
(int)
constructor to minimize resizing.The merge operation is likely to be best; since none of the threads will be interfering with each other. Unless the process involved when two objects share the same key is particularly lengthy; in which case forcing it all to happen on a single thread at the end of the operation might end up zeroing all the performance gains by parallelizing the first stage!
Equally, there's potentially memory concerns there since you will effectively be cloning the dictionary, so if the final result is large enough you could end up consuming lots of resources; granted, though - they will be released.
If it's the case that a decision needs to be made the thread-level when a key is already present, then you will need a lock(){} construct.
Over a dictionary, this typically takes the following shape:
Now if that
MergeOperation
is really slow; then you might consider releasing the lock, creating a cloned object that represents the merge of the existing and the new object, then re-acquiring the lock. However - you need a reliable way of checking that the state of the existing object hasn't changed between the first lock and the second (a version number is useful for this).是的,这是当您插入元素时可能会遇到的异常,就像字典正忙于增加存储桶的数量一样。由添加项目的另一个线程触发,并且负载系数过高。词典对此特别敏感,因为重组需要一段时间。好事,让你的代码很快崩溃,而不是每周一次。
检查线程中使用的每一行代码并检查共享对象的使用位置。您还没有发现每周一次的崩溃。或者更糟糕的是,那些不会崩溃但只是偶尔生成错误数据的系统。
Yup, this is the exception you can get when you insert an element just as the dictionary is busy increasing the number of buckets. Triggered by another thread adding an item and the load factor got too high. Dictionary is especially sensitive to that because the reorganization takes a while. Good thing, makes your code crash quickly instead of only once a week.
Review every line of code that's used in a thread and check where a shared object is used. You haven't yet found the once-a-week crashes. Or worse, the ones that don't crash but just generate bad data once in a while.