Objective-c:块和 NSEnumerationConcurrent 的问题
我有一本字典,其中包含第二本包含 1000 个条目的字典。这些条目都是类型为 key = key XXX
和 value = element XXX
的 NSString,其中 XXX
是 0 到元素数量之间的数字。 elements - 1.(几天前,我询问了包含字典的 Objective-C 字典。如果您需要创建字典的代码。)
子字典中所有字符串的总长度为 28,670 个字符。即:
strlen("key 0")+strlen("element 0")+
//and so on up through
strlen("key 999")+strlen("element 999") == 28670.
如果一个方法将每个键+值对枚举一次且仅一次,则可以将其视为一个非常简单的哈希值作为指示符。
我有一个完美运行的子例程(使用块)来访问各个字典键和值:
NSUInteger KVC_access3(NSMutableDictionary *dict){
__block NSUInteger ll=0;
NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];
[subDict
enumerateKeysAndObjectsUsingBlock:
^(id key, id object, BOOL *stop) {
ll+=[object length];
ll+=[key length];
}];
return ll;
}
// will correctly return the expected length...
的数字:
NSUInteger KVC_access4(NSMutableDictionary *dict){
__block NSUInteger ll=0;
NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];
[subDict
enumerateKeysAndObjectsWithOptions:
NSEnumerationConcurrent
usingBlock:
^(id key, id object, BOOL *stop) {
ll+=[object length];
ll+=[key length];
}];
return ll;
}
// will return correct value sometimes; a shortfall value most of the time...
如果我尝试使用并发块(在多处理器机器上)进行相同的操作,我会得到一个接近但不完全是预期的28670 NSEnumerationConcurrent
状态的 Apple 文档:
"the code of the Block must be safe against concurrent invocation."
我认为这可能是问题所在,但是我的代码或 KVC_access4
中对于并发调用不安全的块有什么问题?
编辑和编辑结论
感谢 BJ Homer 的优秀的解决方案,我让 NSEnumerationConcurrent 工作。我对这两种方法都进行了广泛的计时。我上面的 KVC_access3
代码对于中小型词典来说更快、更容易。在很多词典上它的速度要快得多。但是,如果您有一个 mongo 大字典(数百万或数千万个键/值对),那么此代码:
[subDict
enumerateKeysAndObjectsWithOptions:
NSEnumerationConcurrent
usingBlock:
^(id key, id object, BOOL *stop) {
NSUInteger workingLength = [object length];
workingLength += [key length];
OSAtomicAdd64Barrier(workingLength, &ll);
}];
速度提高了 4 倍。大小的交叉点大约是我的 100,000 个测试元素的 1 个字典。字典越多,交叉点越高,可能是因为设置时间的原因。
I have a dictionary containing a second dictionary with 1000 entries. The entries are all NSStrings of the type key = key XXX
, and value = element XXX
where XXX
is a number between 0 - the number of elements - 1. (Several days ago, I asked about Objective-C dictionaries containing a dictionary. Please refer to that question if you want the code that creates the dictionary.)
The sum total length of all the strings in the sub dictionary is 28,670 characters. ie:
strlen("key 0")+strlen("element 0")+
//and so on up through
strlen("key 999")+strlen("element 999") == 28670.
Consider this a very simple hash value as an indicator if a method has enumerated every key+value pair once and only once.
I have one subroutine that works perfectly (using blocks) to access the individual dictionary key and values:
NSUInteger KVC_access3(NSMutableDictionary *dict){
__block NSUInteger ll=0;
NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];
[subDict
enumerateKeysAndObjectsUsingBlock:
^(id key, id object, BOOL *stop) {
ll+=[object length];
ll+=[key length];
}];
return ll;
}
// will correctly return the expected length...
If I try the same using concurrent blocks (on a multi processor machine), I get a number close to but not exactly the expected 28670:
NSUInteger KVC_access4(NSMutableDictionary *dict){
__block NSUInteger ll=0;
NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];
[subDict
enumerateKeysAndObjectsWithOptions:
NSEnumerationConcurrent
usingBlock:
^(id key, id object, BOOL *stop) {
ll+=[object length];
ll+=[key length];
}];
return ll;
}
// will return correct value sometimes; a shortfall value most of the time...
The Apple docs for NSEnumerationConcurrent
state:
"the code of the Block must be safe against concurrent invocation."
I think that is probably the issue, but what is the issue with my code or the block in KVC_access4
that is NOT safe for concurrent invocation?
Edit & Conclusion
Thanks to BJ Homer's excellent solution, I got NSEnumerationConcurrent working. I timed both methods extensively. The code I have above in KVC_access3
is faster and easier for small and medium sized dictionaries. It much faster on lots of dictionaries. However, if you have a mongo big dictionary (millions or tens of millions of key/value pairs) then this code:
[subDict
enumerateKeysAndObjectsWithOptions:
NSEnumerationConcurrent
usingBlock:
^(id key, id object, BOOL *stop) {
NSUInteger workingLength = [object length];
workingLength += [key length];
OSAtomicAdd64Barrier(workingLength, &ll);
}];
is up to 4x faster. The crossover point for size is about 1 dictionary of 100,000 of my test elements. More dictionaries and that crossover point is higher presumably because of set-up time.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
通过并发枚举,您将可以在多个线程上同时运行该块。这意味着多个线程同时访问
ll
。由于没有同步,因此很容易出现竞争状况。这是一个问题,因为
+=
操作不是原子操作。请记住,ll += x
与ll = ll + x
相同。这涉及读取ll
,将x
添加到该值,然后将新值存储回ll
。在线程 X 上读取 ll 和存储它之间,当线程 X 返回存储其计算时,由其他线程引起的任何更改都将丢失。您需要添加同步,以便多个线程不能同时修改该值。天真的解决方案是这样的:
但是,这会放弃从并发枚举中获得的所有好处,因为块的整个主体现在包含在同步块中 - 实际上,一次只有该块的一个实例实际运行。
如果并发实际上是这里的一个重要性能要求,我建议如下:
请注意,我使用的是 OSAtomicAdd64Barrier,这是一个相当低级的函数,保证以原子方式递增值。您还可以使用 @synchronized 来控制访问,但如果此操作实际上是一个重要的性能瓶颈,那么您可能会需要性能最佳的选项,即使要付出一些代价明晰。如果这感觉有点矫枉过正,那么我怀疑启用并发枚举并不会真正影响您的性能。
With concurrent enumeration, you'll have the block being run simultaneously on multiple threads. This means that multiple threads are accessing
ll
at the same time. Since you have no synchronization, you're prone to race conditions.This is a problem because the
+=
operation is not an atomic operation. Remember,ll += x
is the same thing asll = ll + x
. This involves readingll
, addingx
to that value, and then storing the new value back inll
. Between the time thatll
is read on Thread X and when it is stored, any changes caused by other threads will be lost when Thread X gets back to storing its calculation.You need to add synchronization such that multiple threads can't be modifying the value at the same time. The naive solution is this:
However, this discards all the benefits you get from concurrent enumeration, since the entire body of the block is now enclosed in a synchronized block—in effect, only one instance of this block would be actually running at a time.
If concurrency is actually a significant performance requirement here, I'd suggest the following:
Note that I'm using
OSAtomicAdd64Barrier
, which is a fairly low-level function that is guaranteed to increment a value atomically. You could also use@synchronized
to control the access, but if this operation is actually a significant performance bottleneck, then you're probably going to want the most performant option, even at the cost of a bit of clarity. If this feels like overkill, then I suspect enabling concurrent enumeration isn't really going to affect your performance all that much.