Objective-c：块和 NSEnumerationConcurrent 的问题

发布于 2024-11-04 22:14:37 字数 2442 浏览 2 评论 0原文

我有一本字典，其中包含第二本包含 1000 个条目的字典。这些条目都是类型为 key = key XXX 和 value = element XXX 的 NSString，其中 XXX 是 0 到元素数量之间的数字。 elements - 1.（几天前，我询问了包含字典的 Objective-C 字典。如果您需要创建字典的代码。）

子字典中所有字符串的总长度为 28,670 个字符。即：

strlen("key 0")+strlen("element 0")+
//and so on up through 
strlen("key 999")+strlen("element 999") == 28670.

如果一个方法将每个键+值对枚举一次且仅一次，则可以将其视为一个非常简单的哈希值作为指示符。

我有一个完美运行的子例程（使用块）来访问各个字典键和值：

NSUInteger KVC_access3(NSMutableDictionary *dict){
    __block NSUInteger ll=0;
    NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];

    [subDict 
        enumerateKeysAndObjectsUsingBlock:
            ^(id key, id object, BOOL *stop) {
                ll+=[object length];
                ll+=[key length];
    }];
    return ll;
}
// will correctly return the expected length...

的数字：

NSUInteger KVC_access4(NSMutableDictionary *dict){
    __block NSUInteger ll=0;
    NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];

    [subDict 
        enumerateKeysAndObjectsWithOptions:
            NSEnumerationConcurrent
        usingBlock:
            ^(id key, id object, BOOL *stop) {
                ll+=[object length];
                ll+=[key length]; 
    }];
    return ll;
}
// will return correct value sometimes; a shortfall value most of the time...

如果我尝试使用并发块（在多处理器机器上）进行相同的操作，我会得到一个接近但不完全是预期的28670 NSEnumerationConcurrent 状态的 Apple 文档：

 "the code of the Block must be safe against concurrent invocation."

我认为这可能是问题所在，但是我的代码或 KVC_access4 中对于并发调用不安全的块有什么问题？

编辑和编辑结论

感谢 BJ Homer 的优秀的解决方案，我让 NSEnumerationConcurrent 工作。我对这两种方法都进行了广泛的计时。我上面的 KVC_access3 代码对于中小型词典来说更快、更容易。在很多词典上它的速度要快得多。但是，如果您有一个 mongo 大字典（数百万或数千万个键/值对），那么此代码：

[subDict 
    enumerateKeysAndObjectsWithOptions:
        NSEnumerationConcurrent
    usingBlock:
        ^(id key, id object, BOOL *stop) {
        NSUInteger workingLength = [object length];
        workingLength += [key length];

        OSAtomicAdd64Barrier(workingLength, &ll); 
 }];

速度提高了 4 倍。大小的交叉点大约是我的 100,000 个测试元素的 1 个字典。字典越多，交叉点越高，可能是因为设置时间的原因。

原文

I have a dictionary containing a second dictionary with 1000 entries. The entries are all NSStrings of the type key = key XXX, and value = element XXX where XXX is a number between 0 - the number of elements - 1. (Several days ago, I asked about Objective-C dictionaries containing a dictionary. Please refer to that question if you want the code that creates the dictionary.)

The sum total length of all the strings in the sub dictionary is 28,670 characters. ie:

strlen("key 0")+strlen("element 0")+
//and so on up through 
strlen("key 999")+strlen("element 999") == 28670.

Consider this a very simple hash value as an indicator if a method has enumerated every key+value pair once and only once.

I have one subroutine that works perfectly (using blocks) to access the individual dictionary key and values:

NSUInteger KVC_access3(NSMutableDictionary *dict){
    __block NSUInteger ll=0;
    NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];

    [subDict 
        enumerateKeysAndObjectsUsingBlock:
            ^(id key, id object, BOOL *stop) {
                ll+=[object length];
                ll+=[key length];
    }];
    return ll;
}
// will correctly return the expected length...

If I try the same using concurrent blocks (on a multi processor machine), I get a number close to but not exactly the expected 28670:

NSUInteger KVC_access4(NSMutableDictionary *dict){
    __block NSUInteger ll=0;
    NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];

    [subDict 
        enumerateKeysAndObjectsWithOptions:
            NSEnumerationConcurrent
        usingBlock:
            ^(id key, id object, BOOL *stop) {
                ll+=[object length];
                ll+=[key length]; 
    }];
    return ll;
}
// will return correct value sometimes; a shortfall value most of the time...

The Apple docs for NSEnumerationConcurrent state:

 "the code of the Block must be safe against concurrent invocation."

I think that is probably the issue, but what is the issue with my code or the block in KVC_access4 that is NOT safe for concurrent invocation?

Edit & Conclusion

Thanks to BJ Homer's excellent solution, I got NSEnumerationConcurrent working. I timed both methods extensively. The code I have above in KVC_access3 is faster and easier for small and medium sized dictionaries. It much faster on lots of dictionaries. However, if you have a mongo big dictionary (millions or tens of millions of key/value pairs) then this code:

[subDict 
    enumerateKeysAndObjectsWithOptions:
        NSEnumerationConcurrent
    usingBlock:
        ^(id key, id object, BOOL *stop) {
        NSUInteger workingLength = [object length];
        workingLength += [key length];

        OSAtomicAdd64Barrier(workingLength, &ll); 
 }];

is up to 4x faster. The crossover point for size is about 1 dictionary of 100,000 of my test elements. More dictionaries and that crossover point is higher presumably because of set-up time.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

往事风中埋 2024-11-11 22:14:37

通过并发枚举，您将可以在多个线程上同时运行该块。这意味着多个线程同时访问ll。由于没有同步，因此很容易出现竞争状况。

这是一个问题，因为 += 操作不是原子操作。请记住，ll += x 与 ll = ll + x 相同。这涉及读取 ll，将 x 添加到该值，然后将新值存储回 ll。在线程 X 上读取 ll 和存储它之间，当线程 X 返回存储其计算时，由其他线程引起的任何更改都将丢失。

您需要添加同步，以便多个线程不能同时修改该值。天真的解决方案是这样的：

__block NSUInteger ll=0;
NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];

[subDict 
    enumerateKeysAndObjectsWithOptions:NSEnumerationConcurrent
    usingBlock:
        ^(id key, id object, BOOL *stop) {
            @synchronized(subDict) { // <-- Only one thread can be in this block at a time.
                ll+=[object length];
                ll+=[key length];
            }
}];
return ll;

但是，这会放弃从并发枚举中获得的所有好处，因为块的整个主体现在包含在同步块中 - 实际上，一次只有该块的一个实例实际运行。

如果并发实际上是这里的一个重要性能要求，我建议如下：

__block uint64 ll = 0; // Note the change in type here; it needs to be a 64-bit type.

^(id key, id object, BOOL *stop) {
    NSUInteger workingLength = [object length];
    workingLength += [key length];

    OSAtomicAdd64Barrier(workingLength, &ll); 
}

请注意，我使用的是 OSAtomicAdd64Barrier，这是一个相当低级的函数，保证以原子方式递增值。您还可以使用 @synchronized 来控制访问，但如果此操作实际上是一个重要的性能瓶颈，那么您可能会需要性能最佳的选项，即使要付出一些代价明晰。如果这感觉有点矫枉过正，那么我怀疑启用并发枚举并不会真正影响您的性能。

With concurrent enumeration, you'll have the block being run simultaneously on multiple threads. This means that multiple threads are accessing ll at the same time. Since you have no synchronization, you're prone to race conditions.

This is a problem because the += operation is not an atomic operation. Remember, ll += x is the same thing as ll = ll + x. This involves reading ll, adding x to that value, and then storing the new value back in ll. Between the time that ll is read on Thread X and when it is stored, any changes caused by other threads will be lost when Thread X gets back to storing its calculation.

You need to add synchronization such that multiple threads can't be modifying the value at the same time. The naive solution is this:

__block NSUInteger ll=0;
NSMutableDictionary *subDict=[dict objectForKey:@"dict_key"];

[subDict 
    enumerateKeysAndObjectsWithOptions:NSEnumerationConcurrent
    usingBlock:
        ^(id key, id object, BOOL *stop) {
            @synchronized(subDict) { // <-- Only one thread can be in this block at a time.
                ll+=[object length];
                ll+=[key length];
            }
}];
return ll;

However, this discards all the benefits you get from concurrent enumeration, since the entire body of the block is now enclosed in a synchronized block—in effect, only one instance of this block would be actually running at a time.

If concurrency is actually a significant performance requirement here, I'd suggest the following:

__block uint64 ll = 0; // Note the change in type here; it needs to be a 64-bit type.

^(id key, id object, BOOL *stop) {
    NSUInteger workingLength = [object length];
    workingLength += [key length];

    OSAtomicAdd64Barrier(workingLength, &ll); 
}

Note that I'm using OSAtomicAdd64Barrier, which is a fairly low-level function that is guaranteed to increment a value atomically. You could also use @synchronized to control the access, but if this operation is actually a significant performance bottleneck, then you're probably going to want the most performant option, even at the cost of a bit of clarity. If this feels like overkill, then I suspect enabling concurrent enumeration isn't really going to affect your performance all that much.

回复收藏 0 原文

~没有更多了~