锁定一个实习字符串?
更新:如果此方法不是线程安全的,这是可以接受的,但我有兴趣了解如何使其线程安全。另外,如果可以避免的话,我不想为 key
的所有值锁定单个对象。
原始问题:假设我想编写一个高阶函数,它接受一个键和一个函数,并检查是否已使用给定键缓存对象。如果有,则返回缓存的值。否则,运行给定的函数并缓存并返回结果。
这是我的代码的简化版本:
public static T CheckCache<T>(string key, Func<T> fn, DateTime expires)
{
object cache = HttpContext.Current.Cache.Get(key);
//clearly not thread safe, two threads could both evaluate the below condition as true
//what can I lock on since the value of "key" may not be known at compile time?
if (cache == null)
{
T result = fn();
HttpContext.Current.Cache.Insert(key, result, null, expires, Cache.NoSlidingExpiration);
return result;
}
else
return (T)cache;
}
另外,假设我在编译时不知道 key
的所有可能值。
我怎样才能保证这个线程安全?我知道我需要在这里引入锁定,以防止 1+ 个线程将我的条件评估为 true,但我不知道要锁定什么。我读过的许多有关锁定的示例(例如 Jon Skeet 的文章)建议使用仅用于锁定的“虚拟”私有变量。在这种情况下这是不可能的,因为键在编译时是未知的。我知道我可以通过为每个key
使用相同的锁来轻松地使该线程安全,但这可能会造成浪费。
现在,我的主要问题是:
是否可以锁定密钥
?字符串实习在这里有帮助吗?
阅读后。 NET 2.0 字符串实习,我知道我可以显式调用 String.Intern()
来获取从字符串值到实例的 1 对 1 映射一个字符串。 这样适合加锁吗?我们把上面的代码改成:
public static T CheckCache<T>(string key, Func<T> fn, DateTime expires)
{
//check for the scenario where two strings with the same value are stored at different memory locations
key = String.Intern(key);
lock (key) //is this object suitable for locking?
{
object cache = HttpContext.Current.Cache.Get(key);
if (cache == null)
{
T result = fn();
HttpContext.Current.Cache.Insert(key, result, null, expires, Cache.NoSlidingExpiration);
return result;
}
else
return (T)cache;
}
}
上面的实现线程安全吗?
Update: It is acceptable if this method is not thread safe, but I'm interested in learning how I would make it thread safe. Also, I do not want to lock on a single object for all values of key
if I can avoid it.
Original Question: Suppose I want to write a higher order function that takes a key and a function, and checks if an object has been cached with the given key. If is has, the cached value is returned. Otherwise, the given function is run and the result is cached and returned.
Here's a simplified version of my code:
public static T CheckCache<T>(string key, Func<T> fn, DateTime expires)
{
object cache = HttpContext.Current.Cache.Get(key);
//clearly not thread safe, two threads could both evaluate the below condition as true
//what can I lock on since the value of "key" may not be known at compile time?
if (cache == null)
{
T result = fn();
HttpContext.Current.Cache.Insert(key, result, null, expires, Cache.NoSlidingExpiration);
return result;
}
else
return (T)cache;
}
Also, suppose I do not know all possible values of key
at compile time.
How can I make this thread safe? I know I need to introduce locking here, to prevent 1+ threads from evaluating my condition as true, but I don't know what to lock on. Many of the examples I've read about locking (such as Jon Skeet's article) recommend using a "dummy" private variable that's used only for locking. This isn't possible in this case, because keys are unknown at compile time. I know I could trivially make this thread safe by having the same lock be used for every key
, but that could be wasteful.
Now, my main question is:
Is is possible to lock on key
? Will string interning help here?
After reading .NET 2.0 string interning inside out, I understand that I can explicitly call String.Intern()
to obtain a 1 to 1 mapping from the value of a string to instance of a string. Is this suitable to lock on? Let's change the above code to:
public static T CheckCache<T>(string key, Func<T> fn, DateTime expires)
{
//check for the scenario where two strings with the same value are stored at different memory locations
key = String.Intern(key);
lock (key) //is this object suitable for locking?
{
object cache = HttpContext.Current.Cache.Get(key);
if (cache == null)
{
T result = fn();
HttpContext.Current.Cache.Insert(key, result, null, expires, Cache.NoSlidingExpiration);
return result;
}
else
return (T)cache;
}
}
Is the above implementation thread safe?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
@wsanville 自己的解决方案的问题,之前部分提到过:
String.Intern
锁定模式的编码人员进行扩展) - 请注意,这包括对同一 interned 字符串的锁定 即使它们位于不同的 AppDomains,可能会导致跨 AppDomain 死锁,String.Intern()
很慢要解决所有这 3 个问题,您可以实现您自己的
Intern()
您的特定锁定目的,即不要将其用作全局通用字符串内部:我将此方法称为
...Safe()
,因为实习时我不会存储传入的String
实例,因为例如,可能是一个已经驻留的String
,使其受到上面1.中提到的问题的影响。为了比较各种驻留字符串方法的性能,我还尝试了以下两种方法以及
String.Intern
。基准测试
100 个线程,每个线程随机选择 5000 个不同字符串(每个字符串包含 8 位数字)之一 50000 次,然后调用各自的 intern 方法。所有值均在充分预热后。这是 Windows 7,64 位,4 核 i5。
注意:预热上述设置意味着预热后,不会对相应的实习字典进行任何写入,而只会读取。这是我对当前用例感兴趣的内容,但不同的写入/读取比率可能会影响结果。
结果
String.Intern
(): 2032 毫秒InternLocked()
: 1245 毫秒InternConcurrent()
: 458 毫秒事实上,
InternConcurrentSafe
与InternConcurrent
一样快,这是有道理的,因为事实上,这些数字是在预热之后的(请参见上面的注意),因此在测试期间实际上没有或只有几次String.Copy
调用。In order to properly encapsulate this, create a class like this:
在为您可能拥有的每个用例实例化一个
StringLocker
后,就像调用NB
一样简单。再想一想,不需要返回 < 类型的对象code>string 如果你只想锁定它,那么复制字符是完全没有必要的,下面的类会比上面的类表现得更好。
Problems with @wsanville's own solution, partly mentioned before:
String.Intern
locking pattern) - note that this includes locks on the same interned string even if they are in different AppDomains, potentially leading to cross-AppDomain deadlocksString.Intern()
is slowTo address all these 3 issues, you could implement your own
Intern()
that you tie to your specific locking purpose, i.e. do not use it as a global, general-purpose string interner:I called this method
...Safe()
, because when interning I will not store the passed inString
instance, as that might e.g. be an already internedString
, making it subject to the problems mentioned in 1. above.To compare the performance of various ways of interning strings, I also tried the following 2 methods, as well as
String.Intern
.Benchmark
100 threads, each randomly selecting one of 5000 different strings (each containing 8 digits) 50000 times and then calling the respective intern method. All values after warming up sufficiently. This is Windows 7, 64bit, on a 4core i5.
N.B. Warming up the above setup implies that after warming up, there won't be any writes to the respective interning dictionaries, but only reads. It's what I was interested in for the use case at hand, but different write/read ratios will probably affect the results.
Results
String.Intern
(): 2032 msInternLocked()
: 1245 msInternConcurrent()
: 458 msInternConcurrentSafe()
: 453 msThe fact that
InternConcurrentSafe
is as fast asInternConcurrent
makes sense in light of the fact that these figures are after warming up (see above N.B.), so there are in fact no or only a few invocations ofString.Copy
during the test.In order to properly encapsulate this, create a class like this:
and after instantiating one
StringLocker
for every use case you might have, it is as easy as callingN.B.
Thinking again, there's no need to return an object of type
string
if all you want to do is lock on it, so copying the characters is totally unnecessary, and the following would perform better than above class.Daniel 的变体答案...
而不是为每个字符串创建一个新的锁对象,您可以共享一小部分锁,根据字符串的哈希码选择要使用的锁。如果您可能拥有数千或数百万个密钥,这将意味着 GC 压力更小,并且应该允许足够的粒度以避免任何严重的阻塞(如果需要,可能在进行一些调整之后)。
这将允许您通过更改
_stripes
数组中的元素数量来调整锁定粒度以满足您的特定需求。 (但是,如果您需要接近每串一锁的粒度,那么您最好采用丹尼尔的答案。)A variant of Daniel's answer...
Rather than creating a new lock object for every single string you could share a small-ish set of locks, choosing which lock to use depending on the string's hashcode. This will mean less GC pressure if you potentially have thousands, or millions, of keys, and should allow enough granularity to avoid any serious blocking (perhaps after a few tweaks, if necessary).
This will allow you to tweak the locking granularity to suit your particular needs just by changing the number of elements in the
_stripes
array. (However, if you need close to one-lock-per-string granularity then you're better off going with Daniel's answer.)切勿锁定琴弦。特别是那些被拘留的人。请参阅 此博客条目介绍了锁定驻留字符串的危险。
只需创建一个新对象并锁定它:
Never lock on strings. In particular on those that are interned. See this blog entry on the danger of locking on interned strings.
Just create a new object and lock on that:
根据文档,缓存类型是线程安全的。因此,不同步自己的缺点是,当创建该项目时,在其他线程意识到它们不需要创建它之前,可能会创建它几次。
如果情况只是为了缓存常见的静态/只读内容,那么不必费心去同步,只是为了保存可能发生的少数冲突。 (假设冲突是良性的。)
锁定对象不会特定于字符串,它将特定于您所需的锁定的粒度。在这种情况下,您尝试锁定对缓存的访问,因此一个对象将服务锁定缓存。锁定传入的特定密钥的想法不是锁定通常关注的概念。
如果您想阻止多次发生昂贵的调用,那么您可以将加载逻辑剥离到新类
LoadMillionsOfRecords
中,调用.Load
并在内部锁定上锁定一次根据奥德的回答反对。According to the documentation, the Cache type is thread safe. So the downside for not synchronizing yourself is that when the item is being created, it may be created a few times before the other threads realize they don't need to create it.
If the situation is simply to cache common static / read-only things, then don't bother synchronizing just to save the odd few collisions that might occur. (Assuming the collisions are benign.)
The locking object won't be specific to the strings, it will be specific to the granularity of the lock you require. In this case, you are trying to lock access to the cache, so one object would service locking the cache. The idea of locking on the specific key that comes in isn't the concept locking is usually concerned with.
If you want to stop expensive calls from occurring multiple times, then you can rip the loading logic out into a new class
LoadMillionsOfRecords
, call.Load
and lock once on an internal locking object as per Oded's answer.我会采用务实的方法并使用虚拟变量。
如果出于某种原因这是不可能的,我将使用
Dictionary
,其中key
作为键,虚拟对象作为值,并锁定该值,因为字符串不适合锁定:但是,在大多数情况下,这是矫枉过正和不必要的微优化。
I would go with the pragmatic approach and use the dummy variable.
If this is not possible for whatever reason, I would use a
Dictionary<TKey, TValue>
withkey
as the key and a dummy object as the value and lock on that value, because strings are not suitable for locking:However, in most cases this is overkill and unnecessary micro optimization.
我在 Bardock.Utils 包中添加了一个解决方案,灵感来自 @eugene-beresovsky 答案。
用法:
解决方案代码 :
I added a solution in Bardock.Utils package inspired by @eugene-beresovsky answer.
Usage:
Solution code: