关于java中同步的问题;何时/如何/到什么程度
我正在开发我的第一个多线程程序,并在同步的几个方面陷入困境。我已经浏览了 oracle/sun 主页上的多线程教程,以及这里的一些关于 SO 的问题,所以我相信我知道什么是同步。然而,正如我提到的,有几个方面我不太确定如何弄清楚。我以明确问题的形式将它们表述如下:
问题 1: 我有一个单例类,它包含用于检查有效标识符的方法。事实证明,这个类需要保存集合来跟踪两种不同标识符类型之间的关联。 (如果标识符这个词听起来很复杂;这些只是字符串)。我选择实现两个 MultiValueMap
实例来实现这种多对多关系。我不确定这些集合是否必须是线程安全的,因为集合仅在创建单例类的实例时才会更新,但我注意到在文档中它说:
请注意,MultiValueMap 不是同步的,也不是线程安全的。如果您希望同时从多个线程使用此映射,则必须使用适当的同步。当没有同步的并发线程访问该类时,该类可能会抛出异常。
谁能详细说明这种“适当的同步”?它到底是什么意思?我无法真正在同步的 HashMap
上使用 MultiValueMap.decorate()
,还是我误解了什么?
问题 2:我有另一个类,它扩展了 HashMap
来保存我的实验值,这些值在软件启动时进行解析。此类旨在为我的分析提供适当的方法,例如 permutation()
、randomization()
、filtering(criteria)
等。我想尽可能地保护我的数据,该类被创建并更新一次,并且所有上述方法都返回新的集合。再次,我不确定这个类是否需要线程安全,因为它不应该从多个线程更新,但这些方法肯定会从多个线程调用,并且为了“安全”,我添加了synchronized
修饰符到我的所有方法。您能预见到这会出现什么问题吗?我应该注意哪些潜在问题?
谢谢,
I am working on my first mutlithreaded program and got stuck about a couple of aspects of synchronization. I have gone over the multi-threading tutorial on oracle/sun homepage, as well as a number of questions here on SO, so I believe I have an idea of what synchronization is. However, as I mentioned there are a couple of aspects I am not quite sure how to figure out. I formulated them below in form of clear-cut question:
Question 1: I have a singleton class that holds methods for checking valid identifiers. It turns out this class needs to hold to collections to keep track of associations between 2 different identifier types. (If the word identifier sounds complicated; these are just strings). I chose to implement two MultiValueMap
instances to implement this many-to-many relationship. I am not sure if these collections have to be thread-safe as the collection will be updated only at the creation of the instance of the singleton class but nevertheless I noticed that in the documentation it says:
Note that MultiValueMap is not synchronized and is not thread-safe. If you wish to use this map from multiple threads concurrently, you must use appropriate synchronization. This class may throw exceptions when accessed by concurrent threads without synchronization.
Could anyone elaborate on this "appropriate synchronization"? What exactly does it mean? I can't really use MultiValueMap.decorate()
on a synchronized HashMap
, or have I misunderstood something?
Question 2: I have another class that extends a HashMap
to hold my experimental values, that are parsed in when the software starts. This class is meant to provide appropriate methods for my analysis, such as permutation()
, randomization()
, filtering(criteria)
etc. Since I want to protect my data as much as possible, the class is created and updated once, and all the above mentioned methods return new collections. Again, I am not sure if this class needs to be thread-safe, as it's not supposed to be updated from multiple threads, but the methods will most certainly be called from a number of threads, and to be "safe" I have added synchronized
modifier to all my methods. Can you foresee any problems with that? What kind of potential problems should I be aware of?
Thanks,
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
答案 1: 您的单例类不应将其内部使用的集合公开给其他对象。相反,它应该提供适当的方法来公开您想要的行为。例如,如果您的对象中有一个
Map
,则没有公共或受保护的方法来返回该Map
。相反,有一个方法,它接受一个键并返回Map
中相应的值(也可以选择为键设置值的方法)。如果需要,可以使这些方法成为线程安全的。注意,即使对于您不打算写入的集合,我认为您也不应该假设读取必然是线程安全的,除非它们被记录为如此。集合对象可能会维护一些您看不到的内部状态,但可能会在读取时进行修改。
答案 2: 首先,我认为在这里使用继承不一定是正确的。我将有一个提供您的方法并具有
HashMap
作为私有成员的类。只要您的方法不更改对象或 HashMap 的内部状态,它们就不必同步。Answer 1: Your singleton class should not expose the collections it uses internally to other objects. Instead it should provide appropriate methods to expose the behaviours you want. For example, if your object has a
Map
in it, don't have a public or protected method to return thatMap
. Instead have a method that takes a key and returns the corresponding value in theMap
(and optionally one that sets the value for the key). These methods can then be made thread safe if required.NB even for collections that you do not intend to write to, I don't think you should assume that reads are necessarily thread safe unless they are documented to be so. The collection object might maintain some internal state that you don't see, but might get modified on reads.
Answer 2: Firstly, I don't think that inheritance is necessarily the correct thing to use here. I would have a class that provides your methods and has a
HashMap
as a private member. As long as your methods don't change the internal state of the object or the HashMap, they won't have to be synchronised.很难给出有关同步的一般规则,但您的一般理解是正确的。以只读方式使用的数据结构不必同步。但是,(1)您必须确保在正确初始化该结构之前没有人(即没有其他线程)可以使用该结构,并且(2)该结构确实是只读的。请记住,即使迭代器也有一个删除方法。
对于你的第二个问题:为了确保不变性,即它是只读的,我不会继承 HashMap 而是在你的类中使用它。
It's hard to give general rules about synchronization, but your general understanding is right. A data-structure which is used in a read-only way, does not have to be synchronized. But, (1) you have to ensure that nobody (i.e. no other thread) can use this structure before it is properly initialized and (2) that the structure is indeed read-only. Remember, even iterators have a remove method.
To your second question: In order to ensure the immutability, i.e. that it is read-only, I would not inherit the HashMap but use it inside your class.
当您可以并发修改基础数据或者一个线程修改数据而另一个线程读取并需要查看该修改时,通常需要同步。
在你的情况下,如果我理解正确的话,MultiValueMap 在创建和读取时会被填充一次。因此,除非读取映射会修改某些内部结构,否则从多个线程读取它而不需要同步应该是安全的。创建过程应该同步,或者至少应该在初始化期间阻止读取访问(一个简单的标志可能就足够了)。
如果您始终返回新集合并且在创建这些“副本”期间没有修改基本集合的内部结构,那么您在问题 2 中描述的类可能不需要同步。
另请注意:请注意集合中的值可能也需要同步,因为如果您在多个线程中安全地从集合中获取对象,但随后并发修改该对象,您仍然会遇到问题。
因此,作为一般经验法则:只读访问不一定需要同步(如果在读取期间未修改对象或者如果这并不重要),则写访问通常应该同步。
Synchronization commonly is needed when you either could have concurrent modifications of the underlying data or one thread modifies the data while another reads and needs to see that modification.
In your case, if I understand it correctly, the MultiValueMap is filled once upon creation and the just read. So unless reading the map would modify some internals it should be safe to read it from multiple threads without synchronization. The creation process should be synchronized or you should at least prevent read access during initialization (a simple flag might be sufficient).
The class you descibe in question 2 might not need to be synchronized if you always return new collections and no internals of the base collection are modified during creation of those "copies".
One additional note: be aware of the fact that the values in the collections might need to be synchronized as well, since if you safely get an object from the collection in multiple thread but then concurrently modify that object you'll still get problems.
So as a general rule of thumb: read-only access does not necessarily need synchronization (if the objects are not modified during those reads or if that doesn't matter), write access should generally be synchronized.
如果您的映射在类加载时(即在静态初始化块中)填充一次,并且之后从未修改(即没有添加/删除元素或关联),那么就没有问题。静态初始化保证由 JVM 以线程安全的方式执行,并且其结果对所有线程可见。因此,在这种情况下,您很可能不需要任何进一步的同步。
如果地图是实例成员(从您的描述中我不清楚),但在创建后未修改,我会再说一遍,如果您声明您的成员
final
,那么您很可能是安全的(除非您发布过早地引用this
对象,即在构造函数完成之前以某种方式将其从构造函数传递到外部世界)。If your maps are populated once, at the time the class is loaded (i.e. in a static initializer block), and are never modified afterwards (i.e. no elements or associations are added / removed), you are fine. Static initialization is guaranteed to be performed in a thread safe manner by the JVM, and its results are visible to all threads. So in this case you most probably don't need any further synchronization.
If the maps are instance members (this is not clear to me from your description), but not modified after creation, I would say again you are most probably safe if you declare your members
final
(unless you publish thethis
object reference prematurely, i.e. pass it to the outside world from the cunstructor somehow before the constructor is finished).