假设我想将单词放入数据结构中,并且希望进行恒定时间查找以查看该单词是否在此数据结构中。 我只想看看这个词是否存在。 我会为此使用 HashMap
(containsKey()) 吗? HashMap 使用键->值配对,但在我的例子中,我没有值。 当然,我可以使用 null 作为值,但即使 null 也会占用空间。 看来这个应用程序应该有一个更好的数据结构。
该集合可能会被多个线程使用,但由于该集合包含的对象不会更改,因此我认为我没有同步/并发要求。
谁能帮我吗?
Let's say I want to put words in a data structure and I want to have constant time lookups to see if the word is in this data structure. All I want to do is to see if the word exists. Would I use a HashMap
(containsKey()) for this? HashMap
s use key->value pairings, but in my case I don't have a value. Of course I could use null for the value, but even null takes space. It seems like there ought to be a better data structure for this application.
The collection could potentially be used by multiple threads, but since the objects contained by the collection would not change, I do not think I have a synchronization/concurrency requirement.
Can anyone help me out?
发布评论
评论(6)
请改用 HashSet 。 它是 Set 的哈希实现,它主要用于您所描述的内容(一组无序的项目)。
Use HashSet instead. It's a hash implementation of Set, which is used primarily for exactly what you describe (an unordered set of items).
您通常会使用 Set 的实现,最常见的是 HashSet。 如果您确实需要并发访问,那么 ConcurrentHashSet 提供了一个直接替代方案,可以提供安全的并发访问,包括对集合的安全迭代。
我建议在任何情况下,在整个代码中将其简单地称为 Set,除了在构建它的地方; 这样,如果您以后需要,可以更轻松地为另一种实现添加一个实现。
即使集合是只读的,如果它被创建它的线程以外的线程使用,您确实需要考虑安全发布 (也就是说,确保任何其他线程看到该集合处于一致状态:记住任何内存写入,即使在构造函数中,也不能保证在您期望的时间或在您期望的时间中可供其他线程使用,除非您采取措施确保这一点)。 这可以通过以下两种方法来完成:
您可以使用 Collections.unmodifyingSet() 包装器来帮助确保后者。 这为您提供了给定集合的不可修改的视图——因此,如果没有其他对集合转义的“正常”引用,您是安全的。
You'd generally use an implementation of Set, and most usually HashSet. If you did need concurrent access, then ConcurrentHashSet provides a drop-in replacement that provides safe, concurrent access, including safe iteration over the set.
I'd recommend in any case referring to it as simply a Set throughout your code, except in the one place where you construct it; that way, it's easier to drop in one implementation for the other if you later require it.
Even if the set is read-only, if it's used by a thread other than the one that creates it, you do need to think about safe publication (that is, making sure that any other thread sees the set in a consistent state: remember any memory writes, even in constructors, aren't guaranteed to be made available to other threads when or in the otder you expect, unless you take steps to ensure this). This can be done by both of the following:
You can help to ensure the latter by using the Collections.unmodifiableSet() wrapper. This gives you an unmodifiable view of the given set-- so provided no other "normal" reference to the set escapes, you're safe.
您可能想使用 java.util.Set。 实现包括 java.util.HashSet,其中是 HashMap 的 Set 等价物。
即使集合中包含的对象没有改变,你也可能需要做同步。 Set 传递到不同线程后是否需要将新对象添加到 Set 中? 如果是这样,您可以使用 Collections.synchronizedSet() 使 Set 线程安全。
如果您有一个带有值的 Map,并且您有一些代码只想将 Map 视为 Set,则可以使用 Map.entrySet() (但请记住,entrySet 返回 Map 中键的 Set 视图;如果Map是可变的,则可以通过entrySet返回的集合来更改Map)。
You probably want to use a java.util.Set. Implementations include java.util.HashSet, which is the Set equivalent of HashMap.
Even if the objects contained in the collection do not change, you may need to do synchronization. Do new objects need to be added to the Set after the Set is passed to a different thread? If so, you can use Collections.synchronizedSet() to make the Set thread-safe.
If you have a Map with values, and you have some code that just wants to treat the Map as a Set, you can use Map.entrySet() (though keep in mind that entrySet returns a Set view of the keys in the Map; if the Map is mutable, the Map can be changed through the set returned by entrySet).
您想使用实现 Set 接口的 Collection(可能是 HashSet)来获得您所说的性能。 请参阅 http://java.sun.com/javase /6/docs/api/java/util/Set.html
You want to use a Collection implementing the Set interface, probably HashSet to get the performance you stated. See http://java.sun.com/javase/6/docs/api/java/util/Set.html
除了
Set
之外,在某些情况下,您可能需要使用Collections.newSetFromMap(MapMap
转换为Set
,Boolean>) (某些Map
不允许null
值,因此使用Boolean
)。Other than
Set
s, in some circumstances you might want to convert aMap
into aSet
withCollections.newSetFromMap(Map<E,Boolean>)
(someMap
s disallownull
values, hence theBoolean
).正如大家所说,HashSet 可能是最简单的解决方案,但您不会在 HashSet 中进行恒定时间查找(因为条目可能是链接的),并且您将为每个条目存储一个虚拟对象(始终相同)...
有关信息 这里是数据结构列表也许您会找到更适合您需求的一个。
as everyone said HashSet is probably the simplest solution but you won't have constant time lookup in a HashSet (because entries may be chained) and you will store a dummy object (always the same) for every entry...
For information here a list of data structures maybe you'll find one that better fits your needs.