从 Set 中获取元素
为什么 Set
不提供获取与另一个元素相等的元素的操作?
Set<Foo> set = ...;
...
Foo foo = new Foo(1, 2, 3);
Foo bar = set.get(foo); // get the Foo element from the Set that equals foo
我可以问Set
是否包含等于bar
的元素,那么为什么我无法获取该元素呢? :(
澄清一下,equals
方法被重写,但它只检查其中一个字段,而不是所有字段。因此,两个被视为相等的 Foo
对象实际上可以具有不同的值,这就是为什么我不能只使用 foo
。
Why doesn't Set
provide an operation to get an element that equals another element?
Set<Foo> set = ...;
...
Foo foo = new Foo(1, 2, 3);
Foo bar = set.get(foo); // get the Foo element from the Set that equals foo
I can ask whether the Set
contains an element equal to bar
, so why can't I get that element? :(
To clarify, the equals
method is overridden, but it only checks one of the fields, not all. So two Foo
objects that are considered equal can actually have different values, that's why I can't just use foo
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(22)
要回答这个精确的问题“为什么
Set
不提供获取与另一个元素相等的元素的操作?”,答案是:因为集合的设计者框架不是很有前瞻性。他们没有预料到您非常合法的用例,天真地尝试“对数学集合抽象进行建模”(来自 javadoc),只是忘记添加有用的get()
方法。现在隐含的问题“如何获取元素”:我认为最好的解决方案是使用
Map
而不是设置
,将元素映射到自身。这样,您就可以高效从“集合”中检索元素,因为Map
的get()方法将使用高效的哈希表或树来查找元素算法。如果需要,您可以编写自己的Set
实现,它提供附加的get()
方法,封装Map
。我认为以下答案是不好或错误的:
“您不需要获取元素,因为您已经有一个相等的对象”:断言是错误的,正如您在问题中已经表明的那样。两个相等的对象仍然可以具有与对象相等性无关的不同状态。目标是访问
Set
中包含的元素的状态,而不是用作“查询”的对象的状态。“你别无选择,只能使用迭代器”:这是对集合的线性搜索,对于大型集合来说效率很低(具有讽刺意味的是,
Set
在内部被组织为哈希映射或树,可以高效查询)。不要这样做!通过使用这种方法,我在现实系统中看到了严重的性能问题。在我看来,缺少get()
方法的可怕之处并不在于解决它有点麻烦,而是大多数程序员会使用线性搜索方法而不考虑其含义。To answer the precise question "Why doesn't
Set
provide an operation to get an element that equals another element?", the answer would be: because the designers of the collection framework were not very forward looking. They didn't anticipate your very legitimate use case, naively tried to "model the mathematical set abstraction" (from the javadoc) and simply forgot to add the usefulget()
method.Now to the implied question "how do you get the element then": I think the best solution is to use a
Map<E,E>
instead of aSet<E>
, to map the elements to themselves. In that way, you can efficiently retrieve an element from the "set", because the get() method of theMap
will find the element using an efficient hash table or tree algorithm. If you wanted, you could write your own implementation ofSet
that offers the additionalget()
method, encapsulating theMap
.The following answers are in my opinion bad or wrong:
"You don't need to get the element, because you already have an equal object": the assertion is wrong, as you already showed in the question. Two objects that are equal still can have different state that is not relevant to the object equality. The goal is to get access to this state of the element contained in the
Set
, not the state of the object used as a "query"."You have no other option but to use the iterator": that is a linear search over a collection which is totally inefficient for large sets (ironically, internally the
Set
is organized as hash map or tree that could be queried efficiently). Don't do it! I have seen severe performance problems in real-life systems by using that approach. In my opinion what is terrible about the missingget()
method is not so much that it is a bit cumbersome to work around it, but that most programmers will use the linear search approach without thinking of the implications.如果元素相等,那么获取元素就没有意义了。
Map
更适合此用例。如果您仍然想查找该元素,您别无选择,只能使用迭代器:
There would be no point of getting the element if it is equal. A
Map
is better suited for this usecase.If you still want to find the element you have no other option but to use the iterator:
如果你有一个相同的对象,为什么你需要集合中的那个呢?如果仅通过键“相等”,则
Map
将是更好的选择。无论如何,下面的代码可以做到这一点:
使用 Java 8,这可以成为一个行:
或者(如果
equals()
按预期工作):If you have an equal object, why do you need the one from the set? If it is "equal" only by a key, a
Map
would be a better choice.Anyway, the following will do it:
With Java 8 this can become a one liner:
or (if
equals()
works as expected):不幸的是,Java 中的默认集并非旨在提供“获取”操作,正如 jschreiner 准确解释的那样。
使用迭代器查找感兴趣的元素的解决方案(由dacwe建议)或删除元素并重新添加它的值已更新(由 KyleM),可以工作,但效率可能非常低。
重写 equals 的实现,使非相等的对象“相等”,如 David Ogren,很容易造成维护问题。
恕我直言,使用 Map 作为显式替换(正如许多人所建议的那样)会使代码不太优雅。
如果目标是访问集合中包含的元素的原始实例(希望我正确理解您的用例),那么这是另一个可能的解决方案。
我个人在使用 Java 开发客户端-服务器视频游戏时也有同样的需求。就我而言,每个客户端都有存储在服务器中的组件的副本,问题是每当客户端需要修改服务器的对象时。
通过互联网传递对象意味着客户端无论如何都有该对象的不同实例。为了将这个“复制的”实例与原始实例相匹配,我决定使用 Java UUID。
因此,我创建了一个抽象类 UniqueItem,它自动为其子类的每个实例提供一个随机的唯一 id。
此 UUID 在客户端和服务器实例之间共享,因此通过这种方式,只需使用 Map 即可轻松匹配它们。
然而,在类似的用例中直接使用 Map 仍然不优雅。有人可能会争辩说,使用地图的维护和处理可能会更复杂。
由于这些原因,我实现了一个名为 MagicSet 的库,它使 Map 的使用对开发人员来说“透明”。
https://github.com/ricpacca/magicset
与原始 Java HashSet 一样,MagicHashSet(它是一个库中提供的 MagicSet 实现的一部分)使用后备 HashMap,但它不是将元素作为键,将虚拟值作为值,而是使用元素的 UUID 作为键,将元素本身作为值。与普通的 HashSet 相比,这不会导致内存使用的开销。
此外,MagicSet 可以完全用作 Set,但还有一些提供附加功能的方法,如 getFromId()、popFromId()、removeFromId() 等。
使用它的唯一要求是您想要的任何元素MagicSet 中的存储需要扩展抽象类 UniqueItem。
下面是一个代码示例,设想从 MagicSet 检索城市的原始实例,给定具有相同 UUID(甚至只是其 UUID)的该城市的另一个实例。
Default Set in Java is, unfortunately, not designed to provide a "get" operation, as jschreiner accurately explained.
The solutions of using an iterator to find the element of interest (suggested by dacwe) or to remove the element and re-add it with its values updated (suggested by KyleM), could work, but can be very inefficient.
Overriding the implementation of equals so that non-equal objects are "equal", as stated correctly by David Ogren, can easily cause maintenance problems.
And using a Map as an explicit replacement (as suggested by many), imho, makes the code less elegant.
If the goal is to get access to the original instance of the element contained in the set (hope I understood correctly your use case), here is another possible solution.
I personally had your same need while developing a client-server videogame with Java. In my case, each client had copies of the components stored in the server and the problem was whenever a client needed to modify an object of the server.
Passing an object through the internet meant that the client had different instances of that object anyway. In order to match this "copied" instance with the original one, I decided to use Java UUIDs.
So I created an abstract class UniqueItem, which automatically gives a random unique id to each instance of its subclasses.
This UUID is shared between the client and the server instance, so this way it could be easy to match them by simply using a Map.
However directly using a Map in a similar usecase was still inelegant. Someone might argue that using an Map might be more complicated to mantain and handle.
For these reasons I implemented a library called MagicSet, that makes the usage of an Map "transparent" to the developer.
https://github.com/ricpacca/magicset
Like the original Java HashSet, a MagicHashSet (which is one of the implementations of MagicSet provided in the library) uses a backing HashMap, but instead of having elements as keys and a dummy value as values, it uses the UUID of the element as key and the element itself as value. This does not cause overhead in the memory use compared to a normal HashSet.
Moreover, a MagicSet can be used exactly as a Set, but with some more methods providing additional functionalities, like getFromId(), popFromId(), removeFromId(), etc.
The only requirement to use it is that any element that you want to store in a MagicSet needs to extend the abstract class UniqueItem.
Here is a code example, imagining to retrieve the original instance of a city from a MagicSet, given another instance of that city with the same UUID (or even just its UUID).
使用 Java 8,您可以执行以下操作:
但要小心,.get() 会抛出 NoSuchElementException,或者您可以操作可选项。
With Java 8 you can do:
But be careful, .get() throws a NoSuchElementException, or you can manipulate a Optional item.
如果您的集合实际上是一个
NavigableSet
(例如TreeSet
),并且Foo 实现了 Comparable
,您可以使用(感谢@eliran-malka 的评论提示。)
If your set is in fact a
NavigableSet<Foo>
(such as aTreeSet
), andFoo implements Comparable<Foo>
, you can use(Thanks to @eliran-malka’s comment for the hint.)
原因:
Set 似乎在提供比较手段方面发挥了有用的作用。它被设计为不存储重复元素。
由于这种意图/设计,如果要 get() 对存储对象的引用,然后对其进行变异,则 Set 的设计意图可能会受到阻碍,并可能导致意外行为。
来自 JavaDocs
方法:
现在已经引入了 Streams,我们可以执行以下操作
Why:
It seems that Set plays a useful role in providing a means of comparison. It is designed not to store duplicate elements.
Because of this intention/design, if one were to get() a reference to the stored object, then mutate it, it is possible that the design intentions of Set could be thwarted and could cause unexpected behavior.
From the JavaDocs
How:
Now that Streams have been introduced one can do the following
将set转换为list,然后使用list的
get
方法Convert set to list, and then use
get
method of list如果您只执行一次获取,则性能不会很好,因为您将循环所有元素,但是当对一个大集合执行多次检索时,您会注意到差异。
If you only do one get this will not be very performing because you will loop over all your elements but when performing multiple retrieves on a big set you will notice the difference.
如果您查看
java.util.HashSet
实现的前几行,您将看到:因此
HashSet
无论如何都会内部使用HashMap
,这意味着如果您直接使用HashMap
并使用相同的值作为键和值,您将获得您想要的效果并节省一些内存。If you look at the first few lines of the implementation of
java.util.HashSet
you will see:So
HashSet
usesHashMap
interally anyway, which means that if you just use aHashMap
directly and use the same value as the key and the value you will get the effect you want and save yourself some memory.你可以使用迭代器类
you can use Iterator class
看起来正确使用的对象是 Interner :
它还有一些非常有趣的杠杆,例如 concurrencyLevel 或使用的引用类型(可能值得注意的是,它不提供 SoftInterner,我认为它比 WeakInterner 更有用)。
it looks like the proper object to use is the Interner from guava :
It also has a few very interesting levers, like concurrencyLevel, or the type of references used (it might be worth noting that it doesn't offer a SoftInterner which I could see as more useful than a WeakInterner).
我知道,这个问题很久以前就被问过并回答过,但是如果有人感兴趣,这是我的解决方案 - 由 HashMap 支持的自定义集类:
http://pastebin.com/Qv6S91n9
您可以轻松实现所有其他 Set 方法。
I know, this has been asked and answered long ago, however if anyone is interested, here is my solution - custom set class backed by HashMap:
http://pastebin.com/Qv6S91n9
You can easily implement all other Set methods.
去过那儿做过吗!如果您使用 Guava,将其转换为地图的快速方法是:
Been there done that!! If you are using Guava a quick way to convert it to a map is:
如果你想要 HashSet 中的第 n 个元素,你可以使用下面的解决方案,
这里我在HashSet中添加了ModelClass的对象。
If you want nth Element from HashSet, you can go with below solution,
here i have added object of ModelClass in HashSet.
因为 Set 的任何特定实现可能是也可能不是随机访问。
您始终可以获得 迭代器 并单步执行 Set,一旦找到相等的元素,就使用迭代器的
next()
方法返回所需的结果。无论实现如何,这都有效。如果实现不是随机访问(想象一个链表支持的 Set),则接口中的 get(E element) 方法将具有欺骗性,因为它必须迭代集合才能找到元素返回,并且get(E element)
似乎暗示这是必要的,Set 可以直接跳转到要获取的元素。当然,contains() 可能需要也可能不需要做同样的事情,具体取决于实现,但这个名称似乎不会引起同样的误解。
Because any particular implementation of Set may or may not be random access.
You can always get an iterator and step through the Set, using the iterators'
next()
method to return the result you want once you find the equal element. This works regardless of the implementation. If the implementation is NOT random access (picture a linked-list backed Set), aget(E element)
method in the interface would be deceptive, since it would have to iterate the collection to find the element to return, and aget(E element)
would seem to imply this would be necessary, that the Set could jump directly to the element to get.contains()
may or may not have to do the same thing, of course, depending on the implementation, but the name doesn't seem to lend itself to the same sort of misunderstandings.是的,使用
HashMap
...但是以一种特殊的方式:我在尝试使用HashMap
作为伪Set
时预见到的陷阱是Map/Set
的“实际”元素与“候选”元素(即用于测试equal
元素是否已存在的元素)之间可能存在混淆。这远非万无一失,但可以让您远离陷阱:然后这样做:
但是...您现在希望候选者以某种方式自毁,除非程序员实际上立即将其放入
Map/Set
...您希望contains
“污染”candidate
,以便任何使用它,除非它加入>Map
使它成为“诅咒”。也许您可以让SomeClass
实现一个新的Taintable
接口。更令人满意的解决方案是 GettableSet,如下所示。但是,要实现此目的,您必须负责
SomeClass
的设计,以使所有构造函数不可见(或者......能够并愿意设计和使用包装类it):实现:
你的
NoVisibleConstructor
类看起来像这样:PS 这样一个
NoVisibleConstructor
类的一个技术问题:可能会反对这样一个类本质上是final
,这可能是不可取的。实际上,您始终可以添加一个虚拟无参数受保护的构造函数:...这至少可以让子类编译。然后,您必须考虑是否需要在子类中包含另一个
getOrCreate()
工厂方法。最后一步是一个抽象基类(注意列表的“元素”,集合的“成员”),就像您的集合成员一样(如果可能的话 - 再次,使用包装器的范围类,其中该类不在您的控制之下,或者已经具有基类等),以实现最大程度的实现隐藏:
...用法相当明显(在您的
SomeClass
内) sstatic
工厂方法):Yes, use
HashMap
... but in a specialised way: the trap I foresee in trying to use aHashMap
as a pseudo-Set
is the possible confusion between "actual" elements of theMap/Set
, and "candidate" elements, i.e. elements used to test whether anequal
element is already present. This is far from foolproof, but nudges you away from the trap:Then do this:
But... you now want the
candidate
to self-destruct in some way unless the programmer actually immediately puts it in theMap/Set
... you'd wantcontains
to "taint" thecandidate
so that any use of it unless it joins theMap
makes it "anathema". Perhaps you could makeSomeClass
implement a newTaintable
interface.A more satisfactory solution is a GettableSet, as below. However, for this to work you have either to be in charge of the design of
SomeClass
in order to make all constructors non-visible (or... able and willing to design and use a wrapper class for it):Implementation:
Your
NoVisibleConstructor
classes then look like this:PS one technical issue with such a
NoVisibleConstructor
class: it may be objected that such a class is inherentlyfinal
, which may be undesirable. Actually you could always add a dummy parameterlessprotected
constructor:... which would at least let a subclass compile. You'd then have to think about whether you need to include another
getOrCreate()
factory method in the subclass.Final step is an abstract base class (NB "element" for a list, "member" for a set) like this for your set members (when possible - again, scope for using a wrapper class where the class is not under your control, or already has a base class, etc.), for maximum implementation-hiding:
... usage is fairly obvious (inside your
SomeClass
'sstatic
factory method):哈希码的契约明确指出:
所以你的假设:
是错误的,你违反了合同。如果我们看一下 Set 接口的“contains”方法,我们会发现:
要完成您想要的任务,您可以使用 Map 来定义键,并使用定义对象彼此不同或相等的键来存储元素。
The contract of the hash code makes clear that:
So your assumption:
is wrong and you are breaking the contract. If we look at the "contains" method of Set interface, we have that:
To accomplish what you want, you can use a Map where you define the key and store your element with the key that defines how objects are different or equal to each other.
如果您有一个
NavigableSet
(例如TreeSet
),您可以执行以下操作:对于
HashSet
及其后代(如),事情稍微棘手一些>LinkedHashSet
:Here's what you can do if you have a
NavigableSet
(e.g. aTreeSet
):The things are slightly trickier for
HashSet
and its descendants likeLinkedHashSet
:可能解决这种情况的快速帮助方法:
Quick helper method that might address this situation:
尝试使用数组:
Try using an array:
以下可以是一种方法
Following can be an approach