java - 实时查看...等中包含的集合中包含的集合
我有一个类 A,它可以包含类 B 的许多实例,类 B 又可以包含类 C 的许多实例,类 C 可以包含类 D 的许多实例
现在,在类 AI 中有一个方法 getAllD
。目前,每次调用时都会发生大量迭代,并且会重新创建并返回一个相当大的列表。这不可能非常有效。
我想知道如何才能做得更好。这个问题将多个集合合并成一个逻辑集合?似乎触及类似的主题,但我不太确定如何将其应用到我的情况。
非常感谢所有评论!
I have a class A which can contain many instances of class B which may in turn contain many instances of Class C, which can contain many instances of class D
Now, in class A I have a method getAllD
. Currently every time this is called there is a lot of iterating that takes place, and a rather large list is freshly created and returned. This cannot be very efficient.
I was wondering how I could do this better. This question Combine multiple Collections into a single logical Collection? seems to touch upon a similar topic, but I'm not really sure how I could apply it to my situation.
All comments are much appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我会结合 Iterables.concat 与 Iterables.transform 获取 Ds 的实时视图:
如果您的目标是只是为了迭代 D,您实际上并不需要 集合 视图。它避免了大型临时集合的实例化。
I would combine Iterables.concat with Iterables.transform to obtain a live view of Ds:
This works well if your goal is simply to iterate over the Ds and you don't really need a collection view. It avoids the instantiation of a big temporary collection.
您问题的答案将取决于您的具体情况。这些集合是静态的还是动态的?你的 A 中的 B 集合有多大?您是否只想从 A 访问 D,还是有时想要在树中更靠下的位置或返回 B 或 C?您想要从特定 A 访问同一组 D 的频率如何?一个 D(或 C 或 B)可以与 1 个以上的 A 关联吗?
如果一切都是动态的,那么提高性能的最佳机会就是拥有从 C 到 A 的父引用,然后每当 C 的 D 列表发生变化时更新父引用。这样,您可以在 A 对象中保留一组 D,并在其中一个 C 获得新的或删除一个 C 时更新 A。
如果一切都是静态的,并且每个 A 中的 D 集合都有一定程度的重用,那么缓存可能是一个不错的选择,特别是在有很多 B 的情况下。 A 将拥有一个键为 B 且值为 D 集合的映射。 getAllDs() 方法首先检查映射是否有 B 的键,如果有则返回其 D 集合。如果没有,那么它将生成集合,将其存储到缓存映射中,然后返回集合。
您还可以使用树来存储对象,特别是当它们相当简单时。例如,您可以创建一个 XML DOM 对象并使用 XPath 表达式来提取所需的 D 子集。这将允许更加动态地访问您感兴趣的对象集。
每种解决方案在设置成本、维护成本、结果的及时性、使用灵活性和获取结果的成本方面都有不同的权衡。您应该选择哪一个取决于您的背景。
The answer to your question is going to depend on the specifics of your situation. Are these collections static or dynamic? How big is your collection of B's in A? Are you only going to access the Ds from A, or will you sometimes want to be farther down in the tree or returning Bs or Cs? How frequently are you going to want to access the same set of Ds from a particular A? Can a D (or C or B) be associated with more than 1 A?
If everything is dynamic, then the best chance of improving performance is to have parent references from the Cs to A, and then updating the parent whenever C's list of Ds changes. This way, you can keep a collection of Ds in your A object and update A whenever one of the Cs gets a new one or has one deleted.
If everything is static and there is some reuse of the D collections from each A, then caching may be a good choice, particularly if there are a lot of Bs. A would have a map with a key of B and a value of a collection of Ds. The getAllDs() method would first check to see if the map had a key for B and if so return its collection of Ds. If not, then it would generate the collection, store it into the cache map, and return the collection.
You could also use a tree to store the objects, particularly if they were fairly simple. For example, you could create an XML DOM object and use XPath expressions to pull out the subset of Ds that you wanted. This would allow far more dynamic access to the sets of objects you were interested in.
Each of these solutions has different tradeoffs in terms of cost to setup, cost to maintain, timeliness of results, flexibility of use, and cost to fetch results. Which you should choose is going to depend on your context.
实际上,我认为
Iterables.concat
(或来自 Apache Commons 的IteratorChain
)非常适合您的情况:Actually, I think
Iterables.concat
(orIteratorChain
from Apache Commons) would work fine for your case:内存中的迭代速度非常快。此外,创建一个包含 10k 元素的 ArrayList 与创建 10 个包含 1k 元素的 ArrayList 相比,效率并没有太大的不同。因此,总而言之,您可能应该首先进行最直接的迭代。很可能这工作得很好。
即使您有无数的元素,无论如何实现直接迭代以进行比较可能是明智的。否则,您不知道自己是否能够优化,或者是否因巧妙行事而减慢了速度。
话虽如此,如果您想优化所有 D 的顺序读取访问,我会在外部维护一个“索引”。索引可以是
LinkedList
、ArrayList
、TreeList
等,具体取决于您的情况。例如,如果您不确定索引的长度,那么避免使用 ArrayList 可能是明智之举。如果您想使用该元素的引用有效地删除随机元素,OrderedSet
可能比列表等要好得多。当您这样做时,您必须担心索引和索引的一致性。课堂上的实际参考资料。即更复杂=更多隐藏错误的地方。因此,除非您通过性能测试发现有必要,否则实际上不建议尝试优化。
(顺便说一句,避免新集合对象的实例化不太可能使事情变得更快,除非您正在谈论极端高性能代码。现代 JVM 中的对象实例化只需要几十纳秒或其他时间。此外,您可能会错误地使用具有较小的初始长度或其他东西并使事情变得更糟)
Iterating in-memory is pretty damn fast. Also the efficiency of creating an
ArrayList
of 10 k elements compared to creating 10ArrayList
with 1k elements each won't be that drastically different. So, in conclusion, you should probably first just go with the most straight-forward iterating. Chances are that this works just fine.Even if you have gazillion elements, it is probably wise to implement a straight-forward iterating anyways for comparison. Otherwise you don't know if you are being able to optimize or if you are slowing things down by doing things clever.
Having said that, if you want to optimize for sequential read access of all Ds, I'd maintain an "index" outside. The index could be a
LinkedList
,ArrayList
,TreeList
etc. depending on your situation. For example, if you aren't sure of the length of the index, it is probably wise to avoidArrayList
. If you want to efficiently remove random elements using the reference of that element,OrderedSet
might be much better than a list etc.When you do this you have to worry about the consistency of the index & actual references in your classes. I.e. more complexity = more place to hide bugs. So, unless you find it necessary through performance testing, it is really not advisable to attempt an optimization.
(btw avoiding instantiation of new collection objects are unlikely to make things much faster unless you are talking about EXTREME high-performing code. Object instantiation in modern JVMs only take a few ten nano seconds or something. Also, you could mistakenly use an ArrayList having small initial length or something and make things worse)