此用例的 Java 集合

发布于 2024-10-30 13:21:40 字数 694 浏览 4 评论 0 原文

假设我们有一堆 Car 对象。

每辆汽车都有一些独特的属性,例如制造商、型号、年份等(这些可用于创建不同的哈希代码)。

每辆车都有一个 PurchaseOffer 对象列表(PurchaseOffer 对象包含定价\零售商信息)。

我们从多个不同来源收到汽车列表,每辆车都有一个购买报价。 事实是,这些列表可能会重叠 - 一辆汽车可以出现在多个列表中。

我们希望将列表聚合到单个汽车集合中,其中每辆车都包含其遇到的所有购买报价。

我的问题是在这个聚合过程中选择要使用的集合:

使用 java.util.HashSet 来保存我们的汽车感觉很自然,这样当检查不同的汽车列表时,我们可以检查是否集合中已经存在一辆汽车,分摊时间为 O(1), 但是 - 您无法从 Set 中检索元素(在我们的例子中 - 当我们遇到 Set 中已存在的汽车时 - 我们希望根据其标识 hashCode 从 Set 中检索该汽车并向其添加 PurchaseOffers ) 。

我可以使用 HashMap,其中每个 Car 的 hashCode 映射到实际的 Car 对象,但它可能不是教科书解决方案,因为它不安全 - 我必须确保自己每个 hashCode 都映射到具有该 hashCode 的 Car -可能会出现不一致的情况。 当然,可以制定一个指定的数据结构来保证这种一致性——难道不应该已经存在吗?

任何人都可以建议我所追求的数据结构,或者指出设计错误吗? 谢谢。

Let's say we have a bunch of Car objects.

Each Car has some distinguishing properties e.g. manufacturer, model, year, etc. (these can be used to create distinct hashCodes).

Each car has a List of PurchaseOffer objects (a PurchaseOffer object contains pricing\retailer info).

We receive Lists of Cars from several different sources, each Car with a single PurchaseOffer.
Thing is, these lists may overlap - a Car can appear in more than one list.

We wish to aggregate the lists into a single collection of Cars where each Car holds all encountered PurchaseOffers for it.

My Problem is choosing what to collection to use in this aggregation process:

Feels natural to use java.util.HashSet for holding our cars, that way when going over the different lists of Cars, we can check if a car already exists in the Set in amortized O(1),
however - you cannot retrieve an element from a Set (in our case - when we go encounter a Car that already exists in the Set - we would have liked to retrieve that Car from the Set based on its identifying hashCode and add PurchaseOffers to it).

I can use a HashMap where each Car's hashCode maps to the actual Car object, but it probably isn't the school-book solution since it is unsafe - I would have to make sure myself that every hashCode maps to a Car with that hashCode - there could be inconsistency.
Of course, can make a designated data structure that guarantees this consistency - Shouldn't one already exist ?

Can anyone suggest the data-structure I am after, or point out a design mistake ?
Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

夜无邪 2024-11-06 13:21:40

由于这是多对多关系,因此您需要双向多映射。 Car 是第一个的键,以 PurchaseOrder 列表作为值。 PurchaseOrder 是第二个订单的键,其值为汽车列表。

底层实现是两个HashMap。

在其之上放置一个 API 以获得您需要的行为。或者查看 Google 收藏集是否可以为您提供帮助。它是一个 BiMap 和两个 MultiMap 的组合。

Since this is a many-to-many relationship, you need a bi-directional multi-map. Car is the key for the first one, with a List of PurchaseOrder as the value. The PurchaseOrder is the key for the second one, with a List of Cars as the value.

The underlying implementation is two HashMaps.

Put an API on top of it to get the behavior you need. Or see if Google Collections can help you. It's a combination of a BiMap and two MultiMaps.

春风十里 2024-11-06 13:21:40

我认为你确实需要(至少)一个 HashMap> ...正如 @Andreas_D 所建议的,

你反对每个 Car 已经有一个 List 不是重点。 HashMap 中的列表是聚合列表,包含所有 Car 对象中的所有 PurchaseOffer 对象,这些对象代表同一辆实体车。

创建新列表的目的是避免更改原始 Car 对象上的原始列表。 (如果这不是问题,那么您可以从代表一辆实体汽车的集合中选择一个 Car 实例,并将其他对象中的 PurchaseOffer 对象合并到该列表中。)

我不完全确定为什么 @duffymo 建议在两者之间建立双向映射,但我认为这是因为来自不同来源的不同 Car 对象可能具有相同的互补(或矛盾)信息实体车。通过保留所有实例,您可以避免丢弃信息。 (再一次,如果您愿意放弃变异和/或放弃信息,您可以尝试将每辆汽车的信息合并到一个 Car 对象中。


如果您真的不关心保留信息并准备随意合并内容,那么以下方法可能会起作用:

  HashMap<Car, Car> map = new HashMap<Car, Car>(...);
  for (Car car : carsToBeAggregated) {
      Car master = nap.get(car);
      if (master == null) {
          map.put(car, car);
      } else {
          master.offers.addAll(car.offers);
          // optionally, merge other Car information from car to master
      }
  }

您不应该尝试使用 Car.hashCode() 作为任何内容的键。哈希码值不是唯一标识符。 :很可能两辆不同的汽车最终会具有相同的哈希码值,如果您尝试将它们当作唯一标识符来使用,您会遇到麻烦......

I think that you really do need (at least) a HashMap<Car, List<PurchaseOffer>> ... as suggested by @Andreas_D

Your objection that each Car already has a List<PurchaseOffer> is beside the point. The list in the HashMap is the aggregate list, containing all PurchaseOffer objects from all Car objects that stand for the same physical car.

The point of creating a new list is to avoid changing the original lists on the original Car objects. (If that was not a concern, then you could pick one instance of Car from the set that represent a physical car, and merge the PurchaseOffer objects from the others into that list.)

I'm not entirely sure why @duffymo suggested a bi-directional map between, but I think it is because the different Car objects from different sources may have complementary (or contradictory) information for the same physical car. By keeping all instances, you avoid discarding information. (Once again, if you are happy to discard mutate and/or discard information, you could attempt to merge the information about each individual car into a single Car object.


If you really didn't care about preserving information and were prepared to merge stuff willy-nilly then the following approach would probably work:

  HashMap<Car, Car> map = new HashMap<Car, Car>(...);
  for (Car car : carsToBeAggregated) {
      Car master = nap.get(car);
      if (master == null) {
          map.put(car, car);
      } else {
          master.offers.addAll(car.offers);
          // optionally, merge other Car information from car to master
      }
  }

You should NOT be trying to use the Car.hashCode() as a key for anything. Hashcode values are not unique identifiers: there is a distinct possibility that two different cars will end up with the same hashcode value. If you attempt to use them as if they were unique identifiers you'll get into trouble ...

春庭雪 2024-11-06 13:21:40

基本数据结构应该是HashMap>。这允许存储和接收一辆选定汽车的所有报价。

现在您可能必须为 Car.equals() 找到合适的实现,以确保来自不同来源的“汽车”确实相同。将 equals() 基于真实世界汽车的唯一标识符 (VIN) 怎么样?

The basic datastructure should be a HashMap<Car, List<PurchaseOffer>>. This allows for storing and receiving all offers for one selected car.

Now you may have to find a suitable implementation for Car.equals() to assure, that "cars" coming from different source are really the same. What about basing equals() on a unique identifier for a real world car (VIN)?

我要还你自由 2024-11-06 13:21:40

我更喜欢使用 HashMap>,如之前建议的(Andreas、Stephen),主要是在 Car 对象保存列表的情况下购买优惠。
否则,我会考虑使用 HashMap ,或者更好的 IMO,如果每辆车都有唯一的 ID,则使用 HashMap

正如问题中提到的,它不能简单地将汽车的 hashCode 映射到汽车,因为不同的汽车可以具有相同的 hashCode!

(无论如何,我会创建一个自己的类来存储和管理汽车。这将包含 HashMap 或任何一个 - 因此很容易更改实现而不需要更改其接口)

I would prefer to use a HashMap<Car, List<PurchaseOffer>>, as suggested before (Andreas, Stephen), mainly if the Car object does not hold the list of PurchaseOffers.
Otherwise I would consider using a HashMap<Car, Car> or, better IMO, a HashMap<ID, Car> if there is an unique ID for each Car.

It can not simply map the Car's hashCode to the Car, as mentioned in the question, since distinct Cars can have the same hashCode!

(Anyway, I would create an own class for storing and managing the Cars. This would contain the HashMap, or whichever - so it's easy to change the implementation without needing to change its interface)

殊姿 2024-11-06 13:21:40

创建扩展哈希的自定义类
Set,
重写方法包含(Object o)
检查操作系统哈希码是否相同并根据返回结果,并将对象添加到集合中,并且仅当它不包含该对象时

create tout custom class that extends hash
Set,
override method contains(Object o)
check there os hash code is same or not and return result according, and add object to set of and only if it not containing that object

樱娆 2024-11-06 13:21:40

定义一个新的自定义聚合类怎么样?定义哈希码,使汽车的 id 作为键并相应地覆盖 equals()。定义一个自定义方法来接受您的原始汽车并在列表上执行联合操作。最后将自定义对象存储在 HashSet 中以实现恒定时间查找。

用纯粹的术语来说,聚合是一种超出单个对象范围的行为。访问者模式试图解决类似的问题。

或者,如果您有一个 sql 数据存储,则使用 group by 进行简单的选择即可解决问题。

How about a defining a new custom Aggregation class? Define the hashcode such that the id of the car acts as the key and override the equals() accordingly. Define a custom method for accepting your original car and do a union operation on the lists. Finally store the custom objects in a HashSet for achieving constant time look up.

In purist terms, aggregation is a behavior beyond the scope of a single object. Visitor pattern tries to address a similar problem.

Alternatively if you have a sql datastore, a simple select using group by would do the trick.

梦途 2024-11-06 13:21:40

好吧,是的,如果不是因为以下事实,HashMap> 将是完美的
每个 Car 都包含一个 List 作为属性。可以说一个 Car 对象是由
由两部分组成:识别部分(假设每辆车确实有一个唯一的 VIN)和列表
购买优惠

在本例中,将 Car 类分为两个类 - 具有标识属性的 CarType 类,然后是列表部分(可能两者一起由 Car 使用)。然后使用 Map 作为您的数据结构(或 MultiMap)。

Welp, yeah, HashMap<Car, List<PurchaseOffer>> would be perfect if it wasn't for the fact that
each Car contains a List<PurchaseOffer> as a property. Can say that a Car object is composed
of two parts: an identifying part (let's say each car indeed has a unique VIN), and the list of
PurchaseOffers.

In this case split the Car class in two classes - the CarType class with the identifying attributes, and then the list part (maybe both together used by Car). Then use Map<CarType, Lost<PurchaseOffer> for your datastructure (or MultiMap<CarType, PurchaseOffer>).

糖粟与秋泊 2024-11-06 13:21:40
    //alt. 1
    List<Offer> offers;
    List<Car> cars;
    Map<Car, List<Offer>> mapCarToOffers;
    Map<Offer, List<Car>> mapOfferToCars;
    public void List<Offer> getOffersForCar(Car aCar);
    public void List<Car> getCarsForOffer(Offer anOffer);

替代方案 1 将利用 CarOfferhashCode()

    //alt. 2
    List<Offer> offers;
    List<Car> cars;
    Map<Integer, List<Offer>> mapCarIdToOffers;
    Map<Integer, List<Car>> mapOfferIdToCars;
    public void List<Offer> getOffersForCarId(int aCarId);
    public void List<Car> getCarsForOfferId(int anOfferId);

替代方案 2 将利用 IntegerhashCode()。这将减轻您对“安全”的担忧,因为 Integer 对象的哈希码不应在值唯一的情况下重叠。这会产生额外的开销,因为必须为每个 CarOffer 对象维护唯一的 ID,但是,我猜测您可能已经根据业务需求拥有了这些 ID。
请注意,您可以选择使用其他类来替代 ID 的 int(例如 String)。

对于这两种替代方案,请使用 ArrayListLinkedList 实现 List - 哪一个更好由您根据其他要求来确定,例如作为插入/删除频率与查找频率。使用 HashMap 实现 Map - 请参阅上面有关如何使用哈希码的注释。


作为旁注,在我们的软件中,我们使用上述两者来表示类似类型的多对多数据。与您的用例非常相似。
两种选择都效果很好。

    //alt. 1
    List<Offer> offers;
    List<Car> cars;
    Map<Car, List<Offer>> mapCarToOffers;
    Map<Offer, List<Car>> mapOfferToCars;
    public void List<Offer> getOffersForCar(Car aCar);
    public void List<Car> getCarsForOffer(Offer anOffer);

Alternative 1 would make use of the hashCode() of Car and Offer

    //alt. 2
    List<Offer> offers;
    List<Car> cars;
    Map<Integer, List<Offer>> mapCarIdToOffers;
    Map<Integer, List<Car>> mapOfferIdToCars;
    public void List<Offer> getOffersForCarId(int aCarId);
    public void List<Car> getCarsForOfferId(int anOfferId);

Alternative 2 would make use of the hashCode() of Integer. This would allay your concerns about "safety" as the hash codes for Integer objects should not overlap where the values are unique. This incurs the additional overhead of having to maintain unique IDs for each Car and Offer object, however, I am guessing that you probably already have those from your business requirements.
Note, you may choose to use other classes as alternative to ints for ID's (e.g. String).

For both alternatives, implement the Lists with ArrayList or LinkedList - which one is better is up to you to determine based on other requirements, such as the frequency of insertion/deletion vs lookup. Implement the Maps with HashMap - see comments above about how hash codes are used.


As a side note, in our software, we use these both of the above to represent similar types of many-to-many data. Very similar to your use case.
Both alternatives work very well.

短暂陪伴 2024-11-06 13:21:40

为什么不使用对象数据库呢?您可以存储您想要的任何对象图,并且您将获得一个搜索 API,您可以使用它执行您想要的任何关系/检索机制。一个简单的集合可以工作,但听起来您想要比集合提供的更复杂的关系。查看 db4o (http://db4o.com) - 它对于此类事情非常强大。

Why not use an object database for this? You could store any object graph you wanted, and you'd get a search API with which you could do any relationship/retrieval mechanism you wanted. A simple collection could work, but it sounds like you want a more complex relationship than a collection would provide. Look into db4o (http://db4o.com) - it's very powerful for this sort of thing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文