Ruby 2.7:如何合并一个基于一个键的哈希阵列并消除重复项:值

发布于 2025-02-02 00:41:29 字数 963 浏览 1 评论 0 原文

我正在尝试完成基于项目的评估以进行工作面试,而他们只在Ruby上提供了Ruby,这对此一无所知。我正在尝试将一个包含两个或多个阵列哈希的哈希进行,并将阵列组合到一个哈希阵列中,同时消除基于“ id”:value对的重复哈希。

因此,我试图将

h = {
  'first' =>
      [
        { 'authorId' => 12, 'id' => 2, 'likes' => 469 },
        { 'authorId' => 5, 'id' => 8, 'likes' => 735 },
        { 'authorId' => 8, 'id' => 10, 'likes' => 853 }
      ],
  'second' =>
      [
        { 'authorId' => 9, 'id' => 1, 'likes' => 960 },
        { 'authorId' => 12, 'id' => 2, 'likes' => 469 },
        { 'authorId' => 8, 'id' => 4, 'likes' => 728 }
      ]
}

其转变为:

[
  { 'authorId' => 12, 'id' => 2, 'likes' => 469 },
  { 'authorId' => 5, 'id' => 8, 'likes' => 735 },
  { 'authorId' => 8, 'id' => 10, 'likes' => 853 },
  { 'authorId' => 9, 'id' => 1, 'likes' => 960 },
  { 'authorId' => 8, 'id' => 4, 'likes' => 728 }

]

I'm trying to complete a project-based assessment for a job interview, and they only offer it in Ruby on Rails, which I know little to nothing about. I'm trying to take one hash that contains two or more hashes of arrays and combine the arrays into one array of hashes, while eliminating duplicate hashes based on an "id":value pair.

So I'm trying to take this:

h = {
  'first' =>
      [
        { 'authorId' => 12, 'id' => 2, 'likes' => 469 },
        { 'authorId' => 5, 'id' => 8, 'likes' => 735 },
        { 'authorId' => 8, 'id' => 10, 'likes' => 853 }
      ],
  'second' =>
      [
        { 'authorId' => 9, 'id' => 1, 'likes' => 960 },
        { 'authorId' => 12, 'id' => 2, 'likes' => 469 },
        { 'authorId' => 8, 'id' => 4, 'likes' => 728 }
      ]
}

And turn it into this:

[
  { 'authorId' => 12, 'id' => 2, 'likes' => 469 },
  { 'authorId' => 5, 'id' => 8, 'likes' => 735 },
  { 'authorId' => 8, 'id' => 10, 'likes' => 853 },
  { 'authorId' => 9, 'id' => 1, 'likes' => 960 },
  { 'authorId' => 8, 'id' => 4, 'likes' => 728 }

]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

两相知 2025-02-09 00:41:29

Ruby有很多方法可以实现这一目标。

我的第一个本能是通过 id 将它们分组,并仅从数组中选择第一个项目。

h.values.flatten.group_by{|x| x["id"]}.map{|k,v| v[0]}

更清洁的方法是在变平上的哈希阵列后根据ID选择不同的项目,这就是评论中建议的 Cary Swoveland

h.values.flatten.uniq { |h| h['id'] }

Ruby has many ways to achieve this.

My first instinct is to group them by id it and pick only first item from the array.

h.values.flatten.group_by{|x| x["id"]}.map{|k,v| v[0]}

Much cleaner approach is to pick the distinct item based on id after flattening the array of hash which is what Cary Swoveland suggested in the comments

h.values.flatten.uniq { |h| h['id'] }
自演自醉 2025-02-09 00:41:29

TL; DR

最简单的解决方案适合您发布的数据的问题是 h.values.flatten.uniq 。您可以在这里停止阅读,除非您想了解为什么您不需要关心此特定数据集的重复ID,或者您何时可能需要关心以及为什么这通常比看起来不那么直接。

在末尾,我还提到了一些针对此特定数据不需要的边缘案例的铁轨功能。但是,它们可能有助于其他用例。

跳过特定于ID的重复数据删除;专注于删除重复哈希而不是

首先,您的 have no Replate ID 键也不是重复哈希对象的一部分。尽管Ruby实施,在概念上是无序的哈希。务实地,这意味着两个具有相同键和值的哈希对象(即使它们处于不同的插入顺序)仍然被认为是平等的。因此,也许是不直觉的:

{'authorId' => 12, 'id' => 2, 'likes' => 469} ==
  {'id' => 2, 'likes' => 469, 'authorId' => 12}
#=> true

考虑到您的示例输入,您实际上不必担心此练习的独特ID。您只需要从合并的数组中消除重复的哈希对象,而您只有一个。

duplicate_ids =
  h.values.flatten.group_by { _1['id'] }
    .reject { _2.one? }.keys
#=> [2]

unique_hashes_with_duplicate_ids =
  h.values.flatten.group_by { _1['id'] }
    .reject { _2.uniq.one? }.count
#=> 0

如您所见,'id'=> 2 是两个哈希值中唯一发现的ID,尽管在相同的哈希对象中。由于您只有一个重复的哈希,因此问题已减少以使存储在 h 中的哈希值数组变平,以便您可以从组合阵列中删除任何重复的哈希元素(不是重复的ID)。

解决已发布的问题的

解决方案可能会使用您需要处理哈希键独特性的情况,但这不是其中之一。除非您想通过某个键对结果进行排序,否则您真正需要的是:

h.values.flatten.uniq

由于没有被要求对合并阵列中的哈希对象进行排序,因此您可以避免需要另一种方法来调用(无论如何,无论如何,无论如何,无论如何)是一个无障碍。

“唯一性”可能会很棘手,没有其他上下文,

查看您的 ID 键的唯一原因是,如果您在多个 unique hash对象中具有重复的ID,并且如果那样如果您不得不担心要保留哪个哈希是正确的。例如,给定:

[ {'id' => 1, 'authorId' => 9, 'likes' => 1_920},
  {'id' => 1, 'authorId' => 9, 'likes' => 960} ]

以下哪个记录是“重复”?没有其他数据,例如时间戳,只需链接 uniq {h ['id'} 或合并哈希对象,将分别为您的第一个或最后一个记录净化。考虑:

[
  {'id' => 1, 'authorId' => 9, 'likes' => 1_920},
  {'id' => 1, 'authorId' => 9, 'likes' => 960}
].uniq { _1['id'] }
#=> [{"id"=>1, "authorId"=>9, "likes"=>1920}]

[
  {'id' => 1, 'authorId' => 9, 'likes' => 1_920},
  {'id' => 1, 'authorId' => 9, 'likes' => 960}
].reduce({}, :merge)
#=> {"id"=>1, "authorId"=>9, "likes"=>960}

利用上下文,例如特定于铁轨的时间戳功能,

而上述唯一性问题似乎不符合您当前被问到的问题,但了解任何类型的数据转换的局限性是有用的。此外,知道Ruby on Rails支持 act activerecord :: timestamp :: timestamp 以及 database迁移在更广泛的意义上可能高度相关的时间。

您不需要知道这些事情就可以回答原始问题。但是,知道给定解决方案何时适合特定的用例,并且何时也不重要。

TL;DR

The simplest solution to the problem that fits the data you posted is h.values.flatten.uniq. You can stop reading here unless you want to understand why you don't need to care about duplicate IDs with this particular data set, or when you might need to care and why that's often less straightforward than it seems.

Near the end I also mention some features of Rails that address edge cases that you don't need for this specific data. However, they might help with other use cases.

Skip ID-Specific Deduplication; Focus on Removing Duplicate Hashes Instead

First of all, you have no duplicate id keys that aren't also part of duplicate Hash objects. Despite the fact that Ruby implementations preserve entry order of Hash objects, a Hash is conceptually unordered. Pragmatically, that means two Hash objects with the same keys and values (even if they are in a different insertion order) are still considered equal. So, perhaps unintuitively:

{'authorId' => 12, 'id' => 2, 'likes' => 469} ==
  {'id' => 2, 'likes' => 469, 'authorId' => 12}
#=> true

Given your example input, you don't actually have to worry about unique IDs for this exercise. You just need to eliminate duplicate Hash objects from your merged Array, and you have only one of those.

duplicate_ids =
  h.values.flatten.group_by { _1['id'] }
    .reject { _2.one? }.keys
#=> [2]

unique_hashes_with_duplicate_ids =
  h.values.flatten.group_by { _1['id'] }
    .reject { _2.uniq.one? }.count
#=> 0

As you can see, 'id' => 2 is the only ID found in both Hash values, albeit in identical Hash objects. Since you have only one duplicate Hash, the problem has been reduced to flattening the Array of Hash values stored in h so that you can remove any duplicate Hash elements (not duplicate IDs) from the combined Array.

Solution to the Posted Problem

There might be uses cases where you need to handle the uniqueness of Hash keys, but this is not one of them. Unless you want to sort your result by some key, all you really need is:

h.values.flatten.uniq

Since you aren't being asked to sort the Hash objects in your consolidated Array, you can avoid the need for another method call that (in this case, anyway) is a no-op.

"Uniqueness" Can Be Tricky Absent Additional Context

The only reason to look at your id keys at all would be if you had duplicate IDs in multiple unique Hash objects, and if that were the case you'd then have to worry about which Hash was the correct one to keep. For example, given:

[ {'id' => 1, 'authorId' => 9, 'likes' => 1_920},
  {'id' => 1, 'authorId' => 9, 'likes' => 960} ]

which one of these records is the "duplicate" one? Without other data such as a timestamp, simply chaining uniq { h['id' } or merging the Hash objects will either net you the first or last record respectively. Consider:

[
  {'id' => 1, 'authorId' => 9, 'likes' => 1_920},
  {'id' => 1, 'authorId' => 9, 'likes' => 960}
].uniq { _1['id'] }
#=> [{"id"=>1, "authorId"=>9, "likes"=>1920}]

[
  {'id' => 1, 'authorId' => 9, 'likes' => 1_920},
  {'id' => 1, 'authorId' => 9, 'likes' => 960}
].reduce({}, :merge)
#=> {"id"=>1, "authorId"=>9, "likes"=>960}

Leveraging Context Like Rails-Specific Timestamp Features

While the uniqueness problem described above may seem out of scope for the question you're currently being asked, understanding the limitations of any kind of data transformation is useful. In addition, knowing that Ruby on Rails supports ActiveRecord::Timestamp and the creation and management of timestamp-related columns within database migrations may be highly relevant in a broader sense.

You don't need to know these things to answer the original question. However, knowing when a given solution fits a specific use case and when it doesn't is important too.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文