Ruby 2.7:如何合并一个基于一个键的哈希阵列并消除重复项:值
我正在尝试完成基于项目的评估以进行工作面试,而他们只在Ruby上提供了Ruby,这对此一无所知。我正在尝试将一个包含两个或多个阵列哈希的哈希进行,并将阵列组合到一个哈希阵列中,同时消除基于“ id”:value对的重复哈希。
因此,我试图将
h = {
'first' =>
[
{ 'authorId' => 12, 'id' => 2, 'likes' => 469 },
{ 'authorId' => 5, 'id' => 8, 'likes' => 735 },
{ 'authorId' => 8, 'id' => 10, 'likes' => 853 }
],
'second' =>
[
{ 'authorId' => 9, 'id' => 1, 'likes' => 960 },
{ 'authorId' => 12, 'id' => 2, 'likes' => 469 },
{ 'authorId' => 8, 'id' => 4, 'likes' => 728 }
]
}
其转变为:
[
{ 'authorId' => 12, 'id' => 2, 'likes' => 469 },
{ 'authorId' => 5, 'id' => 8, 'likes' => 735 },
{ 'authorId' => 8, 'id' => 10, 'likes' => 853 },
{ 'authorId' => 9, 'id' => 1, 'likes' => 960 },
{ 'authorId' => 8, 'id' => 4, 'likes' => 728 }
]
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Ruby有很多方法可以实现这一目标。
我的第一个本能是通过
id
将它们分组,并仅从数组中选择第一个项目。更清洁的方法是在变平上的哈希阵列后根据ID选择不同的项目,这就是评论中建议的 Cary Swoveland
Ruby has many ways to achieve this.
My first instinct is to group them by
id
it and pick only first item from the array.Much cleaner approach is to pick the distinct item based on id after flattening the array of hash which is what Cary Swoveland suggested in the comments
TL; DR
最简单的解决方案适合您发布的数据的问题是
h.values.flatten.uniq
。您可以在这里停止阅读,除非您想了解为什么您不需要关心此特定数据集的重复ID,或者您何时可能需要关心以及为什么这通常比看起来不那么直接。在末尾,我还提到了一些针对此特定数据不需要的边缘案例的铁轨功能。但是,它们可能有助于其他用例。
跳过特定于ID的重复数据删除;专注于删除重复哈希而不是
首先,您的 have no Replate
ID
键也不是重复哈希对象的一部分。尽管Ruby实施,在概念上是无序的哈希。务实地,这意味着两个具有相同键和值的哈希对象(即使它们处于不同的插入顺序)仍然被认为是平等的。因此,也许是不直觉的:考虑到您的示例输入,您实际上不必担心此练习的独特ID。您只需要从合并的数组中消除重复的哈希对象,而您只有一个。
如您所见,
'id'=> 2
是两个哈希值中唯一发现的ID,尽管在相同的哈希对象中。由于您只有一个重复的哈希,因此问题已减少以使存储在 h 中的哈希值数组变平,以便您可以从组合阵列中删除任何重复的哈希元素(不是重复的ID)。解决已发布的问题的
解决方案可能会使用您需要处理哈希键独特性的情况,但这不是其中之一。除非您想通过某个键对结果进行排序,否则您真正需要的是:
由于没有被要求对合并阵列中的哈希对象进行排序,因此您可以避免需要另一种方法来调用(无论如何,无论如何,无论如何,无论如何)是一个无障碍。
“唯一性”可能会很棘手,没有其他上下文,
查看您的
ID
键的唯一原因是,如果您在多个 unique hash对象中具有重复的ID,并且如果那样如果您不得不担心要保留哪个哈希是正确的。例如,给定:以下哪个记录是“重复”?没有其他数据,例如时间戳,只需链接
uniq {h ['id'}
或合并哈希对象,将分别为您的第一个或最后一个记录净化。考虑:利用上下文,例如特定于铁轨的时间戳功能,
而上述唯一性问题似乎不符合您当前被问到的问题,但了解任何类型的数据转换的局限性是有用的。此外,知道Ruby on Rails支持 act activerecord :: timestamp :: timestamp 以及 database迁移在更广泛的意义上可能高度相关的时间。
您不需要知道这些事情就可以回答原始问题。但是,知道给定解决方案何时适合特定的用例,并且何时也不重要。
TL;DR
The simplest solution to the problem that fits the data you posted is
h.values.flatten.uniq
. You can stop reading here unless you want to understand why you don't need to care about duplicate IDs with this particular data set, or when you might need to care and why that's often less straightforward than it seems.Near the end I also mention some features of Rails that address edge cases that you don't need for this specific data. However, they might help with other use cases.
Skip ID-Specific Deduplication; Focus on Removing Duplicate Hashes Instead
First of all, you have no duplicate
id
keys that aren't also part of duplicate Hash objects. Despite the fact that Ruby implementations preserve entry order of Hash objects, a Hash is conceptually unordered. Pragmatically, that means two Hash objects with the same keys and values (even if they are in a different insertion order) are still considered equal. So, perhaps unintuitively:Given your example input, you don't actually have to worry about unique IDs for this exercise. You just need to eliminate duplicate Hash objects from your merged Array, and you have only one of those.
As you can see,
'id' => 2
is the only ID found in both Hash values, albeit in identical Hash objects. Since you have only one duplicate Hash, the problem has been reduced to flattening the Array of Hash values stored in h so that you can remove any duplicate Hash elements (not duplicate IDs) from the combined Array.Solution to the Posted Problem
There might be uses cases where you need to handle the uniqueness of Hash keys, but this is not one of them. Unless you want to sort your result by some key, all you really need is:
Since you aren't being asked to sort the Hash objects in your consolidated Array, you can avoid the need for another method call that (in this case, anyway) is a no-op.
"Uniqueness" Can Be Tricky Absent Additional Context
The only reason to look at your
id
keys at all would be if you had duplicate IDs in multiple unique Hash objects, and if that were the case you'd then have to worry about which Hash was the correct one to keep. For example, given:which one of these records is the "duplicate" one? Without other data such as a timestamp, simply chaining
uniq { h['id' }
or merging the Hash objects will either net you the first or last record respectively. Consider:Leveraging Context Like Rails-Specific Timestamp Features
While the uniqueness problem described above may seem out of scope for the question you're currently being asked, understanding the limitations of any kind of data transformation is useful. In addition, knowing that Ruby on Rails supports ActiveRecord::Timestamp and the creation and management of timestamp-related columns within database migrations may be highly relevant in a broader sense.
You don't need to know these things to answer the original question. However, knowing when a given solution fits a specific use case and when it doesn't is important too.