主力:删除键值数组中的重复项
我有一对大小相等的数组,我将它们称为键和值。
例如:
K: V
1: 99
1: 100
1: 100
1: 100
1: 103
2: 103
2: 105
3: 45
3: 67
键已排序,与每个键关联的值是 已排序。如何删除与每个键关联的重复项值 及其对应的键?
也就是说,我想将上面的内容压缩为:
1: 99
1: 100
1: 103
2: 103 <-- This should remain, since key is different
2: 105
3: 45
3: 67
我查看了 Thrust 中可用的流压缩函数,但是 找不到任何可以做到这一点的东西。这可能吗? 推力?或者我是否需要编写自己的内核来标记重复项 模板然后将其移除?
I have a pair of arrays of equal size, I will call them keys and values.
For example:
K: V
1: 99
1: 100
1: 100
1: 100
1: 103
2: 103
2: 105
3: 45
3: 67
The keys are sorted and the values associated with each key are
sorted. How do I remove the value duplicates associated with each key
and its corresponding key?
That is, I want to compact the above to:
1: 99
1: 100
1: 103
2: 103 <-- This should remain, since key is different
2: 105
3: 45
3: 67
I looked at the stream compaction functions available in Thrust, but
was not able to find anything which does this. Is this possible with
Thrust? Or do I need to write my own kernel to mark the duplicates in
a stencil and then remove them?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
键已排序,与每个键关联的值也已排序。因此,我们可以认为键值对已排序。如果
thrust::unique
可以将这两个向量视为单个向量,则它将直接处理此问题。这可以通过使用 zip_iterator 将每个位置的 2 个项目(键值)压缩到单个元组中来实现。以下是如何就地实现此目的,并将键值向量修剪为仅唯一元素:
如果要压缩并生成单独的结果流,则需要为您的类型编写自己的二进制谓词,该谓词同时查看元组的元素。
thrust::zip_iterator
可用于从单独的数组形成虚拟元组迭代器。一个完整的工作示例如下所示:
The keys are sorted and the values associated with each key are also sorted. Thus, we can consider that the key-value pairs are sorted.
thrust::unique
will work directly on this if it can see these 2 vectors as a single vector. This can be achieved by zipping up the 2 items (key-value) at each position into a single tuple usingzip_iterator
.Here is how to achieve this in-place and also trim the key-value vectors to only the unique elements:
If you want to compact and produce a separate result stream, you need to write your own binary predicate for your type which looks at both elements of the tuple. The
thrust::zip_iterator
can be used to form a virtual tuple iterator from separate arrays.A complete working example of how you might do it looks like this:
只需做一点准备就可以进行流压缩。您基本上可以为每个键值对启动一个线程,检查前一个键值对是否相等,如果不相等:在与这些键值对大小相同的单独数组中设置一个标志(int = 1)。所有其他标志保持未设置状态 (int = 0)。然后根据flag数组对键值对进行流式压缩。
Stream compaction with a little bit of preparation will do. You can basically launch a thread for each key-value pair, check if the previous key-value pair equals, if not: set a flag (int = 1) in a separate array of the same size as those pairs. All other flags remain unset (int = 0). Then do stream compaction of the key-value pairs based on the flag array.