主力:删除键值数组中的重复项

发布于 2024-10-28 10:55:17 字数 488 浏览 5 评论 0原文

我有一对大小相等的数组,我将它们称为键和值。

例如:

K: V
1: 99
1: 100
1: 100
1: 100
1: 103
2: 103
2: 105
3: 45
3: 67

键已排序,与每个键关联的值是 已排序。如何删除与每个键关联的重复项值 及其对应的键?

也就是说,我想将上面的内容压缩为:

1: 99
1: 100
1: 103
2: 103 <-- This should remain, since key is different
2: 105
3: 45
3: 67

我查看了 Thrust 中可用的流压缩函数,但是 找不到任何可以做到这一点的东西。这可能吗? 推力?或者我是否需要编写自己的内核来标记重复项 模板然后将其移除?

I have a pair of arrays of equal size, I will call them keys and values.

For example:

K: V
1: 99
1: 100
1: 100
1: 100
1: 103
2: 103
2: 105
3: 45
3: 67

The keys are sorted and the values associated with each key are
sorted. How do I remove the value duplicates associated with each key
and its corresponding key?

That is, I want to compact the above to:

1: 99
1: 100
1: 103
2: 103 <-- This should remain, since key is different
2: 105
3: 45
3: 67

I looked at the stream compaction functions available in Thrust, but
was not able to find anything which does this. Is this possible with
Thrust? Or do I need to write my own kernel to mark the duplicates in
a stencil and then remove them?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

明月夜 2024-11-04 10:55:17

键已排序,与每个键关联的值也已排序。因此,我们可以认为键值对已排序。如果 thrust::unique 可以将这两个向量视为单个向量,则它将直接处理此问题。这可以通过使用 zip_iterator 将每个位置的 2 个项目(键值)压缩到单个元组中来实现。

以下是如何就地实现此目的,并将键值向量修剪为仅唯一元素:

typedef thrust::device_vector< int >                IntVector;
typedef IntVector::iterator                         IntIterator;
typedef thrust::tuple< IntIterator, IntIterator >   IntIteratorTuple;
typedef thrust::zip_iterator< IntIteratorTuple >    ZipIterator;

IntVector keyVector;
IntVector valVector;

ZipIterator newEnd = thrust::unique( thrust::make_zip_iterator( thrust::make_tuple( keyVector.begin(), valVector.begin() ) ),
                                     thrust::make_zip_iterator( thrust::make_tuple( keyVector.end(), valVector.end() ) ) );

IntIteratorTuple endTuple = newEnd.get_iterator_tuple();

keyVector.erase( thrust::get<0>( endTuple ), keyVector.end() );
valVector.erase( thrust::get<1>( endTuple ), valVector.end() );

如果要压缩并生成单独的结果流,则需要为您的类型编写自己的二进制谓词,该谓词同时查看元组的元素。 thrust::zip_iterator 可用于从单独的数组形成虚拟元组迭代器。

一个完整的工作示例如下所示:

#include <iostream>
#include <thrust/tuple.h>
#include <thrust/functional.h>
#include <thrust/device_vector.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/unique.h>

// Binary predicate for a tuple pair
typedef thrust::tuple<int, int> tuple_t;
struct tupleEqual
{
  __host__ __device__
    bool operator()(tuple_t x, tuple_t y)
    {
      return ( (x.get<0>()== y.get<0>()) && (x.get<1>() == y.get<1>()) );
    }
};

typedef thrust::device_vector<int>::iterator  intIterator;
typedef thrust::tuple<intIterator, intIterator> intIteratorTuple;
typedef thrust::zip_iterator<intIteratorTuple> zipIterator;
typedef thrust::device_vector<tuple_t>::iterator tupleIterator;

int main(void)
{
  thrust::device_vector<int> k(9), v(9);
  thrust::device_vector<tuple_t> kvcopy(9);

  k[0] = 1; k[1] = 1; k[2] = 1; 
  k[3] = 1; k[4] = 1; k[5] = 2;
  k[6] = 2; k[7] = 3; k[8] = 3;

  v[0] = 99;  v[1] = 100; v[2] = 100;
  v[3] = 100; v[4] = 103; v[5] = 103;
  v[6] = 105; v[7] = 45;  v[8] = 67;

  zipIterator kvBegin(thrust::make_tuple(k.begin(),v.begin()));
  zipIterator kvEnd(thrust::make_tuple(k.end(),v.end()));
  thrust::copy(kvBegin, kvEnd, kvcopy.begin());

  tupleIterator kvend = 
        thrust::unique(kvcopy.begin(), kvcopy.end(), tupleEqual());

  for(tupleIterator kvi = kvcopy.begin(); kvi != kvend; kvi++) {
    tuple_t r = *kvi;
    std::cout << r.get<0>() << "," << r.get<1>() << std::endl;
  }

  return 0;
}

The keys are sorted and the values associated with each key are also sorted. Thus, we can consider that the key-value pairs are sorted. thrust::unique will work directly on this if it can see these 2 vectors as a single vector. This can be achieved by zipping up the 2 items (key-value) at each position into a single tuple using zip_iterator.

Here is how to achieve this in-place and also trim the key-value vectors to only the unique elements:

typedef thrust::device_vector< int >                IntVector;
typedef IntVector::iterator                         IntIterator;
typedef thrust::tuple< IntIterator, IntIterator >   IntIteratorTuple;
typedef thrust::zip_iterator< IntIteratorTuple >    ZipIterator;

IntVector keyVector;
IntVector valVector;

ZipIterator newEnd = thrust::unique( thrust::make_zip_iterator( thrust::make_tuple( keyVector.begin(), valVector.begin() ) ),
                                     thrust::make_zip_iterator( thrust::make_tuple( keyVector.end(), valVector.end() ) ) );

IntIteratorTuple endTuple = newEnd.get_iterator_tuple();

keyVector.erase( thrust::get<0>( endTuple ), keyVector.end() );
valVector.erase( thrust::get<1>( endTuple ), valVector.end() );

If you want to compact and produce a separate result stream, you need to write your own binary predicate for your type which looks at both elements of the tuple. The thrust::zip_iterator can be used to form a virtual tuple iterator from separate arrays.

A complete working example of how you might do it looks like this:

#include <iostream>
#include <thrust/tuple.h>
#include <thrust/functional.h>
#include <thrust/device_vector.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/unique.h>

// Binary predicate for a tuple pair
typedef thrust::tuple<int, int> tuple_t;
struct tupleEqual
{
  __host__ __device__
    bool operator()(tuple_t x, tuple_t y)
    {
      return ( (x.get<0>()== y.get<0>()) && (x.get<1>() == y.get<1>()) );
    }
};

typedef thrust::device_vector<int>::iterator  intIterator;
typedef thrust::tuple<intIterator, intIterator> intIteratorTuple;
typedef thrust::zip_iterator<intIteratorTuple> zipIterator;
typedef thrust::device_vector<tuple_t>::iterator tupleIterator;

int main(void)
{
  thrust::device_vector<int> k(9), v(9);
  thrust::device_vector<tuple_t> kvcopy(9);

  k[0] = 1; k[1] = 1; k[2] = 1; 
  k[3] = 1; k[4] = 1; k[5] = 2;
  k[6] = 2; k[7] = 3; k[8] = 3;

  v[0] = 99;  v[1] = 100; v[2] = 100;
  v[3] = 100; v[4] = 103; v[5] = 103;
  v[6] = 105; v[7] = 45;  v[8] = 67;

  zipIterator kvBegin(thrust::make_tuple(k.begin(),v.begin()));
  zipIterator kvEnd(thrust::make_tuple(k.end(),v.end()));
  thrust::copy(kvBegin, kvEnd, kvcopy.begin());

  tupleIterator kvend = 
        thrust::unique(kvcopy.begin(), kvcopy.end(), tupleEqual());

  for(tupleIterator kvi = kvcopy.begin(); kvi != kvend; kvi++) {
    tuple_t r = *kvi;
    std::cout << r.get<0>() << "," << r.get<1>() << std::endl;
  }

  return 0;
}
可爱暴击 2024-11-04 10:55:17

只需做一点准备就可以进行流压缩。您基本上可以为每个键值对启动一个线程,检查前一个键值对是否相等,如果不相等:在与这些键值对大小相同的单独数组中设置一个标志(int = 1)。所有其他标志保持未设置状态 (int = 0)。然后根据flag数组对键值对进行流式压缩。

Stream compaction with a little bit of preparation will do. You can basically launch a thread for each key-value pair, check if the previous key-value pair equals, if not: set a flag (int = 1) in a separate array of the same size as those pairs. All other flags remain unset (int = 0). Then do stream compaction of the key-value pairs based on the flag array.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文