就地联合排序向量

发布于 2024-09-16 19:02:02 字数 524 浏览 10 评论 0原文

我想要一种有效的方法来将排序向量与另一个排序向量进行就地并集。通过就地，我的意思是该算法不应该创建一个全新的向量或其他存储来存储并集，即使是暂时的。相反，第一个向量应该简单地按新元素的数量增长。

类似于：

void inplace_union(vector & A, const vector & B);

之后，A包含A联合B的所有元素并且被排序。

中的 std::set_union 不起作用，因为它覆盖了其目的地，即 A。

另外，这可以通过一次传递两个向量来完成吗？

编辑：同时 A 和 B 中的元素只能在 A 中出现一次。

原文

I'd like an efficient method for doing the inplace union of a sorted vector with another sorted vector. By inplace, I mean that the algorithm shouldn't create a whole new vector or other storage to store the union, even temporarily. Instead, the first vector should simple grow by exactly the number of new elements.

Something like:

void inplace_union(vector & A, const vector & B);

Where, afterwards, A contains all of the elements of A union B and is sorted.

std::set_union in <algorithm> wont work because it overwrites its destination, which would be A.

Also, can this be done with just one pass over the two vectors?

Edit: elements that are in both A and B should only appear once in A.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

旧情勿念 2024-09-23 19:02:02

我相信你可以使用算法std::inplace_merge。这是示例代码：

void inplace_union(std::vector<int>& a, const std::vector<int>& b)
{
    int mid = a.size(); //Store the end of first sorted range

    //First copy the second sorted range into the destination vector
    std::copy(b.begin(), b.end(), std::back_inserter(a));

    //Then perform the in place merge on the two sub-sorted ranges.
    std::inplace_merge(a.begin(), a.begin() + mid, a.end());

    //Remove duplicate elements from the sorted vector
    a.erase(std::unique(a.begin(), a.end()), a.end());
}

I believe you can use the algorithm std::inplace_merge. Here is the sample code:

void inplace_union(std::vector<int>& a, const std::vector<int>& b)
{
    int mid = a.size(); //Store the end of first sorted range

    //First copy the second sorted range into the destination vector
    std::copy(b.begin(), b.end(), std::back_inserter(a));

    //Then perform the in place merge on the two sub-sorted ranges.
    std::inplace_merge(a.begin(), a.begin() + mid, a.end());

    //Remove duplicate elements from the sorted vector
    a.erase(std::unique(a.begin(), a.end()), a.end());
}

回复收藏 0 原文

赏烟花じ飞满天 2024-09-23 19:02:02

是的，这可以在 O(n) 时间内就地完成，假设两个输入都已排序，并且一次传递两个向量。方法如下：

通过 B.size() 扩展 A（目标向量） - 为我们的新元素腾出空间。

从 B 的末尾和 A 的原始末尾开始向后迭代两个向量。如果向量按小→大排序（最后大），则将指向较大数字的迭代器放在A的真实末尾。继续下去，直到 B 的迭代器到达 B 的开头。反向迭代器在这里应该特别好。

示例：

A: [ 1, 2, 4, 9 ]
B: [ 3, 7, 11 ]

* = iterator, ^ = where we're inserting, _ = unitialized
A: [ 1, 3, 4, 9*, _, _, _^ ]   B: [ 3, 7, 11* ]
A: [ 1, 3, 4, 9*, _, _^, 11 ]  B: [ 3, 7*, 11 ]
A: [ 1, 3, 4*, 9, _^, 9, 11 ]  B: [ 3, 7*, 11 ]
A: [ 1, 3, 4*, 9^, 7, 9, 11 ]  B: [ 3*, 7, 11 ]
A: [ 1, 3*, 4^, 4, 7, 9, 11 ]  B: [ 3*, 7, 11 ]
A: [ 1, 3*^, 3, 4, 7, 9, 11 ]  B: [ 3, 7, 11 ]

超级编辑：您考虑过std::inplace_merge吗？（我可能刚刚重新发明了它？）

Yes, this can be done in-place, and in O(n) time, assuming both inputs are sorted, and with one pass over both vectors. Here's how:

Extend A (the destination vector) by B.size() - make room for our new elements.

Iterate backwards over the two vectors, starting from the end of B and the original end of A. If the vectors are sorted small → big (big at the end), then take the iterator pointing at the larger number, and stick it in the true end of A. Keep going until B's iterator hits the beginning of B. Reverse iterators should prove especially nice here.

Example:

A: [ 1, 2, 4, 9 ]
B: [ 3, 7, 11 ]

* = iterator, ^ = where we're inserting, _ = unitialized
A: [ 1, 3, 4, 9*, _, _, _^ ]   B: [ 3, 7, 11* ]
A: [ 1, 3, 4, 9*, _, _^, 11 ]  B: [ 3, 7*, 11 ]
A: [ 1, 3, 4*, 9, _^, 9, 11 ]  B: [ 3, 7*, 11 ]
A: [ 1, 3, 4*, 9^, 7, 9, 11 ]  B: [ 3*, 7, 11 ]
A: [ 1, 3*, 4^, 4, 7, 9, 11 ]  B: [ 3*, 7, 11 ]
A: [ 1, 3*^, 3, 4, 7, 9, 11 ]  B: [ 3, 7, 11 ]

Super edit: Have you considered std::inplace_merge? (which I may have just re-invented?)

回复收藏 0 原文

薄凉少年不暖心 2024-09-23 19:02:02

set_difference 的想法很好，但缺点是我们不知道需要提前将向量增长多少。

这是我的解决方案，它执行两次 set_difference ，一次是为了计算我们需要的额外插槽的数量，另一次是为了执行实际的复制。

注意：这意味着我们将迭代源两次。

#include <algorithm>
#include <boost/function_output_iterator.hpp>

// for the test
#include <vector>
#include <iostream>


struct output_counter
{
   output_counter(size_t & r) : result(r) {}
   template <class T> void operator()(const T & x) const { ++result; }
private:
   size_t & result;
};


// Target is assumed to work the same as a vector
// Both target and source must be sorted
template <class Target, class It>
void inplace_union( Target & target, It src_begin, It src_end )
{
   const size_t mid = target.size(); // Store the end of first sorted range

   // first, count how many items we will copy
   size_t extra = 0;
   std::set_difference( 
         src_begin, src_end,
         target.begin(), target.end(),
         boost::make_function_output_iterator(output_counter(extra)));

   if (extra > 0) // don't waste time if nothing to do
   {
      // reserve the exact memory we will require
      target.reserve( target.size() + extra );

      // Copy the items from the source that are missing in the destination
      std::set_difference( 
            src_begin, src_end,
            target.begin(), target.end(),
            std::back_inserter(target) );

      // Then perform the in place merge on the two sub-sorted ranges.
      std::inplace_merge( target.begin(), target.begin() + mid, target.end() );
   }
}



int main()
{
   std::vector<int> a(3), b(3);
   a[0] = 1;
   a[1] = 3;
   a[2] = 5;

   b[0] = 4;
   b[1] = 5;
   b[2] = 6;

   inplace_union(a, b.begin(), b.end());

   for (size_t i = 0; i != a.size(); ++i)
      std::cout << a[i] << ", ";
   std::cout << std::endl;

   return 0;
}

使用 boost 头文件编译，结果是：

$ ./test 
1, 3, 4, 5, 6,

The set_difference idea is good, but the disadvantage is we don't know how much we need to grow the vector in advance.

This is my solution which does the set_difference twice, once to count the number of extra slots we'll need, and once again to do the actual copy.

Note: that means we will iterate over the source twice.

#include <algorithm>
#include <boost/function_output_iterator.hpp>

// for the test
#include <vector>
#include <iostream>


struct output_counter
{
   output_counter(size_t & r) : result(r) {}
   template <class T> void operator()(const T & x) const { ++result; }
private:
   size_t & result;
};


// Target is assumed to work the same as a vector
// Both target and source must be sorted
template <class Target, class It>
void inplace_union( Target & target, It src_begin, It src_end )
{
   const size_t mid = target.size(); // Store the end of first sorted range

   // first, count how many items we will copy
   size_t extra = 0;
   std::set_difference( 
         src_begin, src_end,
         target.begin(), target.end(),
         boost::make_function_output_iterator(output_counter(extra)));

   if (extra > 0) // don't waste time if nothing to do
   {
      // reserve the exact memory we will require
      target.reserve( target.size() + extra );

      // Copy the items from the source that are missing in the destination
      std::set_difference( 
            src_begin, src_end,
            target.begin(), target.end(),
            std::back_inserter(target) );

      // Then perform the in place merge on the two sub-sorted ranges.
      std::inplace_merge( target.begin(), target.begin() + mid, target.end() );
   }
}



int main()
{
   std::vector<int> a(3), b(3);
   a[0] = 1;
   a[1] = 3;
   a[2] = 5;

   b[0] = 4;
   b[1] = 5;
   b[2] = 6;

   inplace_union(a, b.begin(), b.end());

   for (size_t i = 0; i != a.size(); ++i)
      std::cout << a[i] << ", ";
   std::cout << std::endl;

   return 0;
}

Compiled with the boost headers, the result is: