如何使向量的元素唯一? (删除不相邻的重复项)

发布于 2024-08-04 20:51:19 字数 572 浏览 2 评论 0原文

我有一个包含一些不相邻重复项的向量。

作为一个简单的示例,请考虑:

2 1 6 1 4 6 2 1 1

我试图通过删除不相邻的重复项并保持元素的顺序来使此向量唯一。

结果是:

2 1 6 4 

我尝试的解决方案是:

  1. 插入 std::set 但这种方法的问题是它会扰乱元素的顺序。
  2. 使用 std::sort 和 std::unique 的组合。但同样的订单问题又出现了。
  3. 手动重复消除:

     定义一个临时向量TempVector。
        对于(向量中的每个元素)
        {
            if(该元素不存在于 TempVector 中)
            {
                添加到 TempVector;
            }
        }
        将原始向量与 TempVector 交换。
    

我的问题是:

是否有任何STL算法可以从向量中删除不相邻的重复项?它的复杂性是什么?

I have a vector containing few non-adjacent duplicates.

As a simple example, consider:

2 1 6 1 4 6 2 1 1

I am trying to make this vector unique by removing the non-adjacent duplicates and maintaining the order of elements.

Result would be:

2 1 6 4 

The solutions I tried are:

  1. Inserting into a std::set but the problem with this approach is that it will disturb the order of elements.
  2. Use the combination of std::sort and std::unique. But again same order problem.
  3. Manual duplicate elimination:

        Define a temporary vector TempVector.
        for (each element in a vector)
        {
            if (the element does not exists in TempVector)
            {
                add to TempVector;
            }
        }
        swap orginial vector with TempVector.
    

My question is:

Is there any STL algorithm which can remove the non-adjacent duplicates from the vector ? what is its complexity?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

始于初秋 2024-08-11 20:51:19

我想你会这样做:

我会在向量上使用两个迭代器:

第一个迭代器读取数据并将其插入临时集。

当读取的数据不在集合中时,您将其从第一个迭代器复制到第二个迭代器并递增它。

最后,您只保留第二个迭代器之前的数据。

复杂度为 O( n .log( n ) ),因为重复元素的查找使用集合,而不是向量。

#include <vector>
#include <set>
#include <iostream>

int main(int argc, char* argv[])
{
    std::vector< int > k ;

    k.push_back( 2 );
    k.push_back( 1 );
    k.push_back( 6 );
    k.push_back( 1 );
    k.push_back( 4 );
    k.push_back( 6 );
    k.push_back( 2 );
    k.push_back( 1 );
    k.push_back( 1 );

{
    std::vector< int >::iterator r , w ;

    std::set< int > tmpset ;

    for( r = k.begin() , w = k.begin() ; r != k.end() ; ++r )
    {
        if( tmpset.insert( *r ).second )
        {
            *w++ = *r ;
        }
    }

    k.erase( w , k.end() );
}


    {
        std::vector< int >::iterator r ;

        for( r = k.begin() ; r != k.end() ; ++r )
        {
            std::cout << *r << std::endl ;
        }
    }
}

I think you would do it like this:

I would use two iterators on the vector :

The first of one reads the data and inserts it a temporary set.

When the read data was not in the set you copy it from the first iterator to the second and increment it.

At the end you keep only the data up to the second iterator.

The complexity is O( n .log( n ) ) as the lookup for duplicated elements uses the set, not the vector.

#include <vector>
#include <set>
#include <iostream>

int main(int argc, char* argv[])
{
    std::vector< int > k ;

    k.push_back( 2 );
    k.push_back( 1 );
    k.push_back( 6 );
    k.push_back( 1 );
    k.push_back( 4 );
    k.push_back( 6 );
    k.push_back( 2 );
    k.push_back( 1 );
    k.push_back( 1 );

{
    std::vector< int >::iterator r , w ;

    std::set< int > tmpset ;

    for( r = k.begin() , w = k.begin() ; r != k.end() ; ++r )
    {
        if( tmpset.insert( *r ).second )
        {
            *w++ = *r ;
        }
    }

    k.erase( w , k.end() );
}


    {
        std::vector< int >::iterator r ;

        for( r = k.begin() ; r != k.end() ; ++r )
        {
            std::cout << *r << std::endl ;
        }
    }
}
污味仙女 2024-08-11 20:51:19

如果不使用临时,则可能会(可能)造成一些性能损失:

template<class Iterator>
Iterator Unique(Iterator first, Iterator last)
{
    while (first != last)
    {
        Iterator next(first);
        last = std::remove(++next, last, *first);
        first = next;
    }

    return last;
}

用于:

vec.erase( Unique( vec.begin(), vec.end() ), vec.end() );

对于较小的数据集,实现的简单性和所需的额外分配的缺乏可能会抵消理论上较高的复杂性使用额外的。不过,使用代表性输入进行测量是唯一确定的方法。

Without using a temporary set it's possible to do this with (possibly) some loss of performance:

template<class Iterator>
Iterator Unique(Iterator first, Iterator last)
{
    while (first != last)
    {
        Iterator next(first);
        last = std::remove(++next, last, *first);
        first = next;
    }

    return last;
}

used as in:

vec.erase( Unique( vec.begin(), vec.end() ), vec.end() );

For smaller data sets, the implementation simplicity and lack of extra allocation required may offset the theoretical higher complexity of using an additional set. Measurement with a representative input is the only way to be sure, though.

折戟 2024-08-11 20:51:19

因为问题是“有没有STL算法……?它的复杂度是多少?”实现像 std::unique 这样的函数是有意义的:

template <class FwdIterator>
inline FwdIterator stable_unique(FwdIterator first, FwdIterator last)
{
    FwdIterator result = first;
    std::unordered_set<typename FwdIterator::value_type> seen;

    for (; first != last; ++first)
        if (seen.insert(*first).second)
            *result++ = *first;
    return result;
}

所以这就是 std::unique 的实现方式以及额外的集合。 unordered_set 应比常规set 更快。所有与它们前面的元素相等的元素都被删除(第一个元素被保留,因为我们不能统一到任何东西)。迭代器返回指向 [first,last) 范围内新末端的点。

编辑:最后一句意味着容器本身没有被 unique 修改。这可能会令人困惑。下面的例子实际上将容器简化为统一集。

1: std::vector<int> v(3, 5);
2: v.resize(std::distance(v.begin(), unique(v.begin(), v.end())));
3: assert(v.size() == 1);

第 1 行创建一个向量 { 5, 5, 5 }。在第 2 行中,调用 unique 返回一个指向第二个元素的迭代器,这是第一个不唯一的元素。因此,distance 返回 1,并且 resize 修剪向量。

As the question was "is there any STL algorithm...? what is its complexity?" it makes sense to implement the function like std::unique:

template <class FwdIterator>
inline FwdIterator stable_unique(FwdIterator first, FwdIterator last)
{
    FwdIterator result = first;
    std::unordered_set<typename FwdIterator::value_type> seen;

    for (; first != last; ++first)
        if (seen.insert(*first).second)
            *result++ = *first;
    return result;
}

So this is how std::unique is implemented plus an extra set. The unordered_set shall be faster than a regular set. All elements are removed that compare equal to the element right preceding them (the first element is kept because we cannot unify to nothing). The iterator returned points to the new end within the range [first,last).

EDIT: The last sentence means that the container itself is NOT modified by unique. This can be confusing. The following example actually reduces the container to the unified set.

1: std::vector<int> v(3, 5);
2: v.resize(std::distance(v.begin(), unique(v.begin(), v.end())));
3: assert(v.size() == 1);

Line 1 creates a vector { 5, 5, 5 }. In line 2 calling unique returns an iterator to the 2nd element, which is the first element that is not unique. Hence distance returns 1 and resize prunes the vector.

东走西顾 2024-08-11 20:51:19

您可以删除 fa's 使用 remove_copy_if 的回答:

class NotSeen : public std::unary_function <int, bool>
{
public:
  NotSeen (std::set<int> & seen) : m_seen (seen) { }

  bool operator ()(int i) const  {
    return (m_seen.insert (i).second);
  }

private:
  std::set<int> & m_seen;
};

void removeDups (std::vector<int> const & iv, std::vector<int> & ov) {
  std::set<int> seen;
  std::remove_copy_if (iv.begin ()
      , iv.end ()
      , std::back_inserter (ov)
      , NotSeen (seen));
}

这对算法的复杂性没有影响(即,如所写,它也是 O(n log n))。您可以使用 unordered_set 对此进行改进,或者如果值的范围足够小,您可以简单地使用数组或位数组。

You can remove some of the loops in fa's answer using remove_copy_if:

class NotSeen : public std::unary_function <int, bool>
{
public:
  NotSeen (std::set<int> & seen) : m_seen (seen) { }

  bool operator ()(int i) const  {
    return (m_seen.insert (i).second);
  }

private:
  std::set<int> & m_seen;
};

void removeDups (std::vector<int> const & iv, std::vector<int> & ov) {
  std::set<int> seen;
  std::remove_copy_if (iv.begin ()
      , iv.end ()
      , std::back_inserter (ov)
      , NotSeen (seen));
}

This has no affect on the complexity of the algorithm (ie. as written it's also O(n log n)). You can improve upon this using unordered_set, or if the range of your values is small enough you could simply use an array or a bitarray.

月亮是我掰弯的 2024-08-11 20:51:19

没有任何 STL 算法可以做您想要保留序列原始顺序的事情。

您可以在向量中创建一个迭代器或索引的 std::set ,并使用比较谓词使用引用的数据而不是迭代器/索引来对内容进行排序。然后,从向量中删除集合中未引用的所有内容。 (当然,您也可以使用迭代器/索引的另一个 std::vectorstd::sortstd::unique并以此作为保留内容的参考。)

There's no STL algorithm doing what you want preserving the sequence's original order.

You could create a std::set of iterators or indexes into the vector, with a comparison predicate that uses the referenced data rather than the iterators/indexes to sort stuff. Then you delete everything from the vector that isn't referenced in the set. (Of course, you could just as well use another std::vector of iterators/indexes, std::sort and std::unique that, and use this as a reference as to what to keep.)

烟凡古楼 2024-08-11 20:51:19

基于@fa的回答。它还可以使用 STL 算法 std::stable_partition 进行重写:

struct dupChecker_ {
    inline dupChecker_() : tmpSet() {}
    inline bool operator()(int i) {
        return tmpSet.insert(i).second;
    }
private:
    std::set<int> tmpSet;
};

k.erase(std::stable_partition(k.begin(), k.end(), dupChecker_()), k.end());

这样它会更紧凑,而且我们不需要关心迭代器。

似乎甚至不会带来太多的性能损失。我在我的项目中使用它,该项目需要经常处理相当大的复杂类型向量,但它没有真正的区别。

另一个不错的功能是,可以通过使用 std::set来调整唯一性。 tmpSet;。例如,在我的项目中忽略某些舍入错误。

Based on the answer of @fa. It can also get rewritten using the STL algorithm std::stable_partition:

struct dupChecker_ {
    inline dupChecker_() : tmpSet() {}
    inline bool operator()(int i) {
        return tmpSet.insert(i).second;
    }
private:
    std::set<int> tmpSet;
};

k.erase(std::stable_partition(k.begin(), k.end(), dupChecker_()), k.end());

This way it is more compact and we don't need to care of the iterators.

It seems even not to introduce to much performance penalty. I use it in my project which needs to handle quite large vectors of complex types often and it makes no real difference.

Another nice feature is, that it is possible to adjust the uniqueness by using std::set<int, myCmp_> tmpSet;. For instance, in my project to ignore certain rounding errors.

梦萦几度 2024-08-11 20:51:19

John Torjo 写了一篇很好的文章,系统地讨论了这个问题。他提出的结果似乎比迄今为止建议的任何解决方案更通用、更高效:

http://www.builderau.com.au/program/java/soa/C-Removing-duplicates-from- a-range/0,339024620,320271583,00.htm

https://web.archive.org/web/1/http://articles.techrepublic%2ecom%2ecom/5100-10878_11-1052159.html

不幸的是,John 的解决方案的完整代码似乎不再可用,并且 John 也没有回复 5 封电子邮件。因此,我编写了自己的代码,该代码基于与他类似的理由,但故意在一些细节上有所不同。如果您愿意,请随时联系我 (vschoech think-cell com) 并讨论详细信息。

为了使代码能够为您编译,我添加了一些我自己经常使用的库内容。另外,我没有使用简单的 stl,而是大量使用 boost 来创建更通用、更高效、更可读的代码。

玩得开心!

#include <vector>
#include <functional>

#include <boost/bind.hpp>
#include <boost/range.hpp>
#include <boost/iterator/counting_iterator.hpp>

/////////////////////////////////////////////////////////////////////////////////////////////
// library stuff

template< class Rng, class Func >
Func for_each( Rng& rng, Func f ) {
    return std::for_each( boost::begin(rng), boost::end(rng), f );
};

template< class Rng, class Pred >
Rng& sort( Rng& rng, Pred pred ) {
    std::sort( boost::begin( rng ), boost::end( rng ), pred );
    return rng; // to allow function chaining, similar to operator+= et al.
}

template< class T >
boost::iterator_range< boost::counting_iterator<T> > make_counting_range( T const& tBegin, T const& tEnd ) {
    return boost::iterator_range< boost::counting_iterator<T> >( tBegin, tEnd );
}

template< class Func >
class compare_less_impl {
private:
    Func m_func;
public:
    typedef bool result_type;
    compare_less_impl( Func func ) 
    :   m_func( func )
    {}
    template< class T1, class T2 > bool operator()( T1 const& tLeft, T2 const& tRight ) const {
        return m_func( tLeft ) < m_func( tRight );
    }
};

template< class Func >
compare_less_impl<Func> compare_less( Func func ) {
    return compare_less_impl<Func>( func );
}


/////////////////////////////////////////////////////////////////////////////////////////////
// stable_unique

template<class forward_iterator, class predicate_type>
forward_iterator stable_unique(forward_iterator itBegin, forward_iterator itEnd, predicate_type predLess) {
    typedef std::iterator_traits<forward_iterator>::difference_type index_type;
    struct SIteratorIndex {
        SIteratorIndex(forward_iterator itValue, index_type idx) : m_itValue(itValue), m_idx(idx) {}
        std::iterator_traits<forward_iterator>::reference Value() const {return *m_itValue;}
        index_type m_idx;
    private:
        forward_iterator m_itValue;
    };

    // {1} create array of values (represented by iterators) and indices
    std::vector<SIteratorIndex> vecitidx;
    vecitidx.reserve( std::distance(itBegin, itEnd) );
    struct FPushBackIteratorIndex {
        FPushBackIteratorIndex(std::vector<SIteratorIndex>& vecitidx) : m_vecitidx(vecitidx) {}
        void operator()(forward_iterator itValue) const {
            m_vecitidx.push_back( SIteratorIndex(itValue, m_vecitidx.size()) );
        }
    private:
        std::vector<SIteratorIndex>& m_vecitidx;
    };
    for_each( make_counting_range(itBegin, itEnd), FPushBackIteratorIndex(vecitidx) );

    // {2} sort by underlying value
    struct FStableCompareByValue {
        FStableCompareByValue(predicate_type predLess) : m_predLess(predLess) {}
        bool operator()(SIteratorIndex const& itidxA, SIteratorIndex const& itidxB) {
            return m_predLess(itidxA.Value(), itidxB.Value())
                // stable sort order, index is secondary criterion
                || !m_predLess(itidxB.Value(), itidxA.Value()) && itidxA.m_idx < itidxB.m_idx;
        }
    private:
        predicate_type m_predLess;
    };
    sort( vecitidx, FStableCompareByValue(predLess) );

    // {3} apply std::unique to the sorted vector, removing duplicate values
    vecitidx.erase(
        std::unique( vecitidx.begin(), vecitidx.end(),
            !boost::bind( predLess,
                // redundand boost::mem_fn required to compile
                boost::bind(boost::mem_fn(&SIteratorIndex::Value), _1),
                boost::bind(boost::mem_fn(&SIteratorIndex::Value), _2)
            )
        ),
        vecitidx.end()
    );

    // {4} re-sort by index to match original order
    sort( vecitidx, compare_less(boost::mem_fn(&SIteratorIndex::m_idx)) );

    // {5} keep only those values in the original range that were not removed by std::unique
    std::vector<SIteratorIndex>::iterator ititidx = vecitidx.begin();
    forward_iterator itSrc = itBegin;
    index_type idx = 0;
    for(;;) {
        if( ititidx==vecitidx.end() ) {
            // {6} return end of unique range
            return itSrc;
        }
        if( idx!=ititidx->m_idx ) {
            // original range must be modified
            break;
        }
        ++ititidx;
        ++idx;
        ++itSrc;
    }
    forward_iterator itDst = itSrc;
    do {
        ++idx;
        ++itSrc;
        // while there are still items in vecitidx, there must also be corresponding items in the original range
        if( idx==ititidx->m_idx ) {
            std::swap( *itDst, *itSrc ); // C++0x move
            ++ititidx;
            ++itDst;
        }
    } while( ititidx!=vecitidx.end() );

    // {6} return end of unique range
    return itDst;
}

template<class forward_iterator>
forward_iterator stable_unique(forward_iterator itBegin, forward_iterator itEnd) {
    return stable_unique( itBegin, itEnd, std::less< std::iterator_traits<forward_iterator>::value_type >() );
}

void stable_unique_test() {
    std::vector<int> vecn;
    vecn.push_back(1);
    vecn.push_back(17);
    vecn.push_back(-100);
    vecn.push_back(17);
    vecn.push_back(1);
    vecn.push_back(17);
    vecn.push_back(53);
    vecn.erase( stable_unique(vecn.begin(), vecn.end()), vecn.end() );
    // result: 1, 17, -100, 53
}

There is a nice article by John Torjo which deals with this very question in a systematic way. The result he comes up with seems more generic and more efficient than any of the solutions suggested here so far:

http://www.builderau.com.au/program/java/soa/C-Removing-duplicates-from-a-range/0,339024620,320271583,00.htm

https://web.archive.org/web/1/http://articles.techrepublic%2ecom%2ecom/5100-10878_11-1052159.html

Unfortunately, the complete code of John's solution seems to be no longer available, and John did not respond to may email. Therefore, I wrote my own code which is based on similar grounds like his, but intentionally differs in some details. Feel free to contact me (vschoech think-cell com) and discuss the details if you wish.

To make the code compile for you, I added some of my own library stuff which I use regularly. Also, instead of going with plain stl, I use boost a lot to create more generic, more efficient, and more readable code.

Have fun!

#include <vector>
#include <functional>

#include <boost/bind.hpp>
#include <boost/range.hpp>
#include <boost/iterator/counting_iterator.hpp>

/////////////////////////////////////////////////////////////////////////////////////////////
// library stuff

template< class Rng, class Func >
Func for_each( Rng& rng, Func f ) {
    return std::for_each( boost::begin(rng), boost::end(rng), f );
};

template< class Rng, class Pred >
Rng& sort( Rng& rng, Pred pred ) {
    std::sort( boost::begin( rng ), boost::end( rng ), pred );
    return rng; // to allow function chaining, similar to operator+= et al.
}

template< class T >
boost::iterator_range< boost::counting_iterator<T> > make_counting_range( T const& tBegin, T const& tEnd ) {
    return boost::iterator_range< boost::counting_iterator<T> >( tBegin, tEnd );
}

template< class Func >
class compare_less_impl {
private:
    Func m_func;
public:
    typedef bool result_type;
    compare_less_impl( Func func ) 
    :   m_func( func )
    {}
    template< class T1, class T2 > bool operator()( T1 const& tLeft, T2 const& tRight ) const {
        return m_func( tLeft ) < m_func( tRight );
    }
};

template< class Func >
compare_less_impl<Func> compare_less( Func func ) {
    return compare_less_impl<Func>( func );
}


/////////////////////////////////////////////////////////////////////////////////////////////
// stable_unique

template<class forward_iterator, class predicate_type>
forward_iterator stable_unique(forward_iterator itBegin, forward_iterator itEnd, predicate_type predLess) {
    typedef std::iterator_traits<forward_iterator>::difference_type index_type;
    struct SIteratorIndex {
        SIteratorIndex(forward_iterator itValue, index_type idx) : m_itValue(itValue), m_idx(idx) {}
        std::iterator_traits<forward_iterator>::reference Value() const {return *m_itValue;}
        index_type m_idx;
    private:
        forward_iterator m_itValue;
    };

    // {1} create array of values (represented by iterators) and indices
    std::vector<SIteratorIndex> vecitidx;
    vecitidx.reserve( std::distance(itBegin, itEnd) );
    struct FPushBackIteratorIndex {
        FPushBackIteratorIndex(std::vector<SIteratorIndex>& vecitidx) : m_vecitidx(vecitidx) {}
        void operator()(forward_iterator itValue) const {
            m_vecitidx.push_back( SIteratorIndex(itValue, m_vecitidx.size()) );
        }
    private:
        std::vector<SIteratorIndex>& m_vecitidx;
    };
    for_each( make_counting_range(itBegin, itEnd), FPushBackIteratorIndex(vecitidx) );

    // {2} sort by underlying value
    struct FStableCompareByValue {
        FStableCompareByValue(predicate_type predLess) : m_predLess(predLess) {}
        bool operator()(SIteratorIndex const& itidxA, SIteratorIndex const& itidxB) {
            return m_predLess(itidxA.Value(), itidxB.Value())
                // stable sort order, index is secondary criterion
                || !m_predLess(itidxB.Value(), itidxA.Value()) && itidxA.m_idx < itidxB.m_idx;
        }
    private:
        predicate_type m_predLess;
    };
    sort( vecitidx, FStableCompareByValue(predLess) );

    // {3} apply std::unique to the sorted vector, removing duplicate values
    vecitidx.erase(
        std::unique( vecitidx.begin(), vecitidx.end(),
            !boost::bind( predLess,
                // redundand boost::mem_fn required to compile
                boost::bind(boost::mem_fn(&SIteratorIndex::Value), _1),
                boost::bind(boost::mem_fn(&SIteratorIndex::Value), _2)
            )
        ),
        vecitidx.end()
    );

    // {4} re-sort by index to match original order
    sort( vecitidx, compare_less(boost::mem_fn(&SIteratorIndex::m_idx)) );

    // {5} keep only those values in the original range that were not removed by std::unique
    std::vector<SIteratorIndex>::iterator ititidx = vecitidx.begin();
    forward_iterator itSrc = itBegin;
    index_type idx = 0;
    for(;;) {
        if( ititidx==vecitidx.end() ) {
            // {6} return end of unique range
            return itSrc;
        }
        if( idx!=ititidx->m_idx ) {
            // original range must be modified
            break;
        }
        ++ititidx;
        ++idx;
        ++itSrc;
    }
    forward_iterator itDst = itSrc;
    do {
        ++idx;
        ++itSrc;
        // while there are still items in vecitidx, there must also be corresponding items in the original range
        if( idx==ititidx->m_idx ) {
            std::swap( *itDst, *itSrc ); // C++0x move
            ++ititidx;
            ++itDst;
        }
    } while( ititidx!=vecitidx.end() );

    // {6} return end of unique range
    return itDst;
}

template<class forward_iterator>
forward_iterator stable_unique(forward_iterator itBegin, forward_iterator itEnd) {
    return stable_unique( itBegin, itEnd, std::less< std::iterator_traits<forward_iterator>::value_type >() );
}

void stable_unique_test() {
    std::vector<int> vecn;
    vecn.push_back(1);
    vecn.push_back(17);
    vecn.push_back(-100);
    vecn.push_back(17);
    vecn.push_back(1);
    vecn.push_back(17);
    vecn.push_back(53);
    vecn.erase( stable_unique(vecn.begin(), vecn.end()), vecn.end() );
    // result: 1, 17, -100, 53
}
得不到的就毁灭 2024-08-11 20:51:19

我的问题是:

是否有任何STL算法可以从向量中删除不相邻的重复项?它的复杂性是多少?

STL 选项是您提到的:将项目放入 std::set 中,或调用 std::sortstd::unique > 并在容器上调用erase()。这些都不能满足您的“删除不相邻重复项并保持元素顺序”的要求。

那么为什么 STL 不提供其他选择呢?没有一个标准库能够满足每个用户的需求。 STL 的设计考虑因素包括“对几乎所有用户来说足够快”、“对几乎所有用户有用”和“尽可能提供异常安全”(以及“对标准委员会来说足够小”,正如 Stepanov 库最初所说的那样)写的要大得多,Stroustrup 砍掉了大约 2/3 的内容)。

我能想到的最简单的解决方案如下所示:

// Note:  an STL-like method would be templatized and use iterators instead of
// hardcoding std::vector<int>
std::vector<int> stable_unique(const std::vector<int>& input)
{
    std::vector<int> result;
    result.reserve(input.size());
    for (std::vector<int>::iterator itor = input.begin();
                                    itor != input.end();
                                    ++itor)
        if (std::find(result.begin(), result.end(), *itor) == result.end())
            result.push_back(*itor);
        return result;
}

该解决方案应该是 O(M * N),其中 M 是唯一元素的数量,N 是元素的总数。

My question is:

Is there any STL algorithm which can remove the non-adjacent duplicates from the vector ? what is its complexity?

The STL options are the ones you mentioned: put the items in a std::set, or call std::sort, std::unique and calling erase() on the container. Neither of these fulfills your requirement of "removing the non-adjacent duplicates and maintaining the order of elements."

So why doesn't the STL offer some other option? No standard library will offer everything for every user's needs. The STL's design considerations include "be fast enough for nearly all users," "be useful for nearly all users," and "provide exception safety as much as possible" (and "be small enough for the Standards Committee" as the library Stepanov originally wrote was much larger, and Stroustrup axed out something like 2/3 of it).

The simplest solution I can think of would look like this:

// Note:  an STL-like method would be templatized and use iterators instead of
// hardcoding std::vector<int>
std::vector<int> stable_unique(const std::vector<int>& input)
{
    std::vector<int> result;
    result.reserve(input.size());
    for (std::vector<int>::iterator itor = input.begin();
                                    itor != input.end();
                                    ++itor)
        if (std::find(result.begin(), result.end(), *itor) == result.end())
            result.push_back(*itor);
        return result;
}

This solution should be O(M * N) where M is the number of unique elements and N is the total number of elements.

场罚期间 2024-08-11 20:51:19

据我所知,stl中没有。查找参考

As far as i know there is none in stl. Look up reference.

花之痕靓丽 2024-08-11 20:51:19

基于 @Corden 的答案,但使用 lambda 表达式并删除重复项,而不是将它们写入输出向量

    set<int> s;
    vector<int> nodups;
    remove_copy_if(v.begin(), v.end(), back_inserter(nodups), 
        [&s](int x){ 
            return !s.insert(x).second; //-> .second false means already exists
        } ); 

Based on the @Corden's answer, but uses lambda expression and removes duplicates instead of writing them in the output vector

    set<int> s;
    vector<int> nodups;
    remove_copy_if(v.begin(), v.end(), back_inserter(nodups), 
        [&s](int x){ 
            return !s.insert(x).second; //-> .second false means already exists
        } ); 
美人骨 2024-08-11 20:51:19

鉴于您的输入位于 vector中, foo 您可以使用 remove在这里为您做腿部工作,那么如果您想缩小矢量,只需使用 erase 否则,当您希望删除重复项但保留顺序的向量时,只需使用 last 作为最后一位迭代器:

auto last = end(foo);

for(auto first = begin(foo); first < last; ++first) last = remove(next(first), last, *first);

foo.erase(last, end(foo));

实例

就时间复杂度而言,这将是 O(nm) 。其中n是元素的数量,m是唯一元素的数量。就空间复杂度而言,这将使用O(n),因为它会就地进行删除。

Given your your input is in vector<int> foo you can use remove to do the leg work for you here, then if you want to shrink the vector just use erase otherwise just use last as your one-past-the-end iterator when you want the vector with duplicates removed but order preserved:

auto last = end(foo);

for(auto first = begin(foo); first < last; ++first) last = remove(next(first), last, *first);

foo.erase(last, end(foo));

Live Example

As far as time complexity this will be O(nm). Where n is the number of elements and m is the number of unique elements. As far as space complexity this will use O(n) because it does the removal in place.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文