将 std::vector 转换为另一个 std::vector 的最快方法

发布于 2024-12-11 20:01:07 字数 277 浏览 0 评论 0原文

将 std::vector 从一种数据类型转换为另一种数据类型的最快方法（如果有其他方法）是什么（以节省空间为目的）？例如：

std::vector<unsigned short> ----> std::vector<bool>

我们显然假设第一个向量只包含0和1。如果向量非常大，逐个元素复制效率非常低。

条件问题：如果您认为没有办法做得更快，是否有一种复杂的数据类型实际上允许从一种数据类型快速转换为另一种数据类型？

原文

What is the fastest way (if there is any other) to convert a std::vector from one datatype to another (with the idea to save space)? For example:

std::vector<unsigned short> ----> std::vector<bool>

we obviously assume that the first vector only contains 0s and 1s. Copying element by element is highly inefficient in case of a really large vector.

Conditional question:
If you think there is no way to do it faster, is there a complex datatype which actually allows fast conversion from one datatype to another?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

因为看清所以看轻 2024-12-18 20:01:07

std::vector<bool>

停止。

std::vector 是...不是。 std::vector 对 bool 类型的使用有专门化，这会导致 vector 发生某些变化。也就是说，它不再像 std::vector 那样工作。

该标准保证您可以使用 std::vector 执行某些操作。而 vector 违反了这些保证。因此，您在使用它们时应该非常小心。

不管怎样，我会假装你说的是 vector 而不是 vector，因为后者确实让事情变得复杂。

对于非常大的向量，逐个元素复制效率非常低。

只要你做错了。

您想要的类型的矢量转换需要仔细完成才能有效。

如果源 T 类型可转换为目标 T，那么这工作得很好：

vector<Tnew> vec_new(vec_old.begin(), vec_old.end());

体面的实现应该识别它们何时被赋予随机访问迭代器并进行优化适当的内存分配和循环。

对于简单类型的不可转换类型来说，最大的问题不是这样做：

std::vector<int> newVec(oldVec.size());

那很糟糕。这将分配一个适当大小的缓冲区，但它也会用数据填充它。即，默认构造的 int (int())。

相反，您应该这样做：

std::vector<int> newVec;
newVec.reserve(oldVec.size());

这保留了等于原始向量的容量，但它也确保不会发生默认构造。您现在可以 push_back 尽情享受，因为您知道您永远不会导致新向量中的重新分配。

从那里，您可以循环遍历旧向量中的每个条目，根据需要进行转换。

std::vector<bool>

Stop.

A std::vector<bool> is... not. std::vector has a specialization for the use of the type bool, which causes certain changes in the vector. Namely, it stops acting like a std::vector.

There are certain things that the standard guarantees you can do with a std::vector. And vector<bool> violates those guarantees. So you should be very careful about using them.

Anyway, I'm going to pretend you said vector<int> instead of vector<bool>, as the latter really complicates things.

Copying element by element is highly inefficient in case of a really large vector.

Only if you do it wrong.

Vector casting of the type you want needs to be done carefully to be efficient.

If the the source T type is convertible to the destination T, then this is works just fine:

vector<Tnew> vec_new(vec_old.begin(), vec_old.end());

Decent implementations should recognize when they've been given random-access iterators and optimize the memory allocation and loop appropriately.

The biggest problem for non-convertible types you'll have for simple types is not doing this:

std::vector<int> newVec(oldVec.size());

That's bad. That will allocate a buffer of the proper size, but it will also fill it with data. Namely, default-constructed ints (int()).

Instead, you should do this:

std::vector<int> newVec;
newVec.reserve(oldVec.size());

This reserves capacity equal to the original vector, but it also ensures that no default construction takes place. You can now push_back to your hearts content, knowing that you will never cause reallocation in your new vector.

From there, you can just loop over each entry in the old vector, doing the conversion as needed.

回复收藏 0 原文

ぃ弥猫深巷。 2024-12-18 20:01:07

没有办法避免复制，因为 std::vector 是一个独特的
从 std::vector 输入，并且它们无法共享
记忆。除此之外，这取决于数据的映射方式。如果
映射对应于隐式转换（例如，unsigned Short 到
bool)，然后简单地使用开始和结束创建一个新向量
旧的迭代器可以解决这个问题：

std::vector<bool> newV( oldV.begin(), oldV.end() );

如果映射不仅仅是隐式转换（这包括
您想要验证事物的情况；例如，无符号短
确实只包含 0 或 1），那么它会变得更加复杂。这
明显的解决方案是使用 std::transform:

std::vector<TargetType> newV;
newV.reserve( oldV.size() );    //  avoids unnecessary reallocations
std::transform( oldV.begin(), oldV.end(),
                std::back_inserter( newV ),
                TranformationObject() );

，其中 TranformationObject 是一个函数对象，它执行以下操作
转换，例如：（

struct ToBool : public std::unary_function<unsigned short, bool>
{
    bool operator()( unsigned short original ) const
    {
        if ( original != 0 && original != 1 )
            throw Something();
        return original != 0;
    }
};

请注意，我只是使用此转换函数作为示例。
如果唯一区分变换函数和
隐式转换是验证，验证可能会更快
首先使用 std::for_each 获取 oldV 中的所有值，然后使用
上面的两个迭代器构造函数。）

根据默认构造目标类型的成本，它可能是
更快地创建具有正确大小的新向量，然后覆盖
it：

std::vector<TargetType> newV( oldV.size() );
std::transform( oldV.begin(), oldV.end(),
                newV.begin(),
                TranformationObject() );

最后，另一种可能性是使用
boost::transform_iterator。比如：

std::vector<TargetType> newV(
    boost::make_transform_iterator( oldV.begin(), TranformationObject() ),
    boost::make_transform_iterator( oldV.end(), TranformationObject() ) );

从很多方面来说，这是我更喜欢的解决方案；取决于如何
boost::transform_iterator 已经实现，它也可能是
最快。

There's no way to avoid the copy, since a std::vector<T> is a distinct
type from std::vector<U>, and there's no way for them to share the
memory. Other than that, it depends on how the data is mapped. If the
mapping corresponds to an implicit conversion (e.g. unsigned short to
bool), then simply creating a new vector using the begin and end
iterators from the old will do the trick:

std::vector<bool> newV( oldV.begin(), oldV.end() );

If the mapping isn't just an implicit conversion (and this includes
cases where you want to verify things; e.g. that the unsigned short
does contain only 0 or 1), then it gets more complicated. The
obvious solution would be to use std::transform:

std::vector<TargetType> newV;
newV.reserve( oldV.size() );    //  avoids unnecessary reallocations
std::transform( oldV.begin(), oldV.end(),
                std::back_inserter( newV ),
                TranformationObject() );

, where TranformationObject is a functional object which does the
transformation, e.g.:

struct ToBool : public std::unary_function<unsigned short, bool>
{
    bool operator()( unsigned short original ) const
    {
        if ( original != 0 && original != 1 )
            throw Something();
        return original != 0;
    }
};

(Note that I'm just using this transformation function as an example.
If the only thing which distinguishes the transformation function from
an implicit conversion is the verification, it might be faster to verify
all of the values in oldV first, using std::for_each, and then use
the two iterator constructor above.)

Depending on the cost of default constructing the target type, it may be
faster to create the new vector with the correct size, then overwrite
it:

std::vector<TargetType> newV( oldV.size() );
std::transform( oldV.begin(), oldV.end(),
                newV.begin(),
                TranformationObject() );

Finally, another possibility would be to use a
boost::transform_iterator. Something like:

std::vector<TargetType> newV(
    boost::make_transform_iterator( oldV.begin(), TranformationObject() ),
    boost::make_transform_iterator( oldV.end(), TranformationObject() ) );

In many ways, this is the solution I prefer; depending on how
boost::transform_iterator has been implemented, it could also be the
fastest.

回复收藏 0 原文

南巷近海 2024-12-18 20:01:07

您应该能够像这样使用 assign ：

vector<unsigned short> v;
//...
vector<bool> u;
//...
u.assign(v.begin(), v.end());

You should be able to use assign like this:

vector<unsigned short> v;
//...
vector<bool> u;
//...
u.assign(v.begin(), v.end());

回复收藏 0 原文

青萝楚歌 2024-12-18 20:01:07

class A{... }
class B{....}
B convert_A_to_B(const A& a){.......}

void convertVector_A_to_B(const vector<A>& va, vector<B>& vb)
{
    vb.clear();
    vb.reserve(va.size());
    std::transform(va.begin(), va.end(), std::back_inserter(vb), convert_A_to_B);
}

class A{... }
class B{....}
B convert_A_to_B(const A& a){.......}

void convertVector_A_to_B(const vector<A>& va, vector<B>& vb)
{
    vb.clear();
    vb.reserve(va.size());
    std::transform(va.begin(), va.end(), std::back_inserter(vb), convert_A_to_B);
}

回复收藏 0 原文

牛↙奶布丁 2024-12-18 20:01:07

最快的方法就是不做。例如，如果您事先知道您的项目只需要一个字节来存储，则首先使用字节大小向量。您会发现很难找到比这更快的方法:-)

如果不可能，那么只需吸收转换成本即可。即使它有点慢（这绝不是确定的，请参阅尼科尔的精彩回答了解详情），还是有必要的。如果不是，您只需将其保留在较大类型的向量中即可。

回复收藏 0 原文

夜唯美灬不弃 2024-12-18 20:01:07

首先，警告：不要按照我的建议去做。这是危险的，绝对不能这样做。也就是说，如果您无论如何都必须挤出一点点性能......

首先，有一些警告。如果不满足这些要求，则无法执行此操作：

向量必须包含普通旧数据。如果您的类型有指针，或使用析构函数，或需要运算符 = 才能正确复制...请不要这样做。
两个向量包含的 sizeof() 类型必须相同。即，矢量＜ A＞可以从向量复制B＞仅当 sizeof(A) == sizeof(B) 时。

这是一个相当稳定的方法：

vector< A > a;
vector< B > b;
a.resize( b.size() );
assert( sizeof(vector< A >::value_type) == sizeof(vector< B >::value_type) );
if( b.size() == 0 )
   a.clear();
else
   memcpy( &(*a.begin()), &(*b.begin()), b.size() * sizeof(B) );

它对向量 b 中包含的内存进行非常快速的块复制，直接粉碎向量 a 中的任何数据。它不调用构造函数，不执行任何安全检查，并且比此处给出的任何其他方法都要快得多。理论上，优化编译器应该能够匹配它的速度，但除非您使用的是非常好的编译器，否则它不会（几年前我检查过 Visual C++，结果还差得远）。

另外，考虑到这些限制，您可以强制（通过 void *）将一种向量类型转换为另一种向量类型并交换它们——我有一个代码示例，但它开始在我的屏幕上渗出外质，所以我删除了它。

First, a warning: Don't do what I'm about to suggest. It's dangerous and must never be done. That said, if you just have to squeeze out a tiny bit more performance No Matter What...

First, there are some caveats. If you don't meet these, you can't do this:

The vector must contain plain-old-data. If your type has pointers, or uses a destructor, or needs an operator = to copy correctly ... do not do this.
The sizeof() both vector's contained types must be the same. That is, vector< A > can copy from vector< B > only if sizeof(A) == sizeof(B).

Here is a fairly stable method:

vector< A > a;
vector< B > b;
a.resize( b.size() );
assert( sizeof(vector< A >::value_type) == sizeof(vector< B >::value_type) );
if( b.size() == 0 )
   a.clear();
else
   memcpy( &(*a.begin()), &(*b.begin()), b.size() * sizeof(B) );

This does a very fast, block copy of the memory contained in vector b, directly smashing whatever data you have in vector a. It doesn't call constructors, it doesn't do any safety checking, and it's much faster than any of the other methods given here. An optimizing compiler should be able to match the speed of this in theory, but unless you're using an unusually good one, it won't (I checked with Visual C++ a few years ago, and it wasn't even close).

Also, given these constraints, you could forcibly (via void *) cast one vector type to the other and swap them -- I had a code sample for that, but it started oozing ectoplasm on my screen, so I deleted it.

回复收藏 0 原文

牵你的手，一向走下去 2024-12-18 20:01:07

逐个元素复制的效率并不是很低。 std::vector 为其任何元素提供恒定的访问时间，因此整个操作将是 O(n) 。你不会注意到它。

回复收藏 0 原文

请别遗忘我 2024-12-18 20:01:07

#ifdef VECTOR_H_TYPE1
#ifdef VECTOR_H_TYPE2
#ifdef VECTOR_H_CLASS
/* Other methods can be added as needed, provided they likewise carry out the same operations on both */

#include <vector>

using namespace std;

class VECTOR_H_CLASS {
public:
        vector<VECTOR_H_TYPE1> *firstVec;
        vector<VECTOR_H_TYPE2> *secondVec;

        VECTOR_H_CLASS(vector<VECTOR_H_TYPE1> &v1, vector<VECTOR_H_TYPE2> &v2) { firstVec = &v1; secondVec = &v2; }
        ~VECTOR_H_CLASS() {}

        void init() { // Use this to copy a full vector into an empty (or garbage) vector to equalize them
                secondVec->clear();
                for(vector<VECTOR_H_TYPE1>::iterator it = firstVec->begin(); it != firstVec->end(); it++) secondVec->push_back((VECTOR_H_TYPE2)*it);
        }

        void push_back(void *value) {
                firstVec->push_back((VECTOR_H_TYPE1)value);
                secondVec->push_back((VECTOR_H_TYPE2)value);
        }

        void pop_back() {
                firstVec->pop_back();
                secondVec->pop_back();
        }

        void clear() {
                firstVec->clear();
                secondVec->clear();
        }
};
#undef VECTOR_H_CLASS
#endif
#undef VECTOR_H_TYPE2
#endif
#undef VECTOR_H_TYPE1
#endif

#ifdef VECTOR_H_TYPE1
#ifdef VECTOR_H_TYPE2
#ifdef VECTOR_H_CLASS
/* Other methods can be added as needed, provided they likewise carry out the same operations on both */

#include <vector>

using namespace std;

class VECTOR_H_CLASS {
public:
        vector<VECTOR_H_TYPE1> *firstVec;
        vector<VECTOR_H_TYPE2> *secondVec;

        VECTOR_H_CLASS(vector<VECTOR_H_TYPE1> &v1, vector<VECTOR_H_TYPE2> &v2) { firstVec = &v1; secondVec = &v2; }
        ~VECTOR_H_CLASS() {}

        void init() { // Use this to copy a full vector into an empty (or garbage) vector to equalize them
                secondVec->clear();
                for(vector<VECTOR_H_TYPE1>::iterator it = firstVec->begin(); it != firstVec->end(); it++) secondVec->push_back((VECTOR_H_TYPE2)*it);
        }

        void push_back(void *value) {
                firstVec->push_back((VECTOR_H_TYPE1)value);
                secondVec->push_back((VECTOR_H_TYPE2)value);
        }

        void pop_back() {
                firstVec->pop_back();
                secondVec->pop_back();
        }

        void clear() {
                firstVec->clear();
                secondVec->clear();
        }
};
#undef VECTOR_H_CLASS
#endif
#undef VECTOR_H_TYPE2
#endif
#undef VECTOR_H_TYPE1
#endif

回复收藏 0 原文

~没有更多了~