矢量的小字符串优化?
我知道几个(全部?)STL 实现实现了“小字符串”优化,其中字符串不是存储通常的 3 个指针(用于开始、结束和容量),而是将实际字符数据存储在用于指针的内存中,如果 sizeof(characters) <= sizeof(指针)。我所处的情况是,我有很多元素大小 <= sizeof(pointer) 的小向量。我不能使用固定大小的数组,因为向量需要能够动态调整大小并且可能会变得很大。然而,向量的中值(不是平均)大小仅为 4-12 字节。因此,适合向量的“小字符串”优化对我来说非常有用。这样的事存在吗?
我正在考虑通过简单地将向量转换为字符串来实现自己的功能,即为字符串提供向量接口。好主意吗?
I know several (all?) STL implementations implement a "small string" optimization where instead of storing the usual 3 pointers for begin, end and capacity a string will store the actual character data in the memory used for the pointers if sizeof(characters) <= sizeof(pointers). I am in a situation where I have lots of small vectors with an element size <= sizeof(pointer). I cannot use fixed size arrays, since the vectors need to be able to resize dynamically and may potentially grow quite large. However, the median (not mean) size of the vectors will only be 4-12 bytes. So a "small string" optimization adapted to vectors would be quite useful to me. Does such a thing exist?
I'm thinking about rolling my own by simply brute force converting a vector to a string, i.e. providing a vector interface to a string. Good idea?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
Boost 1.58 刚刚发布,它的 Container 库有一个 small_vector 基于 LLVM
SmallVector
的类。还有一个
static_vector
,它的增长不能超过最初给定的大小。两个容器都只有标头。facebook 的 folly 库也有一些很棒的容器。
它有一个
small_vector
,可以配置为一个模板参数,其作用类似于 boost 的static
或small
向量。它还可以配置为使用小整数类型进行内部大小簿记,考虑到它们是 facebook,这并不奇怪:)目前正在进行使库跨平台的工作,因此 Windows/MSVC 支持有一天应该会落地......
Boost 1.58 was just released and it's
Container
library has a small_vector class based on the LLVMSmallVector
.There is also a
static_vector
which cannot grow beyond the initially given size. Both containers are header-only.facebook's folly library also has some awesome containers.
It has a
small_vector
which can be configured with a template parameter to act like boost'sstatic
orsmall
vectors. It can also be configured to use small integer types for it's internal size bookkeeping which given that they are facebook is no surprise :)There is work in progress to make the library cross platform so Windows/MSVC support should land some day...
您可以借用 LLVM 的 SmallVector 实现。 (仅标头,位于 LLVM\include\llvm\ADT 中)
You can borrow the SmallVector implementation from LLVM. (header only, located in LLVM\include\llvm\ADT)
这是几年前讨论过的(该线程中的一些名称可能看起来有点熟悉:-) ),但我不知道现有的实现。我不认为我会尝试使
std::string
适应任务。对于std::basic_string
类型的确切要求没有明确说明,但标准非常明确,它仅适用于行为类似于char
的内容。对于本质上不同的类型,它可能仍然有效,但很难说会发生什么——它从来没有被设计用于,并且可能还没有用除小整数之外的许多类型进行测试。完全一致的 std::vector 实现需要大量工作。但是从头开始实现一个可用的
std::vector
子集(甚至包括一个小的向量优化)通常不会非常困难。如果您包含一个小的矢量优化,我有理由确定您无法满足std::vector
的所有要求。特别是,交换或移动向量对象中已存储实际数据的向量意味着您需要交换/移动实际数据项,其中对
std::vector
的要求取决于它只存储指向数据的指针,因此通常1只需操作指针即可交换或移动内容,而根本不需要实际接触数据项本身。因此,即使操作数据项本身会/将要抛出,也需要能够在不抛出的情况下执行这些操作。因此,小的矢量优化将无法满足这些要求。另一方面,如上所述,对 std::string 的要求之一是它只能存储可以操作而不抛出的项目。因此,如果 std::string 是一个可行的选择,那么实现您自己的类似矢量的容器可能也不需要太担心这些细节。
It was discussed years ago (and a few of the names in that thread may look a bit familiar :-) ), but I don't know of an existing implementation. I don't think I'd try to adapt
std::string
to the task. The exact requirements on the type over whichstd::basic_string
aren't well stated, but the standard is pretty clear that it's only intended for something that acts a lot likechar
. For types that are substantially different, it might still work, but it's hard to say what would happen -- it was never intended for, and probably hasn't been tested with many types other than small integers.A fully conforming implementation of
std::vector
is a lot of work. But implementing a usable subset ofstd::vector
from scratch (even including a small vector optimization) won't usually be terribly difficult. If you include a small vector optimization, I'm reasonably certain you can't meet all the requirements onstd::vector
though.In particular, swapping or moving a vector where you've stored actual data in the vector object means you'll need to swap/move actual data items, where the requirements on
std::vector
are predicated on its storing only a pointer to the data, so it can normally1 swap or move the contents just by manipulating the pointers, without actually touching the data items themselves at all. As such, it's required to be able to do these things without throwing, even if manipulating the data items themselves would/will throw. As such, a small vector optimization will preclude meeting those requirements.On the other hand, as noted above, one of the requirements on
std::string
is that it can only store items that can be manipulated without throwing. As such, ifstd::string
is a viable option at all, implementing your ownvector
-like container probably won't need to worry about those details a lot either.std::vector
: if the two vectors use different allocators, then you have to allocate space for the objects in the destination via that vector's allocator.如果 T 是 POD 类型,为什么不使用 basic_string 而不是向量?
If T is a POD type why not basic_string instead of vector??