STL 中的双端队列到底是什么？

发布于 2024-11-14 22:39:47 字数 556 浏览 3 评论 0原文

我正在查看 STL 容器并试图弄清楚它们到底是什么（即使用的数据结构），双端队列阻止了我：我一开始以为它是一个双链表，这将允许在恒定时间内从两端插入和删除，但我对做出的承诺感到困扰< /a> 由运算符 []在恒定时间内完成。在链表中，任意访问应该是O(n)，对吧？

如果它是一个动态数组，它如何在恒定时间内添加元素？应该提到的是，可能会发生重新分配，并且 O(1) 是摊余成本，就像向量一样。

所以我想知道这个结构是什么，它允许在恒定时间内任意访问，同时永远不需要移动到新的更大的地方。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

骑趴 2024-11-21 22:39:47

双端队列在某种程度上是递归定义的：它在内部维护一个由固定大小的块组成的双端队列。每个块都是一个向量，块的队列（下图中的“映射”）本身也是一个向量。

对性能特征以及它与 vector 的比较进行了很好的分析，位于 CodeProject。

GCC 标准库实现内部使用 T** 来表示地图。每个数据块都是一个T*，它分配有一些固定大小的__deque_buf_size（取决于sizeof(T)）。

回复收藏 0 原文

晨曦慕雪 2024-11-21 22:39:47

从概述来看，您可以将deque视为双端队列

deque 中的数据以固定大小的块存储向量，它们是

有指针的通过 map（这也是一大块向量，但其大小可能会改变）

deque迭代器主要部分代码如下：

/*
buff_size is the length of the chunk
*/
template <class T, size_t buff_size>
struct __deque_iterator{
    typedef __deque_iterator<T, buff_size>              iterator;
    typedef T**                                         map_pointer;

    // pointer to the chunk
    T* cur;       
    T* first;     // the begin of the chunk
    T* last;      // the end of the chunk

    //because the pointer may skip to other chunk
    //so this pointer to the map
    map_pointer node;    // pointer to the map
}

主要部分代码deque如下：

/*
buff_size is the length of the chunk
*/
template<typename T, size_t buff_size = 0>
class deque{
    public:
        typedef T              value_type;
        typedef T&            reference;
        typedef T*            pointer;
        typedef __deque_iterator<T, buff_size> iterator;

        typedef size_t        size_type;
        typedef ptrdiff_t     difference_type;

    protected:
        typedef pointer*      map_pointer;

        // allocate memory for the chunk 
        typedef allocator<value_type> dataAllocator;

        // allocate memory for map 
        typedef allocator<pointer>    mapAllocator;

    private:
        //data members

        iterator start;
        iterator finish;

        map_pointer map;
        size_type   map_size;
}

下面给大家带来deque的核心代码，主要是三个部分：

iterator
如何构造a deque

1. iterator(`__deque_iterator`)

迭代器的主要问题是，当++时，- - 迭代器，它可能会跳到其他块（如果它指向块的边缘）。例如，存在三个数据块：chunk 1、chunk 2、chunk 3。

pointer1 指针指向 chunk 2 的开头，当操作符 --pointer 时，它将指向 chunk 1 的结尾code>，从而指向pointer2。

下面我给出__deque_iterator的主要函数：

首先，跳到任意一个chunk：

void set_node(map_pointer new_node){
    node = new_node;
    first = *new_node;
    last = first + chunk_size();
}

注意，计算chunk的chunk_size()函数尺寸，你可以想想这里返回 8 是为了简化。

operator* 获取 chunk 中的数据

reference operator*()const{
    return *cur;
}

operator++, --

// 增量的前缀形式

self& operator++(){
    ++cur;
    if (cur == last){      //if it reach the end of the chunk
        set_node(node + 1);//skip to the next chunk
        cur = first;
    }
    return *this;
}

// postfix forms of increment
self operator++(int){
    self tmp = *this;
    ++*this;//invoke prefix ++
    return tmp;
}
self& operator--(){
    if(cur == first){      // if it pointer to the begin of the chunk
        set_node(node - 1);//skip to the prev chunk
        cur = last;
    }
    --cur;
    return *this;
}

self operator--(int){
    self tmp = *this;
    --*this;
    return tmp;
}

iterator skip n steps / random access

self& operator+=(difference_type n){ // n can be postive or negative
    difference_type offset = n + (cur - first);
    if(offset >=0 && offset < difference_type(buffer_size())){
        // in the same chunk
        cur += n;
    }else{//not in the same chunk
        difference_type node_offset;
        if (offset > 0){
            node_offset = offset / difference_type(chunk_size());
        }else{
            node_offset = -((-offset - 1) / difference_type(chunk_size())) - 1 ;
        }
        // skip to the new chunk
        set_node(node + node_offset);
        // set new cur
        cur = first + (offset - node_offset * chunk_size());
    }

    return *this;
}

// skip n steps
self operator+(difference_type n)const{
    self tmp = *this;
    return tmp+= n; //reuse  operator +=
}

self& operator-=(difference_type n){
    return *this += -n; //reuse operator +=
}

self operator-(difference_type n)const{
    self tmp = *this;
    return tmp -= n; //reuse operator +=
}

// random access (iterator can skip n steps)
// invoke operator + ,operator *
reference operator[](difference_type n)const{
    return *(*this + n);
}

2. 如何构造 `deque`

常用函数code>deque

iterator begin(){return start;}
iterator end(){return finish;}

reference front(){
    //invoke __deque_iterator operator*
    // return start's member *cur
    return *start;
}

reference back(){
    // cna't use *finish
    iterator tmp = finish;
    --tmp; 
    return *tmp; //return finish's  *cur
}

reference operator[](size_type n){
    //random access, use __deque_iterator operator[]
    return start[n];
}


template<typename T, size_t buff_size>
deque<T, buff_size>::deque(size_t n, const value_type& value){
    fill_initialize(n, value);
}

template<typename T, size_t buff_size>
void deque<T, buff_size>::fill_initialize(size_t n, const value_type& value){
    // allocate memory for map and chunk
    // initialize pointer
    create_map_and_nodes(n);

    // initialize value for the chunks
    for (map_pointer cur = start.node; cur < finish.node; ++cur) {
        initialized_fill_n(*cur, chunk_size(), value);
    }

    // the end chunk may have space node, which don't need have initialize value
    initialized_fill_n(finish.first, finish.cur - finish.first, value);
}

template<typename T, size_t buff_size>
void deque<T, buff_size>::create_map_and_nodes(size_t num_elements){
    // the needed map node = (elements nums / chunk length) + 1
    size_type num_nodes = num_elements / chunk_size() + 1;

    // map node num。min num is  8 ，max num is "needed size + 2"
    map_size = std::max(8, num_nodes + 2);
    // allocate map array
    map = mapAllocator::allocate(map_size);

    // tmp_start，tmp_finish poniters to the center range of map
    map_pointer tmp_start  = map + (map_size - num_nodes) / 2;
    map_pointer tmp_finish = tmp_start + num_nodes - 1;

    // allocate memory for the chunk pointered by map node
    for (map_pointer cur = tmp_start; cur <= tmp_finish; ++cur) {
        *cur = dataAllocator::allocate(chunk_size());
    }

    // set start and end iterator
    start.set_node(tmp_start);
    start.cur = start.first;

    finish.set_node(tmp_finish);
    finish.cur = finish.first + num_elements % chunk_size();
}

假设 i_deque 有 20 个 int 元素 0~19，其块大小为 8，现在将 3 个元素 (0, 1, 2) 推送到i_deque：

i_deque.push_back(0);
i_deque.push_back(1);
i_deque.push_back(2);

其内部结构如下：

然后再次push_back，它将调用分配新块：

push_back(3)

如果我们 push_front，则将在上一个 start

注意，当 push_back 元素放入双端队列时，如果所有映射和块被填满，会导致分配新的map，并调整chunks。但是上面的代码可能足以让你理解deque。

From overview, you can think deque as a double-ended queue

The datas in deque are stored by chuncks of fixed size vector, which are

pointered by a map(which is also a chunk of vector, but its size may change)

The main part code of the deque iterator is as below:

/*
buff_size is the length of the chunk
*/
template <class T, size_t buff_size>
struct __deque_iterator{
    typedef __deque_iterator<T, buff_size>              iterator;
    typedef T**                                         map_pointer;

    // pointer to the chunk
    T* cur;       
    T* first;     // the begin of the chunk
    T* last;      // the end of the chunk

    //because the pointer may skip to other chunk
    //so this pointer to the map
    map_pointer node;    // pointer to the map
}

The main part code of the deque is as below:

/*
buff_size is the length of the chunk
*/
template<typename T, size_t buff_size = 0>
class deque{
    public:
        typedef T              value_type;
        typedef T&            reference;
        typedef T*            pointer;
        typedef __deque_iterator<T, buff_size> iterator;

        typedef size_t        size_type;
        typedef ptrdiff_t     difference_type;

    protected:
        typedef pointer*      map_pointer;

        // allocate memory for the chunk 
        typedef allocator<value_type> dataAllocator;

        // allocate memory for map 
        typedef allocator<pointer>    mapAllocator;

    private:
        //data members

        iterator start;
        iterator finish;

        map_pointer map;
        size_type   map_size;
}

Below i will give you the core code of deque, mainly about three parts:

iterator
How to construct a deque

1. iterator(`__deque_iterator`)

The main problem of iterator is, when ++, -- iterator, it may skip to other chunk(if it pointer to edge of chunk). For example, there are three data chunks: chunk 1,chunk 2,chunk 3.

The pointer1 pointers to the begin of chunk 2, when operator --pointer it will pointer to the end of chunk 1, so as to the pointer2.

Below I will give the main function of __deque_iterator:

Firstly, skip to any chunk:

void set_node(map_pointer new_node){
    node = new_node;
    first = *new_node;
    last = first + chunk_size();
}

Note that, the chunk_size() function which compute the chunk size, you can think of it returns 8 for simplify here.

operator* get the data in the chunk

reference operator*()const{
    return *cur;
}

operator++, --

// prefix forms of increment

self& operator++(){
    ++cur;
    if (cur == last){      //if it reach the end of the chunk
        set_node(node + 1);//skip to the next chunk
        cur = first;
    }
    return *this;
}

// postfix forms of increment
self operator++(int){
    self tmp = *this;
    ++*this;//invoke prefix ++
    return tmp;
}
self& operator--(){
    if(cur == first){      // if it pointer to the begin of the chunk
        set_node(node - 1);//skip to the prev chunk
        cur = last;
    }
    --cur;
    return *this;
}

self operator--(int){
    self tmp = *this;
    --*this;
    return tmp;
}

iterator skip n steps / random access

self& operator+=(difference_type n){ // n can be postive or negative
    difference_type offset = n + (cur - first);
    if(offset >=0 && offset < difference_type(buffer_size())){
        // in the same chunk
        cur += n;
    }else{//not in the same chunk
        difference_type node_offset;
        if (offset > 0){
            node_offset = offset / difference_type(chunk_size());
        }else{
            node_offset = -((-offset - 1) / difference_type(chunk_size())) - 1 ;
        }
        // skip to the new chunk
        set_node(node + node_offset);
        // set new cur
        cur = first + (offset - node_offset * chunk_size());
    }

    return *this;
}

// skip n steps
self operator+(difference_type n)const{
    self tmp = *this;
    return tmp+= n; //reuse  operator +=
}

self& operator-=(difference_type n){
    return *this += -n; //reuse operator +=
}

self operator-(difference_type n)const{
    self tmp = *this;
    return tmp -= n; //reuse operator +=
}

// random access (iterator can skip n steps)
// invoke operator + ,operator *
reference operator[](difference_type n)const{
    return *(*this + n);
}

2. How to construct a `deque`

common function of deque

iterator begin(){return start;}
iterator end(){return finish;}

reference front(){
    //invoke __deque_iterator operator*
    // return start's member *cur
    return *start;
}

reference back(){
    // cna't use *finish
    iterator tmp = finish;
    --tmp; 
    return *tmp; //return finish's  *cur
}

reference operator[](size_type n){
    //random access, use __deque_iterator operator[]
    return start[n];
}


template<typename T, size_t buff_size>
deque<T, buff_size>::deque(size_t n, const value_type& value){
    fill_initialize(n, value);
}

template<typename T, size_t buff_size>
void deque<T, buff_size>::fill_initialize(size_t n, const value_type& value){
    // allocate memory for map and chunk
    // initialize pointer
    create_map_and_nodes(n);

    // initialize value for the chunks
    for (map_pointer cur = start.node; cur < finish.node; ++cur) {
        initialized_fill_n(*cur, chunk_size(), value);
    }

    // the end chunk may have space node, which don't need have initialize value
    initialized_fill_n(finish.first, finish.cur - finish.first, value);
}

template<typename T, size_t buff_size>
void deque<T, buff_size>::create_map_and_nodes(size_t num_elements){
    // the needed map node = (elements nums / chunk length) + 1
    size_type num_nodes = num_elements / chunk_size() + 1;

    // map node num。min num is  8 ，max num is "needed size + 2"
    map_size = std::max(8, num_nodes + 2);
    // allocate map array
    map = mapAllocator::allocate(map_size);

    // tmp_start，tmp_finish poniters to the center range of map
    map_pointer tmp_start  = map + (map_size - num_nodes) / 2;
    map_pointer tmp_finish = tmp_start + num_nodes - 1;

    // allocate memory for the chunk pointered by map node
    for (map_pointer cur = tmp_start; cur <= tmp_finish; ++cur) {
        *cur = dataAllocator::allocate(chunk_size());
    }

    // set start and end iterator
    start.set_node(tmp_start);
    start.cur = start.first;

    finish.set_node(tmp_finish);
    finish.cur = finish.first + num_elements % chunk_size();
}

Let's assume i_deque has 20 int elements 0~19 whose chunk size is 8, and now push_back 3 elements (0, 1, 2) to i_deque:

i_deque.push_back(0);
i_deque.push_back(1);
i_deque.push_back(2);

It's internal structure like below:

Then push_back again, it will invoke allocate new chunk:

push_back(3)

If we push_front, it will allocate new chunk before the prev start

Note when push_back element into deque, if all the maps and chunks are filled, it will cause allocate new map, and adjust chunks.But the above code may be enough for you to understand deque.

回复收藏 0 原文

小苏打饼 2024-11-21 22:39:47

将其想象为向量的向量。只是它们不是标准的 std::vector 。

外部向量包含指向内部向量的指针。当通过重新分配更改其容量时，它不会像 std::vector 那样将所有空白空间分配到末尾，而是将空白空间在向量的开头和结尾处分成相等的部分。这允许该向量上的 push_front 和 push_back 都在摊销 O(1) 时间内发生。

内部向量的行为需要根据它位于双端队列的前面还是后面而改变。在后面，它可以表现为标准的 std::vector ，它在末尾增长，并且在 O(1) 时间内发生 push_back 。在前面，它需要做相反的事情，从每个 push_front 开始增长。实际上，通过将指针添加到前面的元素以及增长方向和大小，可以轻松实现这一点。通过这个简单的修改，push_front 也可以是 O(1) 时间。

访问任何元素都需要偏移并划分到正确的外部向量索引（发生在 O(1) 中），并索引到内部向量（也是 O(1)）。这假设内部向量都是固定大小的，除了位于双端队列开头或结尾的向量。

回复收藏 0 原文

一抹淡然 2024-11-21 22:39:47

（这是我在另一个线程中给出的答案。本质上，我认为即使是相当幼稚的实现，使用单个向量，符合“constant non-amortized push_{front,back}”的要求，你可能会感到惊讶，并认为这是不可能的，但我在中找到了其他相关的引用。以令人惊讶的方式定义上下文的标准；如果我在这个答案中犯了错误，那么识别我所说的哪些内容以及我的逻辑在哪里崩溃将非常有帮助）

。，我并不是想找出一个好的实现，我只是想帮助我们解释 C++ 标准中的复杂性要求。我引用N3242，根据维基百科，最新的免费 C++11 标准化文档。（它的组织方式似乎与最终标准不同，因此我不会引用确切的页码。当然，这些规则可能在最终标准中发生了变化，但我认为这种情况没有发生。

） code>deque可以通过使用 vector 正确实现。所有元素都被复制到堆上，指针存储在向量中。（稍后将详细介绍向量）。

为什么使用 T* 而不是 T？因为标准要求

“在双端队列任一端插入都会使所有迭代器无效
到双端队列，但对引用的有效性没有影响
双端队列的元素。"

（我的重点）。T* 有助于满足这一点。它还帮助我们满足这一点：

“在双端队列的开头或结尾插入单个元素总是......导致对 T 的构造函数的单次调用。”

现在来说说（有争议的）部分。为什么使用向量来存储T*？它为我们提供了随机访问，这是一个好的开始。让我们暂时忘记向量的复杂性，并仔细构建这一点：

标准谈论“所包含对象上的操作数量”。对于 deque::push_front 来说，这显然是 1，因为恰好构造了一个 T 对象，并且在任何一个中读取或扫描了零个现有的 T 对象。方式。这个数字 1 显然是一个常量，并且与当前双端队列中的对象数量无关。这让我们可以这样说：

“对于我们的 deque::push_front，所包含对象（Ts）上的操作数量是固定的，并且与双端队列中已有的对象数量无关。”

当然，T*上的操作数就不会那么乖了。当向量变得太大时，它将被重新分配，并且许多T*将被复制。所以，是的，T* 上的操作数量会有很大差异，但 T 上的操作数量不会受到影响。

为什么我们要关心 T 上的计数操作和 T* 上的计数操作之间的区别？这是因为标准说：

本条款中的所有复杂性要求仅根据所包含对象的操作数量来说明。

对于deque，包含的对象是T，而不是T*，这意味着我们可以忽略任何复制（或重新分配）a的操作T*。

我没有过多谈论向量在双端队列中的行为。也许我们会将其解释为循环缓冲区（向量始终占用其最大容量（），然后当向量已满时将所有内容重新分配到更大的缓冲区中。细节并不重要。

在前面几段中，我们分析了 deque::push_front 以及双端队列中已经存在的对象数量与 push_front 对包含的 T 执行的操作数量之间的关系-我们发现它们是物体。 由于标准要求复杂性以 T 上的操作为单位，因此我们可以说它具有恒定的复杂性。

是的， Operations-On-T*-Complexity 已摊销（由于向量），但我们只对 Operations-On-T-Complexity 感兴趣这是恒定的（非摊销）。

在此实现中，向量::push_back 或向量::push_front 的复杂性无关紧要；这些考虑因素涉及 T* 上的操作，因此无关紧要。如果该标准指的是复杂性的“传统”理论概念，那么他们就不会明确地将自己限制在“所包含对象上的操作数量”。我是否过度解读了这句话？

(This is an answer I've given in another thread. Essentially I'm arguing that even fairly naive implementations, using a single vector, conform to the requirements of "constant non-amortized push_{front,back}". You might be surprised, and think this is impossible, but I have found other relevant quotes in the standard that define the context in a surprising way. Please bear with me; if I have made a mistake in this answer, it would be very helpful to identify which things I have said correctly and where my logic has broken down. )

In this answer, I am not trying to identify a good implementation, I'm merely trying to help us to interpret the complexity requirements in the C++ standard. I'm quoting from N3242, which is, according to Wikipedia, the latest freely available C++11 standardization document. (It appears to be organized differently from the final standard, and hence I won't quote the exact page numbers. Of course, these rules might have changed in the final standard, but I don't think that has happened.)

A deque<T> could be implemented correctly by using a vector<T*>. All the elements are copied onto the heap and the pointers stored in a vector. (More on the vector later).

Why T* instead of T? Because the standard requires that

"An insertion at either end of the deque invalidates all the iterators
to the deque, but has no effect on the validity of references to
elements of the deque."

(my emphasis). The T* helps to satisfy that. It also helps us to satisfy this:

"Inserting a single element either at the beginning or end of a deque always ..... causes a single call to a constructor of T."

Now for the (controversial) bit. Why use a vector to store the T*? It gives us random access, which is a good start. Let's forget about the complexity of vector for a moment and build up to this carefully:

The standard talks about "the number of operations on the contained objects.". For deque::push_front this is clearly 1 because exactly one T object is constructed and zero of the existing T objects are read or scanned in any way. This number, 1, is clearly a constant and is independent of the number of objects currently in the deque. This allows us to say that:

'For our deque::push_front, the number of operations on the contained objects (the Ts) is fixed and is independent of the number of objects already in the deque.'

Of course, the number of operations on the T* will not be so well-behaved. When the vector<T*> grows too big, it'll be realloced and many T*s will be copied around. So yes, the number of operations on the T* will vary wildly, but the number of operations on T will not be affected.

Why do we care about this distinction between counting operations on T and counting operations on T*? It's because the standard says:

All of the complexity requirements in this clause are stated solely in terms of the number of operations on the contained objects.

For the deque, the contained objects are the T, not the T*, meaning we can ignore any operation which copies (or reallocs) a T*.

I haven't said much about how a vector would behave in a deque. Perhaps we would interpret it as a circular buffer (with the vector always taking up its maximum capacity(), and then realloc everything into a bigger buffer when the vector is full. The details don't matter.

In the last few paragraphs, we have analyzed deque::push_front and the relationship between the number of objects in the deque already and the number of operations performed by push_front on contained T-objects. And we found they were independent of each other. As the standard mandates that complexity is in terms of operations-on-T, then we can say this has constant complexity.

Yes, the Operations-On-T*-Complexity is amortized (due to the vector), but we're only interested in the Operations-On-T-Complexity and this is constant (non-amortized).

The complexity of vector::push_back or vector::push_front is irrelevant in this implementation; those considerations involve operations on T* and hence are irrelevant. If the standard was referring to the 'conventional' theoretical notion of complexity, then they wouldn't have explicitly restricted themselves to the "number of operations on the contained objects". Am I overinterpreting that sentence?

回复收藏 0 原文