STL 中的双端队列到底是什么?
我正在查看 STL 容器并试图弄清楚它们到底是什么(即使用的数据结构),双端队列阻止了我:我一开始以为它是一个双链表,这将允许在恒定时间内从两端插入和删除,但我对做出的承诺感到困扰< /a> 由运算符 []在恒定时间内完成。在链表中,任意访问应该是O(n),对吧?
如果它是一个动态数组,它如何在恒定时间内添加元素 ?应该提到的是,可能会发生重新分配,并且 O(1) 是摊余成本, 就像向量一样。
所以我想知道这个结构是什么,它允许在恒定时间内任意访问,同时永远不需要移动到新的更大的地方。
I was looking at STL containers and trying to figure what they really are (i.e. the data structure used), and the deque stopped me: I thought at first that it was a double linked list, which would allow insertion and deletion from both ends in constant time, but I am troubled by the promise made by the operator [] to be done in constant time. In a linked list, arbitrary access should be O(n), right?
And if it's a dynamic array, how can it add elements in constant time? It should be mentioned that reallocation may happen, and that O(1) is an amortized cost, like for a vector.
So I wonder what is this structure that allows arbitrary access in constant time, and at the same time never needs to be moved to a new bigger place.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
双端队列在某种程度上是递归定义的:它在内部维护一个由固定大小的块组成的双端队列。每个块都是一个向量,块的队列(下图中的“映射”)本身也是一个向量。
对性能特征以及它与
vector
的比较进行了很好的分析,位于 CodeProject。GCC 标准库实现内部使用
T**
来表示地图。每个数据块都是一个T*
,它分配有一些固定大小的__deque_buf_size
(取决于sizeof(T)
)。A deque is somewhat recursively defined: internally it maintains a double-ended queue of chunks of fixed size. Each chunk is a vector, and the queue (“map” in the graphic below) of chunks itself is also a vector.
There’s a great analysis of the performance characteristics and how it compares to the
vector
over at CodeProject.The GCC standard library implementation internally uses a
T**
to represent the map. Each data block is aT*
which is allocated with some fixed size__deque_buf_size
(which depends onsizeof(T)
).从概述来看,您可以将
deque
视为双端队列
deque
中的数据以固定大小的块存储向量,它们是有指针的通过
map
(这也是一大块向量,但其大小可能会改变)deque迭代器
主要部分代码如下:主要部分代码
deque
如下:下面给大家带来
deque
的核心代码,主要是三个部分:iterator
如何构造a
deque
1. iterator(
__deque_iterator
)迭代器的主要问题是,当++时,- - 迭代器,它可能会跳到其他块(如果它指向块的边缘)。例如,存在三个数据块:
chunk 1
、chunk 2
、chunk 3
。pointer1
指针指向chunk 2
的开头,当操作符--pointer
时,它将指向chunk 1
的结尾code>,从而指向pointer2
。下面我给出
__deque_iterator
的主要函数:首先,跳到任意一个chunk:
注意,计算chunk的
chunk_size()
函数尺寸,你可以想想这里返回 8 是为了简化。operator*
获取 chunk 中的数据operator++, --
// 增量的前缀形式
iterator skip n steps / random access
2. 如何构造
deque
常用函数code>deque
假设
i_deque
有 20 个 int 元素0~19
,其块大小为 8,现在将 3 个元素 (0, 1, 2) 推送到i_deque
:其内部结构如下:
然后再次push_back,它将调用分配新块:
如果我们
push_front
,则将在上一个start
注意,当
push_back
元素放入双端队列时,如果所有映射和块被填满,会导致分配新的map,并调整chunks。但是上面的代码可能足以让你理解deque
。From overview, you can think
deque
as adouble-ended queue
The datas in
deque
are stored by chuncks of fixed size vector, which arepointered by a
map
(which is also a chunk of vector, but its size may change)The main part code of the
deque iterator
is as below:The main part code of the
deque
is as below:Below i will give you the core code of
deque
, mainly about three parts:iterator
How to construct a
deque
1. iterator(
__deque_iterator
)The main problem of iterator is, when ++, -- iterator, it may skip to other chunk(if it pointer to edge of chunk). For example, there are three data chunks:
chunk 1
,chunk 2
,chunk 3
.The
pointer1
pointers to the begin ofchunk 2
, when operator--pointer
it will pointer to the end ofchunk 1
, so as to thepointer2
.Below I will give the main function of
__deque_iterator
:Firstly, skip to any chunk:
Note that, the
chunk_size()
function which compute the chunk size, you can think of it returns 8 for simplify here.operator*
get the data in the chunkoperator++, --
// prefix forms of increment
iterator skip n steps / random access
2. How to construct a
deque
common function of
deque
Let's assume
i_deque
has 20 int elements0~19
whose chunk size is 8, and now push_back 3 elements (0, 1, 2) toi_deque
:It's internal structure like below:
Then push_back again, it will invoke allocate new chunk:
If we
push_front
, it will allocate new chunk before the prevstart
Note when
push_back
element into deque, if all the maps and chunks are filled, it will cause allocate new map, and adjust chunks.But the above code may be enough for you to understanddeque
.将其想象为向量的向量。只是它们不是标准的 std::vector 。
外部向量包含指向内部向量的指针。当通过重新分配更改其容量时,它不会像 std::vector 那样将所有空白空间分配到末尾,而是将空白空间在向量的开头和结尾处分成相等的部分。这允许该向量上的
push_front
和push_back
都在摊销 O(1) 时间内发生。内部向量的行为需要根据它位于双端队列的前面还是后面而改变。在后面,它可以表现为标准的 std::vector ,它在末尾增长,并且在 O(1) 时间内发生
push_back
。在前面,它需要做相反的事情,从每个push_front
开始增长。实际上,通过将指针添加到前面的元素以及增长方向和大小,可以轻松实现这一点。通过这个简单的修改,push_front
也可以是 O(1) 时间。访问任何元素都需要偏移并划分到正确的外部向量索引(发生在 O(1) 中),并索引到内部向量(也是 O(1))。这假设内部向量都是固定大小的,除了位于双端队列开头或结尾的向量。
Imagine it as a vector of vectors. Only they aren't standard
std::vector
s.The outer vector contains pointers to the inner vectors. When its capacity is changed via reallocation, rather than allocating all of the empty space to the end as
std::vector
does, it splits the empty space to equal parts at the beginning and the end of the vector. This allowspush_front
andpush_back
on this vector to both occur in amortized O(1) time.The inner vector behavior needs to change depending on whether it's at the front or the back of the
deque
. At the back it can behave as a standardstd::vector
where it grows at the end, andpush_back
occurs in O(1) time. At the front it needs to do the opposite, growing at the beginning with eachpush_front
. In practice this is easily achieved by adding a pointer to the front element and the direction of growth along with the size. With this simple modificationpush_front
can also be O(1) time.Access to any element requires offsetting and dividing to the proper outer vector index which occurs in O(1), and indexing into the inner vector which is also O(1). This assumes that the inner vectors are all fixed size, except for the ones at the beginning or the end of the
deque
.(这是我在另一个线程中给出的答案。本质上,我认为即使是相当幼稚的实现,使用单个
向量
,符合“constant non-amortized push_{front,back}”的要求,你可能会感到惊讶,并认为这是不可能的,但我在 中找到了其他相关的引用。以令人惊讶的方式定义上下文的标准;如果我在这个答案中犯了错误,那么识别我所说的哪些内容以及我的逻辑在哪里崩溃将非常有帮助)。 ,我并不是想找出一个好的实现,我只是想帮助我们解释 C++ 标准中的复杂性要求。我引用N3242,根据维基百科,最新的免费 C++11 标准化文档。 (它的组织方式似乎与最终标准不同,因此我不会引用确切的页码。当然,这些规则可能在最终标准中发生了变化,但我认为这种情况没有发生。
) code>deque可以通过使用
vector
正确实现。所有元素都被复制到堆上,指针存储在向量中。 (稍后将详细介绍向量)。为什么使用
T*
而不是T
?因为标准要求(我的重点)。
T*
有助于满足这一点。它还帮助我们满足这一点:现在来说说(有争议的)部分。为什么使用
向量
来存储T*
?它为我们提供了随机访问,这是一个好的开始。让我们暂时忘记向量的复杂性,并仔细构建这一点:标准谈论“所包含对象上的操作数量”。对于
deque::push_front
来说,这显然是 1,因为恰好构造了一个T
对象,并且在任何一个中读取或扫描了零个现有的T
对象。方式。这个数字 1 显然是一个常量,并且与当前双端队列中的对象数量无关。这让我们可以这样说:“对于我们的
deque::push_front
,所包含对象(Ts)上的操作数量是固定的,并且与双端队列中已有的对象数量无关。”当然,
T*
上的操作数就不会那么乖了。当向量
变得太大时,它将被重新分配,并且许多T*
将被复制。所以,是的,T*
上的操作数量会有很大差异,但T
上的操作数量不会受到影响。为什么我们要关心
T
上的计数操作和T*
上的计数操作之间的区别?这是因为标准说:对于
deque
,包含的对象是T
,而不是T*
,这意味着我们可以忽略任何复制(或重新分配)a的操作T*
。我没有过多谈论向量在双端队列中的行为。也许我们会将其解释为循环缓冲区(向量始终占用其最大容量(),然后当向量已满时将所有内容重新分配到更大的缓冲区中。细节并不重要。
在前面几段中,我们分析了
deque::push_front
以及双端队列中已经存在的对象数量与 push_front 对包含的T
执行的操作数量之间的关系-我们发现它们是物体。 由于标准要求复杂性以T
上的操作为单位,因此我们可以说它具有恒定的复杂性。是的, Operations-On-T*-Complexity 已摊销(由于
向量
),但我们只对 Operations-On-T-Complexity 感兴趣这是恒定的(非摊销)。在此实现中,向量::push_back 或向量::push_front 的复杂性无关紧要;这些考虑因素涉及
T*
上的操作,因此无关紧要。如果该标准指的是复杂性的“传统”理论概念,那么他们就不会明确地将自己限制在“所包含对象上的操作数量”。我是否过度解读了这句话?(This is an answer I've given in another thread. Essentially I'm arguing that even fairly naive implementations, using a single
vector
, conform to the requirements of "constant non-amortized push_{front,back}". You might be surprised, and think this is impossible, but I have found other relevant quotes in the standard that define the context in a surprising way. Please bear with me; if I have made a mistake in this answer, it would be very helpful to identify which things I have said correctly and where my logic has broken down. )In this answer, I am not trying to identify a good implementation, I'm merely trying to help us to interpret the complexity requirements in the C++ standard. I'm quoting from N3242, which is, according to Wikipedia, the latest freely available C++11 standardization document. (It appears to be organized differently from the final standard, and hence I won't quote the exact page numbers. Of course, these rules might have changed in the final standard, but I don't think that has happened.)
A
deque<T>
could be implemented correctly by using avector<T*>
. All the elements are copied onto the heap and the pointers stored in a vector. (More on the vector later).Why
T*
instead ofT
? Because the standard requires that(my emphasis). The
T*
helps to satisfy that. It also helps us to satisfy this:Now for the (controversial) bit. Why use a
vector
to store theT*
? It gives us random access, which is a good start. Let's forget about the complexity of vector for a moment and build up to this carefully:The standard talks about "the number of operations on the contained objects.". For
deque::push_front
this is clearly 1 because exactly oneT
object is constructed and zero of the existingT
objects are read or scanned in any way. This number, 1, is clearly a constant and is independent of the number of objects currently in the deque. This allows us to say that:'For our
deque::push_front
, the number of operations on the contained objects (the Ts) is fixed and is independent of the number of objects already in the deque.'Of course, the number of operations on the
T*
will not be so well-behaved. When thevector<T*>
grows too big, it'll be realloced and manyT*
s will be copied around. So yes, the number of operations on theT*
will vary wildly, but the number of operations onT
will not be affected.Why do we care about this distinction between counting operations on
T
and counting operations onT*
? It's because the standard says:For the
deque
, the contained objects are theT
, not theT*
, meaning we can ignore any operation which copies (or reallocs) aT*
.I haven't said much about how a vector would behave in a deque. Perhaps we would interpret it as a circular buffer (with the vector always taking up its maximum
capacity()
, and then realloc everything into a bigger buffer when the vector is full. The details don't matter.In the last few paragraphs, we have analyzed
deque::push_front
and the relationship between the number of objects in the deque already and the number of operations performed by push_front on containedT
-objects. And we found they were independent of each other. As the standard mandates that complexity is in terms of operations-on-T
, then we can say this has constant complexity.Yes, the Operations-On-T*-Complexity is amortized (due to the
vector
), but we're only interested in the Operations-On-T-Complexity and this is constant (non-amortized).The complexity of vector::push_back or vector::push_front is irrelevant in this implementation; those considerations involve operations on
T*
and hence are irrelevant. If the standard was referring to the 'conventional' theoretical notion of complexity, then they wouldn't have explicitly restricted themselves to the "number of operations on the contained objects". Am I overinterpreting that sentence?可以向任一方向增长的容器。
双端队列通常被实现为数组的
向量
(数组的链接列表不能提供恒定时间随机访问)。数组的大小取决于实现,通常以字节为单位的常量大小(libc++ 为 4096,libstdc++ 为 512,MS STL 为 16)。这种方法意味着,对于大于固定大小一半的任何元素,双端队列实际上成为指向单个元素分配的指针向量。
A container which can grow in either direction.
Deque is typically implemented as a
vector
of arrays (a linked-list of arrays can't give constant time random access). The size of the arrays is implementation dependent, commonly a constant size in bytes (4096 for libc++, 512 for libstdc++, 16 for MS STL).This approach means that with any element larger than half the fixed size, the deque effectively becomes a vector of pointers to single element allocations.
我正在阅读 Adam Drozdek 的《C++ 中的数据结构和算法》,发现这很有用。
HTH。
您可以注意到中间是指向数据的指针数组(右侧的块),并且您还可以注意到中间的数组是动态变化的。
一张图片胜过一千个文字。
I was reading "Data structures and algorithms in C++" by Adam Drozdek, and found this useful.
HTH.
You can notice in the middle is the array of pointers to the data (chunks on the right), and also you can notice that the array in the middle is dynamically changing.
An image is worth a thousand words.
虽然该标准没有强制要求任何特定的实现(仅恒定时间随机访问),但双端队列通常被实现为连续内存“页面”的集合。新页面会根据需要分配,但您仍然可以随机访问。与 std::vector 不同,我们不保证数据是连续存储的,但与向量一样,中间的插入需要大量的重新定位。
While the standard doesn't mandate any particular implementation (only constant-time random access), a deque is usually implemented as a collection of contiguous memory "pages". New pages are allocated as needed, but you still have random access. Unlike
std::vector
, you're not promised that data is stored contiguously, but like vector, insertions in the middle require lots of relocating.deque
可以实现为固定大小数组的循环缓冲区:deque
could be implemented as a circular buffer of fixed size array:双端队列是“双端队列”的缩写,是 C++ 标准模板库 (STL) 中的通用数据结构。它允许在前端和后端高效地插入和删除元素。
双端队列通常被实现为内存块的集合,这使得它们能够从两端高效地增长或收缩。
A deque, short for "double-ended queue," is a versatile data structure in the C++ Standard Template Library (STL). It allows for efficient insertion and deletion of elements at both the front and back ends.
Deques are usually implemented as a collection of memory blocks, which allows them to grow or shrink from both ends efficiently.