数据结构的选择
我使用 C++,假设我想存储 40 个用户名,我将简单地使用一个数组。 但是,如果我想存储 40000 个用户名,就搜索速度而言这仍然是一个好主意吗? 我应该使用哪种数据结构来提高这个速度?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
我使用 C++,假设我想存储 40 个用户名,我将简单地使用一个数组。 但是,如果我想存储 40000 个用户名,就搜索速度而言这仍然是一个好主意吗? 我应该使用哪种数据结构来提高这个速度?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(5)
您需要指定插入和删除的要求。 是否需要在序列中的随机点删除和插入内容?
另外,为什么要求按顺序搜索? 您是否正在执行不适合哈希表查找的搜索?
目前我建议使用
deque
或list
。 通常,最好选择一个具有最简单的算法实现接口的容器,然后仅在性能不足且替代方案提供必要的加速时才更改选择。向量有两个主要优点,没有每个对象的内存开销,尽管向量会过度分配以防止频繁复制,并且对象是连续存储的,因此顺序访问往往会很快。 这些也是它的缺点。 增长的向量需要重新分配和复制,并且从向量末端以外的任何位置插入和删除也需要复制。 对于具有大量对象或大型对象的向量,连续存储可能会产生问题,因为即使只有轻微的内存碎片,也很难满足连续存储要求。
列表不需要连续存储,但列表节点通常有两个指针的每个对象开销(在大多数实现中)。 这在非常小的对象列表中可能很重要(例如,在指针列表中,每个节点是数据项大小的 3 倍)。 不过,从列表中间插入和删除的成本非常低,并且列表节点一旦创建就不需要在内存中移动。
双端队列使用分块存储,因此它与向量类似,每个对象的开销较低,但不需要在整个容器上连续存储,因此不存在碎片内存空间的相同问题。 它通常是一个非常好的收藏选择,但常常被忽视。
You need to specify what the insertion and removal requirements are. Do things need to be removed and inserted at random points in the sequence?
Also, why the requirement to search sequentially? Are you doing searches that aren't suitable for a hash table lookup?
At the moment I'd suggest a
deque
or alist
. Often it's best to choose a container with the interface that makes for the simplest implementation for your algorithm and then only change the choice if the performance is inadequate and an alternative provides the necessary speedup.A
vector
has two principle advantages, there is no per-object memory overhead, although vectors will over-allocate to prevent frequent copying and objects are stored contiguously so sequential access tends to be fast. These are also its disadvantages. Growing vectors require reallocation and copying, and insertion and removal from anywhere other than the end of the vector also require copying. Contiguous storage can produce problems for vectors with large numbers of objects or large objects as the contiguous storage requirements can be hard to satisfy even with only mild memory fragmentation.A
list
doesn't require contigous storage but list nodes usually have a per-object overhead of two pointers (in most implementation). This can be significant in list of very small objects (e.g. in a list of pointers, each node is 3x the size of the data item). Insertion and removal from the middle of a list is very cheap though and list nodes never need to me moved in memory once created.A
deque
uses chunked storage, so it has a low per-object overhead similar to a vector, but doesn't require contiguous storage over the whole container so doesn't have the same problem with fragmented memory spaces. It is often a very good choice for collections and is often overlooked.根据经验,与
list
相比,更喜欢使用vector
,或者,饮食禁止,C 风格数组。填充向量后,请确保使用
sort
算法。 然后,您可以使用
find
、binary_search
或lower_bound
。 (您不需要排序即可使用find
。)As a rule of thumb, prefer
vector
tolist
or, diety forbid, C-style array.After the vector is filled, make sure it is properly ordered using the
sort
algorithm. You can then search for a particular record using eitherfind
,binary_search
orlower_bound
. (You don't need to sort to usefind
.)说真的,除非您处于资源受限的环境(嵌入式平台、电话或其他)。 使用
std::map
,保存进行排序或搜索的努力,让容器来处理一切。 这可能是一个排序的树结构,可能是平衡的(例如红黑),这意味着您将获得良好的搜索性能。 除非数据的大小接近一两个指针的大小,否则您选择的任何数据结构的内存开销都是可以忽略不计的。 您的显卡可能有更多的内存,您将用完您正在考虑的数据。正如其他人所说,没有什么充分的理由使用普通数组,如果您不想使用
map
使用std::vector
或std:: list
取决于您是否需要插入/删除数据 (=>list) 或不需要 (=>vector)还要考虑如果您确实需要内存中的所有数据,如何通过 sqlite。 或者甚至使用 sqlite 进行内存访问。 这完全取决于您需要如何处理数据。
Seriously unless you are in a resource constrained environment (embedded platform, phone, or other). Use a
std::map
, save the effort of doing sorting or searching and let the container take care of everything. This will possibly be a sorted tree structure, probably balance (e.g. Red-Black), which means you will get good searching performance. Unless the size of you data is close to the size of one or two pointers, the memory overhead of whatever data structure you pick is negligable. You Graphics Card probably has more memory that you are going to use up for the data you are think about.As others said there is very little good reason to use vanilla array, if you don't want to use a
map
usestd::vector
orstd::list
depending on whether you need insert/delete data (=>list) or not (=>vector)Also consider if you really need all that data in memory, how about putting it on disk via sqlite. Or even use sqlite for in memory access. It all depends on what you need to do with your data.
std::vector 和 std::list 似乎很适合这项任务。 如果您事先知道最大记录数,则可以使用数组。
std::vector and std::list seem good for this task. You can use an array if you know the maximum number of records beforehands.
如果您只需要顺序搜索和存储,那么 list 是合适的容器。< br>
另外, vector 也不是一个糟糕的选择。
If you need only sequentially search and storage, then list is the proper container.
Also, vector wouldn't be a bad choice.