STL Set:以最高效的方式插入两百万个有序数
对于下面的批量插入,由于输入是有序的,是否有任何(轻微的)优化?
set<int> primes;
for ( int i = 2; i <= 2000000; i++ ) {
primes.insert(i);
}
// then follows Sieve of Eratosthenes algorithm
新改进,速度提高了一倍:
set<int> primes;
for ( int i = 2; i <= 2000000; i++ ) {
primes.insert(primes.end(), i);
}
// then follows Sieve of Eratosthenes algorithm
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
来自 http://www.cplusplus.com/reference/stl/set/set/< /a>
这应该有 O(n) 复杂度,而你的复杂度是 O(n*log(n)) 。
引用的链接说,当您使用基于迭代器的构造函数时,如果迭代器已排序,那么您将获得线性。但是,我没有看到任何关于如何显式通知构造函数它是一个排序迭代器的明确内容。
对于更干净的东西,我宁愿使用 boost 的计数迭代器。这将缩短为:
From http://www.cplusplus.com/reference/stl/set/set/
That should have O(n) complexity, where as yours has O(n*log(n)) complexity.
The referenced link says that when you use the iterator based constructor, if the iterator is sorted, then you get linear. However, I'm not seeing anything explicit on howto explicitly inform the constructor that it's a sorted iterator.
For something cleaner, I'd rather use boost's counting iterator though. That shortens this to:
insert
方法有一个重载版本,它采用迭代器作为第一个参数。该迭代器用作提示,即比较将从该迭代器开始。因此,如果您传递正确的迭代器作为提示,那么您应该在集合上进行非常有效的插入操作。有关详细信息,请参阅此处。我相信在你的情况下,numbers.end() -1
将是一个不错的选择。There is a overloaded version of
insert
method available which takes an iterator as the first parameter. This iterator is used as the hint i.e. the comparison will start from this iterator. So if you pass the proper iterator as the hint, then you should have a very efficient insert operation on the set. See here for details. I believe in your case,numbers.end() -1
will be a good option.如果您从全部候选
int
开始,我根本不会使用set
,我会使用vector
>,将它们全部初始化为false
,并将目标(质数)标记为true
,因为它们所在的位置。您可以使用当前候选
int
- 1(候选范围 = 1..200000)对此进行索引。然后使用find
进行迭代来定位真实值,并使用偏移量与begin()
转换为int
。vector
是一种特殊情况,请参阅 @Greg Rogers 的评论)与set 的 >= 800000
,折扣 BTree 开销。If you are starting with the full range of candidate
int
s, I would not use aset
at all, I would use avector<bool>
, init them all tofalse
, and mark the targets (primes) astrue
as they are located.You can index this using the current candidate
int
- 1 (candidate range = 1..200000). Then iterate usingfind
to locate the true values, converting toint
by using the offset versusbegin()
.vector<bool>
is a special case, see comment from @Greg Rogers) vs >= 800000 forset<int>
, discounting BTree overhead.以下是一些:
实现一个自定义的 stl 分配器,用于预先保留内存。
如果您使用矢量解决方案,请致电保留。
如果您只是要实现筛子,请使用(或实现)增强计数迭代器并且仅将结果存储在向量中,该向量调用reserve。
Here's a few:
implement a custom stl allocator that does the reservations for memory upfront.
If you use the vector solution, call reserve.
If you're just going to to implement the sieve use (or implement) a boost counting iterator and only store the results in a vector, which calls reserve.