STL Set:以最高效的方式插入两百万个有序数

发布于 2024-09-28 03:49:42 字数 406 浏览 0 评论 0 原文

对于下面的批量插入,由于输入是有序的,是否有任何(轻微的)优化?

set<int> primes;
for ( int i = 2; i <= 2000000; i++ ) {
    primes.insert(i);
}
// then follows Sieve of Eratosthenes algorithm

新改进,速度提高了一倍:

set<int> primes;
for ( int i = 2; i <= 2000000; i++ ) {
    primes.insert(primes.end(), i);
}
// then follows Sieve of Eratosthenes algorithm

For the following mass-insert, because the inputs are ordered, are there any (slight) optimizations?

set<int> primes;
for ( int i = 2; i <= 2000000; i++ ) {
    primes.insert(i);
}
// then follows Sieve of Eratosthenes algorithm

New improvement, twice as fast:

set<int> primes;
for ( int i = 2; i <= 2000000; i++ ) {
    primes.insert(primes.end(), i);
}
// then follows Sieve of Eratosthenes algorithm

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

何以畏孤独 2024-10-05 03:49:42

来自 http://www.cplusplus.com/reference/stl/set/set/< /a>

vector<int> numbers;
for (int i=2; i<=2000000; ++i)
    numbers.push_back(i);

set<int> primes(numbers.begin(), numbers.end());

这应该有 O(n) 复杂度,而你的复杂度是 O(n*log(n)) 。

引用的链接说,当您使用基于迭代器的构造函数时,如果迭代器已排序,那么您将获得线性。但是,我没有看到任何关于如何显式通知构造函数它是一个排序迭代器的明确内容。


对于更干净的东西,我宁愿使用 boost 的计数迭代器。这将缩短为:

set<int> primes(
    boost::counting_iterator<int>(0),
    boost::counting_iterator<int>(200000));

From http://www.cplusplus.com/reference/stl/set/set/

vector<int> numbers;
for (int i=2; i<=2000000; ++i)
    numbers.push_back(i);

set<int> primes(numbers.begin(), numbers.end());

That should have O(n) complexity, where as yours has O(n*log(n)) complexity.

The referenced link says that when you use the iterator based constructor, if the iterator is sorted, then you get linear. However, I'm not seeing anything explicit on howto explicitly inform the constructor that it's a sorted iterator.


For something cleaner, I'd rather use boost's counting iterator though. That shortens this to:

set<int> primes(
    boost::counting_iterator<int>(0),
    boost::counting_iterator<int>(200000));
恋竹姑娘 2024-10-05 03:49:42

insert 方法有一个重载版本,它采用迭代器作为第一个参数。该迭代器用作提示,即比较将从该迭代器开始。因此,如果您传递正确的迭代器作为提示,那么您应该在集合上进行非常有效的插入操作。有关详细信息,请参阅此处。我相信在你的情况下,numbers.end() -1将是一个不错的选择。

There is a overloaded version of insert method available which takes an iterator as the first parameter. This iterator is used as the hint i.e. the comparison will start from this iterator. So if you pass the proper iterator as the hint, then you should have a very efficient insert operation on the set. See here for details. I believe in your case, numbers.end() -1 will be a good option.

古镇旧梦 2024-10-05 03:49:42

如果您从全部候选 int 开始,我根本不会使用 set,我会使用 vector >,将它们全部初始化为 false,并将目标(质数)标记为 true,因为它们所在的位置。

vector<bool> candidates(200000);

您可以使用当前候选 int - 1(候选范围 = 1..200000)对此进行索引。然后使用 find 进行迭代来定位真实值,并使用偏移量与 begin() 转换为 int

  • 内存分配总数 - 1.
  • 总内存使用量 25000 字节(vector 是一种特殊情况,请参阅 @Greg Rogers 的评论)与 set 的 >= 800000 ,折扣 BTree 开销。
  • 所有元素访问都是通过指针算术进行的。

If you are starting with the full range of candidate ints, I would not use a set at all, I would use a vector<bool>, init them all to false, and mark the targets (primes) as true as they are located.

vector<bool> candidates(200000);

You can index this using the current candidate int - 1 (candidate range = 1..200000). Then iterate using find to locate the true values, converting to int by using the offset versus begin().

  • Total number of memory allocations - 1.
  • Total memory usage 25000 bytes (vector<bool> is a special case, see comment from @Greg Rogers) vs >= 800000 for set<int>, discounting BTree overhead.
  • All element access is via pointer arithmetic.
拥抱没勇气 2024-10-05 03:49:42

以下是一些:

  1. 实现一个自定义的 stl 分配器,用于预先保留内存。

  2. 如果您使用矢量解决方案,请致电保留。

  3. 如果您只是要实现筛子,请使用(或实现)增强计数迭代器并且仅将结果存储在向量中,该向量调用reserve。

Here's a few:

  1. implement a custom stl allocator that does the reservations for memory upfront.

  2. If you use the vector solution, call reserve.

  3. If you're just going to to implement the sieve use (or implement) a boost counting iterator and only store the results in a vector, which calls reserve.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文