为什么erase()函数如此昂贵?

发布于 2024-10-11 16:22:44 字数 689 浏览 3 评论 0原文

考虑一个 2d 向量 vector <向量> N并且假设它的内容如下:

1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4

所以这里N的大小是4,即N.size() = 4

现在,考虑下面的代码:

int i = 0;
while(N != empty()){
N.erase(i);
++i;
}

我计算了时间这段代码单独具有不同大小的 N,结果如下:

N 的大小为 1000 执行时间:0.230000s

N的大小为10000 执行时间:22.900000s

N的大小为20000 执行时间:91.760000s

N的大小为30000 执行时间:206.620000s

N的大小为47895 执行时间:526.540000s

我的问题是为什么这个函数如此昂贵?如果是这样,那么许多程序中的条件擦除语句可能会因为这个函数而永远花费下去。当我在 std::map 中使用擦除函数时也是同样的情况。这个功能有什么替代方案吗?其他库(例如 Boost)是否提供任何功能?

请不要说我可以整体执行 N.erase() 因为我只是想分析这个函数。

Consider a 2d vector vector < vector <int> > Nand lets say its contents are as follows:

1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4

So the size of N here is 4 i.e. N.size() = 4

Now, consider the following code :

int i = 0;
while(N != empty()){
N.erase(i);
++i;
}

I calculated the time just for this piece of code alone with various sizes for N and following are the results:

The size of N is 1000
Execution Time: 0.230000s

The size of N is 10000
Execution Time: 22.900000s

The size of N is 20000
Execution Time: 91.760000s

The size of N is 30000
Execution Time: 206.620000s

The size of N is 47895
Execution Time: 526.540000s

My question is why is this function so expensive ? If it is so then conditional erase statements in many programs could take forever just because of this function. It is the same case when I use erase function in std::map too. Is there any alternative for this function. Does other libraries like Boost offer any?

Please do not say I could do N.erase() as a whole because I'm just trying to analyze this function.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

要走干脆点 2024-10-18 16:22:45

向量是一个数组,当您向其中添加元素时,它会自动增长。因此,向量中的元素在内存中是连续的。这允许对元素进行恒定的时间访问。因为它们是从末尾开始增长的,所以它们也需要摊销常数时间来添加或删除到末尾。

现在,当你从中间移除时会发生什么?嗯,这意味着擦除元素后存在的任何内容都必须向后移动一个位置。这是非常昂贵的。

如果您想在中间进行大量插入/删除操作,请使用链表,例如 std::deque 的 std::list。

A vector is an array that grows automatically as you add elements to it. As such, elements in a vector a contiguous in memory. This allows constant time access to an element. Because they grow from the end, they also take amortized constant time to add or remove to/from the end.

Now, what happens when you remove in the middle? Well, it means whatever exists after the erased element must be shifted back one position. This is very expensive.

If you want to do lots of insertion/removal in the middle, use a linked list such as std::list of std::deque.

花期渐远 2024-10-18 16:22:45

正如 Oli 所说,从向量的第一个元素中删除意味着必须向下复制其后面的元素,以便数组能够按预期运行。

这就是为什么链表用于从列表中的随机位置删除元素的情况 - 它更快(在较大的列表上),因为没有复制,仅重置一些节点指针。

As Oli said, erasing from the first element of a vector means the elements following it have to be copied down in order for the array to behave as desired.

This is why linked lists are used for situations in which elements will be removed from random locations in the list - it is quicker (on larger lists) because there is no copying, only resetting some node pointers.

野鹿林 2024-10-18 16:22:44

考虑一下当删除向量的第一个元素时会发生什么。向量的其余部分必须向下“移动”一个索引,这涉及到复制它。尝试从另一端擦除,看看这是否会产生影响(我怀疑它会......)

Consider what happens when you delete the first element of a vector. The rest of the vector must be "moved" down by one index, which involves copying it. Try erasing from the other end, and see if that makes a difference (I suspect it will...)

痴梦一场 2024-10-18 16:22:44

因为你的算法是O(n^2)。每次调用 erase 都会强制 vector 将已擦除元素之后的所有元素移回原位。因此,在具有 4 元素向量的循环中,第一个循环导致 3 个元素移动,第二次迭代导致 1 个元素移动,之后您将出现未定义的行为。

如果有 8 个元素,则第一次迭代将移动 7 个元素,下一次迭代将移动 5 个元素,下一次迭代将移动 3 个元素,最后的枚举将移动 1 个元素。 (同样,你有未定义的行为)

当你遇到这样的情况时,通常你应该使用标准算法(即 std::removestd::remove_if),因为它们运行一次容器并将典型的 O(n^2) 算法转换为 O(n) 算法。有关详细信息,请参阅 Scott Meyers 的“有效 STL”第 43 项:优先使用算法调用而不是显式循环。

Because your algorithm is O(n^2). Each call to erase forces the vector to move all elements after the erased element back. So in your loop with the 4 element vector, the first loop causes 3 elements to be shifted, the second iteration causes 1 element to be shifted, and after that you have undefined behavior.

If you had 8 elements, the first iteration would move 7 elements, the next would move 5 elements, the next would move 3 elements, and the final enumeration would move 1 element. (And again you have undefined behavior)

When you encounter situations like this, generally you should use the standard algorithms (i.e. std::remove, std::remove_if) instead, as they run through the container once and turn typical O(n^2) algorithms into O(n) algorithms. For more information see Scott Meyers' "Effective STL" Item 43: Prefer Algorithm Calls to Explicit Loops.

梦里°也失望 2024-10-18 16:22:44

std::vector 在内部只是一个元素数组。如果删除中间的一个元素,则该元素后面的所有元素都必须下移。这可能非常昂贵 - 如果元素有一个可以完成大量工作的自定义 operator= ,则成本会更高!

如果您需要 erase() 速度更快,您应该使用 std::list - 这将使用双向链表结构,允许从中间快速擦除(但是,其他操作会稍微慢一些)。如果您只需要快速从列表的开始中删除,请使用std::deque - 这会创建一个数组的链接列表,并提供 std::vector 的大部分速度优势,同时仍然允许快速擦除仅开始或结束。

此外,请注意,您的循环使问题变得更糟 - 您首先扫描所有等于零的元素并删除它们。扫描需要 O(n) 时间,擦除也需要 O(n) 时间。然后重复 1,依此类推 - 总体时间为 O(n^2)。如果您需要删除多个值,您应该使用迭代器并使用 erase()。或者,如果您使用向量,您会发现复制到新向量中的速度会更快。

至于 std::map (和 std::set) - 这根本不是问题。 std::map 既能够随机删除元素,也能够随机搜索元素,时间为 O(lg n) - 这对于大多数用途来说是相当合理的。即使你的天真循环也不应该太糟糕;手动迭代并一次性删除您想要删除的所有内容在某种程度上更有效,但远没有达到 std::list 和朋友的程度。

A std::vector is, internally, just an array of elements. If you delete an element in the middle, all the elements after it have to be shifted down. This can be very expensive - even more so if the elements have a custom operator= that does a lot of work!

If you need erase() to be fast, you should use a std::list - this will use a doubly linked list structure that allows fast erasure from the middle (however, other operations get somewhat slower). If you just need to remove from the start of the list quickly, use std::deque - this creates a linked list of arrays, and offers most of the speed advantages of std::vector while still allowing fast erasures from the beginning or end only.

Furthermore, note that your loop there makes the problem worse - you first scan through all elements equal to zero and erase them. The scan takes O(n) time, the erasure also O(n) time. You then repeat for 1, and so on - overall, O(n^2) time. If you need to erase multiple values, you should take an iterator and go through the std::list yourself, using the iterator variant of erase(). Or if you use a vector, you'll find it can be faster to copy into a new vector.

As for std::map (and std::set) - this isn't a problem at all. std::map is capable of both removing elements at random, as well as searching for elements at random, with O(lg n) time - which is quite reasonable for most uses. Even your naive loop there shouldn't be too bad; manually iterating through and removing everything you want to remove in one pass is somewhat more efficient, but not nearly to the extent that it is with std::list and friends.

心凉怎暖 2024-10-18 16:22:44

vector.erase 会将 i 之后的所有元素向前推进 1。这是一个 O(n) 操作。

此外,您是按值而不是按引用传递向量。

您的代码也不会删除整个向量。

例如:
我=0
擦除N[0]
N = {{2, 2, 2, 2}, {3, 3, 3, 3}, {4, 4, 4, 4}}

i = 1
擦除N[1]
N = {{2, 2, 2, 2}, {4, 4, 4, 4}}

i = 2
擦除 N[2] 没有任何反应,因为最大索引是 N[1]

最后,我认为这是 vector.erase() 的正确语法。您需要将迭代器传递到开始位置以删除所需的元素。
试试这个:

vector<vector<int>> vectors; // still passing by value so it'll be slow, but at least erases everything
for(int i = 0; i < 1000; ++i)
{
    vector<int> temp;
    for(int j = 0; j < 1000; ++j)
    {
        temp.push_back(i);
    }
    vectors.push_back(temp);
}

// erase starting from the beginning
while(!vectors.empty())
{
    vectors.erase(vectors.begin());
}

您还可以将其与从末尾擦除进行比较(它应该明显更快,特别是在使用值而不是引用时):

// just replace the while-loop at the end
while(!vectors.empty())
{
    vectors.erase(vectors.end()-1);
}

vector.erase will advance all elements after i forward by 1. This is an O(n) operation.

Additionally, you're passing vectors by value rather than by reference.

Your code also doesn't erase the entire vector.

For example:
i = 0
erase N[0]
N = {{2, 2, 2, 2}, {3, 3, 3, 3}, {4, 4, 4, 4}}

i = 1
erase N[1]
N = {{2, 2, 2, 2}, {4, 4, 4, 4}}

i = 2
erase N[2] nothing happens because the maximum index is N[1]

Lastly, I don' think that's the correct syntax for vector.erase(). You need to pass in an iterator to the begin location to erase the element you want.
Try this:

vector<vector<int>> vectors; // still passing by value so it'll be slow, but at least erases everything
for(int i = 0; i < 1000; ++i)
{
    vector<int> temp;
    for(int j = 0; j < 1000; ++j)
    {
        temp.push_back(i);
    }
    vectors.push_back(temp);
}

// erase starting from the beginning
while(!vectors.empty())
{
    vectors.erase(vectors.begin());
}

You can also compare this to erasing from the end (it should be significantly faster, especially when using values rather than references):

// just replace the while-loop at the end
while(!vectors.empty())
{
    vectors.erase(vectors.end()-1);
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文