寻找最小元素的范围
基本上是我的旧问题的后续问题:选择满足条件的特定对象
我有称为变量的对象,可以分配或取消分配。每个变量都有一个域,这意味着可以为变量分配值。例如,变量可能采用 1 到 32 之间的值。我可以使用高度优化的方法在恒定时间内查询变量的大小。
一项重要的操作是以某种方式找到具有最小域大小的未分配元素的范围。为此,我有一个指向所有变量对象的指针向量。簿记是为了维护一个不变量,即指向所有未分配变量的指针首先出现在所有已分配变量之前。
随着时间步长的推移,变量以动态方式发生变化,其域大小发生变化,未分配的变量可能会被分配,反之亦然。当变量的状态发生变化时,它们会被交换,以便再次保持不变性。这意味着向量的大小在程序执行过程中保持不变。
接下来是我的问题:有效地找到具有最小域大小的未分配变量的范围。我现在正在做的事情:用指针向量维护不变量。然后,当我需要查找元素的范围时,我对向量中包含指向未分配变量的指针的部分进行完全排序。
这太慢了。根据 Visual Studio 分析器,我使用了大约 50% 的总时间进行排序。我排序的这些指针元素范围很小,最多500-600百个元素。这有可能的原因吗?
其实在这个方案之前,我使用的是不依赖排序的方法。它感觉应该更慢,甚至是渐近的。我发现我的新方案对性能产生了很大的负面影响,而排序是工作量最大的一项。
有什么我可以尝试的吗?在现代硬件上对 500-600 个元素进行排序应该没什么问题,对吧(并且使用 std::sort)? std::partial_sort 也没有真正改变这种情况。
我有点不确定是否可以为您提供更多有用的信息,但如果有的话请告诉我。我真的对这种现象感到震惊,不管是什么原因造成的。如果这很重要的话,我正在使用 C++,但我想这也可以被视为与语言无关的问题。
编辑:这是一个简化的示例:
此向量中的零表示已分配的变量,非零元素表示具有该域大小的未分配变量。假设我的向量如下所示:
{1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 0, 0, 0, 0, 0}
然后我的算法执行一步并更改变量的状态。那么向量看起来就像
{2, 0, 3, 3, 3, 4, 4, 3, 3, 3, 4, 5, 0, 0, 1, 0, 0}
现在违反了不变量:前面有未分配的变量。我需要做一些事情来纠正这个问题。然后我需要找到具有最小值的元素的范围。修复和排序会产生例如:
{1, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 5, 0, 0, 0, 0, 0}
EDIT2:
有关问题的其他详细信息。未分配的变量没有很多潜在的不同值范围。我确实提前知道,最大值始终最多为 32,而 1 始终为最小值。值的更改可能会发生在元素集中的任何位置,但当发生这种情况时,每个更改值在一个步骤中最多更改 1。是的,我需要找到具有最低值的所有元素。
是的,我相信在进行簿记时可以进行重新排序。我们可以假设我们从一个排序范围开始并且不变式成立。但随之而来的问题是,如何进行重新排序呢?好吧,也许我应该检查哪些元素已更改,并以某种方式让它们浮动到正确的位置。有什么想法如何做到这一点(在代码级别)?首先,数据结构本身是一个好主意吗?
第一次编辑有点误导,现在已更正。因此范围始终最多为 500-600 个元素,并且在一个小范围内(最多 1 到 32)有许多连续的值。
EDIT3:
如果有人好奇,我将解释以前的方法,该方法更快并且不依赖于排序。我相信这一切都可以更有效地完成,所以这就是为什么我想尝试改变到另一个方案。
这个想法如下:在每一点,我都会有一个指针向量,其中仅包含具有最小域值的元素。然后算法开始工作,变量发生一些变化。首先,我对所有变量进行线性扫描,并检查最小域值是否已更改。如果有,我会重建元素向量。如果没有,我检查小范围的指针,然后添加或删除元素以使不变式再次保持(容器保存具有最小域的所有未分配变量)。
我有一种直觉,我的新方案可能会更快,因为向量的大小保持不变,元素仅在其中交换或旋转。不过,我可能完全错了,很难说什么对硬件真正有效。
Basically a follow-up question to my old question here: Choosing specific objects satisfying conditions
I have objects called variables that can either be assigned or unassigned. Every variable has a domain, which means the values a variable can be assigned to. For example a variable might take on values from 1 to 32. I can query the size of a variable in constant time with a very highly-optimized method.
One important operation is to somehow find the range of unassigned elements with the smallest domain sizes. For this I have a vector of pointers to all variable-objects. Bookkeeping is done in order to maintain an invariant which says that pointers to all unassigned variables occur first before all assigned ones.
As time steps go by, the variables change in a dynamic way that their domain sizes change and unassigned variables might become assigned and vice versa. When the states of the variables change, they are swapped so that invariant once again holds. This means that the size of the vector stays the same throughout the execution of the program.
Then comes my question: finding efficiently the range of unassigned variables that have the smallest domain size. What I am doing now: maintain the invariant with the vector of pointers. Then, when I need to find the range of elements, I fully sort the part of the vector containing pointers to the unassigned variables.
This is incredibly slow. According to the Visual Studio profiler I am using around 50% of total time sorting. These ranges of pointer elements I am sorting are quite small, at most 500-600 hundred elements. Is there a possible reason for this?
Actually before this scheme, I was using a method that did not rely on sorting. It felt it should be slower and even asymptotically it was. I am seeing very big negative impacts in performance with my new scheme, and sorting is the one doing the biggest amount of work.
Is there anything I could try? Sorting 500-600 elements should be nothing on modern hardware, right (and with std::sort)? std::partial_sort doesn't really change the situation either.
I am a bit unsure if I could provide you with any more useful info, but please let me know if there is something. I am really stunned by this phenomenon, whatever is causing it. I am using C++ if that matters, but this could be seen as language-independent question as well I suppose.
EDIT: Here's a simplified example:
A zero in this vector designates assigned variables, non-zero elements mean unassigned variables with that domain size. Let's say my vector looks like this:
{1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 0, 0, 0, 0, 0}
Then my algorithm takes one step and changes the states of the variables. The vector then looks like
{2, 0, 3, 3, 3, 4, 4, 3, 3, 3, 4, 5, 0, 0, 1, 0, 0}
Right now the invariant is violated: there are unassigned variables in the front. I need to do something to correct this. Then I need to find the range of elements with the smallest values. Fixing and sorting produces for example then:
{1, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 5, 0, 0, 0, 0, 0}
EDIT2:
Additional details about the problem. There are not many potential different value ranges for the unassigned variables. I do know it in advance and the maximum value is always at most 32, with 1 always being the lowest value. The changes to the values might occur anywhere in the set of elements, but when it happens, on one step each changing value changes by at most 1. And yes, I need to find all the elements with the lowest value.
Yes, I believe the reordering can be performed when the bookkeeping is performed. We can assume that we are starting with a sorted range and the invariant holds. But then comes the question, how does one actually go about doing the reordering? OK, probably I should check which elements have changed, and somehow let them float into their right places. Any ideas how to do this (on code level)? Is the data structure itself a good idea in the first place?
The first EDIT was a bit misleading, it is now corrected. So the ranges are always at most 500-600 elements, with many concecutive values in a small range (at most 1 to 32).
EDIT3:
If anyone is curious, I'll explain the previous approach that was much faster and did not rely on sorting. I believe all this can be done even more efficiently, so that's why I wanted to try changing to another scheme.
The idea was as follows: at every point, I would have a vector of pointers that contains only the elements with the smallest domain value. Then the algorithm does its work and something happens to the variables. First, I do a linear scan through all variables and check to see if the smallest domain value has changed. If it has, I rebuild the vector of elements. If it has not, I check the small range of pointers and either add or remove elements to make the invariant hold again (the container holds all the unassigned variables with the smallest domain).
I have a gut feeling this could be faster with my new scheme, since the size of the vector stays constant, elements are only swapped or rotated in it. I might be totally wrong though, it's very hard to say what really is efficient for the hardware.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果我理解正确的话,你只想对大约 500-600 个对象指针进行排序,但这对你来说太慢了。首先你使用什么算法?某些算法在大多数排序数据中具有几乎线性的复杂度。如果你的数据只包含一些未排序的元素,你可能想尝试插入排序(平均需要 O(n^2),它在大多数排序数据上效果很好),或 TimSort(它有点难以实现,但它确实快的)。
我不知道 1..32 范围是否只是一个例子,但如果你真的使用这么小的一组值,最快的排序方法是当你创建 32 列表(例如链接列表)时,以及当如果遇到值 X,则将其放入第 X 个列表中。完成后,您选择第一个非空列表,然后您就拥有了它。
If I understanded it corretly you only want to sort about 500-600 pointers to object, but its too slow for you. First what algorithm do you use? some algorithm has almost linear complexity in mostly sorted data. if your data only contains a few unsorted elements, you might want to try insertion sort (it takes O(n^2) on avarage, it works well on mostly sorted data), or TimSort (its a little hard to implement but its really quick).
i don't know if the 1..32 range is just an exaple, but if you really work with such a small set of values, the fastest way of sorting is when you create 32 list (linked list for example), and when you encounter the value X, you put it in the X-th list. when your done, you select the first not empty list, and there you have it.