为什么在链表中查找循环时将指针增加 2,而不是 3、4、5?
我查看了问题已经讨论了在链表中查找循环的算法。我读过 Floyd 的循环查找算法解决方案,在很多地方都提到过,我们必须拿两个指针。一个指针(slower/tortoise)加1,另一个指针(faster/hare)加2。当它们相等时,我们发现循环,如果faster指针达到空,则链表中没有循环。
现在我的问题是为什么我们将更快的指针增加 2。为什么不做其他的事情呢?需要增加 2,或者我们可以将其增加 X 以获得结果。如果我们将 Faster 指针增加 2,是否有必要找到循环,或者可能存在需要增加 3、5 或 x 的情况。
I had a look at question already which talk about algorithm to find loop in a linked list. I have read Floyd's cycle-finding algorithm solution, mentioned at lot of places that we have to take two pointers. One pointer( slower/tortoise ) is increased by one and other pointer( faster/hare ) is increased by 2. When they are equal we find the loop and if faster pointer reaches null there is no loop in the linked list.
Now my question is why we increase faster pointer by 2. Why not something else? Increasing by 2 is necessary or we can increase it by X to get the result. Is it necessary that we will find a loop if we increment faster pointer by 2 or there can be the case where we need to increment by 3 or 5 or x.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
从正确性的角度来看,没有理由需要使用数字 2。任何步长的选择都可以(当然,除了一个)。然而,选择大小为 2 的步长可以最大限度地提高效率。
为了明白这一点,我们首先来看看弗洛伊德算法为何有效。这个想法是考虑序列 x0, x1, x2, ..., xn >, ...如果您从列表的开头开始,然后继续沿着列表走,直到到达结尾,您将访问的链接列表的元素。如果列表不包含循环,则所有这些值都是不同的。但如果它确实包含循环,那么这个序列将无限重复。
以下是弗洛伊德算法发挥作用的定理:
让我们来证明这一点;这并不难。对于“if”情况,如果存在这样的 aj,则选择 k = 2。然后对于某些正 j,xj = x2j 且 j ≠ 2j,所以列表包含一个循环。
对于另一个方向,假设列表包含从位置 s 开始的长度为 l 的循环。令 j 为 l 大于 s 的最小倍数。那么对于任何k,如果我们考虑xj和xjk,由于j是循环长度的倍数,我们可以认为xjk > 作为从列表中的位置 j 开始,然后执行 j 步 k-1 次而形成的元素。但每次执行 j 步时,您最终都会回到列表中开始的位置,因为 j 是循环长度的倍数。因此,xj = xjk。
这个证明向您保证,如果您在每次迭代中采取任何恒定数量的步骤,您确实会遇到慢指针。更准确地说,如果您在每次迭代中执行 k 步,那么您最终将找到点 xj 和 xkj 并检测循环。直观上,人们倾向于选择 k = 2 来最小化运行时间,因为每次迭代所采取的步骤数最少。
我们可以更正式地分析运行时,如下所示。如果列表不包含循环,则快速指针将在 n 步后到达列表末尾,时间复杂度为 O(n),其中 n 是列表中元素的数量。否则,当慢指针走了j步后,两个指针就会相遇。请记住,j 是大于 s 的 l 的最小倍数。如果 s ≤ l,则 j = l;否则如果 s > l,则 j 最多为 2s,因此 j 的值为 O(s + l)。由于 l 和 s 不能大于列表中的元素数量,这意味着 j = O(n)。然而,在慢速指针走了 j 步之后,对于慢速指针所采取的 j 步,快指针将采取 k 步,因此它将采取 O(kj) 步。由于 j = O(n),因此净运行时间至多为 O(nk)。请注意,这表示我们使用快速指针执行的步骤越多,算法完成所需的时间就越长(尽管只是成比例)。因此选择 k = 2 可以最大限度地减少算法的整体运行时间。
希望这有帮助!
From a correctness perspective, there is no reason that you need to use the number two. Any choice of step size will work (except for one, of course). However, choosing a step of size two maximizes efficiency.
To see this, let's take a look at why Floyd's algorithm works in the first place. The idea is to think about the sequence x0, x1, x2, ..., xn, ... of the elements of the linked list that you'll visit if you start at the beginning of the list and then keep on walking down it until you reach the end. If the list does not contain a cycle, then all these values are distinct. If it does contain a cycle, though, then this sequence will repeat endlessly.
Here's the theorem that makes Floyd's algorithm work:
Let's go prove this; it's not that hard. For the "if" case, if such a j exists, pick k = 2. Then we have that for some positive j, xj = x2j and j ≠ 2j, and so the list contains a cycle.
For the other direction, assume that the list contains a cycle of length l starting at position s. Let j be the smallest multiple of l greater than s. Then for any k, if we consider xj and xjk, since j is a multiple of the loop length, we can think of xjk as the element formed by starting at position j in the list, then taking j steps k-1 times. But each of these times you take j steps, you end up right back where you started in the list because j is a multiple of the loop length. Consequently, xj = xjk.
This proof guarantees you that if you take any constant number of steps on each iteration, you will indeed hit the slow pointer. More precisely, if you're taking k steps on each iteration, then you will eventually find the points xj and xkj and will detect the cycle. Intuitively, people tend to pick k = 2 to minimize the runtime, since you take the fewest number of steps on each iteration.
We can analyze the runtime more formally as follows. If the list does not contain a cycle, then the fast pointer will hit the end of the list after n steps for O(n) time, where n is the number of elements in the list. Otherwise, the two pointers will meet after the slow pointer has taken j steps. Remember that j is the smallest multiple of l greater than s. If s ≤ l, then j = l; otherwise if s > l, then j will be at most 2s, and so the value of j is O(s + l). Since l and s can be no greater than the number of elements in the list, this means than j = O(n). However, after the slow pointer has taken j steps, the fast pointer will have taken k steps for each of the j steps taken by the slower pointer so it will have taken O(kj) steps. Since j = O(n), the net runtime is at most O(nk). Notice that this says that the more steps we take with the fast pointer, the longer the algorithm takes to finish (though only proportionally so). Picking k = 2 thus minimizes the overall runtime of the algorithm.
Hope this helps!
假设不包含循环的列表长度为
s
,循环长度为t
,fast_pointer_speed
与slow_pointer_speed
为k
。让两个指针在距离循环起点
j
处相交。因此,慢指针移动的距离 =
s + j
。快指针移动的距离 = s + j + m * t(其中 m 是快指针完成循环的次数)。但是,快指针也会移动一段距离k * (s + j)
(k
乘以慢指针的距离)。因此,我们得到
k * (s + j) = s + j + m * t
。s + j = (m / k-1)t
。因此,根据上式,慢指针行进的长度是循环长度的整数倍。
为了获得最大效率,
(m / k-1) = 1
(慢速指针不应多次遍历循环。)因此,
m = k - 1 => k = m + 1
由于
m
是快指针完成循环的次数,因此m >= 1
。为了获得最大效率,
m = 1
。因此
k = 2
。如果我们取 k > 的值2 ,两个指针必须移动的距离更大。
希望以上解释有帮助。
Let us suppose the length of the list which does not contain the loop be
s
, length of the loop bet
and the ratio offast_pointer_speed
toslow_pointer_speed
bek
.Let the two pointers meet at a distance
j
from the start of the loop.So, the distance slow pointer travels =
s + j
. Distance the fast pointer travels =s + j + m * t
(wherem
is the number of times the fast pointer has completed the loop). But, the fast pointer would also have traveled a distancek * (s + j)
(k
times the distance of the slow pointer).Therefore, we get
k * (s + j) = s + j + m * t
.s + j = (m / k-1)t
.Hence, from the above equation, length the slow pointer travels is an integer multiple of the loop length.
For greatest efficiency ,
(m / k-1) = 1
(the slow pointer shouldn't have traveled the loop more than once.)therefore ,
m = k - 1 => k = m + 1
Since
m
is the no.of times the fast pointer has completed the loop ,m >= 1
.For greatest efficiency ,
m = 1
.therefore
k = 2
.if we take a value of
k > 2
, more the distance the two pointers would have to travel.Hope the above explanation helps.
考虑一个大小为 L 的循环,这意味着第 k 个元素是循环的位置: xk -> xk+1 -> ...-> xk+L-1 -> xk。假设一个指针以 r1=1 的速率运行,另一个指针以 r2 的速率运行。当第一个指针到达 xk 时,第二个指针将已经在循环中的某个元素 xk+s 处,其中 0 <= s < L. m 进一步增加指针后,第一个指针位于 xk+(m mod L) 处,第二个指针位于 xk+((m*r2 +s) mod L)。因此,两个指针碰撞的条件可以表述为存在满足同余 m
= m*r2 + s (mod L)
的 m ,这可以通过以下步骤简化
m(1 -r2) = s (mod L)
m(L+1-r2) = s (mod L)
这是线性同余的形式。如果 s 可被 gcd(L+1-r2,L) 整除,则有解 m。如果 gcd(L+1-r2,L)=1,情况肯定会如此。如果 r2=2 则 gcd(L+1-r2,L)=gcd(L-1,L)=1 并且解 m 始终存在。
因此 r2=2 具有良好的性质,即对于任何周期大小 L,它满足 gcd(L+1-r2,L)=1 ,从而保证即使两个指针从不同的位置开始,指针最终也会发生冲突。 r2 的其他值不具有此属性。
Consider a cycle of size L, meaning at the kth element is where the loop is: xk -> xk+1 -> ... -> xk+L-1 -> xk. Suppose one pointer is run at rate r1=1 and the other at r2. When the first pointer reaches xk the second pointer will already be in the loop at some element xk+s where 0 <= s < L. After m further pointer increments the first pointer is at xk+(m mod L) and the second pointer is at xk+((m*r2+s) mod L). Therefore the condition that the two pointers collide can be phrased as the existence of an m satisfying the congruence
m = m*r2 + s (mod L)
This can be simplified with the following steps
m(1-r2) = s (mod L)
m(L+1-r2) = s (mod L)
This is of the form of a linear congruence. It has a solution m if s is divisible by gcd(L+1-r2,L). This will certainly be the case if gcd(L+1-r2,L)=1. If r2=2 then gcd(L+1-r2,L)=gcd(L-1,L)=1 and a solution m always exists.
Thus r2=2 has the good property that for any cycle size L, it satisfies gcd(L+1-r2,L)=1 and thus guarantees that the pointers will eventually collide even if the two pointers start at different locations. Other values of r2 do not have this property.
如果快指针移动
3
步,慢指针移动1
步,则不能保证两个指针在包含偶数个节点的循环中相遇。然而,如果慢指针移动2
步,会议就能得到保证。一般来说,如果兔子移动
H
步,乌龟移动T
步,则保证在一个周期内相遇,当且仅当H = T + 1
.考虑兔子相对于乌龟的移动。
H - T
个节点。给定一个长度为
N =(H - T) * k
的循环,其中k
为任意正数整数,野兔会跳过每个
H - T - 1
节点(同样,相对到乌龟),并且如果
乌龟位于这些节点中的任何一个中。
保证会面的唯一可能性是当
H - T - 1 = 0
时。因此,只要慢速指针增加
x - 1
,就允许将快指针增加x
。If the fast pointer moves
3
steps and slow pointer at1
step, it is not guaranteed for both pointers to meet in cycles containing even number of nodes. If the slow pointer moved at2
steps, however, the meeting would be guaranteed.In general, if the hare moves at
H
steps, and tortoise moves atT
steps, you are guaranteed to meet in a cycle iffH = T + 1
.Consider the hare moving relative to the tortoise.
H - T
nodes per iteration.Given a cycle of length
N =(H - T) * k
, wherek
is any positiveinteger, the hare would skip every
H - T - 1
nodes (again, relativeto the tortoise), and it would be impossible to for them to meet if
the tortoise was in any of those nodes.
The only possibility where a meeting is guaranteed is when
H - T - 1 = 0
.Hence, increasing the fast pointer by
x
is allowed, as long as the slow pointer is increased byx - 1
.这是一个直观的非数学方式来理解这一点:
如果快速指针超出了列表的末尾,显然就没有循环。
忽略指针位于链表初始非循环部分的初始部分,我们只需要把它们放入循环即可。当慢指针最终到达循环时,快指针在循环中的位置并不重要。
一旦它们都进入循环,它们就会围绕循环运行,但在不同的点上。想象一下,如果它们每次都移动一格。然后他们会绕着自行车转一圈,但保持相同的距离。换句话说,进行相同的循环但异相。现在,通过将快速指针每移动两步,它们就可以相互改变相位;每走一步,彼此的距离就减少一分。快指针将追上慢指针,我们可以检测到循环。
为了证明这是真的,它们会相遇,并且快指针不会以某种方式超越并跳过慢指针,只需手动模拟当快指针落后慢指针三步时会发生什么,然后模拟当快指针落后慢指针时会发生什么落后慢速两步,则当快指针仅落后慢速指针一步时。在每种情况下,它们都在同一节点相遇。任何更大的距离最终都会变成三、二或一的距离。
编辑:如果快指针移动了 3、4 或 5,那么快指针有可能“跳过”慢指针,而不会击中同一个节点。
Here is a intuitive non-mathematical way to understand this:
If the fast pointer runs off the end of the list obviously there is no cycle.
Ignore the initial part where the pointers are in the initial non-cycle part of the list, we just need to get them into the cycle. It doesn't matter where in the cycle the fast pointer is when the slow pointer finally reaches the cycle.
Once they are both in the cycle, they are circling the cycle but at different points. Imagine if they were both moving by one each time. Then they would be circling the cycle but staying the same distance apart. In other words, making the same loop but out of phase. Now by moving the fast pointer by two each step they are changing their phase with each other; Decreasing their distance apart by one each step. The fast pointer will catch up to the slow pointer and we can detect the loop.
To prove this is true, that they will meet each other and the fast pointer will not somehow overtake and skip over the slow pointer just hand simulate what happens when the fast pointer is three steps behind the slow, then simulate what happens when the fast pointer is two steps behind the slow, then when the fast pointer is just one step behind the slow pointer. In every case they meet at the same node. Any larger distance will eventually become a distance of three, two or one.
Edit: If the fast pointer moves by 3, 4 or 5 then it would be possible for the fast pointer to "jump over" the slow pointer and not hit the same node.
如果有一个循环(n 个节点),那么一旦指针进入循环,它将永远保留在那里;所以我们可以及时向前移动,直到两个指针都在循环中。从这里开始,指针可以用初始值为 a 和 b 的以 n 为模的整数来表示。 t步后满足的条件是
a+t≡b+2t mod n
有解 t=a−b mod n。
只要速度之间的差异不与 n 共享质因数,这种方法就有效。
参考
https://math. stackexchange.com/questions/412876/proof-of-the-2-pointer-method-for-finding-a-linked-list-loop
对速度的唯一限制是它们的差异应该与循环的长度。
If there is a loop (of n nodes), then once a pointer has entered the loop it will remain there forever; so we can move forward in time until both pointers are in the loop. From here on the pointers can be represented by integers modulo n with initial values a and b. The condition for them to meet after t steps is then
a+t≡b+2t mod n
which has solution t=a−b mod n.
This will work so long as the difference between the speeds shares no prime factors with n.
Reference
https://math.stackexchange.com/questions/412876/proof-of-the-2-pointer-method-for-finding-a-linked-list-loop
The single restriction on speeds is that their difference should be co-prime with the loop's length.
理论上,将循环(循环)视为公园(圆形、矩形等),第一人 X 移动缓慢,第二人 Y 移动速度比 X 快。现在,如果人 Y 以 2 的速度移动并不重要X 倍或 3,4,5... 倍。当他们在某个时刻相遇时,总会有一种情况。
Theoretically, consider the cycle(loop) as a park(circular, rectangle whatever), First person X is moving slow and Second person Y is moving faster than X. Now, it doesn't matter if person Y is moving with speed of 2 times that of X or 3,4,5... times. There will always be a case when they meet at one point.
假设我们使用两个参考 Rp 和 Rq,它们在每次迭代中采取 p 和 q 步; p> q.在Floyd算法中,p = 2,q = 1。
我们知道,经过一定的迭代后,Rp和Rq都会处于循环的某些元素处。然后,假设 Rp 领先 Rq x 步。也就是说,从 Rq 的元素开始,我们可以采取 x 步到达 Rp 的元素。
假设循环有 n 个元素。经过 t 次进一步迭代后,Rp 将领先 Rq (x + (pq)*t) 步。因此,只有在以下情况下,它们才能在 t 次迭代后满足:
可以写为:
由于模运算,这仅当满足以下条件时才可能: GCD(p−q, n) | x。
但我们不知道x。不过,如果 GCD 为 1,它将整除任何 x。为了使 GCD 为 1:
更新:在后来的一些进一步分析中,我意识到任何不相等的正整数
p
和q
都会使两个引用在一些迭代后相遇。不过,1 和 2 的值似乎需要较少的总步进数。Say we use two references Rp and Rq which take p and q steps in each iteration; p > q. In the Floyd's algorithm, p = 2, q = 1.
We know that after certain iterations, both Rp and Rq will be at some elements of the loop. Then, say Rp is ahead of Rq by x steps. That is, starting at the element of Rq, we can take x steps to reach the element of Rp.
Say, the loop has n elements. After t further iterations, Rp will be ahead of Rq by (x + (p-q)*t) steps. So, they can meet after t iterations only if:
Which can be written as:
Due to modular arithmetic, this is possible only if: GCD(p−q, n) | x.
But we do not know x. Though, if the GCD is 1, it will divide any x. To make the GCD as 1:
Update: On some further analysis later, I realized that any unequal positive integers
p
andq
will make the two references meet after some iterations. Though, the values of 1 and 2 seem to require less number of total stepping.选择 2 的原因是因为可以说
慢速指针移动到 1
快速移动2
该循环有 5 个元素。
现在让慢指针和快指针相遇,
1,2 和 5 的最小公倍数 (LCM) 必须存在,这就是它们相交的地方。在本例中为 10。
如果您模拟慢速和快速指针,您将看到慢速和快速指针在循环中的 2 * 个元素处相遇。当您进行 2 次循环时,您会在与起点完全相同的点处相遇。
在非循环的情况下,它变为 1,2 和无穷大的 LCM。所以他们永远不会见面。
The reason why 2 is chosen is because lets say
slow pointer moves at 1
fast moves at 2
The loop has 5 elements.
Now for slow and fast pointer to meet ,
Lowest common multiple (LCM) of 1,2 and 5 must exist and thats where they meet. In this case its 10.
If you simulate slow and fast pointer you will see that the slow and fast pointer meet at 2 * elements in loop. When you do 2 loops , you meet at exactly same point of as starting point.
In case of non loop , it becomes LCM of 1,2 and infinity. so they never meet.
如果链表有循环,那么增量为 2 的快速指针会比增量为 3 或 4 或更多更好,因为它确保一旦我们进入循环内部,指针肯定会发生冲突,并且不会发生超车。
例如,如果我们采用增量 3 并在循环内进行假设
,而增量为 2 时这种情况永远不会发生。
此外,如果您真的很不幸,那么您最终可能会遇到循环长度为 L 的情况> 并且您将快速指针增加
L+1
。那么你将被无限地卡住,因为快慢指针的移动差异将始终是L
。我希望我说清楚了。
If the linked list has a loop then a fast pointer with increment of 2 will work better then say increment of 3 or 4 or more because it ensures that once we are inside the loop the pointers will surely collide and there will be no overtaking.
For example if we take increment of 3 and inside the loop lets assume
whereas such case will never happen with increment of 2.
Also if you are really unlucky then you may end up in a situation where loop length is
L
and you are incrementing the fast pointer byL+1
. Then you will be stuck infinitely since the difference of the movement fast and slow pointer will always beL
.I hope I made myself clear.