对于图中的每个顶点,找到距离 d 内的所有顶点
在我的特定情况下,该图表示为邻接列表,并且是无向且稀疏的,n 可以是数百万,d 是 3。计算 A^d (其中 A 是邻接矩阵)并挑选出非零条目有效,但我想要一些不涉及矩阵乘法的东西。对每个顶点进行广度优先搜索也是一种选择,但速度很慢。
In my particular case, the graph is represented as an adjacency list and is undirected and sparse, n can be in the millions, and d is 3. Calculating A^d (where A is the adjacency matrix) and picking out the non-zero entries works, but I'd like something that doesn't involve matrix multiplication. A breadth-first search on every vertex is also an option, but it is slow.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
假设我们有一个函数
verticesWithin(d,x)
,它可以查找顶点x
距离d
内的所有顶点。对于这样的问题,为了暴露缓存/记忆化机会,一个好的策略是提出一个问题:这个问题的子问题如何相互关联?
在这种情况下,我们可以看到
verticesWithin(d,x)
如果d >= 1
是所有vertices(d-1,y[i])
的并集i
在范围内,其中y=verticesWithin(1,x)
。如果d == 0
那么它就是{x}
。 (我假设一个顶点被认为与自身的距离为 0。)在实践中,您需要查看
d == 1
情况的邻接表,而不是使用它关系,以避免无限循环。您还需要避免将x
本身视为y
的成员的冗余。此外,如果
verticesWithin(d,x)
的返回类型从简单列表或集合更改为d
集合列表,表示与x 的距离增加
,然后verticesWithin(d,x) = init(verticesWithin(d+1,x))
其中
init
是生成列表所有元素的函数除了最后一个。显然,如果按字面转录成代码,这将是一个非终止递归关系,因此您必须对如何实现它有点聪明。配备了子问题之间的这些关系,我们现在可以缓存 verticesWithin 的结果,并使用这些缓存的结果来避免执行冗余遍历(尽管以执行一些集合操作为代价 - 我不完全确保这是一场胜利)。我将把它作为练习来填写实现细节。
Let's say that we have a function
verticesWithin(d,x)
that finds all vertices within distanced
of vertexx
.One good strategy for a problem such as this, to expose caching/memoisation opportunities, is to ask the question: How are the subproblems of this problem related to each other?
In this case, we can see that
verticesWithin(d,x)
ifd >= 1
is the union ofvertices(d-1,y[i])
for alli
within range, wherey=verticesWithin(1,x)
. Ifd == 0
then it's simply{x}
. (I'm assuming that a vertex is deemed to be of distance 0 from itself.)In practice you'll want to look at the adjacency list for the case
d == 1
, rather than using that relation, to avoid an infinite loop. You'll also want to avoid the redundancy of consideringx
itself as a member ofy
.Also, if the return type of
verticesWithin(d,x)
is changed from a simple list or set, to a list ofd
sets representing increasing distance fromx
, thenverticesWithin(d,x) = init(verticesWithin(d+1,x))
where
init
is the function that yields all elements of a list except the last one. Obviously this would be a non-terminating recursive relation if transcribed literally into code, so you have to be a little bit clever about how you implement it.Equipped with these relations between the subproblems, we can now cache the results of
verticesWithin
, and use these cached results to avoid performing redundant traversals (albeit at the cost of performing some set operations - I'm not entirely sure that this is a win). I'll leave it as an exercise to fill in the implementation details.您已经提到了计算
A^d
的选项,但这比您需要的要多得多(正如您已经说过的那样)。然而,有一种更便宜的方法可以使用这个想法。假设您有一个由 0 和 1 组成的(列)向量
v
,表示一组顶点。现在,向量w := A v
在每个节点上都有一个 1,从起始节点只需一步即可到达该向量。迭代时,u := A w
对于从起始节点只需两步即可到达的每个节点都有一个 1,等等。对于
d=3
,您可以执行以下操作以下(MATLAB 伪代码):向量
w
现在对于每个节点都有一个正条目,可以从最多d< 中的第
j
个节点进行访问/代码> 步骤。You already mention the option of calculating
A^d
, but this is much, much more than you need (as you already remark).There is, however, a much cheaper way of using this idea. Suppose you have a (column) vector
v
of zeros and ones, representing a set of vertices. The vectorw := A v
now has a one at every node that can be reached from the starting node in exactly one step. Iterating,u := A w
has a one for every node you can reach from the starting node in exactly two steps, etc.For
d=3
, you could do the following (MATLAB pseudo-code):the vector
w
now has a positive entry for each node that can be accessed from thej
th node in at mostd
steps.在这种情况下,从给定顶点开始的广度优先搜索是最佳解决方案。你会找到距离 d 内的所有顶点,并且你永远不会访问距离 >= d + 2 的任何顶点。
这是递归代码,尽管如果需要的话,可以通过使用队列轻松消除递归。
Breadth first search starting with the given vertex is an optimal solution in this case. You will find all the vertices that within the distance d, and you will never even visit any vertices with distance >= d + 2.
Here is recursive code, although recursion can be easily done away with if so desired by using a queue.