Katz相似度-矩阵/图运算(JAVA+JUNG)

发布于 2024-12-09 11:57:48 字数 394 浏览 2 评论 0原文

我正在图表中测试一些相似性指标。我正在使用 JUNG API 来处理图表。我已经成功计算了基本指标,例如共同邻居和优先依恋。现在,我想计算 katz 指标,如下所示: 卡茨(v1,v2) = B.paths_1(v1,v2) + B^2.paths_2(v1,v2) + ...+ B^n.paths_n(v1,v2) 其中 paths_n(v1,v2) 是 v1 和 v2 之间长度为“n”的路径数; B 是标量。我将 n 限制为 4,因此最终的 katz 矩阵可以通过以下方式轻松计算: BA + (BA)^2 + ... + (BA)^4 其中 A 是图的邻接矩阵。问题是我正在处理的图表非常巨大,我无法将整个卡茨矩阵存储在内存中。此外,我不需要所有的分数,因为我只测试几对节点。 不过,我无法找到一种有效的方法来计算个人分数,而不必在图表上执行步行。 有什么想法吗?

I'm testing some similarity metrics in a graph. I'm using JUNG API to handle the graphs. I've managed to compute basic metrics like common neighbors and prefferential attachment. Now, I want to calculate katz metric that is as follow:
katz(v1,v2) = B.paths_1(v1,v2) + B^2.paths_2(v1,v2) + ...+ B^n.paths_n(v1,v2)
where paths_n(v1,v2) is the number of paths of length "n" between v1 and v2; and B is a scalar. I'm restricting n to 4, so the final katz matrix could be calculated easily through:
B.A + (B.A)^2 + ... + (B.A)^4 where A is the adjacency matrix of the graph. The thing is that the graphs I'm working with are very huge and I'm not able to store the whole katz matrix in memory. Besides, I won't need all the scores because I only test few pairs of nodes.
I can't find an efficient way to calculate individual scores without having to performe a walk on the graph though.
Any ideas?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

暗喜 2024-12-16 11:57:48

要计算个人得分 ketz(v1,v2),您只需要考虑邻接子矩阵,该矩阵仅包含与 v1 或 v2 距离小于 4 的顶点。

您可以使用 v1 和 v2 中的呼吸优先搜索来定位这些顶点。

但如果从 v1 开始执行 BFS 时直接计算 #paths,实际上可以做得更好。你只需要记住距 v1 的距离,并在每个顶点检查是否到达 v2。如果是这样,则增加适当的计数器。

类似的东西(伪代码):

Queue q = new Queue();
q.enqueue((v1, 0));
int[] counts = new int[] { 0,0,0,0,0 };

while (!q.empty()) {
    (v, dist) = q.dequeue();
    for(Vertex w : v.Neighbors()) {
        if(dist < 3)
            q.enqueue((w, dist+1));

        if(dist < 4 && w == v2)
            counts[dist+1]++;
    }
}

因此,运行此命令后,您将得到 counts[n] = paths_n(v1,v2) for n =1,2,3,4

To calculate individual score ketz(v1,v2), you just need to consider the adjacency sub-matrix, which contains only the vertices which are within a distance less than 4 from either v1 or v2.

You can locate those vertices using breath-first-search from v1 and from v2.

But you can actually do much better if you directly count the #paths while doing a BFS starting at v1. You only need to remember the distance from v1 and at each vertex check whether you have arrived at v2. If so increment the appropriate counter.

Something like that (pseudo code):

Queue q = new Queue();
q.enqueue((v1, 0));
int[] counts = new int[] { 0,0,0,0,0 };

while (!q.empty()) {
    (v, dist) = q.dequeue();
    for(Vertex w : v.Neighbors()) {
        if(dist < 3)
            q.enqueue((w, dist+1));

        if(dist < 4 && w == v2)
            counts[dist+1]++;
    }
}

So after you run this, you'll have counts[n] = paths_n(v1,v2) for n =1,2,3,4

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文