PageRank 的 Python 实现
我正在尝试理解 Google PageRank 背后的概念,并尝试在 Python 中实现类似的(尽管是初级的)版本。我花了几个小时来熟悉该算法,但仍然不太清楚。
我找到了一个特别有趣的网站,它概述了Python 中的 PageRank。但是,我似乎不太理解此页面上显示的所有功能的用途。谁能澄清这些函数到底在做什么,特别是 pageRankeGenerator?
I am attempting to understand the concepts behind Google PageRank, and am attempting to implement a similar (though rudimentary) version in Python. I have spent the last few hours familiarizing myself with the algorithm, however it's still not all that clear.
I've located a particularly interesting website that outlines the implementation of PageRank in Python. However, I can't quite seem to understand the purpose of all of the functions shown on this page. Could anyone clarify what exactly the functions are doing, particularly pageRankeGenerator?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我将尝试从我的个人笔记中对 PageRank 算法进行简单的解释(定义)。
假设页面 T1、T2、... Tn 指向页面 A,则其中
PR
每个 PR(x) 可以有起始值 1,我们通过对每个页面重复该算法约 10-20 次来调整页面排名。
A、B、C 页面示例:
第 1 轮
A = 0.15 + 0.85 (1/2 + 1/1) = 1.425
B = 0.15 + 0.85 (1/1) = 1
C = 0.15 + 0.85 (1/2) = 0.575
回合总和 = 3
第 2 轮
A = 0.15 + 0.85 (1/2 + 0.575) = 1.06375
B = 0.15 + 0.85 (1.425) = 1.36125
C = 0.15 + 0.85 (1/2) = 0.575
回合总和 = 3
第 3 轮
A = 0.15 + 0.85 (1.36125/2 + 0.575) = 1.217
B = 0.15 + 0.85 (1.06375) = 1.054
C = 0.728
回合总和 = 3
...
I'll try to give a simple explanation (definition) of the PageRank algorithm from my personal notes.
Let us say that pages T1, T2, ... Tn are pointing to page A, then
where
Every PR(x) can have start value 1 and we adjust the page ranks by repeating the algorithm ~10-20 times for each page.
Example for pages A, B, C:
Round 1
A = 0.15 + 0.85 (1/2 + 1/1) = 1.425
B = 0.15 + 0.85 (1/1) = 1
C = 0.15 + 0.85 (1/2) = 0.575
round's sum = 3
Round 2
A = 0.15 + 0.85 (1/2 + 0.575) = 1.06375
B = 0.15 + 0.85 (1.425) = 1.36125
C = 0.15 + 0.85 (1/2) = 0.575
round's sum = 3
Round 3
A = 0.15 + 0.85 (1.36125/2 + 0.575) = 1.217
B = 0.15 + 0.85 (1.06375) = 1.054
C = 0.728
round's sum = 3
...