PageRank 的 Python 实现

发布于 2024-09-28 06:27:16 字数 297 浏览 4 评论 0原文

我正在尝试理解 Google PageRank 背后的概念,并尝试在 Python 中实现类似的(尽管是初级的)版本。我花了几个小时来熟悉该算法,但仍然不太清楚。

我找到了一个特别有趣的网站,它概述了Python 中的 PageRank。但是,我似乎不太理解此页面上显示的所有功能的用途。谁能澄清这些函数到底在做什么,特别是 pageRankeGenerator?

I am attempting to understand the concepts behind Google PageRank, and am attempting to implement a similar (though rudimentary) version in Python. I have spent the last few hours familiarizing myself with the algorithm, however it's still not all that clear.

I've located a particularly interesting website that outlines the implementation of PageRank in Python. However, I can't quite seem to understand the purpose of all of the functions shown on this page. Could anyone clarify what exactly the functions are doing, particularly pageRankeGenerator?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

卖梦商人 2024-10-05 06:27:16

我将尝试从我的个人笔记中对 PageRank 算法进行简单的解释(定义)。

假设页面 T1、T2、... Tn 指向页面 A,则其中

PR(A) = (1-d) + d * (PR(T1) / C(T1) + ... + PR(Tn) / C(Tn))

PR

  • (Ti) 是 Ti 的 PageRank
  • C(Ti) 是页面 Ti 的传出链接数
  • d 是倾销 因子在 0 << 范围内d < 1,通常设置为 0.85

每个 PR(x) 可以有起始值 1,我们通过对每个页面重复该算法约 10-20 次来调整页面排名。

A、B、C 页面示例:

   A <--> B
   ^     /
    \   v
      C

第 1 轮
A = 0.15 + 0.85 (1/2 + 1/1) = 1.425
B = 0.15 + 0.85 (1/1) = 1
C = 0.15 + 0.85 (1/2) = 0.575

回合总和 = 3

第 2 轮
A = 0.15 + 0.85 (1/2 + 0.575) = 1.06375
B = 0.15 + 0.85 (1.425) = 1.36125
C = 0.15 + 0.85 (1/2) = 0.575

回合总和 = 3

第 3 轮
A = 0.15 + 0.85 (1.36125/2 + 0.575) = 1.217
B = 0.15 + 0.85 (1.06375) = 1.054
C = 0.728

回合总和 = 3

...

I'll try to give a simple explanation (definition) of the PageRank algorithm from my personal notes.

Let us say that pages T1, T2, ... Tn are pointing to page A, then

PR(A) = (1-d) + d * (PR(T1) / C(T1) + ... + PR(Tn) / C(Tn))

where

  • PR(Ti) is the PageRank of Ti
  • C(Ti) is the number of outgoing links from page Ti
  • d is the dumping factor in the range 0 < d < 1, usually set to 0.85

Every PR(x) can have start value 1 and we adjust the page ranks by repeating the algorithm ~10-20 times for each page.

Example for pages A, B, C:

   A <--> B
   ^     /
    \   v
      C

Round 1
A = 0.15 + 0.85 (1/2 + 1/1) = 1.425
B = 0.15 + 0.85 (1/1) = 1
C = 0.15 + 0.85 (1/2) = 0.575

round's sum = 3

Round 2
A = 0.15 + 0.85 (1/2 + 0.575) = 1.06375
B = 0.15 + 0.85 (1.425) = 1.36125
C = 0.15 + 0.85 (1/2) = 0.575

round's sum = 3

Round 3
A = 0.15 + 0.85 (1.36125/2 + 0.575) = 1.217
B = 0.15 + 0.85 (1.06375) = 1.054
C = 0.728

round's sum = 3

...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文