如何计算小型网络的 PageRank?
我的 Mysql 数据库中有两个表
table1 包含我的网络中的所有网页
| table1: (pages)|
|----------------|
| id | url |
|----------------|
table2 有两个字段,分别是链接的源页面和链接的目标页面
|---------------------------|
|table2(links) |
|---------------------------|
|from_page_id | to_page_id|
|----------------------------
如何计算我的网络的页面排名
我找到了这篇文章这里它解释了PageRank算法,但是很难用它来写出他们的公式PHP + 我不擅长数学
谢谢
更新:
我的网络中有近 5000 个页面
I have two tabled in my Mysql database
table1 has the all webpages in my network
| table1: (pages)|
|----------------|
| id | url |
|----------------|
table2 has two fields, which are the source page of the link and the destination page of the link
|---------------------------|
|table2(links) |
|---------------------------|
|from_page_id | to_page_id|
|----------------------------
How to calculate the page rank for my network
I have found this article here it explains the PageRank algorithm but it is very difficult to write their formula in PHP + I am not good at math
Thanks
update:
I have almost 5000 pages in my network
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
嗨,
我想我已经弄清楚了如何做到这一点,但我不确定
我会直到你和你判断我计算页面排名的方式是否正确,
首先我在“页面”表中添加了一个新列,称为它是“outgoinglinks”,它包含从该页面发出的链接数量
,我添加了另外两列“pagerank”和“pagerank2”
以及另一列称为“i”的列,它计算迭代次数
现在让我们开始编程
<强>注意:
开始之前,请确保将其中一个页面(任何页面)的 pagerank 设置为 1,并将其他页面保留为 0
为什么有两个 pagerank 列?
我这样做是因为我认为我们应该将每次迭代分开以进行准确的计算,以便我们的脚本将在这两列之间交替,每次迭代都会对其中一个页面排名列进行处理,并将新结果保存到另一个页面排名列,
之前的代码将循环多次获得准确的结果,例如每次 50 次,我们都会更接近页面的真实页面排名
我的问题是,我的网络中所有页面排名的总和是否应该等于 1!
如果是的话,谷歌如何给每个页面排名10?!
有什么想法吗?
谢谢
HI again
I think I have figured out how to do it but I am not sure
I will till you and you judge if my way in calculation the pagerank is correct or not
first I have added a new column to the "pages" table a called it "outgoinglinks" it has the number of out going links from that page
and I have added another two columns "pagerank" and "pagerank2"
and another column called "i" which count the the number of iterations
now lets move to the programming
note:
before you start make sure to set the pagerank of one of the pages (any page) to 1 and leave other pages with 0
why two pageranks columns?
I did that because I think we should separate every iteration to have an accurate calculation so our script will alternate between those two columns, every iteration will do the processing for one of the page rank columns and save the new results to the other pagerank column
the previous code will loop for many times to get an accurate results like 50 times each time we will get closer to the real pageranks for our pages
my question is, if the sum of all the pageranks in my network should be equal 1!
if yes how is google giving every page a rank out of 10?!
any ideas?
Thanks
如果这是您自己的网络,为什么还需要 PageRank?为什么不直接计算从唯一页面到特定页面的链接总数并将该数字用作页面评级?
Why do you need exactly PageRank if that's your own network? Why not just to calculate the total number of links from unique pages to a particular page and use this number as a page rating?