洪水贝叶斯评级创建的值超出范围

发布于 2024-11-07 01:48:00 字数 2226 浏览 8 评论 0原文

我正在尝试应用 贝叶斯评分公式,但如果我评分为 1 5,000 或数百,最终评分大于 5。

例如,给定项目没有投票,在投票 170,000 次并获得 1 星后,其最终评分为 5.23。如果我给100分,它就是一个正常值。

这是我在 PHP 中的内容。

<?php
// these values came from DB
$total_votes     = 2936;    // total of votes for all items
$total_rating    = 582.955; // sum of all ratings
$total_items     = 202;

// now the specific item, it has no votes yet
$this_num_votes  = 0;
$this_score      = 0;
$this_rating     = 0;

// simulating a lot of votes with 1 star
for ($i=0; $i < 170000; $i++) { 
    $rating_sent = 1; // the new rating, always 1

    $total_votes++; // adding 1 to total
    $total_rating = $total_rating+$rating_sent; // adding 1 to total

    $avg_num_votes = ($total_votes/$total_items); // Average number of votes in all items
    $avg_rating = ($total_rating/$total_items);   // Average rating for all items
    $this_num_votes = $this_num_votes+1;          // Number of votes for this item
    $this_score = $this_score+$rating_sent;       // Sum of all votes for this item
    $this_rating = $this_score/$this_num_votes;   // Rating for this item

    $bayesian_rating = ( ($avg_num_votes * $avg_rating) + ($this_num_votes * $this_rating) ) / ($avg_num_votes + $this_num_votes);
}
echo $bayesian_rating;
?>

即使我用 1 或 2 淹没:

$rating_sent = rand(1,2)

100,000 票后的最终评级超过 5。

我刚刚做了一个新的测试

$rating_sent = rand(1,5)

,在 100,000 票后我得到了一个完全超出范围的值 (10.53)。我知道在正常情况下,没有一个项目会获得 170,000 票,而所有其他项目都不会获得投票。但我想知道我的代码是否有问题,或者考虑到大量选票,这是否是贝叶斯公式的预期行为。

编辑

为了清楚起见,这里对某些变量有更好的解释。

$avg_num_votes   // SUM(votes given to all items)/COUNT(all items)
$avg_rating      // SUM(rating of all items)/COUNT(all items)
$this_num_votes  // COUNT(votes given for this item)
$this_score      // SUM(rating for this item)
$bayesian_rating // is the formula itself

公式为:( (avg_num_votes * avg_ rating) + (this_num_votes * this_ rating) ) / (avg_num_votes + this_num_votes)。摘自此处

I'm trying to apply the Bayesian rating formula, but if I rate 1 out of 5 thousand of hundreds, the final rating is greater than 5.

For example, a given item has no votes and after voting 170,000 times with 1 star, its final rating is 5.23. If I rate 100, it has a normal value.

Here is what I have in PHP.

<?php
// these values came from DB
$total_votes     = 2936;    // total of votes for all items
$total_rating    = 582.955; // sum of all ratings
$total_items     = 202;

// now the specific item, it has no votes yet
$this_num_votes  = 0;
$this_score      = 0;
$this_rating     = 0;

// simulating a lot of votes with 1 star
for ($i=0; $i < 170000; $i++) { 
    $rating_sent = 1; // the new rating, always 1

    $total_votes++; // adding 1 to total
    $total_rating = $total_rating+$rating_sent; // adding 1 to total

    $avg_num_votes = ($total_votes/$total_items); // Average number of votes in all items
    $avg_rating = ($total_rating/$total_items);   // Average rating for all items
    $this_num_votes = $this_num_votes+1;          // Number of votes for this item
    $this_score = $this_score+$rating_sent;       // Sum of all votes for this item
    $this_rating = $this_score/$this_num_votes;   // Rating for this item

    $bayesian_rating = ( ($avg_num_votes * $avg_rating) + ($this_num_votes * $this_rating) ) / ($avg_num_votes + $this_num_votes);
}
echo $bayesian_rating;
?>

Even if I flood with 1 or 2:

$rating_sent = rand(1,2)

The final rating after 100,000 votes is over 5.

I just did a new test using

$rating_sent = rand(1,5)

And after 100,000 I got a value completely out of range range (10.53). I know that in a normal situation no item will get 170,000 votes while all the other items get no vote. But I wonder if there is something wrong with my code or if this is an expected behavior of Bayesian formula considering the massive votes.

Edit

Just to make it clear, here is a better explanation for some variables.

$avg_num_votes   // SUM(votes given to all items)/COUNT(all items)
$avg_rating      // SUM(rating of all items)/COUNT(all items)
$this_num_votes  // COUNT(votes given for this item)
$this_score      // SUM(rating for this item)
$bayesian_rating // is the formula itself

The formula is: ( (avg_num_votes * avg_rating) + (this_num_votes * this_rating) ) / (avg_num_votes + this_num_votes). Taken from here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

一口甜 2024-11-14 01:48:00

计算 avg_ rating 时,您需要除以total_votes 而不是total_items。

我做了一些改变,并得到了一些表现更好的东西。

http://codepad.org/gSdrUhZ2

You need to divide by total_votes rather than total_items when calculating avg_rating.

I made the changes and got something that behaves much better here.

http://codepad.org/gSdrUhZ2

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文