人气公式? (基于“喜欢”、“评论”、“观点”)

发布于 2024-09-05 01:49:16 字数 208 浏览 13 评论 0原文

我在网站上有一些页面,我必须根据“流行度”/“活动”创建一个排序

我必须使用的参数是:

  • 页面上的页面
  • 评论的视图(底部有一个表格,其中用户可以发表评论)
  • 点击“喜欢”图标

受欢迎程度的公式有什么标准吗? (如果没有意见也很好)

(最初我想到的是意见+10*评论+10*点赞)

I have some pages on a website and I have to create an ordering based on "popularity"/"activity"

The parameters that I have to use are:

  • views to the page
  • comments made on the page (there is a form at the bottom where uses can make comments)
  • clicks made to the "like it" icon

Are there any standards for what a formula for popularity would be? (if not opinions are good too)

(initially I thought of views + 10*comments + 10*likeit)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

黯淡〆 2024-09-12 01:49:16

实际上有一个公认的最佳计算方法:
http://www.evanmiller.org/how-not -to-sort-by-average- rating.html

您可能需要将“喜欢”和“评论”组合成一个分数,为每个分数分配您自己的权重因子,然后将其作为“积极投票的价值。

从上面的链接:

分数 = Wilson 分数置信区间的下限
伯努利参数

我们需要平衡正面评价的比例
少量观测结果的不确定性。幸运的是,
1927 年,埃德温·B·威尔逊 (Edwin B. Wilson) 制定了相关数学公式。我们想要什么
要问的是:鉴于我的评级,有 95% 的可能性
正面评价的“真实”比例至少是多少?
Wilson 给出了
回答。仅考虑正面和负面评级(即不是
5星),正面评价比例的下限
由下式给出:
输入图像描述这里

(在表示加/减的地方使用减号来计算下限。)
这里,观察到的正面评分分数,zα/2
(1-α/2) 标准正态分布的分位数,n
评分总数。在 Ruby 中实现的相同公式:

require 'statistics2'

def ci_lower_bound(pos, n, confidence)
    if n == 0
        return 0
    end
    z = Statistics2.pnormaldist(1-(1-confidence)/2)
    phat = 1.0*pos/n
    (phat + z*z/(2*n) - z * Math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)
end

pos 为好评数,n 为好评总数
评级,置信度是指统计置信水平:
选择 0.95 表示您的下限正确的可能性为 95%,即 0.975
有 97.5% 的机会等。此函数中的 z 分数永远不会
变化,所以如果你没有方便的统计包或者如果
性能是一个问题,您始终可以在此处为 z 硬编码一个值。
(使用 1.96 表示置信水平为 0.95。)

与 SQL 查询相同的公式:

SELECT widget_id, ((positive + 1.9208) / (positive + negative) - 
                   1.96 * SQRT((positive * negative) / (positive + negative) + 0.9604) / 
                          (positive + negative)) / (1 + 3.8416 / (positive + negative)) 
       AS ci_lower_bound FROM widgets WHERE positive + negative > 0 
       ORDER BY ci_lower_bound DESC;

Actually there is an accepted best way to calculate this:
http://www.evanmiller.org/how-not-to-sort-by-average-rating.html

You may need to combine 'likes' and 'comments' into a single score, assigning your own weighting factor to each, before plugging it into the formula as the 'positive vote' value.

from the link above:

Score = Lower bound of Wilson score confidence interval for a
Bernoulli parameter

We need to balance the proportion of positive ratings with
the uncertainty of a small number of observations. Fortunately, the
math for this was worked out in 1927 by Edwin B. Wilson. What we want
to ask is: Given the ratings I have, there is a 95% chance that the
"real" fraction of positive ratings is at least what?
Wilson gives the
answer. Considering only positive and negative ratings (i.e. not a
5-star scale), the lower bound on the proportion of positive ratings
is given by:
enter image description here

(Use minus where it says plus/minus to calculate the lower bound.)
Here is the observed fraction of positive ratings, zα/2 is the
(1-α/2) quantile of the standard normal distribution, and n is the
total number of ratings. The same formula implemented in Ruby:

require 'statistics2'

def ci_lower_bound(pos, n, confidence)
    if n == 0
        return 0
    end
    z = Statistics2.pnormaldist(1-(1-confidence)/2)
    phat = 1.0*pos/n
    (phat + z*z/(2*n) - z * Math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)
end

pos is the number of positive ratings, n is the total number of
ratings, and confidence refers to the statistical confidence level:
pick 0.95 to have a 95% chance that your lower bound is correct, 0.975
to have a 97.5% chance, etc. The z-score in this function never
changes, so if you don't have a statistics package handy or if
performance is an issue you can always hard-code a value here for z.
(Use 1.96 for a confidence level of 0.95.)

The same formula as an SQL query:

SELECT widget_id, ((positive + 1.9208) / (positive + negative) - 
                   1.96 * SQRT((positive * negative) / (positive + negative) + 0.9604) / 
                          (positive + negative)) / (1 + 3.8416 / (positive + negative)) 
       AS ci_lower_bound FROM widgets WHERE positive + negative > 0 
       ORDER BY ci_lower_bound DESC;
调妓 2024-09-12 01:49:16

对此没有标准公式(怎么可能有?)

您所拥有的看起来像是一个相当正常的解决方案,并且可能会很好地工作。当然,您应该尝试使用 10 来找到适合您需要的值。

根据您的要求,您可能还需要添加时间因素(即每周 -X 点),以便旧页面变得不那么受欢迎。或者,您可以将“页面浏览量”更改为“上个月的页面浏览量”。同样,这取决于您的需求,它可能不相关。

There is no standard formula for this (how could there be?)

What you have looks like a fairly normal solution, and would probably work well. Of course, you should play around with the 10's to find values that suit your needs.

Depending on your requirements, you might also want to add in a time factor (i.e. -X points per week) so that old pages become less popular. Alternatively, you could change your "page views" to "page views in the last month". Again, this depends on your needs, it may not be relevant.

貪欢 2024-09-12 01:49:16

您可以执行类似于 YouTube 的操作 - 只需按每个类别的最大计数进行排序。例如 - 观看次数最多、评论次数最多、点赞次数最多。在每个类别中,不同的页面可能排在第一位,尽管排名可能是相关的。如果您只需要一个排名,那么您必须想出某种公式,最好是通过分析您已有的一堆数据并决定应计算什么为好/坏,然后向后工作以适应经验得出的公式适合您决定的方程式。

您甚至可以尝试使用机器学习方法来“学习”将这些数字组合在一起的最佳权重,如示例公式中所示。手动完成可能也不会太难。

You could do something like what YouTube does - just have it sorted by largest count per category. For example - most viewed, most commented, most liked. In each category a different page could come first, though the rankings might likely be correlated. If you only need a single ranking, then you would have to come up with a formula of some sort, preferably derived empirically by analyzing a bunch of data you already have and deciding what should be calculated as good/bad, and working backwards to fit an equation that fits your decision.

You could even attempt a machine learning approach to "learn" what a good weighting is for combining each of these numbers as in your example formula. Doing it manually might also not be too hard.

绝不放开 2024-09-12 01:49:16

我使用的是,

(C*comments + L*likeit)*100/views

您必须根据您对每个属性的重视程度来使用 C 和 L。
我使用 C=1 和 L=1。

这为您提供了产生积极行动的视图百分比,使项目具有
比例越高最“受欢迎”。
我喜欢这一点,因为它使得较新的商品一开始就非常受欢迎,首先出现并获得更多浏览量,从而变得不那么受欢迎(或更多),直到稳定下来。

反正,
我希望它有帮助。
PS:如果没有“*100”,它的工作原理是一样的,但我喜欢百分比。

I use,

(C*comments + L*likeit)*100/views

where you must use C and L depending on how much you value each attribute.
I use C=1 and L=1.

This gives you the percentage of views that generated a positive action, making the items with
higher percentage the most "popular".
I like this because it makes it possible for newer items to be very popular at first, showing up first and getting more views and thus becoming less popular (or more) until stabilizing.

Anyway,
i hope it helps.
PS: Of it would work just the same without the "*100" but i like percentages.

乞讨 2024-09-12 01:49:16

我更看重评论,而不是“内容引发讨论”。如果只是陈述事实,那么评论和点赞数量的同等比例似乎还可以(尽管我认为 10 有点太多了……)

访问是否以某种方式考虑了用户花费的时间?您也可以使用它,因为 2 秒的视图意味着少于 3 分钟的视图。

I would value comments more than 'like it's if the content invites a discussion. If it's just stating facts, an equal ration for comments and the like count seems ok (though 10 is a bit too much, I think...)

Does visit take into account the time the user spent somehow? You might use that, as well, as a 2 second view means less than a 3 minute one.

半世晨晓 2024-09-12 01:49:16

Anentropic 的答案的Java代码:

public static double getRank(double thumbsUp, double thumbsDown) {
  double totalVotes = thumbsUp + thumbsDown;

  if (totalVotes > 0) {
    return ((thumbsUp + 1.9208) / totalVotes - 
      1.96 * Math.sqrt((thumbsUp * thumbsDown) / totalVotes + 0.9604) / 
      totalVotes) / (1 + (3.8416 / totalVotes));
  } else {
    return 0;
  }
}

Java code for Anentropic's answer:

public static double getRank(double thumbsUp, double thumbsDown) {
  double totalVotes = thumbsUp + thumbsDown;

  if (totalVotes > 0) {
    return ((thumbsUp + 1.9208) / totalVotes - 
      1.96 * Math.sqrt((thumbsUp * thumbsDown) / totalVotes + 0.9604) / 
      totalVotes) / (1 + (3.8416 / totalVotes));
  } else {
    return 0;
  }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文