如何在数据库中对产品评级进行建模?

发布于 2024-09-02 19:51:51 字数 325 浏览 13 评论 0原文

在数据库中存储产品评级的最佳方法是什么?我想到以下两种(简化的,并假设 MySQL 数据库)场景:

在products表中创建两列,分别存储所有投票的数量和总和。使用这些列在运行时或使用查询来获取平均值。

这种方法意味着我只需要访问一张表,从而简化了事情。

通过创建一个附加表来存储评级来标准化数据。

这会将评级数据隔离到一个单独的表中,而让产品表提供有关可用产品的数据。尽管它需要联接或单独的评级查询。

哪种方法最好,标准化还是非标准化?

What is the best approach to storing product ratings in a database? I have in mind the following two (simplified, and assuming a MySQL db) scenarios:

Create two columns in the products table to store the number and the sum of all votes respectively. Use the columns to get an average at run time or using a query.

This approach means I only need to access one table, simplifying things.

Normalize the data by creating an additional table to store the ratings.

This isolates the ratings data into a separate table, leaving the products table to furnish data on available products. Although it would require a join or a separate query for ratings.

Which approach is best, normalised or denormalised?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

内心旳酸楚 2024-09-09 19:51:52

强烈建议使用不同的评级表以保持动态。不要担心数百(或数千或数万)个条目,这对于数据库来说只是小事一桩。

建议:

产品

  • id
  • 名称

产品评级

  • id
  • 产品Id
  • 评级
  • 日期(如果需要)
  • ip(如果需要,例如防止双重评级)

检索产品的所有评级 1234

SELECT pr.rating
FROM products_ratings pr
INNER JOIN products p
  ON pr.productId = p.id
  AND p.id = 1234

产品的平均评分 1234

SELECT AVG(pr.rating) AS rating_average -- or ROUND(AVG(pr.rating))
FROM products_ratings pr
INNER JOIN products p
  ON pr.productId = p.id
  AND p.id = 1234;

获取产品列表及其平均评分也同样容易:

SELECT
  p.id, p.name, p.etc,
  AVG(pr.rating) AS rating_average
FROM products p
INNER JOIN products_ratings pr
  ON pr.productId = p.id
WHERE p.id > 10 AND p.id < 20 -- or whatever
GROUP BY p.id, p.name, p.etc;

A different table for ratings is highly recommended to keep things dynamic. Don't worry about hundreds (or thousands or tens of thousands) of entries, that's all peanuts for databases.

Suggestion:

table products

  • id
  • name
  • etc

table products_ratings

  • id
  • productId
  • rating
  • date (if needed)
  • ip (if needed, e.g. to prevent double rating)
  • etc

Retrieve all ratings for product 1234:

SELECT pr.rating
FROM products_ratings pr
INNER JOIN products p
  ON pr.productId = p.id
  AND p.id = 1234

Average rating for product 1234:

SELECT AVG(pr.rating) AS rating_average -- or ROUND(AVG(pr.rating))
FROM products_ratings pr
INNER JOIN products p
  ON pr.productId = p.id
  AND p.id = 1234;

And it's just as easy to get a list of products along with their average rating:

SELECT
  p.id, p.name, p.etc,
  AVG(pr.rating) AS rating_average
FROM products p
INNER JOIN products_ratings pr
  ON pr.productId = p.id
WHERE p.id > 10 AND p.id < 20 -- or whatever
GROUP BY p.id, p.name, p.etc;
苯莒 2024-09-09 19:51:52

我知道我的答案并不是您真正要求的,但您可能希望有机会促进您的系统的新产品几乎永远无法击败旧产品。假设您将获得评分为 99% 的产品。如果按照评分最高的产品来排序,新产品很难获得高分。

I know that my answer is not what you actually ask for, but you might want to have a chance of facilitating that new products with your system can almost never beat the old products. Say that you would get a product with 99% rating. It would be very difficult for new products to get high if you sort by products with the highest rating.

多孤肩上扛 2024-09-09 19:51:52

除非您绝对需要,否则不要存储每个评级的记录。这种情况的一个例子是心理实验,该实验倾向于分析评估者本身的特定属性。所以,是的!您必须同样疯狂才能将每个费率存储在单独的记录中。

现在,来到解决方案,向产品表中添加另外两列:AverageRatingRateCount
你会在其中存储什么?好吧,假设您已经计算出两个数字的平均值:23,即 2.5;如果新费率为 10,您需要将平均值 (2.5) 乘以费率计数(在本例中为 2)。现在,您有5。将此结果添加到新的速率值 (10) 中,并将结果除以 3

让我们用一个简单的公式来涵盖上述所有内容,

(AverageRating * RateCount + NewRateValue) / (RateCount + 1)

So (2.5 * 2 + 10) / (2 + 1) = 5

在服务器端(而不是在数据库中)计算平均值,并将平均值存储在 AverageRating 列中,将速率计数存储在 RateCount 列中。

简单,对吧?!

编辑

此解决方案不需要单独存储每个评级,只要不涉及审核、编辑或删除操作即可。然而,对于这种情况;假设您的评价为 3,所有者用户希望将其修改为 4。那么,重新计算平均评分的公式将是这样的,

(AverageRating * RateCount - OldRateValue + NewRateValue) / RateCount

参考文献

https://math.stackexchange.com/a/106314

Do not store a record of each rating unless you absolutely need them specifically. An example of such a case could be a psychological experiment that tends to analyze specific properties of the raters themselves. So, yeah! You'd have to be just as crazy to store each rate in a separate record.

Now, coming to the solution, add two more columns to your product table: AverageRating and RateCount.
What would you store in them? Well, suppose you have an already-calculated average of the two numbers: 2 and 3, which is 2.5; having a new rate of 10, you'll multiply the average (2.5) by the rate count (2 in this case). Now, you have 5. Add this result to the new rate value (10) and divide the result by 3.

Let's cover all the above in a simple formula,

(AverageRating * RateCount + NewRateValue) / (RateCount + 1)

So (2.5 * 2 + 10) / (2 + 1) = 5.

Calculate the average on the server-side (not in your database) and store the average in the AverageRating column and the rate count in the RateCount column.

Simple, right?!

Edit

This solution doesn't require storing each rating separately as long as no review, edit or delete operations are involved. Yet, for such cases; let's assume that you've got a review with a rating of 3 that the owning user would like to modify to 4. Then, the formula to recalculate the average rating would be like this,

(AverageRating * RateCount - OldRateValue + NewRateValue) / RateCount

References

https://math.stackexchange.com/a/106314

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文