提高 Rails 模型的性能
我有以下模型,允许用户对照片投票。
class Vote < ActiveRecord::Base
attr_accessible :value
belongs_to :photo
belongs_to :user
validates_associated :photo, :user
validates_uniqueness_of :user_id, :scope => :photo_id
validates_uniqueness_of :photo_id, :scope => :user_id
validates_inclusion_of :value, :in => [-2,-1,1,2], :allow_nil => true
after_save :write_photo_data
def self.score
dd = where( :value => -2 ).count
d = where( :value => -1 ).count
u = where( :value => 1 ).count
uu = where( :value => 2 ).count
self.compute_score(dd,d,u,uu)
end
def self.compute_score(dd, d, u, uu)
tot = [dd,d,u,uu].sum.to_f
score = [-5*dd, -2*d, 2*u, 5*uu].sum / [tot,4].sum*20.0
score.round(2)
end
private
def write_photo_data
self.photo.score = self.photo.votes.score
self.photo.save!
end
end
这个功能非常好,但是计算照片的分数相当慢 - 似乎平均需要 7-12 秒。我尝试为 photo_id
、user_id
添加索引,并为 photo_id
和 value
添加索引,但这并没有据我所知,并没有真正提高性能。
我对任何认真的 Rails 专家(我完全是业余爱好者)的反馈感兴趣,了解如何优化/改进。您如何计算对特定照片和价值的投票?
谢谢!
--编辑--
请注意,分数:-2,-1,1,2
代表“两个拇指向下,一个拇指向下,拇指向上,两个拇指向上”,而不是具体值。我可以将它们与我在计算分数方法中分配给它们的值进行匹配,但到目前为止我还没有这样做,因为在看到积累的更多数据后,我可能想随着时间的推移调整权重。
另外,无论我如何在数据库中表示这四种可能的投票,我仍然需要每种投票的计数以及每张照片的这些投票的加权值来计算分数。谢谢!
I have the following model that allows Users to cast Votes on Photos.
class Vote < ActiveRecord::Base
attr_accessible :value
belongs_to :photo
belongs_to :user
validates_associated :photo, :user
validates_uniqueness_of :user_id, :scope => :photo_id
validates_uniqueness_of :photo_id, :scope => :user_id
validates_inclusion_of :value, :in => [-2,-1,1,2], :allow_nil => true
after_save :write_photo_data
def self.score
dd = where( :value => -2 ).count
d = where( :value => -1 ).count
u = where( :value => 1 ).count
uu = where( :value => 2 ).count
self.compute_score(dd,d,u,uu)
end
def self.compute_score(dd, d, u, uu)
tot = [dd,d,u,uu].sum.to_f
score = [-5*dd, -2*d, 2*u, 5*uu].sum / [tot,4].sum*20.0
score.round(2)
end
private
def write_photo_data
self.photo.score = self.photo.votes.score
self.photo.save!
end
end
This functions very well, however computing the score for a photo is pretty slow - it seems to take 7-12 seconds on average. I've tried adding indices for photo_id
, user_id
, and one combined for photo_id
and value
, but this hasn't really improved the performance as far as I can tell.
I'd be interested in feedback from any serious rails gurus (I'm totally an amateur) as to how this could be optimized / improved. How would you tally up votes for a particular photo and value?
Thanks!
--EDIT--
Note that the scores: -2,-1,1,2
represent "two-thumbs down, one-thumb down, thumb up, two-thumbs up", not specific values. I could match these to the values I've assigned to them in the compute score method, but I haven't done that so far because I may want to tweak the weightings over time after seeing more data accumulated.
Also, regardless of how I represent those four possible votes in the DB, I still need both the COUNT of each kind of vote as well as the weighted value of those votes for each photo to compute the score. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
为什么存储 -2, 2, 1, 2 而不是实际成绩?如果您存储成绩(例如-5),您将能够直接在数据库中计算分数,而无需运行 4 次计数查询。这肯定会是一个进步。
Why do you store -2, 2, 1, 2 instead of the actual grade? If you store the grade (-5 for example), you will be able to compute the score in DB directly without having to run 4 count queries. This will be an improvement for sure.
如果数据库中有大量记录,则在值列上放置索引将加快 SELECT 的速度。
上面的帖子还提出了一些关于直接优化的好点。然而,随着数据库的扩展,所有这些方法最终都会失败。由于分数是派生值,因此您可以将其缓存在 Memcached、Redis 甚至 SQL 中,这将确保随着应用程序的增长,在恒定时间内获取分数。您可以允许缓存过期并使用后台进程保持更新。通过这样做,您的计算函数可以花费任意长的时间,而不会影响用户体验。
Putting an index on the value column will speed up the SELECTs if you have lots of records in the DB.
The above posts also bring up some good points on direct optimization. However, as your DB scales, all of these approaches will eventually fall down. Since the score is a derived value, you could cache it in Memcached, Redis, or even SQL which will ensure that fetching the score scales in constant time as the app grows. You can allow the caches to get out of date and keep them updated using a background process. By doing so, your calculation function can take arbitrarily long without impacting the user experience.
您需要一个价值索引本身。组合索引仅在查询具有两个组件(从左侧开始)时才起作用。由于您的 where 子句未指定照片 ID,因此它不使用您的组合索引。
更新请参阅http:// dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html
You need an index on value, by itself. combined indexes only work when the query has both components, starting at the left. Since your where clause does not specify a photo id, it's not using your combined index.
update see http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html
您可以做的一件事是向数据库询问一次而不是四次分数计数:
将导致单个数据库查询并为您提供像这样的哈希
此外,如果您存储
[-5, - 的实际值2, 2, 5]
而不是数据库中的[-2, -1, 1, 2]
,您可以直接从数据库中获取总和(或者甚至使用
avg
来获取平均值)One thing you could do is asking the database once instead of four times for the score counts:
would result in a single database query and give you a hash like
Besides that, if you store the actual values of
[-5, -2, 2, 5]
instead of[-2, -1, 1, 2]
in the database, you could just doand get your sum direct from the database (or even use
avg
to get the average instead)