存储数据库记录计数是否多余?

发布于 2024-08-06 07:37:03 字数 422 浏览 2 评论 0原文

我正在使用 Rails 和 MySQL,并且有一个基于行计数的效率问题。

我有一个 Project 模型,has_many :donations

我想计算一个项目的独特捐助者的数量。

projects 表中添加一个名为 num_donors 的字段,并在创建新捐赠者时递增该字段是个好主意吗?

或者,由于数据库优化,像 @num_donors = Donor.count(:select => 'DISTINCT user_id') 这样的东西在效率方面会相似或相同吗?这是否需要我为 user_id 以及我想要计数的任何其他字段创建索引?

对于捐赠总额的求和,同样的答案是否成立?

I'm using Rails and MySQL, and have an efficiency question based on row counting.

I have a Project model that has_many :donations.

I want to count the number of unique donors for a project.

Is having a field in the projects table called num_donors, and incrementing it when a new donor is created a good idea?

Or is something like @num_donors = Donor.count(:select => 'DISTINCT user_id') going to be similar or the same in terms of efficiency thanks to database optimization? Will this require me to create indexes for user_id and any other fields I want to count?

Does the same answer hold for summing the total amount donated?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

热情消退 2024-08-13 07:37:03

来回答一下标题的问题。是的,这是多余的,但是否应该这样做取决于您的情况。

除非您存在已知的性能问题,否则请在应用程序中动态计算计数和总计,并且不要存储它们。也就是说,除非别无选择,否则不要存储计算值。

在大多数情况下,您不必也不应该诉诸于此。

如果必须存储计算值,请执行以下操作:

  • 不要通过递增来使其保持最新。每次更新时重新计算所有数据的计数/总计。
  • 如果你的更新不多的话
    将代码放入更新触发器中
    保持计数/总计最新。
  • 冗余的问题
    数据库是当数字
    不同意,你不确定哪个是
    权威性。添加到
    文档说明来源
    数据是权威来源,如果
    他们不同意并且可以被覆盖。

To answer the title question. Yes it is redundant, but whether you should do it depends on your situation.

Unless you have known performance problems, calculate the counts and totals on the fly in your application and don't store them. That is, don't store calculated values unless you have no other choice.

In most situations, you wont have to resort to this and shouldn't.

If you must store calculated values, do the following:

  • Don't keep it up-to date by incrementing it. Recalculate the count/total from all the data each time you update it.
  • If you don't have a lot of updates,
    put the code in an update trigger to
    keep the count/totals up to date.
  • The trouble with redundancy in
    databases is that when the numbers
    disagree, you are unsure of which is
    authoritative. Add to the
    documentation a note that the source
    data is the authoritative source if
    they disagree and can be overwritten.
澉约 2024-08-13 07:37:03

虽然这取决于数据库的大小,但这些是数据库专门从事的操作类型,因此它们应该很快。这里可能是过早优化的情况 - 您应该从不存储总数开始,从而使其更简单 - 并在必要时稍后进行优化。

While it depends on the size of your database, these are the kinds of operations that databases specialize in, so they should be fast. It's probably a case of premature optimization here - you should start by not storing the totals, thus making it simpler - and optimize later if necessary.

话少情深 2024-08-13 07:37:03

记住这句格言“一个人带着一只手表总是知道时间。一个人带着两块手表永远不确定。”我只会在以下情况下存储派生数字:

性能问题阻止您在以下情况下获取派生数字:您需要它们(在这种情况下这应该不是问题,因为答案可能可以从索引中获得)

或者

您有理由相信您由于程序员错误或故意或意外的用户操作而丢失了主表中的记录。在这种情况下,您可以使用派生的数字来审核当前计算的数字。

Remember the maxim "A man with one watch always knows the time. A man with two watches is never sure." I would only store the derived number if:

Performance issues stop you from getting the derived numbers when you need them (which should not be a problem in this case since the answer is likely to be available from the indexes)

or

You have reason to believe that you are losing records from the main table through programmer error or deliberate or accidental user action. In that case, you can use your the derived number to audit the currently calculated number.

◇流星雨 2024-08-13 07:37:03

Peter 和 JohnFx 的答案是合理的,您建议的是数据库架构的非规范化,这可以提高读取性能,但会损害写入,同时还让开发人员(或其他 DBMS 专家)承担防止数据集中出现不一致的责任。

ActiveRecord 有一些内置功能可以自动管理 has_many 关系的计数。查看计数器缓存上的 Railscast

Peter's and JohnFx's answers are sound, what you're proposing is the denormalization of your database schema, which can improve read performance but at the detriment of writes while additionally putting the onus on the developer (or additional DBMS clevers) to prevent inconsistencies within your dataset.

ActiveRecord has some built in functionality to automatically manage counts on has_many relationships. Check out this Railscast on counter caches.

淡忘如思 2024-08-13 07:37:03

您知道 ActiveRecord 的神奇之处在于一个简单的标志吗?

class ThingOwner

# it has a column like
# t.integer things_count, :default => 0

has_many :things, :counter_cache => true

end

至于问题 - 是的,当然这是多余的,当且仅当 things.count 的时间份额太大时,我才会添加这样的计数器。

否则就是过早的优化。

Do you know that a simple flag does the ActiveRecord magic?

class ThingOwner

# it has a column like
# t.integer things_count, :default => 0

has_many :things, :counter_cache => true

end

As for the question - yeah, sure it is redundant, I would add such a counter if and only if things.count's share of time is too large.

Otherwise it's premature optimization.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文