标准化真的会损害高流量网站的性能吗?
我正在设计一个数据库,我想规范化数据库。在一个查询中,我将连接大约 30-40 个表。如果网站变得非常受欢迎,这会损害网站性能吗?这将是主要查询,并且 50% 的时间都会调用它。我将连接两个表的其他查询。
我现在可以选择规范化或不规范化,但如果规范化将来成为问题,我可能必须重写 40% 的软件,这可能会花费我很长时间。在这种情况下,正常化真的会造成伤害吗?我现在应该在有时间的时候去规范化吗?
I am designing a database and I would like to normalize the database. In one query I will joining about 30-40 tables. Will this hurt the website performance if it ever becomes extremely popular? This will be the main query and it will be getting called 50% of the time. The other queries I will be joining about two tables.
I have a choice right now to normalize or not to normalize but if the normalization becomes a problem in the future I may have to rewrite 40% of the software and it may take me a long time. Does normalization really hurt in this case? Should I denormalize now while I have the time?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我引用:“为了正确性而标准化,为了速度而非标准化 - 仅在必要时”
我建议您参考:就数据库而言,“为了正确性而标准化,为了性能而进行非规范化”是正确的口头禅吗?
HTH。
I quote: "normalize for correctness, denormalize for speed - and only when necessary"
I refer you to: In terms of databases, is "Normalize for correctness, denormalize for performance" a right mantra?
HTH.
当关注性能时,通常有比非规范化更好的替代方案:
When performance is a concern, there are usually better alternatives than denormalization:
标准化可能会损害性能。然而,这并不是过早进行非规范化的理由。
从完全标准化开始,然后您将看到是否存在任何性能问题。按照您描述的速度(每天 1000 次更新/插入),我认为除非表很大,否则您不会遇到问题。
即使有大量数据库优化选项(索引、准备存储过程、物化视图……)可供您使用。
Normalization can hurt performance. However this is no reason to denormalize prematurely.
Start with full normalization and then you'll see if you have any performance problems. At the rate you are describing (1000 updates/inserts per day) I don't think you'll run into problems unless the tables are huge.
And even if there are tons of database optimization options (Indexes, Prepared stored procedures, materialized views, ...) that you can use.
也许我在这里遗漏了一些东西。但是,如果您的架构要求您在单个查询中连接 30 到 40 个表,并且该查询是您网站的主要用途,那么您就会遇到更大的问题。
我同意其他人的观点,不要过早优化您的网站。但是,您应该优化您的架构以适应您的主要用例。在 IMO 中,运行超过 50% 时间的查询的 40 个表连接并不是优化的。
Maybe I missing something here. But if your architecture requires you to join 30 to 40 tables in a single query, ad that query is the main use of your site then you have larger problems.
I agree with others, don't prematurely optimize your site. However, you should optimize your architecture to account for you main use case. a 40 table join for a query run over 50% of the time is not optimized IMO.
不要进行早期优化。非规范化并不是加速网站的唯一方法。您的缓存策略也非常重要,如果 30-40 个表的查询包含相当静态的数据,则缓存结果可能会被证明是更好的优化。
另外,还要考虑写入次数与读取次数。如果每次插入或更新执行大约 10 次读取,您可以说数据是相当静态的,因此您应该将其缓存一段时间。
如果您最终对模式进行非规范化,您的写入也会变得更加昂贵,并且可能还会减慢速度。
在进行太多优化之前,先真正分析您的问题,并等待看看系统中真正的瓶颈在哪里,因为您可能最终会对您应该首先优化的内容感到惊讶。
Don't make early optimizations. Denormalization isn't the only way to speed up a website. Your caching strategy is also quite important and if that query of 30-40 tables is of fairly static data, caching the results may prove to be a better optimization.
Also, take into account the number of writes to the number of reads. If you are doing approximately 10 reads for every insert or update, you could say that data is fairly static, hence you should cache it for some period of time.
If you end up denormalizing your schema, your writes will also become more expensive and potentially slow things down as well.
Really analyze your problem before making too many optimizations and also wait to see where your bottlenecks in the system really as you might end up being surprised as to what it is you should optimize in the first place.