是否有专门针对拥有大量受众的网站的可扩展性最佳实践?
虽然这个问题以前曾在各种情况下被问过,但我找不到任何专门针对针对大量受众(例如数十万甚至数百万用户规模)的网站的信息。
当编写针对较小受众的网站(例如处理从几个到几千个用户的 Intranet 托管数据驱动网站)时,我们只倾向于在项目预算/截止日期的范围内遵循最佳实践 - 即开发人员成本、推出时间表和可维护性对我们如何编码的影响比我们通常希望的要大得多。
有些事情(在某种程度上)也可以忽略不计,例如交付时间、图像压缩/大小、带宽,因为 LAN 托管应用程序的性质往往意味着我们(在合理范围内)不需要花费相对少量的财务成本。不需要担心太多。
然而,当希望瞄准更广泛的受众时,例如(希望)数百万用户的受众:
- 是否有任何不再需要担心的最佳实践(即变得更加可以忽略不计 >更多观众)?
- 是否有任何做法应该更加严格地遵守?
- 另外,是否有任何实践只有在您的受众达到一定的临界质量(以及该临界质量是多少)时才真正发挥作用?即在专用网络上应用不会让您担心的人为约束
到目前为止我遇到的示例有:
- 在 Google 上托管 jQuery 等代码库,因为它是从 Google 的 CDN 提供的,并且比您自己的服务器提供的服务速度要快得多。这也将有助于降低网站交付的带宽成本。
- 在 CDN 上托管图像的原因与在其他地方托管 javascript 代码的原因相同。
While this question has been asked in a variety of contexts before, I can't find any information pertaining specifically to sites targeting very large audiences - for example on the scale of hundreds of thousands or even millions of users.
When writing sites that target smaller audiences (such as intranet hosted data driven sites that handle from a few to a few thousand users) we only tend to follow best practices within the confines of our project budgets/deadlines - i.e. developer costs, rollout schedules and maintainability have a far bigger impact than we would often like on how we code things.
Some things are also negligible (to a point), for instance delivery time, image compression/size, bandwidth because the nature of a LAN hosted application tends to mean that there is a relatively small amount of financial cost that (within reason) we don't need to worry about too much.
However, when looking to target a much broader audience for instance an audience of (hopefully) millions of users:
- Are there any best practices that no longer need to be worried about (i.e. become more negligible the larger the audience)?
- Are there any practices that should be adhered to even more tightly?
- Also, are there any practices that only really come into play as your audience achieves some critical mass [and what would that critical mass be]? i.e. applying artificial constraints that wouldn't begin to concern you on a private network
Examples I've come across so far are:
- Host codebases such as jQuery on Google as it's delivered from Google's CDN and can be served much faster than from your own servers. This will also help keep bandwidth costs down for delivery of your site.
- Host images on a CDN for the same reason as hosting your javascript code elsewhere.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我想这取决于压力“三角”的目标:CAP(一致性、可用性和分区容错性)。例如,当面临导致“P”的网络中断时,一个人只能拥有这么多“C”。
如今,重点似乎更多地放在提供“良好的用户体验”上,这似乎取决于“结果时间”(例如,在用户桌面上有一个完整的网页):这转化为投资(除其他外) “A”和“P”侧的数量多于“C”侧的数量。
更具体地说:花一些时间决定何时为用户的表示层执行数据聚合,例如我可以在之前
当然,我只是触及了问题的表面。
I guess it depends on what one aims for on the "triangle" of pressures: CAP (Consistency, Availability & Tolerance to Partition). E.g. one can only have so much "C" when faced with network disruptions which incur "P".
Nowadays, it would appear that the accent is put more on delivering "good user experience" which seems to hinge on "Time to Result" (e.g. having a complete web page on the user's desktop): this translate to investing (amongst other things) more on the "A" and "P" sides then the "C" one.
More concretely: spend some time deciding when to perform data aggregation for the presentation layer to your users e.g. can I aggregate this data over a longer time period before recomputing another view to push?
Of course, I am only barely scratching the surface of the problem.
我认为这里需要记住三件大事:
a)你不会写下一个 twitter/youtube/facebook/ebay/amazon/whatever。这种情况并不经常发生,所以这是 YAGNI 的一个大案例。
b) 如果您碰巧编写了其中之一,那么您很可能有机会多次重写该应用程序。
c) 从任何公开谈论这些应用程序的架构类型中得到的唯一的实际教训是,水平扩展是正确的选择。垂直极限真实、快速。
另外,我认为流程改进在如此高的规模上会变得更大。您将拥有大量的开发人员、严格的部署窗口和大量需要担心的问题。它最好是真正的脚本化、自动化和可重复的。
I think there are three big things to keep in mind here:
a) You aren't going to write the next twitter/youtube/facebook/ebay/amazon/whatever. It don't happen too often so it is a big case of YAGNI.
b) If you do happen to write one of those, chances are you'll have the opportunity to rewrite the application more than a few times.
c) Only object lesson from any of the architecture types who have spoken publicly about those apps is that scaling horizontally is the way to go. Vertical maxes out real, real quick.
Also, I'd argue that process improvements become much bigger at these lofty scales. You will have legions of developers, strict deployment windows and lots of boxes to worry about. It had better be real scripted, automated and repeatable.
我会查看 YSlow 并遵循他们关于提高性能的建议。
I would check out YSlow and follow their reccomendations with regards to improving performance.
@jldupont - 刚刚查看了您链接到的演示文稿。我不明白的一件事是,当您失去可用性以获得一致性和分区时,为什么“分布式数据库”是一个示例场景。
我认为对于分布式数据库你会失去一致性。
@jldupont - Just looked at the presentation that you have linked to. One thing that I didn't get is that how come "Distributed Databases" is an example scenario when you lose Availability to gain Consistency and Partitioning.
I think for distributed databases you lose Consistency.