从一开始就计划实现最佳性能和可扩展性?
关于 stackoverflow 的第一个问题。 我以前没有运行高流量网站的经验,我认为自己介于新手和中级程序员之间......请温柔一点:)
我正在尝试创建一个社交网站,我最终希望它能够处理大量流量和用户。 然而,我不知道这个概念是否可行,与将一些功能上以相同方式工作的草率代码放在一起相比,可扩展性编程需要大量额外的工作。 此外,由于我对高可扩展性编程相对不了解,我发现自己做了很多研究,这进一步减慢了我的速度(highscalability.com 太棒了......我目前正在尝试找出离线队列
)问题是,我应该:
一)
1. 将一些次优但功能齐全的代码放在一起(有些草率的代码、过多的数据库查询、没有缓存等)
2. 收集流量
3. 重写和重构代码
或 B)
1.充分研究可扩展的设计并从一开始就应用,这样我就不必进行太多重组
2. 收集流量的工作
如有任何建议,我们将不胜感激,谢谢。
First question on stackoverflow. I have no previous experience of running a high traffic website and I would consider myself somewhere in between a novice and an intermediate programmer....please be gentle :)
I am trying to make a social website that I ultimately hope will handle a lot of traffic and users. However, I don't know if the concept will fly and programming for scalability is a lot of additional work compared to slapping some sloppy code together that functionally works the same way. In addition, since I'm relatively uninformed about programming for high scalability, I find myself doing a lot of research which is further slowing me down (highscalability.com is amazing...I'm currently trying to figure out offline queues)
My question is, should I:
A)
1. put together some code that's suboptimal but functional (somewhat sloppy code, excessive database queries, no caches, etc.)
2. work on gathering traffic
3. rewrite and restructure code
or
B)
1. fully research scalable designs and apply from the beginning so I don't have to restructure much
2. work on gathering traffic
Any advice is appreciated, thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
Web 开发是一个持续的过程。 我们可能一开始就认为自己知道自己想要什么,但当我们到达目的地时,情况将不可避免地发生变化。
我建议您首先阅读 37 Signals Crew 的书《Getting Real》:
http://gettingreal.37signals.com/
混合 A 和 B。尝试获得良好的托管情况。 考虑一下可以缓存的方式(memcache - 这很简单)。 编写清晰的代码,但不要花费太多时间......
“尽早发布,经常发布”。
——
这是两个项目的故事。
让智慧指引你...
Web development is a continual process. We may think we know what we want at the beginning, but it will inevitably change by the time we get there.
I suggest that you start by getting the book by the 37 Signals Crew -- Getting Real:
http://gettingreal.37signals.com/
Mix A and B. Try to get a good hosting situation. Think about ways that you can cache (memcache -- it's easy). Write clear code, but don't spend too much time...
"Release Early, Release Often".
--
Here is a tale of two projects.
Let wisdom guide you...
我会选择选项 A。为网站产生流量比提高性能要困难得多。 如果您的想法是独特的,那么上市时间应该是您的首要目标。 http://highscalability.com/ 包含大量关于其他人如何解决可扩展性问题的好文章。
I would go for option A. It's much harder to generate traffic to a website than it is to improve performance. If your idea is unique then time to market should be your primary goal. http://highscalability.com/ contains tons of good articles on how others have solved scalability problems.
花几周时间学习 Cal Henderson 的构建可扩展的网站、Theo Schlossnagle 的可扩展的互联网架构,当然还有您已经找到的网站,Todd Hoff 的优秀 highscalability.com。 至少,您将了解 (A) 和 (B) 之间的权衡,并能够做出更好的决定。
还要花时间查看 Amazon Web Services,尤其是他们的 EC2(弹性云计算)和 S3(简单存储系统) )。 我公司的一个团队刚刚在亚马逊基础设施上部署了一个 Web 应用程序,这比尝试在他们自己的物理硬件上运行它要简单得多。
如果您仍处于早期构思阶段并且只想弄清楚您的想法并进行小型实验,(A)会很有效。 但是,一旦您决定要部署小规模试验以形成全面产品,您绝对需要遵循 (B)。
当您开始转向 (B) 模式时,我建议您使用 AWS 来节省建立自己的基础设施时几乎所有的精力和资本支出。 利用使用 AWS 节省的一些时间来彻底学习 (B) 并应用所学到的知识。 如果您成功了,您的可扩展架构将允许您根据需要租用任意数量的 AWS 机器时数。 如果你没有成功,你会学到很多非常有用的东西,可以应用到你的下一个创业想法(或工作)中。
请记住,这也不是一个非此即彼的选择。 一旦您了解了扩展背后的基本原理,您就能够从简单的事情开始沿着路径 (B) 开始,同时可以放心地知道如何进展到下一步。 Danga 在这些方面有一些非常有趣的演示。 看看这个,你就会明白他们是如何开始的一台机器,转移到应用程序服务器机器和数据库机器,转移到三台应用程序服务器和数据库机器,依此类推。
Spend a couple weeks studying Cal Henderson's Building Scalable Web Sites, Theo Schlossnagle's Scalable Internet Architectures, and of course the site you've already found, Todd Hoff's excellent highscalability.com. At a minimum you'll understand the tradeoffs between (A) and (B) and be able to make a better decision.
Also spend time looking at Amazon Web Services, especially their EC2 (Elastic Cloud Computing) and S3 (Simple Storage System). A group at my company just deployed a web application on the Amazon infrastructure and it was dramatically simpler than trying to run it on their own physical hardware.
If you're still at an early ideation stage and just want to work out your ideas and run small experiments, (A) would work well. But once you decide you want to deploy a small-scale trial leading into a full scale product, you absolutely need to follow (B).
When you start to shift into (B) mode, I'd suggest you use AWS to save nearly all the effort and capital expenditure in setting up your own infrastructure. Use some of the time you'll save using AWS to thoroughly learn (B) and apply the lessons. Then if you succeed, your scalable architecture will allow you to rent as many AWS machine-hours as you need. If you don't succeed, you'll have learned a lot of very useful things to apply to your next startup idea (or job).
Keep in mind this isn't an either-or choice too. Once you understand the basic principles behind scaling, you'll be able to start out along path (B) with something simple, while at the same time have the comfort in knowing how you'll progress to the next step. Danga has some very interesting presentations along these lines. Take a look at this one, and you'll see how they started off with just one machine, shifted to an app server machine and a database machine, to three app servers and a database machine, and so.
你让它听起来像 A) 会导致草率、考虑不周的代码可以工作,但无法很好地扩展,并且几乎肯定需要重写一旦你已经有了用户并且需要提供合理的正常运行时间。 一旦交通拥堵,解决可预防的问题听起来就像一场噩梦。
我肯定会选择B)。 对于任何重要的软件应用程序来说,思考、研究和规划应用程序的架构,不仅是为了优化或性能,而且也是为了合理的整体设计。
有一个普遍的误解:过早的优化是万恶之源。 这绝对是错误的,尽管更准确地说,不必要的优化是万恶之源。 不要犯新手错误,在无关紧要的地方进行优化,这只会弄乱你的代码,但一定要花时间找出哪些优化确实重要。
当 Twitter 意识到一旦他们已经有了流量,他们就做出了一些糟糕的数据库设计选择时,他们几乎就死了。
You make it sound like A) would result in sloppy, poorly thought-out code that works, but will not scale well and is almost certainly going to require a rewrite once you already have users and need to provide reasonable uptime. Fixing prevantable problems once you already have traffic sounds like a nightmare.
I would definitely go with B). Thinking about, researching and planning the architecture of your application, not just for optimisation or performance but also just for sensible overall design, is an absolute must for any non-trivial software application.
There is a common myth that premature optimisation is the root of all evil. This is absolutely false, though it would be more accurate to say that unnecessary optimisation is the root of all evil. Do not make the newbie mistake of optimising where it doesn't matter, which is just going to mess up your code, but do spend the time finding out which optimisations DO matter.
Twitter nearly died when they realised they'd made some poor DB design choices once they already had traffic.