鉴于您只能使用一个数据库,您真的可以使用 Django 进行扩展吗? (在 models.py 和 settings.py 中)
Django 只允许你在settings.py 中使用一个数据库。 这会妨碍你扩大规模吗? (数百万用户)
Django only allows you to use one database in settings.py.
Does that prevent you from scaling up? (millions of users)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
Django 现在支持多个数据库。
Django now has support for multiple databases.
数据库不是您的瓶颈。
仔细检查您的浏览器。
对于每个 HTML 页面,您(平均)发送 8 个其他文件,其中一些文件可能非常大。这些是你的 JS、CSS、图形等。
实际的性能瓶颈是浏览器请求这些文件并接受字节 s... l... o... w... l... y...
To规模,然后,这样做。
使用与 wackamole 等纯软件解决方案平衡的多个前端。 http://www.backhand.org/wackamole/
使用squid等代理服务器发送“其他”文件。它们基本上是静态的。这是 7/8 的工作完成下载到客户端的地方。不要吝惜这些权利。
使用多个并发 mod_wsgi/Django 基于数据库查询创建罕见的动态 HTML 片段。确保 mod_wsgi 处于守护进程模式,以便您可以拥有多个可用于 Apache 的 Django 服务器。根据需要构建尽可能多的这些。它们都是相同的,都是并行的,并且都由 Wackamole 共享。
使用单一、快速的数据库(例如 MySQL)来处理一些必须来自数据库的事情。 MySQL 将在其服务器上使用多个核心,因此它可以很好地扩展,除了购买内存之外,您无需执行任何操作。将其放在一个单独的盒子上,完全独立,专用并为此进行调整。
您会发现这可以很好地扩展。您会发现负载在鱿鱼、apache、Django 守护进程和实际数据库之间很好地共享。您还会发现负载的每个部分(从无聊的静态部分到有趣的数据库查询)都是单独并发地发生的。
最后,购买 Schlossnagle 的书。 http://www.amazon.com/Scalable-Internet-Architectures-Theo-Schlossnagle/dp /067232699X
The database isn't your bottleneck.
Check your browser carefully.
For each page of HTML you're sending (on average) 8 other files, some of which may be quite large. These are your JS, CSS, graphics, etc.
The actual performance bottleneck is the browser requesting those files and accepting the bytes s... l... o... w... l... y...
To scale, then, do this.
Use multiple front-ends balanced with a pure software solution like wackamole. http://www.backhand.org/wackamole/
Use proxy servers like squid to send the "other" files. They're largely static. This is where 7/8ths of the work is done downloading to the client. Don't scrimp on getting these right.
Use multiple, concurrent mod_wsgi/Django to create the -- rare -- piece of dynamic HTML based on DB queries. Be sure that mod_wsgi is in daemon mode so that you can have multiple Django servers available to Apache. Build as many of these as you need. They're all identical, all in parallel, and all shared by Wackamole.
Use a single, fast database like MySQL for the few things that must come from a database. MySQL will make use of multiple cores on it's server, so it will scale reasonably well without you having to do anything other than buy memory. Put this on a separate box, all by itself, dedicated and tuned for just this.
You'll find that this scales nicely. You'll find that the load is shared nicely between squid, apache, the Django daemons and the actual database. You'll also find that each part of the load (from the boring static parts to the interesting database query) happens separately and concurrently.
Finally, buy Schlossnagle's book. http://www.amazon.com/Scalable-Internet-Architectures-Theo-Schlossnagle/dp/067232699X
读取扩展到数百万用户并不是数据库问题,而是通过负载平衡和缓存等解决的,请参见上面的 S. Lott。
写入扩展确实可能是数据库问题。 “分片”和拥有多个数据库可能是一种解决方案,但这对于 SQL 来说很难,同时仍保留数据库的关联性。流行的解决方案是新型“nosql”数据库。但如果你确实遇到这些问题,那么你需要认真的专家帮助,而不仅仅是 Stackoverflow 的答案。 :)
Read scaling to millions of users is not a database problem, but is fixed with load balancing and caching, etc, see S. Lott above.
Write scaling can indeed be a database problem. "Sharding" and having multiple databases can be one solution, but that's hard with SQL while still retaining the relationality of the database. Popular solutions there are the new types of "nosql" databases. But if you really have those problems, then you need serious expert help, not just answers from dudes Stackoverflow. :)
已经有一些很好的答案(例如 S. Lott),但是我认为我应该加入更多内容:
确保不要使用数据库进行逻辑操作
我理解
Order 的吸引力通过
或SQL过程
,但是你只有一个数据库,但你有多个django服务器,如果可以的话,让服务器处理这个问题。当然,如果您只需要根据特定标准(日期)获取最后十行,那么请务必在请求中精确处理;)只需确保不要使用可以在其他地方处理的操作使数据库过载。
增加更多硬件来解决问题
MySQL 和 Oracle 在硬件方面的扩展性非常好,如果您遇到性能小问题,您可以从添加更多硬件开始。
拆分数据库
我知道,对于关系和所有内容,您必须一起管理一些表,但是如果您遇到加载问题,请尝试对表进行分组,例如,如果您有一个“历史”组的表,也许它可以在没有其他表的情况下工作并位于单独的服务器上。
一定要考虑调整,并留意你的请求/索引
你需要专家的建议,但我可以从经验中看出,即使是一个调整不当的请求也会造成严重破坏......而且很难查出。您可以考虑 Ask Tom 网站 作为示例诊断/微调。
不要孤立地决定表架构,但一定要考虑请求
分层请求和多个联接的成本可能非常高。您不必构建完全规范化的关系模式,并且可以考虑一些非规范化,以便更好地适应数据库将面临的请求类型。
只是一些想法:)
Some great answers already (S. Lott for example), however I thought I should pipe in with some more things:
Make sure not to use the database for logical operations
I understand the attractiveness of
Order By
orSQL Procedures
however you only have one database but you have multiple django servers, let the servers handle this if you can.Of course, if you only want the last ten rows according to a certain criterion (date), then by all means do precise it in the request ;) Just make sure not to overload your database with operations that could be handled elsewhere.
Throw more hardware to the problem
MySQL and Oracle scale quite well with hardware, if you have a small problem of performance you could begin by adding more hardware.
Split your database
I know that for relationships and all you have to manage some tables together, however if you ever have a load problem, try to group your tables, for example if you have a "history" group of tables, perhaps that it could work without the others and be on a separate server.
Do consider tuning, and watch out for your requests/index
You would need experts advises here, but I can tell from experience that even a single badly tuned request can wreak havoc... and it's quite difficult to find out. You can consider the Ask Tom website for example of diagnosis / fine tuning.
Don't decide on your tables architecture in isolation, but do consider the requests
Hierarchical requests and multiple joins can be really costly. You don't have to build a fully normalized relations schema and may consider some denormalization in order to better accomodate the type of requests the database will face.
Just a couple of thoughts :)
一些杂项建议:
我很惊讶还没有人提到这一点。使用内存缓存。如果您收到大量重复类型的查询(大多数网络应用程序都会这样做),这可能会产生巨大的影响。
考虑使用 Oracle 的故障转移和负载平衡。它允许您在单个数据库连接上添加对多个数据库的支持。
另一件需要考虑的事情是使用类似于 FriendFeed 的系统 。这解决了“我们如何在不停止世界的情况下更改数据库?”的问题。
A few miscellaneous pieces of advice:
I'm surprised no one's mentioned this yet. Use memcached. If you're getting a lot of repetitive types of queries (which most webapps do), this can make a huge difference.
Consider using Oracle's failover and load balancing. It allows you to add support for multiple databases on a single db connection.
Another thing to consider is using a system similar to FriendFeed's. This solves the problem of "how do we make changes to the database without halting the world?" more than anything else.
如果您发现数据库是您应用程序的瓶颈,并且它们现在已经解决了它(例如使用缓存),那么您也应该扩展您的数据库。 Django 与此无关
If you find out that the DB is the bottlenck of your app, and their is now way around it (like using caching) then you should scale your DB as well. Django has nothing to do with this