MySQL 的最大功能
我如何知道一个项目对于 MySQL 来说太大了,我应该使用在可扩展性方面享有更好声誉的项目?
在性能下降之前,MySQL 是否存在最大数据库大小?与 Oracle 或 SQL Server 等商业 DBMS 相比,哪些因素导致 MySQL 不是一个可行的选择?
How do I know when a project is just to big for MySQL and I should use something with a better reputation for scalability?
Is there a max database size for MySQL before degradation of performance occurs? What factors contribute to MySQL not being a viable option compared to a commercial DBMS like Oracle or SQL Server?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
谷歌使用MySQL。你的项目比谷歌还大吗?
抛开聪明人的评论不谈,MySQL 是一个专业级的数据库应用程序。如果您的应用程序给 MySQL 带来了压力,我敢打赌它也会对任何其他数据库造成同样的影响。
Google uses MySQL. Is your project bigger than Google?
Smart-alec comments aside, MySQL is a professional level database application. If your application puts a strain on MySQL, I bet it'll do the same to just about any other database.
如果您正在寻找几个示例:
Facebook 已迁移至 Cassandra仅在它存储了超过 7 TB 的收件箱数据之后。 (来源:Lakshman、Malik:Cassandra - 去中心化结构化存储系统。)(...尽管他们在那个阶段遇到了很多问题。)
维基百科也在 MySQL 中处理数百 GB 的文本数据。
If you are looking for a couple of examples:
Facebook moved to Cassandra only after it was storing over 7 Terabytes of inbox data. (Source: Lakshman, Malik: Cassandra - A Decentralized Structured Storage System.) (... Even though they were having quite a few issues at that stage.)
Wikipedia also handles hundreds of Gigabytes of text data in MySQL.
我在一家非常大的互联网公司工作。 MySQL 可以扩展得非常非常大,并且性能非常好,但有一些注意事项。
您可能遇到的一个问题是大于 4 GB 的索引无法进入内存。我曾经花了很多时间试图通过摆弄一些索引参数来提高 MySQL 的全文性能,但是你无法解决一个根本问题:如果你的查询访问磁盘上的索引,它就会变慢。
您可能会发现一些帮助应用程序可以帮助解决您的问题。对于全文问题,有Sphinx: http://www.sphinxsearch.com/
Jeremy Zawodny现在在 Craig's List 工作,他有一个博客,偶尔会在其中讨论大型数据库的性能: http://blog .zawodny.com/
总之,您的项目对于 MySQL 来说可能不太大。对于您之前使用 MySQL 的某些方式来说,它可能太大,您可能需要对其进行调整。
I work for a very large Internet company. MySQL can scale very, very large with very good performance, with a couple of caveats.
One problem you might run into is that an index greater than 4 gigabytes can't go into memory. I spent a lot of time once trying to improve the MySQL's full-text performance by fiddling with some index parameters, but you can't get around the fundamental problem that if your query hits disk for an index, it gets slow.
You might find some helper applications that can help solve your problem. For the full-text problem, there is Sphinx: http://www.sphinxsearch.com/
Jeremy Zawodny, who now works at Craig's List, has a blog on which he occasionally discusses the performance of large databases: http://blog.zawodny.com/
In summary, your project probably isn't too big for MySQL. It may be too big for some of the ways that you've used MySQL before, and you may need to adapt them.
主要是桌子的大小。
我在这里假设您将使用 MySQL 的 Oracle innoDB 插件作为您的引擎。如果您不这样做,这可能意味着您正在使用商业引擎,例如 infiniDB、InfoBright for Tokutek,在这种情况下,您的问题应该发送给他们。
InnoDB 对于非常大的表来说有点令人讨厌。如果可能的话,建议您对非常大的实例对表进行分区。本质上,如果你的(经常使用的)索引不能全部放入内存,插入将会非常慢,因为它们需要接触很多不在内存中的页面。这是无法解决的。
如果 MySQL 5.1 分区功能可以满足您的要求,您可以使用它;如果不能,则可以在应用程序级别对表进行分区。如果您可以让表的索引适合内存,并且一次只加载一张表,那么您就获胜了。
您可以使用插件的压缩功能使您的 RAM 走得更远(因为页面在 RAM 和光盘上都被压缩),但它无法克服基本限制。
如果你的表的索引不全部(或者至少大部分 - 如果你有一些索引在 99.99% 的情况下都是 NULL,那么如果没有这些索引,你可能会逃脱)适合 ram,插入速度会很糟糕。
数据库大小不是主要问题,只要您在进行批量加载时各个表都适合 RAM(当然,一次只加载一个)。
大多数基于行的数据库确实存在这些限制。如果您需要更多,请考虑列数据库。
Infobright和Infinidb都使用基于mysql的核心,并且是基于列的引擎,可以处理非常大的表。
Tokutek 也很有趣 - 您可能需要联系他们进行评估。
当您评估引擎的适用性时,请务必在生产级硬件上加载非常大的数据。使用(例如)10G 数据库测试它是没有意义的,这不能证明任何事情。
Mostly it is table size.
I am assuming here that you will use the Oracle innoDB plugin for mysql as your engine. If you do not, that probably means you're using a commercial engine such as infiniDB, InfoBright for Tokutek, in which case your questions should be sent to them.
InnoDB gets a bit nasty with very large tables. You are advised to partition your tables if at all possible with very large instances. Essentially, if your (frequently used) indexes don't all fit into ram, inserts will be very slow as they need to touch a lot of pages not in ram. This cannot be worked around.
You can use the MySQL 5.1 partitioning feature if it does what you want, or partition your tables at the application level if it does not. If you can get your tables' indexes to fit in ram, and only load one table at a time, then you're on a winner.
You can use the plugin's compression to make your ram go a bit further (as the pages are compressed in ram as well as on disc) but it cannot beat the fundamental limtation.
If your table's indexes don't all (or at least MOSTLY - if you have a few indexes which are NULL in 99.99% of cases you might get away without those ones) fit in ram, insert speed will suck.
Database size is not a major issue, provided your tables individually fit in ram while you're doing bulk loading (and of course, you only load one at once).
These limitations really happen with most row-based databases. If you need more, consider a column database.
Infobright and Infinidb both use a mysql-based core and are column based engines which can handle very large tables.
Tokutek is quite interesting too - you may want to contact them for an evaluation.
When you evaluate the engine's suitability, be sure to load it with very large data on production-grade hardware. There's no point in testing it with a (e.g.) 10G database, that won't prove anything.
MySQL是一个商业DBMS,您只需选项来获取Oracle 或 Microsoft 提供的支持/监控。或者您可以使用社区支持或社区提供的监控软件。
MySQL is a commercial DBMS, you just have the option to get the support/monitoring that is offered by Oracle or Microsoft. Or you can use community support or community provided monitoring software.
您应该关注的不仅仅是运营规模。还关键的是:
无论你做什么,都要小心 Oracle RAC(他们的集群)——众所周知,它是“有问题的”(准确地说)。众所周知,SQL Server 更便宜,扩展性更差(没有“RAC”选项),但基本上可以工作,不会让管理员每小时都想自杀(“RAC”选项似乎可以做到这一点)。对于 Terra Server 来说,“差很多”的可扩展性仍然足够好 (http://msdn.microsoft.com/en-us/library/aa226316(SQL.70).aspx)
最近这里出现了一些问题,人们在 10gb 数据库或其他数据库上重建索引时遇到问题。
我的 2 美分就这么多了。我确信一些 MySQL 专家会介入解决那里的问题。
Things you should look at are not only size at operations. Critical are also:
Whatever you go for, be carefull with Oracle RAC (their cluster) - it is known to be "problematic" (to say it finely). SQL Server is known to be a lot cheaper, scale a lot worse (no "RAC" option) but basically work without making admins want to commit suicide every hour (the "RAC" option seems to do that). Scalability "a lot worse" still is good enough for the Terra Server (http://msdn.microsoft.com/en-us/library/aa226316(SQL.70).aspx)
THere wer some questions here recently of people having problems rebuilding indices on a 10gb database or something.
So much for my 2 cents. I am sure some MySQL specialists will jump in on issues there.