结合 MySQL、Sphinx 和 MongDB。好主意吗?

发布于 2024-11-26 06:42:37 字数 530 浏览 1 评论 0原文

对于一个新项目,我希望将 MySQL、Sphinx 和 MongoDB 结合起来。 MySQL 用于关系数据和数值搜索,Sphinx 用于自由文本搜索,MongoDB 用于地理数据。据我的(快速)基准测试显示,MongoDB 是地理查询最快的,sphinx 是自由文本搜索最快的,MySQL 是关系数据搜索最快的。因此,为了获得最佳性能,我可能必须将它们结合到我的项目中。

然而,这存在三个缺点。

  1. 三点故障,Sphinx、MySQL、MongoDB 可能崩溃 这将停止我的网站
  2. 我需要三个数据库中的数据并且需要使它们保持最新 (所有数据每天只改变一个,所以这不是最糟糕的问题)。
  3. 硬件要求(主要是 RAM)正在飞速增长 因为所有数据库都希望拥有大部分 RAM 能够执行。

所以问题是我应该将这三者结合起来,留下一个(可能是 MongoDB 并使用 Sphinx 进行地理数据),或者甚至只使用一个(MongoDB 或 MySQL)?

为了了解数据,关系数据约为 6GB,地理数据约为 4GB,自由文本数据约为 16GB。

For a new project I'm looking to combine MySQL, Sphinx and MongoDB. MySQL for the relational data and searching on numeric values, Sphinx for free text search and MongoDB for geodata. As far as my (quick) benchmarks shows MongoDB is the fastest for geo queries, sphinx for free text search and MySQL for relational data searches. So to get the best performance I might have to combine them in my project.

There are however three drawbacks to this.

  1. Three points of failure, i.e. Sphinx, MySQL, and MongoDB can crash
    which will stop my site
  2. I need data in three databases and need to keep them up to date
    (all data only changes ones per day so its not the worst problem).
  3. Hardware requirements and mainly RAM is going through the roof
    since all databases wants to have a large portion of the RAM to be
    able to perform.

So the questions is should I combine the three, leave one out (probably MongoDB and use Sphinx for geodata as well) or even go with only one (MongoDB or MySQL)?

To give an idea of the data, the relational data is aprox 6GB, the geodata about 4GB and the freetext data about 16GB.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

ぇ气 2024-12-03 06:42:37

不太明白 3 个数据库中包含的记录/集合/文档是否具有数据库间引用。 EG 如果用户名、工作、电话号码在 Mysql 中并且用户地址在 Mongo 中。我假设答案是肯定的。

恕我直言,不建议使用 3 种不同的存储解决方案,因为:

1)(最重要)您无法聚合来自 2 个数据库的数据(以可扩展的方式)。

例子:
假设您将用户数据(用户名)保存在 Mysql 中,将用户地理坐标保存在 Mongo 中。您无法查询对两个数据库上的字段进行过滤/排序。例如,您不能:

SELECT all users 
WHERE name starts with 'A'
SORT BY distance_from_center

同样适用于 Sphinx。

解决方案:您要么限制单个数据库上的可用数据,要么将数据从一个数据库复制/镜像到另一个数据库。

2)维护成本:维护3台服务器,不同的备份/冗余策略,不同的扩展策略;开发成本:开发人员必须使用3个查询库,有3种不同的查询方式等。

3)必须手动处理的不一致/同步问题(例如,您想在mongo和mysql中都插入数据;假设mongo写入数据,但 mysql 引发了引用完整性异常,因此现在数据库之间存在不一致)

4)关于硬件成本,唯一消耗 RAM 的是 MongoDB(建议是它必须在内存中拥有所有索引)。对于MySQL和Solr服务器,您可以控制内存消耗。

我会做什么:

  • 如果我不需要所有 SQL 功能(如事务、引用完整性、联接等),我会选择 Mongo

  • 如果我需要这些功能,并且我可以忍受地理操作的较低性能,我会选择 MySQL

  • ,如果我需要(我的意思是,我真的需要)全文搜索,并且 Mongo/Mysql FTS 功能还不够,我还会附加一个 FTS 服务器,例如 Sphinx、Solr、Elasticsearch, 等

Didn't quite understood if the records/collections/documents contained in the 3 dbs have inter-db references. EG if user names, jobs, telephone numbers are in Mysql and user addresses are in Mongo. I'll assume that the answer is Yes.

IMHO having 3 different storage solutions is not recommended, because:

1) (most important) You can not aggregate data from 2 DBs (in a scalable way).

Example:
Let's say that you keep user data (user names) in Mysql and user geo coordinates in Mongo. You can't query having filters/sorts on fields located on both dbs. For example, you can't:

SELECT all users 
WHERE name starts with 'A'
SORT BY distance_from_center

Same applies for Sphinx.

Solution: you either limit to data available on a single DB, or you duplicate/mirror data from one db to another.

2) Maintenance costs: 3 servers to maintain, different backup/redundance strategies, different scaling strategies; Development costs: developer must use 3 querying libraries, with 3 different ways to query, etc etc.

3) Inconsistence/Synchronization issues that must be manually dealt with(EG you want to insert data both in mongo and in mysql; let's say that mongo wrote the data, but mysql raised a referential integrity exception, so now you have an inconsistency between dbs)

4) About HW costs, the only RAM-eater is MongoDB (the recommendation is that it has to have all indexes in ram). For MySQL and Solr servers, you can control memory consumption.

What I would do:

  • If I don't need all the SQL features (like transactions, referential integrity, joins, etc) I would go with Mongo

  • If I need those features, and I can live with a lower performance on geo operations, I would go with MySQL

  • now, If I need (I mean, I really really need) full-text search, and Mongo/Mysql FTS capabilities are not enough, I would attach also a FTS server like Sphinx, Solr, Elasticsearch, etc

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文