针对特定需求的 NoSQL 推荐
我正在使用具有以下简单结构的 MySql 表:
ID_A:整数 8
ID_B:整数 8
主键:ID_A、ID_B 索引:ID_B
该 MySQL 表包含超过 5 亿行,权重为 20Go。
我需要能够执行此类查询:
按 ID_A 顺序从表组中选择 *,count(*) 作为 cpt 按 cpt DESC select *,count(*) as cpt from table group by ID_B order by cpt DESC select * from table where ID_A in (1,2,3,4,5,5) select * from table where ID_B in (1,2,3,4,5,5) select *,count(*) as cpt from table where ID_B in (1,2,3,4,5) group by ID_A order by cpt DESC select *,count(*) as cpt from table where ID_A in (1,2,3,4,5) group by ID_B order by cpt DESC
我尝试了 innodb 和 MyIsam,但即使使用大型配置服务器,mysql 也无法回答 Group By 查询。我什至无法从脚本端执行此操作,因为它会消耗大量内存。
所有数据都无法装入 RAM(现在是 20Go,但不久的将来是 60Go)。
我们应该使用NoSql数据库吗? MongoDB?映射减少数据库?
谢谢你的帮助
I'm using a MySql table with the following simple structure :
ID_A : int 8
ID_B : int 8
Primary key : ID_A, ID_B
Index : ID_BThis MySQL table contain more than 500 million of rows, and the weight is 20Go.
I need to be able to perform those kind of query :
select *,count(*) as cpt from table group by ID_A order by cpt DESC select *,count(*) as cpt from table group by ID_B order by cpt DESC select * from table where ID_A in (1,2,3,4,5,5) select * from table where ID_B in (1,2,3,4,5,5) select *,count(*) as cpt from table where ID_B in (1,2,3,4,5) group by ID_A order by cpt DESC select *,count(*) as cpt from table where ID_A in (1,2,3,4,5) group by ID_B order by cpt DESC
I tried innodb and MyIsam, but even with a big configuration server, mysql can't answer the Group By query. I even can't do it from a script side, because it would consumme to much memory.
All the data can't fit in RAM (20Go today, but 60Go in a near future).
Should we use NoSql database ? MongoDB ? Map reduce DB ?
Thanks for you help
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
err,这将解析 - 但它的编程风格非常糟糕,我什至不确定它会返回什么。
我怀疑它在 NoSQL 系统上不会快得多,除非您有很多并发性并且可以在多个服务器之间分配负载(您同样可以使用 MySQL 做到这一点)。因此,您可能必须考虑分片/映射减少来并行化请求(再次意味着多个服务器)。
抛开奇怪的 SQL 不谈,为什么不直接对数据进行非规范化 - 添加 ID_A 和 ID_B 计数表,然后在现有表上放置触发器以将数据填充到新表中。
err, that will parse - but its very bad programming style, I'm not even sure what it will return.
I suspect that it won't be much faster on a NoSQL system unless you've got a lot of concurrency and can distribute the load across multiple servers (which you could equally do with MySQL). So you'd probably have to look at sharding / map-reduce to parallelize the requests (again implying multiple servers).
Leaving aside the strange SQL, why not just denormalise your data - add tables for ID_A and ID_B counts then put a trigger on the existing table to populate the data into the new tables.
我从未将 MongoDB 用于大数据,但对于超过 10,000 个键,您可以使用 mongoDB map/reduce 而不是默认的 groupBy。
您可以在这里找到 mongoDB 文档:
mongoDB groupBy 用于更大的分组操作
希望这可以帮助
I've never used MongoDB for large data, but for more than 10,000 keys you can use the mongoDB map/reduce instead of default groupBy.
You can found here mongoDB documentation for this:
mongoDB groupBy for larger grouping operation
Hope this can help