针对特定需求的 NoSQL 推荐

发布于 2024-11-24 04:15:55 字数 826 浏览 2 评论 0原文

  • 我正在使用具有以下简单结构的 MySql 表:

    ID_A:整数 8

    ID_B:整数 8

    主键:ID_A、ID_B 索引:ID_B

  • 该 MySQL 表包含超过 5 亿行,权重为 20Go。

  • 我需要能够执行此类查询:

    按 ID_A 顺序从表组中选择 *,count(*) 作为 cpt 按 cpt DESC
    
    select *,count(*) as cpt from table group by ID_B order by cpt DESC
    
    select * from table where ID_A in (1,2,3,4,5,5) 
    
    select * from table where ID_B in (1,2,3,4,5,5) 
    
    select *,count(*) as cpt from table where ID_B in (1,2,3,4,5) group by ID_A order by cpt DESC
    
    select *,count(*) as cpt from table where ID_A in (1,2,3,4,5) group by ID_B order by cpt DESC
    
  • 我尝试了 innodb 和 MyIsam,但即使使用大型配置服务器,mysql 也无法回答 Group By 查询。我什至无法从脚本端执行此操作,因为它会消耗大量内存。

所有数据都无法装入 RAM(现在是 20Go,但不久的将来是 60Go)。

我们应该使用NoSql数据库吗? MongoDB?映射减少数据库?

谢谢你的帮助

  • I'm using a MySql table with the following simple structure :

    ID_A : int 8

    ID_B : int 8

    Primary key : ID_A, ID_B
    Index : ID_B

  • This MySQL table contain more than 500 million of rows, and the weight is 20Go.

  • I need to be able to perform those kind of query :

    select *,count(*) as cpt from table group by ID_A order by cpt DESC
    
    select *,count(*) as cpt from table group by ID_B order by cpt DESC
    
    select * from table where ID_A in (1,2,3,4,5,5) 
    
    select * from table where ID_B in (1,2,3,4,5,5) 
    
    select *,count(*) as cpt from table where ID_B in (1,2,3,4,5) group by ID_A order by cpt DESC
    
    select *,count(*) as cpt from table where ID_A in (1,2,3,4,5) group by ID_B order by cpt DESC
    
  • I tried innodb and MyIsam, but even with a big configuration server, mysql can't answer the Group By query. I even can't do it from a script side, because it would consumme to much memory.

All the data can't fit in RAM (20Go today, but 60Go in a near future).

Should we use NoSql database ? MongoDB ? Map reduce DB ?

Thanks for you help

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

断念 2024-12-01 04:15:58

按 ID_A 顺序按 cpt DESC 从表组中选择 ,count() 作为 cpt

err,这将解析 - 但它的编程风格非常糟糕,我什至不确定它会返回什么。

我怀疑它在 NoSQL 系统上不会快得多,除非您有很多并发性并且可以在多个服务器之间分配负载(您同样可以使用 MySQL 做到这一点)。因此,您可能必须考虑分片/映射减少来并行化请求(再次意味着多个服务器)。

抛开奇怪的 SQL 不谈,为什么不直接对数据进行非规范化 - 添加 ID_A 和 ID_B 计数表,然后在现有表上放置触发器以将数据填充到新表中。

select ,count() as cpt from table group by ID_A order by cpt DESC

err, that will parse - but its very bad programming style, I'm not even sure what it will return.

I suspect that it won't be much faster on a NoSQL system unless you've got a lot of concurrency and can distribute the load across multiple servers (which you could equally do with MySQL). So you'd probably have to look at sharding / map-reduce to parallelize the requests (again implying multiple servers).

Leaving aside the strange SQL, why not just denormalise your data - add tables for ID_A and ID_B counts then put a trigger on the existing table to populate the data into the new tables.

酒中人 2024-12-01 04:15:57

我从未将 MongoDB 用于大数据,但对于超过 10,000 个键,您可以使用 mongoDB map/reduce 而不是默认的 groupBy。

您可以在这里找到 mongoDB 文档:

mongoDB groupBy 用于更大的分组操作

希望这可以帮助

I've never used MongoDB for large data, but for more than 10,000 keys you can use the mongoDB map/reduce instead of default groupBy.

You can found here mongoDB documentation for this:

mongoDB groupBy for larger grouping operation

Hope this can help

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文