如何使 ActiveRecord 与旧的分区/分片数据库/表一起使用?

发布于 2024-08-09 10:18:35 字数 1239 浏览 6 评论 0原文

首先感谢您的时间...在google、github和这里进行了所有搜索之后,对这些大词(partition/shard/fedorate)感到更加困惑,我想我必须描述我遇到的具体问题并四处询问。

我公司的数据库涉及海量的用户和订单,所以我们对数据库和表的拆分有多种方式,如下所述:

way             database and table name      shard by (maybe it's should be called partitioned by?)
YZ.X            db_YZ.tb_X                   order serial number last three digits
YYYYMMDD.       db_YYYYMMDD.tb               date
YYYYMM.DD       db_YYYYMM.tb_ DD             date too

基本概念是数据库和表按照字段(不一定是主键)进行分离,并且有太多的数据库和太多的表,因此为每个数据库编写或神奇地生成一个database.yml配置并为每个表生成一个模型是不可能的,或者至少不是最好的解决方案。

我研究了 drnic 的神奇解决方案和 datafabric,甚至是 active record 的源代码,也许我可以使用 ERB 生成database.yml并在过滤器周围进行数据库连接,也许我可以使用named_scope来动态决定表名查找,但更新/创建操作仅限于“self.class.quoted_table_name”,因此我无法轻松解决问题。甚至我可以为每个表生成一个模型,因为它的数量最多可达 30 个。

但这只是不干!

我需要的是一个干净的解决方案,如以下 DSL:

class Order < ActiveRecord::Base
   shard_by :order_serialno do |key|
      [get_db_config_by(key), #because some or all of the databaes might share the same machine in a regular way or can be configed by a hash of regex, and it can also be a const
       get_db_name_by(key), 
       get_tb_name_by(key),        
      ]
   end
end

有人能启发我吗?任何帮助将不胜感激~~~

thanks for your time first...after all the searching on google, github and here, and got more confused about the big words(partition/shard/fedorate),I figure that I have to describe the specific problem I met and ask around.

My company's databases deals with massive users and orders, so we split databases and tables in various ways, some are described below:

way             database and table name      shard by (maybe it's should be called partitioned by?)
YZ.X            db_YZ.tb_X                   order serial number last three digits
YYYYMMDD.       db_YYYYMMDD.tb               date
YYYYMM.DD       db_YYYYMM.tb_ DD             date too

The basic concept is that databases and tables are seperated acording to a field(not nessissarily the primary key), and there are too many databases and too many tables, so that writing or magically generate one database.yml config for each database and one model for each table isn't possible or at least not the best solution.

I looked into drnic's magic solutions, and datafabric, and even the source code of active record, maybe I could use ERB to generate database.yml and do database connection in around filter, and maybe I could use named_scope to dynamically decide the table name for find, but update/create opertions are bounded to "self.class.quoted_table_name" so that I couldn't easily get my problem solved. And even I could generate one model for each table, because its amount is up to 30 most.

But this is just not DRY!

What I need is a clean solution like the following DSL:

class Order < ActiveRecord::Base
   shard_by :order_serialno do |key|
      [get_db_config_by(key), #because some or all of the databaes might share the same machine in a regular way or can be configed by a hash of regex, and it can also be a const
       get_db_name_by(key), 
       get_tb_name_by(key),        
      ]
   end
end

Can anybody enlight me? Any help would be greatly appreciated~~~~

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

街角卖回忆 2024-08-16 10:18:35

第二种情况(仅更改数据库名称)很容易使用 DbCharmer 实现。您需要在 DbCharmer 中创建自己的分片方法,这将返回连接参数哈希基于密钥。

其他两种情况暂时不支持,但可以轻松添加到您的系统中:

  1. 您实现的分片方法知道如何处理分片数据库中的数据库名称,这将使您能够执行shard_for (key) 调用您的模型来切换数据库连接。

  2. 您添加这样的方法:

    类 MyModel < ActiveRecord::基础
      db_magic :分片=> { :sharded_connection => :my_sharding_method }
    
      def switch_shard(键)
        set_table_name(table_for_key(key)) # 切换表
        shard_for(key) # 切换连接
      结尾
    结尾
    
  3. 现在您可以像这样使用您的模型:

    MyModel.switch_shard(key).first
    MyModel.switch_shard(key).count
    

    并且,考虑到您有从 switch_shard 方法返回的 shard_for(key) 调用结果,您可以像这样使用它:

    m = MyModel.switch_shard(key) # 切换连接并获取连接代理
    m.first # 调用代理上的任何 AR 方法
    米计数 
    

Case two (where only db name changes) is pretty easy to implement with DbCharmer. You need to create your own sharding method in DbCharmer, that would return a connection parameters hash based on the key.

Other two cases are not supported right away, but could be easily added to your system:

  1. You implement sharding method that knows how to deal with database names in your sharded dabatase, this would give you an ability to do shard_for(key) calls to your model to switch db connection.

  2. You add a method like this:

    class MyModel < ActiveRecord::Base
      db_magic :sharded => { :sharded_connection => :my_sharding_method }
    
      def switch_shard(key)
        set_table_name(table_for_key(key))  # switch table
        shard_for(key)                      # switch connection
      end
    end
    
  3. Now you could use your model like this:

    MyModel.switch_shard(key).first
    MyModel.switch_shard(key).count
    

    and, considering you have shard_for(key) call results returned from the switch_shard method, you could use it like this:

    m = MyModel.switch_shard(key) # Switch connection and get a connection proxy
    m.first                       # Call any AR methods on the proxy
    m.count 
    
小嗲 2024-08-16 10:18:35

如果您想要特定的 DSL,或者与遗留分片背后的逻辑相匹配的东西,您将需要深入研究 ActiveRecord 并编写一个 gem 来为您提供这种功能。您提到的所有现有解决方案不一定是根据您的情况编写的。您可以根据自己的意愿改变任意数量的解决方案,但最终您可能必须编写自定义代码才能获得您想要的东西。

If you want that particular DSL, or something that matches the logic behind the legacy sharding you are going to need to dig into ActiveRecord and write a gem to give you that kind of capability. All the existing solutions that you mention were not necessarily written with your situation in mind. You may be able to bend any number of solutions to your will, but in the end you're gonna have to probably write custom code to get what you are looking for.

他夏了夏天 2024-08-16 10:18:35

听起来,在这种情况下,您应该考虑不使用 SQL。

如果数据集那么大并且可以表示为键/值对(带有一点反规范化),那么您应该考虑 couchDB 或其他 noSQL 解决方案。
这些解决方案速度快、完全可扩展,并且基于 REST,因此易于扩展、备份和复制。

我们都已经开始使用同一个工具解决所有问题(相信我,我也尝试这样做)。

切换到 noSQL 解决方案然后重写 activeRecord 会容易得多。

Sounds like, in this case, you should consider not use SQL.

If the data sets are that big and can be expressed as key/value pairs (with a little de-normalization), you should look into couchDB or other noSQL solutions.
These solutions are fast, fully scalable, and is REST based, so it is easy to grow and backup and replicate.

We all have gotten into solving all our problems with the same tool (Believe me, I try to too).

It would be much easier to switch to a noSQL solution then to rewrite activeRecord.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文