如何将这些多个复杂的请求变成一个? (ActiveRecord,阿雷尔)

发布于 2025-01-11 09:38:36 字数 2600 浏览 0 评论 0原文

我试图优化代码中一些非常贪婪的部分,在这些部分中我在非常大的表上执行请求。我正在计算存储在子模型中不同时间步长的指标的平均值,为此我正在循环并执行每个时间步长的请求,但我确信这可以使用 Arel 级别在单个请求中执行。

不过,我对数据库的东西不太擅长,可以使用一些帮助或任何输入来知道这是否可能。

对于一些上下文,这里是对数据库的一些解释:

我有一个汽车模型,我在该模型上收集过去的数据,并执行模拟,生成未来的模拟数据并将其存储在子模型中。

这是数据库的样子

class Car < ApplicationRecord 
  # each sim_step is corresponding 
  # to a point in time where 
  # simulations are performed
  has_many :sim_steps 
end

# t0: the point in time where the simulations start
# tf: duration of the simulations 
#    in seconds (simulations make 
#    calculations on the  
#    collected data, and generates 
#    sim_data from t0 to (t0 + 
#    tf.seconds))
class SimStep < ApplicationRecord
  has_many :sims  
end

# state: state of the simulation 
#   (done / running)
# kind: the kind of simulation 
#   (different algorithms)
class Sim < ApplicationRecord 
  # sim_datas contain the metric 
  # for aggregation 
  has_many :sim_datas (that 
end

所以这是它现在的完成方式:

# Calculation of the average of :metric for each hour of the 12 past hours
# returns an Array of 12 values corresponding to the average of my metric for the 12 past hours
# [<avg for h-12 to h-11>, <avg for h-11 to h-10>, ... , <avg for h-1 to h-0>]
def previous_12_hours_avg
    @previous_12_hours_avg ||= (1..12).reverse_each.map do |offset|
      time = current_hour - offset.hour
      all_data_for_1_hour_since(time)
        &.average(:metric)
    end
end


def all_data_for_1_hour_since(time)
    best_sim_for_1_hour_since(time)  # returns the best simulation for time
      .sim_data 
      .where(
        'time': time..(time + 1.hour)
      )
end  

# retriving the best matching simulation in the past for a specific time frame
# this may look useful but i absolutely need this step
def best_simulation_for_1_hour_since(time)
    # this is scoped, and we're in the context of a model
    # here **sims** is corresponding to all of the simulations of a specific car model via a has_many through association (through sim_steps)
    sims
      .where(
        'sims.state': :done,
        'sims.kind': 'regular'
      )
      .order('sim_steps.t0': :desc)
      .where('sim_steps.t0 <= ?', time)
      .where("? - sim_steps.t0 < (sim.tf * '1 sec'::interval)", time + 1.hour).first
  end

这按我想要的方式工作,但是由于表非常大,这需要很长时间才能执行,并且在某种程度上我觉得这在单个请求中是可行的。

每次我试图找出解决方案时,让我迷失的超级棘手的部分是,我需要根据当前时间将结果分为 12 个不同的时间步长,并且对于每个时间步长,我需要执行一个子请求来检索该特定时间的最佳模拟并访问其数据。

我意识到这是多么棘手,我提前表示抱歉,因为我无法更具体地了解保密问题的代码。我希望这足够清楚,并且有人对此有线索!

谢谢 !

Im trying to optimize some really greedy part of my code where i perform requests on really large tables. I'm calculating an average of a metric stored in a submodel for different timesteps and for this i'm looping and performing the requests for each timestep but im convinced this could be performed in a single request using Arel level.

Nevertheless im not super good with database stuff and could use some help or any input to know if thats even possible.

For a bit of context, here is a bit of explanation of the database:

I have a car model on which i collect data from the past, and perform simulations that generates simulated data for the future and store it in submodels.

here is what the database look like

class Car < ApplicationRecord 
  # each sim_step is corresponding 
  # to a point in time where 
  # simulations are performed
  has_many :sim_steps 
end

# t0: the point in time where the simulations start
# tf: duration of the simulations 
#    in seconds (simulations make 
#    calculations on the  
#    collected data, and generates 
#    sim_data from t0 to (t0 + 
#    tf.seconds))
class SimStep < ApplicationRecord
  has_many :sims  
end

# state: state of the simulation 
#   (done / running)
# kind: the kind of simulation 
#   (different algorithms)
class Sim < ApplicationRecord 
  # sim_datas contain the metric 
  # for aggregation 
  has_many :sim_datas (that 
end

So here is how its done right now:

# Calculation of the average of :metric for each hour of the 12 past hours
# returns an Array of 12 values corresponding to the average of my metric for the 12 past hours
# [<avg for h-12 to h-11>, <avg for h-11 to h-10>, ... , <avg for h-1 to h-0>]
def previous_12_hours_avg
    @previous_12_hours_avg ||= (1..12).reverse_each.map do |offset|
      time = current_hour - offset.hour
      all_data_for_1_hour_since(time)
        &.average(:metric)
    end
end


def all_data_for_1_hour_since(time)
    best_sim_for_1_hour_since(time)  # returns the best simulation for time
      .sim_data 
      .where(
        'time': time..(time + 1.hour)
      )
end  

# retriving the best matching simulation in the past for a specific time frame
# this may look useful but i absolutely need this step
def best_simulation_for_1_hour_since(time)
    # this is scoped, and we're in the context of a model
    # here **sims** is corresponding to all of the simulations of a specific car model via a has_many through association (through sim_steps)
    sims
      .where(
        'sims.state': :done,
        'sims.kind': 'regular'
      )
      .order('sim_steps.t0': :desc)
      .where('sim_steps.t0 <= ?', time)
      .where("? - sim_steps.t0 < (sim.tf * '1 sec'::interval)", time + 1.hour).first
  end

This works as i want, but as the tables are really large this takes ages to execute and in some way i feel this is doable in a single request.

The super tricky parts that get me lost everytime im trying to figure out a solution are the fact i need to split my result in 12 different timesteps based on the current time, and that for each of those timestep i need to perform a subrequest to retrive the best simulation for that specific time and access its data.

I realize how tricky this is and i'm sorry in advance beacause i can't really be more specific about the code for confidentiality matters. I hope somehow this is clear enough and that somebody out there has a clue about it !

Thanks !

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文