如何将这些多个复杂的请求变成一个? (ActiveRecord,阿雷尔)
我试图优化代码中一些非常贪婪的部分,在这些部分中我在非常大的表上执行请求。我正在计算存储在子模型中不同时间步长的指标的平均值,为此我正在循环并执行每个时间步长的请求,但我确信这可以使用 Arel 级别在单个请求中执行。
不过,我对数据库的东西不太擅长,可以使用一些帮助或任何输入来知道这是否可能。
对于一些上下文,这里是对数据库的一些解释:
我有一个汽车模型,我在该模型上收集过去的数据,并执行模拟,生成未来的模拟数据并将其存储在子模型中。
这是数据库的样子
class Car < ApplicationRecord
# each sim_step is corresponding
# to a point in time where
# simulations are performed
has_many :sim_steps
end
# t0: the point in time where the simulations start
# tf: duration of the simulations
# in seconds (simulations make
# calculations on the
# collected data, and generates
# sim_data from t0 to (t0 +
# tf.seconds))
class SimStep < ApplicationRecord
has_many :sims
end
# state: state of the simulation
# (done / running)
# kind: the kind of simulation
# (different algorithms)
class Sim < ApplicationRecord
# sim_datas contain the metric
# for aggregation
has_many :sim_datas (that
end
所以这是它现在的完成方式:
# Calculation of the average of :metric for each hour of the 12 past hours
# returns an Array of 12 values corresponding to the average of my metric for the 12 past hours
# [<avg for h-12 to h-11>, <avg for h-11 to h-10>, ... , <avg for h-1 to h-0>]
def previous_12_hours_avg
@previous_12_hours_avg ||= (1..12).reverse_each.map do |offset|
time = current_hour - offset.hour
all_data_for_1_hour_since(time)
&.average(:metric)
end
end
def all_data_for_1_hour_since(time)
best_sim_for_1_hour_since(time) # returns the best simulation for time
.sim_data
.where(
'time': time..(time + 1.hour)
)
end
# retriving the best matching simulation in the past for a specific time frame
# this may look useful but i absolutely need this step
def best_simulation_for_1_hour_since(time)
# this is scoped, and we're in the context of a model
# here **sims** is corresponding to all of the simulations of a specific car model via a has_many through association (through sim_steps)
sims
.where(
'sims.state': :done,
'sims.kind': 'regular'
)
.order('sim_steps.t0': :desc)
.where('sim_steps.t0 <= ?', time)
.where("? - sim_steps.t0 < (sim.tf * '1 sec'::interval)", time + 1.hour).first
end
这按我想要的方式工作,但是由于表非常大,这需要很长时间才能执行,并且在某种程度上我觉得这在单个请求中是可行的。
每次我试图找出解决方案时,让我迷失的超级棘手的部分是,我需要根据当前时间将结果分为 12 个不同的时间步长,并且对于每个时间步长,我需要执行一个子请求来检索该特定时间的最佳模拟并访问其数据。
我意识到这是多么棘手,我提前表示抱歉,因为我无法更具体地了解保密问题的代码。我希望这足够清楚,并且有人对此有线索!
谢谢 !
Im trying to optimize some really greedy part of my code where i perform requests on really large tables. I'm calculating an average of a metric stored in a submodel for different timesteps and for this i'm looping and performing the requests for each timestep but im convinced this could be performed in a single request using Arel level.
Nevertheless im not super good with database stuff and could use some help or any input to know if thats even possible.
For a bit of context, here is a bit of explanation of the database:
I have a car model on which i collect data from the past, and perform simulations that generates simulated data for the future and store it in submodels.
here is what the database look like
class Car < ApplicationRecord
# each sim_step is corresponding
# to a point in time where
# simulations are performed
has_many :sim_steps
end
# t0: the point in time where the simulations start
# tf: duration of the simulations
# in seconds (simulations make
# calculations on the
# collected data, and generates
# sim_data from t0 to (t0 +
# tf.seconds))
class SimStep < ApplicationRecord
has_many :sims
end
# state: state of the simulation
# (done / running)
# kind: the kind of simulation
# (different algorithms)
class Sim < ApplicationRecord
# sim_datas contain the metric
# for aggregation
has_many :sim_datas (that
end
So here is how its done right now:
# Calculation of the average of :metric for each hour of the 12 past hours
# returns an Array of 12 values corresponding to the average of my metric for the 12 past hours
# [<avg for h-12 to h-11>, <avg for h-11 to h-10>, ... , <avg for h-1 to h-0>]
def previous_12_hours_avg
@previous_12_hours_avg ||= (1..12).reverse_each.map do |offset|
time = current_hour - offset.hour
all_data_for_1_hour_since(time)
&.average(:metric)
end
end
def all_data_for_1_hour_since(time)
best_sim_for_1_hour_since(time) # returns the best simulation for time
.sim_data
.where(
'time': time..(time + 1.hour)
)
end
# retriving the best matching simulation in the past for a specific time frame
# this may look useful but i absolutely need this step
def best_simulation_for_1_hour_since(time)
# this is scoped, and we're in the context of a model
# here **sims** is corresponding to all of the simulations of a specific car model via a has_many through association (through sim_steps)
sims
.where(
'sims.state': :done,
'sims.kind': 'regular'
)
.order('sim_steps.t0': :desc)
.where('sim_steps.t0 <= ?', time)
.where("? - sim_steps.t0 < (sim.tf * '1 sec'::interval)", time + 1.hour).first
end
This works as i want, but as the tables are really large this takes ages to execute and in some way i feel this is doable in a single request.
The super tricky parts that get me lost everytime im trying to figure out a solution are the fact i need to split my result in 12 different timesteps based on the current time, and that for each of those timestep i need to perform a subrequest to retrive the best simulation for that specific time and access its data.
I realize how tricky this is and i'm sorry in advance beacause i can't really be more specific about the code for confidentiality matters. I hope somehow this is clear enough and that somebody out there has a clue about it !
Thanks !
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论