匹配不同长度的时间向量:一个棘手的问题

发布于 2024-11-16 15:35:55 字数 811 浏览 2 评论 0原文

我有两组来自不同机器的测量结果。它们随着时间的推移进行测量,时间间隔略有不同 - 例如,一个每 5 分钟进行一次测量,而另一个每 3 分钟进行一次测量。优点是每 5 分钟计算一次,作为整个时间间隔的平均值,因此这些值应该大致对应。我想通过每 5 分钟(光)测量一次来扩展向量,以便其值与每 5 分钟进行的测量向量中的值大致同步。然后应该用前面的值填充间隙。

的示例

Date             Light 
26/05/2011 16:00 -529.98            
26/05/2011 16:05 -276.68            
26/05/2011 16:10 -179.63            
26/05/2011 16:15 -385.57            
26/05/2011 16:20 -1273.6            
26/05/2011 16:25 -1109.7 

这是每 5 分钟的数据和每 3 分钟的数据

    Date             Flux 
26/05/2011 16:01     0.64
26/05/2011 16:04    -1.96
26/05/2011 16:07    -0.51
26/05/2011 16:10    -1.34
26/05/2011 16:13    -1.28
26/05/2011 16:15    -0.22

。我也不应该认为光测量的矢量(每 5 分钟)比每 3 分钟的矢量短。因此,目标是使 5 分钟测量的向量与 3 分钟向量的长度相同。

我意识到这是一个相当棘手的问题,但任何建议都会受到极大的欢迎。

I have two sets of measurements from different machines. They are measured over time, at slightly different intervals - e.g. one makes a measurement every 5 mins, but the other, every 3 mins. The advantage is that the one every 5 mins is computed as an average over the whole interval so the values should correspond roughly to one another. I would like to expand the vector with measurements every 5 minutes (Light) so that its values are roughly synchronous with the values in vector of measurements made every 5 minutes. The gap should then be filled with the preceding value

Here is an example of the data every 5 minutes

Date             Light 
26/05/2011 16:00 -529.98            
26/05/2011 16:05 -276.68            
26/05/2011 16:10 -179.63            
26/05/2011 16:15 -385.57            
26/05/2011 16:20 -1273.6            
26/05/2011 16:25 -1109.7 

and the data every 3 minutes

    Date             Flux 
26/05/2011 16:01     0.64
26/05/2011 16:04    -1.96
26/05/2011 16:07    -0.51
26/05/2011 16:10    -1.34
26/05/2011 16:13    -1.28
26/05/2011 16:15    -0.22

I should also not that the vector of light measurement (every 5 mins) is shorter than the vector every 3 minutes. The goal is thus to make the vector of 5 min measurements the same length as the 3 minute vector.

I realise that this is quite a tricky problem, but any suggestions would be greatfuly received.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

や三分注定 2024-11-23 15:35:55

如果我理解正确的话,这可以通过 Zoo 或 xts 轻松完成。首先,这是您的示例数据:

Lines1 <- "Date,Light
26/05/2011 16:00,-529.98
26/05/2011 16:05,-276.68
26/05/2011 16:10,-179.63
26/05/2011 16:15,-385.57
26/05/2011 16:20,-1273.6
26/05/2011 16:25,-1109.7"

Lines2 <- "Date,Flux
26/05/2011 16:01,0.64
26/05/2011 16:04,-1.96
26/05/2011 16:07,-0.51
26/05/2011 16:10,-1.34
26/05/2011 16:13,-1.28
26/05/2011 16:15,-0.22"

con <- textConnection(Lines1)
Light <- read.csv(con, stringsAsFactors=FALSE, header=TRUE)
close(con)
con <- textConnection(Lines2)
Flux <- read.csv(con, stringsAsFactors=FALSE, header=TRUE)
close(con)

现在我们加载 xts 包,该包还加载 Zoo.xts 包。然后我们将 LightFlux data.frame 对象转换为 xts 对象。

library(xts)
light <- xts(Light$Light, as.POSIXct(Light$Date, format="%d/%m/%Y %H:%M"))
flux <- xts(Flux$Flux, as.POSIXct(Flux$Date, format="%d/%m/%Y %H:%M"))

这是很棒的部分。 merge.xtsmerge.zoo 将按索引对齐每个系列。 na.locf 使用之前的值填充每个 NA

Data <- merge(light,flux)
#                        light  flux
# 2011-05-26 16:00:00  -529.98    NA
# 2011-05-26 16:01:00       NA  0.64
# 2011-05-26 16:04:00       NA -1.96
# 2011-05-26 16:05:00  -276.68    NA
# 2011-05-26 16:07:00       NA -0.51
# 2011-05-26 16:10:00  -179.63 -1.34
# 2011-05-26 16:13:00       NA -1.28
# 2011-05-26 16:15:00  -385.57 -0.22
# 2011-05-26 16:20:00 -1273.60    NA
# 2011-05-26 16:25:00 -1109.70    NA
Data <- na.locf(Data)

最后,我们可以从合并的 Data 对象中提取 3 分钟索引。

Data[index(flux),]
#                       light  flux
# 2011-05-26 16:01:00 -529.98  0.64
# 2011-05-26 16:04:00 -529.98 -1.96
# 2011-05-26 16:07:00 -276.68 -0.51
# 2011-05-26 16:10:00 -179.63 -1.34
# 2011-05-26 16:13:00 -179.63 -1.28
# 2011-05-26 16:15:00 -385.57 -0.22

If I understand correctly, this is easily accomplished with either zoo or xts. First, here's your sample data:

Lines1 <- "Date,Light
26/05/2011 16:00,-529.98
26/05/2011 16:05,-276.68
26/05/2011 16:10,-179.63
26/05/2011 16:15,-385.57
26/05/2011 16:20,-1273.6
26/05/2011 16:25,-1109.7"

Lines2 <- "Date,Flux
26/05/2011 16:01,0.64
26/05/2011 16:04,-1.96
26/05/2011 16:07,-0.51
26/05/2011 16:10,-1.34
26/05/2011 16:13,-1.28
26/05/2011 16:15,-0.22"

con <- textConnection(Lines1)
Light <- read.csv(con, stringsAsFactors=FALSE, header=TRUE)
close(con)
con <- textConnection(Lines2)
Flux <- read.csv(con, stringsAsFactors=FALSE, header=TRUE)
close(con)

Now we load the xts package, which also loads zoo. Then we convert the Light and Flux data.frame objects to xts objects.

library(xts)
light <- xts(Light$Light, as.POSIXct(Light$Date, format="%d/%m/%Y %H:%M"))
flux <- xts(Flux$Flux, as.POSIXct(Flux$Date, format="%d/%m/%Y %H:%M"))

Here's the awesome part. merge.xts and merge.zoo will align each series by index. na.locf fills in each NA with the previous value.

Data <- merge(light,flux)
#                        light  flux
# 2011-05-26 16:00:00  -529.98    NA
# 2011-05-26 16:01:00       NA  0.64
# 2011-05-26 16:04:00       NA -1.96
# 2011-05-26 16:05:00  -276.68    NA
# 2011-05-26 16:07:00       NA -0.51
# 2011-05-26 16:10:00  -179.63 -1.34
# 2011-05-26 16:13:00       NA -1.28
# 2011-05-26 16:15:00  -385.57 -0.22
# 2011-05-26 16:20:00 -1273.60    NA
# 2011-05-26 16:25:00 -1109.70    NA
Data <- na.locf(Data)

Finally, we can extract the 3 minute index from the merged Data object.

Data[index(flux),]
#                       light  flux
# 2011-05-26 16:01:00 -529.98  0.64
# 2011-05-26 16:04:00 -529.98 -1.96
# 2011-05-26 16:07:00 -276.68 -0.51
# 2011-05-26 16:10:00 -179.63 -1.34
# 2011-05-26 16:13:00 -179.63 -1.28
# 2011-05-26 16:15:00 -385.57 -0.22
猥琐帝 2024-11-23 15:35:55

您可以使用大约,这将在数据点之间线性插值。这是一个简单的示例:

x = sort( rnorm(20) )
y = 1:20
plot(x, y, main = 'function interpolation example' )
points(approx(x, y), col = 2, pch = 3 )

要指定要插值的点数量,可以使用 xout 参数,如下所示:

points( approx( x, y, xout = seq( from = min(x), to = max(x), by = 0.1 ) ), pch = 3, col = 3 )

对于更多插值点:

points( approx( x, y, xout = seq( from = min(x), to = max(x), by = 0.05 ) ), pch = 3, col = 4 )

对于您的特定示例,您需要执行诸如插值的 x,y 值之类的操作两个函数都使用两台机器的时间点的交集。这是一个建议:

x_interp = unique( sort( c(seq( from = 0, to = 100, by = 5 ), seq( from = 0, to = 100, by = 3 ) ) ) )
x_interp
 [1]   0   3   5   6   9  10  12  15  18  20  21  24  25  27  30  33  35
[18]  36  39  40  42  45  48  50  51  54  55  57  60  63  65  66  69  70
[35]  72  75  78  80  81  84  85  87  90  93  95  96  99 100

然后,您可以使用此 x_interp 作为 xout 在两台机器的点之间进行插值:

par( mfrow = c(1,2) )
plot( x_light, y_light )
points(approx(x_light, y_light, x_out = x_interp), col = 2, pch = 3 )

plot( x_flux, y_flux )
points(approx(x_flux, y_flux, x_out = x_interp), col = 3, pch = 3 )

如果您想获得一个为任意输入插值的函数,请参阅名为 approxfun

You can use approx, which will linearly interpolate between your datapoints. Here's a quick example:

x = sort( rnorm(20) )
y = 1:20
plot(x, y, main = 'function interpolation example' )
points(approx(x, y), col = 2, pch = 3 )

To specify how many points you want to interpolate, you can use the xout parameter, like this:

points( approx( x, y, xout = seq( from = min(x), to = max(x), by = 0.1 ) ), pch = 3, col = 3 )

For more interpolation points:

points( approx( x, y, xout = seq( from = min(x), to = max(x), by = 0.05 ) ), pch = 3, col = 4 )

For your specific example, you'd want to do something like interpolating the x,y values of both functions using the intersection of timepoints from both machines. Here is one suggestion:

x_interp = unique( sort( c(seq( from = 0, to = 100, by = 5 ), seq( from = 0, to = 100, by = 3 ) ) ) )
x_interp
 [1]   0   3   5   6   9  10  12  15  18  20  21  24  25  27  30  33  35
[18]  36  39  40  42  45  48  50  51  54  55  57  60  63  65  66  69  70
[35]  72  75  78  80  81  84  85  87  90  93  95  96  99 100

Then, you can use this x_interp as an xout to interpolate between points from both machines:

par( mfrow = c(1,2) )
plot( x_light, y_light )
points(approx(x_light, y_light, x_out = x_interp), col = 2, pch = 3 )

plot( x_flux, y_flux )
points(approx(x_flux, y_flux, x_out = x_interp), col = 3, pch = 3 )

If you'd like to get a function that interpolates values for arbitrary inputs, see the related function called approxfun.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文