是否有一种方法可以使用循环(或嵌套用于循环)从一个数据框架中使用日期信息并从另一个DF找到相应的日期?

发布于 2025-01-21 09:08:13 字数 1203 浏览 0 评论 0原文

以下问题是一个相当普遍的问题。我有一个数据框架,其中某些人和相应行上的一些日期。我想做的是使用另一个日常数据框架,根据个人的日期找到与连续几天有关的信息。例如,如果我有个人x在01-01-2000(1st df),则使用函数,我想在每日数据框架(第二df)中找到01-01-2000,并找到第一个的平均值出生后3天(即01-01-2000:05-01-2000),然后将其添加到第一df的新列中。这并不重要,这可能是重量,阳光小时或电话数量。这个问题可能有些模糊,因此,如果有人可以解释此文本,将不胜感激。

name<-c("A","B","C","D")
dob<-c("01-01-2000","02-01-2000","03-01-2000","08-01-2000")
df1<-data.frame(name,dob)

  name        dob
1    A 01-01-2000
2    B 02-01-2000
3    C 03-01-2000
4    D 08-01-2000

date<- c("31-12-1999","01-01-2000","02-01-2000","03-01-2000","04-01-2000","05-01-2000","06-01-2000","07-01-2000","08-01-2000","09-01-2000","10-01-2000","11-01-2000")
calls<-c(0,0,1,2,2,2,0,0,1,4,2,3)
df2<-data.frame(date,calls)

         date calls
1  31-12-1999     0
2  01-01-2000     0
3  02-01-2000     1
4  03-01-2000     2
5  04-01-2000     2
6  05-01-2000     2
7  06-01-2000     0
8  07-01-2000     0
9  08-01-2000     1
10 09-01-2000     4
11 10-01-2000     2
12 11-01-2000     3

我想要的是以下内容;

 name        dob mean.call
1    A 01-01-2000      1.00
2    B 02-01-2000      1.67
3    C 03-01-2000      2.00
4    D 08-01-2000      2.33

由于数据帧相当大,因此我想实现循环。

The following question is a rather general question. I have a data frame with certain individuals and some dates on the corresponding row. What I would like to do is using another daily data frame, find information pertaining to the consecutive days based on the date of the individual. For example, if I have individual X born on 01-01-2000 (1st df), using a function, I would like to find 01-01-2000 in the daily data frame (2nd df) and find the mean of the first 3 days post birth (namely 01-01-2000 : 05-01-2000) and then add it to a new column of the 1st df. Its not important what mean, it could be weight, sunlight hours, or number of calls. This question may be a bit vague so if someone could interpret this text, any help would be appreciated.

name<-c("A","B","C","D")
dob<-c("01-01-2000","02-01-2000","03-01-2000","08-01-2000")
df1<-data.frame(name,dob)

  name        dob
1    A 01-01-2000
2    B 02-01-2000
3    C 03-01-2000
4    D 08-01-2000

date<- c("31-12-1999","01-01-2000","02-01-2000","03-01-2000","04-01-2000","05-01-2000","06-01-2000","07-01-2000","08-01-2000","09-01-2000","10-01-2000","11-01-2000")
calls<-c(0,0,1,2,2,2,0,0,1,4,2,3)
df2<-data.frame(date,calls)

         date calls
1  31-12-1999     0
2  01-01-2000     0
3  02-01-2000     1
4  03-01-2000     2
5  04-01-2000     2
6  05-01-2000     2
7  06-01-2000     0
8  07-01-2000     0
9  08-01-2000     1
10 09-01-2000     4
11 10-01-2000     2
12 11-01-2000     3

What I would like is the following;

 name        dob mean.call
1    A 01-01-2000      1.00
2    B 02-01-2000      1.67
3    C 03-01-2000      2.00
4    D 08-01-2000      2.33

As the data frames are rather large, I would like to implement for loops.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

星星的軌跡 2025-01-28 09:08:13

我将首先使用Zoo s rollmean 函数来计算Mean s,然后加入df2df1 < /code>:


library(dplyr)
library(zoo)

df2 %>%
  add_row(calls = rep(0, 2)) %>% 
  mutate(means = rollmean(calls, k = 3, align = "left", fill = NA),
         .keep = "unused") %>% 
  right_join(df1, by = c("date" = "dob")) %>% 
  select(name, date, means)

这返回

  name       date    means
1    A 01-01-2000 1.000000
2    B 02-01-2000 1.666667
3    C 03-01-2000 2.000000
4    D 08-01-2000 2.333333

注意:我将两个虚拟行添加到df2中,以计算最后两个条目的平均值。由于这些值没有具体规则,因此我选择这样做。请记住这一点。

I would calculate the means first using zoos rollmean function and then join df2 and df1:


library(dplyr)
library(zoo)

df2 %>%
  add_row(calls = rep(0, 2)) %>% 
  mutate(means = rollmean(calls, k = 3, align = "left", fill = NA),
         .keep = "unused") %>% 
  right_join(df1, by = c("date" = "dob")) %>% 
  select(name, date, means)

This returns

  name       date    means
1    A 01-01-2000 1.000000
2    B 02-01-2000 1.666667
3    C 03-01-2000 2.000000
4    D 08-01-2000 2.333333

Note: I added two dummy rows into df2 to calculate the mean of the last two entries. Since there is no specific rule for those values, I choose to do so. Keep this in mind.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文