是否有一种方法可以使用循环（或嵌套用于循环）从一个数据框架中使用日期信息并从另一个DF找到相应的日期？

发布于 2025-01-21 09:08:13 字数 1203 浏览 0 评论 0原文

以下问题是一个相当普遍的问题。我有一个数据框架，其中某些人和相应行上的一些日期。我想做的是使用另一个日常数据框架，根据个人的日期找到与连续几天有关的信息。例如，如果我有个人x在01-01-2000（1st df），则使用函数，我想在每日数据框架（第二df）中找到01-01-2000，并找到第一个的平均值出生后3天（即01-01-2000：05-01-2000），然后将其添加到第一df的新列中。这并不重要，这可能是重量，阳光小时或电话数量。这个问题可能有些模糊，因此，如果有人可以解释此文本，将不胜感激。

name<-c("A","B","C","D")
dob<-c("01-01-2000","02-01-2000","03-01-2000","08-01-2000")
df1<-data.frame(name,dob)

  name        dob
1    A 01-01-2000
2    B 02-01-2000
3    C 03-01-2000
4    D 08-01-2000

date<- c("31-12-1999","01-01-2000","02-01-2000","03-01-2000","04-01-2000","05-01-2000","06-01-2000","07-01-2000","08-01-2000","09-01-2000","10-01-2000","11-01-2000")
calls<-c(0,0,1,2,2,2,0,0,1,4,2,3)
df2<-data.frame(date,calls)

         date calls
1  31-12-1999     0
2  01-01-2000     0
3  02-01-2000     1
4  03-01-2000     2
5  04-01-2000     2
6  05-01-2000     2
7  06-01-2000     0
8  07-01-2000     0
9  08-01-2000     1
10 09-01-2000     4
11 10-01-2000     2
12 11-01-2000     3

我想要的是以下内容；

 name        dob mean.call
1    A 01-01-2000      1.00
2    B 02-01-2000      1.67
3    C 03-01-2000      2.00
4    D 08-01-2000      2.33

由于数据帧相当大，因此我想实现循环。

原文

The following question is a rather general question. I have a data frame with certain individuals and some dates on the corresponding row. What I would like to do is using another daily data frame, find information pertaining to the consecutive days based on the date of the individual. For example, if I have individual X born on 01-01-2000 (1st df), using a function, I would like to find 01-01-2000 in the daily data frame (2nd df) and find the mean of the first 3 days post birth (namely 01-01-2000 : 05-01-2000) and then add it to a new column of the 1st df. Its not important what mean, it could be weight, sunlight hours, or number of calls. This question may be a bit vague so if someone could interpret this text, any help would be appreciated.

name<-c("A","B","C","D")
dob<-c("01-01-2000","02-01-2000","03-01-2000","08-01-2000")
df1<-data.frame(name,dob)

  name        dob
1    A 01-01-2000
2    B 02-01-2000
3    C 03-01-2000
4    D 08-01-2000

date<- c("31-12-1999","01-01-2000","02-01-2000","03-01-2000","04-01-2000","05-01-2000","06-01-2000","07-01-2000","08-01-2000","09-01-2000","10-01-2000","11-01-2000")
calls<-c(0,0,1,2,2,2,0,0,1,4,2,3)
df2<-data.frame(date,calls)

         date calls
1  31-12-1999     0
2  01-01-2000     0
3  02-01-2000     1
4  03-01-2000     2
5  04-01-2000     2
6  05-01-2000     2
7  06-01-2000     0
8  07-01-2000     0
9  08-01-2000     1
10 09-01-2000     4
11 10-01-2000     2
12 11-01-2000     3

What I would like is the following;

 name        dob mean.call
1    A 01-01-2000      1.00
2    B 02-01-2000      1.67
3    C 03-01-2000      2.00
4    D 08-01-2000      2.33

As the data frames are rather large, I would like to implement for loops.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

星星的軌跡 2025-01-28 09:08:13

我将首先使用Zoo s rollmean 函数来计算Mean s，然后加入df2和df1 < /code>：


library(dplyr)
library(zoo)

df2 %>%
  add_row(calls = rep(0, 2)) %>% 
  mutate(means = rollmean(calls, k = 3, align = "left", fill = NA),
         .keep = "unused") %>% 
  right_join(df1, by = c("date" = "dob")) %>% 
  select(name, date, means)

这返回

  name       date    means
1    A 01-01-2000 1.000000
2    B 02-01-2000 1.666667
3    C 03-01-2000 2.000000
4    D 08-01-2000 2.333333

注意：我将两个虚拟行添加到df2中，以计算最后两个条目的平均值。由于这些值没有具体规则，因此我选择这样做。请记住这一点。

I would calculate the means first using zoos rollmean function and then join df2 and df1:


library(dplyr)
library(zoo)

df2 %>%
  add_row(calls = rep(0, 2)) %>% 
  mutate(means = rollmean(calls, k = 3, align = "left", fill = NA),
         .keep = "unused") %>% 
  right_join(df1, by = c("date" = "dob")) %>% 
  select(name, date, means)

This returns

  name       date    means
1    A 01-01-2000 1.000000
2    B 02-01-2000 1.666667
3    C 03-01-2000 2.000000
4    D 08-01-2000 2.333333

Note: I added two dummy rows into df2 to calculate the mean of the last two entries. Since there is no specific rule for those values, I choose to do so. Keep this in mind.

回复收藏 0 原文

~没有更多了~