当日期重叠时,将药物组合在一起
解释这有点棘手,所以请忍受我,如果我没有意义,请问问题。
这是我的数据
mydata <- data.frame(id = c(1,1,1,1,1,1,1,1),
drug = c("let", "per", "pac", "tra","chem", "tem", "cap", "nem"),
type = c("type1", "type2", "type1","type1","type1", "type2", "type1", "type2"),
startdate = c("2016-05-12","2016-05-30","2016-05-31","2016-05-31", "2018-01-18","2018-04-01", "2020-11-05","2020-11-04"),
enddate =c("2016-05-12", "2018-04-05","2017-11-08", "2018-04-05", "2018-01-18", "2020-11-06", "2021-08-18", "2021-08-11"))
,我的目标是将日期彼此重叠的药物分组。但是,即使在两种药物之间存在重叠的日期,但是药物的类型切换到Type2,我还是希望这触发另一行的起点和结束日期。
我能够使用以下代码来实现分组日期相互重叠,
mydata <- mydata %>%
arrange(id, startdate,drug) %>%
group_by(id) %>%
mutate(indx = c(0, cumsum(as.numeric(lead(startdate)) >
cummax(as.numeric(enddate)))[-n()])) %>%
group_by(id, indx) %>%
mutate(drugs = paste0(drug, collapse = ", "))%>%
summarise(startDate = min(startdate), endDate = max(enddate), drugs=drugs) %>% distinct()
但是您可以在毒品“ let”之后看到;所有其他行分组在一起。相反,我想要“ TEM”和“ NEM”的新行,因为它们是2型药物。
输出!
mydata1 <- data.frame(id = c(1,1,1,1),
drugs = c("let", "per,pac,tra,chem", "tem", "cap, nem"),
startdate = c("2016-05-12","2016-05-30","2018-04-01","2020-11-04"),
enddate =c("2016-05-12","2018-01-18", "2020-11-06","2021-08-11"))
这是我希望获得任何帮助的
This is a bit tricky to explain so please bear with me and ask questions if I am not making sense.
Here is my data
mydata <- data.frame(id = c(1,1,1,1,1,1,1,1),
drug = c("let", "per", "pac", "tra","chem", "tem", "cap", "nem"),
type = c("type1", "type2", "type1","type1","type1", "type2", "type1", "type2"),
startdate = c("2016-05-12","2016-05-30","2016-05-31","2016-05-31", "2018-01-18","2018-04-01", "2020-11-05","2020-11-04"),
enddate =c("2016-05-12", "2018-04-05","2017-11-08", "2018-04-05", "2018-01-18", "2020-11-06", "2021-08-18", "2021-08-11"))
My goal is to group the drugs whose dates overlap with each other. But even if there is an overlap with dates between two drugs, but the type of drug switches to type2, I want that to trigger another row with its own start and end dates.
I was able to achieve grouping dates overlapping with each other using the following code
mydata <- mydata %>%
arrange(id, startdate,drug) %>%
group_by(id) %>%
mutate(indx = c(0, cumsum(as.numeric(lead(startdate)) >
cummax(as.numeric(enddate)))[-n()])) %>%
group_by(id, indx) %>%
mutate(drugs = paste0(drug, collapse = ", "))%>%
summarise(startDate = min(startdate), endDate = max(enddate), drugs=drugs) %>% distinct()
But as you can see after drug "let"; all other rows get grouped together. Where instead I want a new row for "tem" and "nem" as they are type 2 drugs.
This is the output I am hoping to get
mydata1 <- data.frame(id = c(1,1,1,1),
drugs = c("let", "per,pac,tra,chem", "tem", "cap, nem"),
startdate = c("2016-05-12","2016-05-30","2018-04-01","2020-11-04"),
enddate =c("2016-05-12","2018-01-18", "2020-11-06","2021-08-11"))
Any help is appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我将数据框分为每种药物的单独数据框,然后使用您现有的代码。然后,我将两个新的数据范围放回一个数据范围中。
我还通过日期转换来获得NA,因此我使用橄榄酸酯转换了日期。
I split the dataframe into separate dataframes for each drug and then used your existing code. Then I put the two new dataframes back together into one dataframe.
I was also getting NA by conversion from the dates, so I converted the dates using lubridate.