在第二个数据的日期间隔内寻找值。表
我是R的初学者,我正在寻求功能/循环的帮助。
I Have this data table "by_newborn":
id_mom | date | id_newborn | week | weight | conception_date | pregnancy_interval | one_year_before_pregnany | one_year_before_interval | first_trimester |
---|---|---|---|---|---|---|---|---|---|
1.21e+12 | 01/05/2020 | 1234 | 18 | 2 | 2019-12-27 | 2019-12-27 UTC--2020-05-01 UTC | 2018-12-26 | 2018-12-26 UTC--2019-12-27 UTC | 2020-04-02 |
1.21E+12 | 01/05/2020 | 5489 | 18 | 2 | 2019-12-27 2019-12-27 | 2019-12-27 2019-12-27 UTC---27 -05-01 UTC | 2018-12-26 | 2018-12-26 UTC--2019-12-27 UTC | 2020-04-02 |
by_newborn
结构:
structure(list(ן..ID = c(2602035392, 2602035392, 4104232942),
date_of_birth = structure(c(1L, 1L, 2L), .Label = c("01/05/2020",
"02/05/2018", "03/05/2020", "04/05/2020", "05/05/2020", "06/05/2020",
"07/05/2020", "08/05/2020"), class = "factor"), week = c(38L,
38L, 36L), conception_date = structure(c(18117, 18117, 17401
), class = "Date"), pregnancy_interval = new("Interval",
.Data = c(22982400, 22982400, 21772800), start = structure(c(1565308800,
1565308800, 1503446400, 1503446400, 1563062400, 1563062400,
1564358400, 1564358400, 1563840000, 1563840000, 1563926400,
1564617600, 1567728000), tzone = "UTC", class = c("POSIXct",
"POSIXt")), tzone = "UTC")), row.names = c(NA,
-3L), class = c("data.table", "data.frame"))
我已经使用数据表创建了间隔
和lubridate
conception_date = lubridate :: dmy(by_newborn$date) - lubridate:: weeks(by_newborn$week)
by_newborn[, conception_date:= conception_date]
by_newborn[, pregnancy_interval := interval(ymd(by_newborn$conception_date), dmy(by_newborn$date))]
我有第二个表,我制作了tsh_results
带有测试结果的每个ID_MOM:
id_mom |date |tsh_level|Units
1.21e+12|01/02/2020|0.5 |ng/dl
1.21e+12|05/02/2020|0.5 |ng/dl
1.21e+12|03/05/2015|1.8 |ng/dl
1.21e+12|09/05/2015|1.8 |ng/dl
tsh_results
结构:
structure(list(ן..id_mom = c(1.21e+12, 1.21e+12, 1.21e+12, 1.21e+12,
1.21e+12), date = c("01/02/2020", "01/02/2020", "01/02/2020",
"01/02/2020", "01/02/2020"), TSH_level = c("0.5", "0.5", "0.5",
"0.5", "0.5"), measur = c("ng/dl", "ng/dl", "ng/dl", "ng/dl",
"ng/dl")), row.names = c(NA, -5L), class = c("data.table", "data.frame"
),
我想要在编写一个代码方面的一些帮助,该代码将在tsh结果
中查找每个ID,该结果在间隔内(或2个日期),并将返回tsh级别的by_newborn中的新列
我已经尝试过,但是看来我可能需要循环或其他方式:
by_newborn[id_mom == TSH_results$id_mom & (dmy(TSH_results$date) %within%
pregnancy_interval), preg_results := TSH_results$result]
非常感谢!
I am a beginner to R and I am looking for help with a function/loop.
I Have this data table "by_newborn":
id_mom | date | id_newborn | week | weight | conception_date | pregnancy_interval | one_year_before_pregnany | one_year_before_interval | first_trimester |
---|---|---|---|---|---|---|---|---|---|
1.21e+12 | 01/05/2020 | 1234 | 18 | 2 | 2019-12-27 | 2019-12-27 UTC--2020-05-01 UTC | 2018-12-26 | 2018-12-26 UTC--2019-12-27 UTC | 2020-04-02 |
1.21e+12 | 01/05/2020 | 5489 | 18 | 2 | 2019-12-27 | 2019-12-27 UTC--2020-05-01 UTC | 2018-12-26 | 2018-12-26 UTC--2019-12-27 UTC | 2020-04-02 |
by_newborn
structure:
structure(list(ן..ID = c(2602035392, 2602035392, 4104232942),
date_of_birth = structure(c(1L, 1L, 2L), .Label = c("01/05/2020",
"02/05/2018", "03/05/2020", "04/05/2020", "05/05/2020", "06/05/2020",
"07/05/2020", "08/05/2020"), class = "factor"), week = c(38L,
38L, 36L), conception_date = structure(c(18117, 18117, 17401
), class = "Date"), pregnancy_interval = new("Interval",
.Data = c(22982400, 22982400, 21772800), start = structure(c(1565308800,
1565308800, 1503446400, 1503446400, 1563062400, 1563062400,
1564358400, 1564358400, 1563840000, 1563840000, 1563926400,
1564617600, 1567728000), tzone = "UTC", class = c("POSIXct",
"POSIXt")), tzone = "UTC")), row.names = c(NA,
-3L), class = c("data.table", "data.frame"))
I have created the intervals using Data table
and Lubridate
conception_date = lubridate :: dmy(by_newborn$date) - lubridate:: weeks(by_newborn$week)
by_newborn[, conception_date:= conception_date]
by_newborn[, pregnancy_interval := interval(ymd(by_newborn$conception_date), dmy(by_newborn$date))]
I have a second table I made TSH_results
with tests results history for each id_mom:
id_mom |date |tsh_level|Units
1.21e+12|01/02/2020|0.5 |ng/dl
1.21e+12|05/02/2020|0.5 |ng/dl
1.21e+12|03/05/2015|1.8 |ng/dl
1.21e+12|09/05/2015|1.8 |ng/dl
TSH_results
structure:
structure(list(ן..id_mom = c(1.21e+12, 1.21e+12, 1.21e+12, 1.21e+12,
1.21e+12), date = c("01/02/2020", "01/02/2020", "01/02/2020",
"01/02/2020", "01/02/2020"), TSH_level = c("0.5", "0.5", "0.5",
"0.5", "0.5"), measur = c("ng/dl", "ng/dl", "ng/dl", "ng/dl",
"ng/dl")), row.names = c(NA, -5L), class = c("data.table", "data.frame"
),
I would like for some help with writing a code that will look for each ID for a result in TSH results
that is within an interval (or 2 dates) and will return the TSH level to a new column in by_newborn
I have tried this, but it seems I might need a loop or another way:
by_newborn[id_mom == TSH_results$id_mom & (dmy(TSH_results$date) %within%
pregnancy_interval), preg_results := TSH_results$result]
Many thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论