使用AgeAdjust.Direct计算每个子组的原油和AJUST率

发布于 2025-01-22 17:25:24 字数 4420 浏览 1 评论 0原文

我正在尝试计算每年疾病的发生率和每个年龄类别的发生率。我也想应用直接标准化。我使用函数ageadjust.direct（软件包epitools）。

   age_cat persondays_individual contactfirst_cat ESPpop personyears year 
 1 <40                     38624                3  26938      106.   2011 
 2 >90                       367                2    691      1.01 2011 
 3 41-50                  111208               10  14214      305.   2011 
 4 51-60                  222777               29  13534      610.   2011 
 5 61-70                  219567               41  11403      602.   2011 
 6 71-80                  102593               26  77484      281.   2011 
 7 81-90                   20056               10   3945       54.9  2011 
 8 <40                     32673                6  26938       89.5  2012 
 9 >90                       366                0    691      1.00 2012 
10 41-50                  102182               11  14214      280.   2012 
11 51-60                  209241               29  13534      573.   2012 
12 61-70                  224701               33  11403      616.   2012 
13 71-80                  104898               26  77484      287.   2012 
14 81-90                   23771                9   3945       65.1  2012 

df<-structure(list(age_cat = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("<40", ">90", "41-50", 
"51-60", "61-70", "71-80", "81-90"), class = "factor"), persondays_individual = c(38624, 
367, 111208, 222777, 219567, 102593, 20056, 32673, 366, 102182, 
209241, 224701, 104898, 23771), contactfirst_cat = c(3, 2, 10, 
29, 41, 26, 10, 6, 0, 11, 29, 33, 26, 9), ESPpop = c(26938, 691, 
14214, 13534, 11403, 77484, 3945, 26938, 691, 14214, 13534, 11403, 
77484, 3945), personyears = c(105.819178082192, 1.00547945205479, 
304.679452054795, 610.34794520548, 601.553424657534, 281.076712328767, 
54.9479452054794, 89.5150684931507, 1.0027397260274, 279.950684931507, 
573.26301369863, 615.619178082192, 287.391780821918, 65.1260273972603
), year = c("2011", "2011", "2011", "2011", "2011", "2011", "2011", 
"2012", "2012", "2012", "2012", "2012", "2012", "2012")), row.names = c(NA, 
14L), class = "data.frame")

我想计算原油发生率（$ Contactfirst_cat / persyears），调整后的发病率和95％CI每年和年龄类别。

所需的输出

   age_cat persondays_individual contactfirst_cat ESPpop personyears year    crude.rate  adj. rate   lci  uci
 1 <40                     38624                3  26938      106.   2011 
 2 >90                       367                2    691      1.01   2011 
 3 41-50                  111208               10  14214      305.   2011 
 4 51-60                  222777               29  13534      610.   2011 
 5 61-70                  219567               41  11403      602.   2011 
 6 71-80                  102593               26  77484      281.   2011 
 7 81-90                   20056               10   3945       54.9  2011 
 8 <40                     32673                6  26938       89.5  2012 
 9 >90                       366                0    691      1.00   2012 
10 41-50                  102182               11  14214      280.   2012

我已经尝试了以下代码（我从计算原油和年龄标准化的速率，速率差异和子组与顺式的比率）

df%>%
  group_by(year, age_cat) %>%
  summarise(age_adjust = list(ageadjust.direct(count = contactfirst_cat,   #count of events
                                               pop = personyears,          #person years of DFpop
                                               rate = NULL,                
                                               stdpop = ESPpop,            #standard population (European standard population per age_cat)
                                               conf.level = 0.95))) %>%
  mutate(age_adjust = map(age_adjust, as.data.frame.list)) %>%
  ungroup()

但是，此代码并不能给我提供原油/调整率和每个子组的CI （Year and Age_cat），但我只能观察到这4列。

我如何每年和age_cat为粗糙，adj。rate，adj。任何帮助将不胜感激！非常感谢！

编辑当我在Quinten的答案中运行脚本时，我会得到以下结果：

    crude.rate    adj.rate     lci          uci
1   0.06070314     0.07801597  0.06298263   0.09764104

输出Quinten显示为DF_2正是我想要的。我不确定我在做什么错。

原文

I am trying to calculate the incidence of a disease per year and per age-category. I also want to apply direct standardization. Im using the function ageadjust.direct (package epitools).

   age_cat persondays_individual contactfirst_cat ESPpop personyears year 
 1 <40                     38624                3  26938      106.   2011 
 2 >90                       367                2    691      1.01 2011 
 3 41-50                  111208               10  14214      305.   2011 
 4 51-60                  222777               29  13534      610.   2011 
 5 61-70                  219567               41  11403      602.   2011 
 6 71-80                  102593               26  77484      281.   2011 
 7 81-90                   20056               10   3945       54.9  2011 
 8 <40                     32673                6  26938       89.5  2012 
 9 >90                       366                0    691      1.00 2012 
10 41-50                  102182               11  14214      280.   2012 
11 51-60                  209241               29  13534      573.   2012 
12 61-70                  224701               33  11403      616.   2012 
13 71-80                  104898               26  77484      287.   2012 
14 81-90                   23771                9   3945       65.1  2012 

df<-structure(list(age_cat = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("<40", ">90", "41-50", 
"51-60", "61-70", "71-80", "81-90"), class = "factor"), persondays_individual = c(38624, 
367, 111208, 222777, 219567, 102593, 20056, 32673, 366, 102182, 
209241, 224701, 104898, 23771), contactfirst_cat = c(3, 2, 10, 
29, 41, 26, 10, 6, 0, 11, 29, 33, 26, 9), ESPpop = c(26938, 691, 
14214, 13534, 11403, 77484, 3945, 26938, 691, 14214, 13534, 11403, 
77484, 3945), personyears = c(105.819178082192, 1.00547945205479, 
304.679452054795, 610.34794520548, 601.553424657534, 281.076712328767, 
54.9479452054794, 89.5150684931507, 1.0027397260274, 279.950684931507, 
573.26301369863, 615.619178082192, 287.391780821918, 65.1260273972603
), year = c("2011", "2011", "2011", "2011", "2011", "2011", "2011", 
"2012", "2012", "2012", "2012", "2012", "2012", "2012")), row.names = c(NA, 
14L), class = "data.frame")

I would like to calculate the crude incidence ($contactfirst_cat / personyears), the adjusted incidence and the 95% CI per year and age category.

Desired output

   age_cat persondays_individual contactfirst_cat ESPpop personyears year    crude.rate  adj. rate   lci  uci
 1 <40                     38624                3  26938      106.   2011 
 2 >90                       367                2    691      1.01   2011 
 3 41-50                  111208               10  14214      305.   2011 
 4 51-60                  222777               29  13534      610.   2011 
 5 61-70                  219567               41  11403      602.   2011 
 6 71-80                  102593               26  77484      281.   2011 
 7 81-90                   20056               10   3945       54.9  2011 
 8 <40                     32673                6  26938       89.5  2012 
 9 >90                       366                0    691      1.00   2012 
10 41-50                  102182               11  14214      280.   2012

I've tried the following code (which i got from Calculating crude and age-standardized rates, rate differences, and rate ratios with CIs by subgroups)

df%>%
  group_by(year, age_cat) %>%
  summarise(age_adjust = list(ageadjust.direct(count = contactfirst_cat,   #count of events
                                               pop = personyears,          #person years of DFpop
                                               rate = NULL,                
                                               stdpop = ESPpop,            #standard population (European standard population per age_cat)
                                               conf.level = 0.95))) %>%
  mutate(age_adjust = map(age_adjust, as.data.frame.list)) %>%
  ungroup()

However, this code doesnt give me the crude/adj rates and CI's per subgroup (year and age_cat), but i get only 1 observation of these 4 columns.

How can i create new columns for crude.rate, adj.rate, lci and uci PER year and age_cat? Any help would be very appreciated!
Many thanks in advance!

EDIT
When I run the script in Quinten's answer i get the following result:

    crude.rate    adj.rate     lci          uci
1   0.06070314     0.07801597  0.06298263   0.09764104

The output Quinten shows as df_2 is exactly what i want however. Im not sure what im doing wrong.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

南街九尾狐 2025-01-29 17:25:24

首先，您需要unnest您的代码而不是UNGROUP喜欢：

library(tidyverse)
library(epitools)
df_2 <- df %>%
  group_by(year, age_cat) %>%
  summarise(age_adjust = list(ageadjust.direct(count = contactfirst_cat,   #count of events
                                               pop = personyears,          #person years of DFpop
                                               rate = NULL,                
                                               stdpop = ESPpop,            #standard population (European standard population per age_cat)
                                               conf.level = 0.95))) %>%
  mutate(age_adjust = map(age_adjust, as.data.frame.list)) %>%
  unnest(cols = c(age_adjust))

outputd df_2：

# A tibble: 14 × 6
# Groups:   year [2]
   year  age_cat crude.rate adj.rate       lci    uci
   <chr> <fct>        <dbl>    <dbl>     <dbl>  <dbl>
 1 2011  <40         0.0284   0.0284   0.00585 0.0829
 2 2011  >90         1.99     1.99     0.241   7.19  
 3 2011  41-50       0.0328   0.0328   0.0157  0.0604
 4 2011  51-60       0.0475   0.0475   0.0318  0.0682
 5 2011  61-70       0.0682   0.0682   0.0489  0.0925
 6 2011  71-80       0.0925   0.0925   0.0604  0.136 
 7 2011  81-90       0.182    0.182    0.0873  0.335 
 8 2012  <40         0.0670   0.0670   0.0246  0.146 
 9 2012  >90         0        0      NaN       3.68  
10 2012  41-50       0.0393   0.0393   0.0196  0.0703
11 2012  51-60       0.0506   0.0506   0.0339  0.0727
12 2012  61-70       0.0536   0.0536   0.0369  0.0753
13 2012  71-80       0.0905   0.0905   0.0591  0.133 
14 2012  81-90       0.138    0.138    0.0632  0.262

在您可以绑定两个数据框后，以获取所需的数据框架您想要这样：

df_desired <- cbind(df, df_2[,c(3:6)]) 
df_desired

输出：

   age_cat persondays_individual contactfirst_cat ESPpop personyears year crude.rate   adj.rate         lci        uci
1      <40                 38624                3  26938  105.819178 2011 0.02835025 0.02835025 0.005846503 0.08285146
2      >90                   367                2    691    1.005479 2011 1.98910082 1.98910082 0.240889337 7.18531607
3    41-50                111208               10  14214  304.679452 2011 0.03282138 0.03282138 0.015739127 0.06035969
4    51-60                222777               29  13534  610.347945 2011 0.04751388 0.04751388 0.031820792 0.06823786
5    61-70                219567               41  11403  601.553425 2011 0.06815687 0.06815687 0.048910548 0.09246249
6    71-80                102593               26  77484  281.076712 2011 0.09250144 0.09250144 0.060425010 0.13553604
7    81-90                 20056               10   3945   54.947945 2011 0.18199043 0.18199043 0.087271484 0.33468687
8      <40                 32673                6  26938   89.515068 2012 0.06702782 0.06702782 0.024598029 0.14589135
9      >90                   366                0    691    1.002740 2012 0.00000000 0.00000000         NaN 3.67880055
10   41-50                102182               11  14214  279.950685 2012 0.03929263 0.03929263 0.019614742 0.07030538
11   51-60                209241               29  13534  573.263014 2012 0.05058760 0.05058760 0.033879310 0.07265223
12   61-70                224701               33  11403  615.619178 2012 0.05360457 0.05360457 0.036898918 0.07528074
13   71-80                104898               26  77484  287.391781 2012 0.09046884 0.09046884 0.059097248 0.13255781
14   81-90                 23771                9   3945   65.126027 2012 0.13819360 0.13819360 0.063190912 0.26233449

First you need to unnest your code instead of ungroup like this:

library(tidyverse)
library(epitools)
df_2 <- df %>%
  group_by(year, age_cat) %>%
  summarise(age_adjust = list(ageadjust.direct(count = contactfirst_cat,   #count of events
                                               pop = personyears,          #person years of DFpop
                                               rate = NULL,                
                                               stdpop = ESPpop,            #standard population (European standard population per age_cat)
                                               conf.level = 0.95))) %>%
  mutate(age_adjust = map(age_adjust, as.data.frame.list)) %>%
  unnest(cols = c(age_adjust))

Outputd df_2:

# A tibble: 14 × 6
# Groups:   year [2]
   year  age_cat crude.rate adj.rate       lci    uci
   <chr> <fct>        <dbl>    <dbl>     <dbl>  <dbl>
 1 2011  <40         0.0284   0.0284   0.00585 0.0829
 2 2011  >90         1.99     1.99     0.241   7.19  
 3 2011  41-50       0.0328   0.0328   0.0157  0.0604
 4 2011  51-60       0.0475   0.0475   0.0318  0.0682
 5 2011  61-70       0.0682   0.0682   0.0489  0.0925
 6 2011  71-80       0.0925   0.0925   0.0604  0.136 
 7 2011  81-90       0.182    0.182    0.0873  0.335 
 8 2012  <40         0.0670   0.0670   0.0246  0.146 
 9 2012  >90         0        0      NaN       3.68  
10 2012  41-50       0.0393   0.0393   0.0196  0.0703
11 2012  51-60       0.0506   0.0506   0.0339  0.0727
12 2012  61-70       0.0536   0.0536   0.0369  0.0753
13 2012  71-80       0.0905   0.0905   0.0591  0.133 
14 2012  81-90       0.138    0.138    0.0632  0.262

After you can bind your two dataframes to get the desired dataframe you want like this:

df_desired <- cbind(df, df_2[,c(3:6)]) 
df_desired

Output:

   age_cat persondays_individual contactfirst_cat ESPpop personyears year crude.rate   adj.rate         lci        uci
1      <40                 38624                3  26938  105.819178 2011 0.02835025 0.02835025 0.005846503 0.08285146
2      >90                   367                2    691    1.005479 2011 1.98910082 1.98910082 0.240889337 7.18531607
3    41-50                111208               10  14214  304.679452 2011 0.03282138 0.03282138 0.015739127 0.06035969
4    51-60                222777               29  13534  610.347945 2011 0.04751388 0.04751388 0.031820792 0.06823786
5    61-70                219567               41  11403  601.553425 2011 0.06815687 0.06815687 0.048910548 0.09246249
6    71-80                102593               26  77484  281.076712 2011 0.09250144 0.09250144 0.060425010 0.13553604
7    81-90                 20056               10   3945   54.947945 2011 0.18199043 0.18199043 0.087271484 0.33468687
8      <40                 32673                6  26938   89.515068 2012 0.06702782 0.06702782 0.024598029 0.14589135
9      >90                   366                0    691    1.002740 2012 0.00000000 0.00000000         NaN 3.67880055
10   41-50                102182               11  14214  279.950685 2012 0.03929263 0.03929263 0.019614742 0.07030538
11   51-60                209241               29  13534  573.263014 2012 0.05058760 0.05058760 0.033879310 0.07265223
12   61-70                224701               33  11403  615.619178 2012 0.05360457 0.05360457 0.036898918 0.07528074
13   71-80                104898               26  77484  287.391781 2012 0.09046884 0.09046884 0.059097248 0.13255781
14   81-90                 23771                9   3945   65.126027 2012 0.13819360 0.13819360 0.063190912 0.26233449

回复收藏 0 原文

~没有更多了~