带有频率的幸存(包装存活)(在较粗的精确匹配之后获得 - 包装匹配)
我正在对生存数据进行反事实的影响评估。更确切地说,我试图评估职业培训对失业时间所花费的影响的影响。我使用生存曲线的Kaplan Meier估计器(包装生存)。
在进行Kaplan Meier之前,我使用粗糙的精确匹配(AIM是ATT)使控制和治疗组在预处理协变量(包装匹配)方面关闭。
对于Kaplan Meier估计器,我必须使用匹配的权重,它使用权重选项和可靠的SurviT标准误差效果很好:
library(survival)
library(survminer)
kp_cem <- survfit(Surv(time=time_cem,event=status_cem)~treatment_cem, data=data_impact_cem,robust =TRUE,weights =weights)
尽管当我尝试使用日志量测试来测试生存曲线的差异时在治疗组和对照组之间,我不能考虑匹配的频率权重,因此测试统计数据不正确。
log_rank <- survdiff(Surv(time=time_cem,event=status_cem)~treatment_cem, data=data_impact_cem,rho=0)
我尝试了GGSURVPLOT(软件包Survminer)的选项“ PVAL = TRUE”,但问题是相同的,频率没有考虑到频率。
我如何在Survdiff中包括频率重量?是否还有其他软件包来计算对数量量的测试(在匹配后获得)?
I am doing a counterfactual impact evaluation on survival data. More precisely, I try to evaluate the impact of vocational training on time spent in unemployment. I use the Kaplan Meier estimator of the survival curve (package survival).
Before doing Kaplan Meier, I use coarsened exact matching (aim is ATT) to get the control and treatment groups close in terms of pretreatment covariates (package MatchIt).
For the Kaplan Meier estimator, I have to use the weights form the matching, which works well using the weights option and robust standard errors of survfit :
library(survival)
library(survminer)
kp_cem <- survfit(Surv(time=time_cem,event=status_cem)~treatment_cem, data=data_impact_cem,robust =TRUE,weights =weights)
Although, when I try to use a log-rank test to test for the difference in survival curves between treatment and control groups, I cannot take into account the frequency weights from the matching so the test statistics are not correct.
log_rank <- survdiff(Surv(time=time_cem,event=status_cem)~treatment_cem, data=data_impact_cem,rho=0)
I tried the option "pval = TRUE" of ggsurvplot (package survminer) but the problem is the same, the frequency weights are not taken into account.
How can I include frequency weights in survdiff? Are there other packages to compute log-rank test taking into account frequency weights (obtained after matching)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
至少有两种方法可以做到这一点:
首先,您可以使用
survey::svylogrank
函数,如 @IRTFM 建议的那样。这会将权重视为采样权重,但我认为对于 svylogrank 使用的稳健标准误差来说这是可以的。其次,您可以使用
survival::coxph
。 logrank 检验是 Cox 模型中的得分检验,coxph
取频率权重。如果您想要稳健的分数测试,请使用“robust=TRUE”:它将位于“summary(your_cox_model)”输出的底部,您可以将其提取为“summary” (your_cox_model)$robscoreThere are at least two ways to do this:
First, you can use the
survey::svylogrank
function, as @IRTFM suggests. This will treat the weights as sampling weights, but I think that's ok with the robust standard errors thatsvylogrank
uses.Second, you can use
survival::coxph
. The logrank test is the score test in a Cox model, andcoxph
takes frequency weights. Userobust=TRUE
if you want a robust score test: it will be at the bottom of the output ofsummary(your_cox_model)
and you can extract it assummary(your_cox_model)$robscore
非常感谢@thomas Lumley和@IRTFM的答案。
这是我应用您的两个建议的方式(我添加了一些注释 +参考)。
1。使用调查:: svylogrank
我使用采样权重时感觉不太舒适,而这确实是我的频率重量。
我应该如何指定调查设计?权重来自一类层层匹配的粗糙精确匹配(与方法=“ cem”)。
我应该指定调查设计中的地层和权重吗?在此小插图中,Matchit ,建议仅在生存分析中仅使用权重和强大的标准误差(不是地层)(第27页)。
这是我如何使用匹配的权重来指定设计以及如何使用软件包调查获得对数秩测试的方式:
2。使用生存:: coxph
感谢您提供的信息,这是生存分析的新信息,我忽略了Cox模型和原木测试的得分测试等效性的这种不错的属性。对于希望有关此主题的更多信息的人们,我发现这本书非常有启发性:Moore,D。(2016年)。使用纽约R.纽约的应用生存分析:纽约:施普林格(第58页)。
我发现这个2D选项比第一个涉及调查更具吸引力。这是我的应用方式:
这是2个测试统计数据之间的区别(最后非常接近)。
在此处输入图像描述
尽管我仍然想知道为什么重量选项在survdiff中不存在。
Thank you very much @Thomas Lumley and @IRTFM for your answers.
Here is how I apply your 2 suggestions (I added some comments + references).
1. Using survey::svylogrank
I don’t feel very confortable using sampling weights while it is really frequency weights that I have.
How should I specify the survey design ? The weights come from Coarsened Exact Matching (matchit with method = "cem") which is a class of stratum matching.
Should I specify the strata and the weights in the survey design ? In this vignette form Matchit Estimating Effects After Matching, it is suggested to use only weights and robust standard errors in the survival analysis (not the strata) (p. 27).
Here is how I specify the design and how I obtain the log-rank test using the package survey taking into account the weights from matching :
2. Using survival::coxph
Thank you for this piece of information, being quite new to survival analysis, I overlooked this nice property of the equivalency of score test from cox model and log-rank test. For people wishing more info on this subject, I found this book very instructive : Moore, D. (2016). Applied survival analysis using R. New York: NY : Springer (p 58).
I find this 2d option more attractive than the 1st involving survey. Here is how I apply it :
Here is the difference between the 2 test statistics (quite close in the end).
enter image description here
Though, I still wonder why the weights option does not exist in survdiff.