在 gtsummary 中添加缺失值的频率和百分比
df_nhpi %>%
select(AGE, SEX, MAR_STAT, HEIGHT, WEIGHT, BMI, HTN, HTNMED, MI, Smoking, COPD, CANCER, DIABETES) %>%
tbl_summary(by = SEX,
label = list(MAR_STAT ~ 'Marital Status',
HTN ~ 'Hypertension',
HTNMED ~ 'Hypertension Medication',
MI ~ 'Heart Attack',
Smoking ~ 'Smoking Status',
COPD ~ 'Chronic Obstructive Pulmonary Disease'),
type = list(c("HTN","HTNMED", "MI", "COPD", "CANCER") ~ "categorical"),
missing = "ifany",
missing_text = "Unknown",
statistic = list(all_continuous() ~ "{mean} ({sd})",
all_categorical() ~ "{n} ({p}%)"),
digits = all_continuous() ~ 2, percent = "column") %>%
add_stat_label() %>%
add_p(test = all_continuous() ~ "t.test", pvalue_fun =
function(x) style_pvalue(x, digits = 3)) %>%
bold_p() %>%
modify_caption("**Table 1. Baseline Characteristics**") %>% bold_labels()
我正在尝试生成一个表。但是,这里的问题是,我想要跨列的缺失值(特别是分类变量)的 %,同时,我不希望在计算 p 值时包含缺失值。我正在尝试用单个代码块来完成此操作。无论如何可以做到这一点还是我应该采用传统方法?
过去三天我一直在搜索整个互联网。但是,我没有找到任何适合我的情况。
PS:mutate 和 forcats 不起作用,因为它会扭曲我的 p 值。
df_nhpi %>%
select(AGE, SEX, MAR_STAT, HEIGHT, WEIGHT, BMI, HTN, HTNMED, MI, Smoking, COPD, CANCER, DIABETES) %>%
tbl_summary(by = SEX,
label = list(MAR_STAT ~ 'Marital Status',
HTN ~ 'Hypertension',
HTNMED ~ 'Hypertension Medication',
MI ~ 'Heart Attack',
Smoking ~ 'Smoking Status',
COPD ~ 'Chronic Obstructive Pulmonary Disease'),
type = list(c("HTN","HTNMED", "MI", "COPD", "CANCER") ~ "categorical"),
missing = "ifany",
missing_text = "Unknown",
statistic = list(all_continuous() ~ "{mean} ({sd})",
all_categorical() ~ "{n} ({p}%)"),
digits = all_continuous() ~ 2, percent = "column") %>%
add_stat_label() %>%
add_p(test = all_continuous() ~ "t.test", pvalue_fun =
function(x) style_pvalue(x, digits = 3)) %>%
bold_p() %>%
modify_caption("**Table 1. Baseline Characteristics**") %>% bold_labels()
I'm trying to generate a table one. But, the issue here is, I want % for missing values across columns (specifically for categorical variables) and at the same time, I don't want missing values to be included while calculating p-values. I'm trying to do this in single chunk of code. Is there anyway to do this or should I go for the conventional method?
I've been searching the whole internet for the past three days. But, I don't find anything that works in my case.
PS: mutate and forcats doesn't work as it skews my p-values.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我准备了两个解决方案,它们都报告丢失数据的比例。希望其中之一适合您!
由 reprex 软件包 (v2.0.1) 创建于 2022 年 3 月 22 日
I prepared two solutions that both report the proportion of missing data. Hopefully one of them works for you!
Created on 2022-03-22 by the reprex package (v2.0.1)