消除R中CrossTable中的值

发布于 2025-01-16 03:43:50 字数 1450 浏览 1 评论 0原文

我刚刚开始学习 R，我正在努力将我的注意力集中在卡方上以完成大学作业。

具体来说，我正在使用 2018 年一般社会调查数据集（对于密码本： https:/ /www.thearda.com/Archive/Files/Codebooks/GSS2018_CB.asp），我试图弄清楚宗教是否对人们寻求心理健康帮助的方式

我想使用reliten（宗教信仰的自我评估 - 从强烈到无宗教）作为自变量，以及mentloth（询问有心理健康问题的人是否应该这样做）联系心理健康专业人士（是或否）作为因变量。在卡方旁边，我还想添加 CrossTable(GSS18$reliten, GSS18$mentloth)，但我不知道如何去掉“不适用”、“不适用”知道”和“无响应”值编码为 0、8 和 9。有人有一些提示吗？

下面是我的数据的简短预览，如果有帮助的话。

structure(list(reliten = structure(c(1, 1, 4, 1, 1, 2, 1, 1, 
4, 2, 2, 3, 2, 2, 4, 1, 4, 3, 2, 1, 2, 1, 2, 2, 1), label = "Would you call yourself a strong [religious preference] or a not very strong [re", format.stata = "%8.0g", labels = c(`Not applicable` = 0, 
Strong = 1, `Not very strong` = 2, `Somewhat strong` = 3, `No religion` = 4, 
`Don't know` = 8, `No answer` = 9), class = c("haven_labelled", 
"vctrs_vctr", "double")), mentloth = structure(c(0, 1, 0, 1, 
2, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0
), label = "Should [NAME] go to a therapist, or counselor, like a psychologist, social worke", format.stata = "%8.0g", labels = c(`Not applicable` = 0, 
Yes = 1, No = 2, `Don't know` = 8, `No answer` = 9), class = c("haven_labelled", 
"vctrs_vctr", "double"))), row.names = c(NA, -25L), class = c("tbl_df", 
"tbl", "data.frame"))

任何帮助将不胜感激！

原文

I'm just getting started in R and I'm trying to wrap my head around Chi square for a university assignment.

Specifically, I am using the General Social Survey 2018 dataset (for codebook: https://www.thearda.com/Archive/Files/Codebooks/GSS2018_CB.asp), and I am trying to figure out if religion has any effect on the way people seek out help for mental health.

I want to use reliten (self-assessment of religiousness - from strong to no religion) as the independent variable, and mentloth, (asks if a person with mental health issues should reach out to a mental health professional - yes or no) as the dependent variable. Next to the Chi-square, I also want to add CrossTable(GSS18$reliten, GSS18$mentloth), but I'm not sure how to take out the "Not applicable", "Don't know" and "No response" values coded as 0, 8 and 9. Anyone has some tips?

Below there is a short preview of my data, if it helps.

structure(list(reliten = structure(c(1, 1, 4, 1, 1, 2, 1, 1, 
4, 2, 2, 3, 2, 2, 4, 1, 4, 3, 2, 1, 2, 1, 2, 2, 1), label = "Would you call yourself a strong [religious preference] or a not very strong [re", format.stata = "%8.0g", labels = c(`Not applicable` = 0, 
Strong = 1, `Not very strong` = 2, `Somewhat strong` = 3, `No religion` = 4, 
`Don't know` = 8, `No answer` = 9), class = c("haven_labelled", 
"vctrs_vctr", "double")), mentloth = structure(c(0, 1, 0, 1, 
2, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0
), label = "Should [NAME] go to a therapist, or counselor, like a psychologist, social worke", format.stata = "%8.0g", labels = c(`Not applicable` = 0, 
Yes = 1, No = 2, `Don't know` = 8, `No answer` = 9), class = c("haven_labelled", 
"vctrs_vctr", "double"))), row.names = c(NA, -25L), class = c("tbl_df", 
"tbl", "data.frame"))

Any help would be much appreciated!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦里南柯 2025-01-23 03:43:50

CrossTable 函数来自 gmodels 包，它不知道如何处理 Haven_labelled 类的对象，因此将它们视为数值向量。

为了获得更好的输出，您可以将它们转换为 CrossTable 的基本 R 因子以保留名称。幸运的是，haven 包包含用于执行此操作的函数 as_factor。

完成此操作后，很容易删除不需要的因子水平，如下所示：

library(gmodels)
library(haven)

df <- GSS18[!GSS18$mentloth %in% c(0, 8, 9),]
df$reliten <- as_factor(df$reliten)
df$mentloth <- as_factor(df$mentloth)
df$reliten <- factor(as.character(df$reliten), 
                     levels = c("No religion", "Somewhat strong", 
                                "Not very strong", "Strong"))

所以现在您可以这样做

CrossTable(df$reliten, df$mentloth)

   Cell Contents
|-------------------------|
|                       N |
| Chi-square contribution |
|           N / Row Total |
|           N / Col Total |
|         N / Table Total |
|-------------------------|

 
Total Observations in Table:  12 

 
                | df$mentloth 
     df$reliten |       Yes |        No | Row Total | 
----------------|-----------|-----------|-----------|
    No religion |         1 |         0 |         1 | 
                |     0.008 |     0.083 |           | 
                |     1.000 |     0.000 |     0.083 | 
                |     0.091 |     0.000 |           | 
                |     0.083 |     0.000 |           | 
----------------|-----------|-----------|-----------|
Somewhat strong |         1 |         0 |         1 | 
                |     0.008 |     0.083 |           | 
                |     1.000 |     0.000 |     0.083 | 
                |     0.091 |     0.000 |           | 
                |     0.083 |     0.000 |           | 
----------------|-----------|-----------|-----------|
Not very strong |         3 |         0 |         3 | 
                |     0.023 |     0.250 |           | 
                |     1.000 |     0.000 |     0.250 | 
                |     0.273 |     0.000 |           | 
                |     0.250 |     0.000 |           | 
----------------|-----------|-----------|-----------|
         Strong |         6 |         1 |         7 | 
                |     0.027 |     0.298 |           | 
                |     0.857 |     0.143 |     0.583 | 
                |     0.545 |     1.000 |           | 
                |     0.500 |     0.083 |           | 
----------------|-----------|-----------|-----------|
   Column Total |        11 |         1 |        12 | 
                |     0.917 |     0.083 |           | 
----------------|-----------|-----------|-----------|

The CrossTable function is from the gmodels package, which doesn't know how to handle objects of class haven_labelled, so treats them as numeric vectors.

To get a nicer output, you can convert them into base R factors for CrossTable to retain the names. Fortunately, the haven package contains the function as_factor for doing exactly that.

Once you have done that, it is easy to drop the factor levels you don't want, as shown below:

library(gmodels)
library(haven)

df <- GSS18[!GSS18$mentloth %in% c(0, 8, 9),]
df$reliten <- as_factor(df$reliten)
df$mentloth <- as_factor(df$mentloth)
df$reliten <- factor(as.character(df$reliten), 
                     levels = c("No religion", "Somewhat strong", 
                                "Not very strong", "Strong"))

So now you can do

CrossTable(df$reliten, df$mentloth)

   Cell Contents
|-------------------------|
|                       N |
| Chi-square contribution |
|           N / Row Total |
|           N / Col Total |
|         N / Table Total |
|-------------------------|

 
Total Observations in Table:  12 

 
                | df$mentloth 
     df$reliten |       Yes |        No | Row Total | 
----------------|-----------|-----------|-----------|
    No religion |         1 |         0 |         1 | 
                |     0.008 |     0.083 |           | 
                |     1.000 |     0.000 |     0.083 | 
                |     0.091 |     0.000 |           | 
                |     0.083 |     0.000 |           | 
----------------|-----------|-----------|-----------|
Somewhat strong |         1 |         0 |         1 | 
                |     0.008 |     0.083 |           | 
                |     1.000 |     0.000 |     0.083 | 
                |     0.091 |     0.000 |           | 
                |     0.083 |     0.000 |           | 
----------------|-----------|-----------|-----------|
Not very strong |         3 |         0 |         3 | 
                |     0.023 |     0.250 |           | 
                |     1.000 |     0.000 |     0.250 | 
                |     0.273 |     0.000 |           | 
                |     0.250 |     0.000 |           | 
----------------|-----------|-----------|-----------|
         Strong |         6 |         1 |         7 | 
                |     0.027 |     0.298 |           | 
                |     0.857 |     0.143 |     0.583 | 
                |     0.545 |     1.000 |           | 
                |     0.500 |     0.083 |           | 
----------------|-----------|-----------|-----------|
   Column Total |        11 |         1 |        12 | 
                |     0.917 |     0.083 |           | 
----------------|-----------|-----------|-----------|

回复收藏 0 原文

~没有更多了~