R:在大型数据帧中的多行上迭代渔夫测试以逐行获取输出

发布于 2025-01-12 17:32:37 字数 1265 浏览 1 评论 0原文

我有一个包含多个分类值的大型数据集,这些分类值在两个不同的组中具有不同的整数值(计数)。

例如

Element <- c("zinc", "calcium", "magnesium", "sodium", "carbon", "nitrogen")
no_A <- c(45, 143, 10, 35, 70, 40)
no_B <- c(10, 11, 1, 4, 40, 30)
elements_df <- data.frame(Element, no_A, no_B)
Elementno_Ano_B
Zinc4510
Calcium14311
Magnesium101
Sodium354
Carbon7040
Nitrogen4030

以前我只是使用下面的代码并手动更改 x 来获取输出值:

x = "calcium"

n1 = (elements_df %>% filter(Element== x))$no_A
n2 = sum(elements_df$no_A) - n1
n3 = (elements_df %>% filter(Element== x))$no_B
n4 = sum(elements_df$no_B) - n3

fisher.test(matrix(c(n1, n2, n3, n4), nrow = 2, ncol = 2, byrow = TRUE)) 

但是我有一个非常大的值具有 4000 行的数据集,我想要最有效的方法来迭代所有这些行并查看哪些具有重要的 p价值观。

我想我需要一个 for 循环和函数,尽管我已经浏览了一些以前的类似问题(我觉得我没有一个可以使用),并且似乎使用 apply 可能是可行的方法。

那么,简而言之,任何人都可以帮助我编写代码来迭代每行中的 x 并打印出每个元素相应的 p 值和优势比吗?

I have a large dataset with multiple categorical values that have different integer values (counts) in two different groups.

As an example

Element <- c("zinc", "calcium", "magnesium", "sodium", "carbon", "nitrogen")
no_A <- c(45, 143, 10, 35, 70, 40)
no_B <- c(10, 11, 1, 4, 40, 30)
elements_df <- data.frame(Element, no_A, no_B)
Elementno_Ano_B
Zinc4510
Calcium14311
Magnesium101
Sodium354
Carbon7040
Nitrogen4030

Previously I’ve just been using the code below and changing x manually to get the output values:

x = "calcium"

n1 = (elements_df %>% filter(Element== x))$no_A
n2 = sum(elements_df$no_A) - n1
n3 = (elements_df %>% filter(Element== x))$no_B
n4 = sum(elements_df$no_B) - n3

fisher.test(matrix(c(n1, n2, n3, n4), nrow = 2, ncol = 2, byrow = TRUE)) 

But I have a very large dataset with 4000 rows and I’d like the most efficient way to iterate through all of them and see which have significant p values.

I imagined I’d need a for loop and function, although I’ve looked through a few previous similar questions (none that I felt I could use) and it seems using apply might be the way to go.

So, in short, can anyone help me with writing code that iterates over x in each row and prints out the corresponding p values and odds ratio for each element?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

江心雾 2025-01-19 17:32:37

您可以将它们全部放在一个漂亮的数据框中,如下所示:

`row.names<-`(do.call(rbind, lapply(seq(nrow(elements_df)), function(i) {
f <- fisher.test(matrix(c(elements_df$no_A[i], sum(elements_df$no_A[-i]),
                     elements_df$no_B[i], sum(elements_df$no_B[-i])), nrow = 2));
data.frame(Element = elements_df$Element[i],
           "odds ratio" = f$estimate, "p value" = scales::pvalue(f$p.value),
           "Lower CI" = f$conf.int[1], "Upper CI" = f$conf.int[2],
           check.names = FALSE)
})), NULL)

#>     Element odds ratio p value  Lower CI    Upper CI
#> 1      zinc  1.2978966   0.601 0.6122734   3.0112485
#> 2   calcium  5.5065701  <0.001 2.7976646  11.8679909
#> 3 magnesium  2.8479528   0.469 0.3961312 125.0342574
#> 4    sodium  2.6090482   0.070 0.8983185  10.3719176
#> 5    carbon  0.3599468  <0.001 0.2158107   0.6016808
#> 6  nitrogen  0.2914476  <0.001 0.1634988   0.5218564

You could get them all in a nice data frame like this:

`row.names<-`(do.call(rbind, lapply(seq(nrow(elements_df)), function(i) {
f <- fisher.test(matrix(c(elements_df$no_A[i], sum(elements_df$no_A[-i]),
                     elements_df$no_B[i], sum(elements_df$no_B[-i])), nrow = 2));
data.frame(Element = elements_df$Element[i],
           "odds ratio" = f$estimate, "p value" = scales::pvalue(f$p.value),
           "Lower CI" = f$conf.int[1], "Upper CI" = f$conf.int[2],
           check.names = FALSE)
})), NULL)

#>     Element odds ratio p value  Lower CI    Upper CI
#> 1      zinc  1.2978966   0.601 0.6122734   3.0112485
#> 2   calcium  5.5065701  <0.001 2.7976646  11.8679909
#> 3 magnesium  2.8479528   0.469 0.3961312 125.0342574
#> 4    sodium  2.6090482   0.070 0.8983185  10.3719176
#> 5    carbon  0.3599468  <0.001 0.2158107   0.6016808
#> 6  nitrogen  0.2914476  <0.001 0.1634988   0.5218564
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文