R tidyverse - 按组关联,将多列与单列进行比较并返回单个数据帧
我有一个包含计数数据的数据集,其结构如下:
Sample | ID | Expected | Observed_A | Observed_B |
---|---|---|---|---|
A | id1 | 10 | 8 | 10 |
A | id2 | 6 | 8 | 4 |
B | id1 | 15 | 12 | 18 |
B | id2 | 1 | 2 | 4 |
我试图用 tidyr/ 实现什么dplyr 是每个观察到的计数与预期计数之间的每个样本相关性(即我不担心通过每个观察列之间的相关性)。
样本 | 数据 | 集相关性 |
---|---|---|
A | Observed_A | 0.99 |
A | Observed_B | 0.93 |
B | Observed_A | 0.89 |
B | Observed_B | 0.91 |
我可以通过循环来做到这一点,但想知道是否有一种使用 tidyverse 函数的“更清晰”的方法?
任何帮助非常感谢!
I have a dataset which contains count data where the structure looks like:
Sample | ID | Expected | Observed_A | Observed_B |
---|---|---|---|---|
A | id1 | 10 | 8 | 10 |
A | id2 | 6 | 8 | 4 |
B | id1 | 15 | 12 | 18 |
B | id2 | 1 | 2 | 4 |
What I'm trying to get to with tidyr/dplyr is the per-sample correlation between each of the observed counts and the expected counts (i.e. I'm unfussed by the correlation between each of the observed columns).
Sample | Dataset | Correlation |
---|---|---|
A | Observed_A | 0.99 |
A | Observed_B | 0.93 |
B | Observed_A | 0.89 |
B | Observed_B | 0.91 |
I can do this by looping, but was wondering whether there is a 'clearer' approach to take using tidyverse functions?
Any help much appreciated!!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
怎么样:
由 reprex 包(v2.0.1)于 2022 年 3 月 4 日创建< /sup>
How about this:
Created on 2022-03-04 by the reprex package (v2.0.1)