如何在R中的系数相关（R）计算中为每个数据帧子集一个列？

发布于 2025-01-24 16:25:58 字数 1877 浏览 0 评论 0原文

我有两个dataframes vobs和vest。请参阅下面的示例：

dput(head(Vobs,20))
structure(list(ID = c("LAM_1", "LAM_2", "LAM_3", "LAM_4", "LAM_5", 
"LAM_6", "LAM_7", "AUR_1", "AUR_2", "AUR_3", "AUR_4", "AUR_5", 
"AUR_6"), SOS = c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 
26), EOS = c(3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27)), row.names = c(NA, 
-13L), class = c("tbl_df", "tbl", "data.frame"))

dput(head(Vest,30))
structure(list(ID = c("LAM", "LAM", "LAM", "LAM", "LAM", "AUR", 
"AUR", "AUR", "AUR", "AUR", "AUR", "P0", "P01", "P01", "P02", 
"P1", "P2", "P3", "P4", "P13", "P14", "P15", "P17", "P18", "P19", 
"P20", "P22", "P23", "P24"), EVI_SOS = c(2, 6, 10, 14, NA, 20, 
24, 28, 32, 36, NA, 42, 42, NA, 48, 48, 52, 56, 60, 64, 68, NA, 
NA, 72, NA, 78, 82, 86, 90), EVI_EOS = c(3, 7, 11, 15, NA, 21, 
25, 29, 33, 37, NA, 43, 43, NA, 49, 49, 53, 57, 61, 65, 69, NA, 
NA, 73, NA, 79, 83, 87, 91), NDVI_SOS = c(4, 8, 12, 16, 18, 22, 
26, 30, 34, 38, 40, 44, 44, 46, 50, 50, 54, 58, 62, 66, 70, NA, 
NA, 74, 76, 80, 84, 88, 92), NDVI_EOS = c(5, 9, 13, 17, 19, 23, 
27, 31, 35, 39, 41, 45, 45, 47, 51, 51, 55, 59, 63, 67, 71, NA, 
NA, 75, 77, 81, 85, 89, 93)), row.names = c(NA, -29L), class = c("tbl_df", 
"tbl", "data.frame"))

我想在两个数据范围之间进行相关系数（ r ）。例如，我假装在sos vobs 和evi_sos 的列之间做 r 背心涉及LAM ID（两个数据范围内存在）。换句话说，我想为感兴趣的ID子设置数据。在此示例中，我对lam ID感兴趣， vest 和lam_3 to lam_7（也就是lam_3，lam_4，lam_5，lam_6，lam_7）用于 vobs 。

我一直在使用此代码： cor（vobs $ sos，vest $ evi_sos，use =“ pountty.obs”），但我错过了两个不同数据范围的两个列的ID子集。如何使用此代码进行子集？

任何帮助将不胜感激。

原文

I have two dataframes Vobs and Vest. See the example below:

dput(head(Vobs,20))
structure(list(ID = c("LAM_1", "LAM_2", "LAM_3", "LAM_4", "LAM_5", 
"LAM_6", "LAM_7", "AUR_1", "AUR_2", "AUR_3", "AUR_4", "AUR_5", 
"AUR_6"), SOS = c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 
26), EOS = c(3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27)), row.names = c(NA, 
-13L), class = c("tbl_df", "tbl", "data.frame"))

dput(head(Vest,30))
structure(list(ID = c("LAM", "LAM", "LAM", "LAM", "LAM", "AUR", 
"AUR", "AUR", "AUR", "AUR", "AUR", "P0", "P01", "P01", "P02", 
"P1", "P2", "P3", "P4", "P13", "P14", "P15", "P17", "P18", "P19", 
"P20", "P22", "P23", "P24"), EVI_SOS = c(2, 6, 10, 14, NA, 20, 
24, 28, 32, 36, NA, 42, 42, NA, 48, 48, 52, 56, 60, 64, 68, NA, 
NA, 72, NA, 78, 82, 86, 90), EVI_EOS = c(3, 7, 11, 15, NA, 21, 
25, 29, 33, 37, NA, 43, 43, NA, 49, 49, 53, 57, 61, 65, 69, NA, 
NA, 73, NA, 79, 83, 87, 91), NDVI_SOS = c(4, 8, 12, 16, 18, 22, 
26, 30, 34, 38, 40, 44, 44, 46, 50, 50, 54, 58, 62, 66, 70, NA, 
NA, 74, 76, 80, 84, 88, 92), NDVI_EOS = c(5, 9, 13, 17, 19, 23, 
27, 31, 35, 39, 41, 45, 45, 47, 51, 51, 55, 59, 63, 67, 71, NA, 
NA, 75, 77, 81, 85, 89, 93)), row.names = c(NA, -29L), class = c("tbl_df", 
"tbl", "data.frame"))

I want to do the correlation coefficient (R) between the two dataframes. As an example, I pretend to do the R between SOS column of Vobs and EVI_SOS column of Vest concerning the LAM ID (which exists in both dataframes).
In other words, I want to subset the data for the ID of interest. In this example, I'm interested in the LAM ID, for Vest and LAM_3 to LAM_7 (that is LAM_3, LAM_4, LAM_5, LAM_6, LAM_7) for Vobs.

I have been using this code:
cor(Vobs$SOS, Vest$EVI_SOS, use = "complete.obs") but I missed the ID subset for both columns of the two different dataframes. How can I do the subset using this code?

Any help will be much appreciated.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

眼眸 2025-01-31 16:25:58

在您的特定情况下，要用一个顺序数值后缀子集，请尝试使用sprint（）将数字和子集附加到以下方式：

sprintf("LAM_%s",3:7)
[1] "LAM_3" "LAM_4" "LAM_5" "LAM_6" "LAM_7"

so：

Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"]

# SOS
# <dbl>
# 1     6
# 2     8
# 3    10
# 4    12
# 5    14

so：hro只有LAM用于观测值，您可以更轻松地子集。尝试

cor(Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"], 
    Vest[Vest$ID %in% "LAM","EVI_SOS"], use = "complete.obs")

In your specific case, to subset a character variable with a sequential numerical suffix, try using sprint() to append the number and subset as follows:

sprintf("LAM_%s",3:7)
[1] "LAM_3" "LAM_4" "LAM_5" "LAM_6" "LAM_7"

So:

Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"]

# SOS
# <dbl>
# 1     6
# 2     8
# 3    10
# 4    12
# 5    14

Since the Vest dataset just has LAM for the observations, you can subset easier. Try

cor(Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"], 
    Vest[Vest$ID %in% "LAM","EVI_SOS"], use = "complete.obs")

回复收藏 0 原文

~没有更多了~

关于作者

扭转时空

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

如何在R中的系数相关（R）计算中为每个数据帧子集一个列？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

如何在R中的系数相关（R）计算中为每个数据帧子集一个列？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。