如何在R中的系数相关(R)计算中为每个数据帧子集一个列?

发布于 2025-01-24 16:25:58 字数 1877 浏览 0 评论 0原文

我有两个dataframes vobsvest。请参阅下面的示例:

dput(head(Vobs,20))
structure(list(ID = c("LAM_1", "LAM_2", "LAM_3", "LAM_4", "LAM_5", 
"LAM_6", "LAM_7", "AUR_1", "AUR_2", "AUR_3", "AUR_4", "AUR_5", 
"AUR_6"), SOS = c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 
26), EOS = c(3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27)), row.names = c(NA, 
-13L), class = c("tbl_df", "tbl", "data.frame"))
dput(head(Vest,30))
structure(list(ID = c("LAM", "LAM", "LAM", "LAM", "LAM", "AUR", 
"AUR", "AUR", "AUR", "AUR", "AUR", "P0", "P01", "P01", "P02", 
"P1", "P2", "P3", "P4", "P13", "P14", "P15", "P17", "P18", "P19", 
"P20", "P22", "P23", "P24"), EVI_SOS = c(2, 6, 10, 14, NA, 20, 
24, 28, 32, 36, NA, 42, 42, NA, 48, 48, 52, 56, 60, 64, 68, NA, 
NA, 72, NA, 78, 82, 86, 90), EVI_EOS = c(3, 7, 11, 15, NA, 21, 
25, 29, 33, 37, NA, 43, 43, NA, 49, 49, 53, 57, 61, 65, 69, NA, 
NA, 73, NA, 79, 83, 87, 91), NDVI_SOS = c(4, 8, 12, 16, 18, 22, 
26, 30, 34, 38, 40, 44, 44, 46, 50, 50, 54, 58, 62, 66, 70, NA, 
NA, 74, 76, 80, 84, 88, 92), NDVI_EOS = c(5, 9, 13, 17, 19, 23, 
27, 31, 35, 39, 41, 45, 45, 47, 51, 51, 55, 59, 63, 67, 71, NA, 
NA, 75, 77, 81, 85, 89, 93)), row.names = c(NA, -29L), class = c("tbl_df", 
"tbl", "data.frame"))

我想在两个数据范围之间进行相关系数( r )。例如,我假装在sos vobs evi_sos 的列之间做 r 背心涉及LAM ID(两个数据范围内存在)。 换句话说,我想为感兴趣的ID子设置数据。在此示例中,我对lam ID感兴趣, vest lam_3 to lam_7(也就是lam_3lam_4lam_5lam_6lam_7)用于 vobs

我一直在使用此代码: cor(vobs $ sos,vest $ evi_sos,use =“ pountty.obs”),但我错过了两个不同数据范围的两个列的ID子集。如何使用此代码进行子集?

任何帮助将不胜感激。

I have two dataframes Vobs and Vest. See the example below:

dput(head(Vobs,20))
structure(list(ID = c("LAM_1", "LAM_2", "LAM_3", "LAM_4", "LAM_5", 
"LAM_6", "LAM_7", "AUR_1", "AUR_2", "AUR_3", "AUR_4", "AUR_5", 
"AUR_6"), SOS = c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 
26), EOS = c(3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27)), row.names = c(NA, 
-13L), class = c("tbl_df", "tbl", "data.frame"))
dput(head(Vest,30))
structure(list(ID = c("LAM", "LAM", "LAM", "LAM", "LAM", "AUR", 
"AUR", "AUR", "AUR", "AUR", "AUR", "P0", "P01", "P01", "P02", 
"P1", "P2", "P3", "P4", "P13", "P14", "P15", "P17", "P18", "P19", 
"P20", "P22", "P23", "P24"), EVI_SOS = c(2, 6, 10, 14, NA, 20, 
24, 28, 32, 36, NA, 42, 42, NA, 48, 48, 52, 56, 60, 64, 68, NA, 
NA, 72, NA, 78, 82, 86, 90), EVI_EOS = c(3, 7, 11, 15, NA, 21, 
25, 29, 33, 37, NA, 43, 43, NA, 49, 49, 53, 57, 61, 65, 69, NA, 
NA, 73, NA, 79, 83, 87, 91), NDVI_SOS = c(4, 8, 12, 16, 18, 22, 
26, 30, 34, 38, 40, 44, 44, 46, 50, 50, 54, 58, 62, 66, 70, NA, 
NA, 74, 76, 80, 84, 88, 92), NDVI_EOS = c(5, 9, 13, 17, 19, 23, 
27, 31, 35, 39, 41, 45, 45, 47, 51, 51, 55, 59, 63, 67, 71, NA, 
NA, 75, 77, 81, 85, 89, 93)), row.names = c(NA, -29L), class = c("tbl_df", 
"tbl", "data.frame"))

I want to do the correlation coefficient (R) between the two dataframes. As an example, I pretend to do the R between SOS column of Vobs and EVI_SOS column of Vest concerning the LAM ID (which exists in both dataframes).
In other words, I want to subset the data for the ID of interest. In this example, I'm interested in the LAM ID, for Vest and LAM_3 to LAM_7 (that is LAM_3, LAM_4, LAM_5, LAM_6, LAM_7) for Vobs.

I have been using this code:
cor(Vobs$SOS, Vest$EVI_SOS, use = "complete.obs") but I missed the ID subset for both columns of the two different dataframes. How can I do the subset using this code?

Any help will be much appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

眼眸 2025-01-31 16:25:58

在您的特定情况下,要用一个顺序数值后缀子集,请尝试使用sprint()将数字和子集附加到以下方式:

sprintf("LAM_%s",3:7)
[1] "LAM_3" "LAM_4" "LAM_5" "LAM_6" "LAM_7"

so:

Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"]

# SOS
# <dbl>
# 1     6
# 2     8
# 3    10
# 4    12
# 5    14

so:hro只有LAM用于观测值,您可以更轻松地子集。尝试

cor(Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"], 
    Vest[Vest$ID %in% "LAM","EVI_SOS"], use = "complete.obs")

In your specific case, to subset a character variable with a sequential numerical suffix, try using sprint() to append the number and subset as follows:

sprintf("LAM_%s",3:7)
[1] "LAM_3" "LAM_4" "LAM_5" "LAM_6" "LAM_7"

So:

Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"]

# SOS
# <dbl>
# 1     6
# 2     8
# 3    10
# 4    12
# 5    14

Since the Vest dataset just has LAM for the observations, you can subset easier. Try

cor(Vobs[Vobs$ID %in% sprintf("LAM_%s",3:7),"SOS"], 
    Vest[Vest$ID %in% "LAM","EVI_SOS"], use = "complete.obs")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文