使用矩阵数组中的行索引值对 R 中数据帧中的列下的行值求和 - 可重现的示例
我有一个数据框(df1),其中包含一列值。我还有一个矩阵(ma1),其中包含两列,它们引用 df1 中的行索引(即,ma1 的第 1 行、第 1 列是 4,第 2 列是 6 - 这些引用 df1 的第 4 行和第 6 行)
我需要能够使用 ma1 中的行索引对每个窗口的 df1 中的行值求和。所以我需要输出是第 4-6 行、第 9 到 12 行、第 15 到 19 行等的总和 - 与 ma1 中的索引值相对应。
我已阅读有关 rowSums 的内容,但不确定如何在此示例中完全实现此
示例:ma1 第 1 行是 4 和 6,因此我需要输出为 df1 的第 4、5、6 行的总和。即 3.37 + 0 + 1.19。 ma1 的第 2 行是 9 和 12,所以我需要输出是 df1 的第 9,10,11,12 行等的总和
df1 <- structure(list(CO2 = c(3.37, 0, 1.19, 0.889999999999986, 5.88999999999999,
0.169999999999959, 3.92000000000002, 1.46000000000004, 1.23000000000002,
2.60000000000002, 1.39999999999998, 0, 4.35999999999996, 0.649999999999977,
0.149999999999977, 2.08999999999997, 4.23999999999995, 5.69,
0, 3.38, 1.95000000000005, 3.16999999999996, 2.82999999999998,
0, 1.69, 1.36000000000001, 0.669999999999959, 0.54000000000002,
0.529999999999973, 0.95999999999998, 0.600000000000023, 0.850000000000023,
0, 0.00999999999999091, 1.77999999999997, 1.98000000000002, 1.63,
2.74000000000001, 2.56, 3.50999999999999, 0, 0, 3.37, 0, 0.630000000000052,
0, 0.270000000000039, 0.769999999999982, 0.75, 1.25999999999999,
0, 0.689999999999998, 1.12, 0.210000000000036, 2.66000000000003,
3.14000000000004, 2.24000000000001, 0.620000000000005, 0.0900000000000318,
0)), row.names = c(NA, -60L), class = c("tbl_df", "tbl", "data.frame"
))
ma1 <- structure(c(4, 6, 9, 12, 15, 19, 33, 37, 41, 54, 6, 9, 12, 15,
19, 24, 37, 41, 44, 60), .Dim = c(10L, 2L), .Dimnames = list(
NULL, c("co2_start", "co2_end")))
I have a dataframe (df1) the contains a column of values. I also have a matrix (ma1) that contains two columns which refer to row indices in df1 (i.e., row 1, col 1 of ma1 is 4 and col 2 is 6 - these refer to row 4 and 6 of df1)
I need to be able to use the row indices in ma1 to sum the values of rows in df1 for each window. So i need the output to be sum of row 4-6, row 9 to 12, row 15 to 19 etc. - which correspond with the indices values in ma1.
I have read about rowSums but unsure how to full implement this in this example
Example: ma1 row 1 is 4 and 6, so I need output to be sum of rows 4,5,6 of df1. i.e., 3.37 + 0 + 1.19. Row 2 of ma1 is 9 and 12, so I need output to be sum of rows 9,10,11,12 of df1 etc etc
df1 <- structure(list(CO2 = c(3.37, 0, 1.19, 0.889999999999986, 5.88999999999999,
0.169999999999959, 3.92000000000002, 1.46000000000004, 1.23000000000002,
2.60000000000002, 1.39999999999998, 0, 4.35999999999996, 0.649999999999977,
0.149999999999977, 2.08999999999997, 4.23999999999995, 5.69,
0, 3.38, 1.95000000000005, 3.16999999999996, 2.82999999999998,
0, 1.69, 1.36000000000001, 0.669999999999959, 0.54000000000002,
0.529999999999973, 0.95999999999998, 0.600000000000023, 0.850000000000023,
0, 0.00999999999999091, 1.77999999999997, 1.98000000000002, 1.63,
2.74000000000001, 2.56, 3.50999999999999, 0, 0, 3.37, 0, 0.630000000000052,
0, 0.270000000000039, 0.769999999999982, 0.75, 1.25999999999999,
0, 0.689999999999998, 1.12, 0.210000000000036, 2.66000000000003,
3.14000000000004, 2.24000000000001, 0.620000000000005, 0.0900000000000318,
0)), row.names = c(NA, -60L), class = c("tbl_df", "tbl", "data.frame"
))
ma1 <- structure(c(4, 6, 9, 12, 15, 19, 33, 37, 41, 54, 6, 9, 12, 15,
19, 24, 37, 41, 44, 60), .Dim = c(10L, 2L), .Dimnames = list(
NULL, c("co2_start", "co2_end")))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以使用
apply
遍历ma1
的行,每次使用Between(row_number(),....)
进行过滤,并包装在colSums()
中。这里,x
代表ma1
的(每个)行向量输出:
此
data.table
方法将比此基本 R 方法 更快.. tidyverse 的过滤器/之间会减慢速度
You can use
apply
to go over the rows ofma1
, each time, usingbetween(row_number(),....)
to filter, and wrap incolSums()
. Here,x
, represents (each) row-vector ofma1
Output:
This
data.table
approach will be fasteras would this base R approach.. It is the filter/between from tidyverse that slows things down
此方法仅返回您实际想要的索引的总和,而不是每个连续的成对索引。
输出:
This approach returns only the sum of the indexes you actually want rather than every sequential pairwise index.
Output: