使用单独的类别向量聚合数据框中的列
我有一组观察结果,每个观察结果都包含对一组详细问题是/否的答案。我想要做的是通过一组减少的问题类别来汇总“是”分数。我可以用“for”循环来做到这一点,但我知道一定有更好的方法。下面是一些设置随机数据集的代码,应该澄清我想要做什么。
#
# Initial data. 10 observations with no/yes (0/1) answers to 15 questions. Questions are
# grouped into 3 categories, 5 questions each. The category is not included in the data set. I want to compute the sums of each category across all 10 observations.
#
# Set up random test data
#
myData<-data.frame(COL1=sample(rep(c(0,1),5),10),
COL2=sample(rep(c(0,1),5),10),
COL3=sample(rep(c(0,1),5),10),
COL4=sample(rep(c(0,1),5),10),
COL5=sample(rep(c(0,1),5),10),
COL6=sample(rep(c(0,1),5),10),
COL7=sample(rep(c(0,1),5),10),
COL8=sample(rep(c(0,1),5),10),
COL9=sample(rep(c(0,1),5),10),
COL10=sample(rep(c(0,1),5),10),
COL11=sample(rep(c(0,1),5),10),
COL12=sample(rep(c(0,1),5),10),
COL13=sample(rep(c(0,1),5),10),
COL14=sample(rep(c(0,1),5),10),
COL15=sample(rep(c(0,1),5),10))
print(myData)
#
# Allocate storage for category totals
#
catSums<-data.frame(
Cat01=rep(NA,10),
Cat02=rep(NA,10),
Cat03=rep(NA,10))
#
# For loop to aggregate sums in each category
#
for (i in 1:10) {
catSums$Cat01[i]=sum(myData[i,c(1:5)])
catSums$Cat02[i]=sum(myData[i,c(6:10)])
catSums$Cat03[i]=sum(myData[i,c(11:15)])
}
print(catSums)
I have a set of observations that each contains yes/no answers to a set of detailed questions. What I want to do is aggregate the "yes" scores by a reduced set of categories for the questions. I can do this with a "for" loop but I know there must be a better way. Below is some code that set ups a random data set that should clarify what I am trying to do.
#
# Initial data. 10 observations with no/yes (0/1) answers to 15 questions. Questions are
# grouped into 3 categories, 5 questions each. The category is not included in the data set. I want to compute the sums of each category across all 10 observations.
#
# Set up random test data
#
myData<-data.frame(COL1=sample(rep(c(0,1),5),10),
COL2=sample(rep(c(0,1),5),10),
COL3=sample(rep(c(0,1),5),10),
COL4=sample(rep(c(0,1),5),10),
COL5=sample(rep(c(0,1),5),10),
COL6=sample(rep(c(0,1),5),10),
COL7=sample(rep(c(0,1),5),10),
COL8=sample(rep(c(0,1),5),10),
COL9=sample(rep(c(0,1),5),10),
COL10=sample(rep(c(0,1),5),10),
COL11=sample(rep(c(0,1),5),10),
COL12=sample(rep(c(0,1),5),10),
COL13=sample(rep(c(0,1),5),10),
COL14=sample(rep(c(0,1),5),10),
COL15=sample(rep(c(0,1),5),10))
print(myData)
#
# Allocate storage for category totals
#
catSums<-data.frame(
Cat01=rep(NA,10),
Cat02=rep(NA,10),
Cat03=rep(NA,10))
#
# For loop to aggregate sums in each category
#
for (i in 1:10) {
catSums$Cat01[i]=sum(myData[i,c(1:5)])
catSums$Cat02[i]=sum(myData[i,c(6:10)])
catSums$Cat03[i]=sum(myData[i,c(11:15)])
}
print(catSums)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以使用 dplyr:
这会返回
You could use
dplyr
:This returns
你可以这样做:
You could do:
输出:
Output: