长度在 by() 聚合中不起作用?
我有一些调查数据想要按政党和国家来描述。
我在使用 by() 聚合命令时遇到一些问题。它可以与很多函数一起使用,但不能与 length() 一起使用。例如:
by(x, list(party=nn$info$party,state=nn$info$st),mean)
工作正常,但不
by(x, list(party=nn$info$party,state=nn$info$st),length)
返回一个数组,其中填充的不是我要查找的数据的计数,而是一系列 1。这就是阿拉巴马州的情况:
party: D
state: AL
[1] 1
---------------------------------------------------------------------------
party: I
state: AL
[1] 1
---------------------------------------------------------------------------
party: R
state: AL
[1] 1
---------------------------------------------------------------------------
非常神秘。有什么想法吗?
I have some survey data that I want to describe by political party and state.
I'm having some trouble with the by() aggregation command. It works with lots of functions, but just not length(). Eg:
by(x, list(party=nn$info$party,state=nn$info$st),mean)
works fine but not
by(x, list(party=nn$info$party,state=nn$info$st),length)
Which returns an array filled not with the count of the data I'm looking for, but just a series of 1's. This is what it looks like for Alabama:
party: D
state: AL
[1] 1
---------------------------------------------------------------------------
party: I
state: AL
[1] 1
---------------------------------------------------------------------------
party: R
state: AL
[1] 1
---------------------------------------------------------------------------
Very mystifying. Any ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
好吧,我猜测
x
是一个数据框。在这种情况下,length
返回列数,而不是元素数。您需要nrow
。请注意,如果foo
是一个数据框,则通过foo$bar
获取单个列将返回一个包含一列的数据框。Ok, I'm going to guess that
x
is a data frame. In which caselength
returns the number of columns, not the number of elements. You wantnrow
instead. Note that iffoo
is a data frame, getting a single column byfoo$bar
will return a data frame with one column.如果您尝试获取不同数据组的记录数,那么最简单的方法通常是使用
table
。从您的帖子中不清楚您想要使用哪个数据框 - 是x
还是nn$info
?考虑到这一点,您的代码应该类似于下面的示例,任何人都可以使用
MASS
包中的Cars93
数据集进行复制。If you are trying to get the number of records for different groups of your data, then the easiest way to do it is usually with
table
. It isn't clear from your post which data frame you want to use – is itx
ornn$info
? with this in mind, your code should look something likeHere's an example anyone can replicate, using the
Cars93
dataset in theMASS
package.