长度在 by() 聚合中不起作用?

发布于 2024-08-13 02:09:59 字数 680 浏览 7 评论 0原文

我有一些调查数据想要按政党和国家来描述。

我在使用 by() 聚合命令时遇到一些问题。它可以与很多函数一起使用,但不能与 length() 一起使用。例如:

by(x, list(party=nn$info$party,state=nn$info$st),mean)

工作正常,但不

by(x, list(party=nn$info$party,state=nn$info$st),length)

返回一个数组,其中填充的不是我要查找的数据的计数,而是一系列 1。这就是阿拉巴马州的情况:

party: D
state: AL
[1] 1
--------------------------------------------------------------------------- 
party: I
state: AL
[1] 1
--------------------------------------------------------------------------- 
party: R
state: AL
[1] 1
--------------------------------------------------------------------------- 

非常神秘。有什么想法吗?

I have some survey data that I want to describe by political party and state.

I'm having some trouble with the by() aggregation command. It works with lots of functions, but just not length(). Eg:

by(x, list(party=nn$info$party,state=nn$info$st),mean)

works fine but not

by(x, list(party=nn$info$party,state=nn$info$st),length)

Which returns an array filled not with the count of the data I'm looking for, but just a series of 1's. This is what it looks like for Alabama:

party: D
state: AL
[1] 1
--------------------------------------------------------------------------- 
party: I
state: AL
[1] 1
--------------------------------------------------------------------------- 
party: R
state: AL
[1] 1
--------------------------------------------------------------------------- 

Very mystifying. Any ideas?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

乖不如嘢 2024-08-20 02:09:59

好吧,我猜测 x 是一个数据框。在这种情况下,length 返回列数,而不是元素数。您需要nrow。请注意,如果 foo 是一个数据框,则通过 foo$bar 获取单个列将返回一个包含一列的数据框。

> by(1:10, rep(1:5, 2), length)
rep(1:5, 2): 1
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 2
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 3
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 4
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 5
[1] 2
> by(data.frame(1:10), rep(1:5, 2), length)
rep(1:5, 2): 1
[1] 1
------------------------------------------------------------ 
rep(1:5, 2): 2
[1] 1
------------------------------------------------------------ 
rep(1:5, 2): 3
[1] 1
------------------------------------------------------------ 
rep(1:5, 2): 4
[1] 1
------------------------------------------------------------ 
rep(1:5, 2): 5
[1] 1
> by(data.frame(1:10), rep(1:5, 2), nrow)
rep(1:5, 2): 1
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 2
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 3
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 4
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 5
[1] 2

Ok, I'm going to guess that x is a data frame. In which case length returns the number of columns, not the number of elements. You want nrow instead. Note that if foo is a data frame, getting a single column by foo$bar will return a data frame with one column.

> by(1:10, rep(1:5, 2), length)
rep(1:5, 2): 1
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 2
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 3
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 4
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 5
[1] 2
> by(data.frame(1:10), rep(1:5, 2), length)
rep(1:5, 2): 1
[1] 1
------------------------------------------------------------ 
rep(1:5, 2): 2
[1] 1
------------------------------------------------------------ 
rep(1:5, 2): 3
[1] 1
------------------------------------------------------------ 
rep(1:5, 2): 4
[1] 1
------------------------------------------------------------ 
rep(1:5, 2): 5
[1] 1
> by(data.frame(1:10), rep(1:5, 2), nrow)
rep(1:5, 2): 1
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 2
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 3
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 4
[1] 2
------------------------------------------------------------ 
rep(1:5, 2): 5
[1] 2
一世旳自豪 2024-08-20 02:09:59

如果您尝试获取不同数据组的记录数,那么最简单的方法通常是使用table。从您的帖子中不清楚您想要使用哪个数据框 - 是 x 还是 nn$info?考虑到这一点,您的代码应该类似于

with(nn$info, table(party, state=st))

下面的示例,任何人都可以使用 MASS 包中的 Cars93 数据集进行复制。

> with(Cars93, table(Type, AirBags))
         AirBags
Type      Driver & Passenger Driver only None
  Compact                  2           9    5
  Large                    4           7    0
  Midsize                  7          11    4
  Small                    0           5   16
  Sporty                   3           8    3
  Van                      0           3    6

If you are trying to get the number of records for different groups of your data, then the easiest way to do it is usually with table. It isn't clear from your post which data frame you want to use – is it x or nn$info? with this in mind, your code should look something like

with(nn$info, table(party, state=st))

Here's an example anyone can replicate, using the Cars93 dataset in the MASS package.

> with(Cars93, table(Type, AirBags))
         AirBags
Type      Driver & Passenger Driver only None
  Compact                  2           9    5
  Large                    4           7    0
  Midsize                  7          11    4
  Small                    0           5   16
  Sporty                   3           8    3
  Van                      0           3    6
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文