如何为 lapply() 编写函数?
这是我的数据集的顶部:
state start_date end_date created_at cycle party answer candidate_name pct survey_length
1 Florida 2020-11-02 2020-11-02 6/14/21 15:36 2020 REP Trump Donald Trump 48.0 0 days
2 Iowa 2020-11-01 2020-11-02 11/2/20 09:02 2020 REP Trump Donald Trump 48.0 1 days
3 Pennsylvania 2020-11-01 2020-11-02 11/2/20 12:49 2020 REP Trump Donald Trump 49.2 1 days
4 Florida 2020-11-01 2020-11-02 11/2/20 19:02 2020 REP Trump Donald Trump 48.2 1 days
5 Florida 2020-10-31 2020-11-02 11/4/20 09:17 2020 REP Trump Donald Trump 49.4 2 days
6 Nevada 2020-10-31 2020-11-02 11/4/20 10:38 2020 REP Trump Donald Trump 49.1 2 days
我想取每个州每个月“pct”列的平均值。
我可以单独过滤数据并使用aggregate(),如下所示:
Alabama <- filter(prep2020, prep2020$state == 'Alabama')
Alabama$end_date <- format(Alabama$end_date, '%m')
AL <- aggregate(Alabama$pct, by=list(Alabama$end_date), mean)
我认为最好的方法是编写一个对所有状态执行此操作的函数,然后在lapply()中使用该函数,但我似乎无法弄清楚如何做到这一点。有什么建议吗?
This is the top of my dataset:
state start_date end_date created_at cycle party answer candidate_name pct survey_length
1 Florida 2020-11-02 2020-11-02 6/14/21 15:36 2020 REP Trump Donald Trump 48.0 0 days
2 Iowa 2020-11-01 2020-11-02 11/2/20 09:02 2020 REP Trump Donald Trump 48.0 1 days
3 Pennsylvania 2020-11-01 2020-11-02 11/2/20 12:49 2020 REP Trump Donald Trump 49.2 1 days
4 Florida 2020-11-01 2020-11-02 11/2/20 19:02 2020 REP Trump Donald Trump 48.2 1 days
5 Florida 2020-10-31 2020-11-02 11/4/20 09:17 2020 REP Trump Donald Trump 49.4 2 days
6 Nevada 2020-10-31 2020-11-02 11/4/20 10:38 2020 REP Trump Donald Trump 49.1 2 days
I want to take the average of the 'pct' column, for each month, for each state.
I can filter the data individually and use aggregate() like this:
Alabama <- filter(prep2020, prep2020$state == 'Alabama')
Alabama$end_date <- format(Alabama$end_date, '%m')
AL <- aggregate(Alabama$pct, by=list(Alabama$end_date), mean)
I think the best way would be to write a function that does this for all states, and then use the function in lapply() but I can't seem to figure out how to do that. Any suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
aggregate()
函数可以自动执行此操作。为了说明这一点,我使用
mtcars
数据集并取每个组合的mpg
的平均值am
和cyl
变量:data.table
解决方案dplyr
解决方案The
aggregate()
function can do this automatically. To illustrate, Iuse the
mtcars
dataset and take the mean ofmpg
for each combinationof the
am
andcyl
variables:data.table
solutiondplyr
solution你可以这样做:
但是,如果你想使用你开始的方法编写一个函数,你可以这样做
You can do this:
However, if you want to write a function using the approach you started with, you can do something like this