将“by”转换为“by” R 中数据框的对象

发布于 2024-08-28 16:03:44 字数 945 浏览 10 评论 0原文

我正在使用 R 中的“by”函数来分割数据框并将函数应用于不同的部分，如下所示：

pairwise.compare <- function(x) {
Nright <- ...
Nwrong <- ...
Ntied <- ...
return(c(Nright=Nright, Nwrong=Nwrong, Ntied=Ntied))
}
Z.by <- by(rankings, INDICES=list(rankings$Rater, rankings$Class), FUN=pairwise.compare)

结果 (Z.by) 看起来像这样：

: 4 
: 357 
Nright Nwrong Ntied
     3      0     0
------------------------------------------------------------
: 8 
: 357 
NULL
------------------------------------------------------------
: 10 
: 470 
Nright Nwrong Ntied
     3      4     1 
------------------------------------------------------------ 
: 11 
: 470 
Nright Nwrong Ntied
    12      4     1

我想要的是将此结果转换进入数据帧（不存在 NULL 条目），因此它看起来像这样：

  Rater Class Nright Nwrong Ntied
1     4   357      3      0     0
2    10   470      3      4     1
3    11   470     12      4     1

我该怎么做？

原文

I'm using the "by" function in R to chop up a data frame and apply a function to different parts, like this:

pairwise.compare <- function(x) {
Nright <- ...
Nwrong <- ...
Ntied <- ...
return(c(Nright=Nright, Nwrong=Nwrong, Ntied=Ntied))
}
Z.by <- by(rankings, INDICES=list(rankings$Rater, rankings$Class), FUN=pairwise.compare)

The result (Z.by) looks something like this:

: 4 
: 357 
Nright Nwrong Ntied
     3      0     0
------------------------------------------------------------
: 8 
: 357 
NULL
------------------------------------------------------------
: 10 
: 470 
Nright Nwrong Ntied
     3      4     1 
------------------------------------------------------------ 
: 11 
: 470 
Nright Nwrong Ntied
    12      4     1

What I would like is to have this result converted into a data frame (with the NULL entries not present) so it looks like this:

  Rater Class Nright Nwrong Ntied
1     4   357      3      0     0
2    10   470      3      4     1
3    11   470     12      4     1

How do I do that?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

心舞飞扬 2024-09-04 16:03:45

by 函数返回一个列表，因此您可以执行以下操作：

data.frame(do.call("rbind", by(x, column, mean)))

The by function returns a list, so you can do something like this:

data.frame(do.call("rbind", by(x, column, mean)))

回复收藏 0 原文

妄司 2024-09-04 16:03:45

考虑在 plyr 包中使用 ddply 而不是 by。它处理将列添加到数据框的工作。

回复收藏 0 原文

戏蝶舞 2024-09-04 16:03:45

旧线程，但对于搜索此主题的任何人：

analysis = by(...)
data.frame(t(vapply(analysis,unlist,unlist(analysis[[1]]))))

unlist() 将获取 by() 输出的元素（在本例中为 analysis) 并将其表示为命名向量。 vapply() 取消列出analysis 的所有元素并输出结果。它需要一个虚拟参数来了解输出类型，这就是 analysis[[1]] 的用途。如果可能的话，您可能需要添加一个检查来确保分析不为空。每个输出都是一列，因此 t() 将其转置到所需的方向，其中每个分析条目都成为一行。

Old thread, but for anyone who searches for this topic:

analysis = by(...)
data.frame(t(vapply(analysis,unlist,unlist(analysis[[1]]))))

unlist() will take an element of a by() output (in this case, analysis) and express it as a named vector.
vapply() does unlist to all the elemnts of analysis and outputs the result. It requires a dummy argument to know the output type, which is what analysis[[1]] is there for. You may need to add a check that analysis is not empty if that will be possible.
Each output will be a column, so t() transposes it to the desired orientation where each analysis entry becomes a row.

回复收藏 0 原文

氛圍 2024-09-04 16:03:45

这扩展了 Shane 使用 rbind() 的解决方案，但还添加了标识组的列并删除了 NULL 组 - 问题中要求的两个功能。通过使用基础包函数，不需要其他依赖项，例如 plyr。

simplify_by_output = function(by_output) {
    null_ind = unlist(lapply(by_output, is.null))  # by() returns NULL for combinations of grouping variables for which there are no data. rbind() ignores those, so you have to keep track of them.
    by_df = do.call(rbind, by_output)  # Combine the results into a data frame.
    return(cbind(expand.grid(dimnames(by_output))[!null_ind, ], by_df))  # Add columns identifying groups, discarding names of groups for which no data exist.
}

This expands upon Shane's solution of using rbind() but also adds columns identifying groups and removes NULL groups - two features which were requested in the question. By using base package functions, no other dependencies are required, e.g., plyr.

simplify_by_output = function(by_output) {
    null_ind = unlist(lapply(by_output, is.null))  # by() returns NULL for combinations of grouping variables for which there are no data. rbind() ignores those, so you have to keep track of them.
    by_df = do.call(rbind, by_output)  # Combine the results into a data frame.
    return(cbind(expand.grid(dimnames(by_output))[!null_ind, ], by_df))  # Add columns identifying groups, discarding names of groups for which no data exist.
}

回复收藏 0 原文

疯了 2024-09-04 16:03:45

我会做

x = by(data, list(data$x, data$y), function(d) whatever(d))
array(x, dim(x), dimnames(x))

I would do

x = by(data, list(data$x, data$y), function(d) whatever(d))
array(x, dim(x), dimnames(x))

回复收藏 0 原文

~没有更多了~

关于作者

撧情箌佬

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

将“by”转换为“by” R 中数据框的对象

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

Promise

qq_lbRlsh

待＂谢繁草

yy2010hell

漫无边际

傲娇萝莉攻

友情链接

将“by”转换为“by” R 中数据框的对象

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

Promise

qq_lbRlsh

待＂谢繁草

yy2010hell

漫无边际

傲娇萝莉攻

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。