合并两个列表组件

发布于 2024-12-10 06:19:42 字数 488 浏览 1 评论 0原文

我有一个大列表,但微观示例如下所示:

A <- c("A", "a", "A", "a", "A")
B <- c("A", "A", "a", "a", "a")
C <- c(1, 2, 3, 1, 4) 
mylist <- list(A=A, B=B, C= C)

预期输出是将 A 与 B 合并,以便每个组件看起来像 AB

AA, aA, Aa, aa, Aa

更好应该排序,大写字母总是第一个

AA, Aa, Aa, aa, Aa

因此新列表或矩阵应该有两列或rows:

AA, Aa, Aa, aa, Aa
1,   2, 3,   1, 4

现在我想根据类“AA”、“Aa”和“aa”计算C的平均值

看起来很简单,但我无法轻易弄清楚。

I have a big list, but micro example would be like the following:

A <- c("A", "a", "A", "a", "A")
B <- c("A", "A", "a", "a", "a")
C <- c(1, 2, 3, 1, 4) 
mylist <- list(A=A, B=B, C= C)

expected output is merge A with B so that each component will look like AB

AA, aA, Aa, aa, Aa

better should be sorted, upper case is always first

AA, Aa, Aa, aa, Aa

Thus new list or matrix should have two columns or rows:

AA, Aa, Aa, aa, Aa
1,   2, 3,   1, 4

Now I want calculate average of C based on class - "AA", "Aa", and "aa"

Looks simple but I could not figure out easily.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

2024-12-17 06:19:42
> (ab <- paste(A, B, sep="") )
[1] "AA" "aA" "Aa" "aa" "Aa"
> (ab <- paste(A, B, sep="") )  # the joining step
[1] "AA" "aA" "Aa" "aa" "Aa"
> (ab <- sub("([a-z])([A-Z])", "\\2\\1", ab) ) # swap lowercase uppercase
[1] "AA" "Aa" "Aa" "aa" "Aa"

> rbind(ab, C)                  # matrix
   [,1] [,2] [,3] [,4] [,5]
ab "AA" "Aa" "Aa" "aa" "Aa"
C  "1"  "2"  "3"  "1"  "4" 
> data.frame(alleles=ab, count=C)  # dataframes are lists
  alleles count
1      AA     1
2      Aa     2
3      Aa     3
4      aa     1
5      Aa     4
> (ab <- paste(A, B, sep="") )
[1] "AA" "aA" "Aa" "aa" "Aa"
> (ab <- paste(A, B, sep="") )  # the joining step
[1] "AA" "aA" "Aa" "aa" "Aa"
> (ab <- sub("([a-z])([A-Z])", "\\2\\1", ab) ) # swap lowercase uppercase
[1] "AA" "Aa" "Aa" "aa" "Aa"

> rbind(ab, C)                  # matrix
   [,1] [,2] [,3] [,4] [,5]
ab "AA" "Aa" "Aa" "aa" "Aa"
C  "1"  "2"  "3"  "1"  "4" 
> data.frame(alleles=ab, count=C)  # dataframes are lists
  alleles count
1      AA     1
2      Aa     2
3      Aa     3
4      aa     1
5      Aa     4
似狗非友 2024-12-17 06:19:42

如果您的数据使用 plyr 包排列在 data.frame 中,我就可以做到这一点

> A <- c("A", "a", "A", "a", "A")
> B <- c("A", "A", "a", "a", "a")
> C <- c(1, 2, 3, 1, 4) 
> groups <- sort(paste(A, B, sep=""))
[1] "AA" "aA" "Aa" "aa" "Aa"
> my.df <- data.frame(A=A, B=B, C=C, group=groups)

> require(plyr)
> result <- ddply(my.df, "group", transform, group.means=mean(C))
> result[order(result$group, decreasing=TRUE),]
  A B C group group.means
5 A A 1    AA         1.0
3 A a 3    Aa         3.5
4 A a 4    Aa         3.5
2 a A 2    aA         2.0
1 a a 1    aa         1.0

I can do it if your data is arranged in a data.frame using the package plyr

> A <- c("A", "a", "A", "a", "A")
> B <- c("A", "A", "a", "a", "a")
> C <- c(1, 2, 3, 1, 4) 
> groups <- sort(paste(A, B, sep=""))
[1] "AA" "aA" "Aa" "aa" "Aa"
> my.df <- data.frame(A=A, B=B, C=C, group=groups)

> require(plyr)
> result <- ddply(my.df, "group", transform, group.means=mean(C))
> result[order(result$group, decreasing=TRUE),]
  A B C group group.means
5 A A 1    AA         1.0
3 A a 3    Aa         3.5
4 A a 4    Aa         3.5
2 a A 2    aA         2.0
1 a a 1    aa         1.0
千纸鹤 2024-12-17 06:19:42

使用您的数据:

A <- c("A", "a", "A", "a", "A")
B <- c("A", "A", "a", "a", "a")
C <- c(1, 2, 3, 1, 4) 

我使用 A 和 B 的组合作为关键列定义一个 data.frame

AB <- paste(A, B, sep='')
df <- data.frame(id=AB, C=C)

> df
  id C
1 AA 1
2 aA 2
3 Aa 3
4 aa 1
5 Aa 4

如果您需要在聚合之前订购此 data.frame 则:

df <- df[order(AB, decreasing=TRUE),]

> df
  id C
 1 AA 1
 3 Aa 3
 5 Aa 4
 2 aA 2
 4 aa 1

使用aggregate,您可以计算每个id的平均值:

meanDF <- aggregate(C~id, data=df, mean)

> meanDF

  id   C
1 aa 1.0
2 aA 2.0
3 Aa 3.5
4 AA 1.0

但是如果您想在聚合后进行排序,那么:

df <- data.frame(id=AB, C=C)
meanDF <- aggregate(C~id, data=df, mean)
meanDF <- meanDF[order(meanDF$id, decreasing=TRUE),]

With your data:

A <- c("A", "a", "A", "a", "A")
B <- c("A", "A", "a", "a", "a")
C <- c(1, 2, 3, 1, 4) 

I define a data.frame using the combination of A and B as the key column:

AB <- paste(A, B, sep='')
df <- data.frame(id=AB, C=C)

> df
  id C
1 AA 1
2 aA 2
3 Aa 3
4 aa 1
5 Aa 4

If you need to order this data.frame before the aggregation then:

df <- df[order(AB, decreasing=TRUE),]

> df
  id C
 1 AA 1
 3 Aa 3
 5 Aa 4
 2 aA 2
 4 aa 1

And with aggregate you calculate the mean for each id:

meanDF <- aggregate(C~id, data=df, mean)

> meanDF

  id   C
1 aa 1.0
2 aA 2.0
3 Aa 3.5
4 AA 1.0

But if you want to order after the aggregation, then:

df <- data.frame(id=AB, C=C)
meanDF <- aggregate(C~id, data=df, mean)
meanDF <- meanDF[order(meanDF$id, decreasing=TRUE),]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文