R - 选择行作为列值的随机样本？

发布于 2024-11-05 17:45:03 字数 1215 浏览 12 评论 0原文

如何为列值的随机样本选择所有行？

我有一个如下所示的数据框：

tag  weight

R007     10
R007     11
R007      9
J102     11
J102      9
J102     13
J102     10
M942      3
M054      9
M054     12  
V671     12
V671     13
V671      9
V671     12
Z990     10
Z990     11

您可以使用...进行复制

weights_df <- structure(list(tag = structure(c(4L, 4L, 4L, 1L, 1L, 1L, 1L, 
3L, 2L, 2L, 5L, 5L, 5L, 5L, 6L, 6L), .Label = c("J102", "M054", 
"M942", "R007", "V671", "Z990"), class = "factor"), value = c(10L, 
11L, 9L, 11L, 9L, 13L, 10L, 3L, 9L, 12L, 12L, 14L, 5L, 12L, 11L, 
15L)), .Names = c("tag", "value"), class = "data.frame", row.names = c(NA, 
-16L))

我需要创建一个包含所有数据的数据框上述数据帧中两个随机采样标签的行。假设标签 R007 和 M942 被随机选择，我的新数据框需要如下所示：

tag  weight

R007     10
R007     11
R007      9
M942      3

我该如何做到这一点？

我知道我可以创建两个随机标签的列表，如下所示：

library(plyr)
tags <- ddply(weights_df, .(tag), summarise, count = length(tag))
set.seed(5464)
tag_sample <- tags[sample(nrow(tags),2),]
tag_sample

导致...

   tag count
4 R007     3
3 M942     1

但我只是不知道如何使用它来对我的原始数据框进行子集化。

原文

How can I select all of the rows for a random sample of column values?

I have a dataframe that looks like this:

tag  weight

R007     10
R007     11
R007      9
J102     11
J102      9
J102     13
J102     10
M942      3
M054      9
M054     12  
V671     12
V671     13
V671      9
V671     12
Z990     10
Z990     11

That you can replicate using...

weights_df <- structure(list(tag = structure(c(4L, 4L, 4L, 1L, 1L, 1L, 1L, 
3L, 2L, 2L, 5L, 5L, 5L, 5L, 6L, 6L), .Label = c("J102", "M054", 
"M942", "R007", "V671", "Z990"), class = "factor"), value = c(10L, 
11L, 9L, 11L, 9L, 13L, 10L, 3L, 9L, 12L, 12L, 14L, 5L, 12L, 11L, 
15L)), .Names = c("tag", "value"), class = "data.frame", row.names = c(NA, 
-16L))

I need to create a dataframe containing all of the rows from the above dataframe for two randomly sampled tags. Let's say tags R007and M942 get selected at random, my new dataframe needs to look like this:

tag  weight

R007     10
R007     11
R007      9
M942      3

How do I do this?

I know I can create a list of two random tags like this:

library(plyr)
tags <- ddply(weights_df, .(tag), summarise, count = length(tag))
set.seed(5464)
tag_sample <- tags[sample(nrow(tags),2),]
tag_sample

Resulting in...

   tag count
4 R007     3
3 M942     1

But I just don't know how to use that to subset my original dataframe.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

月依秋水 2024-11-12 17:45:03

这是你想要的吗？

subset(weights_df, tag%in%sample(levels(tag),2))

is this what you want?

subset(weights_df, tag%in%sample(levels(tag),2))

回复收藏 0 原文

贱人配狗天长地久 2024-11-12 17:45:03

如果你的data.frame被命名为dfrm，那么这将选择100个随机标签s，

dfrm[ sample(NROW(dfrm), 100), "tag" ]   # possibly with repeats

另一方面，如果你想要一个具有相同列的数据框（可能有重复））：

samp <- dfrm[ sample(NROW(dfrm), 100),  ]  # leave the col name entry blank to get all

第三种可能性...您想要随机 100 个不同的标签，但根本不按照频率加权的概率：

samp.tags <- unique(dfrm$tag)[ sample(length(unique(dfrm$tag)), 100]

编辑：修改问题；其中之一：

 subset(dfrm, tag %in% c("R007", "M942") )

或：

dfrm[dfrm$tag %in% c("R007", "M942"), ]

或：

dfrm[grep("R007|M942", dfrm$tag), ]

If your data.frame is named dfrm, then this will select 100 random tags

dfrm[ sample(NROW(dfrm), 100), "tag" ]   # possibly with repeats

If, on the other hand, you want a dataframe with the same columns (possibly with repeats):

samp <- dfrm[ sample(NROW(dfrm), 100),  ]  # leave the col name entry blank to get all

A third possibility... you want 100 distinct tags at random, but not with the probability at all weighted to the frequency:

samp.tags <- unique(dfrm$tag)[ sample(length(unique(dfrm$tag)), 100]

Edit: With to revised question; one of these:

 subset(dfrm, tag %in% c("R007", "M942") )

Or:

dfrm[dfrm$tag %in% c("R007", "M942"), ]

Or:

dfrm[grep("R007|M942", dfrm$tag), ]

回复收藏 0 原文

~没有更多了~

关于作者

情话墙

暂无简介

文章

28 人气

关注发私信

yuanzihao09

文章 0 评论 0

关注

1337793151

文章 0 评论 0

关注

横笛休吹塞上声

文章 0 评论 0

关注

你在我安

文章 0 评论 0

关注

qq_QhAHT0kB

文章 0 评论 0

关注

aaaa123451

文章 0 评论 0

友情链接

文江博客

R - 选择行作为列值的随机样本？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

yuanzihao09

1337793151

横笛休吹塞上声

你在我安

qq_QhAHT0kB

aaaa123451

友情链接

R - 选择行作为列值的随机样本？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

yuanzihao09

1337793151

横笛休吹塞上声

你在我安

qq_QhAHT0kB

aaaa123451

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。