按组对多级数据进行重采样
我正在尝试编写一个对嵌套在组中的名称进行重新采样的函数。我的函数适用于不考虑组的重新采样,但我不想创建不属于同一组的名称样本。
这是该函数,其中 x 是所有名称(有些重复)的向量,a 是唯一名称观察值的向量,b 是随机顺序的唯一名称的向量。
rep <- function(x,a,b){
for(i in 1:length(a)){
x1 <- x
x1[which(x==a[i])] <- b[i]
}
x1
}
x <- c("Smith", "Jones", "Washington", "Miller", "Wells", "Smith", "Smith", "Miller")
a <- sort(unique(x))
b <- sample(a, length(a))
dat <- rep(x, a, b)
View(dat)
"Smith" "Jones" "Washington" "Miller" "Jones" "Smith" "Smith" "Miller"
但是,每个名称都嵌套在一个组中,因此我需要避免创建不属于同一组的名称样本。例如:
x groupid
Smith A1
Jones B1
Washington C1
Miller A2
Wells B1
Smith A2
Smith A3
Miller A3
我该如何解释这一点?
I am trying to write a function that resamples names nested in groups. My function works for resampling without respect to groups, but I don't want to create samples of names that aren't in the same group.
Here's the function, where x is a vector of all names (some repeated), a is a vector of unique name observations, and b is a vector of unique names in randomized order.
rep <- function(x,a,b){
for(i in 1:length(a)){
x1 <- x
x1[which(x==a[i])] <- b[i]
}
x1
}
x <- c("Smith", "Jones", "Washington", "Miller", "Wells", "Smith", "Smith", "Miller")
a <- sort(unique(x))
b <- sample(a, length(a))
dat <- rep(x, a, b)
View(dat)
"Smith" "Jones" "Washington" "Miller" "Jones" "Smith" "Smith" "Miller"
However, each name is nested in a group, so I need to avoid creating samples of names that are not in the same group. For example:
x groupid
Smith A1
Jones B1
Washington C1
Miller A2
Wells B1
Smith A2
Smith A3
Miller A3
How can I account for that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用 tidyverse 包可以更容易地实现这一点:
This would be easier to accomplish with the tidyverse packages: