我正在尝试在r中创建代码,以立即识别特定列的值,找到具有该值的所有行,并从所有这些行的数据中提取所有这些行的数据,包括所有其他列,将这些行与新数据框架相交。我希望它可以重复基本列内的每个独特值。例如:
mydata <- data.frame(x = c(1,2,3), y = c('a','b','c'), z = c('red','red','yellow'))
colors <- list(mydata$z)
for (i in 1:length(colors)) {
assign(paste0("mydata_",i), subset(mydata, z == colors[[i]]))
}
这是我的最新尝试,但无法正常工作。目标是在此示例中拥有2个称为“ mydata_red”和“ my_data_yellow”的新数据框。每个都只包含匹配行
I am trying to create code in R that will instantly recognize the value of a certain column, find all rows with that value, and extract the data from of all of those rows including all other columns intersecting those rows in a new data frame. I want this to repeat for every distinct value inside the base column. for instance:
mydata <- data.frame(x = c(1,2,3), y = c('a','b','c'), z = c('red','red','yellow'))
colors <- list(mydata$z)
for (i in 1:length(colors)) {
assign(paste0("mydata_",i), subset(mydata, z == colors[[i]]))
}
this was my latest attempt but can't get it to work. the goal is to have in this example 2 new dataframes called "mydata_red" and "my_data_yellow". Each will only contain the matching rows
发布评论
评论(3)
使用
分配
将框架或列表拆分为多个对象是一个反模式,并且很少改进将所有框架保存在list list
中的首选方法。参见关于此主题的讨论。一个前提是,当您对列表中的一个帧做一些事情时,您可能会做与帧列表中其他元素非常相似的事情,并使用lapply
在列表上工作并概括您的方法有点可以为更清洁的解决方案等。为了使用这些数据到达那里,这很容易分裂:
正如Jay.sf的评论所建议的那样,可以使用此可以将此帧列表转换为单个对象。一般而言,当我劝阻它时,也许是最适合您的用例。
Using
assign
to split a frame or list into multiple objects is an anti-pattern, and rarely an improvement over the preferred method of keeping all frames in alist
. See How do I make a list of data frames? discussions on this topic. One premise is that when you do something to one frame in the list, it is likely that you will do something very similar to other elements of the list of frames, and working on the list usinglapply
and generalizing your methods a little can make for cleaner solutions and such.To get there with this data, it is as easy as splitting:
As suggested by jay.sf's comment, this can be used to convert this list of frames into individual objects. While I discourage it in general, perhaps it's best for your use-case.
您的代码工作正常。只需删除
列表
即可创建颜色名称的向量而不是列表。如果您只需要不同的值,请使用unique
。Your code works fine. Just remove
list
so you create a vector of color names and not a list. If you only want distinct values, useunique
.在整理中:
〜.x%&gt;%突变(z = .y $ z)
一见钟情可能有些奇怪。〜
创建一个lambda(函数)。默认情况下,.f
参数togroup_map
采用一个必需的可选参数。默认情况下,所需的参数命名为.x
,它包含包含当前组的输入数据框架的子集。同样,.y
(可选参数)包含一个定义当前组的单行。group_map
将由.f
定义的函数依次依次返回列表中的结果。具有相同的效果。
In tidyverse:
The
~.x %>% mutate(z = .y$z)
may look a bit strange at first sight. The~
creates a lambda (function). By default the.f
argument togroup_map
takes one required and one optional parameter. The required argument is by default named.x
and it contains the subset of the input data frame that contain the current group. Similarly,.y
, the optional argument, contains a single row that defines the current group.group_map
applies the function defined by.f
to each group of the input data frame in turn and returns the results in a list.Has the same effect.