使用多个参数创建一个dataframe [r]的函数

发布于 2025-01-22 05:33:05 字数 1421 浏览 0 评论 0原文

有一个名为泰坦尼克号的数据

Class  Sex   Age Survived
1   3rd Male Child       No
2   3rd Male Child       No
3   3rd Male Child       No
4   3rd Male Child       No
5   3rd Male Child       No
6   3rd Male Child       No
...

f1 <- function(sex, age, class, survived){
...
}

我 我输入了乘客的一些标准。例如,我希望能够将标准输入到

f1("Female", "Child","3rd", "Yes")

现在返回

     Class    Sex   Age Survived
1534   3rd Female Child      Yes
1535   3rd Female Child      Yes
1536   3rd Female Child      Yes
1537   3rd Female Child      Yes
1538   3rd Female Child      Yes

的函数中,我已经对其进行了硬编码,并且只使用了IF else语句来涵盖所有可能性。

function.q6.1 <- function(sex,age,class,survival){
  if(sex == "Male" & age == "Child" & class == "3rd" & survival == "No"){
    subset(titanic, Sex == "Male" & Age == "Child" & Class == "3rd" & Survived == "No")
  }
  else if(sex == "Female" & age == "Child" & class == "3rd" & survival == "No"){
    subset(titanic, Sex == "Female" & Age == "Child" & Class == "3rd" & Survived == "No")
  }
  else if(sex == "Male" & age == "Adult" & class == "3rd" & survival == "No"){
    subset(titanic, Sex == "Male" & Age == "Adult" & Class == "3rd" & Survived == "No")
  }
...
}

我想知道是否有一种更有效的方法。提前谢谢你。

I have a data frame named titanic with 2021 rows of passengers on the titanic and specific characteristics of each passenger:

Class  Sex   Age Survived
1   3rd Male Child       No
2   3rd Male Child       No
3   3rd Male Child       No
4   3rd Male Child       No
5   3rd Male Child       No
6   3rd Male Child       No
...

I want to create a function that has multiple arguments that looks something like this:

f1 <- function(sex, age, class, survived){
...
}

where the arguments are where I input some criteria of the passengers. As an example, I want to be able to input criteria into the function such that

f1("Female", "Child","3rd", "Yes")

returns

     Class    Sex   Age Survived
1534   3rd Female Child      Yes
1535   3rd Female Child      Yes
1536   3rd Female Child      Yes
1537   3rd Female Child      Yes
1538   3rd Female Child      Yes

Now, I have hard-coded it and just used an if else statement to cover all of the possibilities.

function.q6.1 <- function(sex,age,class,survival){
  if(sex == "Male" & age == "Child" & class == "3rd" & survival == "No"){
    subset(titanic, Sex == "Male" & Age == "Child" & Class == "3rd" & Survived == "No")
  }
  else if(sex == "Female" & age == "Child" & class == "3rd" & survival == "No"){
    subset(titanic, Sex == "Female" & Age == "Child" & Class == "3rd" & Survived == "No")
  }
  else if(sex == "Male" & age == "Adult" & class == "3rd" & survival == "No"){
    subset(titanic, Sex == "Male" & Age == "Adult" & Class == "3rd" & Survived == "No")
  }
...
}

I want to know if there is a more efficient way of doing this. Thank you ahead of time.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

就此别过 2025-01-29 05:33:05

这假定第一个参数是数据框架,其余参数是每个列以它们出现在数据框架或命名的顺序中的值。

对于未命名参数的这种情况,数据框架的第一列将与相同数量的参数匹配的参数可能少于列。如果命名参数,则匹配将使用这些名称。数据框后的所有参数必须命名或未命名。如果仅通过没有其他参数传递数据框,则无形地返回null。

如果数据框架之后存在非零数量的参数,我们将获得名称或使用第一个n个名称,其中n是数据框后的参数数。然后从DAT中删除Na的行,假设这些行无法匹配。 mapply将连续的列与返回逻辑矩阵的连续参数值进行比较。应用程序返回每个行逻辑值,然后我们按此标记。

我们在测试调用末尾的注释中使用的数据框架可重复显示。

f1 <- function(dat, ...) {
  if (n <- ...length()) {
    if (is.null(nms <- ...names())) nms <- head(names(dat), n)
    dat <- na.omit(dat)
    dat[apply(mapply(`==`, dat[nms], list(...)), 1, all), ]
  }
}

现在,我们运行一些测试

f1(dat, "3rd", "Male", "Child", "No")
##   Class  Sex   Age Survived
## 1   3rd Male Child       No
## 2   3rd Male Child       No
## 3   3rd Male Child       No
## 4   3rd Male Child       No
## 5   3rd Male Child       No
## 6   3rd Male Child       No

f1(dat, "3rd", "Female", "Child", "No")
## [1] Class    Sex      Age      Survived
## <0 rows> (or 0-length row.names)

f1(dat, "3rd")
##   Class  Sex   Age Survived
## 1   3rd Male Child       No
## 2   3rd Male Child       No
## 3   3rd Male Child       No
## 4   3rd Male Child       No
## 5   3rd Male Child       No
## 6   3rd Male Child       No

f1(BOD, 1, 8.3)  # BOD is built into R
##   Time demand
## 1    1    8.3

f1(BOD, demand = 8.3)
##   Time demand
## 1    1    8.3

注意

Lines <- "
Class  Sex   Age Survived
1   3rd Male Child       No
2   3rd Male Child       No
3   3rd Male Child       No
4   3rd Male Child       No
5   3rd Male Child       No
6   3rd Male Child       No"
dat <- read.table(text = Lines)

更新

允许的参数少于列,并允许命名参数。

This assumes that the first argument is the data frame and the remaining arguments are values for each of the columns in the order that they appear in the data frame or else are named.

There can be fewer arguments than columns in which case for unnamed arguments the first columns of the data frame will be matched against the same number of arguments. If the arguments are named then the matches will use those names. All arguments after the data frame must either be named or not named. If only the data frame is passed with no other arguments then NULL is returned invisibly.

If there are a non-zero number of arguments after the data frame we get the names or use the first n names where n is the number of arguments after the data frame. Then remove rows with NA's from dat assuming that those rows cannot match. mapply compares successive columns to successive argument values returning a logical matrix. The apply returns one logical value per row and then we subscript by that.

We use the data frame shown reproducibly in the Note at the end in the test calls.

f1 <- function(dat, ...) {
  if (n <- ...length()) {
    if (is.null(nms <- ...names())) nms <- head(names(dat), n)
    dat <- na.omit(dat)
    dat[apply(mapply(`==`, dat[nms], list(...)), 1, all), ]
  }
}

Now we run some tests

f1(dat, "3rd", "Male", "Child", "No")
##   Class  Sex   Age Survived
## 1   3rd Male Child       No
## 2   3rd Male Child       No
## 3   3rd Male Child       No
## 4   3rd Male Child       No
## 5   3rd Male Child       No
## 6   3rd Male Child       No

f1(dat, "3rd", "Female", "Child", "No")
## [1] Class    Sex      Age      Survived
## <0 rows> (or 0-length row.names)

f1(dat, "3rd")
##   Class  Sex   Age Survived
## 1   3rd Male Child       No
## 2   3rd Male Child       No
## 3   3rd Male Child       No
## 4   3rd Male Child       No
## 5   3rd Male Child       No
## 6   3rd Male Child       No

f1(BOD, 1, 8.3)  # BOD is built into R
##   Time demand
## 1    1    8.3

f1(BOD, demand = 8.3)
##   Time demand
## 1    1    8.3

Note

Lines <- "
Class  Sex   Age Survived
1   3rd Male Child       No
2   3rd Male Child       No
3   3rd Male Child       No
4   3rd Male Child       No
5   3rd Male Child       No
6   3rd Male Child       No"
dat <- read.table(text = Lines)

Update

Allow fewer arguments than columns and allow arguments to be named.

铜锣湾横着走 2025-01-29 05:33:05

如果您使用的是data.frame。如您的问题所示,您可以

library(dplyr)
my_filter <- function(sex, age, class, survived) {

  df %>% 
    filter(Sex == sex, Age == age, Class == class, Survived == survived)

}

立即使用my_filter(“女性”,“ child”,“ 3rd”,“ yes”)返回

   Class    Sex   Age Survived
7    3rd Female Child      Yes
8    3rd Female Child      Yes
9    3rd Female Child      Yes
10   3rd Female Child      Yes
11   3rd Female Child      Yes 

If you are using a data.frame like shown in your question, you could use

library(dplyr)
my_filter <- function(sex, age, class, survived) {

  df %>% 
    filter(Sex == sex, Age == age, Class == class, Survived == survived)

}

Now my_filter("Female", "Child","3rd", "Yes") returns

   Class    Sex   Age Survived
7    3rd Female Child      Yes
8    3rd Female Child      Yes
9    3rd Female Child      Yes
10   3rd Female Child      Yes
11   3rd Female Child      Yes 
浪菊怪哟 2025-01-29 05:33:05
#toy dataset
set.seed(1912)
titanic <- data.frame(class = sample(c("1st","2nd","3rd"),100,replace = T),
                      sex = sample(c("Male","Female"),100,replace = T),
                      age = sample(c("Child","Adult"),100,replace = T),
                      survival = sample(c("Yes","No"),100,replace = T)
                      )

f1 <- function(sex,age,class,survival) {
  titanic[titanic$class==class&titanic$sex==sex&titanic$age==age&titanic$survival==survival,]
}

f1("Female", "Child","3rd", "Yes")

class    sex   age survival
11   3rd Female Child      Yes
15   3rd Female Child      Yes
38   3rd Female Child      Yes
71   3rd Female Child      Yes
85   3rd Female Child      Yes
94   3rd Female Child      Yes
#toy dataset
set.seed(1912)
titanic <- data.frame(class = sample(c("1st","2nd","3rd"),100,replace = T),
                      sex = sample(c("Male","Female"),100,replace = T),
                      age = sample(c("Child","Adult"),100,replace = T),
                      survival = sample(c("Yes","No"),100,replace = T)
                      )

f1 <- function(sex,age,class,survival) {
  titanic[titanic$class==class&titanic$sex==sex&titanic$age==age&titanic$survival==survival,]
}

f1("Female", "Child","3rd", "Yes")

class    sex   age survival
11   3rd Female Child      Yes
15   3rd Female Child      Yes
38   3rd Female Child      Yes
71   3rd Female Child      Yes
85   3rd Female Child      Yes
94   3rd Female Child      Yes
烟花易冷人易散 2025-01-29 05:33:05

更新:

将您的列和条件存储在矢量中,然后将功能应用于数据框:

library(dplyr)
library(stringr)

f1 <- paste(f1, collapse = "|")
cols <- c("Sex", "Age", "Class", "Survived")

my_function <- function(df){
  df %>% 
    select(cols) %>% 
    filter(if_all(everything(), ~str_detect(.,f1))
    )
  }
my_function(df)

第一个答案:

也许另一个策略可能是:

library(dplyr)
library(stringr)

f1 <- paste(f1, collapse = "|")

my_function <- function(df){
  df %>% 
    select(Sex, Age, Class, Survived) %>% 
    filter(if_all(everything(), ~str_detect(.,f1))
    )
  }

my_function(df)

输出:

       Sex   Age Class Survived
1534 Female Child   3rd      Yes
1535 Female Child   3rd      Yes
1536 Female Child   3rd      Yes
1537 Female Child   3rd      Yes
1538 Female Child   3rd      Yes

Update:

store your columns and conditions in a vector each and then apply the function to the dataframe:

library(dplyr)
library(stringr)

f1 <- paste(f1, collapse = "|")
cols <- c("Sex", "Age", "Class", "Survived")

my_function <- function(df){
  df %>% 
    select(cols) %>% 
    filter(if_all(everything(), ~str_detect(.,f1))
    )
  }
my_function(df)

First answer:

Maybe another strategy could be:

library(dplyr)
library(stringr)

f1 <- paste(f1, collapse = "|")

my_function <- function(df){
  df %>% 
    select(Sex, Age, Class, Survived) %>% 
    filter(if_all(everything(), ~str_detect(.,f1))
    )
  }

my_function(df)

output:

       Sex   Age Class Survived
1534 Female Child   3rd      Yes
1535 Female Child   3rd      Yes
1536 Female Child   3rd      Yes
1537 Female Child   3rd      Yes
1538 Female Child   3rd      Yes
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文