如何将论证传递给函数内部的srvyr?
所以我使用 srvyr 来计算调查对象中变量 (y) 的调查平均值,并按同一调查对象中的分类变量 (x) 进行分组,基本代码如下所示
survey_means <- survey_object %>%
filter( #remove NAs) %>%
group_by(x) %>%
summarise(Mean = survey_mean(y))
假设我想改为放置此块函数内的代码,该函数接受调查对象和两个变量作为参数。这是我实际尝试做的事情的简化版本,该函数将处理最多 4 个左右的变量,但这是基本情况:
SurveyMeanFunc <- function(survey_object, x, y) {
survey_means <- survey_object %>%
filter( #remove NAs ) %>%
group_by(survey_object[["variables"]][[x]]) %>%
summarise(Mean = survey_mean(survey_object[["variables"]][[y]]))
return(survey_means)
}
当尝试使用此函数时,我将始终出现出现一条错误消息
! Assigned data `x` must be compatible with existing data.
x Existing data has n rows.
x Assigned data has m rows. (m > n)
i Only vectors of size 1 are recycled.
,即使我拆分管道,并在使用汇总命令之前验证 x 中的行数与 y 相同,我仍然收到此消息。我不明白 summarise() 在做什么?
[编辑]带有建议更改的完整上下文:
SurveyMeanMedFunc <- function(survey_obj, xvar, yvar, categ1= NULL, categ2= NULL) {
if (is.null(categ1) & is.null(categ2)) {
survey_estimate <- survey_obj %>%
filter(!is.na({{ xvar }}), !is.na({{ yvar }})) %>%
group_by({{ xvar }}) %>%
summarise(Mean = survey_mean({{ yvar }}, vartype = "ci"))
} else if (is.null(categ2)) {
survey_estimate <- survey_obj %>%
filter(!is.na({{ xvar }}), !is.na({{ yvar }})) %>%
group_by({{ xvar }}, {{ categ1 }}) %>%
summarise(Mean = survey_mean({{ yvar }}, vartype = "ci"))
} else {
NULL #fix
}
return(survey_estimate)
}
剩下的问题是,使用准引用来解决引用调查变量的问题适用于该 if-else 语句的顶层,但在下一个 else if 块内无法识别函数参数,即使使用 {{}} 以相同的方式对待它们
so I'm using srvyr to calculate survey means of a variable (y) from a survey object, grouping by a categorical variable (x) from that same survey object, and the basic code looks like this
survey_means <- survey_object %>%
filter( #remove NAs) %>%
group_by(x) %>%
summarise(Mean = survey_mean(y))
Suppose I want to instead put this block of code inside a function, which accepts the survey object and two variables as parameters. This is a simplified version of what I'm actually trying to do, which is a function that will handle up to a group of 4 or so variables, but this is the base case:
SurveyMeanFunc <- function(survey_object, x, y) {
survey_means <- survey_object %>%
filter( #remove NAs ) %>%
group_by(survey_object[["variables"]][[x]]) %>%
summarise(Mean = survey_mean(survey_object[["variables"]][[y]]))
return(survey_means)
}
When attempting to use this function I will always be presented with an error message along the lines of
! Assigned data `x` must be compatible with existing data.
x Existing data has n rows.
x Assigned data has m rows. (m > n)
i Only vectors of size 1 are recycled.
Even when I split up the pipes, and verify that the number of rows in x are the same as y right before using the summarise command, I still get this message. What is summarise() doing that I don't understand?
[EDIT] Full Context with suggested changes:
SurveyMeanMedFunc <- function(survey_obj, xvar, yvar, categ1= NULL, categ2= NULL) {
if (is.null(categ1) & is.null(categ2)) {
survey_estimate <- survey_obj %>%
filter(!is.na({{ xvar }}), !is.na({{ yvar }})) %>%
group_by({{ xvar }}) %>%
summarise(Mean = survey_mean({{ yvar }}, vartype = "ci"))
} else if (is.null(categ2)) {
survey_estimate <- survey_obj %>%
filter(!is.na({{ xvar }}), !is.na({{ yvar }})) %>%
group_by({{ xvar }}, {{ categ1 }}) %>%
summarise(Mean = survey_mean({{ yvar }}, vartype = "ci"))
} else {
NULL #fix
}
return(survey_estimate)
}
The remaining issue is that using quasiquotation to solve the issue of referencing the survey variables works for the top level of this if-else statement but the function parameters are not recognised inside the next else if block, even though they are treated the same way using {{}}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您没有给出如何使用该函数的示例,但如果我理解正确,您希望获取第一个代码块并用
x
替换为名称来运行它作为x
参数传入的变量和作为y
参数传入的变量名称的y
(仅适用于“remove NA”) ' 行被删除或修复来做某事)也就是说,你想要
SurveyMeanFunc(my_design,species,height)
为这很复杂,因为您不需要
x
的值或名称x
,您想要名称species
。一种方法是准引用,过去需要
enquo
和!!
但现在可以使用{{ }}
运算符更轻松地完成
< strong>更新
您仍然没有给出如何使用该函数的示例,但我认为这可行
问题是您无法评估
categ1< /code> 或
categ2
中if
条件,如果它们是由用户提供的,因为您没有在调查对象中评估它们。 R 不知道该往哪里看。这是一个问题,因为 tidyverse 使用不带引号的变量名称的方式 - 如果您将它们作为模型公式(就像在调查
中那样)或作为带引号的字符串提供,那就没问题了。missing
函数询问是否提供了参数,在本例中正是您想要的。rlang
包中有更灵活的is_missing
/maybe_missing
设置;你可以看看另一个选择。但这似乎有效You don't give an example of how you want to use the function, but if I'm understanding correctly, you want to take your first block of code and run it with
x
replaced by the name of the variable passed in as thex
argument andy
by the name of the variable passed in as they
argument (only with the 'remove NAs' line deleted or fixed to do something)That is, you want
SurveyMeanFunc(my_design, species, height)
to beThis is complicated because you don't want the value of
x
or the namex
, you want the namespecies
.One way is quasiquotation, which used to require
enquo
and!!
but now can be done more easily with the{{ }}
operatorgiving
Update
You still don't give an example of how you want to use the function, but I think this works
The issue is that you can't evaluate
categ1
orcateg2
in theif
condition if they are supplied by the user, because you're not evaluating them in a survey object. R doesn't know where to look. This is a problem because of the way the tidyverse uses unquoted variable names -- if you supplied them as model formulas (as you would insurvey
) or as quoted strings you'd be ok.The
missing
function asks whether an argument was supplied, which in this case is what you want. There's a more flexibleis_missing
/maybe_missing
setup in therlang
package; you could look at that for another option. But this seems to work