R中通过引用调用(使用函数修改对象)

发布于 2025-01-11 01:29:53 字数 425 浏览 0 评论 0原文

我刚刚接触 R,很惊讶地发现函数不会修改对象,至少看起来这是默认的。例如,我编写了一个函数,只是为了在表格中的一个标签上粘贴一个星号;它在函数内部工作,但表本身没有改变。 (我主要来自 Ruby)

那么,在 R 中使用函数更改对象的正常、可接受的方法是什么?如何在表格标题中添加星号?

  • 替换整个对象:myTable = title.asterisk(myTable)

  • 使用解决方法通过引用进行调用(例如,如 在 R 中通过引用调用 TszKin Julian?

  • 使用函数以外的结构?对象方法?

I'm just getting my feet wet in R and was surprised to see that a function doesn't modify an object, at least it seems that's the default. For example, I wrote a function just to stick an asterisk on one label in a table; it works inside the function but the table itself is not changed. (I'm coming mainly from Ruby)

So, what is the normal, accepted way to use functions to change objects in R? How would I add an asterisk to the table title?

  • Replace the whole object: myTable = title.asterisk(myTable)

  • Use a work-around to call by reference (as described, for example, in Call by reference in R by TszKin Julian?

  • Use some structure other than a function? An object method?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

调妓 2025-01-18 01:29:53

您遇到麻烦的原因是您将对象传递到函数的本地命名空间中。这是 R 伟大/可怕的事情之一:它允许隐式变量声明,然后随着命名空间变得更深而实现取代。

这会影响您,因为函数会在当前命名空间内创建新的命名空间。我假设对象“myTable”最初是在全局命名空间中创建的,但是当它被传递到函数“title.asterisk”时,新的函数局部命名空间现在拥有一个具有相同属性的对象。其工作原理如下:

title.asterisk <- function(myTable){ do some stuff to 'myTable' }

在这种情况下,函数“title.asterisk”不会对全局对象“myTable”进行任何更改。相反,会创建一个同名的本地对象,因此本地对象会取代全局对象。如果我们以这种方式调用函数title.asterisk(myTable),则该函数仅对局部变量进行更改。

有两种直接方法可以修改全局对象(以及许多间接方法)。

选项 1:正如您提到的,第一个是让函数返回对象并覆盖全局对象,如下所示:

title.asterisk <- function(myTable){
    do some stuff to 'myTable'
    return(myTable)
}
myTable <- title.asterisk(myTable)

这没问题,但您仍然使代码有点难以理解,因为实际上有两个不同的“myTable”对象,一个是全局的,一个是函数本地的。许多程序员通过添加句点“.”来解决这个问题。在变量参数前面,如下所示:

title.asterisk <- function(.myTable){
    do some stuff to '.myTable'
    return(.myTable)
}
myTable <- title.asterisk(myTable)

好的,现在我们有一个视觉提示,表明这两个变量是不同的。这很好,因为当我们稍后尝试调试代码时,我们不想依赖命名空间取代之类的不可见的东西。它只会让事情变得比原来更困难。

选项 2:您可以只在函数内修改对象。当您想要对对象进行破坏性编辑并且不希望内存膨胀时,这是更好的选择。如果您正在进行破坏性编辑,则无需保存原始副本。另外,如果您的对象足够大,那么您不想在不必要时复制它。要对全局命名空间对象进行编辑,只需不要将其传递到函数中或从函数中声明它即可。

title.asterisk <- function(){ do some stuff to 'myTable' }

现在我们正在函数内直接编辑对象“myTable”。事实上,我们没有传递对象,这使得我们的函数寻找更高级别的命名空间来尝试解析变量名称。你瞧,它在上面找到了一个“myTable”对象!函数中的代码对对象进行更改。

需要考虑的注意事项:我讨厌调试。我的意思是我真的很讨厌调试。这对 R 中的我来说意味着一些事情:

  • 我将几乎所有内容都包装在一个函数中。当我编写代码时,一旦我的代码开始工作,我就会将其包装在一个函数中并将其放在一边。我大量使用“.”我的所有函数参数都带有前缀,并且对于它所在的命名空间本身的任何内容不使用前缀。
  • 我尝试不从函数内修改全局对象。我不喜欢这会导致什么结果。如果需要修改一个对象,我会在声明它的函数中修改它。这通常意味着我有多层函数调用函数,但这使我的工作既模块化又易于理解。
  • 我对所有代码进行注释,解释每一行或每一块的用途。这似乎有点无关,但我发现这三件事对我来说是相辅相成的。一旦开始将代码包装在函数中,您会发现自己想要重用更多旧代码。这就是好的评论的用武之地。对我来说,这是一个必要的部分。

The reason you're having trouble is the fact that you are passing the object into the local namespace of the function. This is one of the great / terrible things about R: it allows implicit variable declarations and then implements supercedence as the namespaces become deeper.

This is affecting you because a function creates a new namespace within the current namespace. The object 'myTable' was, I assume, originally created in the global namespace, but when it is passed into the function 'title.asterisk' a new function-local namespace now has an object with the same properties. This works like so:

title.asterisk <- function(myTable){ do some stuff to 'myTable' }

In this case, the function 'title.asterisk' does not make any changes to the global object 'myTable'. Instead, a local object is created with the same name, so the local object supercedes the global object. If we call the function title.asterisk(myTable) in this way, the function makes changes only to the local variable.

There are two direct ways to modify the global object (and many indirect ways).

Option 1: The first, as you mention, is to have the function return the object and overwrite the global object, like so:

title.asterisk <- function(myTable){
    do some stuff to 'myTable'
    return(myTable)
}
myTable <- title.asterisk(myTable)

This is okay, but you are still making your code a little difficult to understand, since there are really two different 'myTable' objects, one global and one local to the function. A lot of coders clear this up by adding a period '.' in front of variable arguments, like so:

title.asterisk <- function(.myTable){
    do some stuff to '.myTable'
    return(.myTable)
}
myTable <- title.asterisk(myTable)

Okay, now we have a visual cue that the two variables are different. This is good, because we don't want to rely on invisible things like namespace supercedence when we're trying to debug our code later. It just makes things harder than they have to be.

Option 2: You could just modify the object from within the function. This is the better option when you want to do destructive edits to an object and don't want memory inflation. If you are doing destructive edits, you don't need to save an original copy. Also, if your object is suitably large, you don't want to be copying it when you don't have to. To make edits to a global namespace object, simply don't pass it into or declare it from within the function.

title.asterisk <- function(){ do some stuff to 'myTable' }

Now we are making direct edits to the object 'myTable' from within the function. The fact that we aren't passing the object makes our function look to higher levels of namespace to try and resolve the variable name. Lo, and behold, it finds a 'myTable' object higher up! The code in the function makes the changes to the object.

A note to consider: I hate debugging. I mean I really hate debugging. This means a few things for me in R:

  • I wrap almost everything in a function. As I write my code, as soon as I get a piece working, I wrap it in a function and set it aside. I make heavy use of the '.' prefix for all my function arguments and use no prefix for anything that is native to the namespace it exists in.
  • I try not to modify global objects from within functions. I don't like where this leads. If an object needs to be modified, I modify it from within the function that declared it. This often means I have layers of functions calling functions, but it makes my work both modular and easy to understand.
  • I comment all of my code, explaining what each line or block is intended to do. It may seem a bit unrelated, but I find that these three things go together for me. Once you start wrapping coding in functions, you will find yourself wanting to reuse more of your old code. That's where good commenting comes in. For me, it's a necessary piece.
情独悲 2025-01-18 01:29:53

正如您所指出的,这两种范例正在替换整个对象,或者编写“替换”函数,例如

`updt<-` <- function(x, ..., value) {
    ## x is the object to be manipulated, value the object to be assigned
    x$lbl <- paste0(x$lbl, value)
    x
}

with

> d <- data.frame(x=1:5, lbl=letters[1:5])
> d
  x lbl
1 1   a
2 2   b
3 3   c
> updt(d) <- "*"
> d
  x lbl
1 1  a*
2 2  b*
3 3  c*

这是例如 $<- -- 就地更新元素的行为通过$访问。 这里是一个相关问题。人们可以将替换函数视为语法糖,

updt1 <- function(x, ..., value) {
    x$lbl <- paste0(x$lbl, value)
    x
}
d <- updt1(d, value="*")

但在我看来,“语法糖”这个标签并不能真正正确地描述所涉及的中心范式。它支持方便的就地更新,这与 R 通常维护的更改时复制错觉不同,它实际上是更新对象的“R”方式(而不是使用 ?ReferenceClasses例如,它具有更多其他语言的感觉,但会让期待更改时复制语义的 R 用户感到惊讶)。

The two paradigms are replacing the whole object, as you indicate, or writing 'replacement' functions such as

`updt<-` <- function(x, ..., value) {
    ## x is the object to be manipulated, value the object to be assigned
    x$lbl <- paste0(x$lbl, value)
    x
}

with

> d <- data.frame(x=1:5, lbl=letters[1:5])
> d
  x lbl
1 1   a
2 2   b
3 3   c
> updt(d) <- "*"
> d
  x lbl
1 1  a*
2 2  b*
3 3  c*

This is the behavior of, for instance, $<- -- in-place update the element accessed by $. Here is a related question. One could think of replacement functions as syntactic sugar for

updt1 <- function(x, ..., value) {
    x$lbl <- paste0(x$lbl, value)
    x
}
d <- updt1(d, value="*")

but the label 'syntactic sugar' doesn't really do justice, in my mind, to the central paradigm that is involved. It is enabling convenient in-place updates, which is different from the copy-on-change illusion that R usually maintains, and it is really the 'R' way of updating objects (rather than using ?ReferenceClasses, for instance, which have more of the feel of other languages but will surprise R users expecting copy-on-change semantics).

一紙繁鸢 2025-01-18 01:29:53

对于将来寻找一种简单方法(不知道是否是更合适的方法)来解决此问题的任何人:

在函数内部创建对象以临时保存要更改的修改版本。使用 deparse(substitute()) 获取已传递给函数参数的变量名称,然后使用 assign() 覆盖您的对象。您需要在 assign() 内使用 envir =parent.frame() 来让您的对象在函数外部的环境中定义。

(MyTable <- 1:10)

[1] 1 2 3 4 5 6 7 8 9 10

title.asterisk <- function(table) {
  tmp.table <- paste0(table, "*")
  name      <- deparse(substitute(table))
  assign(name, tmp.table, envir = parent.frame())
}

(title.asterisk(MyTable))

[1]“1*”“2*”“3*”“4*”“5*”“6*”“7*”“8*”“9*”“10*”

定义对象时使用括号比定义然后打印更有效率(对我来说,更好看)。

For anybody in the future looking for a simple way (do not know if it is the more appropriate one) to get this solved:

Inside the function create the object to temporally save the modified version of the one you want to change. Use deparse(substitute()) to get the name of the variable that has been passed to the function argument and then use assign() to overwrite your object. You will need to use envir = parent.frame() inside assign() to let your object be defined in the environment outside the function.

(MyTable <- 1:10)

[1] 1 2 3 4 5 6 7 8 9 10

title.asterisk <- function(table) {
  tmp.table <- paste0(table, "*")
  name      <- deparse(substitute(table))
  assign(name, tmp.table, envir = parent.frame())
}

(title.asterisk(MyTable))

[1] "1*" "2*" "3*" "4*" "5*" "6*" "7*" "8*" "9*" "10*"

Using parentheses when defining an object is a little more efficient (and to me, better looking) than defining then printing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文