访问列表或数据帧元素时括号 [ ] 和双括号 [[ ]] 之间的区别

发布于 2024-07-29 10:51:37 字数 116 浏览 9 评论 0 原文

R 提供了两种不同的方法来访问列表或 data.frame 的元素:[][[]]

两者之间有什么区别,什么时候应该使用其中一种而不是另一种?

R provides two different methods for accessing the elements of a list or data.frame: [] and [[]].

What is the difference between the two, and when should I use one over the other?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

不醒的梦 2024-08-05 10:51:37

R 语言定义可以方便地回答这些类型的问题:

R 具有三个基本索引运算符,其语法如以下示例所示

 x[i] 
      x[i,j] 
      x[[i]] 
      x[[i,j]] 
      x$a 
      x$“一个” 
  

对于向量和矩阵,[[ 形式很少使用,尽管它们与 [ 形式有一些细微的语义差异(例如,它删除任何名称或 dimnames 属性,并且部分匹配用于字符索引)。 当使用单个索引索引多维结构时,x[[i]]x[i] 将返回第 i 个连续元素x

对于列表,通常使用 [[ 选择任何单个元素,而 [ 返回所选元素的列表。

[[ 形式允许使用整数或字符索引仅选择单个元素,而 [ 允许通过向量进行索引。 但请注意,对于列表,索引可以是向量,并且向量的每个元素依次应用于列表、所选组件、该组件的所选组件等。 结果仍然是单个元素。

The R Language Definition is handy for answering these types of questions:

R has three basic indexing operators, with syntax displayed by the following examples

    x[i]
    x[i, j]
    x[[i]]
    x[[i, j]]
    x$a
    x$"a"

For vectors and matrices the [[ forms are rarely used, although they have some slight semantic differences from the [ form (e.g. it drops any names or dimnames attribute, and that partial matching is used for character indices). When indexing multi-dimensional structures with a single index, x[[i]] or x[i] will return the ith sequential element of x.

For lists, one generally uses [[ to select any single element, whereas [ returns a list of the selected elements.

The [[ form allows only a single element to be selected using integer or character indices, whereas [ allows indexing by vectors. Note though that for a list, the index can be a vector and each element of the vector is applied in turn to the list, the selected component, the selected component of that component, and so on. The result is still a single element.

清眉祭 2024-08-05 10:51:37

这两种方法之间的显着差异在于它们在用于提取时返回的对象的类以及它们是否可以接受一系列值,或者在赋值期间仅接受单个值。

考虑以下列表中的数据提取情况:

foo <- list( str='R', vec=c(1,2,3), bool=TRUE )

假设我们想从 foo 中提取 bool 存储的值并在 if() 语句中使用它。 这将说明 [][[]] 用于数据提取时返回值之间的差异。 [] 方法返回类 list 的对象(如果 foo 是 data.frame,则返回 data.frame),而 [[]] 方法返回其类由下式确定的对象:他们的值的类型。

因此,使用 [] 方法会产生以下结果:

if( foo[ 'bool' ] ){ print("Hi!") }
Error in if (foo["bool"]) { : argument is not interpretable as logical

class( foo[ 'bool' ] )
[1] "list"

这是因为 [] 方法返回一个列表,而列表不是直接传递给 的有效对象>if() 语句。 在这种情况下,我们需要使用 [[]] 因为它将返回存储在 'bool' 中的“裸”对象,该对象将具有适当的类:

if( foo[[ 'bool' ]] ){ print("Hi!") }
[1] "Hi!"

class( foo[[ 'bool' ]] )
[1] "logical"

第二个区别是 [] 运算符可用于访问列表中的范围槽或数据帧中的列,而[[]] 运算符仅限于访问<单个插槽或列。 考虑使用第二个列表 bar() 进行赋值的情况:

bar <- list( mat=matrix(0,nrow=2,ncol=2), rand=rnorm(1) )

假设我们想用 bar 中包含的数据覆盖 foo 的最后两个槽。 如果我们尝试使用 [[]] 运算符,则会发生以下情况:

foo[[ 2:3 ]] <- bar
Error in foo[[2:3]] <- bar : 
more elements supplied than there are to replace

这是因为 [[]] 仅限于访问单个元素。 我们需要使用[]

foo[ 2:3 ] <- bar
print( foo )

$str
[1] "R"

$vec
     [,1] [,2]
[1,]    0    0
[2,]    0    0

$bool
[1] -0.6291121

请注意,虽然赋值成功,但 foo 中的槽保留了原来的名称。

The significant differences between the two methods are the class of the objects they return when used for extraction and whether they may accept a range of values, or just a single value during assignment.

Consider the case of data extraction on the following list:

foo <- list( str='R', vec=c(1,2,3), bool=TRUE )

Say we would like to extract the value stored by bool from foo and use it inside an if() statement. This will illustrate the differences between the return values of [] and [[]] when they are used for data extraction. The [] method returns objects of class list (or data.frame if foo was a data.frame) while the [[]] method returns objects whose class is determined by the type of their values.

So, using the [] method results in the following:

if( foo[ 'bool' ] ){ print("Hi!") }
Error in if (foo["bool"]) { : argument is not interpretable as logical

class( foo[ 'bool' ] )
[1] "list"

This is because the [] method returned a list and a list is not valid object to pass directly into an if() statement. In this case we need to use [[]] because it will return the "bare" object stored in 'bool' which will have the appropriate class:

if( foo[[ 'bool' ]] ){ print("Hi!") }
[1] "Hi!"

class( foo[[ 'bool' ]] )
[1] "logical"

The second difference is that the [] operator may be used to access a range of slots in a list or columns in a data frame while the [[]] operator is limited to accessing a single slot or column. Consider the case of value assignment using a second list, bar():

bar <- list( mat=matrix(0,nrow=2,ncol=2), rand=rnorm(1) )

Say we want to overwrite the last two slots of foo with the data contained in bar. If we try to use the [[]] operator, this is what happens:

foo[[ 2:3 ]] <- bar
Error in foo[[2:3]] <- bar : 
more elements supplied than there are to replace

This is because [[]] is limited to accessing a single element. We need to use []:

foo[ 2:3 ] <- bar
print( foo )

$str
[1] "R"

$vec
     [,1] [,2]
[1,]    0    0
[2,]    0    0

$bool
[1] -0.6291121

Note that while the assignment was successful, the slots in foo kept their original names.

流绪微梦 2024-08-05 10:51:37

双括号访问列表元素,而单括号则返回包含单个元素的列表。

lst <- list('one','two','three')

a <- lst[1]
class(a)
## returns "list"

a <- lst[[1]]
class(a)
## returns "character"

Double brackets accesses a list element, while a single bracket gives you back a list with a single element.

lst <- list('one','two','three')

a <- lst[1]
class(a)
## returns "list"

a <- lst[[1]]
class(a)
## returns "character"
难得心□动 2024-08-05 10:51:37

来自哈德利·威克姆:

“来自

我的(看起来蹩脚的)修改使用 tidyverse / purrr 进行显示:

在此处输入图像描述

From Hadley Wickham:

From Hadley Wickham

My (crappy looking) modification to show using tidyverse / purrr:

enter image description here

层林尽染 2024-08-05 10:51:37

[] 提取列表,[[]] 提取列表中的元素

alist <- list(c("a", "b", "c"), c(1,2,3,4), c(8e6, 5.2e9, -9.3e7))

str(alist[[1]])
 chr [1:3] "a" "b" "c"

str(alist[1])
List of 1
 $ : chr [1:3] "a" "b" "c"

str(alist[[1]][1])
 chr "a"

[] extracts a list, [[]] extracts elements within the list

alist <- list(c("a", "b", "c"), c(1,2,3,4), c(8e6, 5.2e9, -9.3e7))

str(alist[[1]])
 chr [1:3] "a" "b" "c"

str(alist[1])
List of 1
 $ : chr [1:3] "a" "b" "c"

str(alist[[1]][1])
 chr "a"
泼猴你往哪里跑 2024-08-05 10:51:37

只是在此处添加 [[ 也配备了递归索引。

@JijoMatthew 在答案中暗示了这一点,但没有进行探讨。

?"[[" 中所述,语法类似于 x[[y]],其中 length(y) > 1 被解释为:

x[[ y[1] ]][[ y[2] ]][[ y[3] ]] ... [[ y[length(y)] ]]

请注意,这不会改变您对 [[[——即前者用于子集,后者用于提取单个列表元素。

例如,

x <- list(list(list(1), 2), list(list(list(3), 4), 5), 6)
x
# [[1]]
# [[1]][[1]]
# [[1]][[1]][[1]]
# [1] 1
#
# [[1]][[2]]
# [1] 2
#
# [[2]]
# [[2]][[1]]
# [[2]][[1]][[1]]
# [[2]][[1]][[1]][[1]]
# [1] 3
#
# [[2]][[1]][[2]]
# [1] 4
#
# [[2]][[2]]
# [1] 5
#
# [[3]]
# [1] 6

要获得值 3,我们可以这样做:

x[[c(2, 1, 1, 1)]]
# [1] 3

回到上面 @JijoMatthew 的答案,回想一下 r

r <- list(1:10, foo=1, far=2)

特别是,这解释了我们在错误使用 [ 时容易出现的错误。 [,即:

r[[1:3]]

r[[1:3]] 中出现错误:递归索引在级别 2 失败

因为此代码实际上尝试评估 r[[1]][[2]][[3]] ,并且 r 的嵌套在第 1 层停止,通过递归索引提取的尝试在 [[2]](即第 2 层)失败。

r[[c("foo", "far")]] 中的错误:下标超出范围

这里,R 正在寻找 r[["foo"]][["far "]],它不存在,所以我们得到下标越界错误。

如果这两个错误给出相同的消息,可能会更有帮助/更一致。

Just adding here that [[ also is equipped for recursive indexing.

This was hinted at in the answer by @JijoMatthew but not explored.

As noted in ?"[[", syntax like x[[y]], where length(y) > 1, is interpreted as:

x[[ y[1] ]][[ y[2] ]][[ y[3] ]] ... [[ y[length(y)] ]]

Note that this doesn't change what should be your main takeaway on the difference between [ and [[ -- namely, that the former is used for subsetting, and the latter is used for extracting single list elements.

For example,

x <- list(list(list(1), 2), list(list(list(3), 4), 5), 6)
x
# [[1]]
# [[1]][[1]]
# [[1]][[1]][[1]]
# [1] 1
#
# [[1]][[2]]
# [1] 2
#
# [[2]]
# [[2]][[1]]
# [[2]][[1]][[1]]
# [[2]][[1]][[1]][[1]]
# [1] 3
#
# [[2]][[1]][[2]]
# [1] 4
#
# [[2]][[2]]
# [1] 5
#
# [[3]]
# [1] 6

To get the value 3, we can do:

x[[c(2, 1, 1, 1)]]
# [1] 3

Getting back to @JijoMatthew's answer above, recall r:

r <- list(1:10, foo=1, far=2)

In particular, this explains the errors we tend to get when mis-using [[, namely:

r[[1:3]]

Error in r[[1:3]] : recursive indexing failed at level 2

Since this code actually tried to evaluate r[[1]][[2]][[3]], and the nesting of r stops at level one, the attempt to extract through recursive indexing failed at [[2]], i.e., at level 2.

Error in r[[c("foo", "far")]] : subscript out of bounds

Here, R was looking for r[["foo"]][["far"]], which doesn't exist, so we get the subscript out of bounds error.

It probably would be a bit more helpful/consistent if both of these errors gave the same message.

凉风有信 2024-08-05 10:51:37

从术语上来说,[[ 运算符从列表中提取元素,而 [ 运算符则获取列表的子集

Being terminological, [[ operator extracts the element from a list whereas [ operator takes subset of a list.

治碍 2024-08-05 10:51:37

两者都是子集化的方式。
单括号将返回列表的子集,该子集本身就是一个列表。 即,它可能包含也可能不包含多个元素。
另一方面,双括号将仅返回列表中的单个元素。

-单括号会给我们一个列表。 如果我们希望从列表中返回多个元素,我们也可以使用单括号。
考虑以下列表:

>r<-list(c(1:10),foo=1,far=2);

现在,请注意当我尝试显示列表时返回列表的方式。
我输入 r 并按 Enter 键。

>r

#the result is:-

[[1]]

 [1]  1  2  3  4  5  6  7  8  9 10

$foo

[1] 1

$far

[1] 2

现在我们将看到单括号的魔力:

>r[c(1,2,3)]

#the above command will return a list with all three elements of the actual list r as below

[[1]]

 [1]  1  2  3  4  5  6  7  8  9 10

$foo

[1] 1


$far

[1] 2

这与我们尝试在屏幕上显示 r 的值时完全相同,这意味着使用单括号返回了一个列表,其中在索引 1 处我们有一个包含 10 个元素的向量,那么我们还有两个名为 foo 和 far 的元素。
我们还可以选择给出单个索引或元素名称作为单个括号的输入。
例如,:

> r[1]

[[1]]

 [1]  1  2  3  4  5  6  7  8  9 10

在这个例子中,我们给出了一个索引“1”,作为回报得到了一个包含一个元素的列表(这是一个由 10 个数字组成的数组)

> r[2]

$foo

[1] 1

在上面的例子中,我们给出了一个索引“2”,作为回报得到了一个列表包含一个元素:

> r["foo"];

$foo

[1] 1

在此示例中,我们传递了一个元素的名称,作为返回,返回了一个包含一个元素的列表。

您还可以传递一个元素名称向量,例如:

> x<-c("foo","far")

> r[x];

$foo

[1] 1

$far
[1] 2

在本示例中,我们传递了一个具有两个元素名称“foo”和“far”的向量。

作为回报,我们得到了一个包含两个元素的列表。

简而言之,单个括号将始终返回另一个列表,其元素数量等于您传递到单个括号中的元素数量或索引数量。

相反,双括号始终仅返回一个元素。
在转向双括号之前,请记住一个注意事项。
注意:两者之间的主要区别在于,单括号会返回包含任意多个元素的列表,而双括号永远不会返回列表。 相反,双括号只会返回列表中的单个元素。

我将举几个例子。 请记下粗体字,并在完成以下示例后返回:

双括号将返回索引处的实际值。(它不会返回列表)

  > r[[1]]

     [1]  1  2  3  4  5  6  7  8  9 10


  >r[["foo"]]

    [1] 1

对于双括号,如果我们尝试通过传递向量来查看多个元素,则会导致错误,因为它不是为了满足这一需求而构建的,而只是为了返回单个元素。

考虑下列

> r[[c(1:3)]]
Error in r[[c(1:3)]] : recursive indexing failed at level 2
> r[[c(1,2,3)]]
Error in r[[c(1, 2, 3)]] : recursive indexing failed at level 2
> r[[c("foo","far")]]
Error in r[[c("foo", "far")]] : subscript out of bounds

Both of them are ways of subsetting.
The single bracket will return a subset of the list, which in itself will be a list. i.e., It may or may not contain more than one elements.
On the other hand, a double bracket will return just a single element from the list.

-Single bracket will give us a list. We can also use single bracket if we wish to return multiple elements from the list.
Consider the following list:

>r<-list(c(1:10),foo=1,far=2);

Now, please note the way the list is returned when I try to display it.
I type r and press enter.

>r

#the result is:-

[[1]]

 [1]  1  2  3  4  5  6  7  8  9 10

$foo

[1] 1

$far

[1] 2

Now we will see the magic of single bracket:

>r[c(1,2,3)]

#the above command will return a list with all three elements of the actual list r as below

[[1]]

 [1]  1  2  3  4  5  6  7  8  9 10

$foo

[1] 1


$far

[1] 2

which is exactly the same as when we tried to display value of r on screen, which means the usage of single bracket has returned a list, where at index 1 we have a vector of 10 elements, then we have two more elements with names foo and far.
We may also choose to give a single index or element name as input to the single bracket.
e.g.,:

> r[1]

[[1]]

 [1]  1  2  3  4  5  6  7  8  9 10

In this example, we gave one index "1" and in return got a list with one element(which is an array of 10 numbers)

> r[2]

$foo

[1] 1

In the above example, we gave one index "2" and in return got a list with one element:

> r["foo"];

$foo

[1] 1

In this example, we passed the name of one element and in return a list was returned with one element.

You may also pass a vector of element names like:

> x<-c("foo","far")

> r[x];

$foo

[1] 1

$far
[1] 2

In this example, we passed an vector with two element names "foo" and "far".

In return we got a list with two elements.

In short, a single bracket will always return you another list with number of elements equal to the number of elements or number of indices you pass into the single bracket.

In contrast, a double bracket will always return only one element.
Before moving to double bracket a note to be kept in mind.
NOTE:THE MAJOR DIFFERENCE BETWEEN THE TWO IS THAT SINGLE BRACKET RETURNS YOU A LIST WITH AS MANY ELEMENTS AS YOU WISH WHILE A DOUBLE BRACKET WILL NEVER RETURN A LIST. RATHER A DOUBLE BRACKET WILL RETURN ONLY A SINGLE ELEMENT FROM THE LIST.

I will site a few examples. Please keep a note of the words in bold and come back to it after you are done with the examples below:

Double bracket will return you the actual value at the index.(It will NOT return a list)

  > r[[1]]

     [1]  1  2  3  4  5  6  7  8  9 10


  >r[["foo"]]

    [1] 1

for double brackets if we try to view more than one elements by passing a vector it will result in an error just because it was not built to cater to that need, but just to return a single element.

Consider the following

> r[[c(1:3)]]
Error in r[[c(1:3)]] : recursive indexing failed at level 2
> r[[c(1,2,3)]]
Error in r[[c(1, 2, 3)]] : recursive indexing failed at level 2
> r[[c("foo","far")]]
Error in r[[c("foo", "far")]] : subscript out of bounds
〆一缕阳光ご 2024-08-05 10:51:37

为了帮助新手在手动迷雾中导航,将 [[ ... ]] 表示法视为一个折叠函数可能会有所帮助 - 换句话说,当您只想从命名向量、列表或数据框中“获取数据”。 如果您想使用这些对象中的数据进行计算,最好这样做。 这些简单的例子将会说明。

(x <- c(x=1, y=2)); x[1]; x[[1]]
(x <- list(x=1, y=2, z=3)); x[1]; x[[1]]
(x <- data.frame(x=1, y=2, z=3)); x[1]; x[[1]]

所以从第三个例子开始:

> 2 * x[1]
  x
1 2
> 2 * x[[1]]
[1] 2

To help newbies navigate through the manual fog, it might be helpful to see the [[ ... ]] notation as a collapsing function - in other words, it is when you just want to 'get the data' from a named vector, list or data frame. It is good to do this if you want to use data from these objects for calculations. These simple examples will illustrate.

(x <- c(x=1, y=2)); x[1]; x[[1]]
(x <- list(x=1, y=2, z=3)); x[1]; x[[1]]
(x <- data.frame(x=1, y=2, z=3)); x[1]; x[[1]]

So from the third example:

> 2 * x[1]
  x
1 2
> 2 * x[[1]]
[1] 2
第几種人 2024-08-05 10:51:37

对于另一个具体用例,当您想要选择由 split() 函数创建的数据框时,请使用双括号。 如果您不知道,split() 会根据关键字段将列表/数据框分组为子集。 如果您想对多个组进行操作、绘制它们等,它会很有用。

> class(data)
[1] "data.frame"

> dsplit<-split(data, data$id)
> class(dsplit)
[1] "list"

> class(dsplit['ID-1'])
[1] "list"

> class(dsplit[['ID-1']])
[1] "data.frame"

For yet another concrete use case, use double brackets when you want to select a data frame created by the split() function. If you don't know, split() groups a list/data frame into subsets based on a key field. It's useful if when you want to operate on multiple groups, plot them, etc.

> class(data)
[1] "data.frame"

> dsplit<-split(data, data$id)
> class(dsplit)
[1] "list"

> class(dsplit['ID-1'])
[1] "list"

> class(dsplit[['ID-1']])
[1] "data.frame"
甩你一脸翔 2024-08-05 10:51:37

请参考下面的详细解释。

我使用了 R 中的内置数据框架,称为 mtcars。

> mtcars 
               mpg cyl disp  hp drat   wt ... 
Mazda RX4     21.0   6  160 110 3.90 2.62 ... 
Mazda RX4 Wag 21.0   6  160 110 3.90 2.88 ... 
Datsun 710    22.8   4  108  93 3.85 2.32 ... 
           ............

表的顶行称为标题,其中包含列名称。 之后的每条水平线表示一个数据行,以行名称开头,然后是实际数据。
行中的每个数据成员称为一个单元格。

单方括号“[]”运算符

要检索单元格中的数据,我们将在单方括号“[]”运算符中输入其行和列坐标。 两个坐标用逗号分隔。 换句话说,坐标以行位置开始,然后跟一个逗号,最后以列位置结束。 顺序很重要。

例如 1:- 这是 mtcars 第一行第二列的单元格值。

> mtcars[1, 2] 
[1] 6

例如 2:- 此外,我们可以使用行和列名称来代替数字坐标。

> mtcars["Mazda RX4", "cyl"] 
[1] 6 

双方括号“[[]]”运算符

我们使用双方括号“[[]]”运算符引用数据框列。

例如 1:- 为了检索内置数据集 mtcars 的第九列向量,我们编写 mtcars[[9]]。

mtcars[[9]]
[1] 1 1 1 0 0 0 0 0 0 0 0 ...

例如 2:- 我们可以通过名称检索相同的列向量。

mtcars[["am"]]
[1] 1 1 1 0 0 0 0 0 0 0 0 ...

Please refer the below-detailed explanation.

I have used Built-in data frame in R, called mtcars.

> mtcars 
               mpg cyl disp  hp drat   wt ... 
Mazda RX4     21.0   6  160 110 3.90 2.62 ... 
Mazda RX4 Wag 21.0   6  160 110 3.90 2.88 ... 
Datsun 710    22.8   4  108  93 3.85 2.32 ... 
           ............

The top line of the table is called the header which contains the column names. Each horizontal line afterward denotes a data row, which begins with the name of the row, and then followed by the actual data.
Each data member of a row is called a cell.

single square bracket "[]" operator

To retrieve data in a cell, we would enter its row and column coordinates in the single square bracket "[]" operator. The two coordinates are separated by a comma. In other words, the coordinates begin with row position, then followed by a comma, and ends with the column position. The order is important.

Eg 1:- Here is the cell value from the first row, second column of mtcars.

> mtcars[1, 2] 
[1] 6

Eg 2:- Furthermore, we can use the row and column names instead of the numeric coordinates.

> mtcars["Mazda RX4", "cyl"] 
[1] 6 

Double square bracket "[[]]" operator

We reference a data frame column with the double square bracket "[[]]" operator.

Eg 1:- To retrieve the ninth column vector of the built-in data set mtcars, we write mtcars[[9]].

mtcars[[9]]
[1] 1 1 1 0 0 0 0 0 0 0 0 ...

Eg 2:- We can retrieve the same column vector by its name.

mtcars[["am"]]
[1] 1 1 1 0 0 0 0 0 0 0 0 ...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文