在R中,df[“x”]和df$x有什么区别

发布于 2024-09-12 06:24:44 字数 328 浏览 9 评论 0原文

我在哪里可以找到有关通过以下方式调用 data.frame 中的列之间的差异的信息:

df <- data.frame(x=1:20,y=letters[1:20],z=20:1)

df$x
df["x"]

它们都返回“相同”的结果,但不一定采用相同的格式。我注意到的另一件事是 df$x 返回一个列表。而 df["x"] 返回一个 data.frame。

编辑:然而,知道在哪种情况下使用哪个已经成为一项挑战。这里是否有最佳实践,或者是否真的可以归结为了解命令或功能需要什么?到目前为止,如果我的功能一开始不起作用(反复试验),我只是循环使用它们。

Where can I find information on the differences between calling on a column within a data.frame via:

df <- data.frame(x=1:20,y=letters[1:20],z=20:1)

df$x
df["x"]

They both return the "same" results, but not necessarily in the same format. Another thing that I've noticed is that df$x returns a list. Whereas df["x"] returns a data.frame.

EDIT: However, knowing which one to use in which situation has become a challenge. Is there a best practice here or does it really come down to knowing what the command or function requires? So far I've just been cycling through them if my function doesn't work at first (trial and error).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

不如归去 2024-09-19 06:24:44

另一个区别是 df$w 返回 NULLdf['w']df[['w']]< /code> 给出了示例数据帧的错误。

Another difference is that df$w returns NULL and df['w'] or df[['w']] gives an error with your example dataframe.

破晓 2024-09-19 06:24:44

如果我没记错的话,df$xdf[['x']] 相同。 [[ 用于选择任何单个元素,而 [ 返回所选元素的列表。另请参阅语言参考 。我通常看到 [[ 用于列表,[ 用于数组,$ 用于获取单个列或元素。如果您需要表达式(例如 df[[name]] 或 df[,name]),则也可以使用 [ 或 [[ 表示法。如果选择多列,也会使用 [ 符号。例如 df[,c('name1', 'name2')]。我认为这没有最佳实践。

If I'm not mistaken, df$x is the same as df[['x']]. [[ is used to select any single element, whereas [ returns a list of the selected elements. See also the language reference. I usually see that [[ is used for lists, [ for arrays and $ for getting a single column or element. If you need an expression (for example df[[name]] or df[,name]), then use the [ or [[ notation also. The [ notation is also used if multiple columns are selected. For example df[,c('name1', 'name2')]. I don't think there is a best-practices for this.

睫毛溺水了 2024-09-19 06:24:44

除了手册中的索引页面之外,您还可以在帮助页面“$”上找到这个简洁的描述:

通过“[”进行索引类似于原子索引
向量并选择一个列表
指定元素。

“[[”和“$”都选择一个
列表的元素。主要
区别在于 '$' 不允许
计算索引,而 '[[' 则执行。
'x$name' 等价于 'x[["name",
准确=假]]'。另外,部分
'[[' 的匹配行为可以是
使用“精确”参数进行控制。

当然,函数调用是不同的。请参阅 get("[.data.frame")get("[[.data.frame")get("$")

In addition to the indexing page in the manual, you can find this succinct description on the help page ?"$":

Indexing by ‘[’ is similar to atomic
vectors and selects a list of the
specified element(s).

Both ‘[[’ and ‘$’ select a single
element of the list. The main
difference is that ‘$’ does not allow
computed indices, whereas ‘[[’ does.
‘x$name’ is equivalent to ‘x[["name",
exact = FALSE]]’. Also, the partial
matching behavior of ‘[[’ can be
controlled using the ‘exact’ argument.

The function calls are, of course, different. See get("[.data.frame") versus get("[[.data.frame") versus get("$")

英雄似剑 2024-09-19 06:24:44

在这种情况下,对于大多数用途,我会完全避免子设置并尝试记住 $[[[ 的作用一个数据框。我只会使用 with()

> df <- data.frame(x = 1:20, y = letters[1:20], z = 20:1)
> with(df, y)
 [1] a b c d e f g h i j k l m n o p q r s t
Levels: a b c d e f g h i j k l m n o p q r s t

在大多数情况下,这比任何子设置方法都清晰得多(恕我直言)。

In this instance, for most uses, I'd avoid sub-setting altogether and trying to remember what $, [ and [[ do with a data frame. I would just use with():

> df <- data.frame(x = 1:20, y = letters[1:20], z = 20:1)
> with(df, y)
 [1] a b c d e f g h i j k l m n o p q r s t
Levels: a b c d e f g h i j k l m n o p q r s t

That is a lot clearer than any of the sub-setting methods in most cases (IMHO).

夕嗳→ 2024-09-19 06:24:44

我没有看到明确解释的一件事是 [[[ 可以用来根据变量或表达式的值进行选择而 $ 则不能。 IE 你可以这样做:

> example_frame <- data.frame(Var1 = c(1,2), Var2 = c('a', 'b'))
> x <- 'Var1'

> example_frame$x
NULL  # Not what you wanted

> example_frame[x]
  Var1
1    1
2    2

> example_frame[[x]]
[1] 1 2

> example_frame[[ paste(c("V","a","r",2), collapse='') ]]
[1] a b
Levels: a b

[[[ 之间的差异已被 其他帖子其他问题

One thing I haven't seen explained explicitly is that [ and [[ can be used to select based on the value of a variable or expression while $ cannot. I.E you can do:

> example_frame <- data.frame(Var1 = c(1,2), Var2 = c('a', 'b'))
> x <- 'Var1'

> example_frame$x
NULL  # Not what you wanted

> example_frame[x]
  Var1
1    1
2    2

> example_frame[[x]]
[1] 1 2

> example_frame[[ paste(c("V","a","r",2), collapse='') ]]
[1] a b
Levels: a b

The differences between [ and [[ have been well covered by other posts and other questions.

半枫 2024-09-19 06:24:44

如果您使用 df[,"x"] 而不是 df["x"] 您将得到与 df$x 相同的结果。逗号表示您正在按名称选择

If you use df[,"x"] instead of df["x"] you will get the same result as df$x. The comma indicates that you're selecting a column by name.

大海や 2024-09-19 06:24:44

df$xdf[[x]] 做同样的事情。

假设您有一个名为 one 的数据集。这些变量之一是因子变量,Region。使用 one$Region 将允许您选择特定变量。请考虑以下事项:

one <- read.csv("IED.csv")
one$Region

运行以下代码还可以让您隔离该变量/级别。

one[["Region"]]

每个代码都会生成以下输出:

> one$Region
    [1] RC SOUTH      RC SOUTH      RC SOUTH      RC EAST       RC EAST      
    [6] RC EAST       RC EAST       RC EAST       RC EAST       RC EAST      
   [11] RC SOUTH      RC SOUTH      RC EAST       RC EAST       RC EAST      
   [16] RC EAST       RC EAST       RC SOUTH      RC SOUTH      RC EAST      
   [21] RC SOUTH      RC EAST       RC CAPITAL    RC EAST       RC EAST 


> one[["Region"]]
    [1] RC SOUTH      RC SOUTH      RC SOUTH      RC EAST       RC EAST      
    [6] RC EAST       RC EAST       RC EAST       RC EAST       RC EAST      
   [11] RC SOUTH      RC SOUTH      RC EAST       RC EAST       RC EAST      
   [16] RC EAST       RC EAST       RC SOUTH      RC SOUTH      RC EAST      
   [21] RC SOUTH      RC EAST       RC CAPITAL    RC EAST       RC EAST 

“它们都返回“相同”结果,但不一定采用相同的格式。” - 我没有注意到任何差异。每个命令以相同的格式产生相同的输出。也许是你的数据。

希望有帮助。

编辑:

误读了原来的问题。 df["x"] 产生以下结果:

> one["Region"]
             Region
1          RC SOUTH
2          RC SOUTH
3          RC SOUTH
4           RC EAST
5           RC EAST
6           RC EAST
7           RC EAST
8           RC EAST
9           RC EAST
10          RC EAST

不确定为什么会出现差异。

df$x and df[[x]] do the same thing.

Let's assume that you have a data set named one. One of these variables is a factor variable, Region. Using one$Region will allow you to select a specific variable. Consider the following:

one <- read.csv("IED.csv")
one$Region

Running the following code also allows you to isolate that variable/level.

one[["Region"]]

Each code produces the following output:

> one$Region
    [1] RC SOUTH      RC SOUTH      RC SOUTH      RC EAST       RC EAST      
    [6] RC EAST       RC EAST       RC EAST       RC EAST       RC EAST      
   [11] RC SOUTH      RC SOUTH      RC EAST       RC EAST       RC EAST      
   [16] RC EAST       RC EAST       RC SOUTH      RC SOUTH      RC EAST      
   [21] RC SOUTH      RC EAST       RC CAPITAL    RC EAST       RC EAST 


> one[["Region"]]
    [1] RC SOUTH      RC SOUTH      RC SOUTH      RC EAST       RC EAST      
    [6] RC EAST       RC EAST       RC EAST       RC EAST       RC EAST      
   [11] RC SOUTH      RC SOUTH      RC EAST       RC EAST       RC EAST      
   [16] RC EAST       RC EAST       RC SOUTH      RC SOUTH      RC EAST      
   [21] RC SOUTH      RC EAST       RC CAPITAL    RC EAST       RC EAST 

"They both return the "same" results, but not necessarily in the same format." - I didn't notice any differences. Each command produced the same outputs in the same format. Perhaps its your data.

Hope that helps.

EDIT:

Misread the original question. df["x"] produces the following:

> one["Region"]
             Region
1          RC SOUTH
2          RC SOUTH
3          RC SOUTH
4           RC EAST
5           RC EAST
6           RC EAST
7           RC EAST
8           RC EAST
9           RC EAST
10          RC EAST

Not sure why the difference occurs.

傲影 2024-09-19 06:24:44

df["x"] 返回一个包含名为“x”的单列的数据帧。这意味着结果仍然是数据帧,而不是向量。它保留了数据结构,这在某些情况下很有用。

df$x 返回包含“x”列的值的向量。这意味着结果不是数据帧而是值向量。

df["x"] returns a dataframe with a single column named "x". This means the result is still a dataframe, not a vector. It preserves the data structure, which can be useful in certain situations.

df$x returns a vector containing the values of the column "x". This means the result is not a dataframe but a vector of values.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文