在R中，df[“x”]和df$x有什么区别

发布于 2024-09-12 06:24:44 字数 328 浏览 9 评论 0原文

我在哪里可以找到有关通过以下方式调用 data.frame 中的列之间的差异的信息：

df <- data.frame(x=1:20,y=letters[1:20],z=20:1)

df$x
df["x"]

它们都返回“相同”的结果，但不一定采用相同的格式。我注意到的另一件事是 df$x 返回一个列表。而 df["x"] 返回一个 data.frame。

编辑：然而，知道在哪种情况下使用哪个已经成为一项挑战。这里是否有最佳实践，或者是否真的可以归结为了解命令或功能需要什么？到目前为止，如果我的功能一开始不起作用（反复试验），我只是循环使用它们。

原文

Where can I find information on the differences between calling on a column within a data.frame via:

df <- data.frame(x=1:20,y=letters[1:20],z=20:1)

df$x
df["x"]

They both return the "same" results, but not necessarily in the same format. Another thing that I've noticed is that df$x returns a list. Whereas df["x"] returns a data.frame.

EDIT: However, knowing which one to use in which situation has become a challenge. Is there a best practice here or does it really come down to knowing what the command or function requires? So far I've just been cycling through them if my function doesn't work at first (trial and error).

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

不如归去 2024-09-19 06:24:44

另一个区别是 df$w 返回 NULL 和 df['w'] 或 df[['w']]< /code> 给出了示例数据帧的错误。

回复收藏 0 原文

破晓 2024-09-19 06:24:44

如果我没记错的话，df$x 与 df[['x']] 相同。 [[ 用于选择任何单个元素，而 [ 返回所选元素的列表。另请参阅语言参考。我通常看到 [[ 用于列表，[ 用于数组，$ 用于获取单个列或元素。如果您需要表达式（例如 df[[name]] 或 df[,name]），则也可以使用 [ 或 [[ 表示法。如果选择多列，也会使用 [ 符号。例如 df[,c('name1', 'name2')]。我认为这没有最佳实践。

回复收藏 0 原文

睫毛溺水了 2024-09-19 06:24:44

除了手册中的索引页面之外，您还可以在帮助页面“$”上找到这个简洁的描述：

通过“[”进行索引类似于原子索引
向量并选择一个列表
指定元素。
“[[”和“$”都选择一个
列表的元素。主要
区别在于 '$' 不允许
计算索引，而 '[[' 则执行。
'x$name' 等价于 'x[["name",
准确=假]]'。另外，部分
'[[' 的匹配行为可以是
使用“精确”参数进行控制。

当然，函数调用是不同的。请参阅 get("[.data.frame") 与 get("[[.data.frame") 与 get("$")

回复收藏 0 原文

英雄似剑 2024-09-19 06:24:44

在这种情况下，对于大多数用途，我会完全避免子设置并尝试记住 $、[ 和 [[ 的作用一个数据框。我只会使用 with()：

> df <- data.frame(x = 1:20, y = letters[1:20], z = 20:1)
> with(df, y)
 [1] a b c d e f g h i j k l m n o p q r s t
Levels: a b c d e f g h i j k l m n o p q r s t

在大多数情况下，这比任何子设置方法都清晰得多（恕我直言）。

In this instance, for most uses, I'd avoid sub-setting altogether and trying to remember what $, [ and [[ do with a data frame. I would just use with():

> df <- data.frame(x = 1:20, y = letters[1:20], z = 20:1)
> with(df, y)
 [1] a b c d e f g h i j k l m n o p q r s t
Levels: a b c d e f g h i j k l m n o p q r s t

That is a lot clearer than any of the sub-setting methods in most cases (IMHO).

回复收藏 0 原文

夕嗳→ 2024-09-19 06:24:44

我没有看到明确解释的一件事是 [ 和 [[ 可以用来根据变量或表达式的值进行选择而 $ 则不能。 IE 你可以这样做：

> example_frame <- data.frame(Var1 = c(1,2), Var2 = c('a', 'b'))
> x <- 'Var1'

> example_frame$x
NULL  # Not what you wanted

> example_frame[x]
  Var1
1    1
2    2

> example_frame[[x]]
[1] 1 2

> example_frame[[ paste(c("V","a","r",2), collapse='') ]]
[1] a b
Levels: a b

[ 和 [[ 之间的差异已被其他帖子和其他问题。

One thing I haven't seen explained explicitly is that [ and [[ can be used to select based on the value of a variable or expression while $ cannot. I.E you can do:

> example_frame <- data.frame(Var1 = c(1,2), Var2 = c('a', 'b'))
> x <- 'Var1'

> example_frame$x
NULL  # Not what you wanted

> example_frame[x]
  Var1
1    1
2    2

> example_frame[[x]]
[1] 1 2

> example_frame[[ paste(c("V","a","r",2), collapse='') ]]
[1] a b
Levels: a b

The differences between [ and [[ have been well covered by other posts and other questions.

回复收藏 0 原文

半枫 2024-09-19 06:24:44

如果您使用 df[,"x"] 而不是 df["x"] 您将得到与 df$x 相同的结果。逗号表示您正在按名称选择列。

回复收藏 0 原文

大海や 2024-09-19 06:24:44

df$x 和 df[[x]] 做同样的事情。

假设您有一个名为 one 的数据集。这些变量之一是因子变量，Region。使用 one$Region 将允许您选择特定变量。请考虑以下事项：

one <- read.csv("IED.csv")
one$Region

运行以下代码还可以让您隔离该变量/级别。

one[["Region"]]

每个代码都会生成以下输出：

> one$Region
    [1] RC SOUTH      RC SOUTH      RC SOUTH      RC EAST       RC EAST      
    [6] RC EAST       RC EAST       RC EAST       RC EAST       RC EAST      
   [11] RC SOUTH      RC SOUTH      RC EAST       RC EAST       RC EAST      
   [16] RC EAST       RC EAST       RC SOUTH      RC SOUTH      RC EAST      
   [21] RC SOUTH      RC EAST       RC CAPITAL    RC EAST       RC EAST 


> one[["Region"]]
    [1] RC SOUTH      RC SOUTH      RC SOUTH      RC EAST       RC EAST      
    [6] RC EAST       RC EAST       RC EAST       RC EAST       RC EAST      
   [11] RC SOUTH      RC SOUTH      RC EAST       RC EAST       RC EAST      
   [16] RC EAST       RC EAST       RC SOUTH      RC SOUTH      RC EAST      
   [21] RC SOUTH      RC EAST       RC CAPITAL    RC EAST       RC EAST

“它们都返回“相同”结果，但不一定采用相同的格式。” - 我没有注意到任何差异。每个命令以相同的格式产生相同的输出。也许是你的数据。

希望有帮助。

编辑：

误读了原来的问题。 df["x"] 产生以下结果：

> one["Region"]
             Region
1          RC SOUTH
2          RC SOUTH
3          RC SOUTH
4           RC EAST
5           RC EAST
6           RC EAST
7           RC EAST
8           RC EAST
9           RC EAST
10          RC EAST

不确定为什么会出现差异。

df$x and df[[x]] do the same thing.

Let's assume that you have a data set named one. One of these variables is a factor variable, Region. Using one$Region will allow you to select a specific variable. Consider the following:

one <- read.csv("IED.csv")
one$Region

Running the following code also allows you to isolate that variable/level.

one[["Region"]]

Each code produces the following output:

> one$Region
    [1] RC SOUTH      RC SOUTH      RC SOUTH      RC EAST       RC EAST      
    [6] RC EAST       RC EAST       RC EAST       RC EAST       RC EAST      
   [11] RC SOUTH      RC SOUTH      RC EAST       RC EAST       RC EAST      
   [16] RC EAST       RC EAST       RC SOUTH      RC SOUTH      RC EAST      
   [21] RC SOUTH      RC EAST       RC CAPITAL    RC EAST       RC EAST 


> one[["Region"]]
    [1] RC SOUTH      RC SOUTH      RC SOUTH      RC EAST       RC EAST      
    [6] RC EAST       RC EAST       RC EAST       RC EAST       RC EAST      
   [11] RC SOUTH      RC SOUTH      RC EAST       RC EAST       RC EAST      
   [16] RC EAST       RC EAST       RC SOUTH      RC SOUTH      RC EAST      
   [21] RC SOUTH      RC EAST       RC CAPITAL    RC EAST       RC EAST

"They both return the "same" results, but not necessarily in the same format." - I didn't notice any differences. Each command produced the same outputs in the same format. Perhaps its your data.

Hope that helps.

EDIT:

Misread the original question. df["x"] produces the following:

> one["Region"]
             Region
1          RC SOUTH
2          RC SOUTH
3          RC SOUTH
4           RC EAST
5           RC EAST
6           RC EAST
7           RC EAST
8           RC EAST
9           RC EAST
10          RC EAST

Not sure why the difference occurs.

回复收藏 0 原文