在R中,df[“x”]和df$x有什么区别
我在哪里可以找到有关通过以下方式调用 data.frame 中的列之间的差异的信息:
df <- data.frame(x=1:20,y=letters[1:20],z=20:1)
df$x
df["x"]
它们都返回“相同”的结果,但不一定采用相同的格式。我注意到的另一件事是 df$x 返回一个列表。而 df["x"] 返回一个 data.frame。
编辑:然而,知道在哪种情况下使用哪个已经成为一项挑战。这里是否有最佳实践,或者是否真的可以归结为了解命令或功能需要什么?到目前为止,如果我的功能一开始不起作用(反复试验),我只是循环使用它们。
Where can I find information on the differences between calling on a column within a data.frame via:
df <- data.frame(x=1:20,y=letters[1:20],z=20:1)
df$x
df["x"]
They both return the "same" results, but not necessarily in the same format. Another thing that I've noticed is that df$x returns a list. Whereas df["x"] returns a data.frame.
EDIT: However, knowing which one to use in which situation has become a challenge. Is there a best practice here or does it really come down to knowing what the command or function requires? So far I've just been cycling through them if my function doesn't work at first (trial and error).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
另一个区别是
df$w
返回NULL
和df['w']
或df[['w']]< /code> 给出了示例数据帧的错误。
Another difference is that
df$w
returnsNULL
anddf['w']
ordf[['w']]
gives an error with your example dataframe.如果我没记错的话,
df$x
与df[['x']]
相同。[[
用于选择任何单个元素,而[
返回所选元素的列表。另请参阅语言参考 。我通常看到 [[ 用于列表,[ 用于数组,$ 用于获取单个列或元素。如果您需要表达式(例如 df[[name]] 或 df[,name]),则也可以使用 [ 或 [[ 表示法。如果选择多列,也会使用 [ 符号。例如 df[,c('name1', 'name2')]。我认为这没有最佳实践。If I'm not mistaken,
df$x
is the same asdf[['x']]
.[[
is used to select any single element, whereas[
returns a list of the selected elements. See also the language reference. I usually see that [[ is used for lists, [ for arrays and $ for getting a single column or element. If you need an expression (for example df[[name]] or df[,name]), then use the [ or [[ notation also. The [ notation is also used if multiple columns are selected. For example df[,c('name1', 'name2')]. I don't think there is a best-practices for this.除了手册中的索引页面之外,您还可以在帮助页面“$”上找到这个简洁的描述:
当然,函数调用是不同的。请参阅
get("[.data.frame")
与get("[[.data.frame")
与get("$")
In addition to the indexing page in the manual, you can find this succinct description on the help page ?"$":
The function calls are, of course, different. See
get("[.data.frame")
versusget("[[.data.frame")
versusget("$")
在这种情况下,对于大多数用途,我会完全避免子设置并尝试记住
$
、[
和[[
的作用一个数据框。我只会使用with()
:在大多数情况下,这比任何子设置方法都清晰得多(恕我直言)。
In this instance, for most uses, I'd avoid sub-setting altogether and trying to remember what
$
,[
and[[
do with a data frame. I would just usewith()
:That is a lot clearer than any of the sub-setting methods in most cases (IMHO).
我没有看到明确解释的一件事是
[
和[[
可以用来根据变量或表达式的值进行选择而$
则不能。 IE 你可以这样做:[
和[[
之间的差异已被 其他帖子和其他问题。One thing I haven't seen explained explicitly is that
[
and[[
can be used to select based on the value of a variable or expression while$
cannot. I.E you can do:The differences between
[
and[[
have been well covered by other posts and other questions.如果您使用 df[,"x"] 而不是 df["x"] 您将得到与 df$x 相同的结果。逗号表示您正在按名称选择列。
If you use df[,"x"] instead of df["x"] you will get the same result as df$x. The comma indicates that you're selecting a column by name.
df$x
和df[[x]]
做同样的事情。假设您有一个名为
one
的数据集。这些变量之一是因子变量,Region
。使用one$Region
将允许您选择特定变量。请考虑以下事项:运行以下代码还可以让您隔离该变量/级别。
每个代码都会生成以下输出:
“它们都返回“相同”结果,但不一定采用相同的格式。” - 我没有注意到任何差异。每个命令以相同的格式产生相同的输出。也许是你的数据。
希望有帮助。
编辑:
误读了原来的问题。
df["x"]
产生以下结果:不确定为什么会出现差异。
df$x
anddf[[x]]
do the same thing.Let's assume that you have a data set named
one
. One of these variables is a factor variable,Region
. Usingone$Region
will allow you to select a specific variable. Consider the following:Running the following code also allows you to isolate that variable/level.
Each code produces the following output:
"They both return the "same" results, but not necessarily in the same format." - I didn't notice any differences. Each command produced the same outputs in the same format. Perhaps its your data.
Hope that helps.
EDIT:
Misread the original question.
df["x"]
produces the following:Not sure why the difference occurs.
df["x"] 返回一个包含名为“x”的单列的数据帧。这意味着结果仍然是数据帧,而不是向量。它保留了数据结构,这在某些情况下很有用。
df$x 返回包含“x”列的值的向量。这意味着结果不是数据帧而是值向量。
df["x"] returns a dataframe with a single column named "x". This means the result is still a dataframe, not a vector. It preserves the data structure, which can be useful in certain situations.
df$x returns a vector containing the values of the column "x". This means the result is not a dataframe but a vector of values.