哪些 1-2 个字母的对象名称与现有 R 对象冲突?
为了使我的代码更具可读性,我喜欢在创建新对象时避免使用已经存在的对象名称。由于 R 基于包的性质,并且函数是一等对象,因此可以很容易地覆盖基本 R 中没有的通用函数(因为通用包可能使用短函数名称,但不知道要使用哪个包)加载没有办法检查它)。诸如内置逻辑 T 和 F 之类的对象也会引起麻烦。
我想到的一些例子是:
一个字母
- c
- t
- T/F
- J
两个字母
- df
更好的解决方案可能是完全避免使用短名称,而使用更具描述性的名称,我通常会尝试将其作为一种习惯。然而,操作通用 data.frame 的函数的“df”具有足够的描述性,较长的名称增加的很少,因此短名称有其用途。此外,对于不一定知道更大背景的问题,想出描述性名称几乎是不可能的。
还有哪些一字母和两字母变量名与现有 R 对象冲突?其中哪些非常常见,应该避免?如果它们不在 base
中,请也列出该包。最好的答案至少涉及一些代码;如果使用请提供。
请注意,我并不是在问覆盖已经存在的函数是否可取。这个问题已经在 SO 上得到解决:
有关此处一些答案的可视化,请参阅简历上的此问题:
https://stats.stackexchange.com/questions/13999/visualizing-2-letter-combinations
To make my code more readable, I like to avoid names of objects that already exist when creating new objects. Because of the package-based nature of R, and because functions are first-class objects, it can be easy to overwrite common functions that are not in base R (since a common package might use a short function name but without knowing what package to load there is no way to check for it). Objects such as the built-in logicals T and F also cause trouble.
Some examples that come to mind are:
One letter
- c
- t
- T/F
- J
Two letters
- df
A better solution might be to avoid using short names altogether in favor of more descriptive ones, and I generally try to do that as a matter of habit. Yet "df" for a function which manipulates a generic data.frame is plenty descriptive and a longer name adds little, so short names have their uses. In addition, for SO questions where the larger context isn't necessarily known, coming up with descriptive names is well-nigh impossible.
What other one- and two-letter variable names conflict with existing R objects? Which among those are sufficiently common that they should be avoided? If they are not in base
, please list the package as well. The best answers will involve at least some code; please provide it if used.
Note that I am not asking whether or not overwriting functions that already exist is advisable or not. That question is addressed on SO already:
In R, what exactly is the problem with having variables with the same name as base R functions?
For visualizations of some answers here, see this question on CV:
https://stats.stackexchange.com/questions/13999/visualizing-2-letter-combinations
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
apropos
非常适合这种情况:在没有加载任何包的情况下,返回:
确切的内容将取决于搜索列表。如果您担心与常用的包发生冲突,请尝试加载一些包并重新运行它。
我使用以下命令加载了机器上安装的所有(> 200)个软件包:
并重新调用
apropos
,将其包装在unique
中,因为有一些重复项。返回:
您可以看到它们来自哪里
apropos
is ideal for this:With no packages loaded, this returns:
The exact contents will depend upon the search list. Try loading a few packages and re-running it if you care about conflicts with packages that you commonly use.
I loaded all the (>200) packages installed on my machine with this:
And reran the call to
apropos
, wrapping it inunique
, since there were a few duplicates.This returned:
You can see where they came from with
一直在思考这个问题。这是基本 R 中的一个字母对象名称列表:
基本 R 中的一个和两个字母对象名称:
这是一个比我最初怀疑的要大得多的列表,尽管我永远不会想到将变量命名为“if”,所以在某种程度上这是有道理的。
仍然无法捕获不在基数中的对象名称,或者给出最好避免哪些函数的任何意义。我认为更好的答案是使用专家意见来找出哪些函数是重要的(例如,使用
c
可能比使用qf
更糟糕),或者在a上使用数据挖掘方法。一堆 R 代码,看看哪些短命名函数最常用。Been thinking about this more. Here's a list of one-letter object names in base R:
And one- and two-letter object names in base R:
That's a much bigger list than I initially suspected, although I would never think of naming a variable "if", so to a certain degree it makes sense.
Still doesn't capture object names not in base, or give any sense of which functions are best avoided. I think a better answer would either use expert opinion to figure out which functions are important (e.g. using
c
is probably worse than usingqf
) or use a data mining approach on a bunch of R code to see what short-named functions get used the most.