您在 R 中命名变量的首选风格是什么?

发布于 2024-08-15 11:46:47 字数 1431 浏览 4 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

泪冰清 2024-08-22 11:46:47

之前的答案很好,所以在这里补充一点:

  • 下划线对于 ESS 用户来说真的很烦人;鉴于 ESS 的使用相当广泛,您不会在 ESS 用户编写的代码中看到很多下划线(该组包括一堆 R Core 以及 CRAN 作者,尽管像 Hadley 这样的例外);

  • 点也很邪恶,因为它们可能会在简单的方法调度中混淆;我相信我曾经在 R 列表之一上读过这样的评论:点是一种历史文物,不再受到鼓励;

  • 所以我们在最后一轮中仍然有一个明显的获胜者:camelCase。我也不确定我是否真的同意“R 社区缺乏先例”的说法。

是的:实用主义和一致性胜过教条。因此,无论什么有效,都可以被同事和合著者使用。毕竟,我们仍然有空格和大括号需要争论:)

Good previous answers so just a little to add here:

  • underscores are really annoying for ESS users; given that ESS is pretty widely used you won't see many underscores in code authored by ESS users (and that set includes a bunch of R Core as well as CRAN authors, excptions like Hadley notwithstanding);

  • dots are evil too because they can get mixed up in simple method dispatch; I believe I once read comments to this effect on one of the R list: dots are a historical artifact and no longer encouraged;

  • so we have a clear winner still standing in the last round: camelCase. I am also not sure if I really agree with the assertion of 'lacking precendent in the R community'.

And yes: pragmatism and consistency trump dogma. So whatever works and is used by colleagues and co-authors. After all, we still have white-space and braces to argue about :)

那小子欠揍 2024-08-22 11:46:47

我对 CRAN 上实际使用的命名约定进行了一项调查,并被 R Journal 接受:) 这是总结结果的图表:

在此处输入图像描述

事实证明(也许并不奇怪),小驼峰命名法最常用于函数名称,而句点分隔的名称最常用于参数。然而,像 Google 的 R 风格指南 所提倡的那样使用 UpperCamelCase 确实很少见,而且有点奇怪的是他们提倡使用这种命名约定。

完整论文在这里:

http://journal.r-project .org/archive/2012-2/RJournal_2012-2_Baaaath.pdf

I did a survey of what naming conventions that are actually used on CRAN that got accepted to the R Journal :) Here is a graph summarizing the results:

enter image description here

Turns out (no surprises perhaps) that lowerCamelCase was most often used for function names and period.separated names most often used for parameters. To use UpperCamelCase, as advocated by Google's R style guide is really rare however, and it is a bit strange that they advocate using that naming convention.

The full paper is here:

http://journal.r-project.org/archive/2012-2/RJournal_2012-2_Baaaath.pdf

々眼睛长脚气 2024-08-22 11:46:47

全程下划线!与流行的观点相反,基本 R 中有许多使用下划线的函数。运行 grep("^[^\\.]*$", apropos("_"), value = T) 查看全部内容。

我使用官方 Hadley 风格 编码;)

Underscores all the way! Contrary to popular opinion, there are a number of functions in base R that use underscores. Run grep("^[^\\.]*$", apropos("_"), value = T) to see them all.

I use the official Hadley style of coding ;)

×纯※雪 2024-08-22 11:46:47

当骆驼实际上提供了一些有意义的东西时(比如数据类型),我喜欢驼峰命名法。

dfProfitLoss,其中 df = dataframe

vdfMergedFiles(),其中函数接受向量并吐出数据帧

虽然我认为 _ 确实增加了可读性,但在中使用 .-_ 或其他字符似乎有太多问题名称。尤其是当您跨多种语言工作时。

I like camelCase when the camel actually provides something meaningful -- like the datatype.

dfProfitLoss, where df = dataframe

or

vdfMergedFiles(), where the function takes in a vector and spits out a dataframe

While I think _ really adds to the readability, there just seems to be too many issues with using .-_ or other characters in names. Especially if you work across several languages.

触ぅ动初心 2024-08-22 11:46:47

正如我在这里指出的:

值得记住的是,如果您的同事/用户不是母语,那么您的变量名称对他们来说是多么容易理解......

因此我想说下划线和句点比大写更好,但正如您所指出的,脚本中的一致性至关重要。

As I point out here:

How does the verbosity of identifiers affect the performance of a programmer?

it's worth bearing in mind how understandable your variable names are to your co-workers/users if they are non-native speakers...

For that reason I'd say underscores and periods are better than capitalisation, but as you point out consistency is essential within your script.

梦冥 2024-08-22 11:46:47

这取决于个人喜好,但我遵循谷歌风格指南,因为它与核心团队的风格一致。我还没有在基本 R 的变量中看到下划线。

This comes down to personal preference, but I follow the google style guide because it's consistent with the style of the core team. I have yet to see an underscore in a variable in base R.

顾挽 2024-08-22 11:46:47

正如其他人提到的,下划线会让很多人搞砸。不,这不是禁止的,但也不是特别常见。

使用点作为分隔符对于 S3 类等来说有点麻烦。

根据我的经验,似乎很多 R 语言的高水平垃圾更喜欢使用驼峰命名法,使用一些点和一些下划线。

As others have mentioned, underscores will screw up a lot of folks. No, it's not verboten but it isn't particularly common either.

Using dots as a separator gets a little hairy with S3 classes and the like.

In my experience, it seems like a lot of the high muckity mucks of R prefer the use of camelCase, with some dot usage and a smattering of underscores.

故人如初 2024-08-22 11:46:47

我更喜欢混合大写字母。

但我经常使用句点来指示变量类型是什么:

mixedCapitals.mat 是一个矩阵。
mixCapitals.lm 是一个线性模型。
mixCapitals.lst 是一个列表对象。

等等。

I have a preference for mixedCapitals.

But I often use periods to indicate what the variable type is:

mixedCapitals.mat is a matrix.
mixedCapitals.lm is a linear model.
mixedCapitals.lst is a list object.

and so on.

林空鹿饮溪 2024-08-22 11:46:47

通常我使用 ix 下划线和混合大小写(驼峰式)来重命名变量。简单变量使用下划线命名,例如:

PSOE_votes -> PSOE(西班牙政治团体)的票数。

PSOE_states ->分类,表示 PSOE 获胜的州 {阿拉贡、安达卢西亚...)

PSOE_political_force ->类别,表示 PSOE 政治团体之间的立场{第一、第二、第三)

PSOE_07 -> 2007 年 PSOE_votes + PSOE_states + PSOE_political_force 的并集(h标题 -> 投票、州、立场

如果我的变量是使用混合大写的一/两个变量中应用函数的结果。

示例:

positionXstates <- xtabs(~states+position, PSOE_07)

Usually I rename my variables using a ix of underscores and a mixed capitalization (camelCase). Simple variables are naming using underscores, example:

PSOE_votes -> number of votes for the PSOE (political group of Spain).

PSOE_states -> Categorical, indicates the state where PSOE wins {Aragon, Andalucia, ...)

PSOE_political_force -> Categorial, indicates the position between political groups of PSOE {first, second, third)

PSOE_07 -> Union of PSOE_votes + PSOE_states + PSOE_political_force at 2007 (header -> votes, states, position)

If my variable is a result of to applied function in one/two Variables I using a mixed capitalization.

Example:

positionXstates <- xtabs(~states+position, PSOE_07)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文