如何添加前导零?
我有一组数据,看起来像这样:
anim <- c(25499,25500,25501,25502,25503,25504)
sex <- c(1,2,2,1,2,1)
wt <- c(0.8,1.2,1.0,2.0,1.8,1.4)
data <- data.frame(anim,sex,wt)
data
anim sex wt anim2
1 25499 1 0.8 2
2 25500 2 1.2 2
3 25501 2 1.0 2
4 25502 1 2.0 2
5 25503 2 1.8 2
6 25504 1 1.4 2
我想在每个动物 id 之前添加一个零:
data
anim sex wt anim2
1 025499 1 0.8 2
2 025500 2 1.2 2
3 025501 2 1.0 2
4 025502 1 2.0 2
5 025503 2 1.8 2
6 025504 1 1.4 2
出于兴趣,如果我需要在动物 id 之前添加两个或三个零怎么办?
I have a set of data which looks something like this:
anim <- c(25499,25500,25501,25502,25503,25504)
sex <- c(1,2,2,1,2,1)
wt <- c(0.8,1.2,1.0,2.0,1.8,1.4)
data <- data.frame(anim,sex,wt)
data
anim sex wt anim2
1 25499 1 0.8 2
2 25500 2 1.2 2
3 25501 2 1.0 2
4 25502 1 2.0 2
5 25503 2 1.8 2
6 25504 1 1.4 2
I would like a zero to be added before each animal id:
data
anim sex wt anim2
1 025499 1 0.8 2
2 025500 2 1.2 2
3 025501 2 1.0 2
4 025502 1 2.0 2
5 025503 2 1.8 2
6 025504 1 1.4 2
And for interest sake, what if I need to add two or three zeros before the animal id's?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
简短版本:使用
formatC
或sprintf
。较长版本:
有多种函数可用于格式化数字,包括添加前导零。哪一种最好取决于您想要执行的其他格式设置。
问题中的示例非常简单,因为所有值一开始都有相同的位数,所以让我们尝试一个更难的示例,即制作 10 宽 8 的幂。
粘贴
(以及它的变体paste0
) 通常是您遇到的第一个字符串操作函数。它们并不是真正为操纵数字而设计的,但它们可以用于此目的。在我们总是需要在前面添加一个零的简单情况下,paste0
是最好的解决方案。对于数字中位数可变的情况,您必须手动计算要在前面添加多少个零,这太可怕了,您只能出于病态的好奇心才这样做。
str_pad
来自< code>stringr 的工作方式与paste
类似,使得您想要填充的内容更加明确。同样,它并不是真正设计用于数字,因此更困难的情况需要稍微考虑一下。我们应该只能说“用零填充宽度为 8”,但看看这个输出:
您需要设置科学惩罚 选项 以便数字始终使用固定表示法(而不是科学记数法)进行格式化。
stri_pad
stringi
中的工作方式与stringr
中的str_pad
完全相同。formatC
是一个C 函数printf
的接口。使用它需要了解该底层函数的奥秘(请参阅链接)。在本例中,重要的一点是width
参数、format
为“integer”的"d"
以及"0 "
flag
用于前置零。这是我最喜欢的解决方案,因为很容易修改宽度,并且该功能足够强大,可以进行其他格式更改。
sprintf
是一个同名 C 函数的接口;类似于formatC
但具有不同的语法。sprintf
的主要优点是您可以将格式化的数字嵌入到较长的文本中。另请参阅goodside 的回答。
为了完整起见,值得一提的是其他格式化函数,它们偶尔有用,但没有前置零的方法。
格式
,用于格式化任何类型对象的通用函数,具有数字方法。它的工作原理有点像formatC
,但有另一个接口。prettyNum
尚未另一个格式化函数,主要用于创建手动轴刻度标签。它对于大范围的数字特别有效。scales
包具有多种功能,例如百分比
、date_format
和dollar
适用于专业格式类型。The short version: use
formatC
orsprintf
.The longer version:
There are several functions available for formatting numbers, including adding leading zeroes. Which one is best depends upon what other formatting you want to do.
The example from the question is quite easy since all the values have the same number of digits to begin with, so let's try a harder example of making powers of 10 width 8 too.
paste
(and it's variantpaste0
) are often the first string manipulation functions that you come across. They aren't really designed for manipulating numbers, but they can be used for that. In the simple case where we always have to prepend a single zero,paste0
is the best solution.For the case where there are a variable number of digits in the numbers, you have to manually calculate how many zeroes to prepend, which is horrible enough that you should only do it out of morbid curiosity.
str_pad
fromstringr
works similarly topaste
, making it more explicit that you want to pad things.Again, it isn't really designed for use with numbers, so the harder case requires a little thinking about. We ought to just be able to say "pad with zeroes to width 8", but look at this output:
You need to set the scientific penalty option so that numbers are always formatted using fixed notation (rather than scientific notation).
stri_pad
instringi
works exactly likestr_pad
fromstringr
.formatC
is an interface to the C functionprintf
. Using it requires some knowledge of the arcana of that underlying function (see link). In this case, the important points are thewidth
argument,format
being"d"
for "integer", and a"0"
flag
for prepending zeroes.This is my favourite solution, since it is easy to tinker with changing the width, and the function is powerful enough to make other formatting changes.
sprintf
is an interface to the C function of the same name; likeformatC
but with a different syntax.The main advantage of
sprintf
is that you can embed formatted numbers inside longer bits of text.See also goodside's answer.
For completeness it is worth mentioning the other formatting functions that are occasionally useful, but have no method of prepending zeroes.
format
, a generic function for formatting any kind of object, with a method for numbers. It works a little bit likeformatC
, but with yet another interface.prettyNum
is yet another formatting function, mostly for creating manual axis tick labels. It works particularly well for wide ranges of numbers.The
scales
package has several functions such aspercent
,date_format
anddollar
for specialist format types.对于无论
data$anim
中有多少位都有效的通用解决方案,请使用sprintf
函数。它的工作原理如下:在您的情况下,您可能需要:
data$anim <- sprintf("%06d", data$anim)
For a general solution that works regardless of how many digits are in
data$anim
, use thesprintf
function. It works like this:In your case, you probably want:
data$anim <- sprintf("%06d", data$anim)
扩展 @goodside 的回复:
在某些情况下,您可能需要用零填充字符串(例如 fips 代码或其他类似数字的因素)。在 OSX/Linux 中:
但是由于
sprintf()
调用操作系统的 Csprintf()
命令,因此讨论了 此处,在 Windows 7 中,您会得到不同的结果:因此,在 Windows 计算机上,解决方法是:
Expanding on @goodside's repsonse:
In some cases you may want to pad a string with zeros (e.g. fips codes or other numeric-like factors). In OSX/Linux:
But because
sprintf()
calls the OS's Csprintf()
command, discussed here, in Windows 7 you get a different result:So on Windows machines the work around is:
stringr
包中的str_pad
是一种替代方案。str_pad
from thestringr
package is an alternative.这是一个通用的基本 R 函数:
我喜欢 sprintf,但它有一些警告,例如:
Here's a generalizable base R function:
I like
sprintf
but it comes with caveats like:这是另一种在字符串中添加前导 0 的替代方法,例如 CUSIPs 有时看起来像数字,许多应用程序(例如 Excel)会损坏并删除前导 0 或将其转换为科学记数法。
当我尝试 @metasequoia 提供的答案时,返回的向量有前导空格而不是 0。这与 @user1816679 提到的问题相同 - 删除
0
周围的引号或从%d
更改为%s
并没有使也有区别。仅供参考,我正在使用在 Ubuntu 服务器上运行的 RStudio 服务器。这个两步解决方案对我有用:gsub(pattern = " ", replacement = "0", x = sprintf(fmt = "%09s", ids[,CUSIP]))
使用来自
magrittr
包的%>%
管道函数可能如下所示:sprintf(fmt = "%09s", ids[,CUSIP]) %> ;% gsub(pattern = " ", replacement = "0", x = .)
我更喜欢单一功能的解决方案,但它确实有效。
Here is another alternative for adding leading to 0s to strings such as CUSIPs which can sometimes look like a number and which many applications such as Excel will corrupt and remove the leading 0s or convert them to scientific notation.
When I tried the answer provided by @metasequoia the vector returned had leading spaces and not
0
s. This was the same problem mentioned by @user1816679 -- and removing the quotes around the0
or changing from%d
to%s
did not make a difference either. FYI, I am using RStudio Server running on an Ubuntu Server. This little two-step solution worked for me:gsub(pattern = " ", replacement = "0", x = sprintf(fmt = "%09s", ids[,CUSIP]))
using the
%>%
pipe function from themagrittr
package it could look like this:sprintf(fmt = "%09s", ids[,CUSIP]) %>% gsub(pattern = " ", replacement = "0", x = .)
I'd prefer a one-function solution, but it works.
对于其他希望数字串一致的情况,我做了一个函数。
有人可能会发现这很有用:
对格式感到抱歉。
For other circumstances in which you want the number string to be consistent, I made a function.
Someone may find this useful:
Sorry about the formatting.