分开差异固定宽度字符串格式
新手问题!我有一个列,带有两个不同的固定宽度格式的字符串。我们可以通过其名称识别格式的类型,并根据格式拆分字符串。
df <- data.frame(
var1 = c('M1B123456789MM1158','M1C123456789zMM1183'),
var2 = c('code1','code8'))
固定宽度格式是:
formatM1B = c(3,9,2,4)
formatM1C = c(3,9,1,2,4)
所以我希望这个结果:
|format|var1_2 |var1_3|var1_5|var1_6|code |
1|M1B |123456789| |MM |1158 |code1|
2|M1C |123456789|z |MM |1183 |code8|
我尝试了函数独立, str_split 或 str_split_fixed ,但我不知道如何组合它具有某种if函数来“测试”或“正则”字符串中提到的格式。 这个问题当然已经被问到了很多时间,我进行了数小时的研究,而没有找到适应我的数据的东西:/
Newbie question! I have a column with strings of two differents fixed widths formats. We can recognize the type of format by its name and split the string according to the format.
df <- data.frame(
var1 = c('M1B123456789MM1158','M1C123456789zMM1183'),
var2 = c('code1','code8'))
The fixed widths formats are:
formatM1B = c(3,9,2,4)
formatM1C = c(3,9,1,2,4)
So i hope this result:
|format|var1_2 |var1_3|var1_5|var1_6|code |
1|M1B |123456789| |MM |1158 |code1|
2|M1C |123456789|z |MM |1183 |code8|
I tried the functions separate , str_split or str_split_fixed but i don't know how combine it with a sort of IF function to "test" or "regex" the format mentionned into the string.
This question has certainly been asked a lot of time, i did hours research without being able to find something to adapt to my data :/
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果当缺少“ z” “ z”时定义宽度,则可以使用专用 read.fwf 函数:
If we define widths with zero when "z" is missing, then we can use dedicated read.fwf function:
正则表达式有5个组:
输出:
The regex expresion has 5 groups:
Output:
这是一个基于您的
formatm1b/c
向量进行分割的函数,我们可以将其应用于
Here is a function that does the splitting based on your
formatM1B/C
vectors,And we can apply it as,