将多列与x重复属性合并为x列
我有一个像下面一样排列的数据框,列是按月(enero,febrero,marzo等)分开的,每个行都对应于我需要从时间序列中提取的值。每对月/尾部的大小都会因每月天数的数量而变化。
同样,基于原始数据集,每对月/尾部都被NAS的空列隔开。
enero Caudal X febrero Caudal.1 X.1 marzo Caudal.2 X.2
1 1/1/2003 00:15 - NA 1/2/2003 00:15 - NA 1/3/2003 00:15 1.68 NA
2 1/1/2003 00:30 - NA 1/2/2003 00:30 - NA 1/3/2003 00:30 1.69 NA
3 1/1/2003 00:45 - NA 1/2/2003 00:45 - NA 1/3/2003 00:45 1.68 NA
4 1/1/2003 01:00 - NA 1/2/2003 01:00 - NA 1/3/2003 01:00 1.68 NA
5 1/1/2003 01:15 - NA 1/2/2003 01:15 - NA 1/3/2003 01:15 1.68 NA
6 1/1/2003 01:30 - NA 1/2/2003 01:30 - NA 1/3/2003 01:30 1.68 NA
我所需的结果是一个时间序列,只有两列:日期和尾声。
Date Caudal
1 1/1/2003 00:15 -
2 1/1/2003 00:30 -
3 1/1/2003 00:45 -
4 1/1/2003 01:00 -
5 1/1/2003 01:15 -
6 1/1/2003 01:30 -
7 1/2/2003 00:15 -
8 1/2/2003 00:30 -
9 1/2/2003 00:45 -
10 1/2/2003 01:00 -
11 1/2/2003 01:15 -
12 1/2/2003 01:30 -
13 1/3/2003 00:15 1.68
14 1/3/2003 00:30 1.69
15 1/3/2003 00:45 1.68
16 1/3/2003 01:00 1.68
17 1/3/2003 01:15 1.68
18 1/3/2003 01:30 1.68
我需要使用完全相同格式的40个.txt文件执行此操作。我如何使其安排将我的所有文件串联成一个连续的DF?
示例数据:
structure(list(enero = c("1/1/2003 00:15", "1/1/2003 00:30",
"1/1/2003 00:45", "1/1/2003 01:00", "1/1/2003 01:15", "1/1/2003 01:30"
), Caudal = c(" - ", " - ", " - ", " - ", " - ", " - "
), X = c(NA, NA, NA, NA, NA, NA), febrero = c("1/2/2003 00:15",
"1/2/2003 00:30", "1/2/2003 00:45", "1/2/2003 01:00", "1/2/2003 01:15",
"1/2/2003 01:30"), Caudal.1 = c(" - ", " - ", " - ", " - ",
" - ", " - "), X.1 = c(NA, NA, NA, NA, NA, NA), marzo = c("1/3/2003 00:15",
"1/3/2003 00:30", "1/3/2003 00:45", "1/3/2003 01:00", "1/3/2003 01:15",
"1/3/2003 01:30"), Caudal.2 = c(" 1.68 ", " 1.69 ", " 1.68 ",
" 1.68 ", " 1.68 ", " 1.68 "), X.2 = c(NA, NA, NA, NA, NA, NA
)), row.names = c(NA, 6L), class = "data.frame")
I have a dataframe arranged like below, columns are separated by months (enero, febrero, marzo, etc.) and every row corresponds to a value that I need to extract from the time series. Each pair of Month/Caudal varies in size depending of the amount of days of the month.
Also, based on the original dataset, each pair of Month/Caudal is separated by an empty column of NAs.
enero Caudal X febrero Caudal.1 X.1 marzo Caudal.2 X.2
1 1/1/2003 00:15 - NA 1/2/2003 00:15 - NA 1/3/2003 00:15 1.68 NA
2 1/1/2003 00:30 - NA 1/2/2003 00:30 - NA 1/3/2003 00:30 1.69 NA
3 1/1/2003 00:45 - NA 1/2/2003 00:45 - NA 1/3/2003 00:45 1.68 NA
4 1/1/2003 01:00 - NA 1/2/2003 01:00 - NA 1/3/2003 01:00 1.68 NA
5 1/1/2003 01:15 - NA 1/2/2003 01:15 - NA 1/3/2003 01:15 1.68 NA
6 1/1/2003 01:30 - NA 1/2/2003 01:30 - NA 1/3/2003 01:30 1.68 NA
My desired result is a time series with only two columns: Date and Caudal.
Date Caudal
1 1/1/2003 00:15 -
2 1/1/2003 00:30 -
3 1/1/2003 00:45 -
4 1/1/2003 01:00 -
5 1/1/2003 01:15 -
6 1/1/2003 01:30 -
7 1/2/2003 00:15 -
8 1/2/2003 00:30 -
9 1/2/2003 00:45 -
10 1/2/2003 01:00 -
11 1/2/2003 01:15 -
12 1/2/2003 01:30 -
13 1/3/2003 00:15 1.68
14 1/3/2003 00:30 1.69
15 1/3/2003 00:45 1.68
16 1/3/2003 01:00 1.68
17 1/3/2003 01:15 1.68
18 1/3/2003 01:30 1.68
I need to do this for 40 .txt files with the exact same format. How could I make this arrangement for it to concatenate all my files into one continuous df?
Sample data:
structure(list(enero = c("1/1/2003 00:15", "1/1/2003 00:30",
"1/1/2003 00:45", "1/1/2003 01:00", "1/1/2003 01:15", "1/1/2003 01:30"
), Caudal = c(" - ", " - ", " - ", " - ", " - ", " - "
), X = c(NA, NA, NA, NA, NA, NA), febrero = c("1/2/2003 00:15",
"1/2/2003 00:30", "1/2/2003 00:45", "1/2/2003 01:00", "1/2/2003 01:15",
"1/2/2003 01:30"), Caudal.1 = c(" - ", " - ", " - ", " - ",
" - ", " - "), X.1 = c(NA, NA, NA, NA, NA, NA), marzo = c("1/3/2003 00:15",
"1/3/2003 00:30", "1/3/2003 00:45", "1/3/2003 01:00", "1/3/2003 01:15",
"1/3/2003 01:30"), Caudal.2 = c(" 1.68 ", " 1.69 ", " 1.68 ",
" 1.68 ", " 1.68 ", " 1.68 "), X.2 = c(NA, NA, NA, NA, NA, NA
)), row.names = c(NA, 6L), class = "data.frame")
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我们可以首先删除空列,然后最容易重命名列集(即,日期和尾部)。然后,我们可以使用
_
作为名称分离器将长形式转移到长形式中。输出
We can first remove the empty columns, then it is easiest to rename the sets of columns (i.e., Date and Caudal). Then, we can pivot into long form using
_
as the names separator.Output