我的文件中有这些类型的记录:
1867 121 2 56
1868 121 1 6
1868 121 2 65
1868 122 0 53
1869 121 0 41
1869 121 1 41
1871 121 1 13
1871 121 2 194
我想得到以下输出:
1867 121 2 56
1868 121 1 6
1868 121 2 65
1868 122 0 53
1869 121 0 41
1869 121 1 41
1870 121 0 0
1871 121 1 13
1871 121 2 194
区别在于 1870 121 0 0
行。
因此,如果第一列中的数字之间的差值大于 1,那么我们必须包含缺少数字的行(上面的情况为 1870
)和其他列。人们应该以某种方式获取其他列,让第二列成为该列数字的可能值的最小值(在示例中,这些值可能是 121
或 122< /code>),与第三列情况相同。最后一列的值始终为零。
有人能给我建议吗?提前致谢!
我正在尝试用 awk 解决它,但也许还有其他更好或更实用的解决方案......
I have these kind of records in a file:
1867 121 2 56
1868 121 1 6
1868 121 2 65
1868 122 0 53
1869 121 0 41
1869 121 1 41
1871 121 1 13
1871 121 2 194
I would like to get this output:
1867 121 2 56
1868 121 1 6
1868 121 2 65
1868 122 0 53
1869 121 0 41
1869 121 1 41
1870 121 0 0
1871 121 1 13
1871 121 2 194
The difference is the 1870 121 0 0
row.
So, if the difference between the numbers in the first column is greater than 1, then we have to include a line with the missing number (the above case it is 1870
) and the other columns. One should get the other columns in a way, that let the second column be the minimum of the possible values of the numbers of the column (in the example these values might be 121
or 122
), and for the same as in the third column case. The value of the last column let be always zero.
Can anybody suggest me something? Thanks in advance!
I am trying to solve it with awk
, but maybe there is (are) other nicer or more practical solution(s) for this...
发布评论
评论(4)
像这样的东西可以工作 -
解释:
BEGIN{getline;a=$1;b=$2;c=$3}
-在这个
BEGIN
块中,我们读取第一行并将第1列
中的值分配给变量a
,第 2 列
到变量b
,第 3 列
到变量c
。NR==FNR{if (b>$2) b=$2; if (c>$3) c=$3;next}
-在此,我们扫描整个文件 (
NR==FNR
) 并跟踪第 2 列
和第 3 列中的最低可能值
code> 并将它们分别存储在变量b
和c
中。我们使用next
来避免运行第二个pattern{action}
语句。{if ($1-a>1) {x=($1-a); for (i=1;i< /strong> -
此
action
语句检查第1列
中的值并将其与a
进行比较。如果差值大于 1,我们执行一个for 循环
来添加所有缺失的行,并将a
的值设置为$1
。如果连续行的column 1
中的值不大于1,我们将column 1
的值分配给a
并print
它。测试:
Something like this could work -
Explanation:
BEGIN{getline;a=$1;b=$2;c=$3}
-In this
BEGIN
block we read the first line and assign values incolumn 1
to variablea
,column 2
to variableb
andcolumn 3
to variablec
.NR==FNR{if (b>$2) b=$2; if (c>$3) c=$3;next}
-In this we scan through the entire file (
NR==FNR
) and keep track of the lowest possible values incolumn 2
andcolumn 3
and store them in variablesb
andc
respectively. We usenext
to avoid running the secondpattern{action}
statement.{if ($1-a>1) {x=($1-a); for (i=1;i<x;i++) {print (a+1)"\t"b,c,"0";a++};a=$1} else a=$1;print}
-This
action
statement checks the for the value incolumn 1
and compares it witha
. If the the difference is more than 1, we do afor loop
to add all the missing lines and set the value ofa
to$1
. If the value incolumn 1
on successive lines is not greater than 1, we assign the value ofcolumn 1
toa
andprint
it.Test:
Perl 解决方案。也应该适用于大文件,因为它不会将整个文件加载到内存中,而是遍历该文件两次。
Perl solution. Should work for large files, too, as it does not load the whole file into memory, but goes over the file two times.
bash 解决方案:
A Bash solution:
使用 awk 的一种方法:
运行脚本:
结果:
One way using
awk
:Running the script:
Result: