删除 AWK 中的列选择
我想从 CSV 文件列表中删除选定的列。 awk 调用是内联的,因为它在 shell 脚本中使用。我事先不知道文件有多少列,只知道我想要删除的列包含在列表的每个文件中。
假设我想要删除前 4 列。清空列值将留下分隔符,我也希望将其消失。
我认为以下方法可行:创建一个要删除的列号数组,然后重新创建没有这些列的相应行。
下面的 length(row) 值符合预期,但最终循环仍然迭代原始列数,而不是实际的 length(row) 值。
头 $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) 删除 row[i] ; print NF "," 长度(行) "<<<";out=""; print NF "," 长度(行) ">>>";for(i=1;i<=长度(行);i++){print row[i] "lulu";输出=输出“,”行[i]}; sub(/[ \t]*$/,"",out);打印出}' > $g
或格式化:
head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub(/[ \t]*$/,"",out);print out}' > $g
这是 2 个文件的输出:6 列进入,当我删除第 1 列到第 4 列时留下 2 列,但循环迭代完整的 6 列而不是预期的 2 列。谢谢您的建议。
奥斯特。
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000009lulu
461474lulu
,,,,,0000009,461474
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000010lulu
94942lulu
,,,,,0000010,94942
编辑 (贝利撒留)
格式化代码如下:
BEGIN {FS=",";
split("1,2,3,4",dropers,",")
}
{ split($0,row,FS);
for(i in dropers) delete row[i];
print NF "," length(row) "<<<";
out="";
print NF "," length(row) ">>>";
for(i=1;i<=length(row);i++){print row[i] "lulu";
out = out "," row[i]};
sub(/[ \t]*$/,"",out);
print out
}
I'd like to delete a selection of columns from a list of CSV files. The awk call is in-line as it is used in a shell script. I don't know beforehand how many columns the files have, only that the columns that I want gone are included in each file of the list.
Let's say I want the first 4 columns removed. Blanking out the column values will leave the separators, which I also want gone.
I though the following would work: create an array of column numbers to drop, and recreate the corresponding row without those columns.
The value of length(row) below is as expected, but the final loop still iterates over the original column count, not the actual length(row) value.
head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub(/[ \t]*$/,"",out);print out}' > $g
or formatted:
head $f | awk 'BEGIN{FS=",";split("1,2,3,4",dropers,",")}{split($0,row,FS);for(i in dropers) delete row[i]; print NF "," length(row) "<<<";out=""; print NF "," length(row) ">>>";for(i=1;i<=length(row);i++){print row[i] "lulu"; out = out "," row[i]}; sub(/[ \t]*$/,"",out);print out}' > $g
Here's the output for 2 files: 6 columns going in, 2 left when I've deleted columns 1 through 4, yet the loop iterates over the full 6 cols rather than the expected 2. Thank you for any advice.
Aust.
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000009lulu
461474lulu
,,,,,0000009,461474
6,2<<<
6,2>>>
lulu
lulu
lulu
lulu
0000010lulu
94942lulu
,,,,,0000010,94942
Edit (Belisarius)
Formatted code follows:
BEGIN {FS=",";
split("1,2,3,4",dropers,",")
}
{ split($0,row,FS);
for(i in dropers) delete row[i];
print NF "," length(row) "<<<";
out="";
print NF "," length(row) ">>>";
for(i=1;i<=length(row);i++){print row[i] "lulu";
out = out "," row[i]};
sub(/[ \t]*$/,"",out);
print out
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
输入:
打印:
with input:
prints: