将命令的输出分配给 shell 变量并获取变量大小
我有一个由数字组成的文件。通常,每一行包含一个数字。我想计算文件中以数字“0”开头的行数。如果是这样的话,那么我想做一些后期处理。
虽然我能够正确检索相应的行号,但检索到的行总数不正确。下面,我发布了我正在使用的代码。
linesToRemove=$(awk '/^0/ { print NR; }' ${inputFile});
# linesToRemove=$(grep -n "^0" ${inputFile} | cut -d":" -f1);
linesNr=${#linesToRemove} # <- here, the error
# linesNr=${#linesToRemove[@]} # <- here, the error
if [ "${linesNr}" -gt "0" ]; then
# do something here, e.g. remove corresponding lines.
awk -v n=$linesToRemove 'NR == n {next} {print}' ${anotherFile} > ${outputFile}
fi
另外,对于基于 awk 的命令,我如何使用 shell 变量?我尝试了下面的命令,但它无法正常工作,因为“myIndex”被解释为文本而不是变量。
linesToRemove=$(awk -v myIndex="$myIndex" '/^myIndex/ { print NR;}' ${inputFile});
鉴于在 ${inputFile}
中找到以 0
开头的行号,我想从 ${anotherFile}
中删除相应的行号。下面给出了 ${inputFile} 和 ${anotherFile} 的示例:
// ${inputFile}
0
1
3
0
// ${anotherFile}
2.617300e+01 5.886700e+01 -1.894697e-01 1.251225e+02
5.707397e+01 2.214040e+02 8.607959e-02 1.229114e+02
1.725900e+01 1.734360e+02 -1.298053e-01 1.250318e+02
2.177940e+01 1.249531e+02 1.538853e-01 1.527150e+02
// ${outputFile}
5.707397e+01 2.214040e+02 8.607959e-02 1.229114e+02
1.725900e+01 1.734360e+02 -1.298053e-01 1.250318e+02
在上面的示例中,我需要从 ${ 中删除行
,假设这些行对应于 0
和 3
anotherFile}${inputFile}
中以 0
开头的行。
I have a file consisting of digits. Usually, each line contains one single number. I would like to count the number of lines in the file that begin with digit '0'. If it's the case, then I would like to do some post-processing.
Although I'm able to retrieve correctly the corresponding line numbers, the total number of retrieved lines is not correct. Below, I'm posting the code that I'm using.
linesToRemove=$(awk '/^0/ { print NR; }' ${inputFile});
# linesToRemove=$(grep -n "^0" ${inputFile} | cut -d":" -f1);
linesNr=${#linesToRemove} # <- here, the error
# linesNr=${#linesToRemove[@]} # <- here, the error
if [ "${linesNr}" -gt "0" ]; then
# do something here, e.g. remove corresponding lines.
awk -v n=$linesToRemove 'NR == n {next} {print}' ${anotherFile} > ${outputFile}
fi
Also, as for the awk-based command, how could I use a shell-variable? I tried the command below, but it's not working correctly, since 'myIndex' is interpreted as a text and not as a variable.
linesToRemove=$(awk -v myIndex="$myIndex" '/^myIndex/ { print NR;}' ${inputFile});
Given the line numbers starting with 0
found in ${inputFile}
, I would like to remove the corresponding lines numbers from ${anotherFile}
. An example for both ${inputFile} and ${anotherFile} is given below:
// ${inputFile}
0
1
3
0
// ${anotherFile}
2.617300e+01 5.886700e+01 -1.894697e-01 1.251225e+02
5.707397e+01 2.214040e+02 8.607959e-02 1.229114e+02
1.725900e+01 1.734360e+02 -1.298053e-01 1.250318e+02
2.177940e+01 1.249531e+02 1.538853e-01 1.527150e+02
// ${outputFile}
5.707397e+01 2.214040e+02 8.607959e-02 1.229114e+02
1.725900e+01 1.734360e+02 -1.298053e-01 1.250318e+02
In the example above, I need to delete lines 0
and 3
from ${anotherFile}
, given that those lines correspond to the lines starting with 0
in ${inputFile}
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
鉴于对这个问题的大量编辑,开始一个新答案似乎是最容易的。您的问题可以通过简单的一句话来解决:
Given the large number of edits to this question, it seems easiest to start a new answer. Your problem can be solved with a simple one-liner:
如果你想统计文件中以0开头的行数,那么这一行是错误的。
linesToRemove=$(awk '/^0/ { print NR; }' ${inputFile});
上面说当行以 0 开头时打印行号,而你的
linesToRemove
变量将包含所有行号,而不是总行数。使用END{}
块捕获总数。例如linesToRemove=$(awk '/^0/ {c++}END{print c}' ${inputFile});
至于关于在 awk 中使用变量的第二个问题,请使用正则表达式运算符
~。然后设置
myIndex
变量以包含^
锚点linesToRemove=$(awk -v myIndex="^$myIndex" '$0 ~ myIndex{ print NR; }' ${inputFile});
最后,如果您只想删除那些以 0 开头的行,那么只需将其删除即可。
如果您想使用输入文件中捕获的内容从另一个文件中删除行,这里是一种方式
或仅一种方式
awk
命令粗略解释:FNR==NR && /^0/表示处理第一个文件以0开头的整行并将其行号放入数组
a
中。NR>FNR
表示处理下一个文件,如果行号不在数组中,则打印该行。请参阅 gawk 文档了解 FNR、NR 等的含义If you want to count the number of lines in the file that begins with 0, then this line is wrong.
linesToRemove=$(awk '/^0/ { print NR; }' ${inputFile});
The above says to print the line number when the line start with 0, and your
linesToRemove
variable will contain all the line numbers, not the total number of lines. UseEND{}
block to capture the total. eglinesToRemove=$(awk '/^0/ {c++}END{print c}' ${inputFile});
As for your 2nd question on using variable inside awk, use the regex operator
~
. And then set yourmyIndex
variable to include the^
anchorlinesToRemove=$(awk -v myIndex="^$myIndex" '$0 ~ myIndex{ print NR;}' ${inputFile});
finally, if you just want to remove those lines that start with 0, then just simply remove it
If you want to remove lines from another file using what is captured in input file, here's one way
Or just one
awk
commandcrude explanation: FNR==NR && /^0/ means process the first file whole line starts with 0 and put its line number into array
a
.NR>FNR
means process the next file and if line number not in array, print the line. See the gawk documentation for what FNR,NR etc means我认为您必须执行以下操作来分配数组:
并获取元素数量(如注释行中所示):
要从文件中删除行,您可以执行以下操作:
I think you have to do the following to assign an array:
And to get the number of elements do (as you have in a commented line):
To remove the lines from from the file you could do something like:
一般来说,如果你这样做:
而不是这个:
使用这个:
POC:
In general if you do this:
instead of this:
use this:
POC :
这很大程度上取决于您正在进行的后处理,但您真的需要实际计数吗?为什么不做这样的事情:
或者,甚至只是:
--编辑--
根据您在评论中所说的内容,我会推荐一种不同的方法。如果我理解正确的话,您想读取文件 a,查找 ^0[0-9]* 形式的行,
然后从文件 b 中删除这些行号。如果文件变大,一次执行一行会非常慢。只需这样做:
对 cmd 的赋值形成一个 sed 命令来删除行。在 b 上调用 sed 将省略这些行。您需要适当地重定向 sed 输出(可能重定向到临时文件,然后返回到 b,或者如果您使用的是 gnu sed,则只需使用“sed -i”。)
It greatly depends on the post-processing you are doing, but do you really need the actual count? Why not do something like this:
Or, even just:
--EDIT--
Based on what you've said in your comment, I would recommend a different approach. If I understand you correctly, you want to read file a, looking for lines of the form ^0[0-9]*,
and then remove those line numbers from file b. Doing it one line at a time is pretty slow if the files get big. Just do:
The assignment to cmd forms a sed command to delete the lines. Invoking sed on b will omit those lines. You'll need to redirect the sed output appropriately (perhaps to a temp file and then back to b, or just use 'sed -i' if you're using gnu sed.)