AWK 脚本 - 该脚本有什么作用？

发布于 2024-11-14 06:37:28 字数 632 浏览 7 评论 0原文

我需要重复处理此 AWK 脚本，但无法弄清楚它在做什么。谁能告诉我这个脚本的基本功能是什么？

它需要一个输入文件，并创建一个输出文件，但我无法访问这两个文件来查看它在做什么。它与分隔输入文件中的列的管道分隔符有关。

{ 
   if (NR == 1) {
     line = $0
     len = length(line)
       newlen = len
     while ( substr(line,newlen-1,1) == "|" )
       {
         newlen = newlen - 1
       }
     line = substr(line,1,newlen-1)
   }
     else {
     print line
     line = $0
     }
 }
 END{
      len = length(line)
      newlen = len
    while ( substr(line,newlen-1,1) == "|" ) {
      newlen = newlen - 1
    }
    line = substr(line,1,newlen-1)
      print line
}

原文

I need to duplicate processing of this AWK script but cannot figure out what it is doing. Can anyone please advise what the basic function of this script is?

It takes an input file, and creates an output file but I do not have access to either files to see what it is doing. It has something to do with the pipe delimiter which delimits columns in the input file.

{ 
   if (NR == 1) {
     line = $0
     len = length(line)
       newlen = len
     while ( substr(line,newlen-1,1) == "|" )
       {
         newlen = newlen - 1
       }
     line = substr(line,1,newlen-1)
   }
     else {
     print line
     line = $0
     }
 }
 END{
      len = length(line)
      newlen = len
    while ( substr(line,newlen-1,1) == "|" ) {
      newlen = newlen - 1
    }
    line = substr(line,1,newlen-1)
      print line
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

喜你已久 2024-11-21 06:37:28

看起来它只修剪第一行和最后一行上的所有尾随管道字符。

回复收藏 0 原文

西瓜 2024-11-21 06:37:28

哇，写这篇文章的人一定是通过线路付费的。

从整体结构来看，代码分为三部分：第一行做了什么（if (NR == 1) {…}），其他行做了什么（else { …}），以及最后一行（END {…}）之后执行的操作。在第一行，变量 line 设置为 。 $0 在随后的行中，已保存。打印 line 然后将 line 设置为当前行，最后打印并转换这种 print-previous-then-save-current 模式是一个常见的技巧。在最后一行采取不同的行为：当您读取一行时，您无法知道它是否是最后一行，因此您保存它，打印上一行并在 END 块中继续；对最后一行做不同的事情

这就是我的做法。数据流同样不平凡（但也很难设计），但至少它没有淹没在混乱的文本转换中。

function cleanup (line) { gsub(/(\|+|.)$/, "", line); return line }
NR != 1 { print prev }
{ prev = (NR == 1 ? cleanup($0) : $0) }
END { print cleanup(prev) }

Wow, whoever wrote this must have been paid by the line.

The block of code that occurs twice, from len = length(line) to line = substr(line,1,newlen-1), is doing a string transformation that could be simply (and more clearly) expressed as a regular expression replacement. It's calculating the number of | characters at the end of line and stripping them. When the line ends with a character other than |, one character is stripped (this may be accidental). This could be simply performed as gsub(/(\|+|.)$/, "", line), or gsub(/\|+)$/, "", line) if the behavior with no final | doesn't matter.

As for the overall structure, there are three parts in the code: what's done for the first line (if (NR == 1) {…}, what's done for other lines (else {…}), and what's done after the last line (END {…}). On the first line, the variable line is set to $0 transformed. On subsequent lines, the saved line is printed then line is set to the current line. Finally the last line is printed, transformed. This print-previous-then-save-current pattern is a common trick to act differently on the last line: when you read a line, you can't know whether it's the last one, so you save it, print the previous line and move on; in the END block you do that different thing for the last line.

Here's how I'd write it. The data flow is similarly nontrivial (but hardly contrived either), but at least it's not drowned in a messy text transformation.

function cleanup (line) { gsub(/(\|+|.)$/, "", line); return line }
NR != 1 { print prev }
{ prev = (NR == 1 ? cleanup($0) : $0) }
END { print cleanup(prev) }

回复收藏 0 原文