关于 UNIX Grep 命令

发布于 2024-08-22 01:33:32 字数 466 浏览 6 评论 0原文

我需要编写一个 shell 脚本来选取 /exp/files 目录中的所有文件(而不是目录)。对于目录内的每个文件,我想查找是否收到文件的最后一行。文件中的最后一行是预告片记录。最后一行中的第三个字段也是数据记录数,即2315(文件中的总行数-2(标题,预告片))。在我的unix shell脚本中,我想通过检查T来检查最后一行是否是预告片记录,并想检查文件中的行数是否等于(2315+2)。如果成功,那么我想将文件移动到另一个目录/exp/ready。

tail -1 test.csv 
T,Test.csv,2315,80045.96

另外,在输入文件中,有时预告片记录的 0 或 1 个以上字段可以用双引号引起来

"T","Test.csv","2315","80045.96"
"T", Test.csv, 2212,"80045.96"
T,Test.csv,2315,80045.96

I need to write a shell script that pick all the files (not directories) in /exp/files directory. For each file inside the directory I want to find whether the last line of file is received . The last line in the file is a trailer record. Also the third field in the last line is the number of data records count i.e 2315 (Total Number of lines in the file -2 (header,trailer) ) . In my unix shell script i want to check whether the last line is a trailer record by checking T and want to check whether the number of lines in the file is equal to (2315+2). If this is successful then i want to move the file to a different directory /exp/ready.

tail -1 test.csv 
T,Test.csv,2315,80045.96

Also in the inputfile sometimes 0 or 1 more fields of trailer record can be within double quotes

"T","Test.csv","2315","80045.96"
"T", Test.csv, 2212,"80045.96"
T,Test.csv,2315,80045.96

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

清秋悲枫 2024-08-29 01:33:32

您可以使用以下命令测试最后一行是否存在:

tail -1 ${filename} | egrep '^T,|^"T",' >/dev/null 2>&1
rc=$?

此时,如果该行以 T,"T 开头,则 $rc 将为 0 ”,,假设这足以捕捉预告片记录。

一旦确定了这一点,您就可以使用以下命令提取行数:

lc=$(cat ${filename} | wc -l)

并且可以使用以下命令获取预期行数

elc=$(tail -1 ${filename} | awk -F, '{sub(/^"/,"",$3);print 2+$3}')

并比较两者。

因此,将所有这些结合在一起,这将是一个良好的开始。它输出文件本身(我的测试文件 num[1-9].tst)以及一条消息,指示文件是否正常或为什么不正常。

#!/bin/bash
cd /exp/files
for fspec in *.tst ; do
    if [[ -f ${fspec} ]] ; then
        cat ${fspec} | sed 's/^/   /'
        tail -1 ${fspec} | egrep '^T,|^"T",' >/dev/null 2>&1
        rc=$?
        if [[ ${rc} -eq 0 ]] ; then
            lc=$(cat ${fspec} | wc -l)
            elc=$(tail -1 ${fspec} | awk -F, '{sub(/^"/,"",$3);print 2+$3}')
            if [[ ${lc} -eq ${elc} ]] ; then
                echo '***' File ${fspec} is done and dusted.
            else
                echo '***' File ${fspec} line count mismatch: ${lc}/${elc}.
            fi
        else
            echo '***' File ${fspec} has no valid trailer.
        fi
    else
        ls -ald ${fspec} | sed 's/^/   /'
        echo '***' File ${fspec} is not a regular file.
    fi
done

示例运行显示了我使用的测试文件:

   H,Test.csv,other rubbish goes here
   this file does not have a trailer
*** File num1.tst has no valid trailer.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes and correct count
   "T","Test.csv","1","80045.96"
*** File num2.tst is done and dusted.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes but bad count
   "T","Test.csv","9","80045.96"
*** File num3.tst line count mismatch: 3/11.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes except T, and correct count
   T,"Test.csv","1","80045.96"
*** File num4.tst is done and dusted.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with no quotes on T or count and correct count
   T,"Test.csv",1,"80045.96"
*** File num5.tst is done and dusted.
   H,Test.csv,other rubbish goes here
   this file does have a traier with quotes on T only, and correct count
   "T",Test.csv,1,80045.96
*** File num6.tst is done and dusted.
   drwxr-xr-x+ 2 pax None 0 Feb 23 09:55 num7.tst
*** File num7.tst is not a regular file.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes except the bad count
   "T","Test.csv",8,"80045.96"
*** File num8.tst line count mismatch: 3/10.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with no quotes and a bad count
   T,Test.csv,7,80045.96
*** File num9.tst line count mismatch: 3/9.

You can test for the presence of the last line with the following:

tail -1 ${filename} | egrep '^T,|^"T",' >/dev/null 2>&1
rc=$?

At that point $rc will be 0 if the line started with either T, or "T",, assuming that's enough to catch the trailer record.

Once you've established that, you can extract the line count with:

lc=$(cat ${filename} | wc -l)

and you can get the expected line count with:

elc=$(tail -1 ${filename} | awk -F, '{sub(/^"/,"",$3);print 2+$3}')

and compare the two.

So, tying all that together, this would be a good start. It outputs the file itself (my test files num[1-9].tst) along with a message indicating whether the file is okay or why it is not okay.

#!/bin/bash
cd /exp/files
for fspec in *.tst ; do
    if [[ -f ${fspec} ]] ; then
        cat ${fspec} | sed 's/^/   /'
        tail -1 ${fspec} | egrep '^T,|^"T",' >/dev/null 2>&1
        rc=$?
        if [[ ${rc} -eq 0 ]] ; then
            lc=$(cat ${fspec} | wc -l)
            elc=$(tail -1 ${fspec} | awk -F, '{sub(/^"/,"",$3);print 2+$3}')
            if [[ ${lc} -eq ${elc} ]] ; then
                echo '***' File ${fspec} is done and dusted.
            else
                echo '***' File ${fspec} line count mismatch: ${lc}/${elc}.
            fi
        else
            echo '***' File ${fspec} has no valid trailer.
        fi
    else
        ls -ald ${fspec} | sed 's/^/   /'
        echo '***' File ${fspec} is not a regular file.
    fi
done

The sample run, showing the test files I used:

   H,Test.csv,other rubbish goes here
   this file does not have a trailer
*** File num1.tst has no valid trailer.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes and correct count
   "T","Test.csv","1","80045.96"
*** File num2.tst is done and dusted.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes but bad count
   "T","Test.csv","9","80045.96"
*** File num3.tst line count mismatch: 3/11.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes except T, and correct count
   T,"Test.csv","1","80045.96"
*** File num4.tst is done and dusted.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with no quotes on T or count and correct count
   T,"Test.csv",1,"80045.96"
*** File num5.tst is done and dusted.
   H,Test.csv,other rubbish goes here
   this file does have a traier with quotes on T only, and correct count
   "T",Test.csv,1,80045.96
*** File num6.tst is done and dusted.
   drwxr-xr-x+ 2 pax None 0 Feb 23 09:55 num7.tst
*** File num7.tst is not a regular file.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes except the bad count
   "T","Test.csv",8,"80045.96"
*** File num8.tst line count mismatch: 3/10.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with no quotes and a bad count
   T,Test.csv,7,80045.96
*** File num9.tst line count mismatch: 3/9.
╰◇生如夏花灿烂 2024-08-29 01:33:32

如果您想在写入并关闭文件后移动文件,那么您应该考虑使用 inotify、incron、FAM、gamin 等。

If you want to move the files after they've been written and closed then you should consider using something like inotify, incron, FAM, gamin, etc.

深白境迁sunset 2024-08-29 01:33:32

该代码通过一次 awk 调用完成所有逻辑计算,这使得它非常高效。它还对示例值 2315 进行硬编码,而是使用预告片行中包含的值,因为我相信这是您的意图。

如果您对结果满意,请记住删除echo

#!/bin/bash

for file in /exp/files/*; do
  if [[ -f "$file" ]]; then
    if nawk -F, '{v0=$0;v1=$1;v3=$3}END{gsub(/"/,"",v0);exit !(v1 == "T" && NR == v3+2)}' "$file"; then
      echo mv "$file" /ext/ready
    fi
  fi
done

更新

我必须添加 {v0=$0;v1=$1;v3=$3} 因为 SunOS 的 awk 实现不支持 END{} 访问字段变量($0、$1、$2 等) .),但如果您想在 END{} 内处理它们,则必须将其保存到用户定义的变量中。请参阅此 awk 功能比较链接中第一个表的最后一行

This code does all of the logic calculations via a single call to awk which makes it very efficient. It also does NOT hardcode the example value of 2315 but rather uses the value contained in the trailer line as I believe this was your intent.

Remember to remove the echo if you are satisfied with the results.

#!/bin/bash

for file in /exp/files/*; do
  if [[ -f "$file" ]]; then
    if nawk -F, '{v0=$0;v1=$1;v3=$3}END{gsub(/"/,"",v0);exit !(v1 == "T" && NR == v3+2)}' "$file"; then
      echo mv "$file" /ext/ready
    fi
  fi
done

Update

I had to add {v0=$0;v1=$1;v3=$3} because SunOS's implementation of awk does not support END{} having access to the field variables ($0, $1, $2, etc.) but instead must be saved to a user-defined variable if you want to work on them inside END{}. See the last row of the first table in This awk feature comparison link

寂寞花火° 2024-08-29 01:33:32

这里没有方便的 UNIX shell,但

#!/bin/bash
files=$(find /exp/files -type f)

应该将所有文件放在 BASH 数组中;然后按照上面建议的 paxdiablo 迭代它们中的每一个应该可以让你排序

Don't have a UNIX shell handy here, but

#!/bin/bash
files=$(find /exp/files -type f)

should put all files in a BASH array; then iterating through each of them as paxdiablo suggested above should get you sorted

回忆躺在深渊里 2024-08-29 01:33:32
destination=/exp/ready
for file in /exp/files/*.csv
do
    var=$(tail -1 "$file" | awk -F"," '{ gsub(/\042|\047/,"") }
    $1=="T" && $3 == "2315" { print "ok" }')
    if [ "$var" = "ok" ]; then
        echo mv "$file" "$destination"
    else
        echo "invalid: $file"
    fi
done
destination=/exp/ready
for file in /exp/files/*.csv
do
    var=$(tail -1 "$file" | awk -F"," '{ gsub(/\042|\047/,"") }
    $1=="T" && $3 == "2315" { print "ok" }')
    if [ "$var" = "ok" ]; then
        echo mv "$file" "$destination"
    else
        echo "invalid: $file"
    fi
done
﹏雨一样淡蓝的深情 2024-08-29 01:33:32
#!/bin/bash

ex findready.sh <<'HERE'
  i#!/bin/bash/

  let NUMLINES=$(wc -l $1)
  let TRAILER=$(cat $1 | tail -1 | tr -d '"' | sed 's/^\(.\).*$/\1/')

  if [[ $NUMLINES -eq 2317 && $TRAILER == "T" ]] ; then
      mv $1 /exp/ready/$1
  fi
  .
  wq
HERE

chmod a+x findready.sh

find /exp/files/ -type f -name '*.csv' -exec ./findready.sh {} ';' > /dev/null 2>&1
#!/bin/bash

ex findready.sh <<'HERE'
  i#!/bin/bash/

  let NUMLINES=$(wc -l $1)
  let TRAILER=$(cat $1 | tail -1 | tr -d '"' | sed 's/^\(.\).*$/\1/')

  if [[ $NUMLINES -eq 2317 && $TRAILER == "T" ]] ; then
      mv $1 /exp/ready/$1
  fi
  .
  wq
HERE

chmod a+x findready.sh

find /exp/files/ -type f -name '*.csv' -exec ./findready.sh {} ';' > /dev/null 2>&1
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文