需要帮助将“while-do”转换为“while-do”块到“awk”;块以加快处理速度

发布于 2025-01-10 00:09:31 字数 905 浏览 0 评论 0原文

我需要将 csv 文件的第 7 个字段从 julian(yyddd 或 yyJJJ)转换为 yyyymmdd。我有下面的 while do 循环。我需要使用 awk 命令相同的逻辑来加快处理速度。有人可以帮忙吗?

count=0
while read -r line1; do
        col_7=$( echo $line1 | cut -d ',' -f7 | cut -c4-6)
        year1=$( echo $line1 | cut -d ',' -f7 | cut -c2-3)
        echo $col_7
        col_1=$( echo $line1 | cut -d ',' -f1,2,3,4,5,6)
        col_8=$( echo $line1 | cut -d ',' -f8 )
        date7=$(date -d "01/01/${year1} +${col_7} days -1 day" +%Y%m%d)
        echo $date7
        echo $col_1,$date7,$col_8 >> ${t2}
        count=$[count+1]
done < ${t1}

输入

xx,x,x,xxx,xxxx,xxxxx,021276,x  
xx,x,x,xxx,xxxx,xxxxx,021275,x  
xx,x,x,xxx,xxxx,xxxxx,021275,x  

输出

xx,x,x,xxx,xxxx,xxxxx,20211003,x  
xx,x,x,xxx,xxxx,xxxxx,20211002,x  
xx,x,x,xxx,xxxx,xxxxx,20211002,x  

I need 7th field of a csv file converted from julian(yyddd or yyJJJ) to yyyymmdd. I have the below while do loop. I need the same logic using awk command for quicker processing. Can someone help ?

count=0
while read -r line1; do
        col_7=$( echo $line1 | cut -d ',' -f7 | cut -c4-6)
        year1=$( echo $line1 | cut -d ',' -f7 | cut -c2-3)
        echo $col_7
        col_1=$( echo $line1 | cut -d ',' -f1,2,3,4,5,6)
        col_8=$( echo $line1 | cut -d ',' -f8 )
        date7=$(date -d "01/01/${year1} +${col_7} days -1 day" +%Y%m%d)
        echo $date7
        echo $col_1,$date7,$col_8 >> ${t2}
        count=$[count+1]
done < ${t1}

Input

xx,x,x,xxx,xxxx,xxxxx,021276,x  
xx,x,x,xxx,xxxx,xxxxx,021275,x  
xx,x,x,xxx,xxxx,xxxxx,021275,x  

Output

xx,x,x,xxx,xxxx,xxxxx,20211003,x  
xx,x,x,xxx,xxxx,xxxxx,20211002,x  
xx,x,x,xxx,xxxx,xxxxx,20211002,x  

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

柠北森屋 2025-01-17 00:09:31

只要消除所有对 cut 的调用就会产生奇迹;您可能不需要awk

count=0
while IFS=, read -r c1 c2 c3 c4 c5 c6 c7 col_8 rest; do
    col_7=${c7:3:3}
    year1=${c7:1:2}
    col_1=$c1$c2$c3$c4$c5$c6
    col_8=$c8
    date7=$(date -d "01/01/$year1 +$col_7 days - 1 day" +%Y%m%d)
    ...
    count=$((count+1))
done < "$t1"

Just eliminating all the calls to cut will do wonders; you may not need awk.

count=0
while IFS=, read -r c1 c2 c3 c4 c5 c6 c7 col_8 rest; do
    col_7=${c7:3:3}
    year1=${c7:1:2}
    col_1=$c1$c2$c3$c4$c5$c6
    col_8=$c8
    date7=$(date -d "01/01/$year1 +$col_7 days - 1 day" +%Y%m%d)
    ...
    count=$((count+1))
done < "$t1"
家住魔仙堡 2025-01-17 00:09:31

这是 awk 的解决方案。这需要 GNU awk 来实现其时间函数。在终端上测试了它,所以它几乎是一个单行命令。

awk 'BEGIN { FS=OFS="," } { $7=strftime("%Y%m%d",mktime("20"substr($7,2,2)" 01 01 00 00 00")+(substr($7,4)*86400)-3600) } 1' filename.txt

说明:

  • FS 为字段分隔符。设置为“,”
  • OFS 是输出字段分隔符。将其设置为“,”
  • $7 是第 7 个字段。
  • strftime(format, timestamp) 是一个内置函数,用于根据 format 中的规范以秒为单位格式化时间戳。
  • mktime(datespec) 是将 datespec 转换为秒的函数。日期规范的格式为 YYYY MM DD HH MM SS
  • substr($7,2,2) 是获取两位数年份。
  • substr($7,4) 是获取日期。由于这些函数以秒为输入,因此需要转换为秒。
  • 86400 是 24(小时)* 60(分钟)* 60(秒)
  • 36000 是一天。 60(分钟)* 60(秒)
  • 1 用于打印输入行。不一定是 1。除零之外的任何值都可以。如果您喜欢 RPG,您可能需要将其更改为 999
  • filename.txt 是您的输入文件。

Here is a solution for awk. This requires GNU awk for its time functions. Tested it on terminal, so it is pretty much a one-liner command.

awk 'BEGIN { FS=OFS="," } { $7=strftime("%Y%m%d",mktime("20"substr($7,2,2)" 01 01 00 00 00")+(substr($7,4)*86400)-3600) } 1' filename.txt

Explanations:

  • FS is field separator. Set it to ","
  • OFS is output field separator. Set it to ","
  • $7 is 7th field.
  • strftime(format, timestamp) is a builtin function to format timestamp in seconds according to the specification in format.
  • mktime(datespec) is a function to turn datespec into seconds. The format for datespec is YYYY MM DD HH MM SS.
  • substr($7,2,2) is to get the two-digit year.
  • substr($7,4) is to get the day. Because these functions take seconds as input, so a convertion to seconds is required.
  • 86400 is 24(hours) * 60(minutes) * 60 (seconds)
  • 36000 is a day. 60 (minutes) * 60 (seconds)
  • 1 is for printing the input line. Doesn't have to be 1. Anything other than zero is fine. If you like RPGs, you might want to change that to 999.
  • filename.txt is your input file.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文