使用 awk 将每个字符打印为自己的列？

发布于 2024-08-28 05:31:55 字数 285 浏览 5 评论 0原文

我需要重新组织一个大型 CSV 文件。第一列目前是 6 位数字，需要使用逗号作为字段分隔符进行拆分。

例如，我需要这个：

022250,10:50 AM,274,22,50
022255,11:55 AM,275,22,55

变成这样：

0,2,2,2,5,0,10:50 AM,274,22,50
0,2,2,2,5,5,11:55 AM,275,22,55

让我知道你的想法！

谢谢！

原文

I am in need of reorganizing a large CSV file. The first column, which is currently a 6 digit number needs to be split up, using commas as the field separator.

For example, I need this:

022250,10:50 AM,274,22,50
022255,11:55 AM,275,22,55

turned into this:

0,2,2,2,5,0,10:50 AM,274,22,50
0,2,2,2,5,5,11:55 AM,275,22,55

Let me know what you think!

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

最丧也最甜 2024-09-04 05:31:55

Perl 中的内容要短得多：

perl -F, -ane '$,=","; print split("",$F[0]), @F[1..$#F]' <file>

因为您不了解 Perl，所以快速解释一下。 -F, 表示输入字段分隔符是逗号（如 awk）。 -a 激活自动拆分（到数组 @F 中），-n 隐式地将代码包装在 while 中（>) { ... } 循环，逐行读取输入。 -e 指示下一个参数是要运行的脚本。 $, 是输出字段分隔符（它以这种方式获取循环的设置迭代，但是哦，好吧）。 split 的目的很明显，您可以看到数组是如何索引/切片的。 print，当像这样列出作为参数时，使用输出字段分隔符并打印其所有字段。

在 awk 中：

awk -F, '{n=split($1,a,""); for (i=1;i<=n;i++) {printf("%s,",a[i])}; for (i=2;i<NF;i++) {printf("%s,",$i)}; print $NF}' <file>

It's a lot shorter in perl:

perl -F, -ane '$,=","; print split("",$F[0]), @F[1..$#F]' <file>

Since you don't know perl, a quick explanation. -F, indicates the input field separator is the comma (like awk). -a activates auto-split (into the array @F), -n implicitly wraps the code in a while (<>) { ... } loop, which reads input line-by-line. -e indicates the next argument is the script to run. $, is the output field separator (it gets set iteration of the loop this way, but oh well). split has obvious purpose, and you can see how the array is indexed/sliced. print, when lists as arguments like this, uses the output field separator and prints all their fields.

In awk:

awk -F, '{n=split($1,a,""); for (i=1;i<=n;i++) {printf("%s,",a[i])}; for (i=2;i<NF;i++) {printf("%s,",$i)}; print $NF}' <file>

回复收藏 0 原文

盗心人 2024-09-04 05:31:55

我认为这可能有效。如果第三个参数是空字符串，则 split 函数（至少在我正在运行的版本中）会将值拆分为单个字符。

  BEGIN{ FS="," }
  {
     n = split( $1, a, "" );
     for ( i = 1; i <= n; i++ )
        printf("%s,", a[i] );

     sep = "";
     for ( i = 2; i <= NF; i++ )
        {
        printf( "%s%s", sep, $i );
        sep = ",";
        }
     printf("\n");
  }

I think this might work. The split function (at least in the version I am running) splits the value into individual characters if the third parameter is an empty string.

  BEGIN{ FS="," }
  {
     n = split( $1, a, "" );
     for ( i = 1; i <= n; i++ )
        printf("%s,", a[i] );

     sep = "";
     for ( i = 2; i <= NF; i++ )
        {
        printf( "%s%s", sep, $i );
        sep = ",";
        }
     printf("\n");
  }

回复收藏 0 原文

掩于岁月 2024-09-04 05:31:55

这是 awk 中的另一种方法

$ awk -F"," '{gsub(".",",&",$1);sub("^,","",$1)}1' OFS="," file
0,2,2,2,5,0,10:50 AM,274,22,50
0,2,2,2,5,5,11:55 AM,275,22,55

here's another way in awk

$ awk -F"," '{gsub(".",",&",$1);sub("^,","",$1)}1' OFS="," file
0,2,2,2,5,0,10:50 AM,274,22,50
0,2,2,2,5,5,11:55 AM,275,22,55

回复收藏 0 原文

⊕婉儿 2024-09-04 05:31:55

这是一个主题的变体。需要注意的一件事是它在不使用循环的情况下打印剩余的字段。另一个问题是，既然您无论如何都要循环第一个字段中的字符，为什么不使用 split() 的空分隔符功能（在某些版本的 AWK 中可能不存在）来执行此操作：

awk -F, 'BEGIN{OFS=","} {len=length($1); for (i=1;i<len; i++) {printf "%s,", substr($1,i,1)}; printf "%s", substr($1,len,1);$1=""; print $0}' filename

作为脚本：

BEGIN {FS = OFS = ","}
{
    len = length($1); 
    for (i=1; i<len; i++)
        {printf "%s,", substr($1, i, 1)}; 
    printf "%s", substr($1, len, 1)
    $1 = "";
    print $0
}

Here's a variation on a theme. One thing to note is it prints the remaining fields without using a loop. Another is that since you're looping over the characters in the first field anyway, why not just do it without using the null-delimiter feature of split() (which may not be present in some versions of AWK):

awk -F, 'BEGIN{OFS=","} {len=length($1); for (i=1;i<len; i++) {printf "%s,", substr($1,i,1)}; printf "%s", substr($1,len,1);$1=""; print $0}' filename

As a script:

BEGIN {FS = OFS = ","}
{
    len = length($1); 
    for (i=1; i<len; i++)
        {printf "%s,", substr($1, i, 1)}; 
    printf "%s", substr($1, len, 1)
    $1 = "";
    print $0
}

回复收藏 0 原文

~没有更多了~