制表符间隔数据到填充列

发布于 2024-07-09 12:07:17 字数 1060 浏览 6 评论 0原文

正在寻找一个列格式化脚本,我有一种感觉这可能是一个单行 awk。 理想情况下,我只需要一个小的 shell 脚本。

数据是制表符分隔的,每行中的每个单元格的长度都是可变的,当然,其中可能有空格。

所以我们有这样的东西

dasj    dhsahdwe    dhasdhajks  ewqhehwq    dsajkdhas
e dward das dsaw    das daswf
fjdk    ewf jken    dsajkw  dskdw
hklt    ewq vn1 daskcn  daskw

应该最终是这样的:

dasj       dhsahdwe   dhasdhajks ewqhehwq   dsajkdhas 
e dward    das        dsaw       das        daswf     
fjdk       ewf        jken       dsajkw     dskdw     
hklt       ewq        vn1        daskcn     daskw     

理想情况下,能够调整每个之间的硬间隔量。 如果它逐列查看就更好了,因此前导短单元格不会全部获得相同的正确填充。

不理想:

1       dhsahdwe   dhasdhajks ewqhehwq   dsajkdhas 
2       das        dsaw       das        daswf     
3       ewf        jken       dsajkw     dskdw     
4       ewq        vn1        daskcn     daskw     

理想:

1  dhsahdwe  dhasdhajks  ewqhehwq  dsajkdhas 
2  das       dsaw        das       daswf     
3  ewf       jken        dsajkw    dskdw     
4  ewq       vn1         daskcn    daskw     

Looking for a column formatting script, I have a feeling this could be a one line awk. Ideally, a small shell script is all I am after.

The data is tab separated, each cell in each row is of variable length, and of course, may have spaces in it.

So we have something like this

dasj    dhsahdwe    dhasdhajks  ewqhehwq    dsajkdhas
e dward das dsaw    das daswf
fjdk    ewf jken    dsajkw  dskdw
hklt    ewq vn1 daskcn  daskw

Should end up something like this:

dasj       dhsahdwe   dhasdhajks ewqhehwq   dsajkdhas 
e dward    das        dsaw       das        daswf     
fjdk       ewf        jken       dsajkw     dskdw     
hklt       ewq        vn1        daskcn     daskw     

Ideally, being able to adjust the amount of hard spaced between each one. Even better if it looks on a column by column basis, so leading short cells do not all get the same right padding.

Not ideal:

1       dhsahdwe   dhasdhajks ewqhehwq   dsajkdhas 
2       das        dsaw       das        daswf     
3       ewf        jken       dsajkw     dskdw     
4       ewq        vn1        daskcn     daskw     

Ideal:

1  dhsahdwe  dhasdhajks  ewqhehwq  dsajkdhas 
2  das       dsaw        das       daswf     
3  ewf       jken        dsajkw    dskdw     
4  ewq       vn1         daskcn    daskw     

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

長街聽風 2024-07-16 12:07:17

如果您使用的是 BSD 派生操作系统(包括 Mac OS X),column(1) 及其 -t 选项可能会执行您想要的操作:

% column -t coltest                                                               
dasj  dhsahdwe  dhasdhajks  ewqhehwq  dsajkdhas
e     dward     das         dsaw      das        daswf
fjdk  ewf       jken        dsajkw    dskdw
hklt  ewq       vn1         daskcn    daskw

If you're on a BSD-derived OS (including Mac OS X), column(1) and its -t option might do what you want:

% column -t coltest                                                               
dasj  dhsahdwe  dhasdhajks  ewqhehwq  dsajkdhas
e     dward     das         dsaw      das        daswf
fjdk  ewf       jken        dsajkw    dskdw
hklt  ewq       vn1         daskcn    daskw
说好的呢 2024-07-16 12:07:17

在未混淆的 Perl 中:

#!/usr/bin/perl -w

use strict;

my (@data, @length);
while (<>) {
    chomp;
    my @line = split(/\t/);
    foreach my $i (0 .. $#line) {
        my $n = length($line[$i]);
        $length[$i] = $n if (!defined($length[$i]) || $n > $length[$i]);
    }
    push(@data, [ @line ]);
}

$length[$#length] = 0; # no need to pad the last column
my $fmt = join("  ", map { "%-${_}s" } @length) . "\n";
foreach my $ref (@data) {
    printf $fmt, @$ref;
}

In un-obsfucated Perl:

#!/usr/bin/perl -w

use strict;

my (@data, @length);
while (<>) {
    chomp;
    my @line = split(/\t/);
    foreach my $i (0 .. $#line) {
        my $n = length($line[$i]);
        $length[$i] = $n if (!defined($length[$i]) || $n > $length[$i]);
    }
    push(@data, [ @line ]);
}

$length[$#length] = 0; # no need to pad the last column
my $fmt = join("  ", map { "%-${_}s" } @length) . "\n";
foreach my $ref (@data) {
    printf $fmt, @$ref;
}
水水月牙 2024-07-16 12:07:17

干得好。 用“gawk”测试。

BEGIN {
    FS = "\t";
    # max: Column width
    # fpl: Fields per line
    # data: Fields in every line
}
 { # Note the blank before this brace
    fpl[FNR] = NF;
    for (i=1; i<=NF; i++) {
        data[FNR, i] = $i;
        if (length($i) > max[i]) {
            max[i] = length($i);
        }
    }
}
END {
    for (l=1; l<=length(fpl); l++) {
        for (i=1; i<=fpl[l]; i++) {
            fmt = "%-" max[i] "s";
            if (i > 1) {
                printf " "; # This goes between columns
            }
            printf fmt, data[l, i];
        }
        printf "\n";
    }
}

Here you go. Tested with "gawk".

BEGIN {
    FS = "\t";
    # max: Column width
    # fpl: Fields per line
    # data: Fields in every line
}
 { # Note the blank before this brace
    fpl[FNR] = NF;
    for (i=1; i<=NF; i++) {
        data[FNR, i] = $i;
        if (length($i) > max[i]) {
            max[i] = length($i);
        }
    }
}
END {
    for (l=1; l<=length(fpl); l++) {
        for (i=1; i<=fpl[l]; i++) {
            fmt = "%-" max[i] "s";
            if (i > 1) {
                printf " "; # This goes between columns
            }
            printf fmt, data[l, i];
        }
        printf "\n";
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文