如何将表格变成矩阵?

发布于 2024-07-13 11:37:28 字数 501 浏览 10 评论 0原文

如果我在文本文件中得到一个表格,例如

  • A B 1
  • A C 2
  • A D 1
  • B A 3
  • C D 2
  • A E 1
  • E D 2
  • C B 2
  • 。 。 。
  • 。 。 。
  • 。 。 。

我在另一个文本文件中得到了另一个符号列表。 我想将该表转换为 Perl 数据结构,例如:

  • _ ADE 。 。 。
  • A 0 1 1 。 。 。
  • D 1 0 2 。 。 。
  • E 1 2 0 。 。 。
  • 。 。 。 。 。 。 。

但我只需要一些选定的符号,例如在符号文本中选择了 A、D 和 E,但没有选择 B 和 C。

If I got a table in a text file such like

  • A B 1
  • A C 2
  • A D 1
  • B A 3
  • C D 2
  • A E 1
  • E D 2
  • C B 2
  • . . .
  • . . .
  • . . .

And I got another symbol list in another text file. I want to transform this table into a Perl data structure like:

  • _ A D E . . .
  • A 0 1 1 . . .
  • D 1 0 2 . . .
  • E 1 2 0 . . .
  • . . . . . . .

But I only need some selected symbol, for example A, D and E are selected in the symbol text but B and C are not.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

鹿童谣 2024-07-20 11:37:28

第一个使用数组,第二个使用二维哈希。 第一个应该大致如下:

$list[0] # row 1 - the value is "A B 1"

哈希值如下:

$hash{A}{A} # the intersection of A and A - the value is 0

弄清楚如何实现一个问题对我来说大约 75% 的心理斗争。 我不会详细讨论如何打印哈希或数组,因为这很简单,而且我也不完全清楚您希望如何打印它或您想要打印多少。 但是将数组转换为哈希应该看起来有点像这样:

foreach (@list) {
  my ($letter1, $letter2, $value) = split(/ /);
  $hash{$letter1}{$letter2} = $value;
}

至少,我认为这就是您正在寻找的。 如果您确实想要,可以使用正则表达式,但这对于仅从字符串中提取 3 个值来说可能有点过分了。

编辑:当然,您可以放弃 @list 并直接从文件中组装哈希值。 但这是你的工作,而不是我的。

Use an array for the first one and a 2-dimentional hash for the second one. The first one should look roughly like:

$list[0] # row 1 - the value is "A B 1"

And the hash like:

$hash{A}{A} # the intersection of A and A - the value is 0

Figuring out how to implement a problem is about 75% of the mental battle for me. I'm not going to go into specifics about how to print the hash or the array, because that's easy and I'm also not entirely clear on how you want it printed or how much you want printed. But converting the array to the hash should look a bit like this:

foreach (@list) {
  my ($letter1, $letter2, $value) = split(/ /);
  $hash{$letter1}{$letter2} = $value;
}

At least, I think that's what you're looking for. If you really want you could use a regular expression, but that's probably overkill for just extracting 3 values out of a string.

EDIT: Of course, you could forgo the @list and just assemble the hash straight from the file. But that's your job to figure out, not mine.

や莫失莫忘 2024-07-20 11:37:28

你可以用 awk 尝试一下:

awk -f matrix.awk yourfile.txt > newfile.matrix.txt

其中,matrix.awk 是:

BEGIN {
   OFS="\t"
}
{
  row[$1,$2]=$3
  if (!($2 in f2)) { header=(header)?header OFS $2:$2;f2[$2]}
  if (col1[c]!=$1)
     col1[++c]=$1
}
END {
  printf("%*s%s\n", length(col1[1])+2, " ",header)
  ncol=split(header,colA,OFS)
  for(i=1;i<=c;i++) {
    printf("%s", col1[i])
    for(j=1;j<=ncol;j++)
      printf("%s%s%c", OFS, row[col1[i],colA[j]], (j==ncol)?ORS:"")
  }
}

you can try this with awk:

awk -f matrix.awk yourfile.txt > newfile.matrix.txt

where matrix.awk is :

BEGIN {
   OFS="\t"
}
{
  row[$1,$2]=$3
  if (!($2 in f2)) { header=(header)?header OFS $2:$2;f2[$2]}
  if (col1[c]!=$1)
     col1[++c]=$1
}
END {
  printf("%*s%s\n", length(col1[1])+2, " ",header)
  ncol=split(header,colA,OFS)
  for(i=1;i<=c;i++) {
    printf("%s", col1[i])
    for(j=1;j<=ncol;j++)
      printf("%s%s%c", OFS, row[col1[i],colA[j]], (j==ncol)?ORS:"")
  }
}
网名女生简单气质 2024-07-20 11:37:28

另一种方法是制作一个二维数组 -

my @fArray = ();
## Set the 0,0th element to "_"
push @{$fArray[0]}, '_';

## Assuming that the first line is the range of characters to skip, e.g. BC
chomp(my $skipExpr = <>);

while(<>) {
    my ($xVar, $yVar, $val) = split;

    ## Skip this line if expression matches
    next if (/$skipExpr/);

    ## Check if these elements have already been added in your array
    checkExists($xVar);
    checkExists($yVar);

    ## Find their position 
    for my $i (1..$#fArray) {
        $xPos = $i if ($fArray[0][$i] eq $xVar);
        $yPos = $i if ($fArray[0][$i] eq $yVar);
    }

    ## Set the value 
    $fArray[$xPos][$yPos] = $fArray[$yPos][$xPos] = $val;
}

## Print array
for my $i (0..$#fArray) {
    for my $j (0..$#{$fArray[$i]}) {
        print "$fArray[$i][$j]", " ";
    }
    print "\n";
}

sub checkExists {
    ## Checks if the corresponding array element exists,
    ## else creates and initialises it.
    my $nElem = shift;
    my $found;

    $found = ($_ eq $nElem ? 1 : 0) for ( @{fArray[0]} );

    if( $found == 0 ) {
        ## Create its corresponding column
        push @{fArray[0]}, $nElem;

        ## and row entry.
        push @fArray, [$nElem];

        ## Get its array index
        my $newIndex = $#fArray;

        ## Initialise its corresponding column and rows with '_'
        ## this is done to enable easy output when printing the array
        for my $i (1..$#fArray) {
            $fArray[$newIndex][$i] = $fArray[$i][$newIndex] = '_';
        }

        ## Set the intersection cell value to 0
        $fArray[$newIndex][$newIndex] = 0;
    }
}

我对我处理引用的方式不太自豪,但请忍受初学者(请在评论中留下您的建议/更改)。 上面提到的 Chris 的哈希方法听起来容易多了(更不用说打字少了很多)。

Another way to do this would be to make a two-dimensional array -

my @fArray = ();
## Set the 0,0th element to "_"
push @{$fArray[0]}, '_';

## Assuming that the first line is the range of characters to skip, e.g. BC
chomp(my $skipExpr = <>);

while(<>) {
    my ($xVar, $yVar, $val) = split;

    ## Skip this line if expression matches
    next if (/$skipExpr/);

    ## Check if these elements have already been added in your array
    checkExists($xVar);
    checkExists($yVar);

    ## Find their position 
    for my $i (1..$#fArray) {
        $xPos = $i if ($fArray[0][$i] eq $xVar);
        $yPos = $i if ($fArray[0][$i] eq $yVar);
    }

    ## Set the value 
    $fArray[$xPos][$yPos] = $fArray[$yPos][$xPos] = $val;
}

## Print array
for my $i (0..$#fArray) {
    for my $j (0..$#{$fArray[$i]}) {
        print "$fArray[$i][$j]", " ";
    }
    print "\n";
}

sub checkExists {
    ## Checks if the corresponding array element exists,
    ## else creates and initialises it.
    my $nElem = shift;
    my $found;

    $found = ($_ eq $nElem ? 1 : 0) for ( @{fArray[0]} );

    if( $found == 0 ) {
        ## Create its corresponding column
        push @{fArray[0]}, $nElem;

        ## and row entry.
        push @fArray, [$nElem];

        ## Get its array index
        my $newIndex = $#fArray;

        ## Initialise its corresponding column and rows with '_'
        ## this is done to enable easy output when printing the array
        for my $i (1..$#fArray) {
            $fArray[$newIndex][$i] = $fArray[$i][$newIndex] = '_';
        }

        ## Set the intersection cell value to 0
        $fArray[$newIndex][$newIndex] = 0;
    }
}

I am not too proud regarding the way I have handled references but bear with a beginner here (please leave your suggestions/changes in comments). The above mentioned hash method by Chris sounds a lot easier (not to mention a lot less typing).

零時差 2024-07-20 11:37:28

CPAN 有许多潜在有用的内容。 我将 Data::Table 用于多种目的。 Data::Pivot 看起来也很有前途,但我从未使用过它。

CPAN has many potentially useful suff. I use Data::Table for many purposes. Data::Pivot also looks promising, but I have never used it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文