重命名匹配txt文件中字符串的文件

发布于 2025-01-28 06:36:16 字数 727 浏览 2 评论 0 原文

我试图根据匹配项将多个文件重命名为.txt文件 然后,我的文件是

GCF_000698265.1_ASM69826v1_genomic.gff.gz
GCF_000785125.1_ASM78512v1_genomic.gff.gz
GCF_000934565.1_ASM93456v1_genomic.gff.gz
GCF_000963495.1_ASM96349v1_genomic.gff.gz

我的标签分离的txt文件,看起来像这样:

GCF_000698265.1_ASM69826v1  Pseudomonas_str1
GCF_000785125.1_ASM78512v1  Pseudomonas_str2
GCF_000934565.1_ASM93456v1  Pseudomonas_str3
GCF_000963495.1_ASM96349v1  Pseudomonas_str4

因此,对于匹配文件第一列的文件名,我想将文件重命名为第二列。 我试图理解如何将MV和尴尬吹进,但我迷路了。 我希望我所需的输出看起来像这样:

Pseudomonas_str1_genomic.gff.gz
Pseudomonas_str2_genomic.gff.gz
Pseudomonas_str3_genomic.gff.gz
Pseudomonas_str4_genomic.gff.gz

有人可以帮忙吗? 我希望我很清楚,非常感谢!

I am trying to rename multiple files according to the match to a .txt file
my files are

GCF_000698265.1_ASM69826v1_genomic.gff.gz
GCF_000785125.1_ASM78512v1_genomic.gff.gz
GCF_000934565.1_ASM93456v1_genomic.gff.gz
GCF_000963495.1_ASM96349v1_genomic.gff.gz

then my tab separated txt file looks like this:

GCF_000698265.1_ASM69826v1  Pseudomonas_str1
GCF_000785125.1_ASM78512v1  Pseudomonas_str2
GCF_000934565.1_ASM93456v1  Pseudomonas_str3
GCF_000963495.1_ASM96349v1  Pseudomonas_str4

So, for filenames that match the first column of the file, I want to rename the file as the second column.
I was trying to understand how to do it piping mv and awk, but I got lost.
I would like my desired output look like this:

Pseudomonas_str1_genomic.gff.gz
Pseudomonas_str2_genomic.gff.gz
Pseudomonas_str3_genomic.gff.gz
Pseudomonas_str4_genomic.gff.gz

Can anybody help in this?
I hope I was clear and thanks a lot!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

半世蒼涼 2025-02-04 06:36:17

因此,我创建了一个合成的测试集,它有些可观,并且有意地制作了 1/7th < / strong>匹配,并且那里
在任何地方都没有重复,因为合成文件名是
所有这些都基于唯一的素数列表,并且文件也按改为顺序进行。

  • 254923 19113991 19113991 test_rename_output_2b.txt
  • 254923 15545069 15545069 test_need_need_need_to_to_to_to_to_to_ename_ename_eneame_2.txt
 1784459  53025088 53025088 test_ref_lookup_2.txt
 2294305  87684148 87684148 total 

# gawk profile, created Thu May 12 03:57:36 2022

    # Rule(s)

     1  FNR == NR { # 1
1784459     do {
1784459         __[$!_]
1784459         getline
        } while (FNR == NR)
    }

     1  FNR != NR { # 1
254923      do {
254923          if ($!_ in __) { 
                    printf("gmv -vn \47%s\47 "\
                                    "\47%s\47 ;\n",$!!NF,$NF)
            }
        } while (getline)
    }

此解决方案的优点是,它已经预先格式化用于使用MV命令(示例输出)直接重命名:

gmv -vn 'file522111333101.txt' 'newname_799042B2ED_.txt' ;
gmv -vn 'file2011113799793759.txt' 'newname_72518EBA3BC5F_.txt' ;
gmv -vn 'file476743673269.txt' 'newname_6F002325B5_.txt' ;
gmv -vn 'file7979798079897989.txt' 'newname_1C599585EE8185_.txt' ;
gmv -vn 'file211031042203.txt' 'newname_31226E289B_.txt' ;
gmv -vn 'file172888842428207.txt' 'newname_9D3DD209DF2F_.txt' ;

为了安全地播放它,我已经预先付费 not oftrite aka no-clobber aka -n 在所有重命名命令中标记,可以将其直接发送到轻量的轻度上,例如 dash 以执行而无需执行进一步的文件名操纵。

我猜是可以接受的性能 -

  • mawk2 1.218 secs 完成所有步骤(包括将最终输出文件写入磁盘)。

So i create a synthetic test-set that's somewhat sizable, and intentionally only made 1 / 7th of them match, and there
are no duplicates anywhere, since the synthetic file names are
all based on a unique list of primes, and the files are also in shuffled order.

  • 254923 19113991 19113991 test_rename_output_2b.txt
  • 254923 15545069 15545069 test_need_to_rename_2.txt
 1784459  53025088 53025088 test_ref_lookup_2.txt
 2294305  87684148 87684148 total 

# gawk profile, created Thu May 12 03:57:36 2022

    # Rule(s)

     1  FNR == NR { # 1
1784459     do {
1784459         __[$!_]
1784459         getline
        } while (FNR == NR)
    }

     1  FNR != NR { # 1
254923      do {
254923          if ($!_ in __) { 
                    printf("gmv -vn \47%s\47 "\
                                    "\47%s\47 ;\n",$!!NF,$NF)
            }
        } while (getline)
    }

The advantage of this solution is that it's already pre-formatted for direct renaming using mv command (sample output) :

gmv -vn 'file522111333101.txt' 'newname_799042B2ED_.txt' ;
gmv -vn 'file2011113799793759.txt' 'newname_72518EBA3BC5F_.txt' ;
gmv -vn 'file476743673269.txt' 'newname_6F002325B5_.txt' ;
gmv -vn 'file7979798079897989.txt' 'newname_1C599585EE8185_.txt' ;
gmv -vn 'file211031042203.txt' 'newname_31226E289B_.txt' ;
gmv -vn 'file172888842428207.txt' 'newname_9D3DD209DF2F_.txt' ;

To play it safe, I've pre-pended the don't overwrite aka no-clobber aka -n flag in all the renaming commands, which one can directly send into something light weight like dash to execute without requiring further filename manipulation.

Performance is acceptable I suppose -

  • mawk2 took 1.218 secs to complete all steps (inclusive of writing final output file to disk).
孤千羽 2025-02-04 06:36:16

使用 sed bash ,假设txt文件

sed 's/^/mv /' rename.txt | bash

使用awk命名为'rename.txt':

awk '{system("mv " $1 " " $2)}' rename.txt

此处的密钥是将“ MV”插入到每行的开头并执行。

最后一个解决方案不使用任何外部工具,只需bash:

while read old new; do mv "$old" "$new"; done < rename.txt

更新

基于阿尔贝托的更新问题

,以下是更改:使用:

sed sed 's/^/mv /;s/$/_genomic.gff.gz/' rename.txt | bash

注意:; s/$/_ genomic.gff.gf.gz/表达式>说:搜索行的末尾,并将“ _genomic.gff.gz”附加到其中。只有在每行中没有尾随空间的情况下,这才能起作用。

使用尴尬:

awk '{system("mv " $1 " " $2 "_genomic.gff.gz")}' rename.txt

使用bash:

while read old new; do mv "$old" "${new}_genomic.gff.gz"; done < rename.txt

Using sed and bash, assuming the txt file is named 'rename.txt'

sed 's/^/mv /' rename.txt | bash

Using awk:

awk '{system("mv " $1 " " $2)}' rename.txt

The key here is to insert "mv " to the beginning of each line and execute.

This last solution does not use any external tool, just bash:

while read old new; do mv "$old" "$new"; done < rename.txt

Update

Based on Alberto's updated question, here are the changes:

Using sed:

sed sed 's/^/mv /;s/$/_genomic.gff.gz/' rename.txt | bash

Note: The ;s/$/_genomic.gff.gz/ expression said: search the end of the line and append "_genomic.gff.gz" to it. This will work only if you don't have trailing spaces in each line.

Using awk:

awk '{system("mv " $1 " " $2 "_genomic.gff.gz")}' rename.txt

Using Bash:

while read old new; do mv "$old" "${new}_genomic.gff.gz"; done < rename.txt
酒几许 2025-02-04 06:36:16

试图了解如何操作MV和Awk

您可能会使用尴尬来准备一系列命令,然后将其用作 bash 的标准输入。请注意,您的案件

file1.txt   cat.txt  
file2.txt   dog.txt
file3.txt   fish.txt
file4.txt   mouse.txt

是特定的,因为文件名中没有空格,如果空格被命名,则您可以简单地使用 mv 进行预置行,例如,如果上述文件被命名为 renaming.txt < /code>然后您可能会这样做

awk '{print "mv " $0}' renaming.txt | bash

,但是如果任何名称中都有空间,则会失败。如果允许空格,我建议使用 python (如果您使用Linux Machine,可能已安装)以下方式,创建文件 renaMer.py with

import os
with open("renaming.txt","r") as f:
    for line in f:
        src, dst = line.rstrip().split("\t")
        os.rename(src, dst)

where with code> renaming.txt 是带有2个剪切列的文件名,保留当前名称和所需的名称,然后使用它

python renamer.py

的工作方式如下: open s renaming.txt 用于读取( r ),每行都会在tab字符处进行抛弃式尾随tagring whitespace(newlines)和拆分行,第一部分转到 src ,第二部分,第二部分到 dst 然后在。 /代码> 功能。

您可以选择其他语言,最好是具有用于管理文件的功能,因为这将使为此任务开发代码。

was trying to understand how to do it piping mv and awk

You might use AWK to prepare series of commands which then you use as standard input for bash. Be warned that your case

file1.txt   cat.txt  
file2.txt   dog.txt
file3.txt   fish.txt
file4.txt   mouse.txt

is specific as there are not spaces in filenames, if spaces are prohibitied in names then you might simply prepend lines with mv for example if said files is named renaming.txt then you might do

awk '{print "mv " $0}' renaming.txt | bash

however this will fail if there is space in any name. If spaces are allowed then I suggest to use python (which is likely installed if you use linux machine) following way, create file renamer.py with following content

import os
with open("renaming.txt","r") as f:
    for line in f:
        src, dst = line.rstrip().split("\t")
        os.rename(src, dst)

where renaming.txt is name of file with 2 tab-sheared columns holding current name and desired name then use it as follows

python renamer.py

How it works: opens renaming.txt for reading (r) for each line it does jettison trailing whitespaces (newlines) and split line at TAB character, 1st part goes to src, 2nd to dst which are then used in os.rename function.

You might select other language for that, preferably which has function for managing files, as this will made developing code for this task easier.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文