将文件中的行切成相同长度但使用新标识符
我有一个文件,其每第二行的长度不等。我想让这些行相等(输出的每第二行应等于 10 个字符),但具有新的标识符(每个奇数行)。
文件->
>ZQMK36301EDYQE
ZHZHHEXZZHHZZHHZZXHHHEHHHZZZHHHZHXZHZ
>ZQMK36301EEMJ9
ZZZXHZHHXHHHEZZEEZZHZZZZXEZ
>ZQMK36301EOEM5
ZXHXHZZHEHHHXZEZHXXXHXHHHHXEHHHZHHHH
desired output ->
>ZQMK36301EDYQE
ZHZHHEXZZH
>ZQMK36301EDYQE#2
HZZHHZZXHH
>ZQMK36301EDYQE#3
HEHHHZZZHH
>ZQMK36301EEMJ9
ZZZXHZHHXH
>ZQMK36301EEMJ9#2
HHEZZEEZZH
>ZQMK36301EOEM5
ZXHXHZZHEH
>ZQMK36301EOEM5#2
HHXZEZHXXX
>ZQMK36301EOEM5#3
HXHHHHXEHH
这里,如果我们采用第一行标识符 (>ZQMK36301EDYQE),第二行包含 37 个字符。现在它将生成 3 个长度相等的序列(即 10),如果剩余字符少于 10,我们将丢弃该部分。现在,每个长度相等的新行都有一个标识符,该标识符与它来自的序列部分相同,但后面跟着“#”和数字。我想对整个文件执行此操作。请帮忙。
谢谢并致以最诚挚的问候, 维卡斯
I have a file, whose every 2nd line is of unequal length. I want to make these lines equal(every 2nd line of output should be equal to 10 characters) but with new identifier (every odd line).
FILE ->
>ZQMK36301EDYQE
ZHZHHEXZZHHZZHHZZXHHHEHHHZZZHHHZHXZHZ
>ZQMK36301EEMJ9
ZZZXHZHHXHHHEZZEEZZHZZZZXEZ
>ZQMK36301EOEM5
ZXHXHZZHEHHHXZEZHXXXHXHHHHXEHHHZHHHH
desired output ->
>ZQMK36301EDYQE
ZHZHHEXZZH
>ZQMK36301EDYQE#2
HZZHHZZXHH
>ZQMK36301EDYQE#3
HEHHHZZZHH
>ZQMK36301EEMJ9
ZZZXHZHHXH
>ZQMK36301EEMJ9#2
HHEZZEEZZH
>ZQMK36301EOEM5
ZXHXHZZHEH
>ZQMK36301EOEM5#2
HHXZEZHXXX
>ZQMK36301EOEM5#3
HXHHHHXEHH
Here if we take the first line which is identifier (>ZQMK36301EDYQE) and in its 2nd line it contains 37 characters. Now it will make 3 sequences of equal length (i:e 10) and if remaining characters are less than 10, we will throw that part. Now each new line of equal length has an identifier which is same as from which the part of sequence it came but followed by "#" and the number. I want to do this for whole file. Please help.
Thanks and Best regards,
Vikas
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
作为单行代码:
-n
逐行读取文件并将行存储在$_
中。-l
自动剪切输入。我们假设第一行是标题,第二行是数据。$i
是计数器,因此每个新线对都会重置它。for
循环列表是通过读取一行<>
动态创建的,然后使用正则表达式从中提取 10 个字符的长字符串。然后我们只打印这些内容,并确保不显示零计数器。As a one-liner:
-n
read file line-by-line and store line in$_
.-l
autochomps the input. We assume first line is header, and second is data.$i
is the counter, so it is reset for each new line pair. Thefor
loop list is made on the fly by reading one line<>
, then extracting 10-character long strings from it with a regex. Then we just print the stuff, and make sure not to show the zero counter.