根据连续两行的模式分割文件
我有以下格式的文件:
ATOM 3736 CB THR A 486 -6.552 153.891 -7.922 1.00115.15 C
ATOM 3737 OG1 THR A 486 -6.756 154.842 -6.866 1.00114.94 O
ATOM 3738 CG2 THR A 486 -7.867 153.727 -8.636 1.00115.11 C
ATOM 3739 OXT THR A 486 -4.978 151.257 -9.140 1.00115.13 O
HETATM10351 C1 NAG A 203 33.671 87.279 39.456 0.50 90.22 C
HETATM10483 C1 NAG A 702 28.025 104.269 -27.569 0.50 92.75 C
ATOM 3736 CB THR B 486 -6.552 86.240 7.922 1.00115.15 C
ATOM 3737 OG1 THR B 486 -6.756 85.289 6.866 1.00114.94 O
ATOM 3738 CG2 THR B 486 -7.867 86.404 8.636 1.00115.11 C
ATOM 3739 OXT THR B 486 -4.978 88.874 9.140 1.00115.13 O
HETATM10351 C1 NAG B 203 33.671 152.852 -39.456 0.50 90.22 C
HETATM10639 C2 FUC B 402 -48.168 162.221 -22.404 0.50103.03 C
我想在以 HETATM* 开头的每一行之后分割文件,但前提是下一行以 ATOM 开头。我希望新文件名为 $basename_$column,其中 $basename 是输入文件的基本名称,$column 是位置 22-23 处的字符(在示例中为 A 或 B)。我无法弄清楚如何检查两条连续的线以确定分割点。
I have files with the following format:
ATOM 3736 CB THR A 486 -6.552 153.891 -7.922 1.00115.15 C
ATOM 3737 OG1 THR A 486 -6.756 154.842 -6.866 1.00114.94 O
ATOM 3738 CG2 THR A 486 -7.867 153.727 -8.636 1.00115.11 C
ATOM 3739 OXT THR A 486 -4.978 151.257 -9.140 1.00115.13 O
HETATM10351 C1 NAG A 203 33.671 87.279 39.456 0.50 90.22 C
HETATM10483 C1 NAG A 702 28.025 104.269 -27.569 0.50 92.75 C
ATOM 3736 CB THR B 486 -6.552 86.240 7.922 1.00115.15 C
ATOM 3737 OG1 THR B 486 -6.756 85.289 6.866 1.00114.94 O
ATOM 3738 CG2 THR B 486 -7.867 86.404 8.636 1.00115.11 C
ATOM 3739 OXT THR B 486 -4.978 88.874 9.140 1.00115.13 O
HETATM10351 C1 NAG B 203 33.671 152.852 -39.456 0.50 90.22 C
HETATM10639 C2 FUC B 402 -48.168 162.221 -22.404 0.50103.03 C
I would like to split the file after each line starting with HETATM* but only if the next line starts with ATOM. I would like the new files to be called $basename_$column, where $basename is the base name of the input file and $column is the character at position 22-23 (either A or B, in the example). I am not able to figure out how to check both consecutive lines to determine the splitting point.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一个
awk
版本使用
FILENAME
而不是file
创建相同的文件名。Here's an
awk
versionUse
FILENAME
instead offile
to create the same file name.这是一个简单的 Python 解决方案,没有错误检查。应该在 Python 2 或 3 中工作;更改第一行以匹配您的环境。不要将此视为良好编码风格的示例。
编辑独特的文件名。
Here's a simple Python solution with no error checking. Should work in Python 2 or 3; change the first line to match your environment. Don't take this as an example of good coding style.
Edited for unique file names.