具有两个数组的嵌套循环
我有两组元素数量可变的数组,例如:
chain=(BC)
hresname=(BMA MAN NAG NDG)
我正在解析许多文件,这些文件可能包含数组链中给定位置的元素和元素数组 hresname 位于不同位置(两种情况下位置始终固定)。这是数据的示例:
ATOM 5792 CB MET D 213 49.385 -5.683 125.489 1.00142.66 C
ATOM 5793 CG MET D 213 50.834 -5.674 125.990 1.00154.50 C
ATOM 5794 SD MET D 213 51.530 -7.337 126.277 1.00164.73 S
ATOM 5795 CE MET D 213 52.854 -7.386 125.068 1.00169.73 C
HETATM 5797 C1 NAG B 323 70.090 50.934 125.869 1.00 86.35 C
HETATM 5798 C2 NAG B 323 69.687 52.074 126.879 1.00 95.95 C
HETATM 5799 C3 NAG B 323 68.377 52.740 126.390 1.00 87.65 C
HETATM 5800 C4 NAG B 323 68.598 53.314 125.014 1.00 83.97 C
首先,我需要复制以 ATOM 开头的行,其第五列与数组链的每个元素匹配到单独的文件:
while read pdb ; do
for c in "${chain[@]}" ; do
#if [ ${#chain[@]} -eq 1 ] && \
if [ $(echo "$pdb" | cut -c1-4) == "ATOM" ] && \
[ $(echo "$pdb" | cut -c22-23) == "${chain[$c]}" ]; then
echo "$pdb" >> ../../properpdb/${pdbid}_${chain[$c]}.pdb
fi
done
done < ${pdbid}.pdb
这效果很好(缓慢但可靠)。带注释和未注释的版本都可以工作。
接下来,我想复制以 HETATM 开头且其第四列与 hresname 的元素匹配的行,但前提是这些行也与链数组中第 5 列的元素匹配:
while read pdb ; do
for c in "${chain[@]}" ; do
for h in "${hresname[@]}" ; do
if [ ${#chain[@]} -eq 1 ] && \
[ $(echo "$pdb" | cut -c1-6) == "HETATM" ] && \
[ $(echo "$pdb" | cut -c22-23) == "${chain[$c]}" ] \
[ $(echo "$pdb" | cut -c18-20) == "${hresname[$h]}" ] ; then
echo "$pdb" >> ../../properpdb/${pdbid}_${chain[$c]}.pdb
fi
done
done
done < ${pdbid}.pdb
但是,这不起作用。我反复收到错误:
line 66: [: too many arguments
第 66 行是:
[ $(echo "$pdb" | cut -c22-23) == "${chain[$c]}" ] \
这让我很困惑,因为即使我将循环限制为包含单个元素的链数组,也会发生错误。
根据其他 StackOverflow 问题,在 bash 中应该完全可以做到这一点。知道问题出在哪里吗?
I have two sets of arrays with a variable number of elements, for instance:
chain=(B C)
hresname=(BMA MAN NAG NDG)
I am parsing a number of files that may contain elements from the array chain at a given position and elements of the array hresname at a different position (position is always fixed in both cases). This is a sample of the data:
ATOM 5792 CB MET D 213 49.385 -5.683 125.489 1.00142.66 C
ATOM 5793 CG MET D 213 50.834 -5.674 125.990 1.00154.50 C
ATOM 5794 SD MET D 213 51.530 -7.337 126.277 1.00164.73 S
ATOM 5795 CE MET D 213 52.854 -7.386 125.068 1.00169.73 C
HETATM 5797 C1 NAG B 323 70.090 50.934 125.869 1.00 86.35 C
HETATM 5798 C2 NAG B 323 69.687 52.074 126.879 1.00 95.95 C
HETATM 5799 C3 NAG B 323 68.377 52.740 126.390 1.00 87.65 C
HETATM 5800 C4 NAG B 323 68.598 53.314 125.014 1.00 83.97 C
First I need to copy lines starting with ATOM whose 5th column matches each of the elements of the array chain to separate files:
while read pdb ; do
for c in "${chain[@]}" ; do
#if [ ${#chain[@]} -eq 1 ] && \
if [ $(echo "$pdb" | cut -c1-4) == "ATOM" ] && \
[ $(echo "$pdb" | cut -c22-23) == "${chain[$c]}" ]; then
echo "$pdb" >> ../../properpdb/${pdbid}_${chain[$c]}.pdb
fi
done
done < ${pdbid}.pdb
This works well (slow but sure). Both the commented and the uncommented versions work.
Next I want to copy lines that start with HETATM and whose 4th column matches elements of the hresname but only if those lines also match an element form the chain array at cloumn number 5th:
while read pdb ; do
for c in "${chain[@]}" ; do
for h in "${hresname[@]}" ; do
if [ ${#chain[@]} -eq 1 ] && \
[ $(echo "$pdb" | cut -c1-6) == "HETATM" ] && \
[ $(echo "$pdb" | cut -c22-23) == "${chain[$c]}" ] \
[ $(echo "$pdb" | cut -c18-20) == "${hresname[$h]}" ] ; then
echo "$pdb" >> ../../properpdb/${pdbid}_${chain[$c]}.pdb
fi
done
done
done < ${pdbid}.pdb
However, this does not work. I repeatedly receive an error:
line 66: [: too many arguments
Line 66 is:
[ $(echo "$pdb" | cut -c22-23) == "${chain[$c]}" ] \
Which puzzles me because the error happens even if I restrict the loop to chain arrays containing a single element.
According to other StackOverflow questions, it should be perfectly possible to do this in bash. Any idea what the problem could be?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您要添加
&&
,请更改此行:要
更新:
脚本中有太多错误,我已修复它,现在可以使用了。我建议你先阅读 bash 手册中的 for 循环语法。
You for get to add
&&
, change this line:To
Update:
You have too many errors in the script, I fixed it and it's now working. I suggest you read the for loop syntax in the bash manual first.