迭代文本文件并将最小值存储在字典中
我有一个非常大的文本文件(Summary_post_docking.txt),我想过滤它以找到最低分数。 这就是我想到的:
class Ranker:
def __init__(self):
self.results = {}
with open('HTS_post_docking/Summary_post_docking.txt', 'r') as summary:
for line in summary:
score = float(line.split()[2])
frag_name = str(line.split()[0].split('/')[9]).split('_')[0]
if 0 >= score >= -200:
self.results[frag_name] = score
old = self.results[frag_name]
if frag_name in self.results.keys():
new = float(line.split()[2])
if new < old:
self.results[frag_name] = new
print(self.results)
不幸的是,这一切都是采用它读取的最后一个值,并且不会用新的较低值覆盖。
[str(line.split()[0].split('/')[9]).split('_')[0]] 是分子的名称,而 float(line.split()[2 ]) 是与其相关的分数。
我希望脚本将分子名称存储为键,将分数存储为值。对于每一行,每次它发现具有相同键的较低分数时,我希望它将值升级到它找到的最小值。
编辑:
我添加了 txt 文件中的几行:
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose1 SCORE_sum: -70.13763978228677 avg_score: -0.7 SD_score: 0.44 avg_GBSA: -5.92 SD_GBSA: 2.96 avg_RMSD: 9.75 SD_RMSD: 3.49
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose2 SCORE_sum: -18.39638945104759 avg_score: -0.18 SD_score: 0.26 avg_GBSA: -5.2 SD_GBSA: 4.57 avg_RMSD: 34.57 SD_RMSD: 9.29
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose3 SCORE_sum: -206.23402454507794 avg_score: -2.06 SD_score: 1.15 avg_GBSA: -6.8 SD_GBSA: 1.66 avg_RMSD: 4.05 SD_RMSD: 1.73
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose4 SCORE_sum: -27.56483931516906 avg_score: -0.28 SD_score: 0.64 avg_GBSA: -2.2 SD_GBSA: 3.13 avg_RMSD: 15.43 SD_RMSD: 6.74
我已按照建议更新了代码! 该脚本需要将与该键关联的值更新为它找到的最低分数。
I have a very large text file (Summary_post_docking.txt) and I want to filter it to find the lowest scores.
This is what I came up with:
class Ranker:
def __init__(self):
self.results = {}
with open('HTS_post_docking/Summary_post_docking.txt', 'r') as summary:
for line in summary:
score = float(line.split()[2])
frag_name = str(line.split()[0].split('/')[9]).split('_')[0]
if 0 >= score >= -200:
self.results[frag_name] = score
old = self.results[frag_name]
if frag_name in self.results.keys():
new = float(line.split()[2])
if new < old:
self.results[frag_name] = new
print(self.results)
Unfortunately all this does is taking the last value it reads and doesn't override with the new lower value.
[str(line.split()[0].split('/')[9]).split('_')[0]] is the name of the molecule, while float(line.split()[2]) is the score associated with it.
I want the script to store the name of the molecule as key and the score as a value. For every line, everytime it finds a lower score with the same key I want it to upgrade the value to the smallest it finds.
EDIT:
I'm including a few lines from the txt file:
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose1 SCORE_sum: -70.13763978228677 avg_score: -0.7 SD_score: 0.44 avg_GBSA: -5.92 SD_GBSA: 2.96 avg_RMSD: 9.75 SD_RMSD: 3.49
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose2 SCORE_sum: -18.39638945104759 avg_score: -0.18 SD_score: 0.26 avg_GBSA: -5.2 SD_GBSA: 4.57 avg_RMSD: 34.57 SD_RMSD: 9.29
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose3 SCORE_sum: -206.23402454507794 avg_score: -2.06 SD_score: 1.15 avg_GBSA: -6.8 SD_GBSA: 1.66 avg_RMSD: 4.05 SD_RMSD: 1.73
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose4 SCORE_sum: -27.56483931516906 avg_score: -0.28 SD_score: 0.64 avg_GBSA: -2.2 SD_GBSA: 3.13 avg_RMSD: 15.43 SD_RMSD: 6.74
I have updated the code as suggested!
The script needs to update the value associated with the key to the lowest score it finds.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你的旧值可能等于None,并且...根据分子管理旧值是否符合逻辑?你不这样做。
Your old value could be equal None, and... is it logical to manage the old value according to the molecule? You don't do that.
解决了!
Solved!