如何优化 Python 字典的使用？

发布于 2024-10-16 02:02:21 字数 18290 浏览 2 评论 0原文

问题：

我已经对我的 Python 程序进行了彻底的分析，并且有一个函数正在减慢一切。它大量使用Python字典，所以我可能没有以最好的方式使用它们。如果我不能让它运行得更快，我将不得不用C++重新编写它，那么有谁可以帮助我用Python优化它吗？

我希望我已经给出了正确的解释，并且您可以理解我的代码！预先感谢您的任何帮助。

我的代码：

这是有问题的函数，使用 line_profiler 和 kernprof 进行分析。我正在运行 Python 2.7，

我对第 363、389 和 405 行等内容特别感到困惑，其中比较两个变量的 if 语句似乎花费了过多的时间。

我考虑过使用 NumPy （因为它可以稀疏矩阵），但我认为它不合适，因为：（1 ）我没有使用整数索引我的矩阵（我正在使用对象实例）； (2) 我没有在矩阵中存储简单的数据类型（我存储浮点数和对象实例的元组）。但我愿意被说服接受 NumPy。如果有人了解 NumPy 的稀疏矩阵性能与 Python 哈希表的比较，我会很感兴趣。

抱歉，我没有给出一个可以运行的简单示例，但是这个函数被绑定在一个更大的项目中，我无法弄清楚如何设置一个简单的示例来测试它，而不给你一半的代码根据！

Timer unit: 3.33366e-10 s
File: routing_distances.py
Function: propagate_distances_node at line 328
Total time: 807.234 s

Line #   Hits         Time  Per Hit   % Time  Line Contents
328                                               @profile
329                                               def propagate_distances_node(self, node_a, cutoff_distance=200):
330                                                       
331                                                   # a makes sure its immediate neighbours are correctly in its distance table
332                                                   # because its immediate neighbours may change as binds/folding change
333    737753   3733642341   5060.8      0.2          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].iteritems():
334    512120   2077788924   4057.2      0.1              use_neighbour_link = False
335                                                       
336    512120   2465798454   4814.9      0.1              if(node_b not in self.node_distances[node_a]): # a doesn't know distance to b
337     15857     66075687   4167.0      0.0                  use_neighbour_link = True
338                                                       else: # a does know distance to b
339    496263   2390534838   4817.1      0.1                  (node_distance_b_a, next_node) = self.node_distances[node_a][node_b]
340    496263   2058112872   4147.2      0.1                  if(node_distance_b_a > neighbour_distance_b_a): # neighbour distance is shorter
341        81       331794   4096.2      0.0                      use_neighbour_link = True
342    496182   2665644192   5372.3      0.1                  elif((None == next_node) and (float('+inf') == neighbour_distance_b_a)): # direct route that has just broken
343        75       313623   4181.6      0.0                      use_neighbour_link = True
344                                                               
345    512120   1992514932   3890.7      0.1              if(use_neighbour_link):
346     16013     78149007   4880.3      0.0                  self.node_distances[node_a][node_b] = (neighbour_distance_b_a, None)
347     16013     83489949   5213.9      0.0                  self.nodes_changed.add(node_a)
348                                                           
349                                                           ## Affinity distances update
350     16013     86020794   5371.9      0.0                  if((node_a.type == Atom.BINDING_SITE) and (node_b.type == Atom.BINDING_SITE)):
351       164      3950487  24088.3      0.0                      self.add_affinityDistance(node_a, node_b, self.chemistry.affinity(node_a.data, node_b.data))     
352                                                   
353                                                   # a sends its table to all its immediate neighbours
354    737753   3549685140   4811.5      0.1          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].iteritems():
355    512120   2129343210   4157.9      0.1              node_b_changed = False
356                                               
357                                                       # b integrates a's distance table with its own
358    512120   2203821081   4303.3      0.1              node_b_chemical = node_b.chemical
359    512120   2409257898   4704.5      0.1              node_b_distances = node_b_chemical.node_distances[node_b]
360                                                       
361                                                       # For all b's routes (to c) that go to a first, update their distances
362  41756882 183992040153   4406.3      7.6              for node_c, (distance_b_c, node_after_b) in node_b_distances.iteritems(): # Think it's ok to modify items while iterating over them (just not insert/delete) (seems to work ok)
363  41244762 172425596985   4180.5      7.1                  if(node_after_b == node_a):
364                                                               
365  16673654  64255631616   3853.7      2.7                      try:
366  16673654  88781802534   5324.7      3.7                          distance_b_a_c = neighbour_distance_b_a + self.node_distances[node_a][node_c][0]
367    187083    929898684   4970.5      0.0                      except KeyError:
368    187083   1056787479   5648.8      0.0                          distance_b_a_c = float('+inf')
369                                                                   
370  16673654  69374705256   4160.7      2.9                      if(distance_b_c != distance_b_a_c): # a's distance to c has changed
371    710083   3136751361   4417.4      0.1                          node_b_distances[node_c] = (distance_b_a_c, node_a)
372    710083   2848845276   4012.0      0.1                          node_b_changed = True
373                                                                   
374                                                                   ## Affinity distances update
375    710083   3484577241   4907.3      0.1                          if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
376     99592   1591029009  15975.5      0.1                              node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
377                                                                   
378                                                               # If distance got longer, then ask b's neighbours to update
379                                                               ## TODO: document this!
380  16673654  70998570837   4258.1      2.9                      if(distance_b_a_c > distance_b_c):
381                                                                   #for (node, neighbour_distance) in node_b_chemical.neighbours[node_b].iteritems():
382   1702852   7413182064   4353.4      0.3                          for node in node_b_chemical.neighbours[node_b]:
383   1204903   5912053272   4906.7      0.2                              node.chemical.nodes_changed.add(node)
384                                                       
385                                                       # Look for routes from a to c that are quicker than ones b knows already
386  42076729 184216680432   4378.1      7.6              for node_c, (distance_a_c, node_after_a) in self.node_distances[node_a].iteritems():
387                                                           
388  41564609 171150289218   4117.7      7.1                  node_b_update = False
389  41564609 172040284089   4139.1      7.1                  if(node_c == node_b): # a-b path
390    512120   2040112548   3983.7      0.1                      pass
391  41052489 169406668962   4126.6      7.0                  elif(node_after_a == node_b): # a-b-a-b path
392  16251407  63918804600   3933.1      2.6                      pass
393  24801082 101577038778   4095.7      4.2                  elif(node_c in node_b_distances): # b can already get to c
394  24004846 103404357180   4307.6      4.3                      (distance_b_c, node_after_b) = node_b_distances[node_c]
395  24004846 102717271836   4279.0      4.2                      if(node_after_b != node_a): # b doesn't already go to a first
396   7518275  31858204500   4237.4      1.3                          distance_b_a_c = neighbour_distance_b_a + distance_a_c
397   7518275  33470022717   4451.8      1.4                          if(distance_b_a_c < distance_b_c): # quicker to go via a
398    225357    956440656   4244.1      0.0                              node_b_update = True
399                                                           else: # b can't already get to c
400    796236   3415455549   4289.5      0.1                      distance_b_a_c = neighbour_distance_b_a + distance_a_c
401    796236   3412145520   4285.3      0.1                      if(distance_b_a_c < cutoff_distance): # not too for to go
402    593352   2514800052   4238.3      0.1                          node_b_update = True
403                                                                   
404                                                           ## Affinity distances update
405  41564609 164585250189   3959.7      6.8                  if node_b_update:
406    818709   3933555120   4804.6      0.2                      node_b_distances[node_c] = (distance_b_a_c, node_a)
407    818709   4151464335   5070.7      0.2                      if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
408    104293   1704446289  16342.9      0.1                          node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
409    818709   3557529531   4345.3      0.1                      node_b_changed = True
410                                                       
411                                                       # If any of node b's rows have exceeded the cutoff distance, then remove them
412  42350234 197075504439   4653.5      8.1              for node_c, (distance_b_c, node_after_b) in node_b_distances.items(): # Can't use iteritems() here, as deleting from the dictionary
413  41838114 180297579789   4309.4      7.4                  if(distance_b_c > cutoff_distance):
414    206296    894881754   4337.9      0.0                      del node_b_distances[node_c]
415    206296    860508045   4171.2      0.0                      node_b_changed = True
416                                                               
417                                                               ## Affinity distances update
418    206296   4698692217  22776.5      0.2                      node_b_chemical.del_affinityDistance(node_b, node_c)
419                                                       
420                                                       # If we've modified node_b's distance table, tell its chemical to update accordingly
421    512120   2130466347   4160.1      0.1              if(node_b_changed):
422    217858   1201064454   5513.1      0.0                  node_b_chemical.nodes_changed.add(node_b)
423                                                   
424                                                   # Remove any neighbours that have infinite distance (have just unbound)
425                                                   ## TODO: not sure what difference it makes to do this here rather than above (after updating self.node_distances for neighbours)
426                                                   ##       but doing it above seems to break the walker's movement
427    737753   3830386968   5192.0      0.2          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].items(): # Can't use iteritems() here, as deleting from the dictionary
428    512120   2249770068   4393.1      0.1              if(neighbour_distance_b_a > cutoff_distance):
429       150       747747   4985.0      0.0                  del self.neighbours[node_a][node_b]
430                                                           
431                                                           ## Affinity distances update
432       150      2148813  14325.4      0.0                  self.del_affinityDistance(node_a, node_b)

我的代码说明：

该函数维护一个稀疏距离矩阵，表示（非常大的）网络中节点之间的网络距离（最短路径上的边权重之和）。要处理完整的表格并使用 Floyd-Warshall 算法将非常慢。（我首先尝试了这个，它比当前版本慢了几个数量级。）因此我的代码使用稀疏矩阵来表示全距离矩阵的阈值版本（任何距离大于 200 个单位的路径都将被忽略）。网络拓扑随时间变化，因此该距离矩阵需要随时间更新。为此，我使用距离矢量路由协议的粗略实现：网络中的每个节点都知道到每个其他节点以及路径上的下一个节点的距离。当拓扑发生变化时，与此变化相关的节点相应地更新其距离表，并告知其直接邻居。信息通过网络传播，节点将距离表发送给邻居，邻居更新距离表并将其传播给邻居。

有一个表示距离矩阵的对象：self.node_distances。这是一个将节点映射到路由表的字典。节点是我定义的一个对象。路由表是将节点映射到（距离，next_node）元组的字典。 Distance 是从node_a 到node_b 的图距离，next_node 是在node_a 和node_b 之间的路径上必须首先访问的node_a 的邻居。 next_node 为 None 表示node_a 和node_b 是图邻居。例如，距离矩阵的示例可以是：

self.node_distances = { node_1 : { node_2 : (2.0, None),
                                   node_3 : (5.7, node_2),
                                   node_5 : (22.9, node_2) },
                        node_2 : { node_1 : (2.0, None),
                                   node_3 : (3.7, None),
                                   node_5 : (20.9, node_7)},
                        ...etc...

由于拓扑变化，相距较远（或根本未连接）的两个节点可能会变得很近。发生这种情况时，条目将添加到该矩阵中。由于阈值处理，两个节点可能会变得相距太远而无法关心。发生这种情况时，条目将从该矩阵中删除。

self.neighbours 矩阵与 self.node_distances 类似，但包含有关网络中直接链接（边）的信息。 self.neighbours 通过化学反应不断从外部对该函数进行修改。这就是网络拓扑变化的来源。

我遇到问题的实际函数： propagate_distances_node() 执行距离矢量路由协议。给定一个节点 node_a，该函数确保 node_a 的邻居在距离矩阵中正确（拓扑更改）。然后，该函数将 node_a 的路由表发送到网络中 node_a 的所有直接邻居。它将node_a的路由表与每个邻居自己的路由表集成。

在我的程序的其余部分中，重复调用 propagate_distances_node() 函数，直到距离矩阵收敛。维护一个self.nodes_changed 集合，其中包含自上次更新以来已更改其路由表的节点。在我的算法的每次迭代中，都会选择这些节点的随机子集，并对它们调用 propagate_distances_node()。这意味着节点异步且随机地传播其路由表。当集合 self.nodes_changed 变空时，该算法收敛于真实距离矩阵。

“亲和距离”部分（add_affinityDistance 和 del_affinityDistance）是距离矩阵的（小）子矩阵的缓存，由距离矩阵的不同部分使用。程序。

我这样做的原因是我正在模拟参与反应的化学物质的计算类似物，作为我博士学位的一部分。 “化学”是“原子”的图（图中的节点）。模拟两种化学物质结合在一起，因为它们的两个图形通过新边连接。发生了化学反应（通过与此处无关的复杂过程），改变了图的拓扑结构。但反应中发生的情况取决于构成化学物质的不同原子之间的距离。因此，对于模拟中的每个原子，我想知道它与哪些其他原子接近。稀疏的阈值距离矩阵是存储此信息的最有效方法。由于网络的拓扑随着反应的发生而变化，因此我需要更新矩阵。距离矢量路由协议是我能想到的最快的方法。我不需要更复杂的路由协议，因为像路由循环这样的事情不会发生在我的特定应用程序中（因为我的化学品的结构）。我随机这样做的原因是，我可以将化学反应过程与距离传播交织在一起，并模拟化学反应发生时随着时间逐渐改变形状（而不是立即改变形状）。

该函数中的 self 是代表化学物质的对象。 self.node_distances.keys() 中的节点是组成化学物质的原子。 self.node_distances[node_x].keys() 中的节点是来自化学物质的节点，并且可能是来自与化学物质结合（并与之反应）的任何化学物质的节点。

更新：

我尝试将 node_x == node_y 的每个实例替换为 node_x is node_y （根据 @Sven Marnach 的评论），~~但是它减慢了速度！（我没想到会这样！）我原来的配置文件需要 807.234 秒才能运行，但经过此修改，它增加到 895.895 秒。~~ 抱歉，我执行的配置文件错误！我使用的是 line_by_line，它（在我的代码中）有太多的方差（大约 90 秒的差异都在噪音中）。当正确分析它时，is 绝对比 == 更快。使用 CProfile，我的代码 == 采取了34.394 秒，但使用 is 时，花费了 33.535 秒（我可以确认这是在噪音之外）。

更新： 现有的库

我不确定是否有一个现有的库可以完成我想要的事情，因为我的要求很不寻常：我需要计算加权无向图中所有节点对之间的最短路径长度。我只关心低于阈值的路径长度。计算路径长度后，我对网络拓扑进行了一些小的更改（添加或删除边缘），然后我想重新计算路径长度。与阈值相比，我的图很大（从给定节点开始，大部分图都比阈值更远），因此拓扑变化不会影响大多数最短路径长度。这就是我使用路由算法的原因：因为它通过图结构传播拓扑变化信息，所以当它超出阈值时我可以停止传播它。即，我不需要每次都重新计算所有路径。我可以使用先前的路径信息（来自拓扑更改之前）来加快计算速度。这就是为什么我认为我的算法将比最短路径算法的任何库实现更快。我从未见过在通过物理网络实际路由数据包之外使用的路由算法（但如果有人见过，那么我会感兴趣）。

NetworkX 由 @Thomas K 建议。它有很多用于计算最短路径的算法。它有一个计算所有对最短路径长度带有截止值（这就是我想要的），但它仅适用于未加权的图（我的是加权的）。不幸的是，它的加权图算法不允许使用截止点（这可能会使我的图变慢）。而且它的算法似乎都不支持在非常相似的网络上使用预先计算的路径（即路由内容）。

igraph 是另一个图形库我知道，但看看其文档，我找不到有关最短路径的任何内容。但我可能错过了它 - 它的文档似乎不是很全面。

感谢 @9000 的评论，NumPy 可能是可能的。如果我为节点的每个实例分配一个唯一的整数，我可以将稀疏矩阵存储在 NumPy 数组中。然后，我可以使用整数而不是节点实例来索引 NumPy 数组。我还需要两个 NumPy 数组：一个用于距离，一个用于“next_node”引用。这可能比使用 Python 字典更快（我还不知道）。

有谁知道其他可能有用的库？

更新：内存使用情况

我运行的是 Windows (XP)，因此这里有一些有关内存使用情况的信息，来自进程资源管理器。 CPU使用率为50%，因为我有双核机器。

全局内存使用我的程序的内存使用情况

我的程序没有耗尽 RAM 并开始使用交换区。您可以从数字和没有任何活动的 IO 图中看到这一点。 IO 图上的尖峰是程序打印到屏幕上以说明其运行情况的地方。

然而，随着时间的推移，我的程序确实会消耗越来越多的 RAM，这可能不是一件好事（但它总体上并没有消耗太多 RAM，这就是为什么我直到现在才注意到增加的原因）。

IO 图上尖峰之间的距离随着时间的推移而增加。这很糟糕 - 我的程序每 100,000 次迭代都会打印到屏幕上，因此这意味着随着时间的推移，每次迭代都会花费更长的时间来执行......我通过长时间运行我的程序并测量之间的时间来确认这一点print 语句（程序每 10,000 次迭代之间的时间）。这应该是恒定的，但正如您从图表中看到的那样，它呈线性增加......所以有些东西在那里。（该图上的噪声是因为我的程序使用了大量随机数，因此每次迭代的时间各不相同。）

在我的程序运行了很长一段时间后，内存使用情况如下所示（因此它绝对没有耗尽 RAM）：

全局内存使用情况 - 长时间运行后我的程序的内存使用情况 - 长时间运行后

原文

Question:

I've profiled my Python program to death, and there is one function that is slowing everything down. It uses Python dictionaries heavily, so I may not have used them in the best way. If I can't get it running faster, I will have to re-write it in C++, so is there anyone who can help me optimise it in Python?

I hope I've given the right sort of explanation, and that you can make some sense of my code! Thanks in advance for any help.

My code:

This is the offending function, profiled using line_profiler and kernprof. I'm running Python 2.7

I'm particularly puzzled by things like lines 363, 389 and 405, where an if statement with a comparison of two variables seems to take an inordinate amount of time.

I've considered using NumPy (as it does sparse matrices) but I don't think it's appropriate because: (1) I'm not indexing my matrix using integers (I'm using object instances); and (2) I'm not storing simple data types in the matrix (I'm storing tuples of a float and an object instance).
But I'm willing to be persuaded about NumPy.
If anyone knows about NumPy's sparse matrix performance vs. Python's hash tables, I'd be interested.

Sorry I haven't given a simple example that you can run, but this function is tied up in a much larger project and I couldn't work out how to set up a simple example to test it, without giving you half of my code base!

Timer unit: 3.33366e-10 s
File: routing_distances.py
Function: propagate_distances_node at line 328
Total time: 807.234 s

Line #   Hits         Time  Per Hit   % Time  Line Contents
328                                               @profile
329                                               def propagate_distances_node(self, node_a, cutoff_distance=200):
330                                                       
331                                                   # a makes sure its immediate neighbours are correctly in its distance table
332                                                   # because its immediate neighbours may change as binds/folding change
333    737753   3733642341   5060.8      0.2          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].iteritems():
334    512120   2077788924   4057.2      0.1              use_neighbour_link = False
335                                                       
336    512120   2465798454   4814.9      0.1              if(node_b not in self.node_distances[node_a]): # a doesn't know distance to b
337     15857     66075687   4167.0      0.0                  use_neighbour_link = True
338                                                       else: # a does know distance to b
339    496263   2390534838   4817.1      0.1                  (node_distance_b_a, next_node) = self.node_distances[node_a][node_b]
340    496263   2058112872   4147.2      0.1                  if(node_distance_b_a > neighbour_distance_b_a): # neighbour distance is shorter
341        81       331794   4096.2      0.0                      use_neighbour_link = True
342    496182   2665644192   5372.3      0.1                  elif((None == next_node) and (float('+inf') == neighbour_distance_b_a)): # direct route that has just broken
343        75       313623   4181.6      0.0                      use_neighbour_link = True
344                                                               
345    512120   1992514932   3890.7      0.1              if(use_neighbour_link):
346     16013     78149007   4880.3      0.0                  self.node_distances[node_a][node_b] = (neighbour_distance_b_a, None)
347     16013     83489949   5213.9      0.0                  self.nodes_changed.add(node_a)
348                                                           
349                                                           ## Affinity distances update
350     16013     86020794   5371.9      0.0                  if((node_a.type == Atom.BINDING_SITE) and (node_b.type == Atom.BINDING_SITE)):
351       164      3950487  24088.3      0.0                      self.add_affinityDistance(node_a, node_b, self.chemistry.affinity(node_a.data, node_b.data))     
352                                                   
353                                                   # a sends its table to all its immediate neighbours
354    737753   3549685140   4811.5      0.1          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].iteritems():
355    512120   2129343210   4157.9      0.1              node_b_changed = False
356                                               
357                                                       # b integrates a's distance table with its own
358    512120   2203821081   4303.3      0.1              node_b_chemical = node_b.chemical
359    512120   2409257898   4704.5      0.1              node_b_distances = node_b_chemical.node_distances[node_b]
360                                                       
361                                                       # For all b's routes (to c) that go to a first, update their distances
362  41756882 183992040153   4406.3      7.6              for node_c, (distance_b_c, node_after_b) in node_b_distances.iteritems(): # Think it's ok to modify items while iterating over them (just not insert/delete) (seems to work ok)
363  41244762 172425596985   4180.5      7.1                  if(node_after_b == node_a):
364                                                               
365  16673654  64255631616   3853.7      2.7                      try:
366  16673654  88781802534   5324.7      3.7                          distance_b_a_c = neighbour_distance_b_a + self.node_distances[node_a][node_c][0]
367    187083    929898684   4970.5      0.0                      except KeyError:
368    187083   1056787479   5648.8      0.0                          distance_b_a_c = float('+inf')
369                                                                   
370  16673654  69374705256   4160.7      2.9                      if(distance_b_c != distance_b_a_c): # a's distance to c has changed
371    710083   3136751361   4417.4      0.1                          node_b_distances[node_c] = (distance_b_a_c, node_a)
372    710083   2848845276   4012.0      0.1                          node_b_changed = True
373                                                                   
374                                                                   ## Affinity distances update
375    710083   3484577241   4907.3      0.1                          if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
376     99592   1591029009  15975.5      0.1                              node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
377                                                                   
378                                                               # If distance got longer, then ask b's neighbours to update
379                                                               ## TODO: document this!
380  16673654  70998570837   4258.1      2.9                      if(distance_b_a_c > distance_b_c):
381                                                                   #for (node, neighbour_distance) in node_b_chemical.neighbours[node_b].iteritems():
382   1702852   7413182064   4353.4      0.3                          for node in node_b_chemical.neighbours[node_b]:
383   1204903   5912053272   4906.7      0.2                              node.chemical.nodes_changed.add(node)
384                                                       
385                                                       # Look for routes from a to c that are quicker than ones b knows already
386  42076729 184216680432   4378.1      7.6              for node_c, (distance_a_c, node_after_a) in self.node_distances[node_a].iteritems():
387                                                           
388  41564609 171150289218   4117.7      7.1                  node_b_update = False
389  41564609 172040284089   4139.1      7.1                  if(node_c == node_b): # a-b path
390    512120   2040112548   3983.7      0.1                      pass
391  41052489 169406668962   4126.6      7.0                  elif(node_after_a == node_b): # a-b-a-b path
392  16251407  63918804600   3933.1      2.6                      pass
393  24801082 101577038778   4095.7      4.2                  elif(node_c in node_b_distances): # b can already get to c
394  24004846 103404357180   4307.6      4.3                      (distance_b_c, node_after_b) = node_b_distances[node_c]
395  24004846 102717271836   4279.0      4.2                      if(node_after_b != node_a): # b doesn't already go to a first
396   7518275  31858204500   4237.4      1.3                          distance_b_a_c = neighbour_distance_b_a + distance_a_c
397   7518275  33470022717   4451.8      1.4                          if(distance_b_a_c < distance_b_c): # quicker to go via a
398    225357    956440656   4244.1      0.0                              node_b_update = True
399                                                           else: # b can't already get to c
400    796236   3415455549   4289.5      0.1                      distance_b_a_c = neighbour_distance_b_a + distance_a_c
401    796236   3412145520   4285.3      0.1                      if(distance_b_a_c < cutoff_distance): # not too for to go
402    593352   2514800052   4238.3      0.1                          node_b_update = True
403                                                                   
404                                                           ## Affinity distances update
405  41564609 164585250189   3959.7      6.8                  if node_b_update:
406    818709   3933555120   4804.6      0.2                      node_b_distances[node_c] = (distance_b_a_c, node_a)
407    818709   4151464335   5070.7      0.2                      if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
408    104293   1704446289  16342.9      0.1                          node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
409    818709   3557529531   4345.3      0.1                      node_b_changed = True
410                                                       
411                                                       # If any of node b's rows have exceeded the cutoff distance, then remove them
412  42350234 197075504439   4653.5      8.1              for node_c, (distance_b_c, node_after_b) in node_b_distances.items(): # Can't use iteritems() here, as deleting from the dictionary
413  41838114 180297579789   4309.4      7.4                  if(distance_b_c > cutoff_distance):
414    206296    894881754   4337.9      0.0                      del node_b_distances[node_c]
415    206296    860508045   4171.2      0.0                      node_b_changed = True
416                                                               
417                                                               ## Affinity distances update
418    206296   4698692217  22776.5      0.2                      node_b_chemical.del_affinityDistance(node_b, node_c)
419                                                       
420                                                       # If we've modified node_b's distance table, tell its chemical to update accordingly
421    512120   2130466347   4160.1      0.1              if(node_b_changed):
422    217858   1201064454   5513.1      0.0                  node_b_chemical.nodes_changed.add(node_b)
423                                                   
424                                                   # Remove any neighbours that have infinite distance (have just unbound)
425                                                   ## TODO: not sure what difference it makes to do this here rather than above (after updating self.node_distances for neighbours)
426                                                   ##       but doing it above seems to break the walker's movement
427    737753   3830386968   5192.0      0.2          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].items(): # Can't use iteritems() here, as deleting from the dictionary
428    512120   2249770068   4393.1      0.1              if(neighbour_distance_b_a > cutoff_distance):
429       150       747747   4985.0      0.0                  del self.neighbours[node_a][node_b]
430                                                           
431                                                           ## Affinity distances update
432       150      2148813  14325.4      0.0                  self.del_affinityDistance(node_a, node_b)

Explanation of my code:

This function maintains a sparse distance matrix representing the network distance (sum of edge weights on the shortest path) between nodes in a (very big) network. To work with the complete table and use the Floyd-Warshall algorithm would be very slow. (I tried this first, and it was orders of magnitude slower than the current version.) So my code uses a sparse matrix to represent a thresholded version of the full distance matrix (any paths with a distance greater than 200 units are ignored). The network topolgy changes over time, so this distance matrix needs updating over time. To do this, I am using a rough implementation of a distance-vector routing protocol: each node in the network knows the distance to each other node and the next node on the path. When a topology change happens, the node(s) associated with this change update their distance table(s) accordingly, and tell their immediate neighbours. The information spreads through the network by nodes sending their distance tables to their neighbours, who update their distance tables and spread them to their neighbours.

There is an object representing the distance matrix: self.node_distances. This is a dictionary mapping nodes to routing tables. A node is an object that I've defined. A routing table is a dictionary mapping nodes to tuples of (distance, next_node). Distance is the graph distance from node_a to node_b, and next_node is the neighbour of node_a that you must go to first, on the path between node_a and node_b. A next_node of None indicates that node_a and node_b are graph neighbours. For example, a sample of a distance matrix could be:

self.node_distances = { node_1 : { node_2 : (2.0, None),
                                   node_3 : (5.7, node_2),
                                   node_5 : (22.9, node_2) },
                        node_2 : { node_1 : (2.0, None),
                                   node_3 : (3.7, None),
                                   node_5 : (20.9, node_7)},
                        ...etc...

Because of topology changes, two nodes that were far apart (or not connected at all) can become close. When this happens, entries are added to this matrix. Because of the thresholding, two nodes can become too far apart to care about. When this happens, entries are deleted from this matrix.

The self.neighbours matrix is similar to self.node_distances, but contains information about the direct links (edges) in the network. self.neighbours is continually being modified externally to this function, by the chemical reaction. This is where the network topology changes come from.

The actual function that I'm having problems with: propagate_distances_node() performs one step of the distance-vector routing protocol. Given a node, node_a, the function makes sure that node_a's neighbours are correctly in the distance matrix (topology changes). The function then sends node_a's routing table to all of node_a's immediate neighbours in the network. It integrates node_a's routing table with each neighbour's own routing table.

In the rest of my program, the propagate_distances_node() function is called repeatedly, until the distance matrix converges. A set, self.nodes_changed, is maintained, of the nodes that have changed their routing table since they were last updated. On every iteration of my algorithm, a random subset of these nodes are chosen and propagate_distances_node() is called on them. This means the nodes spread their routing tables asynchronously and stochastically. This algorithm converges on the true distance matrix when the set self.nodes_changed becomes empty.

The "affinity distances" parts (add_affinityDistance and del_affinityDistance) are a cache of a (small) sub-matrix of the distance matrix, that is used by a different part of the program.

The reason I'm doing this is that I'm simulating computational analogues of chemicals participating in reactions, as part of my PhD. A "chemical" is a graph of "atoms" (nodes in the graph). Two chemicals binding together is simulated as their two graphs being joined by new edges. A chemical reaction happens (by a complicated process that isn't relevant here), changing the topology of the graph. But what happens in the reaction depends on how far apart the different atoms are that make up the chemicals. So for each atom in the simulation, I want to know which other atoms it is close to. A sparse, thresholded distance matrix is the most efficient way to store this information. Since the topology of the network changes as the reaction happens, I need to update the matrix. A distance-vector routing protocol is the fastest way I could come up with of doing this. I don't need a more compliacted routing protocol, because things like routing loops don't happen in my particular application (because of how my chemicals are structured). The reason I'm doing it stochastically is so that I can interleve the chemical reaction processes with the distance spreading, and simulate a chemical gradually changing shape over time as the reaction happens (rather than changing shape instantly).

The self in this function is an object representing a chemical. The nodes in self.node_distances.keys() are the atoms that make up the chemical. The nodes in self.node_distances[node_x].keys() are nodes from the chemical and potentially nodes from any chemicals that the chemical is bound to (and reacting with).

Update:

I tried replacing every instance of node_x == node_y with node_x is node_y (as per @Sven Marnach's comment), but it slowed things down! (I wasn't expecting that!)
My original profile took 807.234s to run, but with this modification it increased to 895.895s. Sorry, I was doing the profiling wrong! I was using line_by_line, which (on my code) had far too much variance (that difference of ~90 seconds was all in the noise). When profiling it properly, is is detinitely faster than ==. Using CProfile, my code with == took 34.394s, but with is, it took 33.535s (which I can confirm is out of the noise).

Update:
Existing libraries

I'm unsure as to whether there will be an existing library that can do what I want, since my requirements are unusual:
I need to compute the shortest-path lengths between all pairs of nodes in a weighted, undirected graph. I only care about path lengths that are lower than a threshold value. After computing the path lengths, I make a small change to the network topology (adding or removing an edge), and then I want to re-compute the path lengths. My graphs are huge compared to the threshold value (from a given node, most of the graph is further away than the threshold), and so the topology changes don't affect most of the shortest-path lengths. This is why I am using the routing algorithm: because this spreads topology-change information through the graph structure, so I can stop spreading it when it's gone further than the threshold. i.e., I don't need to re-compute all the paths each time. I can use the previous path information (from before the topology change) to speed up the calculation. This is why I think my algorithm will be faster than any library implementations of shortest-path algorithms.
I've never seen routing algorithms used outside of actually routing packets through physical networks (but if anyone has, then I'd be interested).

NetworkX was suggested by @Thomas K.
It has lots of algorithms for calculating shortest paths.
It has an algorithm for computing the all-pairs shortest path lengths with a cutoff (which is what I want), but it only works on unweighted graphs (mine are weighted).
Unfortunately, its algorithms for weighted graphs don't allow the use of a cutoff (which might make them slow for my graphs). And none of its algorithms appear to support the use of pre-calculated paths on a very similar network (i.e. the routing stuff).

igraph is another graph library that I know of, but looking at its documentation, I can't find anything about shortest-paths. But I might have missed it - its documentation doesn't seem very comprehensive.

NumPy might be possible, thanks to @9000's comment. I can store my sparse matrix in a NumPy array if I assign a unique integer to each instance of my nodes. I can then index a NumPy array with integers instead of node instances. I will also need two NumPy arrays: one for the distances and one for the "next_node" references. This might be faster than using Python dictionaries (I don't know yet).

Does anyone know of any other libraries that might be useful?

Update: Memory usage

I'm running Windows (XP), so here is some info about memory usage, from Process Explorer. The CPU usage is at 50% because I have a dual-core machine.

global memory usage
my program's memory usage

My program doesn't run out of RAM and start hitting the swap. You can see that from the numbers, and from the IO graph not having any activity. The spikes on the IO graph are where the program prints to the screen to say how it's doing.

However, my program does keep using up more and more RAM over time, which is probably not a good thing (but it's not using up much RAM overall, which is why I didn't notice the increase until now).

And the distance between the spikes on the IO graph increases over time. This is bad - my program prints to the screen every 100,000 iterations, so that means that each iteration is taking longer to execute as time goes on... I've confirmed this by doing a long run of my program and measuring the time between print statements (the time between each 10,000 iterations of the program). This should be constant, but as you can see from the graph, it increases linearly... so something's up there. (The noise on this graph is because my program uses lots of random numbers, so the time for each iteration varies.)

lag between print statements increasing over time

After my program's been running for a long time, the memory usage looks like this (so it's definitely not running out of RAM):

global memory usage - after a long run
my program's memory usage - after a long run

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

乙白 2024-10-23 02:02:21

node_after_b == node_a 将尝试调用 node_after_b.__eq__(node_a)：

>>> class B(object):
...     def __eq__(self, other):
...         print "B.__eq__()"
...         return False
... 
>>> class A(object):
...     def __eq__(self, other):
...         print "A.__eq__()"
...         return False
... 
>>> a = A()
>>> b = B()
>>> a == b
A.__eq__()
False
>>> b == a
B.__eq__()
False
>>>

在使用之前尝试使用优化版本覆盖 Node.__eq__()更新

。

我做了这个小实验（python 2.6.6）：

#!/usr/bin/env python
# test.py
class A(object):
    def __init__(self, id):
        self.id = id

class B(A):
    def __eq__(self, other):
        return self.id == other.id

@profile
def main():
    list_a = []
    list_b = []
    for x in range(100000):
        list_a.append(A(x))
        list_b.append(B(x))

    ob_a = A(1)
    ob_b = B(1)
    for ob in list_a:
        if ob == ob_a:
            x = True
        if ob is ob_a:
            x = True
        if ob.id == ob_a.id:
            x = True
        if ob.id == 1:
            x = True
    for ob in list_b:
        if ob == ob_b:
            x = True
        if ob is ob_b:
            x = True
        if ob.id == ob_b.id:
            x = True
        if ob.id == 1:
            x = True

if __name__ == '__main__':
    main()

结果：

Timer unit: 1e-06 s

File: test.py Function: main at line 10 Total time: 5.52964 s

Line #      Hits         Time  Per Hit % Time  Line Contents
==============================================================
    10                                           @profile
    11                                           def main():
    12         1            5      5.0      0.0      list_a = []
    13         1            3      3.0      0.0      list_b = []
    14    100001       360677      3.6      6.5      for x in range(100000):
    15    100000       763593      7.6     13.8          list_a.append(A(x))
    16    100000       924822      9.2     16.7          list_b.append(B(x))
    17
    18         1           14     14.0      0.0      ob_a = A(1)
    19         1            5      5.0      0.0      ob_b = B(1)
    20    100001       500454      5.0      9.1      for ob in list_a:
    21    100000       267252      2.7      4.8          if ob == ob_a:
    22                                                       x = True
    23    100000       259075      2.6      4.7          if ob is ob_a:
    24                                                       x = True
    25    100000       539683      5.4      9.8          if ob.id == ob_a.id:
    26         1            3      3.0      0.0              x = True
    27    100000       271519      2.7      4.9          if ob.id == 1:
    28         1            3      3.0      0.0              x = True
    29    100001       296736      3.0      5.4      for ob in list_b:
    30    100000       472204      4.7      8.5          if ob == ob_b:
    31         1            4      4.0      0.0              x = True
    32    100000       283165      2.8      5.1          if ob is ob_b:
    33                                                       x = True
    34    100000       298839      3.0      5.4          if ob.id == ob_b.id:
    35         1            3      3.0      0.0              x = True
    36    100000       291576      2.9      5.3          if ob.id == 1:
    37         1            3      3.0      0.0              x = True

我非常惊讶：

“点”访问（ob.property）似乎非常昂贵（第25行与第27行）
is 和 '==' 之间没有太大区别，至少对于简单的对象来说是这样。

然后我尝试了更复杂的对象，结果与第一个实验一致。

你交换很多吗？如果您的数据集太大以至于无法容纳可用 RAM，我猜您可能会遇到与虚拟内存获取相关的某种 I/O 争用。

你运行的是 Linux 吗？如果是这样，您可以在运行程序时发布您机器的 vmstat 吗？向我们发送类似以下内容的输出：

vmstat 10 100

祝你好运！

更新（来自OP的评论）

我建议使用sys.setcheckinterval并启用/禁用GC。基本原理是，对于这种特殊情况（大量实例），默认 GC 引用计数检查有点昂贵，并且其默认间隔过于频繁。

是的，我之前玩过
sys.setcheckinterval。我把它改为
1000（默认为 100），但它
没有做任何可测量的差异。
禁用垃圾收集有
帮助 - 谢谢。这已经是
迄今为止最大的加速 - 节省约
20%（全程 171 分钟，
减少到 135 分钟）-我不确定
误差线是什么，但是
它必须具有统计显着性
增加。 – Adam Nellis 2 月 9 日 15:10

我的猜测：

我认为Python GC是基于
引用计数。时不时地
将检查引用计数
每个实例；既然你是
遍历这些巨大的内存
结构，根据您的具体情况
GC 默认频率（1000
周期？）太频繁了 - 一个巨大的
浪费。 – 真诚的您，2 月 10 日 2:06

node_after_b == node_a will try to call node_after_b.__eq__(node_a):

>>> class B(object):
...     def __eq__(self, other):
...         print "B.__eq__()"
...         return False
... 
>>> class A(object):
...     def __eq__(self, other):
...         print "A.__eq__()"
...         return False
... 
>>> a = A()
>>> b = B()
>>> a == b
A.__eq__()
False
>>> b == a
B.__eq__()
False
>>>

Try to override Node.__eq__() with an optimized version before resorting to C.

UPDATE

I made this little experiment (python 2.6.6):

#!/usr/bin/env python
# test.py
class A(object):
    def __init__(self, id):
        self.id = id

class B(A):
    def __eq__(self, other):
        return self.id == other.id

@profile
def main():
    list_a = []
    list_b = []
    for x in range(100000):
        list_a.append(A(x))
        list_b.append(B(x))

    ob_a = A(1)
    ob_b = B(1)
    for ob in list_a:
        if ob == ob_a:
            x = True
        if ob is ob_a:
            x = True
        if ob.id == ob_a.id:
            x = True
        if ob.id == 1:
            x = True
    for ob in list_b:
        if ob == ob_b:
            x = True
        if ob is ob_b:
            x = True
        if ob.id == ob_b.id:
            x = True
        if ob.id == 1:
            x = True

if __name__ == '__main__':
    main()

Results:

Timer unit: 1e-06 s

File: test.py Function: main at line 10 Total time: 5.52964 s

Line #      Hits         Time  Per Hit % Time  Line Contents
==============================================================
    10                                           @profile
    11                                           def main():
    12         1            5      5.0      0.0      list_a = []
    13         1            3      3.0      0.0      list_b = []
    14    100001       360677      3.6      6.5      for x in range(100000):
    15    100000       763593      7.6     13.8          list_a.append(A(x))
    16    100000       924822      9.2     16.7          list_b.append(B(x))
    17
    18         1           14     14.0      0.0      ob_a = A(1)
    19         1            5      5.0      0.0      ob_b = B(1)
    20    100001       500454      5.0      9.1      for ob in list_a:
    21    100000       267252      2.7      4.8          if ob == ob_a:
    22                                                       x = True
    23    100000       259075      2.6      4.7          if ob is ob_a:
    24                                                       x = True
    25    100000       539683      5.4      9.8          if ob.id == ob_a.id:
    26         1            3      3.0      0.0              x = True
    27    100000       271519      2.7      4.9          if ob.id == 1:
    28         1            3      3.0      0.0              x = True
    29    100001       296736      3.0      5.4      for ob in list_b:
    30    100000       472204      4.7      8.5          if ob == ob_b:
    31         1            4      4.0      0.0              x = True
    32    100000       283165      2.8      5.1          if ob is ob_b:
    33                                                       x = True
    34    100000       298839      3.0      5.4          if ob.id == ob_b.id:
    35         1            3      3.0      0.0              x = True
    36    100000       291576      2.9      5.3          if ob.id == 1:
    37         1            3      3.0      0.0              x = True

I was very surprised:

"dot" access (ob.property) seems to be very expensive (line 25 versus line 27).
there was not much difference between is and '==', at least for simple objects

Then I tried with more complex objects and results are consistent with the first experiment.

Are you swapping a lot? If your dataset is so large that it does not fit available RAM, I guess you may experience some kind of I/O contention related to virtual memory fetches.

Are you running Linux? If so, could you post a vmstat of your machine while running your program? Send us the output of something like:

vmstat 10 100

Good luck!

UPDATE (from comments by OP)

I sugested playing with sys.setcheckinterval and enable/disable the GC. The rationale is that for this particular case (huge number of instances) the default GC reference count check is somewhat expensive and its default interval is away too often.

Yes, I had previously played with
sys.setcheckinterval. I changed it to
1000 (from its default of 100), but it
didn't do any measurable difference.
Disabling Garbage Collection has
helped - thanks. This has been the
biggest speedup so far - saving about
20% (171 minutes for the whole run,
down to 135 minutes) - I'm not sure
what the error bars are on that, but
it must be a statistically significant
increase. – Adam Nellis Feb 9 at 15:10

My guess:

I think the Python GC is based on
reference count. From time to time it
will check the reference count for
every instance; since you are
traversing these huge in-memory
structures, in your particular case
the GC default frequency (1000
cycles?) is away too often - a huge
waste. – Yours Truly Feb 10 at 2:06

回复收藏 0 原文

云醉月微眠 2024-10-23 02:02:21

您是否考虑过 Pyrex / Cython？

它将 python 编译为 C，然后自动编译为 .pyd，因此它可能会在不需要太多工作的情况下加快速度。

回复收藏 0 原文

眼眸印温柔 2024-10-23 02:02:21

这需要大量的工作，但是......您可能会考虑使用在 GPU 上运行的 Floyd-Warshall。为了让 Floyd-Warshall 在 GPU 上高效运行，我们做了很多工作。快速谷歌搜索结果：

http://cvit.iiit.ac.in/papers/Pawan07acceleating .pdf

http://my.safaribooksonline.com/book/programming/graphics/9780321545411/gpu-computing-for- Protein-struct-prediction/ch43lev1sec2#X2ludGVybmFsX0ZsYXNoUmVhZGVyP3htbGlkPTk3ODAzMjE1NDU0MTEvNDg3

http://www.gpucomputing.net/?q=node/1203

http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter43.html

尽管在 Python 中实现，Floyd-Warshall 的速度较慢在一个数量级上，强大 GPU 上的良好 GPU 版本可能仍会显着优于新的 Python 代码。

这是一个轶事。我有一段简短的计算密集型代码，它执行类似于霍夫累积的操作。在 Python 中，尽我所能进行优化，在快速的 i7 上大约需要 7 秒。然后我写了一个完全未优化的 GPU 版本；在 Nvidia GTX 480 上花费了大约 0.002 秒。YMMV，但对于任何显着并行的事物，GPU 可能是长期赢家，并且由于它是经过充分研究的算法，因此您应该能够利用现有的高度调整的代码。

对于 Python / GPU 桥接器，我推荐 PyCUDA 或 PyOpenCL。

回复收藏 0 原文

温柔戏命师 2024-10-23 02:02:21

我没有看到你的代码在性能方面有任何问题（没有尝试理解算法），你只是受到大量迭代的打击。您的部分代码被执行了 40百万次！

请注意 80% 的时间花费在 20% 的代码中 - 这 13 行代码执行了超过 2400 万次。顺便说一下，通过这段代码，您为帕累托原理（或“20% 的啤酒饮酒者喝掉了 80% 的啤酒”）。

首先要做的事情：您尝试过Psycho吗？它是一个 JIT 编译器，可以极大地加速你的代码 - 考虑到大量的迭代 - 比如说 4 倍到 5 倍 - 你所要做的就是（当然，在下载和安装之后）将此代码片段插入到开头：

import psyco
psyco.full()

这就是我喜欢 Psycho 并在 GCJ 中使用它的原因，在 GCJ 中，时间至关重要 - 无需编写代码，无需出错，并且添加的 2 行代码会突然提升。

回到挑剔（由于时间改进很小，所以将 == 替换为 is 等）。这里是“错误”的 13 行：

Line    #   Hits    Time    Per Hit % Time  Line Contents
412 42350234    197075504439    4653.5  8.1 for node_c, (distance_b_c, node_after_b) in node_b_distances.items(): # Can't use iteritems() here, as deleting from the dictionary
386 42076729    184216680432    4378.1  7.6 for node_c, (distance_a_c, node_after_a) in self.node_distances[node_a].iteritems():
362 41756882    183992040153    4406.3  7.6 for node_c, (distance_b_c, node_after_b) in node_b_distances.iteritems(): # Think it's ok to modify items while iterating over them (just not insert/delete) (seems to work ok)
413 41838114    180297579789    4309.4  7.4 if(distance_b_c > cutoff_distance):
363 41244762    172425596985    4180.5  7.1 if(node_after_b == node_a):
389 41564609    172040284089    4139.1  7.1 if(node_c == node_b): # a-b path
388 41564609    171150289218    4117.7  7.1 node_b_update = False
391 41052489    169406668962    4126.6  7   elif(node_after_a == node_b): # a-b-a-b path
405 41564609    164585250189    3959.7  6.8 if node_b_update:
394 24004846    103404357180    4307.6  4.3 (distance_b_c, node_after_b) = node_b_distances[node_c]
395 24004846    102717271836    4279    4.2 if(node_after_b != node_a): # b doesn't already go to a first
393 24801082    101577038778    4095.7  4.2 elif(node_c in node_b_distances): # b can already get to c

A) 除了你提到的几行之外，我注意到 #388 在微不足道的时候有相对较高的时间，它所做的一切都是node_b_update = False。哦，但是等等 - 每次执行时，False 都会在全局范围内查找！为了避免这种情况，请在方法的开头分配 F, T = False, True 并将后面使用的 False 和 True 替换为局部变量F 和 T。这应该会减少总体时间，尽管减少了一点（3%？）。

B) 我注意到 #389 中的条件“仅”出现了 512,120 次（基于 #390 的执行次数），而 #391 中的条件出现了 16,251,407 次。由于不存在依赖性，因此颠倒这些检查的顺序是有意义的 - 因为早期的“削减”应该不会带来什么提升（2％？）。我不确定完全避免 pass 语句是否会有帮助，但如果它不会损害可读性：

if (node_after_a is not node_b) and (node_c is not node_b):
   # neither a-b-a-b nor a-b path
   if (node_c in node_b_distances): # b can already get to c
       (distance_b_c, node_after_b) = node_b_distances[node_c]
       if (node_after_b is not node_a): # b doesn't already go to a first
           distance_b_a_c = neighbour_distance_b_a + distance_a_c
           if (distance_b_a_c < distance_b_c): # quicker to go via a
               node_b_update = T
   else: # b can't already get to c
       distance_b_a_c = neighbour_distance_b_a + distance_a_c
       if (distance_b_a_c < cutoff_distance): # not too for to go
           node_b_update = T

C) 我刚刚注意到你在一个案例中使用 try- except (#365 -367) 你只需要字典中的默认值 - 尝试使用 .get(key, defaultVal) 或使用 collections.defaultdict(itertools.repeat(float('+inf)) 创建字典'）））。使用 try-except 有它的代价 - 请参阅 #365 报告 3.5% 的时间，即设置堆栈帧等等。

D) 尽可能避免索引访问（无论是使用 obj.field 或 obj[idx]）。例如，我看到您在多个地方使用 self.node_distances[node_a] (#336, 339, 346, 366, 386)，这意味着每次使用索引都会使用两次（一次用于 . 一次用于 []） - 当执行数千万次时，这会变得昂贵。在我看来，您可以在以 node_a_distances = self.node_distances[node_a] 开头的方法中执行操作，然后进一步使用它。

I don't see anything wrong with your code regarding performance (without trying to grok the algorithm), you are just getting hit by the big number of iterations. Parts of your code get executed 40 million times!

Notice how 80% of the time is spent in 20% of your code - and those are the 13 lines that get executed 24+ million times. By the way with this code you provide great illustration to the Pareto principle (or "20% of beer drinkers drink 80% of the beer").

First things first: have you tried Psycho? It's a JIT compiler that can greatly speed up your code - considering the big number of iterations - say by a factor of 4x-5x - and all you have to do (after downloading and installing, of course) is to insert this snippet in the beginning:

import psyco
psyco.full()

This is why i liked Psycho and used it in GCJ too, where time is of essence - nothing to code, nothing to get wrong and sudden boost from 2 lines added.

Back to nit-picking (which changes like replacing == with is etc is, because of the small % time improvement). Here they are the 13 lines "at fault":

Line    #   Hits    Time    Per Hit % Time  Line Contents
412 42350234    197075504439    4653.5  8.1 for node_c, (distance_b_c, node_after_b) in node_b_distances.items(): # Can't use iteritems() here, as deleting from the dictionary
386 42076729    184216680432    4378.1  7.6 for node_c, (distance_a_c, node_after_a) in self.node_distances[node_a].iteritems():
362 41756882    183992040153    4406.3  7.6 for node_c, (distance_b_c, node_after_b) in node_b_distances.iteritems(): # Think it's ok to modify items while iterating over them (just not insert/delete) (seems to work ok)
413 41838114    180297579789    4309.4  7.4 if(distance_b_c > cutoff_distance):
363 41244762    172425596985    4180.5  7.1 if(node_after_b == node_a):
389 41564609    172040284089    4139.1  7.1 if(node_c == node_b): # a-b path
388 41564609    171150289218    4117.7  7.1 node_b_update = False
391 41052489    169406668962    4126.6  7   elif(node_after_a == node_b): # a-b-a-b path
405 41564609    164585250189    3959.7  6.8 if node_b_update:
394 24004846    103404357180    4307.6  4.3 (distance_b_c, node_after_b) = node_b_distances[node_c]
395 24004846    102717271836    4279    4.2 if(node_after_b != node_a): # b doesn't already go to a first
393 24801082    101577038778    4095.7  4.2 elif(node_c in node_b_distances): # b can already get to c

A) Besides the lines you mention, i notice that #388 has relatively high time when it is trivial, all it does it node_b_update = False. Oh but wait - each time it gets executed, False gets looked up in the global scope! To avoid that, assign F, T = False, True in th e beginning of the method and replace later uses of False and True with locals F and T. This should decrease overall time, although by little (3%?).

B) I notice that the condition in #389 occurred "only" 512,120 times (based on number of executions of #390) vs the condition in #391 with 16,251,407. Since there is no dependency, it makes sense to reverse the order of those checks - because of the early "cut" that should give little boost (2%?). I am not sure if avoiding pass statements altogether will help but if it does not hurt readability:

if (node_after_a is not node_b) and (node_c is not node_b):
   # neither a-b-a-b nor a-b path
   if (node_c in node_b_distances): # b can already get to c
       (distance_b_c, node_after_b) = node_b_distances[node_c]
       if (node_after_b is not node_a): # b doesn't already go to a first
           distance_b_a_c = neighbour_distance_b_a + distance_a_c
           if (distance_b_a_c < distance_b_c): # quicker to go via a
               node_b_update = T
   else: # b can't already get to c
       distance_b_a_c = neighbour_distance_b_a + distance_a_c
       if (distance_b_a_c < cutoff_distance): # not too for to go
           node_b_update = T

C) I just noticed you are using try-except in a case (#365-367) you just need default value from a dictionary - try using instead .get(key, defaultVal) or create your dictionaries with collections.defaultdict(itertools.repeat(float('+inf'))). Using try-except has it's price - see #365 reports 3.5% of the time, that's setting up stack frames and whatnot.

D) Avoid indexed access (be it with obj.field or obj[idx]) when possible. For example i see you use self.node_distances[node_a] in multiple places (#336, 339, 346, 366, 386), which means for every use indexing is used twice (once for . and once for []) - and that gets expensive when executed tens of millions of times. Seems to me you can just do at the method beginning node_a_distances = self.node_distances[node_a] and then use that further.

回复收藏 0 原文

墨落画卷 2024-10-23 02:02:21

我本想将此发布为我的问题的更新，但 Stack Overflow 只允许问题中包含 30000 个字符，因此我将其发布为答案。

更新：到目前为止，我的最佳优化

已经采纳了人们的建议，现在我的代码运行速度比以前快了约 21%，这很好 - 谢谢大家！

这是迄今为止我能做到的最好的事情。我已将节点的所有 == 测试替换为 is，禁用垃圾回收并重写第 388 行的大 if 语句部分，符合@Nas Banov 的建议。我添加了众所周知的 try/ except 技巧来避免测试（第 390 行 - 删除 node_b_distances 中的测试 node_c），这有助于加载，因为它几乎不会抛出异常例外。我尝试切换第 391 行和第 392 行，并将 node_b_distances[node_c] 分配给变量，但这种方式是最快的。

但是，我仍然没有找到内存泄漏（参见我的问题中的图表）。但我认为这可能位于我的代码的不同部分（我没有在这里发布）。如果我能修复内存泄漏，那么这个程序将运行得足够快，供我使用:)

Timer unit: 3.33366e-10 s
File: routing_distances.py
Function: propagate_distances_node at line 328
Total time: 760.74 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
328                                               @profile
329                                               def propagate_distances_node(self, node_a, cutoff_distance=200):
330                                                       
331                                                   # a makes sure its immediate neighbours are correctly in its distance table
332                                                   # because its immediate neighbours may change as binds/folding change
333    791349   4158169713   5254.5      0.2          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].iteritems():
334    550522   2331886050   4235.8      0.1              use_neighbour_link = False
335                                                       
336    550522   2935995237   5333.1      0.1              if(node_b not in self.node_distances[node_a]): # a doesn't know distance to b
337     15931     68829156   4320.5      0.0                  use_neighbour_link = True
338                                                       else: # a does know distance to b
339    534591   2728134153   5103.2      0.1                  (node_distance_b_a, next_node) = self.node_distances[node_a][node_b]
340    534591   2376374859   4445.2      0.1                  if(node_distance_b_a > neighbour_distance_b_a): # neighbour distance is shorter
341        78       347355   4453.3      0.0                      use_neighbour_link = True
342    534513   3145889079   5885.5      0.1                  elif((None is next_node) and (float('+inf') == neighbour_distance_b_a)): # direct route that has just broken
343        74       327600   4427.0      0.0                      use_neighbour_link = True
344                                                               
345    550522   2414669022   4386.1      0.1              if(use_neighbour_link):
346     16083     81850626   5089.3      0.0                  self.node_distances[node_a][node_b] = (neighbour_distance_b_a, None)
347     16083     87064200   5413.4      0.0                  self.nodes_changed.add(node_a)
348                                                           
349                                                           ## Affinity distances update
350     16083     86580603   5383.4      0.0                  if((node_a.type == Atom.BINDING_SITE) and (node_b.type == Atom.BINDING_SITE)):
351       234      6656868  28448.2      0.0                      self.add_affinityDistance(node_a, node_b, self.chemistry.affinity(node_a.data, node_b.data))     
352                                                   
353                                                   # a sends its table to all its immediate neighbours
354    791349   4034651958   5098.4      0.2          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].iteritems():
355    550522   2392248546   4345.4      0.1              node_b_changed = False
356                                               
357                                                       # b integrates a's distance table with its own
358    550522   2520330696   4578.1      0.1              node_b_chemical = node_b.chemical
359    550522   2734341975   4966.8      0.1              node_b_distances = node_b_chemical.node_distances[node_b]
360                                                       
361                                                       # For all b's routes (to c) that go to a first, update their distances
362  46679347 222161837193   4759.3      9.7              for node_c, (distance_b_c, node_after_b) in node_b_distances.iteritems(): # Think it's ok to modify items while iterating over them (just not insert/delete) (seems to work ok)
363  46128825 211963639122   4595.0      9.3                  if(node_after_b is node_a):
364                                                               
365  18677439  79225517916   4241.8      3.5                      try:
366  18677439 101527287264   5435.8      4.4                          distance_b_a_c = neighbour_distance_b_a + self.node_distances[node_a][node_c][0]
367    181510    985441680   5429.1      0.0                      except KeyError:
368    181510   1166118921   6424.5      0.1                          distance_b_a_c = float('+inf')
369                                                                   
370  18677439  89626381965   4798.6      3.9                      if(distance_b_c != distance_b_a_c): # a's distance to c has changed
371    692131   3352970709   4844.4      0.1                          node_b_distances[node_c] = (distance_b_a_c, node_a)
372    692131   3066946866   4431.2      0.1                          node_b_changed = True
373                                                                   
374                                                                   ## Affinity distances update
375    692131   3808548270   5502.6      0.2                          if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
376     96794   1655818011  17106.6      0.1                              node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
377                                                                   
378                                                               # If distance got longer, then ask b's neighbours to update
379                                                               ## TODO: document this!
380  18677439  88838493705   4756.5      3.9                      if(distance_b_a_c > distance_b_c):
381                                                                   #for (node, neighbour_distance) in node_b_chemical.neighbours[node_b].iteritems():
382   1656796   7949850642   4798.3      0.3                          for node in node_b_chemical.neighbours[node_b]:
383   1172486   6307264854   5379.4      0.3                              node.chemical.nodes_changed.add(node)
384                                                       
385                                                       # Look for routes from a to c that are quicker than ones b knows already
386  46999631 227198060532   4834.0     10.0              for node_c, (distance_a_c, node_after_a) in self.node_distances[node_a].iteritems():
387                                                           
388  46449109 218024862372   4693.8      9.6                  if((node_after_a is not node_b) and # not a-b-a-b path
389  28049321 126269403795   4501.7      5.5                     (node_c is not node_b)):         # not a-b path
390  27768341 121588366824   4378.7      5.3                      try: # Assume node_c in node_b_distances ('try' block will raise KeyError if not)
391  27768341 159413637753   5740.8      7.0                          if((node_b_distances[node_c][1] is not node_a) and # b doesn't already go to a first
392   8462467  51890478453   6131.8      2.3                             ((neighbour_distance_b_a + distance_a_c) < node_b_distances[node_c][0])):
393                                                               
394                                                                       # Found a route
395    224593   1168129548   5201.1      0.1                              node_b_distances[node_c] = (neighbour_distance_b_a + distance_a_c, node_a)
396                                                                       ## Affinity distances update
397    224593   1274631354   5675.3      0.1                              if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
398     32108    551523249  17177.1      0.0                                  node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
399    224593   1165878108   5191.1      0.1                              node_b_changed = True
400                                                                       
401    809945   4449080808   5493.1      0.2                      except KeyError:
402                                                                   # b can't already get to c (node_c not in node_b_distances)
403    809945   4208032422   5195.5      0.2                          if((neighbour_distance_b_a + distance_a_c) < cutoff_distance): # not too for to go
404                                                                       
405                                                                       # These lines of code copied, for efficiency 
406                                                                       #  (most of the time, the 'try' block succeeds, so don't bother testing for (node_c in node_b_distances))
407                                                                       # Found a route
408    587726   3162939543   5381.7      0.1                              node_b_distances[node_c] = (neighbour_distance_b_a + distance_a_c, node_a)
409                                                                       ## Affinity distances update
410    587726   3363869061   5723.5      0.1                              if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
411     71659   1258910784  17568.1      0.1                                  node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
412    587726   2706161481   4604.5      0.1                              node_b_changed = True
413                                                                   
414                                                               
415                                                       
416                                                       # If any of node b's rows have exceeded the cutoff distance, then remove them
417  47267073 239847142446   5074.3     10.5              for node_c, (distance_b_c, node_after_b) in node_b_distances.items(): # Can't use iteritems() here, as deleting from the dictionary
418  46716551 242694352980   5195.0     10.6                  if(distance_b_c > cutoff_distance):
419    200755    967443975   4819.0      0.0                      del node_b_distances[node_c]
420    200755    930470616   4634.9      0.0                      node_b_changed = True
421                                                               
422                                                               ## Affinity distances update
423    200755   4717125063  23496.9      0.2                      node_b_chemical.del_affinityDistance(node_b, node_c)
424                                                       
425                                                       # If we've modified node_b's distance table, tell its chemical to update accordingly
426    550522   2684634615   4876.5      0.1              if(node_b_changed):
427    235034   1383213780   5885.2      0.1                  node_b_chemical.nodes_changed.add(node_b)
428                                                   
429                                                   # Remove any neighbours that have infinite distance (have just unbound)
430                                                   ## TODO: not sure what difference it makes to do this here rather than above (after updating self.node_distances for neighbours)
431                                                   ##       but doing it above seems to break the walker's movement
432    791349   4367879451   5519.5      0.2          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].items(): # Can't use iteritems() here, as deleting from the dictionary
433    550522   2968919613   5392.9      0.1              if(neighbour_distance_b_a > cutoff_distance):
434       148       775638   5240.8      0.0                  del self.neighbours[node_a][node_b]
435                                                           
436                                                           ## Affinity distances update
437       148      2096343  14164.5      0.0                  self.del_affinityDistance(node_a, node_b)

I would have posted this as an update to my question, but Stack Overflow only allows 30000 characters in questions, so I'm posting this as an answer.

Update: My best optimisations so far

I've taken on board people's suggestions, and now my code runs about 21% faster than before, which is good - thanks everyone!

This is the best I've managed to do so far. I've replaced all the == tests with is for nodes, disabled garbage collection and re-written the big if statement part at Line 388, in line with @Nas Banov's suggestions. I added in the well-known try/except trick for avoiding tests (line 390 - to remove the test node_c in node_b_distances), which helped loads, since it hardly ever throws the exception. I tried switching lines 391 and 392 around, and assigning node_b_distances[node_c] to a variable, but this way was the quickest.

However, I still haven't tracked down the memory leak yet (see graph in my question). But I think this might be in a different part of my code (that I haven't posted here). If I can fix the memory leak, then this program will run quickly enough for me to use :)

Timer unit: 3.33366e-10 s
File: routing_distances.py
Function: propagate_distances_node at line 328
Total time: 760.74 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
328                                               @profile
329                                               def propagate_distances_node(self, node_a, cutoff_distance=200):
330                                                       
331                                                   # a makes sure its immediate neighbours are correctly in its distance table
332                                                   # because its immediate neighbours may change as binds/folding change
333    791349   4158169713   5254.5      0.2          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].iteritems():
334    550522   2331886050   4235.8      0.1              use_neighbour_link = False
335                                                       
336    550522   2935995237   5333.1      0.1              if(node_b not in self.node_distances[node_a]): # a doesn't know distance to b
337     15931     68829156   4320.5      0.0                  use_neighbour_link = True
338                                                       else: # a does know distance to b
339    534591   2728134153   5103.2      0.1                  (node_distance_b_a, next_node) = self.node_distances[node_a][node_b]
340    534591   2376374859   4445.2      0.1                  if(node_distance_b_a > neighbour_distance_b_a): # neighbour distance is shorter
341        78       347355   4453.3      0.0                      use_neighbour_link = True
342    534513   3145889079   5885.5      0.1                  elif((None is next_node) and (float('+inf') == neighbour_distance_b_a)): # direct route that has just broken
343        74       327600   4427.0      0.0                      use_neighbour_link = True
344                                                               
345    550522   2414669022   4386.1      0.1              if(use_neighbour_link):
346     16083     81850626   5089.3      0.0                  self.node_distances[node_a][node_b] = (neighbour_distance_b_a, None)
347     16083     87064200   5413.4      0.0                  self.nodes_changed.add(node_a)
348                                                           
349                                                           ## Affinity distances update
350     16083     86580603   5383.4      0.0                  if((node_a.type == Atom.BINDING_SITE) and (node_b.type == Atom.BINDING_SITE)):
351       234      6656868  28448.2      0.0                      self.add_affinityDistance(node_a, node_b, self.chemistry.affinity(node_a.data, node_b.data))     
352                                                   
353                                                   # a sends its table to all its immediate neighbours
354    791349   4034651958   5098.4      0.2          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].iteritems():
355    550522   2392248546   4345.4      0.1              node_b_changed = False
356                                               
357                                                       # b integrates a's distance table with its own
358    550522   2520330696   4578.1      0.1              node_b_chemical = node_b.chemical
359    550522   2734341975   4966.8      0.1              node_b_distances = node_b_chemical.node_distances[node_b]
360                                                       
361                                                       # For all b's routes (to c) that go to a first, update their distances
362  46679347 222161837193   4759.3      9.7              for node_c, (distance_b_c, node_after_b) in node_b_distances.iteritems(): # Think it's ok to modify items while iterating over them (just not insert/delete) (seems to work ok)
363  46128825 211963639122   4595.0      9.3                  if(node_after_b is node_a):
364                                                               
365  18677439  79225517916   4241.8      3.5                      try:
366  18677439 101527287264   5435.8      4.4                          distance_b_a_c = neighbour_distance_b_a + self.node_distances[node_a][node_c][0]
367    181510    985441680   5429.1      0.0                      except KeyError:
368    181510   1166118921   6424.5      0.1                          distance_b_a_c = float('+inf')
369                                                                   
370  18677439  89626381965   4798.6      3.9                      if(distance_b_c != distance_b_a_c): # a's distance to c has changed
371    692131   3352970709   4844.4      0.1                          node_b_distances[node_c] = (distance_b_a_c, node_a)
372    692131   3066946866   4431.2      0.1                          node_b_changed = True
373                                                                   
374                                                                   ## Affinity distances update
375    692131   3808548270   5502.6      0.2                          if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
376     96794   1655818011  17106.6      0.1                              node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
377                                                                   
378                                                               # If distance got longer, then ask b's neighbours to update
379                                                               ## TODO: document this!
380  18677439  88838493705   4756.5      3.9                      if(distance_b_a_c > distance_b_c):
381                                                                   #for (node, neighbour_distance) in node_b_chemical.neighbours[node_b].iteritems():
382   1656796   7949850642   4798.3      0.3                          for node in node_b_chemical.neighbours[node_b]:
383   1172486   6307264854   5379.4      0.3                              node.chemical.nodes_changed.add(node)
384                                                       
385                                                       # Look for routes from a to c that are quicker than ones b knows already
386  46999631 227198060532   4834.0     10.0              for node_c, (distance_a_c, node_after_a) in self.node_distances[node_a].iteritems():
387                                                           
388  46449109 218024862372   4693.8      9.6                  if((node_after_a is not node_b) and # not a-b-a-b path
389  28049321 126269403795   4501.7      5.5                     (node_c is not node_b)):         # not a-b path
390  27768341 121588366824   4378.7      5.3                      try: # Assume node_c in node_b_distances ('try' block will raise KeyError if not)
391  27768341 159413637753   5740.8      7.0                          if((node_b_distances[node_c][1] is not node_a) and # b doesn't already go to a first
392   8462467  51890478453   6131.8      2.3                             ((neighbour_distance_b_a + distance_a_c) < node_b_distances[node_c][0])):
393                                                               
394                                                                       # Found a route
395    224593   1168129548   5201.1      0.1                              node_b_distances[node_c] = (neighbour_distance_b_a + distance_a_c, node_a)
396                                                                       ## Affinity distances update
397    224593   1274631354   5675.3      0.1                              if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
398     32108    551523249  17177.1      0.0                                  node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
399    224593   1165878108   5191.1      0.1                              node_b_changed = True
400                                                                       
401    809945   4449080808   5493.1      0.2                      except KeyError:
402                                                                   # b can't already get to c (node_c not in node_b_distances)
403    809945   4208032422   5195.5      0.2                          if((neighbour_distance_b_a + distance_a_c) < cutoff_distance): # not too for to go
404                                                                       
405                                                                       # These lines of code copied, for efficiency 
406                                                                       #  (most of the time, the 'try' block succeeds, so don't bother testing for (node_c in node_b_distances))
407                                                                       # Found a route
408    587726   3162939543   5381.7      0.1                              node_b_distances[node_c] = (neighbour_distance_b_a + distance_a_c, node_a)
409                                                                       ## Affinity distances update
410    587726   3363869061   5723.5      0.1                              if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
411     71659   1258910784  17568.1      0.1                                  node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
412    587726   2706161481   4604.5      0.1                              node_b_changed = True
413                                                                   
414                                                               
415                                                       
416                                                       # If any of node b's rows have exceeded the cutoff distance, then remove them
417  47267073 239847142446   5074.3     10.5              for node_c, (distance_b_c, node_after_b) in node_b_distances.items(): # Can't use iteritems() here, as deleting from the dictionary
418  46716551 242694352980   5195.0     10.6                  if(distance_b_c > cutoff_distance):
419    200755    967443975   4819.0      0.0                      del node_b_distances[node_c]
420    200755    930470616   4634.9      0.0                      node_b_changed = True
421                                                               
422                                                               ## Affinity distances update
423    200755   4717125063  23496.9      0.2                      node_b_chemical.del_affinityDistance(node_b, node_c)
424                                                       
425                                                       # If we've modified node_b's distance table, tell its chemical to update accordingly
426    550522   2684634615   4876.5      0.1              if(node_b_changed):
427    235034   1383213780   5885.2      0.1                  node_b_chemical.nodes_changed.add(node_b)
428                                                   
429                                                   # Remove any neighbours that have infinite distance (have just unbound)
430                                                   ## TODO: not sure what difference it makes to do this here rather than above (after updating self.node_distances for neighbours)
431                                                   ##       but doing it above seems to break the walker's movement
432    791349   4367879451   5519.5      0.2          for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].items(): # Can't use iteritems() here, as deleting from the dictionary
433    550522   2968919613   5392.9      0.1              if(neighbour_distance_b_a > cutoff_distance):
434       148       775638   5240.8      0.0                  del self.neighbours[node_a][node_b]
435                                                           
436                                                           ## Affinity distances update
437       148      2096343  14164.5      0.0                  self.del_affinityDistance(node_a, node_b)

回复收藏 0 原文

~没有更多了~