在numpy中的最快方法以获得阵列中n对的产品的距离
例如,我有n
点数:
A = [2, 3]
B = [3, 4]
C = [3, 3]
.
.
.
它们在类似的数组中:
arr = np.array([[2, 3], [3, 4], [3, 3]])
我需要在bfs中的所有成对距离(广度bfs(广度首次搜索)
跟踪哪个距离是:a-> b,a-> c,b-> c
。对于上述示例数据,结果将为[1.41、1.0、1.0]
。
编辑:我必须用numpy或核心库来完成它。
I have N
number of points, for example:
A = [2, 3]
B = [3, 4]
C = [3, 3]
.
.
.
And they're in an array like so:
arr = np.array([[2, 3], [3, 4], [3, 3]])
I need as output all pairwise distances in BFS (Breadth First Search)
order to track which distance is which, like: A->B, A->C, B->C
. For the above example data, the result would be [1.41, 1.0, 1.0]
.
EDIT: I have to accomplish it with numpy or core libraries.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
作为另一种方法,但类似于 ddejohn 答案,我们可以使用
np。 triu_indices
仅返回矩阵中的上三角形索引,这可能更具内存效率:这不需要其他模块,例如扁平和索引。它的性能类似于上述大数据的答案(例如,您可以通过
arr = np.random.rand(10000,2)
在COLAB上进行检查,这两个两者都将在4.6 s接近;它可能会在较大的数据中击败np.triu
和Flatten
)。我已经通过以下内存使用者测试了一次内存使用量,但是如果在内存使用方面很重要,则必须再次检查它(我不确定):
试图将计算仅限于上三角形,该计算在测试阵列上加快了2至3次的代码。 数组大小的增长,该循环与先前方法之间的性能差
随着 方式,减少内存消耗至少
〜x4
。 方法使得可以在更大的阵列上工作。因此,这种 nofollow noreferrer“>基准:
As an alternative method, but similar to ddejohn answer, we can use
np.triu_indices
which return just the upper triangular indices in the matrix, which may be more memory-efficient:This doesn't need additional modules like flattening and indexing. Its performance is similar to the aforementioned answer for large data (e.g. you can check it by
arr = np.random.rand(10000, 2)
on colab, which will be done near 4.6 s for both; It may beats thenp.triu
andflatten
in larger data).I have tested the memory usage one time by memory-profiler as follows, but it must be checked again if it be important in terms of memory usage (I'm not sure):
Update:
I have tried to limit the calculations just to the upper triangle, that speed the code up 2 to 3 times on the tested arrays. As array size grows, the performance difference between this loop and the previous methods by
np.triu_indices
ornp.triu
grows and be more obvious:Also, through this way, the memory consumption is reduced at least
~x4
. So, this method made it possible to work on larger arrays and more quickly.Benchmarks:
如果您可以使用它,Scipy具有一个功能:
If you can use it, SciPy has a function for this:
这是一个仅使用数字的解决方案(公平警告:它需要大量内存,不同于
pdist
)...演示:
timings(上面的解决方案包装在称为
triu ):
Here's a numpy-only solution (fair warning: it requires a lot of memory, unlike
pdist
)...Demo:
Timings (with the solution above wrapped in a function called
triu
):