Python 中的多维欧几里德距离

发布于 2025-01-08 05:50:14 字数 866 浏览 1 评论 0原文

我想计算两个数组之间多个维度(24 维)的欧几里得距离。我正在使用 numpy-Scipy。

这是我的代码:

import numpy,scipy;

A=numpy.array([116.629, 7192.6, 4535.66, 279714, 176404, 443608, 295522, 1.18399e+07, 7.74233e+06, 2.85839e+08, 2.30168e+08, 5.6919e+08, 168989, 7.48866e+06, 1.45261e+06, 7.49496e+07, 2.13295e+07, 3.74361e+08, 54.5, 3349.39, 262.614, 16175.8, 3693.79, 205865]);

B=numpy.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 151246, 6795630, 4566625, 2.0355328e+08, 1.4250515e+08, 3.2699482e+08, 95635, 4470961, 589043, 29729866, 6124073, 222.3]);

但是,我使用 scipy.spatial.distance.cdist(A[numpy.newaxis,:],B,'euclidean') 来计算欧氏距离。

但是它给了我一个错误

raise ValueError('XB must be a 2-dimensional array.');

我似乎不明白它。

我查找了 scipy.spatial.distance.pdist 但不明白如何使用它?

还有其他更好的方法吗?

I want to calculate the Euclidean distance in multiple dimensions (24 dimensions) between 2 arrays. I'm using numpy-Scipy.

Here is my code:

import numpy,scipy;

A=numpy.array([116.629, 7192.6, 4535.66, 279714, 176404, 443608, 295522, 1.18399e+07, 7.74233e+06, 2.85839e+08, 2.30168e+08, 5.6919e+08, 168989, 7.48866e+06, 1.45261e+06, 7.49496e+07, 2.13295e+07, 3.74361e+08, 54.5, 3349.39, 262.614, 16175.8, 3693.79, 205865]);

B=numpy.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 151246, 6795630, 4566625, 2.0355328e+08, 1.4250515e+08, 3.2699482e+08, 95635, 4470961, 589043, 29729866, 6124073, 222.3]);

However, I used scipy.spatial.distance.cdist(A[numpy.newaxis,:],B,'euclidean') to calcuate the eucleidan distance.

But it gave me an error

raise ValueError('XB must be a 2-dimensional array.');

I don't seem to understand it.

I looked up scipy.spatial.distance.pdist but don't understand how to use it?

Is there any other better way to do it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

疯了 2025-01-15 05:50:14

也许 scipy.spatial.distance.euclidean

示例

<前><代码>>>>从 scipy.spatial 导入距离
>>>>>距离.euclidean([1, 0, 0], [0, 1, 0])
1.4142135623730951
>>>>>距离.euclidean([1, 1, 0], [0, 1, 0])
1.0

Perhaps scipy.spatial.distance.euclidean?

Examples

>>> from scipy.spatial import distance
>>> distance.euclidean([1, 0, 0], [0, 1, 0])
1.4142135623730951
>>> distance.euclidean([1, 1, 0], [0, 1, 0])
1.0
输什么也不输骨气 2025-01-15 05:50:14

使用任一

numpy.sqrt(numpy.sum((A - B)**2))

或更简单的方法

numpy.linalg.norm(A - B)

Use either

numpy.sqrt(numpy.sum((A - B)**2))

or more simply

numpy.linalg.norm(A - B)
伴我老 2025-01-15 05:50:14

Python 3.8 开始,您可以使用标准库的 math< /code>模块及其新的 dist函数,其中返回两点之间的欧氏距离(以坐标列表或元组形式给出):

from math import dist

dist([1, 0, 0], [0, 1, 0]) # 1.4142135623730951

Starting Python 3.8, you can use standard library's math module and its new dist function, which returns the euclidean distance between two points (given as lists or tuples of coordinates):

from math import dist

dist([1, 0, 0], [0, 1, 0]) # 1.4142135623730951
一场春暖 2025-01-15 05:50:14

AB 是 24 维空间中的 2 个点。您应该使用 scipy.spatial.distance.euclidean 。

此处的文档

scipy.spatial.distance.euclidean(A, B)

A and B are 2 points in the 24-D space. You should use scipy.spatial.distance.euclidean.

Doc here

scipy.spatial.distance.euclidean(A, B)
ゝ杯具 2025-01-15 05:50:14

由于上面所有的答案都指的是 numpy 和/或 scipy,只是想指出,这里可以使用 reduce 来完成一些非常简单的事情

def n_dimensional_euclidean_distance(a, b):
   """
   Returns the euclidean distance for n>=2 dimensions
   :param a: tuple with integers
   :param b: tuple with integers
   :return: the euclidean distance as an integer
   """
   dimension = len(a) # notice, this will definitely throw a IndexError if len(a) != len(b)

   return sqrt(reduce(lambda i,j: i + ((a[j] - b[j]) ** 2), range(dimension), 0))

这将对所有 j 的所有 (a[j] - b[j])^2 对进行求和维数(请注意,为简单起见,这不支持 n<2 维距离)。

Since all of the above answers refer to numpy and or scipy, just wanted to point out that something really simple can be done with reduce here

def n_dimensional_euclidean_distance(a, b):
   """
   Returns the euclidean distance for n>=2 dimensions
   :param a: tuple with integers
   :param b: tuple with integers
   :return: the euclidean distance as an integer
   """
   dimension = len(a) # notice, this will definitely throw a IndexError if len(a) != len(b)

   return sqrt(reduce(lambda i,j: i + ((a[j] - b[j]) ** 2), range(dimension), 0))

This will sum all pairs of (a[j] - b[j])^2 for all j in the number of dimensions (note that for simplicity this doesn't support n<2 dimensional distance).

没企图 2025-01-15 05:50:14

除了已经提到的计算欧几里德距离的方法之外,这里还有一种接近原始代码的方法:

scipy.spatial.distance.cdist([A], [B], 'euclidean')

或者

scipy.spatial.distance.cdist(np.atleast_2d(A), np.atleast_2d(B), 'euclidean')

这将返回一个保存 L2 距离的 1×1 np.ndarray

Apart from the already mentioned ways of computing the Euclidean distance, here's one that's close to your original code:

scipy.spatial.distance.cdist([A], [B], 'euclidean')

or

scipy.spatial.distance.cdist(np.atleast_2d(A), np.atleast_2d(B), 'euclidean')

This returns a 1×1 np.ndarray holding the L2 distance.

鯉魚旗 2025-01-15 05:50:14

编写自己的自定义平方根和平方并不总是安全

您可以使用 math.hypot、numpy.hypot 或 scipy 距离函数,而不是编写 numpy.sqrt(numpy.sum((A - B)**2))(i**2 + j**2)**0.5。在您的情况下,它们可能会溢出

参考

速度明智

%%timeit
math.hypot(*(A - B))
# 3 µs ± 64.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%%timeit
numpy.sqrt(numpy.sum((A - B)**2))
# 5.65 µs ± 50.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

安全明智

i, j = 1e-200, 1e-200
np.sqrt(i**2+j**2)
# 0.0

溢溢出

i, j = 1e+200, 1e+200
np.sqrt(i**2+j**2)
# inf

无下溢

i, j = 1e-200, 1e-200
np.hypot(i, j)
# 1.414213562373095e-200

无溢出

i, j = 1e+200, 1e+200
np.hypot(i, j)
# 1.414213562373095e+200

Writing your own custom sqaure root sum square is not always safe

You can use math.hypot, numpy.hypot or scipy distance function rather than writing numpy.sqrt(numpy.sum((A - B)**2)) or (i**2 + j**2)**0.5. In your case maybe they can overflow

refer

Speed wise

%%timeit
math.hypot(*(A - B))
# 3 µs ± 64.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%%timeit
numpy.sqrt(numpy.sum((A - B)**2))
# 5.65 µs ± 50.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Safety wise

Underflow

i, j = 1e-200, 1e-200
np.sqrt(i**2+j**2)
# 0.0

Overflow

i, j = 1e+200, 1e+200
np.sqrt(i**2+j**2)
# inf

No Underflow

i, j = 1e-200, 1e-200
np.hypot(i, j)
# 1.414213562373095e-200

No Overflow

i, j = 1e+200, 1e+200
np.hypot(i, j)
# 1.414213562373095e+200
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文