NumPy:向量化到一组点的距离总和
我正在尝试实现 k-medoids 聚类算法在 Python/NumPy 中。作为该算法的一部分,我必须计算从对象到其“中心点”(簇代表)的距离总和。
我有:五个点上的距离矩阵、
n_samples = 5
D = np.array([[ 0. , 3.04959014, 4.74341649, 3.72424489, 6.70298441],
[ 3.04959014, 0. , 5.38516481, 4.52216762, 6.16846821],
[ 4.74341649, 5.38516481, 0. , 1.02469508, 8.23711114],
[ 3.72424489, 4.52216762, 1.02469508, 0. , 7.69025357],
[ 6.70298441, 6.16846821, 8.23711114, 7.69025357, 0. ]])
一组初始中心点
medoids = np.array([0, 3])
和集群成员资格,
cl = np.array([0, 0, 1, 1, 0])
来计算所需的总和
>>> np.sum(D[i, medoids[cl[i]]] for i in xrange(n_samples))
10.777269622938899
我可以使用Python循环 。我是否缺少某种用于计算这个总和的向量化习惯用法?
I'm trying to implementing a k-medoids clustering algorithm in Python/NumPy. As part of this algo, I have to compute the sum of distances from objects to their "medoids" (cluster representatives).
I have: a distance matrix on five points
n_samples = 5
D = np.array([[ 0. , 3.04959014, 4.74341649, 3.72424489, 6.70298441],
[ 3.04959014, 0. , 5.38516481, 4.52216762, 6.16846821],
[ 4.74341649, 5.38516481, 0. , 1.02469508, 8.23711114],
[ 3.72424489, 4.52216762, 1.02469508, 0. , 7.69025357],
[ 6.70298441, 6.16846821, 8.23711114, 7.69025357, 0. ]])
a set of initial medoids
medoids = np.array([0, 3])
and the cluster memberships
cl = np.array([0, 0, 1, 1, 0])
I can compute the required sum using
>>> np.sum(D[i, medoids[cl[i]]] for i in xrange(n_samples))
10.777269622938899
but that uses a Python loop. Am I missing some kind of vectorized idiom for computing this sum?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
怎么样:
How about: