python中的K最近邻

发布于 2024-10-30 17:41:56 字数 1536 浏览 1 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

温馨耳语 2024-11-06 17:41:56

我认为你应该使用 scikit ann

这里有一个关于最近邻的很好的教程。

根据文档:

ann 是 SWIG 生成的近似最近邻 (ANN) 库的 Python 包装器 ( http://www.cs.umd.edu/~mount/ANN/),由 David M. Mount 和 Sunil Arya 开发。 ann 提供了一个不可变的 kdtree 实现(通过 ANN),它可以执行 k 最近邻并近似 k

I think that you should use scikit ann.

There is a good tutorial about the nearest neightbour here.

According to the documentation :

ann is a SWIG-generated python wrapper for the Approximate Nearest Neighbor (ANN) Library (http://www.cs.umd.edu/~mount/ANN/), developed by David M. Mount and Sunil Arya. ann provides an immutable kdtree implementation (via ANN) which can perform k-nearest neighbor and approximate k

月亮坠入山谷 2024-11-06 17:41:56

这是一个比较 scipy.spatial.cKDTree 和 pyflann.FLANN 的脚本。亲自看看哪一个对您的应用程序来说更快。

import cProfile
import numpy as np
import os
import pyflann
import scipy.spatial

# Config params
dim = 4
data_size = 1000
test_size = 1

# Generate data
np.random.seed(1)
dataset = np.random.rand(data_size, dim)
testset = np.random.rand(test_size, dim)

def test_pyflann_flann(num_reps):
    flann = pyflann.FLANN()
    for rep in range(num_reps):
        params = flann.build_index(dataset, target_precision=0.0, log_level='info')
        result = flann.nn_index(testset, 5, checks=params['checks'])

def test_scipy_spatial_kdtree(num_reps):
    flann = pyflann.FLANN()
    for rep in range(num_reps):
        kdtree = scipy.spatial.cKDTree(dataset, leafsize=10)
        result = kdtree.query(testset, 5)

num_reps = 1000
cProfile.run('test_pyflann_flann(num_reps); test_scipy_spatial_kdtree(num_reps)', 'out.prof')
os.system('runsnake out.prof')

Here is a script comparing scipy.spatial.cKDTree and pyflann.FLANN. See for yourself which one is faster for your application.

import cProfile
import numpy as np
import os
import pyflann
import scipy.spatial

# Config params
dim = 4
data_size = 1000
test_size = 1

# Generate data
np.random.seed(1)
dataset = np.random.rand(data_size, dim)
testset = np.random.rand(test_size, dim)

def test_pyflann_flann(num_reps):
    flann = pyflann.FLANN()
    for rep in range(num_reps):
        params = flann.build_index(dataset, target_precision=0.0, log_level='info')
        result = flann.nn_index(testset, 5, checks=params['checks'])

def test_scipy_spatial_kdtree(num_reps):
    flann = pyflann.FLANN()
    for rep in range(num_reps):
        kdtree = scipy.spatial.cKDTree(dataset, leafsize=10)
        result = kdtree.query(testset, 5)

num_reps = 1000
cProfile.run('test_pyflann_flann(num_reps); test_scipy_spatial_kdtree(num_reps)', 'out.prof')
os.system('runsnake out.prof')
巾帼英雄 2024-11-06 17:41:56

scipy.spatial.cKDTree
快速且稳定。
有关使用它进行 NN 插值的示例,请参阅(咳咳)
反距离-weighted-idw-interpolation-with-python所以。

(如果你可以说“我在 3d 中有 1M 个点,并且想要 1k 个新点的 k=5 个最近邻”,
您可能会得到更好的答案或代码示例。
找到邻居后,您想对他们做什么?)

scipy.spatial.cKDTree
is fast and solid.
For an example of using it for NN interpolation, see (ahem)
inverse-distance-weighted-idw-interpolation-with-python on SO.

(If you could say e.g. "I have 1M points in 3d, and want k=5 nearest neighbors of 1k new points",
you might get better answers or code examples.
What do you want to do with the neighbors once you've found them ?)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文