关于如何加快距离计算的建议
考虑以下类:
class SquareErrorDistance(object):
def __init__(self, dataSample):
variance = var(list(dataSample))
if variance == 0:
self._norm = 1.0
else:
self._norm = 1.0 / (2 * variance)
def __call__(self, u, v): # u and v are floats
return (u - v) ** 2 * self._norm
我用它来计算向量的两个元素之间的距离。我基本上为使用此距离度量的向量的每个维度创建该类的一个实例(有些维度使用其他距离度量)。分析显示,此类的 __call__ 函数占据了我的 knn 实现的 90% 的运行时间(谁会想到)。我不认为有任何纯Python方法可以加速这个过程,但也许如果我用C实现它?
如果我运行一个简单的 C 程序,仅使用上面的公式计算随机值的距离,那么它比 Python 快几个数量级。所以我尝试使用 ctypes 并调用执行计算的 C 函数,但显然是转换参数和返回值的代价非常昂贵,因为生成的代码要慢得多。
我当然可以在 C 中实现整个 knn 并直接调用它,但问题是,就像我所描述的那样,我对向量的某些维度使用不同的距离函数,并将它们转换为 C 的工作量太大。
那么我的选择是什么?使用 Python C-API 编写 C 函数会消除开销吗?还有其他方法可以加快计算速度吗?
Consider the following class:
class SquareErrorDistance(object):
def __init__(self, dataSample):
variance = var(list(dataSample))
if variance == 0:
self._norm = 1.0
else:
self._norm = 1.0 / (2 * variance)
def __call__(self, u, v): # u and v are floats
return (u - v) ** 2 * self._norm
I use it to calculate the distance between two elements of a vector. I basically create one instance of that class for every dimension of the vector that uses this distance measure (there are dimensions that use other distance measures). Profiling reveals that the __call__
function of this class accounts for 90% of the running-time of my knn-implementation (who would have thought). I do not think there is any pure-Python way to speed this up, but maybe if I implement it in C?
If I run a simple C program that just calculates distances for random values using the formula above, it is orders of magnitude faster than Python. So I tried using ctypes and call a C function that does the computation, but apparently the conversion of the parameters and return-values is far to expensive, because the resulting code is much slower.
I could of course implement the entire knn in C and just call that, but the problem is that, like I described, I use different distance functions for some dimension of the vectors, and translating these to C would be too much work.
So what are my alternatives? Will writing the C-function using the Python C-API get rid of the overhead? Are there any other ways to speed this calculation up?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
以下 cython 代码(我意识到
__init__
的第一行是不同的,我用随机的东西替换了它,因为我不知道var
并且因为无论如何它都不重要- 你说__call__
是瓶颈):通过简单的 setup.py 编译(只是 文档中的示例(文件名已更改)),在简单设计的
timeit
基准测试中,它的性能比同等的纯 Python 好近 20 倍。请注意,唯一更改的是_norm
字段和__call__
参数的cdef
。我认为这非常令人印象深刻。The following cython code (I realize the first line of
__init__
is different, I replaced it with random stuff because I don't knowvar
and because it doesn't matter anyway - you stated__call__
is the bottleneck):Compiled via a simple setup.py (just the example from the docs with the file name altered), it performs nearly 20 times better than the equivalent pure python in a simple contrieved
timeit
benchmark. Note that the only changed werecdef
s for the_norm
field and the__call__
parameters. I consider this pretty impressive.这可能没有多大帮助,但您可以使用嵌套函数重写它:
This probably won't help much, but you can rewrite it using nested functions: