numba 没有加快代码编译速度
我在 numba 和正常模式下尝试了此代码,但都在 13 秒内完成,并且 numba 没有增加速度
我该如何针对这种情况设置 numba?
import numpy as np
from numba import jit, cuda
a=[]
@jit(target_backend="cuda")
def func():
for i in range(100000):
a.append(i)
return a
print(func())
I tried this code with numba as well as normal mode but both were completed in 13 seconds and numba did not add speed
How can I set numba for this situation?
import numpy as np
from numba import jit, cuda
a=[]
@jit(target_backend="cuda")
def func():
for i in range(100000):
a.append(i)
return a
print(func())
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
此处不能使用CUDA,因为此代码无法在GPU上运行(即使修改,效率也很低)。 GPU 与 CPU 有很大不同,而且它们的编程方式也不同。要了解原因,请阅读CUDA 编程指南< /a>
如果您只是运行代码并从 Numba 中读取警告,那么您可以看到代码回退到基本的 Python 实现:
原因是未提供
a
的类型Numba 没能找到它。 Numba 速度很快,因为它的类型系统使其能够将代码编译为本机二进制文件。此外,您不应修改全局变量。从软件工程的角度来看,这不是一个好主意,而且 Numba 也不支持这一点。
因此,您需要使用从函数返回的类型化列表。这并不是说从 CPython 读取/写入 CPython 时,类型化列表并不比 Python 列表快多少,因为 Numba 必须进行从 CPython 列表到 CPython 列表的转换,这是一项昂贵的操作。不过,在我的机器上,速度快了大约 3 倍。
更正的代码:
有关延迟编译的更多信息请阅读Numba 文档。
CUDA cannot be used here because this code cannot run on the GPU (and even if would be modified, it would be inefficient). GPUs are very different from CPUs and they are also programmed differently. To understand why, please read the CUDA programming guide
If you just run the code and read the warnings from Numba, then you can see that the code fallback to a basic Python implementation:
The reason is that the type of
a
is not provided and Numba fail to find it. Numba is fast because of its typing system which enable it to compile the code in a native binary.Additionally, you should not modify global variables. This is not a good idea in term of software engineering and Numba does not support that anyway.
Thus, you need to use a typed lists returned from the function. Not that typed lists are not much faster than Python list when read/written from/to CPython because Numba has to make the conversion from/to CPython lists which is an expensive operation. Still, on my machine this is about 3 times faster.
Corrected code:
For more information about the lazy compilation please read the Numba documentation.