numba 没有加快代码编译速度

发布于 2025-01-13 09:39:45 字数 280 浏览 0 评论 0原文

我在 numba 和正常模式下尝试了此代码,但都在 13 秒内完成,并且 numba 没有增加速度

我该如何针对这种情况设置 numba?

import numpy as np
from numba import jit, cuda
a=[]
@jit(target_backend="cuda")
def func():
    for i in range(100000):
        a.append(i)
    return a

print(func())


 

I tried this code with numba as well as normal mode but both were completed in 13 seconds and numba did not add speed

How can I set numba for this situation?

import numpy as np
from numba import jit, cuda
a=[]
@jit(target_backend="cuda")
def func():
    for i in range(100000):
        a.append(i)
    return a

print(func())


 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

甜尕妞 2025-01-20 09:39:45

此处不能使用CUDA,因为此代码无法在GPU上运行(即使修改,效率也很低)。 GPU 与 CPU 有很大不同,而且它们的编程方式也不同。要了解原因,请阅读CUDA 编程指南< /a>

如果您只是运行代码并从 Numba 中读取警告,那么您可以看到代码回退到基本的 Python 实现

编译正在回退到启用了循环提升的对象模式,因为函数“func”由于以下原因导致类型推断失败:无类型全局名称“a”:无法输入空列表

原因是未提供 a 的类型Numba 没能找到它。 Numba 速度很快,因为它的类型系统使其能够将代码编译为本机二进制文件。

此外,您不应修改全局变量。从软件工程的角度来看,这不是一个好主意,而且 Numba 也不支持这一点。

因此,您需要使用从函数返回的类型化列表。这并不是说从 CPython 读取/写入 CPython 时,类型化列表并不比 Python 列表快多少,因为 Numba 必须进行从 CPython 列表到 CPython 列表的转换,这是一项昂贵的操作。不过,在我的机器上,速度快了大约 3 倍。

更正的代码:

import numpy as np
from numba import jit, cuda

@jit
def func():
    a=[]
    for i in range(100000):
        a.append(i)
    return a

func() # Compile the function during the first run

a = func() # Execute quickly the code

print(a) # Printing is slow

有关延迟编译的更多信息请阅读Numba 文档

CUDA cannot be used here because this code cannot run on the GPU (and even if would be modified, it would be inefficient). GPUs are very different from CPUs and they are also programmed differently. To understand why, please read the CUDA programming guide

If you just run the code and read the warnings from Numba, then you can see that the code fallback to a basic Python implementation:

Compilation is falling back to object mode WITH looplifting enabled because Function "func" failed type inference due to: Untyped global name 'a': Cannot type empty list

The reason is that the type of a is not provided and Numba fail to find it. Numba is fast because of its typing system which enable it to compile the code in a native binary.

Additionally, you should not modify global variables. This is not a good idea in term of software engineering and Numba does not support that anyway.

Thus, you need to use a typed lists returned from the function. Not that typed lists are not much faster than Python list when read/written from/to CPython because Numba has to make the conversion from/to CPython lists which is an expensive operation. Still, on my machine this is about 3 times faster.

Corrected code:

import numpy as np
from numba import jit, cuda

@jit
def func():
    a=[]
    for i in range(100000):
        a.append(i)
    return a

func() # Compile the function during the first run

a = func() # Execute quickly the code

print(a) # Printing is slow

For more information about the lazy compilation please read the Numba documentation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文