在 Python 中,为什么用 C 实现的模块比纯 Python 模块更快,以及如何编写一个模块?
python 文档指出,cPickle 比 Pickle 更快的原因是前者是用 C 实现的。这到底意味着什么?
我正在用 Python 制作一个高等数学模块,有些计算需要花费大量时间。这是否意味着如果我的程序用 C 实现,它可以变得更快?
我希望从其他 Python 程序导入这个模块,就像导入 cPickle 一样。
你能解释一下如何用 C 实现 Python 模块吗?
The python documentation states, that the reason cPickle is faster than Pickle is, that the former is implemented in C. What does that mean exactly?
I am making a module for advanced mathematics in Python, and some calculations take a significant amount of time. Does that mean that if my program is implemented in C it can be made much faster?
I wish to import this module from other Python programs, just the way I can import cPickle.
Can you explain how to do implement a Python module in C?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可以编写快速的 C 代码,然后在 python 脚本中使用它,这样您的程序就会运行得更快。[1]
http://docs.python.org/extending/index.html#extending- index
一个例子是 Numpy,用 C 编写 ( https://numpy.org/ )
典型用途是用 C 实现瓶颈(或者使用用 C 编写的库,当然;)),由于它的速度,并且使用 python 来处理剩余的代码
[1] 顺便说一句,这就是为什么 cPickle 比pickle
编辑:
看看Pyrex: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/About.html
这不是“官方”方式,但可能有用
You can write fast C code and then use it in your python scripts, so your program will run faster.[1]
http://docs.python.org/extending/index.html#extending-index
An example is Numpy, written in C ( https://numpy.org/ )
Typical use is to implement the bottleneck in C (or to use a library written in C, of course ;) ), due to its speed, and to use python for the remaining code
[1] by the way, this is why cPickle is faster than pickle
edit:
take a look at Pyrex: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/About.html
It's not the 'official' way but it may be useful
如前所述,numpy 非常适合向量计算。 (可能会更好,但是关于它比您无需实际工作即可编写的任何内容都要好的评论绝对是正确的。)
但是,并非所有内容都可以轻松矢量化,因此,如果您确实有包含大量函数调用的紧密内部循环(例如一个高度递归的算法)你仍然有几个选择:可能最流行的是 Cython,它允许你编写在一种带注释的 Python 中编写模块和函数,并在需要时获得类似 C 的速度。
或者,也许您的时间全部被库调用所支配,以计算特征值或求逆矩阵或评估特殊函数或除以非常大的整数 - 其中许多是
As mentioned, numpy is excellent for vector computations. (Could be better still, but the comment that it's better than anything you could write without actually doing work is definitely true.)
Not everything can be easily vectorized, though, so if you do have tight inner loops with lots of function calls (say a heavily recursive algorithm) you still have a couple of options: probably the most popular is Cython, which allows you to write modules and functions in a kind of annotated Python and get C-like speed when you need it.
Or maybe your time is all dominated by library calls to compute eigenvalues or invert matrices or evaluate special functions or divide really large integers -- many of which the Sage project handles very well, by the way, if what you're doing is more mathematical than pure crunching -- in which case the time spent in Python might not even matter. It all depends on the details of the kind of numerics you're doing.
当您在 python 中编写函数时,会创建一个新的函数对象,对函数代码进行解析和字节编译[并保存在“func_code”属性中],因此当您调用该函数时,解释器会读取其字节码并执行它。
如果您在 C 中编写相同的函数,遵循 C/Python API 以使其在 python 中可用,解释器将创建函数对象,但该函数不会有字节码。
当解释器找到对该函数的调用时,它会调用真正的 C 函数,因此它以“机器”速度而不是“python-machine”速度执行。
您可以验证用 C 编写的此检查函数:
要了解如何为 python 使用编写 C 代码,请按照官方网站中的指南进行操作。
不管怎样,如果你只是做 N 维数组计算 numpy 应该足够了。
When you write a function in python, a new function object is created, the function code is parsed and bytecompiled[and saved in the "func_code" attribute], so when you call that function the interpreter reads its bytecode and executes it.
If you write the same function in C, following C/Python API to make it avaiable in python, the interpreter will create the function object, but this function won't have a bytecode.
When the interpreter finds a call to that function it calls the real C function, thus it executes at "machine" speed and not at "python-machine" speed.
You can verify this checking functions written in C:
To understand how you can write C code for python use follow the guides in the official site.
Anyway, if you are simply doing N-dimensional array calculations numpy ought to be sufficient.
除了已经提到的 Pyrex/Cython 之外,您还有其他选择:
脱落皮肤:翻译(a Python 到 C++ 的限制子集。可以自动为您生成扩展名。您可以创建一个扩展来执行此操作(假设是 Linux):
PyPy:更快的 Python,带有 JIT 编译器。您可以简单地在其上运行您的代码,而不是 CPython。目前仅支持Python 2.5,很快就会支持2.7。可以为数学密集型代码带来巨大的加速。安装并运行它(假设 Linux 32 位):
Weave:允许您内联编写 C,编译它。
编辑:如果您希望我们为您运行这些工具并进行基准测试,只需发布您的代码即可;)
Besides Pyrex/Cython, already mentioned, you have other alternatives:
Shed Skin: Translates (a restricted subset of) Python to C++. Can automatically generate an extension for you. You'd create an extension doing this (assuming Linux):
PyPy: A faster Python, with a JIT compiler. You could simply run your code on it instead of CPython. Only supports Python 2.5 now, 2.7 support soon. Can give huge speedups on math-heavy code. To install and run it (assuming Linux 32-bit):
Weave: Allows you to write C inline, the compiles it.
Edit: If you want us to run these tools for you and benchmark, just post your code ;)
我很晚了,但如果有人需要在 2023 年知道这一点,这里是解决方案:
gcc -shared library.o -fPIC
将对象文件 (.o) 转换为共享对象 (.so)clibrary = ctypes.CDLL("path\to\file.so")
clibrary.function(args)
,您就在Python中调用了一个C函数!I am Very late, but if anyone needs to know this in 2023, here is the solution:
gcc --Wall --save-temps library.c
) There should now be a .o file in the same directory.gcc -shared library.o -fPIC
clibrary = ctypes.CDLL("path\to\file.so")
clibrary.function(args)
and you called a C funtion in Python!