为什么我的 Python C 扩展会泄漏内存?
下面的函数接受一个 python 文件句柄,从文件中读取打包的二进制数据,创建一个 Python 字典并返回它。 如果我无休止地循环它,它将不断消耗内存。 我的引用计数出了什么问题?
static PyObject* __binParse_getDBHeader(PyObject *self, PyObject *args){
PyObject *o; //generic object
PyObject* pyDB = NULL; //this has to be a py file object
if (!PyArg_ParseTuple(args, "O", &pyDB)){
return NULL;
} else {
Py_INCREF(pyDB);
if (!PyFile_Check(pyDB)){
Py_DECREF(pyDB);
PyErr_SetString(PyExc_IOError, "argument 1 must be open file handle");
return NULL;
}
}
FILE *fhDB = PyFile_AsFile(pyDB);
long offset = 0;
DB_HEADER *pdbHeader = malloc(sizeof(DB_HEADER));
fseek(fhDB,offset,SEEK_SET); //at the beginning
fread(pdbHeader, 1, sizeof(DB_HEADER), fhDB );
if (ferror(fhDB)){
fclose(fhDB);
Py_DECREF(pyDB);
PyErr_SetString(PyExc_IOError, "failed reading database header");
return NULL;
}
Py_DECREF(pyDB);
PyObject *pyDBHeader = PyDict_New();
Py_INCREF(pyDBHeader);
o=PyInt_FromLong(pdbHeader->version_number);
PyDict_SetItemString(pyDBHeader, "version", o);
Py_DECREF(o);
PyObject *pyTimeList = PyList_New(0);
Py_INCREF(pyTimeList);
int i;
for (i=0; i<NUM_DRAWERS; i++){
//epochs
o=PyInt_FromLong(pdbHeader->last_good_test[i]);
PyList_Append(pyTimeList, o);
Py_DECREF(o);
}
PyDict_SetItemString(pyDBHeader, "lastTest", pyTimeList);
Py_DECREF(pyTimeList);
o=PyInt_FromLong(pdbHeader->temp);
PyDict_SetItemString(pyDBHeader, "temp", o);
Py_DECREF(o);
free(pdbHeader);
return (pyDBHeader);
}
The function below takes a python file handle, reads in packed binary data from the file, creates a Python dictionary and returns it. If I loop it endlessly, it'll continually consume RAM. What's wrong with my RefCounting?
static PyObject* __binParse_getDBHeader(PyObject *self, PyObject *args){
PyObject *o; //generic object
PyObject* pyDB = NULL; //this has to be a py file object
if (!PyArg_ParseTuple(args, "O", &pyDB)){
return NULL;
} else {
Py_INCREF(pyDB);
if (!PyFile_Check(pyDB)){
Py_DECREF(pyDB);
PyErr_SetString(PyExc_IOError, "argument 1 must be open file handle");
return NULL;
}
}
FILE *fhDB = PyFile_AsFile(pyDB);
long offset = 0;
DB_HEADER *pdbHeader = malloc(sizeof(DB_HEADER));
fseek(fhDB,offset,SEEK_SET); //at the beginning
fread(pdbHeader, 1, sizeof(DB_HEADER), fhDB );
if (ferror(fhDB)){
fclose(fhDB);
Py_DECREF(pyDB);
PyErr_SetString(PyExc_IOError, "failed reading database header");
return NULL;
}
Py_DECREF(pyDB);
PyObject *pyDBHeader = PyDict_New();
Py_INCREF(pyDBHeader);
o=PyInt_FromLong(pdbHeader->version_number);
PyDict_SetItemString(pyDBHeader, "version", o);
Py_DECREF(o);
PyObject *pyTimeList = PyList_New(0);
Py_INCREF(pyTimeList);
int i;
for (i=0; i<NUM_DRAWERS; i++){
//epochs
o=PyInt_FromLong(pdbHeader->last_good_test[i]);
PyList_Append(pyTimeList, o);
Py_DECREF(o);
}
PyDict_SetItemString(pyDBHeader, "lastTest", pyTimeList);
Py_DECREF(pyTimeList);
o=PyInt_FromLong(pdbHeader->temp);
PyDict_SetItemString(pyDBHeader, "temp", o);
Py_DECREF(o);
free(pdbHeader);
return (pyDBHeader);
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
PyDict_New()
返回一个新引用,请检查文档 对于PyDict
。 因此,如果您在创建后立即增加引用计数,则会有两个对它的引用。 当您将其作为结果值返回时,其中一个会传输给调用者,但另一个永远不会消失。您也不需要增加
pyTimeList
。 当你创建它时它就是你的。 然而,你需要 decref 它,但你只 decref 它一次,所以它也被泄漏了。您也不需要在
pyDB
上调用Py_INCREF
。 它是借用的引用,只要您的函数不返回,它就不会消失,因为它仍然在较低的堆栈帧中被引用。仅当您想将引用保留在另一个结构中的某个位置时,您才需要增加引用计数。
比照。 API 文档
PyDict_New()
returns a new reference, check the docs forPyDict
. So if you increase the refcount immediately after creating it, you have two references to it. One is transferred to the caller when you return it as a result value, but the other one never goes aways.You also don't need to incref
pyTimeList
. It's yours when you create it. However, you need to decref it, but you only decref it once, so it's leaked as well.You also don't need to call
Py_INCREF
onpyDB
. It's a borrowed reference and it won't go away as long as your function does not return, because it's still referenced in a lower stack frame.Only if you want to keep the reference in another structure somewhere, you need to increse the refcount.
Cf. the API docs
OT:连续调用
PyList_Append
是一个性能问题。 由于您提前知道将获得多少结果,因此可以使用:注意,在调用
PyList_SET_ITEM
后,您可能不会减少o
的引用计数,因为它“窃取”一个参考。 检查文档。OT: Using successive calls to
PyList_Append
is a performance issue. Since you know how many results you'll get in advance, you can use:Observe that you may not decrease the refcount of
o
after callingPyList_SET_ITEM
, because it "steals" a reference. Check the docs.我不知道Python-C。 然而,我对 COM 引用计数的经验表明,新创建的引用计数对象的引用计数为 1。 所以你的 Py_INCREF(pyDB) 在 PyArg_ParseTuple(args, "O", &pyDB) 和 PyObject *pyDBHeader = PyDict_New(); 之后 都是罪魁祸首。 他们的引用计数已经是 2。
I don't know about Python-C. However, My experience with COM reference counting says that a newly created reference-counted object has a reference count of 1. So your Py_INCREF(pyDB) after PyArg_ParseTuple(args, "O", &pyDB) and PyObject *pyDBHeader = PyDict_New(); are the culprit. Their reference counts are already 2.