使用 NpyIter（新 API）直接访问数据是如何工作的？如何处理 char * 类型？

发布于 2024-12-10 14:23:17 字数 2488 浏览 0 评论 0原文

我在这里举手投足，希望这里有人能够充分了解 Numpy 的 C API 中的新 NpyIter API，以便快速让我知道我做错了什么。

我有一个形状数组（真的很大，有点大）。元素是双精度数 >= 0。对于每一行，我需要找到总和为最大值的连续非零值的总和。我不知道有什么方法可以单独用 Python 快速完成这个任务（有时真的很大~1e5），所以我一直在使用 Weave。

在我的旧代码中，我有以下内容：

            double *p1,*res;
            double g,d,q;
            PyArrayIterObject *itr;
            int axis = 1;
            g = 0;
            d = 0;
            itr = (PyArrayIterObject *) PyArray_IterAllButAxis(py_x,&axis);
            while(PyArray_ITER_NOTDONE(itr)) {
                const int go = x_array->strides[axis]/sizeof(double);
                p1 = (double *) PyArray_ITER_DATA(itr);
                res = (double *) PyArray_GETPTR1(py_r,itr->index);
                g = 0;
                d = 0;
                for (int i = 0; i < x_array->dimensions[axis]; i++) {
                    d+=*p1;
                    if (d>g) g=d;
                    if ((*p1)==0) d=0;
                    p1+=go;
                }
                *res = g;
                PyArray_ITER_NEXT(itr);
            }
            PyArray_free(itr);

这可行，但内存泄漏严重。我不知道如何阻止它泄漏，而且旧 PyArrayIter 的文档在内存管理方面相当缺乏。

我尝试使用 NpyIter API 编写新代码，但缺乏内存管理以外的相关文档。具体来说，我完全不确定应该如何访问实际的数组值。我已经尝试过以下操作：

            char *p1; 
            double *res;
            char **p1p;
            double g,d,q;
            int go;
            NpyIter* iter;
            NpyIter_IterNextFunc *iternext;
            g = 0;
            d = 0;
            iter = NpyIter_New(x_array, NPY_ITER_READONLY|NPY_ITER_EXTERNAL_LOOP, NPY_KEEPORDER, NPY_NO_CASTING, NULL);
            iternext = NpyIter_GetIterNext(iter, NULL);
            p1p = NpyIter_GetDataPtrArray(iter);

            do {
                p1 = *p1p;
                const int go = x_array->strides[1]/sizeof(double);
                res = (double *) PyArray_GETPTR1(py_r,NpyIter_GetIterIndex(iter));
                g = 0;
                d = 0;
                for (int i = 0; i < x_array->dimensions[1]; i++) {
                    d+= p1;
                    if (d>g) g=d;
                    if ((*p1)==0) d=0;
                    p1+=go;
                }
                *res = g;
            } while(iternext(iter));

            NpyIter_Deallocate(iter);

但是，由于 char * 与 double * 的关系，这显然不起作用。然而，我不确定如何获取从 NpyIter_GetDataPtrArray 返回的 (char **) 并将其转换为实际的数组值：文档极其无益，而是使用未给出的函数并采用 char * 。

我怎样才能以一种有效且不泄漏内存的方式做到这一点？

原文

I'm throwing up my hands here and hoping that someone here will know enough about the new NpyIter API in Numpy's C API to quickly let me know what I'm doing wrong.

I have an array of shape ( really big, somewhat big ). The elements are doubles >= 0. For every row, I need to find the sum of the contiguous nonzero values that sum to the largest value. I don't know of any way to do this quickly in Python alone (really big is ~1e5 at times), so I've been using Weave instead.

In my old code, I had the following:

            double *p1,*res;
            double g,d,q;
            PyArrayIterObject *itr;
            int axis = 1;
            g = 0;
            d = 0;
            itr = (PyArrayIterObject *) PyArray_IterAllButAxis(py_x,&axis);
            while(PyArray_ITER_NOTDONE(itr)) {
                const int go = x_array->strides[axis]/sizeof(double);
                p1 = (double *) PyArray_ITER_DATA(itr);
                res = (double *) PyArray_GETPTR1(py_r,itr->index);
                g = 0;
                d = 0;
                for (int i = 0; i < x_array->dimensions[axis]; i++) {
                    d+=*p1;
                    if (d>g) g=d;
                    if ((*p1)==0) d=0;
                    p1+=go;
                }
                *res = g;
                PyArray_ITER_NEXT(itr);
            }
            PyArray_free(itr);

This works, but it leaks memory terribly. I'm not sure how to stop it from leaking, and the documentation for the old PyArrayIter is rather lacking in terms of memory management.

I've tried to write new code with the NpyIter API, but the documentation for that in things other than memory management is lacking. Specifically, I'm not at all sure how I'm supposed to get access to the actual array values. I've tried the following:

            char *p1; 
            double *res;
            char **p1p;
            double g,d,q;
            int go;
            NpyIter* iter;
            NpyIter_IterNextFunc *iternext;
            g = 0;
            d = 0;
            iter = NpyIter_New(x_array, NPY_ITER_READONLY|NPY_ITER_EXTERNAL_LOOP, NPY_KEEPORDER, NPY_NO_CASTING, NULL);
            iternext = NpyIter_GetIterNext(iter, NULL);
            p1p = NpyIter_GetDataPtrArray(iter);

            do {
                p1 = *p1p;
                const int go = x_array->strides[1]/sizeof(double);
                res = (double *) PyArray_GETPTR1(py_r,NpyIter_GetIterIndex(iter));
                g = 0;
                d = 0;
                for (int i = 0; i < x_array->dimensions[1]; i++) {
                    d+= p1;
                    if (d>g) g=d;
                    if ((*p1)==0) d=0;
                    p1+=go;
                }
                *res = g;
            } while(iternext(iter));

            NpyIter_Deallocate(iter);

However, this obviously doesn't work because of char * vs. double *. I'm not sure, however, how to take the (char **) returned from NpyIter_GetDataPtrArray and turn it into actual array values: the documentation extremely unhelpfully instead uses a function that isn't given and takes a char *.

How can I do this in a way that works and doesn't leak memory?

分享到QQ

分享到微博