返回从 C 到 R 的动态向量

发布于 2024-12-26 00:13:42 字数 236 浏览 1 评论 0原文

我正在用 C 编写一些代码,以便从 R 动态调用。

该代码生成一条随机 泊松 过程的路径,直到所需的时间 T。 因此,每次调用我的 C 函数时,返回向量的长度都会有所不同,具体取决于生成的随机数。

我必须创建什么 R 数据结构? LISTSXP?另一个?

如何创建它以及如何附加它?特别是我如何将其返回给 R?

感谢您的帮助。

I am writing some code in C to be called dynamically from R.

This code generates a path of a random Poisson process up to a desired time T.
So at every call to my C function, the length of the returned vector will be different depending on the random numbers generated.

What R data structure will I have to create? a LISTSXP? another one?

How can I create it, and how can I append to it? And especially how can I return it back to R?

Thanks for help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

傻比既视感 2025-01-02 00:13:42

这实际上取决于您想要使用什么作为临时结构,因为最终您无论如何都必须为结果分配一个向量。因此,无论您将使用什么,都不会返回什么。有几种可能性:

  1. 使用Calloc + Realloc + Free,它允许您根据需要扩展临时内存。一旦获得完整的集合,就可以分配结果向量并返回它。
  2. 如果您可以轻松高估大小,则可以过度分配结果向量并在返回之前使用 SETLENGTH 。不过,这存在问题,因为结果将保持过度分配,直到稍后重复。
  3. 您可以使用您暗示的向量块链表,即分配和保护一个对列表,根据需要将向量附加到尾部。最后,您分配返回向量并复制您分配的块的内容。这比上面两个更复杂。

它们各自都有缺点和优点,因此实际上取决于您的应用程序来选择最适合您的一种。

编辑:添加了使用配对列表方法的示例。我仍然推荐 Realloc 方法,因为它更容易,但尽管如此:

#define BLOCK_SIZE xxx /* some reasonable size for increments - could be adaptive, too */ 
SEXP block; /* last vector block */
SEXP root = PROTECT(list1(block = allocVector(REALSXP, BLOCK_SIZE)));
SEXP tail = root;
double *values = REAL(block);
int count = 0, total = 0;
do { /* your code to generate values - if you want to add one 
        first try to add it to the existing block, otherwise allocate new one */
    if (count == BLOCK_SIZE) { /* add a new block when needed */
        tail = SETCDR(tail, list1(block = allocVector(REALSXP, BLOCK_SIZE)));
        values = REAL(block);
        total += count;
        count = 0;
    }
    values[count++] = next_value;
} while (...);
total += count;
/* when done, we need to create the result vector */
{
    SEXP res = allocVector(REALSXP, total);
    double *res_values = REAL(res);
    while (root != R_NilValue) {
        int size = (CDR(root) == R_NilValue) ? count : BLOCK_SIZE;
        memcpy(res_values, REAL(CAR(root)), sizeof(double) * size);
        res_values += size;
        root = CDR(root);
    }
    UNPROTECT(1);
    return res;
 }

It's really up to you what you want to use as temporary structure, because in the end you'll have to allocate a vector for result anyway. So whatever you'll be using is not what you'll return. There are several possibilities:

  1. use Calloc + Realloc + Free which allows you to expand the temporary memory as needed. Once you have the full set, you allocate the result vector and return it.
  2. if you can overestimate the size easily, you can over-allocate the result vector and use SETLENGTH before you return it. There are issues with this, though, because the result will remain over-allocated until duplicated later on.
  3. You can use what you hinted at which is a chained list of vector blocks, i.e. allocate and protect one pairlist of which you append vectors to the tail as you need them. At the end you allocate the return vector and copy content of the blocks you allocated. This is more convoluted than both of the above.

Each of them has drawbacks and benefits, so it really depends on your application to pick the one best suited for you.

Edit: added an example of using the pairlists approach. I would still recommend the Realloc approach since it's much easier, but nonetheless:

#define BLOCK_SIZE xxx /* some reasonable size for increments - could be adaptive, too */ 
SEXP block; /* last vector block */
SEXP root = PROTECT(list1(block = allocVector(REALSXP, BLOCK_SIZE)));
SEXP tail = root;
double *values = REAL(block);
int count = 0, total = 0;
do { /* your code to generate values - if you want to add one 
        first try to add it to the existing block, otherwise allocate new one */
    if (count == BLOCK_SIZE) { /* add a new block when needed */
        tail = SETCDR(tail, list1(block = allocVector(REALSXP, BLOCK_SIZE)));
        values = REAL(block);
        total += count;
        count = 0;
    }
    values[count++] = next_value;
} while (...);
total += count;
/* when done, we need to create the result vector */
{
    SEXP res = allocVector(REALSXP, total);
    double *res_values = REAL(res);
    while (root != R_NilValue) {
        int size = (CDR(root) == R_NilValue) ? count : BLOCK_SIZE;
        memcpy(res_values, REAL(CAR(root)), sizeof(double) * size);
        res_values += size;
        root = CDR(root);
    }
    UNPROTECT(1);
    return res;
 }
把回忆走一遍 2025-01-02 00:13:42

如果您愿意从 C 切换到 C++,您可以免费获得附加的 Rcpp 层。下面是(仍然相当不完整)RcppExample 包页面中的一个示例:

#include <RcppClassic.h>
#include <cmath>

RcppExport SEXP newRcppVectorExample(SEXP vector) {
BEGIN_RCPP

    Rcpp::NumericVector orig(vector);                // keep a copy 
    Rcpp::NumericVector vec(orig.size());            // create vector same size

    // we could query size via
    //   int n = vec.size();
    // and loop over the vector, but using the STL is so much nicer
    // so we use a STL transform() algorithm on each element
    std::transform(orig.begin(), orig.end(), vec.begin(), ::sqrt);

    return Rcpp::List::create(Rcpp::Named( "result" ) = vec,
                              Rcpp::Named( "original" ) = orig) ;

END_RCPP
}

如您所见,没有显式内存分配、释放、PROTECT/UNPROTECT 等,并且您将获得第一类 R 列表对象。

还有更多示例,包括 其他问题,例如就像这个

编辑:你并没有真正说出你的路径会做什么,但作为一个简单的说明,这里是使用 Rcpp 添加 cumsum()的 C++ 代码rpois() 其行为就像在 R 中一样:

R> library(inline)
R> 
R> fun <- cxxfunction(signature(ns="integer", lambdas="numeric"),
+                    plugin="Rcpp",
+                    body='
+    int n = Rcpp::as<int>(ns);
+    double lambda = Rcpp::as<double>(lambdas);
+ 
+    Rcpp::RNGScope tmp;                  // make sure RNG behaves
+ 
+    Rcpp::NumericVector vec = cumsum( rpois( n, lambda ) );
+ 
+    return vec;
+ ')
R> set.seed(42)
R> fun(3, 0.3)
[1] 1 2 2
R> fun(4, 0.4)
[1] 1 1 1 2

作为证明,回到 R 中,如果我们设置种子,我们可以生成完全相同的数字:

R> set.seed(42)
R> cumsum(rpois(3, 0.3))
[1] 1 2 2
R> cumsum(rpois(4, 0.4))
[1] 1 1 1 2
R> 

If you're open to switching from C to C++, you get added layer of Rcpp for free. Here is an example from the page of the (still rather incomple) RcppExample package:

#include <RcppClassic.h>
#include <cmath>

RcppExport SEXP newRcppVectorExample(SEXP vector) {
BEGIN_RCPP

    Rcpp::NumericVector orig(vector);                // keep a copy 
    Rcpp::NumericVector vec(orig.size());            // create vector same size

    // we could query size via
    //   int n = vec.size();
    // and loop over the vector, but using the STL is so much nicer
    // so we use a STL transform() algorithm on each element
    std::transform(orig.begin(), orig.end(), vec.begin(), ::sqrt);

    return Rcpp::List::create(Rcpp::Named( "result" ) = vec,
                              Rcpp::Named( "original" ) = orig) ;

END_RCPP
}

As you see, no explicit memory allocation, freeing, PROTECT/UNPROTECT etc, and you get a first class R list object back.

There are lots more examples, included in other SO questions such as this one.

Edit: And you didn't really say what you paths would do, but as a simple illustration, here is C++ code using the Rcpp additions cumsum() and rpois() which behave just like they do in R:

R> library(inline)
R> 
R> fun <- cxxfunction(signature(ns="integer", lambdas="numeric"),
+                    plugin="Rcpp",
+                    body='
+    int n = Rcpp::as<int>(ns);
+    double lambda = Rcpp::as<double>(lambdas);
+ 
+    Rcpp::RNGScope tmp;                  // make sure RNG behaves
+ 
+    Rcpp::NumericVector vec = cumsum( rpois( n, lambda ) );
+ 
+    return vec;
+ ')
R> set.seed(42)
R> fun(3, 0.3)
[1] 1 2 2
R> fun(4, 0.4)
[1] 1 1 1 2

And as a proof, back in R, if we set the seed, we can generate exactly the same numbers:

R> set.seed(42)
R> cumsum(rpois(3, 0.3))
[1] 1 2 2
R> cumsum(rpois(4, 0.4))
[1] 1 1 1 2
R> 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文