C中的R扩展,设置矩阵行/列名称

发布于 2024-11-02 07:33:55 字数 1118 浏览 1 评论 0原文

我正在编写一个用 C 语言操作矩阵的 R 包。目前,返回到 R 的矩阵具有行/列名称的数字。我宁愿在修改 C 中的对象时分配我自己的行/列名称。

我已经用谷歌搜索了大约一个小时,但还没有找到一个好的解决方案。我发现的最接近的是暗名称,但我想命名每一列,而不仅仅是两个维度。矩阵变得大于 4x4,下面只是我想要做的一个小例子。

行数为 4^x,其中 X 是行名称的长度

Current
     [,1] [,2] [,3] [,4]
[1,] 0.20 0.00 0.00 0.80
[2,] 0.25 0.25 0.25 0.25
[3,] 0.25 0.25 0.25 0.25
[4,] 1.00 0.00 0.00 0.00
[5,] 0.20 0.00 0.00 0.80
[6,] 0.25 0.25 0.25 0.25
[7,] 0.25 0.25 0.25 0.25
[8,] 1.00 0.00 0.00 0.00
[9,] 0.20 0.00 0.00 0.80
[10,] 0.25 0.25 0.25 0.25
[11,] 0.25 0.25 0.25 0.25
[12,] 1.00 0.00 0.00 0.00
[13,] 0.20 0.00 0.00 0.80
[14,] 0.25 0.25 0.25 0.25
[15,] 0.25 0.25 0.25 0.25
[16,] 1.00 0.00 0.00 0.00

Desired
     [A] [C] [G] [T]
 [AA] 0.20 0.00 0.00 0.80
 [AC] 0.25 0.25 0.25 0.25
 [AG] 0.25 0.25 0.25 0.25
 [AT] 1.00 0.00 0.00 0.00
 [CA] 0.20 0.00 0.00 0.80
 [CC] 0.25 0.25 0.25 0.25
 [CG] 0.25 0.25 0.25 0.25
 [CT] 1.00 0.00 0.00 0.00
 [GA] 0.20 0.00 0.00 0.80
 [GC] 0.25 0.25 0.25 0.25
 [GG] 0.25 0.25 0.25 0.25
 [GT] 1.00 0.00 0.00 0.00
 [TA] 0.20 0.00 0.00 0.80
 [TC] 0.25 0.25 0.25 0.25
 [TG] 0.25 0.25 0.25 0.25
 [TT] 1.00 0.00 0.00 0.00

I'm writing an R package that manipulates Matrices in C. Currently, the matrices returned to R have numbers for the row/column names. I would rather assign my own row/column names when modifying the object in C.

I've googled around for about an hour, but haven't found a good solution yet. The closest I've found is dimnames, but I want to name each column, not just the two dimensions. The matrices get larger than 4x4, below is just a small example of what I want to do.

The number of rows is 4^x where X is the length of the row name

Current
     [,1] [,2] [,3] [,4]
[1,] 0.20 0.00 0.00 0.80
[2,] 0.25 0.25 0.25 0.25
[3,] 0.25 0.25 0.25 0.25
[4,] 1.00 0.00 0.00 0.00
[5,] 0.20 0.00 0.00 0.80
[6,] 0.25 0.25 0.25 0.25
[7,] 0.25 0.25 0.25 0.25
[8,] 1.00 0.00 0.00 0.00
[9,] 0.20 0.00 0.00 0.80
[10,] 0.25 0.25 0.25 0.25
[11,] 0.25 0.25 0.25 0.25
[12,] 1.00 0.00 0.00 0.00
[13,] 0.20 0.00 0.00 0.80
[14,] 0.25 0.25 0.25 0.25
[15,] 0.25 0.25 0.25 0.25
[16,] 1.00 0.00 0.00 0.00

Desired
     [A] [C] [G] [T]
 [AA] 0.20 0.00 0.00 0.80
 [AC] 0.25 0.25 0.25 0.25
 [AG] 0.25 0.25 0.25 0.25
 [AT] 1.00 0.00 0.00 0.00
 [CA] 0.20 0.00 0.00 0.80
 [CC] 0.25 0.25 0.25 0.25
 [CG] 0.25 0.25 0.25 0.25
 [CT] 1.00 0.00 0.00 0.00
 [GA] 0.20 0.00 0.00 0.80
 [GC] 0.25 0.25 0.25 0.25
 [GG] 0.25 0.25 0.25 0.25
 [GT] 1.00 0.00 0.00 0.00
 [TA] 0.20 0.00 0.00 0.80
 [TC] 0.25 0.25 0.25 0.25
 [TG] 0.25 0.25 0.25 0.25
 [TT] 1.00 0.00 0.00 0.00

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

2024-11-09 07:33:55

如果您愿意使用 C++ 而不是 C,那么 Rcpp 可以让这变得更容易一些。我们只是像在 R 中一样创建一个包含行和列名称的列表对象,并将其分配给矩阵对象的 dimnames 属性:

R> library(inline)                         # to compile, link, load the code here
R> src <- '
+   Rcpp::NumericMatrix x(2,2);
+   x.fill(42);                           // or more interesting values
+   // C++0x can assign a set of values to a vector, but we use older standard
+   Rcpp::CharacterVector rows(2); rows[0] = "aa"; rows[1] = "bb";
+   Rcpp::CharacterVector cols(2); cols[0] = "AA"; cols[1] = "BB";
+   // now create an object "dimnms" as a list with rows and cols
+   Rcpp::List dimnms = Rcpp::List::create(rows, cols);
+   // and assign it
+   x.attr("dimnames") = dimnms;
+   return(x);
+ '
R> fun <- cxxfunction(signature(), body=src, plugin="Rcpp")
R> fun()
   AA BB
aa 42 42
bb 42 42
R> 

列和行名称的实际分配是手动的..因为当前的 C++ 标准不允许在初始化时直接赋值向量,但这种情况将会改变。

编辑:我刚刚意识到我当然也可以在行和列名上使用静态create()方法,这使得这变得更容易和更短

R> src <- '
+   Rcpp::NumericMatrix x(2,2);
+   x.fill(42);                           // or more interesting values
+   Rcpp::List dimnms =                   // two vec. with static names
+       Rcpp::List::create(Rcpp::CharacterVector::create("cc", "dd"),
+                          Rcpp::CharacterVector::create("ee", "ff"));
+   // and assign it
+   x.attr("dimnames") = dimnms;
+   return(x);
+ '
R> fun <- cxxfunction(signature(), body=src, plugin="Rcpp")
R> fun()
   ee ff
cc 42 42
dd 42 42
R> 

所以我们失败了到三到四个语句,没有保护/取消保护的胡闹,也没有内存管理。

If you are open to C++ instead of C, then Rcpp can make this a little easier. We just create a list object with rows and column names as we would in R, and assign that to the dimnames attribute of the matrix object:

R> library(inline)                         # to compile, link, load the code here
R> src <- '
+   Rcpp::NumericMatrix x(2,2);
+   x.fill(42);                           // or more interesting values
+   // C++0x can assign a set of values to a vector, but we use older standard
+   Rcpp::CharacterVector rows(2); rows[0] = "aa"; rows[1] = "bb";
+   Rcpp::CharacterVector cols(2); cols[0] = "AA"; cols[1] = "BB";
+   // now create an object "dimnms" as a list with rows and cols
+   Rcpp::List dimnms = Rcpp::List::create(rows, cols);
+   // and assign it
+   x.attr("dimnames") = dimnms;
+   return(x);
+ '
R> fun <- cxxfunction(signature(), body=src, plugin="Rcpp")
R> fun()
   AA BB
aa 42 42
bb 42 42
R> 

The actual assignment of the column and row names is so manual ... because the current C++ standard does not allow direct assignment of vectors at initialization, but that will change.

Edit: I just realized that I can of course use static create() method on the row and colnames too, which makes this a little easier and shorter still

R> src <- '
+   Rcpp::NumericMatrix x(2,2);
+   x.fill(42);                           // or more interesting values
+   Rcpp::List dimnms =                   // two vec. with static names
+       Rcpp::List::create(Rcpp::CharacterVector::create("cc", "dd"),
+                          Rcpp::CharacterVector::create("ee", "ff"));
+   // and assign it
+   x.attr("dimnames") = dimnms;
+   return(x);
+ '
R> fun <- cxxfunction(signature(), body=src, plugin="Rcpp")
R> fun()
   ee ff
cc 42 42
dd 42 42
R> 

So we are down to three or four statements, no monkeying with PROTECT / UNPROTECT and no memory management.

可是我不能没有你 2024-11-09 07:33:55

正如 Jim 所说,这在 R 中更容易做到。我通过 nam 参数将名称传递到 C 函数中。

#include <Rinternals.h>
SEXP myMat(SEXP nam) {
  /*PrintValue(nam);*/
  SEXP ans, dimnames;
  PROTECT(ans = allocMatrix(REALSXP, length(nam), length(nam)));
  PROTECT(dimnames = allocVector(VECSXP, 2));
  SET_VECTOR_ELT(dimnames, 0, nam);
  SET_VECTOR_ELT(dimnames, 1, nam);
  setAttrib(ans, R_DimNamesSymbol, dimnames);
  UNPROTECT(2);
  return(ans);
}

如果将该代码放入名为 myMat.c 的文件中,则可以通过下面的行对其进行测试。我使用的是 Ubuntu,因此如果您使用的是 Windows,则必须将 myMat.so 更改为 myMat.dll

R CMD SHLIB myMat.c
Rscript -e 'dyn.load("myMat.so"); .Call("myMat", c("A","C","G","T"))'

As Jim said, this is much easier to do in R. I'm passing the names into the C function via the nam argument.

#include <Rinternals.h>
SEXP myMat(SEXP nam) {
  /*PrintValue(nam);*/
  SEXP ans, dimnames;
  PROTECT(ans = allocMatrix(REALSXP, length(nam), length(nam)));
  PROTECT(dimnames = allocVector(VECSXP, 2));
  SET_VECTOR_ELT(dimnames, 0, nam);
  SET_VECTOR_ELT(dimnames, 1, nam);
  setAttrib(ans, R_DimNamesSymbol, dimnames);
  UNPROTECT(2);
  return(ans);
}

If you put that code in a file called myMat.c, you can test it via the line below. I'm using Ubuntu, so you will have to change myMat.so to myMat.dll if you're on Windows.

R CMD SHLIB myMat.c
Rscript -e 'dyn.load("myMat.so"); .Call("myMat", c("A","C","G","T"))'
喜你已久 2024-11-09 07:33:55

上面的注释很有启发性。 dimnames 是一个列表,其元素数量与数据集的维度相同,其中每个元素对应于该维度上的数字元素,即 list(c('a','c','g', 't'), c('a','c','g','t'))

要在 C 中设置它,我建议:

PROTECT(dimnames = allocVector(VECSXP, 2));
PROTECT(rownames = allocVector(STRSXP, 4));
PROTECT(colnames = allocVector(STRSXP, 4));
setAttrib( ? , R_DimNamesSymbol, dimnames);

然后您必须设置相关的 rowname 和 colname 元素。一般来说,这些事情在 R.

jim中容易完成

The note above is instructive. The dimnames is a list with the same number of elements as dimensions of the dataset, where each element corresponds to the number elements along that dimension, i.e., list(c('a','c','g','t'), c('a','c','g','t')).

To set that in C, I would recommend:

PROTECT(dimnames = allocVector(VECSXP, 2));
PROTECT(rownames = allocVector(STRSXP, 4));
PROTECT(colnames = allocVector(STRSXP, 4));
setAttrib( ? , R_DimNamesSymbol, dimnames);

You'll have to then set the relevant rowname and colname elements. In general, this stuff is much easier to do in R.

jim

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文