创建（和访问）具有 NA 默认条目的稀疏矩阵

发布于 2024-08-01 14:04:19 字数 1460 浏览 6 评论 0原文

在了解了在 R 中处理稀疏矩阵的选项后，我想要使用 Matrix 包来创建稀疏矩阵以下数据框并使所有其他元素为 NA。

     s    r d
1 1089 3772 1
2 1109  190 1
3 1109 2460 1
4 1109 3071 2
5 1109 3618 1
6 1109   38 7

我知道我可以使用以下命令创建一个稀疏矩阵，像往常一样访问元素：

> library(Matrix)
> Y <- sparseMatrix(s,r,x=d)
> Y[1089,3772]
[1] 1
> Y[1,1]
[1] 0

但是如果我想将默认值设置为 NA，我尝试了以下操作：

  M <- Matrix(NA,max(s),max(r),sparse=TRUE)
  for (i in 1:nrow(X))
    M[s[i],r[i]] <- d[i]

并收到此错误

Error in checkSlotAssignment(object, name, value) : 
  assignment of an object of class "numeric" is not valid for slot "x" in an object of class "lgCMatrix"; is(value, "logical") is not TRUE

不仅如此，我发现需要更长的时间来访问元素。

> system.time(Y[3,3])
   user  system elapsed 
  0.000   0.000   0.003 
> system.time(M[3,3])
   user  system elapsed 
  0.660   0.032   0.995

我应该如何创建这个矩阵？为什么一个矩阵的处理速度这么慢？

以下是上述数据的代码片段：

X <- structure(list(s = c(1089, 1109, 1109, 1109, 1109, 1109), r = c(3772, 
190, 2460, 3071, 3618, 38), d = c(1, 1, 1, 2, 1, 7)), .Names = c("s", 
"r", "d"), row.names = c(NA, 6L), class = "data.frame")

原文

After learning about the options for working with sparse matrices in R, I want to use the Matrix package to create a sparse matrix from the following data frame and have all other elements be NA.

     s    r d
1 1089 3772 1
2 1109  190 1
3 1109 2460 1
4 1109 3071 2
5 1109 3618 1
6 1109   38 7

I know I can create a sparse matrix with the following, accessing elements as usual:

> library(Matrix)
> Y <- sparseMatrix(s,r,x=d)
> Y[1089,3772]
[1] 1
> Y[1,1]
[1] 0

but if I want to have the default value to be NA, I tried the following:

  M <- Matrix(NA,max(s),max(r),sparse=TRUE)
  for (i in 1:nrow(X))
    M[s[i],r[i]] <- d[i]

and got this error

Error in checkSlotAssignment(object, name, value) : 
  assignment of an object of class "numeric" is not valid for slot "x" in an object of class "lgCMatrix"; is(value, "logical") is not TRUE

Not only that, I find that one takes much longer to access to elements.

> system.time(Y[3,3])
   user  system elapsed 
  0.000   0.000   0.003 
> system.time(M[3,3])
   user  system elapsed 
  0.660   0.032   0.995

How should I be creating this matrix? Why is one matrix so much slower to work with?

Here's the code snippet for the above data:

X <- structure(list(s = c(1089, 1109, 1109, 1109, 1109, 1109), r = c(3772, 
190, 2460, 3071, 3618, 38), d = c(1, 1, 1, 2, 1, 7)), .Names = c("s", 
"r", "d"), row.names = c(NA, 6L), class = "data.frame")

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

花想c 2024-08-08 14:04:19

为什么需要默认 NA 值？据我所知，只有零单元格的矩阵才是稀疏的。由于 NA 是非零值，您将失去稀疏矩阵的所有好处。如果矩阵几乎没有零，则经典矩阵的效率会更高。经典矩阵就像一个根据维度进行切割的向量。所以它只需要存储数据向量和维度。稀疏矩阵仅存储非零值，但也存储位置。当且仅当您有足够的零值时，这才是一个优势。

回复收藏 0 原文