Scala 创建稀疏向量
这是我试图在 scala 中实现的功能
创建一些数字的列表..例如 (1, 2 ,3 , 4, 5) // 这代表 1 个文档及其特征
将会有n个这样的具有不同特征的列表。
我想把这n个列表放入一个矩阵中。因此,稍后,如果我想对这个矩阵进行诸如矩阵转置、矩阵求逆之类的操作,我可以轻松完成。
目前我确实已经准备好了列表,但我不确定如何使用 scala 的稀疏向量和编码器函数,因为这个矩阵的行数将很大(大约 100 万),列数将是 200000。所以性能也是一个问题问题
This is the functionality i am trying to achieve in scala
create a list of some numbers .. say (1, 2 ,3 , 4, 5) // this represents 1 document and its features
There will be n such lists with different features.
I want to put this n lists into a matrix. So that later down the line, if I want to do operations on this matrix like matrix transpose, matrix inverse i can do it easily.
Currently I do have the lists ready, but i am not sure how to use the sparseVector and Encoder function of scala as the number of rows for this matrix would be huge (approx 1 million) and columns would be 200000. So performance is also an issue
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用具有默认值的映射来表示稀疏矩阵:
如果仅需要稀疏行,则可以使用类似于
Map[Int, Int]
Vector 的内容代码>s 代替。不过,总的来说,如果您关心内存或矩阵运算的性能,那么使用专门为解决此类问题而设计的库会更好。我过去对Colt 库很满意,但还有许多其他库选项,例如 Scalala 和 JScience。
You could use a map with a default value to represent a sparse matrix:
If only the rows need to be sparse, you can use something like a
Vector
ofMap[Int, Int]
s instead.In general, though, if you care about memory or the performance of matrix operations you're going to be much better off with a library that's been designed to solve this kind of problem. I've been happy with the Colt libraries in the past, but there are a number of other options, like Scalala and JScience.