我有大的1D数组 A
和 b
,以及将它们分为子阵列的Pointers i
的数组。我的 a
和 b
几乎不适合RAM,并且具有不同的dtypes(一个包含 uint32
s,另一个 Rational> Rational {Int64} < /code> s),所以我不想加入2D数组,以避免更改dtypes。
对于i [2:end] 中的每个 i与相应的子阵列 b [i [i-1],i [i] -1]
相同。我对此的尝试是:
function sort!(a,b)
p=sortperm(a);
a[:], b[:] = a[p], b[p]
end
Threads.@threads for i in I[2:end]
sort!( a[I[i-1], I[i]-1], b[I[i-1], I[i]-1] )
end
但是,在一个小例子上,我看到 sort!
不会改变子阵列的视图:
a, b = rand(1:10,10), rand(-1000:1000,10) .//1
sort!(a,b); println(a,"\n",b) # works like it should
a, b = rand(1:10,10), rand(-1000:1000,10) .//1
sort!(a[1:5],b[1:5]); println(a,"\n",b) # does nothing!!!
关于如何创建此类函数排序的任何帮助!(尽可能高效)。
背景:我正在处理来自稀疏阵列的数据:
using SparseArrays
n=10^6; x=sprand(n,n,1000/n); #random matrix with 1000 entries per column on average
x = SparseMatrixCSC(n,n,x.colptr,x.rowval,rand(-99:99,nnz(x)).//1); #chnging entries to rationals
U = randperm(n) #permutation of rows of matrix x
a, b, I = U[x.rowval], x.nzval, x.colptr;
因此,这些 a,b,i
是我发布的问题的好示例。我要做的是对每列中条目的行索引(以及对应的矩阵值)进行排序。
注意:我已经在朱莉娅话语上问了这个问题在这里,但没有收到任何答复或评论。如果我可以提高问题的质量,请随时告诉我。
I have large 1D arrays a
and b
, and an array of pointers I
that separates them into subarrays. My a
and b
barely fit into RAM and are of different dtypes (one contains UInt32
s, the other Rational{Int64}
s), so I don’t want to join them into a 2D array, to avoid changing dtypes.
For each i in I[2:end]
, I wish to sort the subarray a[I[i-1],I[i]-1]
and apply the same permutation to the corresponding subarray b[I[i-1],I[i]-1]
. My attempt at this is:
function sort!(a,b)
p=sortperm(a);
a[:], b[:] = a[p], b[p]
end
Threads.@threads for i in I[2:end]
sort!( a[I[i-1], I[i]-1], b[I[i-1], I[i]-1] )
end
However, already on a small example, I see that sort!
does not alter the view of a subarray:
a, b = rand(1:10,10), rand(-1000:1000,10) .//1
sort!(a,b); println(a,"\n",b) # works like it should
a, b = rand(1:10,10), rand(-1000:1000,10) .//1
sort!(a[1:5],b[1:5]); println(a,"\n",b) # does nothing!!!
Any help on how to create such function sort!
(as efficient as possible) are welcome.
Background: I am dealing with data coming from sparse arrays:
using SparseArrays
n=10^6; x=sprand(n,n,1000/n); #random matrix with 1000 entries per column on average
x = SparseMatrixCSC(n,n,x.colptr,x.rowval,rand(-99:99,nnz(x)).//1); #chnging entries to rationals
U = randperm(n) #permutation of rows of matrix x
a, b, I = U[x.rowval], x.nzval, x.colptr;
Thus these a,b,I
serve as good examples to my posted problem. What I am trying to do is sort the row indices (and corresponding matrix values) of entries in each column.
Note: I already asked this question on Julia discourse here, but received no replies nor comments. If I can improve on the quality of the question, don't hesitate to tell me.
发布评论
评论(2)
问题在于
a [1:5]
不是视图,而只是副本。相反,将视图视为您正在寻找的
PS。
@view a [2:3]
,@view(a [2:3])
或@views
宏可以帮助使THINS更可读。The problem is that
a[1:5]
is not a view, it's just a copy. instead make the view likeis what you are looking for
ps.
the
@view a[2:3]
,@view(a[2:3])
or the@views
macro can help making thins more readable.首先,您不应该重新定义
base.sort!
这样。现在,sort!
Will Shadowbase.sort!
,如果您调用sort!(a)
,您将获得错误。另外,
a [i [i-1],i [i] -1]
和b [i [i-1],i [i] -1]
是不是切片,它们只是单个元素,因此,如果您是否使用views
对它们进行排序,则什么都不会发生。并以移动窗口的方式对数组进行排序是不正确的。您想在这里做的事情,因为您的向量很大,请呼叫
在循环中反复,选择
根据block_size
适合内存,并根据b
bp
进行修改,然后继续到<代码> a 并找到下一个p
,然后重复直到要对a
中的任何内容保留为止。我将把实施作为您的练习,但是如果您陷入某件事,您可以回来。First of all, you shouldn't redefine
Base.sort!
like this. Now,sort!
will shadowBase.sort!
and you'll get errors if you callsort!(a)
.Also,
a[I[i-1], I[i]-1]
andb[I[i-1], I[i]-1]
are not slices, they are just single elements, so nothing should happen if you sort them either withviews
or not. And sorting arrays in a moving-window way like this is not correct.What you want to do here, since your vectors are huge, is call
p = partialsortperm(a[i:end], i:i+block_size-1)
repeatedly in a loop, choosing ablock_size
that fits into memory, and modify botha
andb
according top
, then continue to the remaining part ofa
and find nextp
and repeat until nothing remains ina
to be sorted. I'll leave the implementation as an exercise for you, but you can come back if you get stuck on something.