加权平均值极大
我使用 64 位 matlab 和 32g RAM(只是让你知道)。
我有一个包含 130 万个数字(整数)的文件(向量)。我想制作另一个相同长度的向量,其中每个点是整个第一个向量的加权平均值,按距该位置的反距离进行加权(实际上它的位置 ^-0.1,而不是 ^-1,但出于示例目的) 。我无法使用 matlab 的“过滤”功能,因为它只能对当前点之前的内容进行平均,对吧?为了更清楚地解释,这里有一个 3 个元素的示例
data = [ 2 6 9 ]
weights = [ 1 1/2 1/3; 1/2 1 1/2; 1/3 1/2 1 ]
results=data*weights= [ 8 11.5 12.666 ]
i.e.
8 = 2*1 + 6*1/2 + 9*1/3
11.5 = 2*1/2 + 6*1 + 9*1/2
12.666 = 2*1/3 + 6*1/2 + 9*1
,因此新向量中的每个点都是整个第一个向量的加权平均值,加权为 1/(距该位置的距离 +1)。
我可以重新创建每个点的权重向量,然后逐个元素计算结果向量,但这需要 for 循环 130 万次迭代,每个迭代包含 130 万次乘法。我宁愿使用直接矩阵乘法,将 1x1.3mil 乘以 1.3milx1.3mil,这在理论上是有效的,但我无法加载那么大的矩阵。
然后,我尝试使用 shell 脚本创建矩阵并在 matlab 中对其进行索引,因此一次仅调用矩阵的相关列,但这也需要很长时间。
我不必在 matlab 中执行此操作,因此人们对利用如此大的数字并获取平均值的任何建议将不胜感激。由于我使用的是 ^-0.1 的权重,而不是 ^-1,所以它不会下降那么快 - 与原始点权重 1 相比,第一百万个点的权重仍然为 0.25,所以我不能直接剪切它当它变大时就关闭。
希望这已经足够清楚了吗?
这是下面答案的代码(所以可以格式化?):
data = load('/Users/mmanary/Documents/test/insertion.txt');
data=data.';
total=length(data);
x=1:total;
datapad=[zeros(1,total) data];
weights = ([(total+1):-1:2 1:total]).^(-.4);
weights = weights/sum(weights);
Fdata = fft(datapad);
Fweights = fft(weights);
Fresults = Fdata .* Fweights;
results = ifft(Fresults);
results = results(1:total);
plot(x,results)
I am using 64 bit matlab with 32g of RAM (just so you know).
I have a file (vector) of 1.3 million numbers (integers). I want to make another vector of the same length, where each point is a weighted average of the entire first vector, weighted by the inverse distance from that position (actually it's position ^-0.1, not ^-1, but for example purposes). I can't use matlab's 'filter' function, because it can only average things before the current point, right? To explain more clearly, here's an example of 3 elements
data = [ 2 6 9 ]
weights = [ 1 1/2 1/3; 1/2 1 1/2; 1/3 1/2 1 ]
results=data*weights= [ 8 11.5 12.666 ]
i.e.
8 = 2*1 + 6*1/2 + 9*1/3
11.5 = 2*1/2 + 6*1 + 9*1/2
12.666 = 2*1/3 + 6*1/2 + 9*1
So each point in the new vector is the weighted average of the entire first vector, weighting by 1/(distance from that position+1).
I could just remake the weight vector for each point, then calculate the results vector element by element, but this requires 1.3 million iterations of a for loop, each of which contains 1.3million multiplications. I would rather use straight matrix multiplication, multiplying a 1x1.3mil by a 1.3milx1.3mil, which works in theory, but I can't load a matrix that large.
I am then trying to make the matrix using a shell script and index it in matlab so only the relevant column of the matrix is called at a time, but that is also taking a very long time.
I don't have to do this in matlab, so any advice people have about utilizing such large numbers and getting averages would be appreciated. Since I am using a weight of ^-0.1, and not ^-1, it does not drop off that fast - the millionth point is still weighted at 0.25 compared to the original points weighting of 1, so I can't just cut it off as it gets big either.
Hope this was clear enough?
Here is the code for the answer below (so it can be formatted?):
data = load('/Users/mmanary/Documents/test/insertion.txt');
data=data.';
total=length(data);
x=1:total;
datapad=[zeros(1,total) data];
weights = ([(total+1):-1:2 1:total]).^(-.4);
weights = weights/sum(weights);
Fdata = fft(datapad);
Fweights = fft(weights);
Fresults = Fdata .* Fweights;
results = ifft(Fresults);
results = results(1:total);
plot(x,results)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
做到这一点的唯一明智的方法是使用FFT卷积,作为
filter
函数和类似函数的基础。手动操作非常简单:n=10^6 只需不到半秒!
The only sensible way to do this is with FFT convolution, as underpins the
filter
function and similar. It is very easy to do manually:Takes less than half a second for
n=10^6
!这可能不是最好的方法,但是如果有大量内存,您绝对可以并行化该过程。
您可以构造由具有值
i^(-1)
的原始矩阵条目组成的稀疏矩阵(其中i = 1 .. 130 万
),将它们与您的值相乘原始向量,并将所有结果加在一起。因此,对于您的示例,产品本质上是:
当然,您不会以这种方式构造稀疏矩阵。如果您希望减少内部循环的迭代次数,则每个矩阵中可以有多个
i
。查看 MATLAB 中的
parfor
循环:http:// www.mathworks.com/help/toolbox/distcomp/parfor.htmlThis is probably not the best way to do it, but with lots of memory you could definitely parallelize the process.
You can construct sparse matrices consisting of entries of your original matrix which have value
i^(-1)
(wherei = 1 .. 1.3 million
), multiply them with your original vector, and sum all the results together.So for your example the product would be essentially:
Of course, you wouldn't construct the sparse matrices this way. If you wanted to have less iterations of the inside loop, you could have more than one of the
i
's in each matrix.Look into the
parfor
loop in MATLAB: http://www.mathworks.com/help/toolbox/distcomp/parfor.html这是不正确的。您始终可以从数据或过滤后的数据中添加样本(即添加或删除零)。由于使用
filter
进行过滤(顺便说一下,您也可以使用conv
)是一个线性操作,因此它不会改变结果(就像添加和删除零一样,这不会改变结果)没有,然后过滤。然后线性允许您交换添加样本 -> 过滤器 -> 删除样本的顺序。无论如何,在您的示例中,您可以将平均内核设为:
然后简单地:
That is not correct. You can always add samples (i.e, adding or removing zeros) from your data or from the filtered data. Since filtering with
filter
(you can also useconv
by the way) is a linear action, it won't change the result (it's like adding and removing zeros, which does nothing, and then filtering. Then linearity allows you to swap the order to add samples -> filter -> remove sample).Anyway, in your example, you can take the averaging kernel to be:
and then simply:
您只考虑了 2 个选项:
将 1.3M*1.3M 矩阵与向量相乘一次或将 2 个 1.3M 向量相乘 1.3M 次。
但是您可以将权重矩阵划分为任意数量的子矩阵,并将 n*1.3M 矩阵与向量 1.3M/n 次相乘。
我假设最快的情况是迭代次数最少,并且 n 可以创建适合您内存的最大子矩阵,而不会让您的计算机开始将页面交换到硬盘驱动器。
根据你的内存大小,你应该从 n=5000 开始。
您还可以使用 parfor 使其更快(用
n
除以处理器数量)。You considered only 2 options:
Multiplying 1.3M*1.3M matrix with a vector once or multiplying 2 1.3M vectors 1.3M times.
But you can divide your weight matrix to as many sub-matrices as you wish and do a multiplication of n*1.3M matrix with the vector 1.3M/n times.
I assume that the fastest will be when there will be the smallest number of iterations and n is such that creates the largest sub-matrix that fits in your memory, without making your computer start swapping pages to your hard drive.
with your memory size you should start with n=5000.
you can also make it faster by using parfor (with
n
divided by the number of processors).暴力方式可能适合您,只需进行一点小小的优化即可。
用于创建权重的 ^-0.1 运算将比用于计算加权平均值的 + 和 * 运算花费更长的时间,但您可以在所有百万个加权平均值运算中重复使用权重。该算法变为:
创建一个权重向量,其中包含任何计算所需的所有权重:
weights = (-n:n).^-0.1
对于向量中的每个元素:
对权重向量的相关部分进行索引,以将当前元素视为“中心”。
使用权重部分和整个向量执行加权平均。这可以通过快速向量点乘和标量除法来完成。
主循环执行n^2 次加法和减法。 n 等于 130 万,即 3.4 万亿次操作。现代 3GHz CPU 的单核每秒可以执行 60 亿次加法/乘法,因此大约需要 10 分钟。添加索引权重向量和开销的时间,我仍然估计您可以在半小时内完成。
The brute force way will probably work for you, with one minor optimisation in the mix.
The ^-0.1 operations to create the weights will take a lot longer than the + and * operations to compute the weighted-means, but you re-use the weights across all the million weighted-mean operations. The algorithm becomes:
Create a weightings vector with all the weights any computation would need:
weights = (-n:n).^-0.1
For each element in the vector:
Index the relevent portion of the
weights
vector to consider the current element as the 'centre'.Perform the weighted-mean with the weights portion and the entire vector. This can be done with a fast vector dot-multiply followed by a scalar division.
The main loop does n^2 additions and subractions. With n equal to 1.3 million that's 3.4 trillion operations. A single core of a modern 3GHz CPU can do say 6 billion additions/multiplications a second, so that comes out to around 10 minutes. Add time for indexing the
weights
vector and overheads, and I still estimate you could come in under half an hour.