MATLAB 中的 3 天滚动相关性计算
我需要计算 3 天相关性。下面给出了示例矩阵。我的问题是 ID 可能不会每天都存在于宇宙中。例如,AAPL 可能永远存在于宇宙中,但一家公司 - CCL 可能只在我的宇宙中存在 2 天。我希望有一个矢量化的解决方案。我可能必须在这里使用 structs/accumarray 等,因为相关矩阵的大小可能会有所不同。
% col1 = tradingDates, col2 = companyID_asInts, col3 = VALUE_forCorrelation
rawdata = [ ...
734614 1 0.5;
734614 2 0.4;
734614 3 0.1;
734615 1 0.6;
734615 2 0.4;
734615 3 0.2;
734615 4 0.5;
734615 5 0.12;
734618 1 0.11;
734618 2 0.9;
734618 3 0.2;
734618 4 0.1;
734618 5 0.33;
734618 6 0.55;
734619 2 0.11;
734619 3 0.45;
734619 4 0.1;
734619 5 0.6;
734619 6 0.5;
734620 5 0.1;
734620 6 0.3] ;
“3 天相关性”:
% 734614 & 734615 corr is ignored as this is a 3-day corr
% 734618_corr = corrcoef(IDs 1,2,3 values are used. ID 4,5,6 is ignored) -> 3X3 matrix
% 734619_corr = corrcoef(IDs 2,3,4,5 values are used. ID 1,6 is ignored) -> 3X4 matrix
% 734620_corr = corrcoef(IDs 5,6 values are used. ID 1,2,3,4 is ignored) -> 3X2 matrix
真实数据涵盖 1995 年至 2011 年的 Russel1000 宇宙,拥有超过 410 万行。所需的相关性超过 20 天。
I need to calculate 3-day correlation. A sample matrix is given below. My problem is that IDs may not be in the universe every day. For example, AAPL may always be in universe but a company - CCL may be in my universe for just 2 days. I would appreciate a vectorized solution. I might have to use structs/accumarray
etc. here as the correlation-matrix size may vary.
% col1 = tradingDates, col2 = companyID_asInts, col3 = VALUE_forCorrelation
rawdata = [ ...
734614 1 0.5;
734614 2 0.4;
734614 3 0.1;
734615 1 0.6;
734615 2 0.4;
734615 3 0.2;
734615 4 0.5;
734615 5 0.12;
734618 1 0.11;
734618 2 0.9;
734618 3 0.2;
734618 4 0.1;
734618 5 0.33;
734618 6 0.55;
734619 2 0.11;
734619 3 0.45;
734619 4 0.1;
734619 5 0.6;
734619 6 0.5;
734620 5 0.1;
734620 6 0.3] ;
'3-day correlation':
% 734614 & 734615 corr is ignored as this is a 3-day corr
% 734618_corr = corrcoef(IDs 1,2,3 values are used. ID 4,5,6 is ignored) -> 3X3 matrix
% 734619_corr = corrcoef(IDs 2,3,4,5 values are used. ID 1,6 is ignored) -> 3X4 matrix
% 734620_corr = corrcoef(IDs 5,6 values are used. ID 1,2,3,4 is ignored) -> 3X2 matrix
Real data covers Russel1000 universe from 1995-2011 and has over 4.1 million rows. The desired correlation is over a 20-day period.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不会尝试在这里获得矢量化解决方案: MATLAB JIT 编译器< /a> 意味着循环在最新版本的 MATLAB 上通常可以同样快。
您的矩阵看起来很像稀疏矩阵:是否有助于将其转换为这种形式,以便您可以使用数组索引?这可能仅在第三列中的数据永远不能为 0 时才有效,否则您必须保留当前的显式列表并使用如下所示的内容:
I wouldn't try and get a vectorized solution here: the MATLAB JIT compiler means that loops can often be just as fast on recent versions of MATLAB.
Your matrix looks a lot like a sparse matrix: does it help to convert it into that form, so that you can use array indexing? This probably only works if the data in the third column can never be 0, otherwise you'll have to keep the current explicit list and use something like this: