FA:基于“简单结构标准”选择旋转矩阵
使用因子分析时最重要的问题之一是其解释。因子分析经常使用因子旋转来增强其解释。经过满意的旋转后,旋转后的因子载荷矩阵L'将具有相同的表示相关性矩阵的能力,可以代替未旋转的矩阵L<作为因子载荷矩阵。 /强>。
旋转的目的是使旋转后的因子载荷矩阵具有一些理想的性质。使用的方法之一是旋转因子载荷矩阵,使得旋转后的矩阵具有简单的结构。
LL Thurstone 介绍了简单结构原理,作为因子旋转的一般指南:
简单结构标准:
- 因子矩阵的每一行应至少包含一个零
- 如果有 m 个公共因子,则因子矩阵的每一列应至少有m 个零
- 对于因子矩阵中的每一对列,应该有几个变量的条目在一列中接近零,但在另一列中不接近零
- 对于因子矩阵中的每一对列,很大一部分变量应该有条目当有四个或更多因子时,两列都接近零
- 对于因子矩阵中的每一对列,两列中都应该只有少量具有非零条目的变量
理想的简单结构是这样的:
- 每个项目都有一个高的或有意义的负载仅在一个因素上,并且
- 每个因素仅对某些项目具有高或有意义的负载。
问题在于,尝试几种旋转方法的组合以及每种旋转方法接受的参数(尤其是倾斜的),候选矩阵的数量会增加,并且很难看出哪一种更好地满足上述标准。
当我第一次遇到这个问题时,我意识到我无法仅通过“查看”它们来选择最佳匹配,并且我需要一种算法来帮助我做出决定。在项目截止日期的压力下,我最多能做的就是在 MATLAB 中编写以下代码,该代码一次接受一个旋转矩阵,并返回(在某些假设下)是否满足每个条件。 新版本(如果我尝试升级它)将接受 3d 矩阵(一组 2d 矩阵)作为参数,并且算法应该返回更符合上述标准的矩阵。
我只是询问您的意见(我也认为对该方法本身的有用性存在批评)以及也许更好的解决旋转矩阵选择问题的方法。如果有人想提供一些代码,我更喜欢 R 或 MATLAB。
PS 以上简单结构标准制定可以在《Making Sense of》一书中找到因子分析”,作者:PETT, M.、LACKEY, N.、SULLIVAN, J.
PS2(来自同一本书):“成功因子分析的检验是它能够在多大程度上再现原始结果corr 矩阵。如果您还使用了倾斜解,请在所有解中选择生成最多数量的最高和最低因子载荷的解。” 这听起来像是该算法可以使用的另一个约束。
function [] = simple_structure_criteria (my_pattern_table)
%Simple Structure Criteria
%Making Sense of Factor Analysis, page 132
disp(' ');
disp('Simple Structure Criteria (Thurstone):');
disp('1. Each row of the factor matrix should contain at least one zero');
disp( '2. If there are m common factors, each column of the factor matrix should have at least m zeros');
disp( '3. For every pair of columns in the factor matrix, there should be several variables for which entries approach zero in the one column but not in the other');
disp( '4. For every pair of columns in the factor matrix, a large proportion of the variables should have entries approaching zero in both columns when there are four or more factors');
disp( '5. For every pair of columns in the factor matrix, there should be only a small number of variables with nonzero entries in both columns');
disp(' ');
disp( '(additional by Pedhazur and Schmelkin) The ideal simple structure is such that:');
disp( '6. Each item has a high, or meaningful, loading on one factor only and');
disp( '7. Each factor have high, or meaningful, loadings for only some of the items.');
disp('')
disp('Start checking...')
%test matrix
%ct=[76,78,16,7;19,29,10,13;2,6,7,8];
%test it by giving: simple_structure_criteria (ct)
ct=abs(my_pattern_table);
items=size(ct,1);
factors=size(ct,2);
my_zero = 0.1;
approach_zero = 0.2;
several = floor(items / 3);
small_number = ceil(items / 4);
large_proportion = 0.30;
meaningful = 0.4;
some_bottom = 2;
some_top = floor(items / 2);
% CRITERION 1
disp(' ');
disp('CRITERION 1');
for i = 1 : 1 : items
count = 0;
for j = 1 : 1 : factors
if (ct(i,j) < my_zero)
count = count + 1;
break
end
end
if (count == 0)
disp(['Criterion 1 is NOT MET for item ' num2str(i)])
end
end
% CRITERION 2
disp(' ');
disp('CRITERION 2');
for j = 1 : 1 : factors
m=0;
for i = 1 : 1 : items
if (ct(i,j) < my_zero)
m = m + 1;
end
end
if (m < factors)
disp(['Criterion 2 is NOT MET for factor ' num2str(j) '. m = ' num2str(m)]);
end
end
% CRITERION 3
disp(' ');
disp('CRITERION 3');
for c1 = 1 : 1 : factors - 1
for c2 = c1 + 1 : 1 : factors
test_several = 0;
for i = 1 : 1 : items
if ( (ct(i,c1)>my_zero && ct(i,c2)<my_zero) || (ct(i,c1)<my_zero && ct(i,c2)>my_zero) ) % approach zero in one but not in the other
test_several = test_several + 1;
end
end
disp(['several = ' num2str(test_several) ' for factors ' num2str(c1) ' and ' num2str(c2)]);
if (test_several < several)
disp(['Criterion 3 is NOT MET for factors ' num2str(c1) ' and ' num2str(c2)]);
end
end
end
% CRITERION 4
disp(' ');
disp('CRITERION 4');
if (factors > 3)
for c1 = 1 : 1 : factors - 1
for c2 = c1 + 1 : 1 : factors
test_several = 0;
for i = 1 : 1 : items
if (ct(i,c1)<approach_zero && ct(i,c2)<approach_zero) % approach zero in both
test_several = test_several + 1;
end
end
disp(['large proportion = ' num2str((test_several / items)*100) '% for factors ' num2str(c1) ' and ' num2str(c2)]);
if ((test_several / items) < large_proportion)
pr = sprintf('%4.2g', (test_several / items) * 100 );
disp(['Criterion 4 is NOT MET for factors ' num2str(c1) ' and ' num2str(c2) '. Proportion is ' pr '%']);
end
end
end
end
% CRITERION 5
disp(' ');
disp('CRITERION 5');
for c1 = 1 : 1 : factors - 1
for c2 = c1 + 1 : 1 : factors
test_number = 0;
for i = 1 : 1 : items
if (ct(i,c1)>approach_zero && ct(i,c2)>approach_zero) % approach zero in both
test_number = test_number + 1;
end
end
disp(['small number = ' num2str(test_number) ' for factors ' num2str(c1) ' and ' num2str(c2)]);
if (test_number > small_number)
disp(['Criterion 5 is NOT MET for factors ' num2str(c1) ' and ' num2str(c2)]);
end
end
end
% CRITERION 6
disp(' ');
disp('CRITERION 6');
for i = 1 : 1 : items
count = 0;
for j = 1 : 1 : factors
if (ct(i,j) > meaningful)
count = count + 1;
end
end
if (count == 0 || count > 1)
disp(['Criterion 6 is NOT MET for item ' num2str(i)])
end
end
% CRITERION 7
disp(' ');
disp('CRITERION 7');
for j = 1 : 1 : factors
m=0;
for i = 1 : 1 : items
if (ct(i,j) > meaningful)
m = m + 1;
end
end
disp(['some items = ' num2str(m) ' for factor ' num2str(j)]);
if (m < some_bottom || m > some_top)
disp(['Criterion 7 is NOT MET for factor ' num2str(j)]);
end
end
disp('')
disp('Checking completed.')
return
One of the most important issues in using factor analysis is its interpretation. Factor analysis often uses factor rotation to enhance its interpretation. After a satisfactory rotation, the rotated factor loading matrix L' will have the same ability to represent the correlation matrix and it can be used as the factor loading matrix, instead of the unrotated matrix L.
The purpose of rotation is to make the rotated factor loading matrix have some desirable properties. One of the methods used is to rotate the factor loading matrix such that the rotated matrix will have a simple structure.
L. L. Thurstone introduced the Principle of Simple Structure, as a general guide for factor rotation:
Simple Structure Criteria:
- Each row of the factor matrix should contain at least one zero
- If there are m common factors, each column of the factor matrix should have at least m zeros
- For every pair of columns in the factor matrix, there should be several variables for which entries approach zero in the one column but not in the other
- For every pair of columns in the factor matrix, a large proportion of the variables should have entries approaching zero in both columns when there are four or more factors
- For every pair of columns in the factor matrix, there should be only a small number of variables with nonzero entries in both columns
The ideal simple structure is such that:
- each item has a high, or meaningful, loading on one factor only and
- each factor have high, or meaningful, loadings for only some of the items.
The problem is that, trying several combinations of rotation methods along with the parameters that each one accepts (especially for oblique ones), the number of candidate matrices increases and it is very difficult to see which one better meets the above criteria.
When I first faced that problem I realized that I was unable to select the best match by merely 'looking' at them, and that I needed an algorithm to help me decide. Under the stress of project's deadlines, the most I could do was to write the following code in MATLAB, which accepts one rotation matrix at a time and returns (under some assumptions) whether each criterion is met or not.
A new version (If I would ever tried to upgrade it) would accept a 3d matrix (a set of 2d matrices) as an argument, and the algorithm should return the one that better fits the above criteria.
I am just asking for your opinions (I also think that there's been criticism over the usefulness of the method by itself) and perhaps better approaches to the rotation matrix selection problem. If someone wants to provide some code, I would prefer R or MATLAB.
P.S. The above Simple Structure Criteria formulation can be found in the book "Making Sense of Factor Analysis" by PETT, M., LACKEY, N., SULLIVAN, J.
P.S.2 (from the same book): "A test of successful factor analysis is the extent to which it can reproduce the original corr matrix. If you also used oblique solutions, among all select the one that generated the greatest number of highest and lowest factor loadings."
This sounds like another constraint that the algorithm could use.
function [] = simple_structure_criteria (my_pattern_table)
%Simple Structure Criteria
%Making Sense of Factor Analysis, page 132
disp(' ');
disp('Simple Structure Criteria (Thurstone):');
disp('1. Each row of the factor matrix should contain at least one zero');
disp( '2. If there are m common factors, each column of the factor matrix should have at least m zeros');
disp( '3. For every pair of columns in the factor matrix, there should be several variables for which entries approach zero in the one column but not in the other');
disp( '4. For every pair of columns in the factor matrix, a large proportion of the variables should have entries approaching zero in both columns when there are four or more factors');
disp( '5. For every pair of columns in the factor matrix, there should be only a small number of variables with nonzero entries in both columns');
disp(' ');
disp( '(additional by Pedhazur and Schmelkin) The ideal simple structure is such that:');
disp( '6. Each item has a high, or meaningful, loading on one factor only and');
disp( '7. Each factor have high, or meaningful, loadings for only some of the items.');
disp('')
disp('Start checking...')
%test matrix
%ct=[76,78,16,7;19,29,10,13;2,6,7,8];
%test it by giving: simple_structure_criteria (ct)
ct=abs(my_pattern_table);
items=size(ct,1);
factors=size(ct,2);
my_zero = 0.1;
approach_zero = 0.2;
several = floor(items / 3);
small_number = ceil(items / 4);
large_proportion = 0.30;
meaningful = 0.4;
some_bottom = 2;
some_top = floor(items / 2);
% CRITERION 1
disp(' ');
disp('CRITERION 1');
for i = 1 : 1 : items
count = 0;
for j = 1 : 1 : factors
if (ct(i,j) < my_zero)
count = count + 1;
break
end
end
if (count == 0)
disp(['Criterion 1 is NOT MET for item ' num2str(i)])
end
end
% CRITERION 2
disp(' ');
disp('CRITERION 2');
for j = 1 : 1 : factors
m=0;
for i = 1 : 1 : items
if (ct(i,j) < my_zero)
m = m + 1;
end
end
if (m < factors)
disp(['Criterion 2 is NOT MET for factor ' num2str(j) '. m = ' num2str(m)]);
end
end
% CRITERION 3
disp(' ');
disp('CRITERION 3');
for c1 = 1 : 1 : factors - 1
for c2 = c1 + 1 : 1 : factors
test_several = 0;
for i = 1 : 1 : items
if ( (ct(i,c1)>my_zero && ct(i,c2)<my_zero) || (ct(i,c1)<my_zero && ct(i,c2)>my_zero) ) % approach zero in one but not in the other
test_several = test_several + 1;
end
end
disp(['several = ' num2str(test_several) ' for factors ' num2str(c1) ' and ' num2str(c2)]);
if (test_several < several)
disp(['Criterion 3 is NOT MET for factors ' num2str(c1) ' and ' num2str(c2)]);
end
end
end
% CRITERION 4
disp(' ');
disp('CRITERION 4');
if (factors > 3)
for c1 = 1 : 1 : factors - 1
for c2 = c1 + 1 : 1 : factors
test_several = 0;
for i = 1 : 1 : items
if (ct(i,c1)<approach_zero && ct(i,c2)<approach_zero) % approach zero in both
test_several = test_several + 1;
end
end
disp(['large proportion = ' num2str((test_several / items)*100) '% for factors ' num2str(c1) ' and ' num2str(c2)]);
if ((test_several / items) < large_proportion)
pr = sprintf('%4.2g', (test_several / items) * 100 );
disp(['Criterion 4 is NOT MET for factors ' num2str(c1) ' and ' num2str(c2) '. Proportion is ' pr '%']);
end
end
end
end
% CRITERION 5
disp(' ');
disp('CRITERION 5');
for c1 = 1 : 1 : factors - 1
for c2 = c1 + 1 : 1 : factors
test_number = 0;
for i = 1 : 1 : items
if (ct(i,c1)>approach_zero && ct(i,c2)>approach_zero) % approach zero in both
test_number = test_number + 1;
end
end
disp(['small number = ' num2str(test_number) ' for factors ' num2str(c1) ' and ' num2str(c2)]);
if (test_number > small_number)
disp(['Criterion 5 is NOT MET for factors ' num2str(c1) ' and ' num2str(c2)]);
end
end
end
% CRITERION 6
disp(' ');
disp('CRITERION 6');
for i = 1 : 1 : items
count = 0;
for j = 1 : 1 : factors
if (ct(i,j) > meaningful)
count = count + 1;
end
end
if (count == 0 || count > 1)
disp(['Criterion 6 is NOT MET for item ' num2str(i)])
end
end
% CRITERION 7
disp(' ');
disp('CRITERION 7');
for j = 1 : 1 : factors
m=0;
for i = 1 : 1 : items
if (ct(i,j) > meaningful)
m = m + 1;
end
end
disp(['some items = ' num2str(m) ' for factor ' num2str(j)]);
if (m < some_bottom || m > some_top)
disp(['Criterion 7 is NOT MET for factor ' num2str(j)]);
end
end
disp('')
disp('Checking completed.')
return
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我知道这不是您要问的,但即使在其他情况下您也可能会发现这很有用:
MATLAB 仅应在确实不可避免时才使用循环。例如,你的代码
应该写成
I know this is not what you're asking, but you might find this useful even in other cases:
MATLAB should use loops only when really unavoidable. for example, your code
Should be written as