如何最好地计算彼此不独立的观测值的相关性(在 Stata 中)?
我有以下案例。我的数据集包含 78 份政策文件(= 我的观察)。这些文件由 50 个不同国家的政府撰写(2005 年至 2020 年期间)。虽然有 27 个国家只制定了一份政策文件,但有 23 个国家制定了多份政策文件。在后一种情况下,这些同一国家的不同政策文件通常是由不同的政府/行政部门和不同的部委在相隔数年之后编写的。尽管如此,我认为可能存在观察结果不独立的风险。因此,我的首要问题是:在这种情况下,您将如何计算相关性?更具体地说:
- 皮尔逊假设观察的独立性,因此,在这里不合适,对吗?或者人们甚至可以可信地辩称这些观察结果毕竟是独立的,因为它们通常是相隔多年(因此是政府)并且由不同的部委发布的?
“参与者内部相关性”(Bland & Altman 1995 a & b)或“重复测量相关性”(= R 和 Stata)更合适吗?或者还有其他更合适的吗?
此外:在我的设置中运行相关性时,我是否需要考虑任何时间效应?
非常感谢您的建议!
免责声明:也发布在 Statalist 此处。
I have the following case. My data set consists of 78 policy documents (= my observations). These were written by 50 different country governments (in the period between 2005 and 2020). While 27 countries have written only one policy document, 23 countries have written multiple policy documents. In the latter case, these same-country different-policy documents have usually been written years apart by different governments/administrations and different ministries. Nevertheless, I reckon there is probably a risk that the observations are not independent. My overarching question is, therefore: How would you calculate correlations in this case? More specifically:
Pearson assumes the independence of the observations, thus, is not suitable here, correct? Or could one even credibly argue that the observations are independent after all, since they were usually published many years (and therefore governments) apart and by different ministries?
Would "within-participants correlation" (Bland & Altman 1995 a & b) or "repeated measures correlation" (= RMCORR in R and Stata) be more suitable? Or is something else more appropriate?
Furthermore: Would I otherwise have to take into account any time effects when running correlations in my setting?
Thank you very much for your advice!
Disclaimer: also posted at Statalist here.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论