在 Stata 中生成行中变量和列中给定变量的分位数的均值表

发布于 2024-12-11 10:02:33 字数 1341 浏览 1 评论 0原文

...并添加差异和 t 统计量列。

我学会了如何按分位数均值表制作分位数以及如何添加差异的列/行此处(感谢@lejohn)。

现在,我希望每一行都是一个不同的变量,而不是将每一行作为一个变量的一个分位数,并且每个单元格将是属于给定变量的分位数的每一列中的个体的每个变量的平均值。

我可以使用 tabstat 轻松计算单元格条目,但我想要行中的变量和列中的分位数(tabstat 生成转置)。我还希望能够区分列(如我的第一个问题)并计算单元格差异的 t 统计量。

我觉得中间步骤是重塑为包含三列的长数据:id(此处为acc_d)、变量名称和变量值。但我不知道如何做到这一点,并且我可能陷入 R 范式中。

这是我想要制作的类型表的示例

在此处输入图像描述

这是我使用的一些代码一直(不成功)修补

* generate data
clear
set obs 2000
generate acc = rnormal()
generate r1 = rnormal()
generate sar1 = rnormal()
generate arbrisk = rnormal()

* generate quantiles for for a and b
xtile acc_d = acc, nquantiles(10)

* form table (at least my attempts)
* w/ tabstat (but transposed and can't manipulate columns)
tabstat acc r1 sar1 arbrisk, stat(mean) by(acc_d) nototal 

* my attempts to reshape fail, but I would want something like to following to use tabulate
* acc_d   variable    value
* 1       acc         0.01
* 1       r1          1.03
* 1       sar1        -0.03
* 1       arbrisk     0.05
* 2       acc         1.01
* 2       r1          2.03
* 2       sar1        0.03
* 2       arbrisk     1.05

谢谢!

... and add columns for differences and t-statistics.

I learned how to make a quantile by quantile table of means and how to add a column/row of differences here (thanks to @lejohn).

Now instead of each row as one quantile of one variable, I would like each row to be a different variable and each cell would be the mean value for each variable for the individuals that fall in each column for the quantile of a given variable.

I can calculate the cell entries easily with tabstat, but I would like the variables in the rows and the quantiles in the columns (tabstat produces the transpose). I would also like the ability to difference columns (as in my first question) and calculate t-statistics for the cell differences.

I feel like the intermediate step is to reshape to long data with three columns: id (here acc_d), variable name, and variable value. But I can't figure out how to do this and I may be stuck in an R paradigm.

Here is an example of the type table I would like to make

enter image description here

and here is some code with which I have been (unsuccesfully) tinkering

* generate data
clear
set obs 2000
generate acc = rnormal()
generate r1 = rnormal()
generate sar1 = rnormal()
generate arbrisk = rnormal()

* generate quantiles for for a and b
xtile acc_d = acc, nquantiles(10)

* form table (at least my attempts)
* w/ tabstat (but transposed and can't manipulate columns)
tabstat acc r1 sar1 arbrisk, stat(mean) by(acc_d) nototal 

* my attempts to reshape fail, but I would want something like to following to use tabulate
* acc_d   variable    value
* 1       acc         0.01
* 1       r1          1.03
* 1       sar1        -0.03
* 1       arbrisk     0.05
* 2       acc         1.01
* 2       r1          2.03
* 2       sar1        0.03
* 2       arbrisk     1.05

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

和我恋爱吧 2024-12-18 10:02:33

在这里我会采取一些不同的方式。我首先会收集计算差异和 t 统计量所需的信息

foreach v of varlist acc r1 sar1 arbrisk {
    summarize `v' if acc_d == 1
    local m_`v'_1 = r(mean)
    local var_`v'_1 = r(Var)
    local n_`v'_1 = r(N)
    summarize `v' if acc_d == 10
    local m_`v'_10 = r(mean)
    local var_`v'_10 = r(Var)
    local n_`v'_10 = r(N)
}

然后我将通过折叠和转置数据来继续

collapse (mean) acc r1 sar1 arbrisk, by(acc_d)
xpose, clear varname 
drop if _varname == "acc_d"
order _varname
forvalues n = 1 / 10 {
    rename v`n' acc_d`n'
}

在最后一步中,我将添加差异和 t 统计量:

generate diff_d10_d1 = . 
generate tstat_d10_d1 = .
foreach v in acc r1 sar1 arbrisk {
    replace diff = `m_`v'_10' - `m_`v'_1' if _varname == "`v'"
    replace tstat = (`m_`v'_10' - `m_`v'_1') / sqrt((`var_`v'_10'/`n_`v'_10') + (`var_`v'_1'/`n_`v'_1')) if _varname == "`v'"
}

最后打印结果:

list, abb(12) noobs

希望这个有帮助。

Here I would proceed a bit differently. I would first of all gather the information required to compute the difference and the t statistic

foreach v of varlist acc r1 sar1 arbrisk {
    summarize `v' if acc_d == 1
    local m_`v'_1 = r(mean)
    local var_`v'_1 = r(Var)
    local n_`v'_1 = r(N)
    summarize `v' if acc_d == 10
    local m_`v'_10 = r(mean)
    local var_`v'_10 = r(Var)
    local n_`v'_10 = r(N)
}

Then I would proceed by collapsing and transposing the data

collapse (mean) acc r1 sar1 arbrisk, by(acc_d)
xpose, clear varname 
drop if _varname == "acc_d"
order _varname
forvalues n = 1 / 10 {
    rename v`n' acc_d`n'
}

In a last step, I would add the difference and the t statistic:

generate diff_d10_d1 = . 
generate tstat_d10_d1 = .
foreach v in acc r1 sar1 arbrisk {
    replace diff = `m_`v'_10' - `m_`v'_1' if _varname == "`v'"
    replace tstat = (`m_`v'_10' - `m_`v'_1') / sqrt((`var_`v'_10'/`n_`v'_10') + (`var_`v'_1'/`n_`v'_1')) if _varname == "`v'"
}

And finally print the results:

list, abb(12) noobs

Hope this helps.

血之狂魔 2024-12-18 10:02:33

这是一个更笨重的解决方案,它创建两个表。

* generate data
clear
set obs 2000
generate acc = rnormal()
generate r1 = rnormal()
generate sar1 = rnormal()
generate arbrisk = rnormal()

* generate quantiles
xtile acc_d = acc, nquantiles(10)

* aggregate
collapse (mean) acc r1 sar1 arbrisk, by(date_y acc_d) cw

* relabel variables after collapse
label variable acc "Acc"
label variable r1 "R1"
label variable sar1 "SAR1"
label variable arbrisk "ArbRisk"

* main part of table
eststo clear
estpost tabstat acc r1 sar1 arbrisk if tin(1975, 2000) ///
    , stat(mean) by(acc_d) columns(statistics) listwise nototal 
esttab using tab_1a.tex ///
    , booktabs replace main(mean) nonumbers noobs ///
    label unstack nogaps not nomtitles nostar ///
    eqlabels(, prefix("Acc ")) 

* add difference t-test
estpost ttest acc r1 sar1 if acc_d == 1 | acc_d == 10, by(acc_d)
esttab using tab_1a_ttest.tex, booktabs replace nonumbers noobs ///
    label mtitles("Acc 1-Acc10") wide ///
    varlabels(acc Acc r1 R1 sar1 SAR1)  

Here is a clunkier solution that creates two tables.

* generate data
clear
set obs 2000
generate acc = rnormal()
generate r1 = rnormal()
generate sar1 = rnormal()
generate arbrisk = rnormal()

* generate quantiles
xtile acc_d = acc, nquantiles(10)

* aggregate
collapse (mean) acc r1 sar1 arbrisk, by(date_y acc_d) cw

* relabel variables after collapse
label variable acc "Acc"
label variable r1 "R1"
label variable sar1 "SAR1"
label variable arbrisk "ArbRisk"

* main part of table
eststo clear
estpost tabstat acc r1 sar1 arbrisk if tin(1975, 2000) ///
    , stat(mean) by(acc_d) columns(statistics) listwise nototal 
esttab using tab_1a.tex ///
    , booktabs replace main(mean) nonumbers noobs ///
    label unstack nogaps not nomtitles nostar ///
    eqlabels(, prefix("Acc ")) 

* add difference t-test
estpost ttest acc r1 sar1 if acc_d == 1 | acc_d == 10, by(acc_d)
esttab using tab_1a_ttest.tex, booktabs replace nonumbers noobs ///
    label mtitles("Acc 1-Acc10") wide ///
    varlabels(acc Acc r1 R1 sar1 SAR1)  
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文