如何让Stata在表格中报告零

发布于 2024-10-17 13:14:42 字数 1543 浏览 3 评论 0原文

我正在尝试使用 Stata 中的 tabulate 命令来创建频率的时间序列。当我尝试在运行每个日期后合并制表的输出时,问题就出现了。当相关变量的值不存在观察值时,tabulate 将不包含 0 作为条目。例如,如果我想在三年内统计一个班级中 10、11 和 12 岁的学生,如果仅其中一组有代表,那么 Stata 可能会输出 (8),因此我们不知道哪一组是 8 名学生学生属于:可以是 (0,8,0) 或 (0,0,8)。

如果时间序列很短,这不是问题,因为“结果”窗口显示了哪些类别被代表或没有被代表。我的数据有更长的时间序列。有谁知道强制 Stata 在这些表格中包含零的解决方案/方法?我的代码的相关部分如下:

# delimit;
set more off;
clear;
matrix drop _all;
set mem 1200m;
cd ;
global InputFile "/Users/.../1973-2010.dta";
global OutputFile "/Users/.../results.txt";

use $InputFile;
compress;

log using "/Users/.../log.txt", append;

gen yr_mn = ym(year(datadate), month(datadate));
la var yr_mn "Year-Month Date"

xtset, clear;
xtset id datadate, monthly;

/*Converting the Ratings Scale to Numeric*/;
gen LT_num = .;
replace LT_num = 1 if splticrm=="AAA";
replace LT_num = 2 if (splticrm=="AA"||splticrm=="AA+"||splticrm=="AA-");
replace LT_num = 3 if (splticrm=="A"||splticrm=="A+"||splticrm=="A-");
replace LT_num = 4 if (splticrm=="BBB"||splticrm=="BBB+"||splticrm=="BBB-");
replace LT_num = 5 if (splticrm=="BB"||splticrm=="BB+"||splticrm=="BB-");
replace LT_num = 6 if (splticrm=="B"||splticrm=="B+"||splticrm=="B-");
replace LT_num = 7 if (splticrm=="CCC"||splticrm=="CCC+"||splticrm=="CCC-");
replace LT_num = 8 if (splticrm=="CC");
replace LT_num = 9 if (splticrm=="SD");
replace LT_num = 10 if (splticrm=="D");

summarize(yr_mn);
local start = r(min);
local finish = r(max);

forv x = `start'/`finish' {;
    qui tab LT_num if yr_mn == `x', matcell(freq_`x');
};

log close;

I'm trying to use the tabulate command in Stata to create a time series of frequencies. The problem arises when I try to combine the output of tabulate after running through each date. tabulate will not include 0 as an entry when no observation exists for a value of the variable in question. For instance, if I wanted to count the 10, 11 and 12 year olds in a class over a three-year period Stata might output (8) if only one of the groups were represented and thus we don't know which group the 8 students belonged to: it could be (0,8,0) or (0,0,8).

This is not a problem if the time series is short as the "Results" window shows which categories are or are not represented. I have a much longer time series to my data. Does anyone know of a solution/method that forces Stata to include zeroes in these tabulations? The relevant parts of my code follows:

# delimit;
set more off;
clear;
matrix drop _all;
set mem 1200m;
cd ;
global InputFile "/Users/.../1973-2010.dta";
global OutputFile "/Users/.../results.txt";

use $InputFile;
compress;

log using "/Users/.../log.txt", append;

gen yr_mn = ym(year(datadate), month(datadate));
la var yr_mn "Year-Month Date"

xtset, clear;
xtset id datadate, monthly;

/*Converting the Ratings Scale to Numeric*/;
gen LT_num = .;
replace LT_num = 1 if splticrm=="AAA";
replace LT_num = 2 if (splticrm=="AA"||splticrm=="AA+"||splticrm=="AA-");
replace LT_num = 3 if (splticrm=="A"||splticrm=="A+"||splticrm=="A-");
replace LT_num = 4 if (splticrm=="BBB"||splticrm=="BBB+"||splticrm=="BBB-");
replace LT_num = 5 if (splticrm=="BB"||splticrm=="BB+"||splticrm=="BB-");
replace LT_num = 6 if (splticrm=="B"||splticrm=="B+"||splticrm=="B-");
replace LT_num = 7 if (splticrm=="CCC"||splticrm=="CCC+"||splticrm=="CCC-");
replace LT_num = 8 if (splticrm=="CC");
replace LT_num = 9 if (splticrm=="SD");
replace LT_num = 10 if (splticrm=="D");

summarize(yr_mn);
local start = r(min);
local finish = r(max);

forv x = `start'/`finish' {;
    qui tab LT_num if yr_mn == `x', matcell(freq_`x');
};

log close;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

橙幽之幻 2024-10-24 13:14:42

您想要的不是 tab 命令的选项。如果你想将结果显示到屏幕上,你也许可以成功使用table ...,missing

除了循环之外,您可以尝试以下操作,我认为这将适合您的目的:

preserve
gen n = 1  // (n could be a variable that indicates if you want to include the row or not; or just something that never ==.)
collapse (count) n , by(LT_num yr_mn)
reshape wide n, i(yr_mn) j(LT_num)
mkmat _all , matrix(mymatname) 
restore
mat list mymatname

我认为这就是您想要的(但无法告诉您如何使用您尝试生成的矩阵)。

PS 我更喜欢使用 inlist 函数来执行以下操作:

replace LT_num = 2 if inlist(splticrm,"AA","AA+","AA-")

What you want is not an option with the tab command. If you want to display the results to the screen, you might be able to use table ..., missing successfully.

Instead of the loop, you could try the following, which I think will work for your purposes:

preserve
gen n = 1  // (n could be a variable that indicates if you want to include the row or not; or just something that never ==.)
collapse (count) n , by(LT_num yr_mn)
reshape wide n, i(yr_mn) j(LT_num)
mkmat _all , matrix(mymatname) 
restore
mat list mymatname

I think that is what you're going after (but can't tell how you use the matrices you are trying to generate).

P.S. I prefer to use the inlist function for things like:

replace LT_num = 2 if inlist(splticrm,"AA","AA+","AA-")
甜嗑 2024-10-24 13:14:42

此问题已通过 tabcount 解决。请参阅 2003 年论文

http://www.stata-journal.com/article.html ?article=pr0011

,通过search tabcount获取链接后下载程序代码和帮助文件。

This problem is addressed by tabcount. See the 2003 paper

http://www.stata-journal.com/article.html?article=pr0011

and download the program code and help files after getting a link by search tabcount.

执笏见 2024-10-24 13:14:42

这是我使用的解决方案。 Keith 的可能更好,我将来会探索他的解决方案。

我将行标签(使用 matrow)保存在向量中,并将其用作初始化为零的正确维度矩阵的索引。这样我就可以将每个频率放入矩阵的正确位置,并保留所有零。解法遵循上述“local finish=r(max)”之后的代码。 [请注意,我包含一个计数器来消除该变量为空的第一个观察值。]

local counter=0;
forv x = `first'/`last' {;
tab LT_num if yr_mn == `x', matrow(index_`x') matcell(freq_`x');
local rows = r(r); /*r(r) is number of rows for tabulate*/;

if `rows'!=0{;
    matrix define A_`x'=J(10,1,0);
    forv r=1/`rows'{;
        local a=index_`x'[`r',1];
        matrix define A_`x'[`a',1]=freq_`x'[`r',1];
    };
};
else {;
    local counter=`counter'+1;
};
};   


local start=`first'+`counter'+1;
matrix define FREQ = freq_`start';

forv i = `start'/`last' {;
    matrix FREQ = (FREQ,A_`i');
};

This is the solution that I used. Keith's is probably better, and I will explore his solution in the future.

I saved the row labels (using matrow) in a vector and used it as an index for a matrix of the correct dimensions initialized to zero. That way I could place each frequency into the matrix at the correct place, and keep all of the zeros. The solution follows the above code after "local finish=r(max)". [note that I include a counter to eliminate the first observations which are empty for this variable.]

local counter=0;
forv x = `first'/`last' {;
tab LT_num if yr_mn == `x', matrow(index_`x') matcell(freq_`x');
local rows = r(r); /*r(r) is number of rows for tabulate*/;

if `rows'!=0{;
    matrix define A_`x'=J(10,1,0);
    forv r=1/`rows'{;
        local a=index_`x'[`r',1];
        matrix define A_`x'[`a',1]=freq_`x'[`r',1];
    };
};
else {;
    local counter=`counter'+1;
};
};   


local start=`first'+`counter'+1;
matrix define FREQ = freq_`start';

forv i = `start'/`last' {;
    matrix FREQ = (FREQ,A_`i');
};
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文