如何在 Stata 中对组内的观察结果进行排名?

发布于 2024-11-07 15:33:34 字数 635 浏览 0 评论 0原文

我在 Stata 中有一些数据,看起来像前两列:

group_id   var_to_rank  desired_rank
____________________________________

1           10          1
1           20          2
1           30          3
1           40          4
2           10          1
2           20          2
2           20          2
2           30          3

我想根据一个变量 (var_to_rank) 创建组 (group_id) 内每个观察值的排名。通常,出于此目的,我使用:

gen id = _n

但是我的一些观察结果(在我的小示例中 group_id = 2)具有相同的排名变量值,并且这种方法不起作用。

我还尝试使用:

egen rank

具有不同选项的命令,但无法使我的排名变量看起来像desired_rank。

你能给我指出这个问题的解决方案吗?

I have some data in Stata which look like the first two columns of:

group_id   var_to_rank  desired_rank
____________________________________

1           10          1
1           20          2
1           30          3
1           40          4
2           10          1
2           20          2
2           20          2
2           30          3

I'd like to create a rank of each observation within group (group_id) according to one variable (var_to_rank). Usually, for this purpose I used:

gen id = _n

However some of my observations (group_id = 2 in my small example) have the same values of ranking variable and this approach doesn't work.

I have also tried using:

egen rank

command with different options, but cannot make my rank variables make to look like desired_rank.

Could you point me to a solution to this problem?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

寄离 2024-11-14 15:33:34

以下对我有用:

bysort group_id: egen desired_rank=rank(var_to_rank)

在此处输入图像描述

The following works for me:

bysort group_id: egen desired_rank=rank(var_to_rank)

enter image description here

生死何惧 2024-11-14 15:33:34

我想说这个问题是为了最好的理解而以错误的方式提出的。目的是将观察值分组,将具有最低值的值全部分配为 1 级,次低值的值全部分配为 2 级,依此类推。这并不是我所讨论过的大多数意义上的排名,但 Stata 的 egen,rank() 确实可以帮助您解决问题。

但是,在该线程的其他地方引用的统计主义者线程中提到的直接方式(从这里开始)在精神上比引用的任何解决方案都简单:

bysort group_id (var_to_rank): gen desired_rank = sum(var_to_rank != var_to_rank[_n-1]) 

一旦数据按 var_to_rank 排序,那么当值与每个块开头的先前值不同时不同值 值 1 是 var_to_rank != var_to_rank[_n-1] 的结果;否则结果为 0。将这些 1 和 0 累加求和即可得到所需的变量。前缀命令 bysort 执行所需的排序,并确保这一切都是在 group_id 定义的组内单独完成的。根本不需要egen(许多只偶尔使用 Stata 的人常常觉得这个命令很奇怪)。

利益声明:引用的集权主义线程表明,当被问到类似的问题时,我也没有想到这个解决方案。

I'd say this question is posed the wrong way round for best understanding. The aim is to group observations, those with the lowest value all being assigned a grade 1, the next lowest being all assigned 2 and so forth. This isn't ranking in most senses that I have seen discussed, but Stata's egen, rank() does get you part of the way.

But the direct way, which was mentioned in the Statalist thread cited elewhere in this thread (start here) is simpler in spirit than any solution quoted:

bysort group_id (var_to_rank): gen desired_rank = sum(var_to_rank != var_to_rank[_n-1]) 

Once data are sorted on var_to_rank then when values differ from previous values at the start of each block of distinct values a value of 1 is the result of var_to_rank != var_to_rank[_n-1]; otherwise 0 is the result. Summing those 1s and 0s cumulatively gives the desired variable. The prefix command bysort does the sorting required and ensures that this is all done separately within the groups defined by group_id. No need for egen at all (a command that many people who only use Stata occasionally often find bizarre).

Declaration of interest: The Statalist thread cited shows that when asked a similar question I too did not think of this solution in one.

灵芸 2024-11-14 15:33:34

Statalist 上偶然发现了这样的解决方案:

bysort group_id (var_to_rank) : gen rank = var_to_rank != var_to_rank[_n-1]
by group_id : replace rank = sum(rank)

似乎可以解决这个问题。

Stumbled upon such solution on the Statalist:

bysort group_id (var_to_rank) : gen rank = var_to_rank != var_to_rank[_n-1]
by group_id : replace rank = sum(rank)

Seems to sort out this issue.

浅浅淡淡 2024-11-14 15:33:34

@radek:你肯定同时解决了它......但这将是一个简单的(虽然不是很优雅)的解决方案:

bysort group_id:   egen desired_rank_HELP =rank(var_to_rank), field
egen desired_rank      =group(grup_id desired_rank_HELP)
drop desired_rank_HELP

@radek: you surely got it sorted out in the meantime ... but this would have been an easy (though not very elegant) solution:

bysort group_id:   egen desired_rank_HELP =rank(var_to_rank), field
egen desired_rank      =group(grup_id desired_rank_HELP)
drop desired_rank_HELP
篱下浅笙歌 2024-11-14 15:33:34

工作量太大了。轻松又优雅。试试这个。

gendesired_rank=int(var_to_rank/10)

Way too much work. Easy and elegant. Try this one.

gen desired_rank=int(var_to_rank/10)

一梦浮鱼 2024-11-14 15:33:34

尝试这个命令,它对我来说非常有效:egen newid=group(oldid)

try this command, it works for me so well: egen newid=group(oldid)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文