R-如何创建一个包含不同间隔的变量(几个国家的民用术语)?

发布于 2025-01-23 14:13:40 字数 1943 浏览 0 评论 0原文

我有一个数据库,其中包含有关全球多个国家 /地区选举的信息。我数据库中的所有国家都有选举,所有这些选举发生在当选官员上任的前一年。因此,一个术语总是在选举后的一年开始。

但是,大多数国家之间的选举之间的间隔是完全独特的,这使得很难自动化该过程。 X国的选举可能从4年到4年,Y国家可能从10到10。我为清晰的目的做了一个虚构的数据库。

tab_elections <- data.frame(country = c('Country X', 'Country X', 'Country X', 'Country X', 'Country Y', 'Country Y', 'Country Y'),
                            election = c('1996', '2000', '2004', '2008', '1990', '2000', '2010'))

我也有一个没有选举信息的国家的年度数据库。

tab_country <- data.frame(country = c('Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y'),
                  year = c(2000, 2000, 2001, 2001, 2002, 2002, 2003, 2003, 2004, 2004, 2005, 2005, 2006, 2006, 2007, 2007, 2008, 2008, 2009, 2009, 2010, 2010, 2011, 2011))

我想将“选举数据库”与“年度数据库”合并,创建一个变量,以告知我“当选现任总统”。

它应该看起来像这样:

tab <- data.frame(country = c('Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y'),
                  year = c(2000, 2000, 2001, 2001, 2002, 2002, 2003, 2003, 2004, 2004, 2005, 2005, 2006, 2006, 2007, 2007, 2008, 2008, 2009, 2009, 2010, 2010, 2011, 2011),
                  year_president_elected = c(1996, 1990, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2004, 2000, 2004, 2000, 2004, 2000, 2004, 2000, 2008, 2000, 2008, 2000, 2008, 2010))

我不知道这是否正确,但是基本上我想了解如何合并两个数据库,创建一个变量,该变量获得“每个国家的最新观察,而无需计算年度本身”(因为总统当选的年不是他们开始任期的一年。如果有任何R软件包了解日期,它将更好。

提前致谢。

I have a database with informations about elections in several countries across the globe. All of the countries from my database have elections, and all these elections happen the year before the elected official take office. Hence, a term always starts the year after the elections.

However, most countries have completely unique intervals between elections, which makes it hard to automatize the proccess. Country X might have elections from 4 to 4 years and Country Y might have them from 10 to 10. I made a fictional database for clarity purposes.

tab_elections <- data.frame(country = c('Country X', 'Country X', 'Country X', 'Country X', 'Country Y', 'Country Y', 'Country Y'),
                            election = c('1996', '2000', '2004', '2008', '1990', '2000', '2010'))

I also have a yearly database for the same countries that does not have electoral information.

tab_country <- data.frame(country = c('Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y'),
                  year = c(2000, 2000, 2001, 2001, 2002, 2002, 2003, 2003, 2004, 2004, 2005, 2005, 2006, 2006, 2007, 2007, 2008, 2008, 2009, 2009, 2010, 2010, 2011, 2011))

I want to merge the "Elections database" with the "Yearly database" creating a variable that informs me "when the current president was elected".

It should look like this:

tab <- data.frame(country = c('Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y', 'Country X', 'Country Y'),
                  year = c(2000, 2000, 2001, 2001, 2002, 2002, 2003, 2003, 2004, 2004, 2005, 2005, 2006, 2006, 2007, 2007, 2008, 2008, 2009, 2009, 2010, 2010, 2011, 2011),
                  year_president_elected = c(1996, 1990, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2004, 2000, 2004, 2000, 2004, 2000, 2004, 2000, 2008, 2000, 2008, 2000, 2008, 2010))

I don't know if this is correct, but basically I'd like insight on how to merge both databases creating a variable that gets "the latest observation for each country without counting the year itself" (because the year the president is elected is not the year they start their term). If there is any R package that understands dates, it would be even better.

Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文