如何在R上的面板数据集中添加条件和二进制变量的行
我正在根据原始数据构建一个面板数据集,其中每行包含一个公司(名称)及其 10 年的销售额。
具体来说,它是这样的:
我正在构建的面板数据集必须如下所示:
到目前为止,我拥有所有公司的面板数据集,但仅限于他们有销售的年份。
对于每家在显示正销售额后停止销售的公司(在 x、x+1 年销售后,y 年有一个“-”),我需要添加一行复制有关该公司的信息(整行:名称、销售额、年份)并在“国家退出”列中添加 1。在上面的示例中,我必须执行第二张图片最后一行中 D 公司所做的操作。
我怎样才能避免在 R studio 上手动执行此操作,因为数据集中大约有 250 家公司存在这种情况?
谢谢,
我在 r 上尝试了一些函数,但无法以简单的方式执行它并使其易于对每个数据执行。
I am constructing a panel dataset from an original data which contains for each row, a company (name) and its sales across 10 years.
In concrete, it loos like this:
The panel dataset I am building has to look like this:
So far, I have the panel dataset with all companies but only with the years when they have sales.
For each of the company that has stopped sales after showing positive sales (there is a "-" in year y after sales in years x, x+1) I need to add a row copying the info about the company (the whole row: name, sales, year) and add a 1 in the column "country exit". In the example above, I would have to do what has been done for company D in the last row in the second picture.
How can I avoid doing that manually on R studio, as there are approximately 250 companies with this case in the dataset?
Thanks
I've tried some functions on r but unable to perform it in a simple way and make it easy to do for each data.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是使用
tidyverse
的示例。假设这是您的数据框架:首先要做的是在所有列上应用
pivot_longer
,但公司名称
以创建Year Year
columne:确保数据帧正确安排(
Company_name
和Year
),group_by
公司名称,对于每个公司,请检查:如果sales sales
在一行中等于na
,但是在上一行中,它大于0,在新列中put 1country exit exit
:和获取清洁器输出,您正在提到,只需从
country exit
中删除na
s:This is an example using
tidyverse
. Let's say this is your dataframe:First thing to do is applying
pivot_longer
on all columns butcompany name
to create theyear
column:Make sure the dataframe is arranged correctly (by
company_name
andyear
),group_by
company name, and for each company check: ifsales
in a row equalsNA
, but in the previous row it's larger than 0, put 1 in the new columnCountry Exit
:And to get the cleaner output, like the one you are mentioning, just remove
NA
s fromCountry Exit
: