如何将因子水平转换为 R 中的变量?
我对 R 比较陌生,正在尝试构建人口金字塔。我需要在两个变量(popMale、pop Female)中并排显示男性和女性的人口数据。目前性别是一个有 2 个级别的因素。如何将这些 2 因子水平转换为 2 个新变量(popMale、popFemale)。我将不胜感激任何帮助。这是我的数据的 dput 片段:
structure(list(V1 = c("Location", "Dominican Republic", "Dominican Republic",
"Dominican Republic", "Dominican Republic"), V2 = c("Sex", "Female",
"Female", "Male", "Male"), V3 = c("Age", "0-4", "5-9", "0-4",
"5-9"), V4 = c(1950L, 217L, 164L, 223L, 167L), V5 = c(1955L,
277L, 199L, 286L, 204L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -5L))
I am relatively new to R and trying to build a population pyramid. I need to have the population data for Males and Females side-by-side in two variables (popMale, pop female). Currently Sex is a factor with 2 levels. How do I convert these 2-factor levels to 2 new variables(popMale, popFemale). I would appreciate any help. Here is a dput snippet of my data:
structure(list(V1 = c("Location", "Dominican Republic", "Dominican Republic",
"Dominican Republic", "Dominican Republic"), V2 = c("Sex", "Female",
"Female", "Male", "Male"), V3 = c("Age", "0-4", "5-9", "0-4",
"5-9"), V4 = c(1950L, 217L, 164L, 223L, 167L), V5 = c(1955L,
277L, 199L, 286L, 204L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -5L))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
由于您的数据在第一行中包含列名称,因此实现所需结果的第一步是根据第一行命名数据,然后删除它。执行此操作后,将数据转换为长格式或整齐格式,即使用
tidyr::pivot_longer
将年份和人口数字移至单独的列中。最后,您可以使用tidyr::pivot_wider
将男性和女性的数据分布在不同的列中。注意:根据分析中的后续步骤,最后一步实际上并不需要,并且实际上可能会使绘制人口金字塔变得复杂。
As your data contains the column names in the first row, the first step to achieve your desired result would be to name your data according to the first row and drop it afterwards. After doing so convert your data to long or tidy format, i.e. move the years and population numbers in separate columns using e.g.
tidyr::pivot_longer
. Finally, you could usetidyr::pivot_wider
to spread the data for males and females in separate columns.Note: Depending on the next steps in your analysis the last step isn't really needed and may actually complicate plotting a population pyramid.