根据R中的现有列中的值将列添加到数据帧中

发布于 2025-01-22 20:04:31 字数 3389 浏览 1 评论 0原文

我在rstudio工作，并且具有以下数据框架：

Favorite<-c("Apple","Lemon","Orange","Salat","Onion", "Apple","Strawberry","Celery","Blueberry","Sweetpotatoes","Strawberry",
                "Oragne","Celery","Sweetpotatoes","Onion","Blueberry","Strawberry","Salad")
PersonID<-c(67,82,67,21,02,12,90,23,65,32,44,67,56,77,30,198,20,99)
all_Data<-data.frame(PersonID,Favorite)

> head(all_Data)
  PersonID Favorite
1       67    Apple
2       82    Lemon
3       67   Orange
4       21    Salat
5        2    Onion
6       12    Apple

我想添加3列，它们应该包含以下内容：

如果all_data $ the in all_data $ taffer中的一行是苹果或蓝莓，则all_data $ country = ireand，all_data $ contonent =欧洲和all_data $ city =贝尔法斯特

如果在all_data $ thow是草莓中是草莓，那么all_data $ country = holland = holland，all_data $ contarent $ contarent =欧洲和all_data $ city = emmen = emmen，

如果all_data $ fairm the all_data $ thit anl_data $ ally_data $ country =法国，all_data $ contarent =欧洲和all_data $ city = menton

如果在all_data $最喜欢的一排是沙拉或洋葱，那么all_data $ country = sweeden，sweeden，all_data $ contarent =欧洲=欧洲和all_data $ data $ city $ city = malmoe

= malmoe in all_data $ yal_data $最喜欢的是柠檬或橙子，然后all_data $ country =法国，all_data $ contarent =欧洲和all_data $ city = Menton

如果在all_data $ thow the as sweetpotatoes中是sweetpotatoes，则all_data $ country = asure = asure，all_data $ continent $ continent = American = America and_data $ data $ city = ver = ver = ver = ver = verona

如果all_data $最喜欢的一排是芹菜，那么all_data $ country =德国，all_data $ contarent =欧洲和all_data $ city $ city =柏林

library(tidyverse)

all_Data |> 
  mutate(ctry_cont = case_when(
    str_detect(Favorite, "Appl|Blueb")  ~ "Ireland|Europe",
    str_detect(Favorite, "Straw")       ~ "Brazillian|South's of America",
    str_detect(Favorite, "Lemon|Orang") ~ "France|Europe",
    str_detect(Favorite, "Salad|Onion") ~ "Sweden|Europe",
    str_detect(Favorite, "Sweetpot")    ~ "United of state|America",
    str_detect(Favorite, "Celery")      ~ "Germany|Europe",
    TRUE                                ~ "Other|Other"
  )) |> 
  separate(ctry_cont, c("country", "continent"))

在运行上述代码后，我得到了以下警告和数据，我们看到了王国联合的一半价值和美国联合。我还添加了撇号的单词，因为在我的原始数据中有一个带有撇号的单词，但它也不可见：

     PersonID      Favorite    country continent
1        67         Apple    Ireland    Europe
2        82         Lemon     France    Europe
3        67        Orange     France    Europe
4        21         Salat      Other     Other
5         2         Onion     Sweden    Europe
6        12         Apple    Ireland    Europe
7        90    Strawberry Brazillian     South
8        23        Celery    Germany    Europe
9        65     Blueberry    Ireland    Europe
10       32 Sweetpotatoes     United        of
11       44    Strawberry Brazillian     South
12       67        Oragne      Other     Other
13       56        Celery    Germany    Europe
14       77 Sweetpotatoes     United        of
15       30         Onion     Sweden    Europe
16      198     Blueberry    Ireland    Europe
17       20    Strawberry Brazillian     South
18       99         Salad     Sweden    Europe

    Warning message:
Expected 2 pieces. Additional pieces discarded in 5 rows [7, 10, 11, 14, 17].

我还尝试在代码的最后一步中添加sep =“”。它给出了一个错误。

separate(ctry_cont, c("country", "continent"), sep="")

原文

I'm working in Rstudio and have a data frame similiar to the following:

Favorite<-c("Apple","Lemon","Orange","Salat","Onion", "Apple","Strawberry","Celery","Blueberry","Sweetpotatoes","Strawberry",
                "Oragne","Celery","Sweetpotatoes","Onion","Blueberry","Strawberry","Salad")
PersonID<-c(67,82,67,21,02,12,90,23,65,32,44,67,56,77,30,198,20,99)
all_Data<-data.frame(PersonID,Favorite)

> head(all_Data)
  PersonID Favorite
1       67    Apple
2       82    Lemon
3       67   Orange
4       21    Salat
5        2    Onion
6       12    Apple

I want to add 3 more column and they should contains the following:

If a row in all_Data$Favorite is Apple or Blueberry then all_Data$Country = Ireand, all_Data$Continent= Europe and all_Data$city=Belfast

If a row in all_Data$Favorite is Strawberry then all_Data$Country = Holland, all_Data$Continent= Europe and all_Data$city=Emmen

If a row in all_Data$Favorite is Lemon or Orange then all_Data$Country = France, all_Data$Continent= Europe and all_Data$city=Menton

If a row in all_Data$Favorite is Salad or Onion then all_Data$Country = Sweeden, all_Data$Continent= Europe and all_Data$city=Malmoe

If a row in all_Data$Favorite is Lemon or Orange then all_Data$Country = France, all_Data$Continent= Europe and all_Data$city=Menton

If a row in all_Data$Favorite is Sweetpotatoes then all_Data$Country = USA, all_Data$Continent= America and all_Data$city=Verona

If a row in all_Data$Favorite is Celery then all_Data$Country = Germany, all_Data$Continent= Europe and all_Data$city=Berlin

library(tidyverse)

all_Data |> 
  mutate(ctry_cont = case_when(
    str_detect(Favorite, "Appl|Blueb")  ~ "Ireland|Europe",
    str_detect(Favorite, "Straw")       ~ "Brazillian|South's of America",
    str_detect(Favorite, "Lemon|Orang") ~ "France|Europe",
    str_detect(Favorite, "Salad|Onion") ~ "Sweden|Europe",
    str_detect(Favorite, "Sweetpot")    ~ "United of state|America",
    str_detect(Favorite, "Celery")      ~ "Germany|Europe",
    TRUE                                ~ "Other|Other"
  )) |> 
  separate(ctry_cont, c("country", "continent"))

After running the code above I get the following warning and data where we see half of the value of United of Kingdom and United of America. I have also added words with apostrophe since in my original data there are word with apostrophes, but it is also not visible:

     PersonID      Favorite    country continent
1        67         Apple    Ireland    Europe
2        82         Lemon     France    Europe
3        67        Orange     France    Europe
4        21         Salat      Other     Other
5         2         Onion     Sweden    Europe
6        12         Apple    Ireland    Europe
7        90    Strawberry Brazillian     South
8        23        Celery    Germany    Europe
9        65     Blueberry    Ireland    Europe
10       32 Sweetpotatoes     United        of
11       44    Strawberry Brazillian     South
12       67        Oragne      Other     Other
13       56        Celery    Germany    Europe
14       77 Sweetpotatoes     United        of
15       30         Onion     Sweden    Europe
16      198     Blueberry    Ireland    Europe
17       20    Strawberry Brazillian     South
18       99         Salad     Sweden    Europe

    Warning message:
Expected 2 pieces. Additional pieces discarded in 5 rows [7, 10, 11, 14, 17].

I also tried to add sep=""at the last step of the code. it gives an error.

separate(ctry_cont, c("country", "continent"), sep="")

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

天涯离梦残月幽梦 2025-01-29 20:04:32

您可以做这样的事情...

Favorite <- c(
  "Apple",
  "Lemon",
  "Orange",
  "Salad",
  "Onion",
  "Apple",
  "Strawberry",
  "Celery",
  "Blueberry",
  "Sweetpotatoes",
  "Strawberry",
  "Orange",
  "Celery",
  "Sweetpotatoes",
  "Onion",
  "Blueberry",
  "Strawberry",
  "Salad"
)

PersonID <-
  c(67, 82, 67, 21, 02, 12, 90, 23, 65, 32, 44, 67, 56, 77, 30, 198, 20, 99)

all_Data <- data.frame(PersonID, Favorite)

library(tidyverse)

all_Data |> 
  mutate(ctry_cont = case_when(
    str_detect(Favorite, "Appl|Blueb")  ~ "Ireland, Europe",
    str_detect(Favorite, "Straw")       ~ "Holland, Europe",
    str_detect(Favorite, "Lemon|Orang") ~ "France, Europe",
    str_detect(Favorite, "Salad|Onion") ~ "Sweden, Europe",
    str_detect(Favorite, "Sweetpot")    ~ "United States, North America",
    str_detect(Favorite, "Celery")      ~ "Germany, Europe",
    TRUE                                ~ "Other, Other"
  )) |> 
  separate(ctry_cont, c("country", "continent"), sep = ", ")
#>    PersonID      Favorite       country     continent
#> 1        67         Apple       Ireland        Europe
#> 2        82         Lemon        France        Europe
#> 3        67        Orange        France        Europe
#> 4        21         Salad        Sweden        Europe
#> 5         2         Onion        Sweden        Europe
#> 6        12         Apple       Ireland        Europe
#> 7        90    Strawberry       Holland        Europe
#> 8        23        Celery       Germany        Europe
#> 9        65     Blueberry       Ireland        Europe
#> 10       32 Sweetpotatoes United States North America
#> 11       44    Strawberry       Holland        Europe
#> 12       67        Orange        France        Europe
#> 13       56        Celery       Germany        Europe
#> 14       77 Sweetpotatoes United States North America
#> 15       30         Onion        Sweden        Europe
#> 16      198     Blueberry       Ireland        Europe
#> 17       20    Strawberry       Holland        Europe
#> 18       99         Salad        Sweden        Europe

^{在2022-04-22创建的 reprex package （（ v2.0.1）}

You could do something like this ...

Favorite <- c(
  "Apple",
  "Lemon",
  "Orange",
  "Salad",
  "Onion",
  "Apple",
  "Strawberry",
  "Celery",
  "Blueberry",
  "Sweetpotatoes",
  "Strawberry",
  "Orange",
  "Celery",
  "Sweetpotatoes",
  "Onion",
  "Blueberry",
  "Strawberry",
  "Salad"
)

PersonID <-
  c(67, 82, 67, 21, 02, 12, 90, 23, 65, 32, 44, 67, 56, 77, 30, 198, 20, 99)

all_Data <- data.frame(PersonID, Favorite)

library(tidyverse)

all_Data |> 
  mutate(ctry_cont = case_when(
    str_detect(Favorite, "Appl|Blueb")  ~ "Ireland, Europe",
    str_detect(Favorite, "Straw")       ~ "Holland, Europe",
    str_detect(Favorite, "Lemon|Orang") ~ "France, Europe",
    str_detect(Favorite, "Salad|Onion") ~ "Sweden, Europe",
    str_detect(Favorite, "Sweetpot")    ~ "United States, North America",
    str_detect(Favorite, "Celery")      ~ "Germany, Europe",
    TRUE                                ~ "Other, Other"
  )) |> 
  separate(ctry_cont, c("country", "continent"), sep = ", ")
#>    PersonID      Favorite       country     continent
#> 1        67         Apple       Ireland        Europe
#> 2        82         Lemon        France        Europe
#> 3        67        Orange        France        Europe
#> 4        21         Salad        Sweden        Europe
#> 5         2         Onion        Sweden        Europe
#> 6        12         Apple       Ireland        Europe
#> 7        90    Strawberry       Holland        Europe
#> 8        23        Celery       Germany        Europe
#> 9        65     Blueberry       Ireland        Europe
#> 10       32 Sweetpotatoes United States North America
#> 11       44    Strawberry       Holland        Europe
#> 12       67        Orange        France        Europe
#> 13       56        Celery       Germany        Europe
#> 14       77 Sweetpotatoes United States North America
#> 15       30         Onion        Sweden        Europe
#> 16      198     Blueberry       Ireland        Europe
#> 17       20    Strawberry       Holland        Europe
#> 18       99         Salad        Sweden        Europe

^{Created on 2022-04-22 by the reprex package (v2.0.1)}

回复收藏 0 原文

~没有更多了~