在列中的列中找到唯一值的值

发布于 2025-02-12 20:01:43 字数 398 浏览 1 评论 0原文

我想找到一列的唯一值,但是取走指定向量中的值。在下面的示例数据中,我想从列all_areas中查找唯一值,abreation1 and aint2中的值。 即结果应该是“城镇”,“城市”,“村庄”

set.seed(1)
area_df = data.frame(all_areas = sample(rep(c("foo", "bar", "big", "small", "town", "city", "village"),5),20),
                    number =  sample(1:100, 20))

area1 = c("foo", "bar")
area2 = c("big", "small")

I'd like to find the unique values of a column, but take away values that are in specified vectors. In the example data below I'd like to find the unique values from the column all_areas minus the values in the vectors area1 and area2.
i.e. the result should be "town", "city", "village"

set.seed(1)
area_df = data.frame(all_areas = sample(rep(c("foo", "bar", "big", "small", "town", "city", "village"),5),20),
                    number =  sample(1:100, 20))

area1 = c("foo", "bar")
area2 = c("big", "small")

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

番薯 2025-02-19 20:01:43

您可以使用函数setDiff来查找all_areasabreat 1 and aint 2组合之间的设置差异:

setdiff(area_df$all_areas, c(area1, area2))

[1] "city" "village" "town"   

You could use the function setdiff to find the set difference between all_areas and area1 and area2 combined:

setdiff(area_df$all_areas, c(area1, area2))

[1] "city" "village" "town"   
別甾虛僞 2025-02-19 20:01:43

我们可以在%中使用%来创建逻辑向量,负()到subset'all_areas'的其他元素,然后返回 唯一行

unique(subset(area_df, !all_areas %in% c(area1, area2)))

带有unique output的

   all_areas number
5    village     44
7       city     33
8       town     84
9       city     35
10   village     70
11      town     74
16   village     87
19      town     40
20   village     93

We may use %in% to create a logical vector, negate (!) to subset the other elements from 'all_areas' and then return the unique rows with unique

unique(subset(area_df, !all_areas %in% c(area1, area2)))

-output

   all_areas number
5    village     44
7       city     33
8       town     84
9       city     35
10   village     70
11      town     74
16   village     87
19      town     40
20   village     93
白云不回头 2025-02-19 20:01:43

使用dplyr方法:

library(dplyr)

area_df %>% 
  filter(!all_areas %in% c(area1, area2)) %>% 
  distinct

#>   all_areas number
#> 1   village     44
#> 2      city     33
#> 3      town     84
#> 4      city     35
#> 5   village     70
#> 6      town     74
#> 7   village     87
#> 8      town     40
#> 9   village     93

With a dplyr approach:

library(dplyr)

area_df %>% 
  filter(!all_areas %in% c(area1, area2)) %>% 
  distinct

#>   all_areas number
#> 1   village     44
#> 2      city     33
#> 3      town     84
#> 4      city     35
#> 5   village     70
#> 6      town     74
#> 7   village     87
#> 8      town     40
#> 9   village     93
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文