在列中的列中找到唯一值的值

发布于 2025-02-12 20:01:43 字数 398 浏览 1 评论 0原文

我想找到一列的唯一值，但是取走指定向量中的值。在下面的示例数据中，我想从列all_areas中查找唯一值，abreation1 and aint2中的值。即结果应该是“城镇”，“城市”，“村庄”

set.seed(1)
area_df = data.frame(all_areas = sample(rep(c("foo", "bar", "big", "small", "town", "city", "village"),5),20),
                    number =  sample(1:100, 20))

area1 = c("foo", "bar")
area2 = c("big", "small")

原文

I'd like to find the unique values of a column, but take away values that are in specified vectors. In the example data below I'd like to find the unique values from the column all_areas minus the values in the vectors area1 and area2.
i.e. the result should be "town", "city", "village"

set.seed(1)
area_df = data.frame(all_areas = sample(rep(c("foo", "bar", "big", "small", "town", "city", "village"),5),20),
                    number =  sample(1:100, 20))

area1 = c("foo", "bar")
area2 = c("big", "small")

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

番薯 2025-02-19 20:01:43

您可以使用函数setDiff来查找all_areas和abreat 1 and aint 2组合之间的设置差异：

setdiff(area_df$all_areas, c(area1, area2))

[1] "city" "village" "town"

You could use the function setdiff to find the set difference between all_areas and area1 and area2 combined:

setdiff(area_df$all_areas, c(area1, area2))

[1] "city" "village" "town"

回复收藏 0 原文

別甾虛僞 2025-02-19 20:01:43

我们可以在％中使用％来创建逻辑向量，负（！）到subset'all_areas'的其他元素，然后返回唯一行

unique(subset(area_df, !all_areas %in% c(area1, area2)))

带有unique output的

   all_areas number
5    village     44
7       city     33
8       town     84
9       city     35
10   village     70
11      town     74
16   village     87
19      town     40
20   village     93

We may use %in% to create a logical vector, negate (!) to subset the other elements from 'all_areas' and then return the unique rows with unique

unique(subset(area_df, !all_areas %in% c(area1, area2)))

-output

   all_areas number
5    village     44
7       city     33
8       town     84
9       city     35
10   village     70
11      town     74
16   village     87
19      town     40
20   village     93

回复收藏 0 原文

白云不回头 2025-02-19 20:01:43

使用dplyr方法：

library(dplyr)

area_df %>% 
  filter(!all_areas %in% c(area1, area2)) %>% 
  distinct

#>   all_areas number
#> 1   village     44
#> 2      city     33
#> 3      town     84
#> 4      city     35
#> 5   village     70
#> 6      town     74
#> 7   village     87
#> 8      town     40
#> 9   village     93

With a dplyr approach:

library(dplyr)

area_df %>% 
  filter(!all_areas %in% c(area1, area2)) %>% 
  distinct

#>   all_areas number
#> 1   village     44
#> 2      city     33
#> 3      town     84
#> 4      city     35
#> 5   village     70
#> 6      town     74
#> 7   village     87
#> 8      town     40
#> 9   village     93

回复收藏 0 原文

~没有更多了~