如何获得不相交的积分，r

发布于 2025-01-18 06:06:43 字数 491 浏览 2 评论 0原文

我在坐标系中有两组：A和B。B中的所有点都在A中。因此，我想选择不在B中的A中的点。我正在使用SF库，但我有点困惑。

在这里，我显示了一个可再现的Exameple。考虑下一个数据。

fname <- system.file("shape/nc.shp", package="sf")
nc <- st_read(fname)
library(sp)
library(sf)
data(meuse)
coordinates(meuse) = ~x+y

因此，我从那里开始有接下来的两组A和B点A和B

A <- st_as_sf(meuse) %>% select(geometry) #155 obs
B <- filter(A, row_number()<= 100)%>% select(geometry) #100 obs

，我想挽救A中不在B（即剩下的55分）中的A中的点。

原文

I have two sets of points in coordinate system: A and B. All points in B are in A. So I want to pick the points in A which are not in B.
I'm using the sf library, but I'm little stuck.

Here I show a reproducible exameple. Consider the next data.

fname <- system.file("shape/nc.shp", package="sf")
nc <- st_read(fname)
library(sp)
library(sf)
data(meuse)
coordinates(meuse) = ~x+y

So I have the next two set of points A and B

A <- st_as_sf(meuse) %>% select(geometry) #155 obs
B <- filter(A, row_number()<= 100)%>% select(geometry) #100 obs

Starting from there, I want to rescue the points in A that are not in B, i.e., the remaining 55 points.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

红衣飘飘貌似仙 2025-01-25 06:06:44

看起来您可以在％操作员中使用％或st_difference函数。如果使用st_difference，则可能需要ST_COMBINE单独设置，如果这些点为sf，则不是sfc sfc代码>。

这些方法可能会因您使用的数据类型而有所不同。它们只是点（SFC），还是与数据（SF）相关的点？

*Edit for your reproducible example:

library(dplyr)
library(ggplot2)
library(sp)
library(sf)

data(meuse)
coordinates(meuse) = ~x+y

A <- st_as_sf(meuse) %>% select(geometry) #155 obs
B <- filter(A, row_number()<= 100)%>% select(geometry) #100 obs

diff_meuse <- st_difference(st_combine(A), st_combine(B)) %>% st_cast('POINT')

diff_meuse
#> Geometry set for 55 features 
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 178786 ymin: 329714 xmax: 180923 ymax: 333116
#> CRS:           NA
#> First 5 geometries:
#> POINT (178786 329822)
#> POINT (178875 330311)
#> POINT (179024 329733)
#> POINT (179030 330082)
#> POINT (179034 330561)

ggplot() +
  geom_sf(data = A, size = 4, alpha = .3) + 
  geom_sf(data = diff_meuse, color = 'red')

^{Created on 2022-03-31 by the reprex package (v2.0.1)}


``` r
library(sf)
library(ggplot2)
set.seed(10)

# using nc data incl w/sf
nc <- st_read(system.file("shape/nc.shp", package="sf"))
#> Reading layer `nc' from data source  
#>   using driver `ESRI Shapefile'
#> Simple feature collection with 100 features and 14 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> Geodetic CRS:  NAD27

a_b <- st_sample(nc, size = 20)

a <- a_b[1:20] 
b <- a_b[1:5]  

diff <- a[!a%in%b]
head(diff)
#> Geometry set for 6 features 
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -81.98142 ymin: 34.15653 xmax: -78.5457 ymax: 36.41477
#> Geodetic CRS:  NAD27
#> First 5 geometries:
#> POINT (-78.5457 34.15653)
#> POINT (-79.28979 36.045)
#> POINT (-79.03986 36.41477)
#> POINT (-80.52165 35.15185)
#> POINT (-81.98142 35.45012)

## or using st_difference
diff_st <- st_difference(st_combine(a), st_combine(b))
head(st_cast(diff_st, 'POINT'))
#> Geometry set for 6 features 
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -82.19945 ymin: 35.11489 xmax: -80.52165 ymax: 35.93451
#> Geodetic CRS:  NAD27
#> First 5 geometries:
#> POINT (-82.19945 35.93451)
#> POINT (-80.57638 35.60809)
#> POINT (-80.72523 35.14048)
#> POINT (-80.78783 35.11489)
#> POINT (-80.52165 35.15185)

ggplot() + 
  geom_sf(data = diff, color = "black", alpha = .2, size = 6) +
  geom_sf(data = diff_st, color = "black", alpha = .8, size = 4) + 
  geom_sf(data = a, color = 'red', size = 1)

^{Created on 2022-03-31 by the reprex package (v2.0.1)< /sup>}

Looks like you can use the %in% operator or the st_difference function. If using st_difference, you might need to st_combine each set individually if the points are sf but not if they're sfc.

The methods may differ a little depending on the type of data you're using. Are they just points(sfc), or are they points associated with data(sf)?

*Edit for your reproducible example:

library(dplyr)
library(ggplot2)
library(sp)
library(sf)

data(meuse)
coordinates(meuse) = ~x+y

A <- st_as_sf(meuse) %>% select(geometry) #155 obs
B <- filter(A, row_number()<= 100)%>% select(geometry) #100 obs

diff_meuse <- st_difference(st_combine(A), st_combine(B)) %>% st_cast('POINT')

diff_meuse
#> Geometry set for 55 features 
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 178786 ymin: 329714 xmax: 180923 ymax: 333116
#> CRS:           NA
#> First 5 geometries:
#> POINT (178786 329822)
#> POINT (178875 330311)
#> POINT (179024 329733)
#> POINT (179030 330082)
#> POINT (179034 330561)

ggplot() +
  geom_sf(data = A, size = 4, alpha = .3) + 
  geom_sf(data = diff_meuse, color = 'red')

^{Created on 2022-03-31 by the reprex package (v2.0.1)}


``` r
library(sf)
library(ggplot2)
set.seed(10)

# using nc data incl w/sf
nc <- st_read(system.file("shape/nc.shp", package="sf"))
#> Reading layer `nc' from data source  
#>   using driver `ESRI Shapefile'
#> Simple feature collection with 100 features and 14 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> Geodetic CRS:  NAD27

a_b <- st_sample(nc, size = 20)

a <- a_b[1:20] 
b <- a_b[1:5]  

diff <- a[!a%in%b]
head(diff)
#> Geometry set for 6 features 
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -81.98142 ymin: 34.15653 xmax: -78.5457 ymax: 36.41477
#> Geodetic CRS:  NAD27
#> First 5 geometries:
#> POINT (-78.5457 34.15653)
#> POINT (-79.28979 36.045)
#> POINT (-79.03986 36.41477)
#> POINT (-80.52165 35.15185)
#> POINT (-81.98142 35.45012)

## or using st_difference
diff_st <- st_difference(st_combine(a), st_combine(b))
head(st_cast(diff_st, 'POINT'))
#> Geometry set for 6 features 
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -82.19945 ymin: 35.11489 xmax: -80.52165 ymax: 35.93451
#> Geodetic CRS:  NAD27
#> First 5 geometries:
#> POINT (-82.19945 35.93451)
#> POINT (-80.57638 35.60809)
#> POINT (-80.72523 35.14048)
#> POINT (-80.78783 35.11489)
#> POINT (-80.52165 35.15185)

ggplot() + 
  geom_sf(data = diff, color = "black", alpha = .2, size = 6) +
  geom_sf(data = diff_st, color = "black", alpha = .8, size = 4) + 
  geom_sf(data = a, color = 'red', size = 1)

^{Created on 2022-03-31 by the reprex package (v2.0.1)}

回复收藏 0 原文

很糊涂小朋友 2025-01-25 06:06:44

使用 st_difference 完全没问题，但是在某些情况下，使用空间操作可能会很昂贵（时间和内存）。

因此，对于此示例，您可以考虑将数据视为 data.frame，而不是空间对象，并使用“反连接”

library(data.table)
library(sfheaders)

## Convert to data.frame/data.table
dfA <- sfheaders::sf_to_df(A, fill = T)
dfB <- sfheaders::sf_to_df(B, fill = T)

setDT(dfA)
setDT(dfB)

## do an 'anti-join' to find those in A but not in B 
dfA[ !dfB, on = .(x, y) ]

## I prefer using data.table, but the same anti-join exists in dplyr
dplyr::anti_join(dfA, dfB, by = c("x","y"))


library(ggplot2)

ggplot() +
  geom_sf(data = A, size = 4, alpha = 0.3) +
  geom_point(data = dfA[ !dfB, on = .(x, y) ], aes(x = x, y = y), colour = 'red')

Using st_difference is perfectly fine, but, in some cases using spatial operations can be costly (in time and memory).

So for this example you could consider treating the data as data.frames, rather than spatial objects, and use an 'anti-join'

library(data.table)
library(sfheaders)

## Convert to data.frame/data.table
dfA <- sfheaders::sf_to_df(A, fill = T)
dfB <- sfheaders::sf_to_df(B, fill = T)

setDT(dfA)
setDT(dfB)

## do an 'anti-join' to find those in A but not in B 
dfA[ !dfB, on = .(x, y) ]

## I prefer using data.table, but the same anti-join exists in dplyr
dplyr::anti_join(dfA, dfB, by = c("x","y"))


library(ggplot2)

ggplot() +
  geom_sf(data = A, size = 4, alpha = 0.3) +
  geom_point(data = dfA[ !dfB, on = .(x, y) ], aes(x = x, y = y), colour = 'red')

回复收藏 0 原文

~没有更多了~