使用来自不同来源的 Shapefile 和数据文件在 R 中绘制专题地图

发布于 2024-12-26 13:11:55 字数 2386 浏览 5 评论 0原文

给定一个 shapefile,如何塑造和使用数据文件,以便能够使用与 shapefile 中的形状区域相对应的标识符来绘制专题图?

#Download English Government Office Network Regions (GOR) from:
#http://www.sharegeo.ac.uk/handle/10672/50
tmp_dir = tempdir()
url_data = "http://www.sharegeo.ac.uk/download/10672/50/English%20Government%20Office%20Network%20Regions%20(GOR).zip"
zip_file = sprintf("%s/shpfile.zip", tmp_dir)
download.file(url_data, zip_file)
unzip(zip_file, exdir = tmp_dir)

library(maptools)

#Load in the data file (could this be done from the downloaded zip file directly?
gor=readShapeSpatial(sprintf('%s/Regions.shp', tmp_dir))

#I can plot the shapefile okay...
plot(gor)

#and I can use these commands to get a feel for the data...
summary(gor)
attributes(gor@data)
gor@data$NAME
#[1] North East               North West              
#[3] Greater London Authority West Midlands           
#[5] Yorkshire and The Humber South West              
#[7] East Midlands            South East              
#[9] East of England         
#9 Levels: East Midlands East of England ... Yorkshire and The Humber

#download data from http://www.justice.gov.uk/downloads/publications/statistics-and-data/courts-and-sentencing/csq-q3-2011-insolvency-tables.csv
#insolvency<- read.csv("~/Downloads/csq-q3-2011-insolvency-tables.csv")
insolvency=read.csv("http://www.justice.gov.uk/downloads/publications/statistics-and-data/courts-and-sentencing/csq-q3-2011-insolvency-tables.csv")
insolvencygor.2011Q3=subset(insolvency,Time.Period=='2011 Q3' & Geography.Type=='Government office region')
#tidy the data
require(gdata)
insolvencygor.2011Q3=drop.levels(insolvencygor.2011Q3)

names(insolvencygor.2011Q3)
#[1] "Time.Period"                 "Geography"                  
#[3] "Geography.Type"              "Company.Winding.up.Petition"
#[5] "Creditors.Petition"          "Debtors.Petition"  

levels(insolvencygor.2011Q3$Geography)
#[1] "East"                     "East Midlands"           
#[3] "London"                   "North East"              
#[5] "North West"               "South East"              
#[7] "South West"               "Wales"                   
#[9] "West Midlands"            "Yorkshire and the Humber"

#So what next?   

到目前为止,我如何采取下一步来生成主题/分区统计图,例如根据 Debtors.Petition 值为每个区域着色?

(我还刚刚注意到一个可能的问题 - GOR 级别的大写不匹配:“Yorkshire and the Humber”和“Yorkshire and The Humber”)

Given a shapefile, how do I shape and use a data file in order to be able to plot thematic maps using identifiers that correspond to shape regions in the shapefile?

#Download English Government Office Network Regions (GOR) from:
#http://www.sharegeo.ac.uk/handle/10672/50
tmp_dir = tempdir()
url_data = "http://www.sharegeo.ac.uk/download/10672/50/English%20Government%20Office%20Network%20Regions%20(GOR).zip"
zip_file = sprintf("%s/shpfile.zip", tmp_dir)
download.file(url_data, zip_file)
unzip(zip_file, exdir = tmp_dir)

library(maptools)

#Load in the data file (could this be done from the downloaded zip file directly?
gor=readShapeSpatial(sprintf('%s/Regions.shp', tmp_dir))

#I can plot the shapefile okay...
plot(gor)

#and I can use these commands to get a feel for the data...
summary(gor)
attributes(gor@data)
gor@data$NAME
#[1] North East               North West              
#[3] Greater London Authority West Midlands           
#[5] Yorkshire and The Humber South West              
#[7] East Midlands            South East              
#[9] East of England         
#9 Levels: East Midlands East of England ... Yorkshire and The Humber

#download data from http://www.justice.gov.uk/downloads/publications/statistics-and-data/courts-and-sentencing/csq-q3-2011-insolvency-tables.csv
#insolvency<- read.csv("~/Downloads/csq-q3-2011-insolvency-tables.csv")
insolvency=read.csv("http://www.justice.gov.uk/downloads/publications/statistics-and-data/courts-and-sentencing/csq-q3-2011-insolvency-tables.csv")
insolvencygor.2011Q3=subset(insolvency,Time.Period=='2011 Q3' & Geography.Type=='Government office region')
#tidy the data
require(gdata)
insolvencygor.2011Q3=drop.levels(insolvencygor.2011Q3)

names(insolvencygor.2011Q3)
#[1] "Time.Period"                 "Geography"                  
#[3] "Geography.Type"              "Company.Winding.up.Petition"
#[5] "Creditors.Petition"          "Debtors.Petition"  

levels(insolvencygor.2011Q3$Geography)
#[1] "East"                     "East Midlands"           
#[3] "London"                   "North East"              
#[5] "North West"               "South East"              
#[7] "South West"               "Wales"                   
#[9] "West Midlands"            "Yorkshire and the Humber"

#So what next?   

Having got that far, how do I take the next step in generating a thematic/choropleth map, that colours each region according to the the Debtors.Petition value, for example?

(I also just noticed a possible gotcha - there is a mismatch in the capitalisation GOR levels: "Yorkshire and the Humber" and "Yorkshire and The Humber" )

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

北斗星光 2025-01-02 13:11:55

没有看到树木的木材,为了回答我自己的问题,这里有一种方法(代码来自问题中的代码):

#Convert factors to numeric [ http://stackoverflow.com/questions/4798343/convert-factor-to-integer ]
#There's probably a much better formulaic way of doing this/automating this?
insolvencygor.2011Q3$Creditors.Petition=as.numeric(levels(insolvencygor.2011Q3$Creditors.Petition))[insolvencygor.2011Q3$Creditors.Petition]
insolvencygor.2011Q3$Company.Winding.up.Petition=as.numeric(levels(insolvencygor.2011Q3$Company.Winding.up.Petition))[insolvencygor.2011Q3$Company.Winding.up.Petition]
insolvencygor.2011Q3$Debtors.Petition=as.numeric(levels(insolvencygor.2011Q3$Debtors.Petition))[insolvencygor.2011Q3$Debtors.Petition]

#Tweak the levels so they match exactly (really should do this via a lookup table of some sort?)
i2=insolvencygor.2011Q3
i2c=c('East of England','East Midlands','Greater London Authority','North East','North West','South East','South West','Wales','West Midlands','Yorkshire and The Humber')
i2$Geography=factor(i2$Geography,labels=i2c)

#Merge the data with the shapefile
gor@data=merge(gor@data,i2,by.x='NAME',by.y='Geography')

#Plot the data using a greyscale
plot(gor,col=gray(gor@data$Creditors.Petition/max(gor@data$Creditors.Petition)))

所以这种方法的作用是将数字数据合并到 shapefile 中,然后直接绘制它。

也就是说,将数据文件和形状文件分开不是更干净的方法吗? (我还是不知道该怎么做?)

Having not seen the wood for the trees, to answer my own question, here's one way (code following on from code in the question):

#Convert factors to numeric [ http://stackoverflow.com/questions/4798343/convert-factor-to-integer ]
#There's probably a much better formulaic way of doing this/automating this?
insolvencygor.2011Q3$Creditors.Petition=as.numeric(levels(insolvencygor.2011Q3$Creditors.Petition))[insolvencygor.2011Q3$Creditors.Petition]
insolvencygor.2011Q3$Company.Winding.up.Petition=as.numeric(levels(insolvencygor.2011Q3$Company.Winding.up.Petition))[insolvencygor.2011Q3$Company.Winding.up.Petition]
insolvencygor.2011Q3$Debtors.Petition=as.numeric(levels(insolvencygor.2011Q3$Debtors.Petition))[insolvencygor.2011Q3$Debtors.Petition]

#Tweak the levels so they match exactly (really should do this via a lookup table of some sort?)
i2=insolvencygor.2011Q3
i2c=c('East of England','East Midlands','Greater London Authority','North East','North West','South East','South West','Wales','West Midlands','Yorkshire and The Humber')
i2$Geography=factor(i2$Geography,labels=i2c)

#Merge the data with the shapefile
gor@data=merge(gor@data,i2,by.x='NAME',by.y='Geography')

#Plot the data using a greyscale
plot(gor,col=gray(gor@data$Creditors.Petition/max(gor@data$Creditors.Petition)))

So what this approach does is merge the numeric data into the shapefile, and then plot it directly.

That said, wouldn't a cleaner way be to keep the data file and the shapefile separate? (I'm still not sure how to do that?)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文