读取多个 .gpx 文件

发布于 2024-11-16 05:38:01 字数 2468 浏览 2 评论 0原文

假设我有许多 .gpx 文件（这些文件包含来自 Garmin eTrex 的 GPX 航路点数据）。我想用不同的名称将它们加载到 R 中并操作它们。

我可以这样读取一个文件：

library(maptools)
gpx.raw <- readGPS(i = "gpx", f = "file1_w_12_f_ddf.gpx", type="w")

假设我想将其中的多个文件读入内存。我可以尝试 for 循环：

files <- list.files(".",pattern = "*.gpx")
for(x in files){

    #Create new file name
    temp <- strsplit(x,"_",fixed=TRUE)
    visit.id <- sapply(temp,FUN=function(x){paste(x[1],x[4],substr(x[5],1,3),sep="_")})

    #read file with new filename
    assign(visit.id, readGPS(i = "gpx", f = x, type="w"))
}

运行上面的程序会产生以下错误：

read.table(con <- textConnection(gpsdata), fill = TRUE, ...) 中出现错误：输入中没有可用的行另外：警告消息：运行命令 'C:\PROGRA~2\GPSBabel\gpsbabel.exe -w -i gpx -f file1_w_12_f_ddf.gpx -o tabsep -F -' 状态为 1

请注意，我能够自行读取此文件，因此它看起来它与文件本身无关，但与循环运行 readGPS 有关。

总的来说，我仍然发现R如何处理像上面的x这样的变量非常令人困惑。我不确定如何从独立实例修改 readGPS 的参数 f = "file1_w_12_f_ddf.gpx"：应该是 x 还是 f = x，或f = "x"，还是什么？或者问题可能出在对 GPSBabel 的调用中...

我在下面提供了一个示例文件，以便您可以将其复制到文本编辑器，并使用不同的名称另存为 .gpx. 两次，然后自己尝试。

<?xml version="1.0" encoding="UTF-8"?>
<gpx
 version="1.0"
 creator="GPSBabel - http://www.gpsbabel.org"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns="http://www.topografix.com/GPX/1/0"
 xsi:schemaLocation="http://www.topografix.com/GPX/1/0 http://www.topografix.com/GPX/1/0/gpx.xsd">
 <time>2010-09-14T18:35:43Z</time>
 <bounds minlat="18.149888897" minlon="-96.747799935" maxlat="50.982883293" maxlon="121.640266674"/>
<wpt lat="38.855549991" lon="-94.799016668">
<ele>325.049072</ele>
 <name>GARMIN</name>
 <cmt>GARMIN</cmt>
 <desc>GARMIN</desc>
 <sym>Flag</sym>
 </wpt>
 <wpt lat="50.982883293" lon="-1.463899976">
 <ele>35.934692</ele>
 <name>GRMEUR</name>
 <cmt>GRMEUR</cmt>
 <desc>GRMEUR</desc>
 <sym>Flag</sym>
 </wpt>
 <wpt lat="25.061783362" lon="121.640266674">
 <ele>38.097656</ele>
 <name>GRMTWN</name>
 <cmt>GRMTWN</cmt>
 <desc>GRMTWN</desc>
 <sym>Flag</sym>
 </wpt>
 </gpx>

注意：要运行 readGPS，您需要安装开源 GPSBabel 程序并在 PATH 变量中引用。

原文

Suppose I have a number of .gpx files (these contain GPX waypoint data from a Garmin eTrex). I want to load them into R with different names and manipulate them.

I can read one file thus:

library(maptools)
gpx.raw <- readGPS(i = "gpx", f = "file1_w_12_f_ddf.gpx", type="w")

Suppose I want to read a number of them into memory. I could try a for loop:

files <- list.files(".",pattern = "*.gpx")
for(x in files){

    #Create new file name
    temp <- strsplit(x,"_",fixed=TRUE)
    visit.id <- sapply(temp,FUN=function(x){paste(x[1],x[4],substr(x[5],1,3),sep="_")})

    #read file with new filename
    assign(visit.id, readGPS(i = "gpx", f = x, type="w"))
}

Running above program yields following error:

Error in read.table(con <- textConnection(gpsdata), fill = TRUE, ...) :
no lines available in input
In addition: Warning message:
running command 'C:\PROGRA~2\GPSBabel\gpsbabel.exe -w -i gpx -f file1_w_12_f_ddf.gpx -o tabsep -F -' had status 1

Note that I was able to read this file on its own, so it would seem it has nothing to do with the file itself but with running readGPS in a loop.

In general I still find it very confusing how R treats variables like x above. I am not sure how to modify the argument to readGPS from the stand alone instance f = "file1_w_12_f_ddf.gpx": Should it be x, or f = x, or f = "x", or what? Or maybe the problem is in the call to GPSBabel...

I include a sample file below so you can copy it to text editor, and save as .gpx. twice with different names and try yourself.

<?xml version="1.0" encoding="UTF-8"?>
<gpx
 version="1.0"
 creator="GPSBabel - http://www.gpsbabel.org"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns="http://www.topografix.com/GPX/1/0"
 xsi:schemaLocation="http://www.topografix.com/GPX/1/0 http://www.topografix.com/GPX/1/0/gpx.xsd">
 <time>2010-09-14T18:35:43Z</time>
 <bounds minlat="18.149888897" minlon="-96.747799935" maxlat="50.982883293" maxlon="121.640266674"/>
<wpt lat="38.855549991" lon="-94.799016668">
<ele>325.049072</ele>
 <name>GARMIN</name>
 <cmt>GARMIN</cmt>
 <desc>GARMIN</desc>
 <sym>Flag</sym>
 </wpt>
 <wpt lat="50.982883293" lon="-1.463899976">
 <ele>35.934692</ele>
 <name>GRMEUR</name>
 <cmt>GRMEUR</cmt>
 <desc>GRMEUR</desc>
 <sym>Flag</sym>
 </wpt>
 <wpt lat="25.061783362" lon="121.640266674">
 <ele>38.097656</ele>
 <name>GRMTWN</name>
 <cmt>GRMTWN</cmt>
 <desc>GRMTWN</desc>
 <sym>Flag</sym>
 </wpt>
 </gpx>

NOTE: To run readGPS you will need the open source GPSBabel program installed and referenced in your PATH variable.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

留一抹残留的笑 2024-11-23 05:38:01

Fred，

安装 GPSBabel 并更新 PATH 变量后，您的代码片段运行良好。我有两个对象名称 test1.gpx_NA_NA 和 test2.gpx_NA_NA 以及 28 个变量的三个观察值。是这样吗？我假设文件名中的 NA 位是由于您定义 visit.id 的方式而导致的，而我的测试文件名不适合该范例。

您是否在 R 的新实例上尝试过此操作？

FWIW，我可能会将所有这些文件读入一个列表对象。我发现处理列表对象比处理大量不同的对象更容易。例如，

files <- dir(pattern = "\\.gpx")
#Replace all space characters with a "_". Replace with the character of your choice.
lapply(files, function(x) file.rename(from = x, to = gsub("\\s+", "_", x)))

#Reread in files with better names:
files <- dir(pattern = "\\.gpx")
out <- lapply(files, function(x) readGPS(i = "gpx", f = x, type = "w"))
names(out) <- files

out 现在是一个包含 2 的列表，其中每个对象都是一个 data.frame，其中包含之前与其关联的文件的名称。使用 *apply 系列中的东西还有一个好处，就是留下一个干净的工作空间。使用 for 循环会导致 x、temp 和 visit.id 挂起。您可以将它们包装到函数调用中，但我认为仅使用 lapply 会更直接。

Fred,

After installing GPSBabel and updating the PATH variable, your code snippet ran fine. I have two objects names test1.gpx_NA_NA and test2.gpx_NA_NA with three observations of 28 variables. Is that right? I assume the NA bit in the file names is due to how you are defining visit.id and my test file names not fitting into that paradigm.

Have you tried this on a fresh instance of R?

FWIW, I would probably read all of these files into a single list object. I find dealing with a list object easier than having lots of different objects floating around. For example,

files <- dir(pattern = "\\.gpx")
#Replace all space characters with a "_". Replace with the character of your choice.
lapply(files, function(x) file.rename(from = x, to = gsub("\\s+", "_", x)))

#Reread in files with better names:
files <- dir(pattern = "\\.gpx")
out <- lapply(files, function(x) readGPS(i = "gpx", f = x, type = "w"))
names(out) <- files

and out is now a list of 2, where each object is a data.frame with the name of the file that it was associated with previously. Using something from the *apply family has the benefit leaving a clean working space behind as well. Using the for-loop results in x, temp, and visit.id hanging out afterwords. You could wrap them up into a function call, but just using lapply will be more straight forward I think.

回复收藏 0 原文