R - 处理数据框列表的惯用方法

发布于 2024-08-21 02:23:57 字数 605 浏览 10 评论 0原文

我有 30 次运行的数据,每一次都存储在单独的 CSV 文件 runi.csv 中,i = 0:29。

假设我想将它们全部收集到一个列表中。我知道如何执行此操作的最佳方法是

runs = list()
for (i in 1:30) { runs[[i]] = read.csv(paste("run", i-1, ".csv")); }

现在让我们进一步说存储在列表中的每个数据框都具有相同的列布局,并且我对由“x”标识的列和由“y”标识的列感兴趣。

绘制所有 30 次运行的 (x, y) 对的最简单方法是什么?这是我目前的做法(我觉得一定有更好的方法):

xList = list()
yList = list()
for (i in 1:30) { xList[[i]] = runs[[i]]$x; yList[[i]] = runs[[i]]$y; }
matplot(x=as.data.frame(xList), y=as.data.frame(yList))

当我尝试对数据进行转换时,这会变得更加痛苦;我不知道如何将函数应用于存储在列表中的每个数据帧的特定列。

这里的任何帮助都会非常有帮助。

I have 30 runs of data, each stored in a separate CSV file, runi.csv, i = 0:29.

Let's say I want to collect them all into a list. Best way I know how to do this is

runs = list()
for (i in 1:30) { runs[[i]] = read.csv(paste("run", i-1, ".csv")); }

Now let's further say that each of these data frames stored in the list has the same column layouts and that I'm interested in the column identified by "x" and the column identified by "y".

What is the easiest way to plot all 30 runs' worth of (x, y) pairs? Here's how I would currently do it (and I feel there must be a better way):

xList = list()
yList = list()
for (i in 1:30) { xList[[i]] = runs[[i]]$x; yList[[i]] = runs[[i]]$y; }
matplot(x=as.data.frame(xList), y=as.data.frame(yList))

This gets even more painful when I'm trying to do transformations to the data; I can't figure out how to apply a function to a specific column of each data frame stored in a list.

Any help here would be extremely helpful.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

痴梦一场 2024-08-28 02:23:57

使用所有数据创建一个数据框可能会更好。例如,导入时添加运行编号 (runs[[i]] = data.frame(read.csv(paste("run", i-1, ".csv")), Run=i)< /code>),然后执行 alldata <- do.call(rbind, running)

现在您可以使用latticeggplot2来绘制图。例如,要通过运行获取使用不同颜色的所有运行的散点图:

library(ggplot2)
qplot(x, y, colour=Run, data=alldata, geom="point")

You would probably be much better off creating one data frame with all the data. For example, add the run number when importing (runs[[i]] = data.frame(read.csv(paste("run", i-1, ".csv")), Run=i)), and then do alldata <- do.call(rbind, runs).

Now you can use lattice or ggplot2 to make plots. For example to get a scatterplot of all runs using different colors by run do:

library(ggplot2)
qplot(x, y, colour=Run, data=alldata, geom="point")
往日情怀 2024-08-28 02:23:57

在处理这样的列表时,最好使用 l*ply 函数(来自 plyr)或 lapply。

进行导入的最简单方法可能如下所示:

library(plyr)
runs <- llply(paste("run",1:30,".csv",sep=""), read.csv)

这是绘制它们的一种方法:

# some dummy data
runs <- list(a=data.frame(x=1:5, y=rnorm(5)), b=data.frame(x=1:5, y=rnorm(5)))
par(mfrow=c((length(runs)/2),2));
l_ply(1:length(runs), function(i) { plot(runs[[i]]$x, runs[[i]]$y) })

当然,您也可以将其输出到另一个设备(例如 pdf),而不使用 par()

It is probably best to use an l*ply function (from plyr) or lapply when dealing with lists like this.

The easiest way to do the import is probably like so:

library(plyr)
runs <- llply(paste("run",1:30,".csv",sep=""), read.csv)

Here's one way to plot them:

# some dummy data
runs <- list(a=data.frame(x=1:5, y=rnorm(5)), b=data.frame(x=1:5, y=rnorm(5)))
par(mfrow=c((length(runs)/2),2));
l_ply(1:length(runs), function(i) { plot(runs[[i]]$x, runs[[i]]$y) })

Of course, you can also output this to another device (e.g. pdf) and not use par().

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文