如何在R中绘制数据框的所有列

发布于 2024-10-15 17:22:34 字数 436 浏览 4 评论 0原文

数据框有 n 列,我想得到 n 个图,每列一个图。

我是新手,对 R 不太熟悉,无论如何我找到了两个解决方案。

第一个可以工作,但它不打印列名称(我需要它们!):

data <- read.csv("sample.csv",header=T,sep=",")
for ( c in data ) plot( c, type="l" )

第二个工作更好,因为它打印列名称:

data <- read.csv("sample.csv",header=T,sep=",")
for ( i in seq(1,length( data ),1) ) plot(data[,i],ylab=names(data[i]),type="l")

有没有更好的(从 R 语言的角度来看)解决方案?

The data frame has n columns and I would like to get n plots, one plot for each column.

I'm a newbie and I am not fluent in R, anyway I found two solutions.

The first one works but it does not print the column name (and I need them!):

data <- read.csv("sample.csv",header=T,sep=",")
for ( c in data ) plot( c, type="l" )

The second one works better because it prints the column name:

data <- read.csv("sample.csv",header=T,sep=",")
for ( i in seq(1,length( data ),1) ) plot(data[,i],ylab=names(data[i]),type="l")

Is there any better (from the R language point of view) solutions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

原野 2024-10-22 17:22:34

ggplot2 包需要一点点学习,但结果看起来非常好,您可以获得漂亮的图例,以及许多其他不错的功能,所有这些都无需编写太多代码。

require(ggplot2)
require(reshape2)
df <- data.frame(time = 1:10,
                 a = cumsum(rnorm(10)),
                 b = cumsum(rnorm(10)),
                 c = cumsum(rnorm(10)))
df <- melt(df ,  id.vars = 'time', variable.name = 'series')

# plot on same grid, each series colored differently -- 
# good if the series have same scale
ggplot(df, aes(time,value)) + geom_line(aes(colour = series))

# or plot on different plots
ggplot(df, aes(time,value)) + geom_line() + facet_grid(series ~ .)

在此处输入图像描述
在此处输入图像描述

The ggplot2 package takes a little bit of learning, but the results look really nice, you get nice legends, plus many other nice features, all without having to write much code.

require(ggplot2)
require(reshape2)
df <- data.frame(time = 1:10,
                 a = cumsum(rnorm(10)),
                 b = cumsum(rnorm(10)),
                 c = cumsum(rnorm(10)))
df <- melt(df ,  id.vars = 'time', variable.name = 'series')

# plot on same grid, each series colored differently -- 
# good if the series have same scale
ggplot(df, aes(time,value)) + geom_line(aes(colour = series))

# or plot on different plots
ggplot(df, aes(time,value)) + geom_line() + facet_grid(series ~ .)

enter image description here
enter image description here

遥远的她 2024-10-22 17:22:34

有一种非常简单的方法可以使用单独的面板或同一面板绘制数据框中的所有列:

plot.ts(data)

产生结果(其中 X1 - X4 是列名称):

在此处输入图像描述

查看 ?plot.ts 了解所有选项。

如果您不想更多地控制绘图函数并且不使用循环,您也可以执行以下操作:

par(mfcol = c(ncol(data), 1))
Map(function(x,y) plot(x, main =y), data, names(data))

There is very simple way to plot all columns from a data frame using separate panels or the same panel:

plot.ts(data)

Which yields (where X1 - X4 are column names):

enter image description here

Have look at ?plot.ts for all the options.

If you wan't more control over your plotting function and not use a loop, you could also do something like:

par(mfcol = c(ncol(data), 1))
Map(function(x,y) plot(x, main =y), data, names(data))
千と千尋 2024-10-22 17:22:34

您可以跳过重重困难,将您的解决方案转换为 lapplysapplyapply 调用。 (我看到@jonw展示了一种方法来做到这一点。)除此之外,您已经拥有的是完全可以接受的代码。

如果这些都是时间序列或类似的时间序列,那么以下可能是一个合适的替代方案,它将每个序列绘制在单个绘图区域的其自己的面板中。我们使用zoo包,因为它确实可以很好地处理这样的有序数据。

require(zoo)
set.seed(1)
## example data
dat <- data.frame(X = cumsum(rnorm(100)), Y = cumsum(rnorm(100)),
                  Z = cumsum(rnorm(100)))
## convert to multivariate zoo object
datz <- zoo(dat)
## plot it
plot(datz)

这给出:
动物园绘图功能示例

You can jump through hoops and convert your solution to a lapply, sapply or apply call. (I see @jonw shows one way to do this.) Other than that what you have already is perfectly acceptable code.

If these are all a time series or similar then the following might be a suitable alternative, which plots each series in it's own panel on a single plotting region. We use the zoo package as it handles ordered data like this very well indeed.

require(zoo)
set.seed(1)
## example data
dat <- data.frame(X = cumsum(rnorm(100)), Y = cumsum(rnorm(100)),
                  Z = cumsum(rnorm(100)))
## convert to multivariate zoo object
datz <- zoo(dat)
## plot it
plot(datz)

Which gives:
Example of zoo plotting capabilities

我家小可爱 2024-10-22 17:22:34

我很惊讶没有人提到matplot。如果您不需要在单独的轴上绘制每条线,这非常方便。
只需一个命令:

matplot(y = data, type = 'l', lty = 1)

使用 ?matplot 查看所有选项。

要添加图例,您可以设置调色板,然后添加它:

mypalette = rainbow(ncol(data))
matplot(y = data, type = 'l', lty = 1, col = mypalette)
legend(legend = colnames(data), x = "topright", y = "topright", lty = 1, lwd = 2, col = mypalette)

I'm surprised that no one mentioned matplot. It's pretty convenient in case you don't need to plot each line in separate axes.
Just one command:

matplot(y = data, type = 'l', lty = 1)

Use ?matplot to see all the options.

To add the legend, you can set color palette and then add it:

mypalette = rainbow(ncol(data))
matplot(y = data, type = 'l', lty = 1, col = mypalette)
legend(legend = colnames(data), x = "topright", y = "topright", lty = 1, lwd = 2, col = mypalette)
淡淡離愁欲言轉身 2024-10-22 17:22:34

使用上面的一些技巧(特别感谢 @daroczig 的 names(df)[i] 形式),该函数会打印数值变量的直方图和因子变量的条形图。探索数据框架的良好开端:

par(mfrow=c(3,3),mar=c(2,1,1,1)) #my example has 9 columns

dfplot <- function(data.frame)
{
  df <- data.frame
  ln <- length(names(data.frame))
  for(i in 1:ln){
    mname <- substitute(df[,i])
      if(is.factor(df[,i])){
        plot(df[,i],main=names(df)[i])}
        else{hist(df[,i],main=names(df)[i])}
  }
}

最良好的祝愿,Mat。

Using some of the tips above (especially thanks @daroczig for the names(df)[i] form) this function prints a histogram for numeric variables and a bar chart for factor variables. A good start to exploring a data frame:

par(mfrow=c(3,3),mar=c(2,1,1,1)) #my example has 9 columns

dfplot <- function(data.frame)
{
  df <- data.frame
  ln <- length(names(data.frame))
  for(i in 1:ln){
    mname <- substitute(df[,i])
      if(is.factor(df[,i])){
        plot(df[,i],main=names(df)[i])}
        else{hist(df[,i],main=names(df)[i])}
  }
}

Best wishes, Mat.

享受孤独 2024-10-22 17:22:34

不幸的是,ggplot2 没有提供一种在不将数据转换为长格式的情况下(轻松地)执行此操作的方法。您可以尝试对抗它,但数据转换会更容易。这里是所有方法,包括来自 reshape2 的 melt、来自 tidyr 的 gather 和来自 tidyr 的 pivot_longer将 data.frame 从宽格式重塑为长格式

这是一个使用 pivot_longer

> df <- data.frame(time = 1:5, a = 1:5, b = 3:7)
> df
  time a b
1    1 1 3
2    2 2 4
3    3 3 5
4    4 4 6
5    5 5 7

> df_long <- df %>% pivot_longer(c(a, b), names_to = "colname", values_to = "val")
> df_long
# A tibble: 10 x 3
    time colname   val
   <int> <chr>   <int>
 1     1 a           1
 2     1 b           3
 3     2 a           2
 4     2 b           4
 5     3 a           3
 6     3 b           5
 7     4 a           4
 8     4 b           6
 9     5 a           5
10     5 b           7

如您所见,pivot_longer 放置选定的列names_to 指定的任何内容(默认“name”),并将长值放入 values_to 指定的任何内容(默认“value”)。如果我同意默认名称,我可以使用 df %>%ivot_longer(c("a", "b"))

现在您可以正常绘图:

ggplot(df_long, aes(x = time, y = val, color = colname)) + geom_line()

在此处输入图像描述

Unfortunately, ggplot2 does not offer a way to do this (easily) without transforming your data into long format. You can try to fight it but it will just be easier to do the data transformation. Here all the methods, including melt from reshape2, gather from tidyr, and pivot_longer from tidyr: Reshaping data.frame from wide to long format

Here's a simple example using pivot_longer:

> df <- data.frame(time = 1:5, a = 1:5, b = 3:7)
> df
  time a b
1    1 1 3
2    2 2 4
3    3 3 5
4    4 4 6
5    5 5 7

> df_long <- df %>% pivot_longer(c(a, b), names_to = "colname", values_to = "val")
> df_long
# A tibble: 10 x 3
    time colname   val
   <int> <chr>   <int>
 1     1 a           1
 2     1 b           3
 3     2 a           2
 4     2 b           4
 5     3 a           3
 6     3 b           5
 7     4 a           4
 8     4 b           6
 9     5 a           5
10     5 b           7

As you can see, pivot_longer puts the selected column names in whatever is specified by names_to (default "name"), and puts the long values into whatever is specified by values_to (default "value"). If I'm ok with the default names, I can use use df %>% pivot_longer(c("a", "b")).

Now you can plot as normal:

ggplot(df_long, aes(x = time, y = val, color = colname)) + geom_line()

enter image description here

那支青花 2024-10-22 17:22:34

对于格子

library(lattice)

df <- data.frame(time = 1:10,
                 a = cumsum(rnorm(10)),
                 b = cumsum(rnorm(10)),
                 c = cumsum(rnorm(10)))

form <- as.formula(paste(paste(names(df)[- 1],  collapse = ' + '),  
                         'time',  sep = '~'))

xyplot(form,  data = df,  type = 'b',  outer = TRUE)

With lattice:

library(lattice)

df <- data.frame(time = 1:10,
                 a = cumsum(rnorm(10)),
                 b = cumsum(rnorm(10)),
                 c = cumsum(rnorm(10)))

form <- as.formula(paste(paste(names(df)[- 1],  collapse = ' + '),  
                         'time',  sep = '~'))

xyplot(form,  data = df,  type = 'b',  outer = TRUE)
巷子口的你 2024-10-22 17:22:34

您可以使用 main 选项指定标题(以及通过 xlabylab 指定轴的标题)。例如:

plot(data[,i], main=names(data)[i])

如果您想绘制(并保存)数据帧的每个变量,您应该使用 pngpdf 或您需要的任何其他图形驱动程序,然后在该问题之后一个 dev.off() 命令。例如:

data <- read.csv("sample.csv",header=T,sep=",")
for (i in 1:length(data)) {
    pdf(paste('fileprefix_', names(data)[i], '.pdf', sep='')
    plot(data[,i], ylab=names(data[i]), type="l")
    dev.off()
}

或者使用 par()mfrow 参数将所有绘图绘制到同一图像。例如:使用 par(mfrow=c(2,2) 将接下来的 4 个图包含在同一“图像”中。

You could specify the title (and also the title of the axes via xlab and ylab) with the main option. E.g.:

plot(data[,i], main=names(data)[i])

And if you want to plot (and save) each variable of a dataframe, you should use png, pdf or any other graphics driver you need, and after that issue a dev.off() command. E.g.:

data <- read.csv("sample.csv",header=T,sep=",")
for (i in 1:length(data)) {
    pdf(paste('fileprefix_', names(data)[i], '.pdf', sep='')
    plot(data[,i], ylab=names(data[i]), type="l")
    dev.off()
}

Or draw all plots to the same image with the mfrow paramater of par(). E.g.: use par(mfrow=c(2,2) to include the next 4 plots in the same "image".

听不够的曲调 2024-10-22 17:22:34

我这台计算机上没有 R,但这里有一个破解方法。您可以使用 par 在一个窗口中显示多个绘图,或者像这样在显示下一页之前提示单击。

plotfun <- function(col) 
  plot(data[ , col], ylab = names(data[col]), type = "l")
par(ask = TRUE)
sapply(seq(1, length(data), 1), plotfun)

I don't have R on this computer, but here is a crack at it. You can use par to display multiple plots in a window, or like this to prompt for a click before displaying the next page.

plotfun <- function(col) 
  plot(data[ , col], ylab = names(data[col]), type = "l")
par(ask = TRUE)
sapply(seq(1, length(data), 1), plotfun)
北风几吹夏 2024-10-22 17:22:34

如果 .csv 文件中的列名称不是有效的 R 名称:

data <- read.csv("sample.csv",sep=";",head=TRUE)
data2 <- read.csv("sample.csv",sep=";",head=FALSE,nrows=1)

for ( i in seq(1,length( data ),1) ) plot(data[,i],ylab=data2[1,i],type="l")

In case the column names in the .csv file file are not valid R name:

data <- read.csv("sample.csv",sep=";",head=TRUE)
data2 <- read.csv("sample.csv",sep=";",head=FALSE,nrows=1)

for ( i in seq(1,length( data ),1) ) plot(data[,i],ylab=data2[1,i],type="l")
过度放纵 2024-10-22 17:22:34

此链接对同样的问题帮助了我很多:

p = ggplot() + 
  geom_line(data = df_plot, aes(x = idx, y = col1), color = "blue") +
  geom_line(data = df_plot, aes(x = idx, y = col2), color = "red") 

print(p)

https://rpubs.com/euclid/343644

This link helped me a lot for the same problem:

p = ggplot() + 
  geom_line(data = df_plot, aes(x = idx, y = col1), color = "blue") +
  geom_line(data = df_plot, aes(x = idx, y = col2), color = "red") 

print(p)

https://rpubs.com/euclid/343644

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文