ggplot 和 pgfSweave 的问题
我不久前开始使用 Sweave。然而,像大多数人一样,我很快就遇到了一个主要问题:速度。编辑一个大文档需要很长时间才能运行,这使得高效工作变得相当具有挑战性。使用cacheSweave 可以大大加速数据处理。然而,绘图 - 特别是 ggplot ;) - 仍然需要很长时间才能渲染。 这就是我想使用 pgfSweave 的方式。
经过很多很多个小时,我终于成功地用 Eclipse/StatET/Texlipse 建立了一个工作系统。然后我想转换现有的报告以与 pgfSweave 一起使用,但有一个糟糕的惊喜:我的大部分 ggplots 似乎不再工作了。例如,以下图在控制台和 Sweave 中完美运行:
pl <- ggplot(plot_info,aes(elevation,area))
pl <- pl + geom_point(aes(colour=que_id))
print(pl)
使用 pgfSweave 运行它,但是,我收到此错误:
Error in if (width > 0) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In if (width > 0) { :
the condition has length > 1 and only the first element will be used
Error in driver$runcode(drobj, chunk, chunkopts) :
Error in if (width > 0) { : missing value where TRUE/FALSE needed
当我从 geom_point 中删除 aes(...) 时,该图与 pgfSweave 完美运行。
pl <- ggplot(plot_info,aes(elevation,area))
pl <- pl + geom_point()
print(pl)
编辑: 我对这个问题进行了更多调查,并可以将问题减少到 tikz 设备。
这工作得很好:
quartz()
pl <- ggplot(plot_info,aes(elevation,area))
pl <- pl + geom_point(aes(colour=que_id))
print(pl)
这给出了上面的错误:
tikz( 'myPlot.tex',standAlone = T )
pl <- ggplot(plot_info,aes(elevation,area))
pl <- pl + geom_point(aes(colour=que_id))
print(pl)
dev.off()
这也工作得很好:
tikz( 'myPlot.tex',standAlone = T )
pl <- ggplot(plot_info,aes(elevation,area))
pl <- pl + geom_point()
print(pl)
dev.off()
我可以用 5 个不同的 ggplots 重复这个。当在映射中不使用颜色(或大小、alpha...)时,它可以与 tikz 一起使用。
Q1:有人对这种行为有任何解释吗?
此外,非绘图代码块的缓存效果不太好。以下代码块使用 Sweave 根本不需要花时间。使用 pgfSweave,大约需要 10 秒。
<<plot.opts,echo=FALSE,results=hide,cache=TRUE>>=
#colour and plot options are globally set
pal1 <- brewer.pal(8,"Set1")
pal_seq <- brewer.pal(8,"YlOrRd")
pal_seq <- c("steelblue1","tomato2")
opt1 <- opts(panel.grid.major = theme_line(colour = "white"),panel.grid.minor = theme_line(colour = "white"))
sca_fill_cont_opt <- scale_fill_continuous(low="steelblue1", high="tomato2")
ory <- geom_hline(yintercept=0,alpha=0.4,linetype=2)
orx <- geom_vline(xintercept=0,alpha=0.4,linetype=2)
ts1 <- 2.3
ts2 <- 2.5
ts3 <- 2.8
ps1 <- 6
offset_x <- function(x,y) 0.15*x/pmax(abs(x),abs(y))
offset_y <- function(x,y) 0.05*y/pmax(abs(x),abs(y))
plot_size <- 50*50
这似乎也是一个非常奇怪的行为,因为只设置了一些变量供以后使用。
Q2:有人对此有任何解释吗?
Q3:更一般地说,我想问是否有人成功使用 pgfSweave? 我所说的成功是指在 Sweave 中适用的所有功能在 pgfSweave 中也适用,并且还有漂亮的字体和更高的速度等额外好处。 ;)
非常感谢您的回复!
I started using Sweave some time ago. However, like most people I encountered pretty soon a major problem: Speed. Sweaving a large document takes ages to run, which makes efficient working quite challenging. Data processing can be accelerated very much with cacheSweave. However, plots - especially ggplot ;) - still take too long to render.
That’s way I want to use pgfSweave.
After many, many hours, I finally succeeded in setting up a working system with Eclipse/StatET/Texlipse. I then wanted to convert an existing report to use with pgfSweave and had a bad surprise: most of my ggplots doesn’t seem to work anymore. The following plot for example works perfectly in the console and Sweave:
pl <- ggplot(plot_info,aes(elevation,area))
pl <- pl + geom_point(aes(colour=que_id))
print(pl)
Running it with pgfSweave, however, I get this error:
Error in if (width > 0) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In if (width > 0) { :
the condition has length > 1 and only the first element will be used
Error in driver$runcode(drobj, chunk, chunkopts) :
Error in if (width > 0) { : missing value where TRUE/FALSE needed
When I remove aes(...) from geom_point, the plot works perfectly with pgfSweave.
pl <- ggplot(plot_info,aes(elevation,area))
pl <- pl + geom_point()
print(pl)
Edit:
I investigated more into the problem and could reduce the problem to the tikz-device.
This works just fine:
quartz()
pl <- ggplot(plot_info,aes(elevation,area))
pl <- pl + geom_point(aes(colour=que_id))
print(pl)
This gives the above error:
tikz( 'myPlot.tex',standAlone = T )
pl <- ggplot(plot_info,aes(elevation,area))
pl <- pl + geom_point(aes(colour=que_id))
print(pl)
dev.off()
This works just fine as well:
tikz( 'myPlot.tex',standAlone = T )
pl <- ggplot(plot_info,aes(elevation,area))
pl <- pl + geom_point()
print(pl)
dev.off()
I could repeat this with 5 different ggplots. When not using colour (or size, alpha,...) in the mapping, it works with tikz.
Q1: Does anybody has any explanations for this behavior?
Additionally, caching of non-plot code chunks doesn’t work very well. The following code chunk takes no time at all with Sweave. With pgfSweave, it takes approximately 10 sec.
<<plot.opts,echo=FALSE,results=hide,cache=TRUE>>=
#colour and plot options are globally set
pal1 <- brewer.pal(8,"Set1")
pal_seq <- brewer.pal(8,"YlOrRd")
pal_seq <- c("steelblue1","tomato2")
opt1 <- opts(panel.grid.major = theme_line(colour = "white"),panel.grid.minor = theme_line(colour = "white"))
sca_fill_cont_opt <- scale_fill_continuous(low="steelblue1", high="tomato2")
ory <- geom_hline(yintercept=0,alpha=0.4,linetype=2)
orx <- geom_vline(xintercept=0,alpha=0.4,linetype=2)
ts1 <- 2.3
ts2 <- 2.5
ts3 <- 2.8
ps1 <- 6
offset_x <- function(x,y) 0.15*x/pmax(abs(x),abs(y))
offset_y <- function(x,y) 0.05*y/pmax(abs(x),abs(y))
plot_size <- 50*50
This seems a pretty strange behavior as well, as only some variables are set for later use.
Q2: Anybody got any explanations for that?
Q3: More generally, I would like to ask if anybody at all is using pgfSweave successfully?
With successfully I mean that all things that work in Sweave also work in pgfSweave, with the additional benefit of nice fonts and improved speed. ;)
Thanks very much for responses!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
以下是 tikzDevice 在尝试构建绘图时出错的三个原因:
当您添加创建图例的美学映射时,例如
aes(colour=que_id)
,ggplot2 将使用变量名称作为图例的标题——在本例中为 que_id。tikzDevice 将所有字符串(例如图例标题)传递给 LaTeX 进行排版。
在 LaTeX 中,下划线字符
_
用于表示下标。如果在数学模式之外使用下划线,则会导致错误。当 tikzDevice 尝试计算图例标题“que_id”的高度和宽度时,它将字符串传递给 LaTeX 进行排版,并期望 LaTeX 返回字符串的宽度和高度。 LaTeX 会出现错误,因为数学模式之外的字符串中使用了未转义的下划线。 tikzDevice 收到字符串宽度的
NULL
而不是数字,这会导致if (width > 0)
检查失败。避免该问题的方法
通过添加色标指定要使用的图例标题:
使用 tikzDevice 0.5.0 中引入的字符串清理功能(但直到 0.5.2 才被破坏)。目前,字符串清理只会转义以下字符:
%
、$
、{
、}
和^。但是,您可以通过
tikzSanitizeCharacters
和tikzReplacementCharacters
选项指定其他替换对:我们将在接下来的几周内发布 tikzDevice 的 0.5.3 版本,以便解决由于 R 处理
system()
方式的变化,现在出现了一些烦人的警告消息。我将在下一个版本中添加以下更改:当
width
为NULL
时提供更好的警告消息,表明绘图文本可能存在问题。将下划线和一些其他字符添加到字符串清理程序查找的默认字符集中。
希望这有帮助!
,'}','{','^','_')) 选项(tikzReplacementCharacters = c('\\%','\\我们将在接下来的几周内发布 tikzDevice 的 0.5.3 版本,以便解决由于 R 处理
system()
方式的变化,现在出现了一些烦人的警告消息。我将在下一个版本中添加以下更改:当
width
为NULL
时提供更好的警告消息,表明绘图文本可能存在问题。将下划线和一些其他字符添加到字符串清理程序查找的默认字符集中。
希望这有帮助!
,'\\}','\\{', '\\^{}', '\\文本下划线')) # 启动绘图设备时打开字符串清理 tikz('myPlot.tex',standAlone = TRUE,sanitize = TRUE) 打印(p1) dev.off()我们将在接下来的几周内发布 tikzDevice 的 0.5.3 版本,以便解决由于 R 处理
system()
方式的变化,现在出现了一些烦人的警告消息。我将在下一个版本中添加以下更改:当
width
为NULL
时提供更好的警告消息,表明绘图文本可能存在问题。将下划线和一些其他字符添加到字符串清理程序查找的默认字符集中。
希望这有帮助!
These are three reasons behind why tikzDevice gives an error when trying to construct your plot:
When you add an aesthetic mapping that creates a legend, such as
aes(colour=que_id)
, ggplot2 will use the variable name as the title of the legend---in this case, que_id.The tikzDevice passes all strings, such as legend titles, to LaTeX for typesetting.
In LaTeX the underscore character,
_
, is used to denote a subscript. If an underscore is used outside of math mode, it causes an error.When the tikzDevice tries to calculate the height and width of the legend title, "que_id", it passes the string to LaTeX for typesetting and expects LaTeX to return the width and height of the string. LaTeX suffers an error because there is an unescaped underscore used in the string outside of mathmode. The tikzDevice receives a
NULL
for the string width instead of a number which causes anif (width > 0)
check to fail.Ways to avoid the problem
Specify a legend title to use by adding a color scale:
Use the string sanitization feature introduced in tikzDevice 0.5.0 (but was broken until 0.5.2). Currently, string sanitization will only escape the following characters:
%
,$
,{
,}
, and^
by default. However, you can specify additional substitution pairs via thetikzSanitizeCharacters
andtikzReplacementCharacters
options:We will be publishing version 0.5.3 of the tikzDevice in the next couple of weeks in order to address some annoying warning messages that now show up due to changes in the way R handles
system()
. I will add the following changes to this next version:Better warning message when
width
isNULL
indicating that there is probably something wrong with plot text.Add underscores and a few other characters to the default set of characters that the string sanitizer looks for.
Hope this helps!
,'}','{','^', '_')) options(tikzReplacementCharacters = c('\\%','\\We will be publishing version 0.5.3 of the tikzDevice in the next couple of weeks in order to address some annoying warning messages that now show up due to changes in the way R handles
system()
. I will add the following changes to this next version:Better warning message when
width
isNULL
indicating that there is probably something wrong with plot text.Add underscores and a few other characters to the default set of characters that the string sanitizer looks for.
Hope this helps!
,'\\}','\\{', '\\^{}', '\\textunderscore')) # Turn on string sanitization when starting the plotting device tikz('myPlot.tex', standAlone = TRUE, sanitize = TRUE) print(p1) dev.off()We will be publishing version 0.5.3 of the tikzDevice in the next couple of weeks in order to address some annoying warning messages that now show up due to changes in the way R handles
system()
. I will add the following changes to this next version:Better warning message when
width
isNULL
indicating that there is probably something wrong with plot text.Add underscores and a few other characters to the default set of characters that the string sanitizer looks for.
Hope this helps!
Q2:我是 pgfsweave 的维护者。
以下是我运行的测试的结果:
我相信造成时间差异的原因有两个,但需要更多的工作来准确验证它们:
作为缓存的示例,请考虑以下测试文件以了解缓存的真正好处:
结果:
Q3:我自己一直在使用 pgfSweave 来完成我自己的工作。 R 2.12 中的 Sweave 发生了一些变化,导致 pgfSweave 出现一些小问题,但即将推出的新版本修复了所有问题。 github 上的开发版本 (https://github.com/cameronbracken/pgfSweave) 已经进行了更改。如果您遇到其他问题,我很乐意提供帮助。
Q2: I am the maintainer of pgfsweave.
Here are the results of a test I ran:
I believe the there are 2 reasons for the time difference but it would take more work to verify them exactly:
As an example of the caching consider the following test file to see the real benefits of caching:
And the results:
Q3: I use pgfSweave myself all the time for my own work. There have been some changes in Sweave in R 2.12 that have been causing some minor problems with pgfSweave but a new version is forthcoming that fixes everything. The development version on github ( https://github.com/cameronbracken/pgfSweave) already has the changes. If you are having additional problems I would be happy to help.
Q2:您是否在标题中使用
\pgfrealjobname{}
并为图形块使用external=TRUE
选项?我发现这大大提高了速度(不是针对第一次编译,而是针对图形未更改的后续编译)。您将在 pgfSweave 小插图中找到更多背景。
Q3:一切对我来说都很好,我像你一样使用 Windows + Eclipse/StatEt/Texlipse。
Q2: Do you use
\pgfrealjobname{<DOCUMENTNAME>}
in the header and optionexternal=TRUE
for the graphics chunks?I've found that that increases the speed a lot (not for the first compilation, but for subsequent ones if the graphics are unchanged). You'll find more background in the pgfSweave vignette.
Q3: Everything works fine for me, I use Windows + Eclipse/StatEt/Texlipse like you.