Sweave/R - 自动生成一个附录,其中包含分析中的所有模型摘要/绘图/数据配置文件
我喜欢让研究在多个细节层面上可用的想法,即为随意好奇的人提供摘要,为更感兴趣的人提供全文,最后为在同一领域工作/试图重现你的结果的人提供数据和代码。在实际文本和数据/代码级别之间,我想插入另一层。也就是说,我想创建一种自动生成的附录,其中包含分析中的完整回归输出、诊断图、探索性图表数据配置文件等,无论 这些图/回归等是否进入最终论文。
我的一个想法是编写一个脚本来检查 .Rnw 文件并自动:
- 分析加载的所有数据集(有点像 Hmisc(?) 包)
- 总结所有回归 - 即,为所有运行摘要(模型)模型
- 呈现所有图(无论它们是否在最终版本中出现),
其想法是使这种工作成为一种省力的、按钮式的东西,而不是像论文的其余部分那样编写正式的附录。我正在寻找一些关于如何以相对简单的方式在 R 中执行此操作的想法。我的预感是,有某种方法可以遍历名称空间,找出某些内容,然后转储到 PDF 中。
想法?这样的东西已经存在了吗?
I like the idea of making research available at multiple levels of detail i.e., abstract for the casually curious, full text for the more interested, and finally the data and code for those working in the same area/trying to reproduce your results. In between the actual text and the data/code level, I'd like to insert another layer. Namely, I'd like to create a kind of automatically generated appendix that contains the full regression output, diagnostic plots, exploratory graphs data profiles etc. from the analysis, regardless of
whether those plots/regressions etc. made it into the final paper.
One idea I had was to write a script that would examine the .Rnw file and automatically:
- Profile all data sets that are loaded (sort of like the Hmisc(?) package)
- Summarize all regressions - i.e., run summary(model) for all models
- Present all plots (regardless of whether they made it in the final version)
The idea is to make this kind of a low-effort, push-button sort of thing as opposed to a formal appendix written like the rest of a paper. What I'm looking for is some ideas on how to do this in R in a relatively simple way. My hunch is that there is some way of going through the namespace, figuring out what something is and then dumping into a PDF.
Thoughts? Does something like this already exist?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我们在最近的 JASA 文章中对此进行了尝试:http://hdl.handle.net/1902.1/ 12174。你应该能够“制作”整篇论文。关于我们的复制存档需要注意的一件事是:我们打包了我们使用的 R 包的版本。事实证明,当人们改进他们的软件包时,有时他们会更改默认值——这会破坏我们的构建。也许将来人们可能会分发包括 R 二进制文件在内的整个虚拟机,该虚拟机将被称为 [回想一下 round(x,digits=) 是如何失去其参数并从 R 版本到下一个版本的位置的 - 使得 round(digits=, x)在没有警告的情况下提供无意义的结果?]。
无论如何,这是我们第一次尝试如此复杂的文档。我这里有一个较小的版本 http://hdl.handle.net/1902.1/13376不使用make。
We made an attempt at this with our recent JASA article: http://hdl.handle.net/1902.1/12174. You should be able to "make" the whole paper. One thing to notice about our reproduction archive: we packaged versions of the R packages that we used. It turned out that as people improve their packages, sometimes they change defaults --- which would break our build. Perhaps in the future one might distribute an entire virtual machine including the R binary which would be called [recall how round(x,digits=) lost its arguments and became positional from version of R to the next -- making round(digits=,x) provide nonsense results without warning?].
Anyway, this is our first attempt at such a complex document. I have a smaller version here http://hdl.handle.net/1902.1/13376 which does not use make.
John,这听起来很有趣,但是如果您提供数据并且文章采用 sweave 格式,那么这个长日志文件不是多余的吗?
回到你的问题,你可能想要研究的一个软件包是 zelig 因为它“自动创建复制数据文件,以便您(或者,如果您愿意,其他任何人)可以复制您的分析结果(从而满足复制标准)”。不是您正在寻找的内容,但复制数据文件的概念可能会给您一些其他想法。请注意,多个期刊现在正在使用复制数据文件。
John, this sounds interesting, but if you provide the data and the article is formatted in sweave, wouldn't this long log file be redundant?
back to your question, one package you might want to look into is zelig since it "automates the creation of replication data files so that you (or, if you wish, anyone else) can replicate the results of your analyses (hence satisfying the replication standard)". Not what you are looking for, but the concept of replication data files might give you some other ideas. notice that multiple journals are now using replication data files.