复制“自定义表” R中的比较
我每天都使用 SPSS,但一直在努力学习 R。阻碍我的主要因素是我需要为我所做的市场研究轻松生成表格、横幅和交叉表。我喜欢 SPSS 中的“自定义表格”选项,并且正在寻求有关如何使用 R 复制它的建议。
我相信 R 比 SPSS 有很多优势,其中之一就是能够与 LaTeX 集成以生成可重复的报告。 SPSS 非常适合快速探索(点击),但在获取结果并将其打包为客户可接受的交付物等方面还有很多不足之处。也就是说,R 非常强大,我觉得我可以做所有我想做的事情如果我只能按照我需要的方式制作横幅/交叉表,就需要它。
简而言之,我可以选择哪些选项来生成类似于下面的值得报告的表格?我正在复制 SPSS 语法命令和输出以供参考。
CTABLES
/VLABELS VARIABLES=age educ paeduc maeduc speduc prestg80 happy
DISPLAY=DEFAULT
/TABLE age [MEAN F40.3, VALIDN COMMA40.0] + educ [MEAN F40.3, VALIDN COMMA40.0] + paeduc [MEAN F40.3, VALIDN COMMA40.0] + maeduc [MEAN F40.3, VALIDN COMMA40.0] + speduc [MEAN F40.3, VALIDN COMMA40.0] + prestg80 [MEAN F40.3, VALIDN COMMA40.0] BY happy
/SLABELS POSITION=ROW
/CATEGORIES VARIABLES=happy ORDER=A KEY=VALUE EMPTY=INCLUDE TOTAL=YES POSITION=AFTER MISSING=EXCLUDE
/SIGTEST TYPE=CHISQUARE ALPHA=0.05 INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE
/COMPARETEST TYPE=MEAN ALPHA=0.05 ADJUST=BONFERRONI ORIGIN=COLUMN INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE MEANSVARIANCE=ALLCATS MERGE=NO
/COMPARETEST TYPE=PROP ALPHA=0.05 ADJUST=BONFERRONI ORIGIN=COLUMN INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE MERGE=NO.
我附上了输出的图片。我对在行/列中拥有多个变量的能力特别感兴趣,并且喜欢在需要时灵活地嵌套它们。在图像中,我有一些连续变量,由列中的分类变量切割而成,汇总统计数据放置在行中。顺便说一句,我也非常喜欢快速列均值比较的功能——但是图可以在 R 中快速访问它们以进行条件交叉表生成。
I use SPSS everyday but have really been trying to learn R. The major thing that is holding me back is my need to easily generate tables, banners, and cross-tabs for the market research that I do. I love the Custom Tables option in SPSS and am looking for advice on how to replicate it with R.
I believe R has a ton of advantages over SPSS, one of which is the ability to integrate with LaTeX for reproducible reports. SPSS is great for quick exploration (point and click), but leaves alot to be desired when taking the results and packaging it to an acceptable deliverable for clients, etc. That said, R is so powerful, I feel like I could do everything I need in it if I could only do my banners/crosstabs the way I need them.
In short, what are my options to generate report-worthy tables similar to what I have below? I am copying the SPSS syntax command and the output for reference.
CTABLES
/VLABELS VARIABLES=age educ paeduc maeduc speduc prestg80 happy
DISPLAY=DEFAULT
/TABLE age [MEAN F40.3, VALIDN COMMA40.0] + educ [MEAN F40.3, VALIDN COMMA40.0] + paeduc [MEAN F40.3, VALIDN COMMA40.0] + maeduc [MEAN F40.3, VALIDN COMMA40.0] + speduc [MEAN F40.3, VALIDN COMMA40.0] + prestg80 [MEAN F40.3, VALIDN COMMA40.0] BY happy
/SLABELS POSITION=ROW
/CATEGORIES VARIABLES=happy ORDER=A KEY=VALUE EMPTY=INCLUDE TOTAL=YES POSITION=AFTER MISSING=EXCLUDE
/SIGTEST TYPE=CHISQUARE ALPHA=0.05 INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE
/COMPARETEST TYPE=MEAN ALPHA=0.05 ADJUST=BONFERRONI ORIGIN=COLUMN INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE MEANSVARIANCE=ALLCATS MERGE=NO
/COMPARETEST TYPE=PROP ALPHA=0.05 ADJUST=BONFERRONI ORIGIN=COLUMN INCLUDEMRSETS=YES CATEGORIES=ALLVISIBLE MERGE=NO.
I attached a picture of what the output looks like. I am particuarly interested in the ability to have multiple variables in the rows/columns and like the flexibility to nest them if I need to. In the image, I have a few continuous variables cut by a categorical variable in the column with the summary statistics placed in the rows. As an aside, I also really like the feature of quick column mean comparisons -- but figure in can quickly access them in R for conditional crosstab generation.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
有关某些表,请参阅 xtable 包导出为 LaTeX 和 HTML。不过,可能还有其他包。 这看起来也很有希望。你听说过斯维夫吗?
See xtable package for some table exporting to LaTeX and HTML. There may be other packages, though. This looks promising as well. Have you heard of Sweave?
我也多次遇到了 R 的非用户友好输出的问题...我发现的唯一解决方案是编写自己的函数,我很高兴在这里与您分享:
以下函数返回 a 中的所有因子变量data.frame 因子变量“变量”每个级别的频率或百分比 (calc="perc")。
最重要的事情可能是输出是一个简单的用户友好的数据框架。因此,以任何您想要的方式导出结果和工作都没有问题。
我意识到还有很大的进一步改进的潜力,即添加选择行与列百分比计算等的可能性。这是一个正在进行的状态,但完成了工作。
I also had trouble many times with un-user-friendly output of R... The only solution I found was writing my own function and I'm happy to share it with you here:
The following function returns for all factor variables in a data.frame the frequency or the percentage (calc="perc") for each level of the factor variable "variable".
The most important thing may be that the output is a simple user friendly data.frame. So it is no problem to export the results an work with it in any way you want.
I realize that there is much potential for further improvements, i.e. add a possibility for selecting row vs. column percentage calculation, etc. It's a work-in-progress status, but gets the job done.
尝试从 "tables" 包。我想这可能会有帮助。
Try to explore a "tabular" function from the "tables" package. I think it might be helpful.
目前这在 R 中并不容易。您可能必须将多个包中的多个函数串在一起才能获得这样的输出。
This is something that is not currently easy in R. You are likely to have to string together multiple functions from multiple packages to get output like this.
我刚刚下载了 psych 包,它非常擅长生成按变量细分的汇总统计表。它的格式不如 stata 那样好。我认为您可以将其输出到文本文件中,然后按照您想要的方式格式化。
I just downloaded the package psych, and it is pretty good at producing tables for summary statistics broken down by variables. it doesn't format as nice as say stata. I think you can output it into a text file and then format it the way you want.
userR 2010 上有关于此主题的多个演示,因此您可能很快会看到更多试图解决此问题的软件包。
There were several presentations on this topic at useR 2010, so you may see more packages out soon that attempt to address this.