我发现谷歌搜索的结果好坏参半。我需要解析 SPSS .sav 文件以发现数据布局并提取调查结果。第一步是读取数据的“模式”。例如,我需要知道问题及其允许的回答类型。我计划在我自己的 SQL 表中对这些数据进行建模,以便我可以根据我的应用程序要求对其进行切片和切块。第二步是用受访者的答案填充我的数据模型。查看 SPSS sav 文件,我相信它包含我正在寻找的两种类型的数据。
如果我没有严格要求,我不需要也不想要昂贵的 SPSS 软件。我们不会对这些数据进行统计,只是根据答案过滤器选择受访者子集。 SPSS 文件将由获得 SPSS 许可的合作伙伴公司提供。我不需要将任何数据导出回 SPSS;我的用例是只读的。
我可以使用 Python、Java(带或不带 Groovy)、C/C++ 作为我的解析器程序。该程序将在数据收集结束时运行一次,因此性能并不是特别重要。理想情况下,我希望我的代码是跨平台的,这样我就可以在 Mac 上开发并部署到 Linux,但如果必须的话,我可以使用 Windows,
我发现的很多内容要么是 2004 年的 java 类,要么是需要现代 Python 代码来自 IBM 的 DLL,并且是特定于 Windows 的。根据我对需求的快速解释,我非常感谢 SO 社区的建议。我认为我的需求很简单,但还没有找到我所希望的。开源库是理想的选择,但我什至愿意以合理的价格购买简单的商业解决方案。
I am finding mixed results googling. I have a need to parse a SPSS .sav file to discover the data layout and extract the survey results. Step one is to read the "schema" of the data. For example I need to know the question and its type of allowed responses. I plan to model this data in my own SQL table so I can slice and dice it per my apps requirements. Step two is populate my data model with the respondents answers. Looking at the SPSS sav file I believe it has both types of data I am looking for.
I don't need or want the expensive SPSS software if I don't strictly require it. We will not be doing statistics on this data, just selecting subsets of respondents based on answer filters. The SPSS file will be provided by a partner company that licenses SPSS. I do not need to out any data back into SPSS; my use case is read-only.
I can use Python, Java with or without Groovy, C/C++ for my parser program. This program will be run once at the end of data collection so performance is not particularly important. Ideally I'd like my code to be cross platform so I can develop on my Mac and deploy to Linux, but I can use windows if I must,
A lot of what am finding is either java classes from 2004 or modern Python code that requires a DLL from IBM and is windows specific. Based on my quick explanation of requirements, I would appreciate recommendations from the SO community. I think my needs are simple, but haven't found exactly what I had hoped. An open source lib would be ideal, but I'd even pay for a simple commercial solution at a reasonable price.
发布评论
评论(3)
您可以免费获取带有详细文档的 SPSS i/o 模块,以便构建您自己的应用程序来读取(或写入)sav 文件。这些模块适用于 SPSS Statistics 支持的所有平台。
转至 SPSS 社区站点 http://www.ibm.com/developerworks/spssdevcentral 并点击 SPSS 下载链接。你必须注册,但这是免费的。
SAV文件是一种二进制格式,具有许多复杂的结构,因此最好使用i/o模块。如果向 SAV 文件添加新功能(这种情况经常发生),I/O 模块也会同时更新,因此您的代码不会过时。
哈特哈,
乔恩·佩克
You can get the SPSS i/o modules with detailed documentation for free in order to build your own app to read (or write) sav files. The modules are available for all platforms supported by SPSS Statistics.
Go to the SPSS Community site at http://www.ibm.com/developerworks/spssdevcentral and follow the links for SPSS Downloads. You have to register, but that is free.
The SAV file is a binary format with a number of complex structures, so it is better to use the i/o modules. And if new features are added to the SAV file, which has often happened, the i/o modules are updated at the same time, so your code won't go out of date.
HTH,
Jon Peck
GNU PSPP 显然可以读取SPSS数据文件。我还发现一个指向PSPP 源代码中的格式描述,尽管它带有警告“不要尝试直接读取/写入此格式”。
GNU PSPP can apparently read SPSS data files. I also found a link to a description of the format in the PSPP source, although it comes with a warning "don't try to read/write this format directly."
这里有一个java库:
http://sourceforge.net/projects/spss-writer/
There is a java library here:
http://sourceforge.net/projects/spss-writer/