将 SPSS 文件读入 R
我正在尝试学习 R 并且想要引入一个 SPSS 文件,我可以在 SPSS 中打开该文件。
我尝试使用来自 foreign
的 read.spss
和来自 Hmisc
的 spss.get
。两个错误消息是相同的。
这是我的代码:
## install.packages("Hmisc")
library(foreign)
## change the working directory
getwd()
setwd('C:/Documents and Settings/BTIBERT/Desktop/')
## load in the file
## ?read.spss
asq <- read.spss('ASQ2010.sav', to.data.frame=T)
以及产生的错误:
read.spss("ASQ2010.sav", to.data.frame = T) 中的错误:错误 读取系统文件头另外:警告消息:在 read.spss("ASQ2010.sav", to.data.frame = T) : ASQ2010.sav: 位置 0:字符“\000”(
另外,我尝试将 SPSS 文件保存为 SPSS 7 .sav 文件(之前使用 SPSS 18)。
警告消息:1:在read.spss("ASQ2010_test.sav", to.data.frame = T) : ASQ2010_test.sav:无法识别的记录类型 7,子类型 14 系统文件2中遇到:在read.spss("ASQ2010_test.sav", to.data.frame = T) : ASQ2010_test.sav: 无法识别的记录类型 7, 系统文件中遇到子类型18
I am trying to learn R and want to bring in an SPSS file, which I can open in SPSS.
I have tried using read.spss
from foreign
and spss.get
from Hmisc
. Both error messages are the same.
Here is my code:
## install.packages("Hmisc")
library(foreign)
## change the working directory
getwd()
setwd('C:/Documents and Settings/BTIBERT/Desktop/')
## load in the file
## ?read.spss
asq <- read.spss('ASQ2010.sav', to.data.frame=T)
And the resulting error:
Error in read.spss("ASQ2010.sav", to.data.frame = T) : error
reading system-file header In addition: Warning message: In
read.spss("ASQ2010.sav", to.data.frame = T) : ASQ2010.sav: position
0: character `\000' (
Also, I tried saving out the SPSS file as a SPSS 7 .sav file (was previously using SPSS 18).
Warning messages: 1: In read.spss("ASQ2010_test.sav", to.data.frame =
T) : ASQ2010_test.sav: Unrecognized record type 7, subtype 14
encountered in system file 2: In read.spss("ASQ2010_test.sav",
to.data.frame = T) : ASQ2010_test.sav: Unrecognized record type 7,
subtype 18 encountered in system file
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(15)
我遇到了类似的问题,并按照 read.spss 帮助中的提示解决了它。
使用包
memisc
代替,您可以导入一个便携式 SPSS 文件,如下所示:同样,对于 .sav 文件:
虽然在这种情况下我似乎错过了一些字符串值,而便携式导入无缝运行。
spss.portable.file
的帮助页面声称:导入器机制比“foreign”包的 read.spss 和 read.dta 更加灵活和可扩展,因为大多数解析文件头是在 R 中完成的。它们还适合高效加载大型数据集。最重要的是,导入器对象支持此包提供的标签、缺失值和描述。
I had a similar issue and solved it following a hint in
read.spss
help.Using package
memisc
instead, you can import a portable SPSS file like this:Similarly, for .sav files:
although in this case I seem to miss some string values, while the portable import works seamlessly. The help page for
spss.portable.file
claims:The importer mechanism is more flexible and extensible than read.spss and read.dta of package "foreign", as most of the parsing of the file headers is done in R. They are also adapted to load efficiently large data sets. Most importantly, importer objects support the labels, missing.values, and descriptions, provided by this package.
read.spss 似乎有点过时了,所以我使用了名为 memisc 的包。
要使其正常工作,请执行以下操作:
The
read.spss
seems to be outdated a little bit, so I used package calledmemisc
.To get this to work do this:
您也可以尝试以下操作:
如果您想读取一个文件夹中的所有文件:
You may also try this:
and if you want to read all files from one folder:
我知道这篇文章很旧,但我在将 Qualtrics SPSS 文件加载到 R 中时也遇到了问题。R 的 read.spss 代码很久以前来自 PSPP,并且已经有一段时间没有更新了。 (Hmisc 的代码也使用 read.spss(),所以运气不好。)
好消息是 PSPP 0.6.1 应该可以很好地读取文件,只要您将“字符串宽度”指定为“Short - 255”( Qualtrics 中“下载数据”页面上的 SPSS 12.0 及更早版本)”。将其读入 PSPP,保存一份新副本,然后您就可以开始工作了。尴尬,但免费。
,
I know this post is old, but I also had problems loading a Qualtrics SPSS file into R. R's read.spss code came from PSPP a long time ago, and hasn't been updated in a while. (And Hmisc's code uses read.spss(), too, so no luck there.)
The good news is that PSPP 0.6.1 should read the files fine, as long as you specify a "String Width" of "Short - 255 (SPSS 12.0 and earlier)" on the "Download Data" page in Qualtrics. Read it into PSPP, save a new copy, and you should be in business. Awkward, but free.
,
您可以使用上述解决方案或您当前使用的解决方案从
R
读取SPSS
文件。只需确保命令与文件一起提供,它可以正确读取。我遇到了同样的错误,问题是 SPSS 无法访问该文件。您应该确保文件路径正确、文件可访问且格式正确。就警告消息而言,它不会影响数据。记录类型 7 用于存储较新的 SPSS 软件中的功能,以使较旧的 SPSS 软件能够读取新数据。但不影响数据。我已经使用过很多次并且数据没有丢失。
您还可以在 http://r.789695.n4.nabble.com/read-spss-warning-message-Unrecognized-record-type-7-subtype-18-在系统文件中遇到-td3000775.html#a3007945
You can read
SPSS
file fromR
using above solutions or the one you are currently using. Just make sure that the command is fed with the file, that it can read properly. I had same error and the problem was, SPSS could not access that file. You should make sure the file path is correct, file is accessible and it is in correct format.As far as warning message is concerned, It does not affect the data. The record type 7 is used to store features in newer SPSS software to make older SPSS software able to read new data. But does not affect data. I have used this numerous times and data is not lost.
You can also read about this at http://r.789695.n4.nabble.com/read-spss-warning-message-Unrecognized-record-type-7-subtype-18-encountered-in-system-file-td3000775.html#a3007945
看起来 R read.spss 实现不完整或损坏。不过,R2.10.1 的性能比 R2.8.1 更好。看起来,即使使用 2.10.1(我拥有的最新版本),R 也会对 sav 文件中的自定义属性感到不安。 R 也可能不理解文件中的字符编码字段,特别是它可能不适用于 SPSS Unicode 文件。
您可以尝试在 SPSS 中打开文件,删除所有自定义属性,然后重新保存文件。
来查看是否有自定义属性
可以用SPSS命令显示属性
。如果是这样,请删除它们(请参阅 VARIABLE ATTRIBUTE 和 DATAFILE ATTRIBUTE 命令),然后重试。
哈特哈,
乔恩·佩克
It looks like the R read.spss implementation is incomplete or broken. R2.10.1 does better than R2.8.1, however. It appears that R gets upset about custom attributes in a sav file even with 2.10.1 (The latest I have). R also may not understand the character encoding field in the file, and in particular it probably does not work with SPSS Unicode files.
You might try opening the file in SPSS, deleting any custom attributes, and resaving the file.
You can see whether there are custom attributes with the SPSS command
display attributes.
If so, delete them (see VARIABLE ATTRIBUTE and DATAFILE ATTRIBUTE commands), and try again.
HTH,
Jon Peck
如果您有权访问 SPSS,请将文件另存为 .csv,然后使用
read.csv
或read.table
导入。我不记得 .sav 文件导入有任何问题。到目前为止,它与read.spss
和spss.get
一起工作就像一个魅力。我认为spss.get
不会给出不同的结果,因为它取决于foreign::read.spss
您能否提供一些有关 SPSS/R/Hmisc/foreign 版本的信息?
If you have access to SPSS, save file as .csv, hence import it with
read.csv
orread.table
. I can't recall any problem with .sav file importing. So far it was working like a charm both withread.spss
andspss.get
. I reckon thatspss.get
will not give different results, since it depends onforeign::read.spss
Can you provide some info on SPSS/R/Hmisc/foreign version?
这里没有提到的另一个解决方案是通过 ODBC 读取 R 中的 SPSS 数据。您需要:
RODBC
包导入 SPSS 数据。请参阅此处的示例。但我不得不承认,非常大的数据文件可能会出现问题。
Another solution not mentioned here is to read SPSS data in R via ODBC. You need:
RODBC
package in R.See the example here. However I have to admit that, there could be problems with very big data files.
对我来说,使用 memisc 效果很好!
For me it works well using memisc!
我同意 @SDahm 的观点,即
haven
包将是最佳选择。我自己在开始使用字符串值时遇到了一些困难,所以我想我也应该在这里分享我的方法。“语义”小插图有一些关于这个主题的有用信息。
I agree with @SDahm that the
haven
package would be the way to go. I myself have struggled a bit with string values when starting to use it, so I thought I'd share my approach on that here, too.The "semantics" vignette has some useful information on this topic.
你使用的包不存在这样的问题。读取 spss 文件的唯一要求是将文件放入 PORTABLE 格式文件中。我的意思是,spss 文件的扩展名为 *.sav。您需要将 spss 文件转换为使用 *.por 扩展名的可移植文档。
有更多信息 http://www.statmethods.net/input/importingdata.html
There is no such problem with packages you are using. The only requirement for read a spss file is to put the file into a PORTABLE format file. I mean, spss file have *.sav extension. You need to transform your spss file in a portable document that uses *.por extension.
There is more info in http://www.statmethods.net/input/importingdata.html
就我而言,此警告与数据第一列之前出现的新变量相结合,其值为 -100, 2, 2, 2, ...,标签和值之间的对应关系发生变化,并删除了最后一个变量多变的。一个有效的解决方案是(使用 SPSS)在文件的最后一列中创建一个新的转储变量,用随机值填充它并执行以下代码:
(文件名是 sav 文件的路径,在我的例子中,原始 SPSS 文件有 62 列,因此带有附加哑变量的 63 列)
希望上面的代码对其他人有帮助。
In my case this warning was combined with a appearance of a new variable before first column of my data with values -100, 2, 2, 2, ..., a shift in the correspondence between labels and values and the deletion of the last variable. A solution that worked was (using SPSS) to create a new dump variable in the last column of the file, fill it with random values and execute the following code:
(filename is the path to the sav file and in my case the original SPSS file had 62 columns, thus 63 with the additional dumb variable)
Hope the above code will help someone else.
关闭 SPSS 中的 UNICODE
打开 SPSS,不打开任何数据,然后在语法编辑器中运行以下代码
打开数据集并重新保存以删除 Unicode
read.spss('yourdata.sav', to.data.frame =T)
则可以正常工作Turn your UNICODE in SPSS off
Open SPSS without any data open and run the code below in your syntax editor
Open the data set and resave it to remove the Unicode
read.spss('yourdata.sav', to.data.frame=T)
works correctly then我刚刚遇到一个 SPSS 文件,无法使用
haven
、foreign
或memisc
打开,但readspss: :read.por
帮我解决了这个问题:很好!谢谢,@JanMarvin!
I just came came across an SPSS file that I couldn't get open using
haven
,foreign
, ormemisc
, butreadspss::read.por
did the trick for me:Nice! Thanks, @JanMarvin!
1)
我发现程序stat-transfer对于将spss和stata文件导入R很有用。
它通过将spss转换为R数据集解决了您提到的问题。对于将超大数据集子集为 R 可使用的较小部分也非常有用。不是免费的,但对于处理来自不同程序的数据集来说是一个非常有用的工具 - 特别是当您无权访问它们时。
2)
Memisc包还有一个spss功能值得尝试。
1)
I've found the program, stat-transfer, useful for importing spss and stata files into R.
It resolves the issue you mention by converting spss to R dataset. Also very useful for subsetting super large datasets into smaller portions consumable by R. Not free, but a very useful tool for working with datasets from different programs -- especially if you don't have access to them.
2)
Memisc package also has an spss function worth trying.