删除附有foreign/Hmisc SPSS导入功能的变量标签
像往常一样,我得到了一些 SPSS 文件,并使用 Hmisc
包中的 spss.get
函数将其导入到 R 中。我对 Hmisc::spss.get
添加到 data.frame
中所有变量的 labelled
类感到困扰,因此想要删除它。
当我尝试运行 ggplot 甚至当我想做一些简单的分析时,labelled
类让我头疼!一种解决方案是从 data.frame
中的每个变量中删除 labelled
类。我怎样才能做到这一点?这可能吗?如果没有,我还有什么其他选择?
我真的想在适用的情况下使用 as.data.frame(lapply(x, as.numeric))
和 as.character
绕过“从头开始”重新编辑变量...我当然不想运行 SPSS 并手动删除标签(不喜欢 SPSS,也不关心安装它)!
谢谢!
As usual, I got some SPSS file that I've imported into R with spss.get
function from Hmisc
package. I'm bothered with labelled
class that Hmisc::spss.get
adds to all variables in data.frame
, hence want to remove it.
labelled
class gives me headaches when I try to run ggplot
or even when I want to do some menial analysis! One solution would be to remove labelled
class from each variable in data.frame
. How can I do that? Is that possible at all? If not, what are my other options?
I really want to bypass reediting variables "from scratch" with as.data.frame(lapply(x, as.numeric))
and as.character
where applicable... And I certainly don't want to run SPSS and remove labels manually (don't like SPSS, nor care to install it)!
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这是我完全摆脱标签的方法。与 Jyotirmoy 的解决方案类似,但适用于向量和 data.frame。 (部分归功于 Frank Harrell)
使用如下:
编辑
这是该函数的一个更简洁的版本,结果相同:
Here's how I get rid of the labels altogether. Similar to Jyotirmoy's solution but works for a vector as well as a data.frame. (Partial credits to Frank Harrell)
Use as follows:
EDIT
Here's a bit of a cleaner version of the function, same results:
关于 R 对象中的类成员资格的迟来的注释/警告。识别“标记”的正确方法不是使用
is
函数或相等性 {==
) 进行测试,而是使用inherits
进行测试。测试特定位置的方法不会选择现有类的顺序与假设的顺序不同的情况。您可以使用以下参数避免在 spss.get 中创建“带标签”变量: use.value.labels=FALSE。
如果标记向量的类只是“标记”而不是 c(“标记”, “因子”),则 Bhattacharya 的代码可能会失败,在这种情况下应该是:
您报告的错误可以用以下代码重现:
A belated note/warning regarding class membership in R objects. The correct method for identification of "labelled" is not to test for with an
is
function or equality {==
) but rather withinherits
. Methods that test for a specific location will not pick up cases where the order of existing classes are not the ones assumed.You can avoid creating "labelled" variables in spss.get with the argument: , use.value.labels=FALSE.
The code from Bhattacharya could fail if the class of the labelled vector were simply "labelled" rather than c("labelled", "factor") in which case it should have been:
The error you report can be reproduced with this code:
您可以尝试
foreign
包中的read.spss
函数。一个粗略且现成的方法可以摆脱
spss.get
创建的labelled
类,但是您能否举一个
labelled
导致问题的示例?如果我在由
spss.get
创建的数据框x
中有一个变量MAED
,我有:编写良好的代码需要一个因子(说)不应该有任何问题。
You can try out the
read.spss
function from theforeign
package.A rough and ready way to get rid of the
labelled
class created byspss.get
But can you please give an example where
labelled
causes problems?If I have a variable
MAED
in a data framex
created byspss.get
, I have:So well-written code that expects a factor (say) should not have any problems.
假设:
您可以使用以下方法删除名为“var1”的变量的标签:
如果您还想删除“labbled”类,您可以这样做:
或者如果该变量有多个类:
希望这会有所帮助!
Suppose:
You could remove the labels of a variable called "var1" by using:
If you also want to remove the class "labbled", you could do:
or if the variable has more than one class:
Hope this helps!
好吧,我发现
unclass
函数可以用来删除类(谁会告诉,是吗?!):这不是最幸运的解决方案,想象一下反向转换一堆向量......如果有人最重要的是,我会检查它作为答案......
Well, I figured out that
unclass
function can be utilized to remove classes (who would tell, aye?!):It's not the luckiest solution, just imagine back-converting bunch of vectors... If anyone tops this, I'll check it as an answer...