我正在处理包含二进制变量的数据,这些变量被编码为字符。有一次,我将它们转换为因素。
在另一点,我必须将它们转换为数字,并处理数字的值。基本上,我正在做类似于 ARM
软件包中的重新缩放的事情,并且我需要两个类别的数值值,因此我可以计算一个均值。我不在乎两个值,但是一旦使用它们,我需要每个数据的值相同的值。
例如,如果我将编码为“ y”和“ n”的变量“ ikt”,并且我一次运行代码,我可能会这样做
ikt <- c("y", "n", "n", "n")
ikt.factor <- as.factor(ikt)
ikt.num <- as.numeric(ikt.factor)
,现在我运行了它,我得到了 ikt.num
要包含 2,1,1,1
。
问题是,如何让R始终创建相同的转换,而永远不要 1,2,2,2
?我需要与数据集和环境独立发生,只要我能保证“ IKT”列将始终将其编码为“ Y”和“ N”。
我不想将可变级别对数字的对应关系进行硬编码,因为此代码必须对不同的二进制变量进行一定的工作,这可以使用不同的字符代码到达。
I am dealing with data that includes binary variables, which arrive coded as characters. At one point, I convert them to factors.
At a different point, I have to convert them to numbers, and deal with the numbers' value. Basically, I am doing something very similar to the rescaling in the arm
package, and I need two numerical values for the two categories, so I can calculate a mean. I don't care that much which two values, but once they have been used, I need the same values for each batch of data.
For example, if I have the variable "ikt" coded as "y" and "n", and I run my code once, I may do
ikt <- c("y", "n", "n", "n")
ikt.factor <- as.factor(ikt)
ikt.num <- as.numeric(ikt.factor)
and now that I ran it, I got ikt.num
to contain 2,1,1,1
.
The question is, how can I get R to always create the same conversion, and never 1,2,2,2
? I need this to happen independently of the dataset and the environment, as long as I can guarantee that the "ikt" column will always arrive coded as "y" and "n".
I don't want to have to hardcode the correspondence of variable levels to numbers, since this code has to work generically for different binary variables, which can arrive with different character codes.
发布评论