向虚拟变量添加噪音
我正在尝试运行 knn 回归,但是,我有很多虚拟变量,因此有很多联系。为了解决这个问题,我想给假人添加噪音。所以我想为特定变量上带有 1 的行提供 1 到 0.99 之间的随机值。我想对值为零的行执行相同的操作,但然后给它们一个 0 到 0.01 之间的随机数。有人可以帮助我找到一种有效的方法来转换我的虚拟变量吗?
I am trying to run a knn regression, however, I have a lot of dummy variables and therefore a lot of ties. To solve this problem, I want to add noise to the dummies. So I want to give the rows with 1 on a specific variable a random value between 1 and 0.99. I want to do the same for rows with a zero value, but then give them a random number between 0 and 0.01. Can somebody help me with an efficient way to transform my dummy variables?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
基本R中有一个很棒的功能,称为
jitter
。There is a great function in base R for this called
jitter
.您可以使用
ifelse
语句来转换您的虚拟vars:在这里我添加一个新列,但是您可以通过将
ifelse
语句分配给旧虚拟变量来替代现有的列。但是,我同意@SAMR的答案,即虚拟变量。不清楚您想对虚拟变量做什么
You can use an
ifelse
statement to transform your dummy vars:Here I add a new column, but you can substitute the existing one by assigning the
ifelse
statement to the old dummy variable.However, I agree with the answer of @SamR, about dummy variables. It is not very clear what you want to do with the dummy variable
要添加噪音,您可以执行以下操作:
但是,我会质疑这是否是正确的方法。虚拟变量通常不需要添加噪声。你说你有关系是什么意思?一般来说,如果您有一个代表 n 个因子水平的变量,则只需要 n-1 个虚拟变量。您指的是这个吗?
To just add noise you can do something like:
However, I would question whether this is the right approach. Dummy variables do not usually need noise added to them. What do you mean you are getting ties? In general, if you have a variable representing n levels of a factor, you will only need n-1 dummy variables. Is this what you are referring to?