基于R中的其他变量模拟变量

发布于 2025-01-23 23:16:07 字数 176 浏览 1 评论 0原文

我希望模拟一组与模拟数值变量相关的分类变量。更具体地说,我具有可变的age,其定义为:age< - rnorm(n = 1000,平均= 35,sd = 9),我希望模拟另一个变量,其中更高的年龄使更高的类。谁能向我指向正确的方向?提前致谢!

I wish to simulate a set of categorical variables which correlates with a simulated numerical variable. More specifically, I have variable the age which is defined like: age <- rnorm(n=1000, mean=35, sd =9) and I wish to simulate another variables class in which higher age makes for higher class. Can anyone point me in the right direction? Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

秋日私语 2025-01-30 23:16:07

我知道的是,如果ab相关,则表示ab是线性相关的。因此,a可以用b的线性函数表示。为了生成随机变量,应添加随机噪声。

这是这样做的一种方法:

set.seed(1)
age <- rnorm(n=10, mean=35, sd =9)
beta <- runif(1, min = 1, max = 5) # or any other finite min and max values, can be positive or negative, but in your case should be positive.
class <- beta*age + rnorm(length(age), mean = 0, sd = 2) # or any other mean and sd values

# Check correlation between age and class
cor(age, class)
#[1] 0.9994416

# Check if higher age makes for higher class
data.frame(sort(age), sort(class))

   sort.age. sort.class.
1   27.47934    129.6408
2   27.61578    131.3707
3   29.36192    137.5428
4   32.25150    152.3856
5   36.65279    171.3957
6   37.96557    179.0890
7   39.38686    184.8634
8   40.18203    187.9404
9   41.64492    198.2192
10  49.35753    233.2981

What I understand is that if a correlates with b, it means a and b are linearly related. So, a can be represented by a linear function of b. To generate random variables, a random noise should be added.

Here is one way of doing that:

set.seed(1)
age <- rnorm(n=10, mean=35, sd =9)
beta <- runif(1, min = 1, max = 5) # or any other finite min and max values, can be positive or negative, but in your case should be positive.
class <- beta*age + rnorm(length(age), mean = 0, sd = 2) # or any other mean and sd values

# Check correlation between age and class
cor(age, class)
#[1] 0.9994416

# Check if higher age makes for higher class
data.frame(sort(age), sort(class))

   sort.age. sort.class.
1   27.47934    129.6408
2   27.61578    131.3707
3   29.36192    137.5428
4   32.25150    152.3856
5   36.65279    171.3957
6   37.96557    179.0890
7   39.38686    184.8634
8   40.18203    187.9404
9   41.64492    198.2192
10  49.35753    233.2981
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文