是否有一个 R 函数可以从数据框中选择 N 个随机列?
是否有一个 R 函数可以从数据框中选择 N 个随机列?我正在尝试检查用于贝叶斯网络结构学习的 Sparsebn 包的时间复杂度
我已经尝试过了,但是该算法不仅选择 N 列,而且还选择 N 行。如何解决这个问题?
library(sparsebn)
library(igraph)
library(graph)
df <- read.csv("data/arth150.csv", header = TRUE, sep = ",", check.names = FALSE)
df <- as.data.frame(unclass(df), stringsAsFactors = TRUE)
experiment_range <- list(10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 106)
timelist <- list()
for (i in experiment_range) {
rand_df <- df[sample(ncol(df), size=i), ]
start_time <- Sys.time()
dat <- sparsebnData(rand_df, type = 'c')
dags <- estimate.dag(data = dat)
end_time <- Sys.time()
ctime <- end_time - start_time
otime <- list(ctime)
timelist <- append(timelist, otime)
}
Is there an R function to select N random columns from the dataframe? I'am trying to check the time complexity of Sparsebn package for structure learning of Bayesian Networks
I've tried this, but the algorithm selects not only N columns, but also N rows. How to fix that?
library(sparsebn)
library(igraph)
library(graph)
df <- read.csv("data/arth150.csv", header = TRUE, sep = ",", check.names = FALSE)
df <- as.data.frame(unclass(df), stringsAsFactors = TRUE)
experiment_range <- list(10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 106)
timelist <- list()
for (i in experiment_range) {
rand_df <- df[sample(ncol(df), size=i), ]
start_time <- Sys.time()
dat <- sparsebnData(rand_df, type = 'c')
dags <- estimate.dag(data = dat)
end_time <- Sys.time()
ctime <- end_time - start_time
otime <- list(ctime)
timelist <- append(timelist, otime)
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果 df 是一个数据帧,您可以通过执行以下操作随机采样
i
列:If
df
is a dataframe, you can samplei
columns randomly by doing this:或者使用 dplyr:
在管道中:
Or using
dplyr
:In a pipe: