将数据集拆分为任意部分

发布于 2025-01-13 01:53:43 字数 867 浏览 0 评论 0原文

我有这个数据集：

var_1 = rnorm(1000,1000,1000)
var_2 = rnorm(1000,1000,1000)
var_3 = rnorm(1000,1000,1000)

sample_data = data.frame(var_1, var_2, var_3)

我想将此数据集拆分为 10 个不同的数据集（每个数据集包含 100 行），然后将它们上传到服务器。

我知道如何手动执行此操作：

sample_1 = sample_data[1:100,]
sample_2 = sample_data[101:200,]
sample_3 = sample_data[201:300,]

# etc.

library(DBI)

#establish connection (my_connection)

dbWriteTable(my_connection,  SQL("sample_1"), sample_1)
dbWriteTable(my_connection,  SQL("sample_2"), sample_2)
dbWriteTable(my_connection,  SQL("sample_3"), sample_3)

# etc

有没有办法“更快”地执行此操作？

我想到了一个通用的方法来做到这一点 - 但我不确定如何正确编写代码：

i = seq(1:1000, by = 100)
j = 1 - 99
{
sample_i = sample_data[ i:j,]

dbWriteTable(my_connection,  SQL("sample_i"), sample_i)
}

有人可以帮我解决这个问题吗？

谢谢你！

原文

I have this data set:

var_1 = rnorm(1000,1000,1000)
var_2 = rnorm(1000,1000,1000)
var_3 = rnorm(1000,1000,1000)

sample_data = data.frame(var_1, var_2, var_3)

I would like to split this data set into 10 different datasets (each containing 100 rows) and then upload them on to a server.

I know how to do this by hand:

sample_1 = sample_data[1:100,]
sample_2 = sample_data[101:200,]
sample_3 = sample_data[201:300,]

# etc.

library(DBI)

#establish connection (my_connection)

dbWriteTable(my_connection,  SQL("sample_1"), sample_1)
dbWriteTable(my_connection,  SQL("sample_2"), sample_2)
dbWriteTable(my_connection,  SQL("sample_3"), sample_3)

# etc

Is there a way to do this "quicker"?

I thought of a general way to do this - but I am not sure how to correctly write the code for this:

i = seq(1:1000, by = 100)
j = 1 - 99
{
sample_i = sample_data[ i:j,]

dbWriteTable(my_connection,  SQL("sample_i"), sample_i)
}

Can someone please help me with this?

Thank you!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

冬天的雪花 2025-01-20 01:53:43

以下是使用 SQLite 数据库引擎的示例。我们将从示例数据集开始：

var_1 = rnorm(1000,1000,1000)
var_2 = rnorm(1000,1000,1000)
var_3 = rnorm(1000,1000,1000)

sample_data = data.frame(var_1, var_2, var_3)

现在我们将使用 split() 将大型数据框分解为包含 10 个数据框的列表。结果将存储在一个列表中：

list_of_dfs <- split(
  sample_data, (seq(nrow(sample_data))-1) %/% 100
)

我们将使用数据库中表的名称创建一个向量。在这里，我只是使用名称 sample_1、sample_2 等创建简单的向量。

table_names <- paste0("sample_", 1:10)

现在我们准备写入数据库。我们将建立连接，然后同时迭代数据帧列表和表名称向量，每次调用 dbWriteTable() ：

connection <- dbConnect(RSQLite::SQLite(), dbname = "test.db")
map2(
  table_names, 
  list_of_dfs, 
  function(x,y) dbWriteTable(connection, x, y)
)

Here's an example using the SQLite database engine. We'll start with your sample data set:

var_1 = rnorm(1000,1000,1000)
var_2 = rnorm(1000,1000,1000)
var_3 = rnorm(1000,1000,1000)

sample_data = data.frame(var_1, var_2, var_3)

Now we'll break your large data frame into a list of 10 data frames using split(). The result will be stored in a list:

list_of_dfs <- split(
  sample_data, (seq(nrow(sample_data))-1) %/% 100
)

We'll create a vector with the names of the tables in the database. Here, I'm just making simple vector with the names sample_1, sample_2, etc.

table_names <- paste0("sample_", 1:10)

Now we're ready to write to the database. We'll make a connection and then iterate over the list of data frames and the vector of table names simultaneously, calling dbWriteTable() each time:

connection <- dbConnect(RSQLite::SQLite(), dbname = "test.db")
map2(
  table_names, 
  list_of_dfs, 
  function(x,y) dbWriteTable(connection, x, y)
)

回复收藏 0 原文

~没有更多了~

关于作者

甜嗑

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

将数据集拆分为任意部分

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

佚名

羁客

天天爱笑的徐老师

星

夏日落

隐诗

友情链接

将数据集拆分为任意部分

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

佚名

羁客

天天爱笑的徐老师

星

夏日落

隐诗

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。