当前位置：文江博客话题详情

UNIX 将 LARGE csv 导入 SQLite

发布于 2024-10-05 18:37:41 字数 345 浏览 2 评论 0原文

我有一个 5gig csv 文件（也作为 sas 数据文件，如果它更容易的话），我需要将其放入 sql 数据库中，以便我可以在 R 中使用它。

变量名称全部包含在第一个观察行中，并且是双引号。有 1000 多个变量，其中一些是数字，另一些是字符（虽然一些字符变量是数字字符串，但我不太担心它，我可以在 R 中修复它）。

我的问题是如何以最小的痛苦将 csv 文件导入到数据库中的新表中？

我发现首先要创建表（其中包括指定所有变量，其中我有 1000 多个），然后使用“.import 文件表”引入数据。或者，使用一些 gui 导入向导，这对我来说不是一个选择。

抱歉，如果这是 sql 101，但感谢您的帮助。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

冰火雁神 2024-10-12 18:37:41

这是我的工作流程：

library("RSQLite")
setwd("~/your/dir")
db <- dbConnect(SQLite(), dbname="your_db.sqlite") ## will make, if not present
field.types <- list(
        date="INTEGER",
        symbol="TEXT",
        permno="INTEGER",
        shrcd="INTEGER",
        prc="REAL",
        ret="REAL")
dbWriteTable(conn=db, name="your_table", value="your_file.csv", row.names=FALSE, header=TRUE, field.types=field.types)
dbGetQuery(db, "CREATE INDEX IF NOT EXISTS idx_your_table_date_sym ON crsp (date, symbol)")
dbDisconnect(db)

field.types 不是必需的。如果您不提供此列表，RSQLite 将从标头中猜测。索引也不是必需的，但会加快您以后的查询速度（如果您为查询索引了正确的列）。

我已经在这里学习了很多这样的东西，所以如果你检查我在 SQLite 上提出/回答的问题，你可能会发现一些重要的东西。

Here's my workflow:

library("RSQLite")
setwd("~/your/dir")
db <- dbConnect(SQLite(), dbname="your_db.sqlite") ## will make, if not present
field.types <- list(
        date="INTEGER",
        symbol="TEXT",
        permno="INTEGER",
        shrcd="INTEGER",
        prc="REAL",
        ret="REAL")
dbWriteTable(conn=db, name="your_table", value="your_file.csv", row.names=FALSE, header=TRUE, field.types=field.types)
dbGetQuery(db, "CREATE INDEX IF NOT EXISTS idx_your_table_date_sym ON crsp (date, symbol)")
dbDisconnect(db)

The field.types isn't necessary. RSQLite will guess from the header if you don't provide this list. The index isn't required either, but will speed up your queries later on (if you index the correct column for your queries).

I've been learning a lot of this stuff here on SO, so if you check my questions asked/answered on SQLite, you may find some tagential stuff.

回复收藏 0 原文