在 postgres 中逐行读取和存储 csv 数据

发布于 2024-11-01 10:34:13 字数 467 浏览 0 评论 0原文

我想从不同的文件复制 csv 数据,然后存储在表中。但问题是,每个 csv 文件的列数不同,所以有些 csv 文件有 3 列,而有些有 4 列。因此,如果文件中有 4 列,我想简单地忽略第四列,只保存前三列。

使用以下代码,如果只有 3 列,我可以将数据复制到表中,

CREATE TABLE ImportCSVTable (
          name varchar(100),
                  address varchar(100),
                  phone varchar(100));




COPY ImportCSVTable (name , address , phone)
         FROM 'path'
        WITH DELIMITER ';' CSV QUOTE '"';

但我期待单独检查每一行,然后将其存储在表中。

谢谢。

I want to copy csv data from different files and then store in a table. But the problem is, the number of column differes in each csv files, So some csv file have 3 columns while some have 4. So if there are 4 columns in a file, I want to simply ignore the fourth column and save only first three.

Using following code, I can copy data into the table, if there are only 3 columns,

CREATE TABLE ImportCSVTable (
          name varchar(100),
                  address varchar(100),
                  phone varchar(100));




COPY ImportCSVTable (name , address , phone)
         FROM 'path'
        WITH DELIMITER ';' CSV QUOTE '"';

But I am looking forward to check each row individually and then store it in the table.

Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

茶花眉 2024-11-08 10:34:13

由于您想一次读取并存储一行,因此 Python csv 模块 应该无论是否有任何额外的列,都可以轻松读取 CSV 文件中的前 3 列。

您可以构建 INSERT 语句并使用您首选的 Python-PostGreSQL 模块执行它。我过去使用过 pyPgSQL;不知道现在是什么情况。

#!/usr/bin/env python
import csv
filesource = 'PeopleAndResources.csv'
with open(filesource, 'rb') as f:
    reader = csv.reader(f, delimiter=';', quotechar='"')
    for row in reader:
        statement = "INSERT INTO ImportCSVTable " + \
        "(name, address, phone) " + \
        "VALUES ('%s', '%s', '%s')" % (tuple(row[0:3]))
        #execute statement

Since you want to read and store it one line at a time, the Python csv module should make it easy to read the first 3 columns from your CSV file regardless of any extra columns.

You can construct an INSERT statement and execute it with your preferred Python-PostGreSQL module. I have used pyPgSQL in the past; don't know what's current now.

#!/usr/bin/env python
import csv
filesource = 'PeopleAndResources.csv'
with open(filesource, 'rb') as f:
    reader = csv.reader(f, delimiter=';', quotechar='"')
    for row in reader:
        statement = "INSERT INTO ImportCSVTable " + \
        "(name, address, phone) " + \
        "VALUES ('%s', '%s', '%s')" % (tuple(row[0:3]))
        #execute statement
旧城烟雨 2024-11-08 10:34:13

使用文本实用程序截掉第四列。这样,您的所有输入文件都将具有三列。 awkcutsed 的某种组合应该可以为您解决这个问题,但这取决于您的列的外观。

Use a text utility to chop off the fourth column. That way, all your input files will have three columns. Some combination of awk, cut, and sed should take care of it for you, but it depends on what your columns look like.

濫情▎り 2024-11-08 10:34:13

您还可以使输入表具有可为空的第四列,然后在导入后删除额外的列。

You can also just make your input table have a fourth column that is nullable, then after the import drop the extra column.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文