在 postgres 中逐行读取和存储 csv 数据
我想从不同的文件复制 csv 数据,然后存储在表中。但问题是,每个 csv 文件的列数不同,所以有些 csv 文件有 3 列,而有些有 4 列。因此,如果文件中有 4 列,我想简单地忽略第四列,只保存前三列。
使用以下代码,如果只有 3 列,我可以将数据复制到表中,
CREATE TABLE ImportCSVTable (
name varchar(100),
address varchar(100),
phone varchar(100));
COPY ImportCSVTable (name , address , phone)
FROM 'path'
WITH DELIMITER ';' CSV QUOTE '"';
但我期待单独检查每一行,然后将其存储在表中。
谢谢。
I want to copy csv data from different files and then store in a table. But the problem is, the number of column differes in each csv files, So some csv file have 3 columns while some have 4. So if there are 4 columns in a file, I want to simply ignore the fourth column and save only first three.
Using following code, I can copy data into the table, if there are only 3 columns,
CREATE TABLE ImportCSVTable (
name varchar(100),
address varchar(100),
phone varchar(100));
COPY ImportCSVTable (name , address , phone)
FROM 'path'
WITH DELIMITER ';' CSV QUOTE '"';
But I am looking forward to check each row individually and then store it in the table.
Thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
由于您想一次读取并存储一行,因此 Python csv 模块 应该无论是否有任何额外的列,都可以轻松读取 CSV 文件中的前 3 列。
您可以构建 INSERT 语句并使用您首选的 Python-PostGreSQL 模块执行它。我过去使用过 pyPgSQL;不知道现在是什么情况。
Since you want to read and store it one line at a time, the Python csv module should make it easy to read the first 3 columns from your CSV file regardless of any extra columns.
You can construct an INSERT statement and execute it with your preferred Python-PostGreSQL module. I have used pyPgSQL in the past; don't know what's current now.
使用文本实用程序截掉第四列。这样,您的所有输入文件都将具有三列。
awk
、cut
和sed
的某种组合应该可以为您解决这个问题,但这取决于您的列的外观。Use a text utility to chop off the fourth column. That way, all your input files will have three columns. Some combination of
awk
,cut
, andsed
should take care of it for you, but it depends on what your columns look like.您还可以使输入表具有可为空的第四列,然后在导入后删除额外的列。
You can also just make your input table have a fourth column that is nullable, then after the import drop the extra column.