使用 LOAD DATA INFILE 将两个 CSV 文件加载到同一行中

发布于 2024-10-12 08:10:59 字数 267 浏览 4 评论 0原文

我有两个 CSV 文件,每个文件都包含我想要合并到一个数据库中的不同列。我使用两个文件中的所有列设置了一个数据库,但是当我使用 load data infile 导入两个文件而不是合并时(IE 数据文件 1 填充第 1-6 列,数据文件 2 填充 7-10 列),我得到一个数据库行数是两倍(每个 CSV 中的每条记录一行),并且 NULL 填充源 CSV 中未表示的数据。

我知道我可以通过以某种方式合并 CSV、在启用覆盖的情况下导入或合并数据库中的数据来解决此问题 - 对我来说最有效的方法是什么?

I have two CSV files, each containing different columns that I want to merge into one database. I set up a database with all the columns from both files, but when I use load data infile to import both files instead of merging (IE data file 1 fills columns 1-6, data file 2 fills 7-10) I get a database with twice as many rows (one row for each record in each CSV) and NULLS filling in the data not represented in the source CSV.

I know I can fix this by merging the CSVs somehow, importing with overwrite enabled, or combining the data in the database - what's the most efficient way for me to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

溇涏 2024-10-19 08:10:59

执行此操作的最佳方法是使用脚本。 CSV 导入脚本通常使用脚本语言(例如 python、ruby 或 php)编写。

您只需要第二个 CSV 的导入程序对第一个 CSV 中创建的记录执行更新,因此脚本实际上只有 5-10 行。如果您提供每个 CSV 的示例记录,我很乐意为您写一个。

编辑:这是一个用于组合文件的 python 脚本,在 file1 的行和 file2 的行之间添加分号。这本质上与 Linux 的 paste 命令的作用相同。

lines1 = open('file1.txt').readlines()
lines2 = open('file2.txt').readlines()
outfile = open('outfile.txt', 'w')

if len(lines1) != len(lines2):
    raise Exception("Files need to be the same length, but file1 is %s lines long and file2 is %s lines long" % (len(lines1), len(lines2)));

for i in range(len(lines1)):
    combined = lines1[i].strip() + ";" + lines2[i].strip() + "\n"
    outfile.write(combined)

您可以通过将其另存为 combine.py 并输入 python merge.py 来运行它。您放置的文件夹应包含 file1.txtfile2.txtoutfile.txt

The best way to do this is with a script. CSV import scripts are usually written in a scripting language such as python, ruby, or php.

You just need the importer for the second CSV to perform updates on the records created in the first CSV, so the script will really only be 5-10 lines. If you provide a sample record from each CSV, I'd be happy to write one for you.

Edit: Here's a python script to combine the files, adding a semicolon between lines from file1 and lines from file2. This essentially does what Linux's paste command would do.

lines1 = open('file1.txt').readlines()
lines2 = open('file2.txt').readlines()
outfile = open('outfile.txt', 'w')

if len(lines1) != len(lines2):
    raise Exception("Files need to be the same length, but file1 is %s lines long and file2 is %s lines long" % (len(lines1), len(lines2)));

for i in range(len(lines1)):
    combined = lines1[i].strip() + ";" + lines2[i].strip() + "\n"
    outfile.write(combined)

You can run it by saving it as combine.py and typing python combine.py. The folder you place it in should contain file1.txt, file2.txt, and outfile.txt.

半窗疏影 2024-10-19 08:10:59

将两个 CSV 合并为一个。

如果您使用的是Linux平台,请使用paste命令来连接两个或多个文件。

PASTE(1)

NAME
       paste - merge lines of files

SYNOPSIS
       paste [OPTION]... [FILE]...

DESCRIPTION
       Write lines consisting of the sequentially corresponding lines from 
       each FILE, separated by TABs, to standard output.  
       With no FILE, or when FILE is -, read standard input.

       Mandatory arguments to long options are mandatory for short options too.

       -d, --delimiters=LIST
              reuse characters from LIST instead of TABs

       -s, --serial
              paste one file at a time instead of in parallel

       --help display this help and exit

       --version
              output version information and exit

例如

paste file1.csv file2.csv > file3.csv

Combine two CSV into one.

If you are in linux platform, use the paste command to join two or more files.

PASTE(1)

NAME
       paste - merge lines of files

SYNOPSIS
       paste [OPTION]... [FILE]...

DESCRIPTION
       Write lines consisting of the sequentially corresponding lines from 
       each FILE, separated by TABs, to standard output.  
       With no FILE, or when FILE is -, read standard input.

       Mandatory arguments to long options are mandatory for short options too.

       -d, --delimiters=LIST
              reuse characters from LIST instead of TABs

       -s, --serial
              paste one file at a time instead of in parallel

       --help display this help and exit

       --version
              output version information and exit

such as

paste file1.csv file2.csv > file3.csv
晚雾 2024-10-19 08:10:59

我会看看 Perl 和 Text::CSV 模块。您需要考虑的一个问题是两个文件中的数据顺序是否相同。

I would look at Perl and the Text::CSV module. One issue you'll need to think about is whether the data is in the same order in the two files.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文