两个如何用python水平合并多个.csv文件?
我有几个 .csv 文件(~10),需要将它们水平合并到一个文件中。每个文件具有相同的行数 (~300) 和 4 个标题行,这些标题行不一定相同,但不应合并(仅从第一个 .csv 文件中获取标题行)。行中的标记以逗号分隔,中间没有空格。
作为一个Python菜鸟,我还没有想出解决方案,尽管我确信这个问题有一个简单的解决方案。欢迎任何帮助。
I've several .csv files (~10) and need to merge them together into a single file horizontally. Each file has the same number of rows (~300) and 4 header lines which are not necessarily identical, but should not be merged (only take the header lines from the first .csv file). The tokens in the lines are comma separated with no spaces in between.
As a python noob I've not come up with a solution, though I'm sure there's a simple solution to this problem. Any help is welcome.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
您可以使用 Python 中的
csv
模块加载 CSV 文件。加载代码请参考该模块的文档,我不记得了,但它真的很容易。类似于:之后,当您以这种形式(元组列表)加载 CSV 文件时:
您可以逐行合并两个这样的列表:
要保存这样的结果,您可以使用:
You can load the CSV files using the
csv
module in Python. Please refer to the documentation of this module for the loading code, I cannot remember it but it is really easy. Something like:After that, when you have the CSV files loaded in such form (a list of tuples):
You can merge two such lists line-by-line:
To save such a result, you can use:
csv 模块是你的朋友。
The csv module is your friend.
如果你不一定要使用Python,你可以使用 shell 工具,如
paste/gawk
等。上面会将它们水平放置,不带标题。如果您想要标头,只需从
file1
获取它们即可If you don't necessarily have to use Python, you can use shell tools like
paste/gawk
etcThe above will put them horizontally without the headers. If you want the headers, just get them from
file1
您不需要为此使用 csv 模块。你可以使用
打开所有文件后你可以这样做
这会给你这个结构(kon已经告诉你了)..如果每个文件中有不同的行数,它也会起作用
之后你可以写它到一个新文件,一次获取 1 个列表
PS:有关 izip_longest 的更多信息 此处
You dont need to use csv module for this. You can just use
After opening all your files you can do this
This will give you this structure (which kon has already told you)..It will also work if you have different number of lines in each file
After this you can just write it to a new file taking 1 list at a time
PS: more about izip_longest here
你通过实践(甚至尝试)来学习。所以,我只会给你一些提示。使用以下函数:
open()
str.split()
如果您真的不知道该怎么做,我建议您阅读教程 和 深入了解 Python 3。 (根据您对 Python 的了解程度,您要么必须阅读前几章,要么直接跳到文件 IO 章节。)
You learn by doing (and trying, even). So, I'll just give you a few hints. Use the following functions:
open()
IOBase.readlines()
str.split()
If you really don't know what to do, I recommend you read the tutorial and Dive Into Python 3. (Depending on how much Python you know, you'll either have to read through the first few chapters or cut straight to the file IO chapters.)
纯粹出于学习目的
一种不利用 csv 模块的简单方法:
Purely for learning purposes
A simple approach that does not take advantage of csv module: