python中如何合并两个文件
我有两个制表符分隔的 csv 文件(带标题),需要在 python 中合并。
另外,在合并的文件中,我想在末尾添加一列来标识文件,因为虽然它们具有相同的格式,但它们具有不同的数据,我稍后需要将其分开。 因此,我想在输出的每一行添加一个名为“source”的列,其中 file1 为 0,file2 为 1。
我已经使用了 csv 模块,但 writerow 在它写入的每一行之间添加了一个附加的换行符,并且此代码不会从 file2 中写入任何内容。我在这里做错了什么?另外,如何在 line 对象中添加额外的列“source”?
import os, csv
path1 = os.path.abspath("../data/file1.txt")
path2 = os.path.abspath("../data/file2.txt")
merged_path = os.path.abspath('../data/output.txt')
# merge the two files for further processing
merged_file = csv.writer(open(merged_path, 'a'), delimiter = '\t')
#file1
fg = csv.reader(open(path1, 'r'), delimiter = '\t')
for line in fg:
if line[7] != '\N':
merged_file.writerow(line)
#file2
bg = csv.reader(open(path2, 'r'), delimiter = '\t')
for line in bg:
if line[16] != '\N':
merged_file.writerow(line)
I have two tab delimited csv files (with headers) that I need to merge in python.
Also, in the merged file I want to add a column in the end to identify the files because though they have same format, they have different data that I need to separate later on.
So, I want to add a column called 'source' on each line of output which is 0 for file1 and 1 for file2.
I have gone far as using the csv module but the writerow adds an additioal newline character between each line it writes and this code doesn't write anything from file2. What am I doing wrong here? Also, how do I add the extra column 'source' in the line object?
import os, csv
path1 = os.path.abspath("../data/file1.txt")
path2 = os.path.abspath("../data/file2.txt")
merged_path = os.path.abspath('../data/output.txt')
# merge the two files for further processing
merged_file = csv.writer(open(merged_path, 'a'), delimiter = '\t')
#file1
fg = csv.reader(open(path1, 'r'), delimiter = '\t')
for line in fg:
if line[7] != '\N':
merged_file.writerow(line)
#file2
bg = csv.reader(open(path2, 'r'), delimiter = '\t')
for line in bg:
if line[16] != '\N':
merged_file.writerow(line)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我更喜欢使用 dictWriter 来实现此目的。此外,您的代码不起作用,因为 csv 库需要以
binary
模式打开文件。I prefer to use the dictWriter for this. Also, your code doesn't work because the csv library requires opening files in
binary
mode.