使用 python 将 csv 文件中的特定列附加到另一个文件
我将解释我的整个问题:
我有 2 个 csv 文件:
- project-table.csv(大约有 50 列)
- interaction-matrix.csv(大约有 45 列)
我想将字符串附加到项目表的 col[43]
中.csv,其字符串位于交互矩阵.csv 的 col[1]
中,两个字符串之间有一个点(.
)
接下来,
- 交互矩阵.csv 有一组标题..
- 在执行我上面提到的
- 所有其他剩余列之后,它的第一个列现在将具有附加字符串,所有其他剩余列只有 0 和 1
- 我应该从此交互矩阵.csv 中仅提取那些带有 1 的列并复制它到一个新的 csv 文件...(第一列完好无损)
这是我想出的代码...
我在 keepcols
行中遇到错误...
import csv
reader=csv.reader(open("project-table.csv","r"))
writer=csv.writer(open("output.csv","w"),delimiter=" ")
for data in reader:
name1=data[1].strip()+'.'+data[43].strip()
writer.writerow((name1, None))
reader=csv.DictReader(open("interaction-matrix.csv","r"),[])
allrows = list(reader)
keepcols = [c for c in allrows[0] if all(r[c] != '0' for r in allrows)]
print keepcols
writer=csv.DictWriter(open("output1.csv","w"),fieldnames='keepcols',extrasaction='ignore')
writer.writerows(allrows)
这是我得到的错误:
Traceback (most recent call last):
File "prg1.py", line 23, in ?
keepcols = [c for c in allrows[0] if all([r[c] != '0' for r in allrows])]
NameError: name 'all' is not defined
项目表和交互矩阵在各自的第一列中都有相同的数据..所以我只是将 prj-table 的 col[43] 附加到同一个表本身的 col[1] 中...
I'll explain my whole problem:
I have 2 csv files:
- project-table.csv (has about 50 columns)
- interaction-matrix.csv (has about 45 columns)
I want to append the string in col[43]
from project-table.csv with string in col[1]
of interaction-matrix.csv with a dot(.
) in between both the strings
next,
- interaction-matrix.csv has a set of headers..
- its 1st col will now have the appended string after doing what I've mentioned above
- all other remaining columns have only 0's and 1's
- I'm supposed to extract only those columns with 1's from this interaction-matrix.csv and copy it to a new csv file... (with the first column intact)
this is the code i ve come up with...
I'm getting an error with the keepcols
line...
import csv
reader=csv.reader(open("project-table.csv","r"))
writer=csv.writer(open("output.csv","w"),delimiter=" ")
for data in reader:
name1=data[1].strip()+'.'+data[43].strip()
writer.writerow((name1, None))
reader=csv.DictReader(open("interaction-matrix.csv","r"),[])
allrows = list(reader)
keepcols = [c for c in allrows[0] if all(r[c] != '0' for r in allrows)]
print keepcols
writer=csv.DictWriter(open("output1.csv","w"),fieldnames='keepcols',extrasaction='ignore')
writer.writerows(allrows)
this is the error i get:
Traceback (most recent call last):
File "prg1.py", line 23, in ?
keepcols = [c for c in allrows[0] if all([r[c] != '0' for r in allrows])]
NameError: name 'all' is not defined
project table and interaction-matrix both have the same data in their respective 1st columns .. so i just appended col[43] of prj-table to col[1] of the same table itself...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
编辑您的问题以显示您收到的错误消息。更新:NameError 可能意味着您正在使用(较旧的)Python 版本(哪一个?),没有
all()
或(您已使用all
作为变量名并且是未显示您运行的确切代码)注意:分别以二进制模式(“rb”和“wb”)打开这两个文件。
你说“我想将project-table.csv的col[43]中的字符串附加到interaction-matrix.csv的col[1]中的字符串,并在两个字符串之间添加一个点(.)”但是你使用的是col project-table.csv(不是interaction-matrix.csv,您在该阶段尚未打开)的[2](不是col[1])。
Edit your question to show what error message are you getting. Update: NameError probably means you are using an (older) version of Python (which one?) without
all()
or (you have usedall
as a variable name AND are not showing the exact code that you ran)Note: open both files in binary mode ("rb" and "wb") respectively.
You say "I want to append the string in col[43] from project-table.csv with string in col[1] of interaction-matrix.csv with a dot(.) in between both the strings" HOWEVER you are using col[2] (not col[1]) of project-table.csv (not interaction-matrix.csv, which you haven't opened at that stage).