添加文件名作为 CSV 文件的最后一列
我有一个 Python 脚本,它修改 CSV 文件以将文件名添加为最后一列:
import sys
import glob
for filename in glob.glob(sys.argv[1]):
file = open(filename)
data = [line.rstrip() + "," + filename for line in file]
file.close()
file = open(filename, "w")
file.write("\n".join(data))
file.close()
不幸的是,它还将文件名添加到文件的标题(第一)行。我希望将字符串“ID”添加到标题中。有人能建议我如何做到这一点吗?
I have a Python script which modifies a CSV file to add the filename as the last column:
import sys
import glob
for filename in glob.glob(sys.argv[1]):
file = open(filename)
data = [line.rstrip() + "," + filename for line in file]
file.close()
file = open(filename, "w")
file.write("\n".join(data))
file.close()
Unfortunately, it also adds the filename to the header (first) row of the file. I would like the string "ID" added to the header instead. Can anybody suggest how I could do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
查看官方 csv 模块。
Have a look at the official csv module.
以下是关于当前代码的一些小注释:
file
作为变量名是一个坏主意,因为这会隐藏内置类型。with
语法自动关闭文件对象。Filename
),而不是仅仅省略第一行中的一列吗?最后一个考虑因素会让我倾向于使用 csv 模块,它将为您处理引用和取消引用。例如,您可以尝试类似以下代码的代码:
这可能会与您的输入文件引用数据略有不同,因此您可能需要使用
csv.reader
和csv 的引用选项.writer
在 csv 模块的文档中进行了描述。进一步说,您可能有充分的理由将 glob 作为参数,而不仅仅是命令行上的文件,但这有点令人惊讶 - 您必须将脚本调用为
./whatever.py '*.csv'
而不仅仅是./whatever.py *.csv
。相反,您可以这样做:... 并让 shell 在脚本了解任何信息之前扩展您的 glob。
最后一件事 - 您当前采用的方法有点危险,因为如果写回同一文件名时出现任何失败,您将丢失数据。避免这种情况的标准方法是写入临时文件,如果成功,则将临时文件重命名为原始文件。因此,您可以将整个事情重写为:
Here are a few minor notes on your current code:
file
as a variable name, since that shadows the built-in type.with
syntax.Filename
, rather than just omitting a column in the first row?That last consideration would incline me to use the
csv
module instead, which will deal with the quoting and unquoting for you. For example, you could try something like the following code:That may quote the data slightly differently from your input file, so you might want to play with the quoting options for
csv.reader
andcsv.writer
described in the documentation for the csv module.As a further point, you might have good reasons for taking a glob as a parameter rather than just the files on the command line, but it's a bit surprising - you'll have to call your script as
./whatever.py '*.csv'
rather than just./whatever.py *.csv
. Instead, you could just do:... and let the shell expand your glob before the script knows anything about it.
One last thing - the current approach you're taking is slightly dangerous, in that if anything fails when writing back to the same filename, you'll lose data. The standard way of avoiding this is to instead write to a temporary file, and, if that was successful, rename the temporary file over the original. So, you might rewrite the whole thing as:
您可以尝试:
You can try:
您可以尝试更改代码,但建议使用 csv 模块。这应该会给你你想要的结果:
You can try changing your code, but using the csv module is recommended. This should give you the result you want:
使用Python附带的CSV模块。
您可以按如下方式运行它:
Use the CSV module that comes with Python.
You can run this as follows:
您可以使用 fileinput 进行就地编辑
you can use fileinput to do in place editing