需要在Python中对多个CSV文件中的每一行进行数学运算
我有大约 100 个 CSV 文件,每个月都要处理一次,我试图解决这个问题,但遇到了困难。我开始了解有关 Python 的一些事情,但是将几件事结合起来仍然给我带来问题,所以我无法弄清楚这一点。
这是我的问题:
我有很多 CSV 文件,这就是我需要做的:
在每行的前面添加一个“列”(或后面,实际上并不重要,但前面是理想的)。另外,每行有 5 行(不包括将要添加的文件名),格式如下:
6 位 ID 号,YYYY-MM-DD(1),YYYY-MM-DD(2),YYYY- MM-DD(3),1-2 位数字
我需要为文件中的每一行从 YYYY-MM-DD(2) 中减去 YYYY-MM-DD(3)(没有标题)行),对于给定目录中的每个 CSV。
我需要行内的文件名,因为我将合并这些文件(如果包含在脚本中,那就太棒了,但我想我可以弄清楚该部分),并且我需要知道记录来自哪个文件。文件名的格式始终为“4-5-digit-number.csv”
我希望这是有意义的,如果没有,请告诉我。我有点不知道从哪里开始,所以我没有任何真正开始为我工作的示例代码。真的很沮丧,所以我感谢你们提供的任何帮助,这个网站很棒!
迈兰
I have about 100 CSV files I have to operate on once a month and I was trying to wrap my head around this but I'm running into a wall. I'm starting to understand some things about Python, but combining several things is still giving me issues, so I can't figure this out.
Here's my problem:
I have many CSV files, and here's what I need done:
add a "column" to the front of each row (or the back, doesn't matter really, but front is ideal). In addition, each line has 5 rows (not counting the filename that will be added), and here's the format:
6-digit ID number,YYYY-MM-DD(1),YYYY-MM-DD(2),YYYY-MM-DD(3),1-2-digit number
I need to subtract YYYY-MM-DD(3) from YYYY-MM-DD(2) for every line in the file (there is no header row), for every CSV in a given directory.
I need the filename inside the row because I will combine the files (which, if is included in the script would be awesome, but I think I can figure that part out), and I need to know what file the records came from. Format of filename is always '4-5-digit-number.csv'
I hope this makes sense, if it does not, please let me know. I'm kind of stumped as to where to even begin, so I don't have any sample code that even really began to work for me. Really frustrated, so I appreciate any help you guys may provide, this site rocks!
Mylan
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
标准库中有一个工具可以完成这些任务:
要迭代目录中的所有 CSV 文件,请使用
glob
模块:要解析 CSV 文件,请使用
csv
模块:要解析日期并计算差异,请使用
datetime module
:
将值添加到行的开头:
将文件名附加到行的末尾:
将一行写入新的 CSV 文件:
总而言之,您将得到:
There's a tool in the standard library for each of these tasks:
To iterate over all CSV files in a directory, use the
glob
module:To parse a CSV file, use the
csv
module:To parse a date and calculate a difference, use the
datetime
module:To add a value to the beginning of a row:
To append the filename to the end of a row:
And to write a row to a new CSV file:
Taken all together, you get:
该程序的基本轮廓如下:
The basic outline of the program is going to be like this:
columns = line.split(",")