从CSV删除或忽略一条线-Python
我尝试找到一种在脚本中添加函数以忽略或删除CSV文件的第一行的方法。我知道我们可以用大熊猫做到这一点,但是没有吗?
非常感谢您的帮助。
这是我的代码 -
from os import mkdir
from os.path import join, splitext, isdir
from glob import iglob
from csv import DictReader
from collections import defaultdict
from urllib.request import urlopen
from shutil import copyfileobj
csv_folder = r"/Users/folder/PycharmProjects/pythonProject/CSVfiles/"
glob_pattern = "*.csv"
for file in iglob(join(csv_folder, glob_pattern)):
with open(file) as csv_file:
reader = DictReader(csv_file)
save_folder, _ = splitext(file)
if not isdir(save_folder):
mkdir(save_folder)
title_counter = defaultdict(int)
for row in reader:
url = row["link"]
title = row["title"]
title_counter[title] += 1
_, ext = splitext(url)
save_filename = join(save_folder, f"{title}_{title_counter[title]}{ext}".replace('/', '-'))
print(f"'{save_filename}'")
with urlopen(url) as req, open(save_filename, "wb") as save_file:
copyfileobj(req, save_file)
I try to find a way to add a function in my script to ignore or delete the first line of my CSV files. I know we can do that with pandas but it is possible without?
Many thanks for your help.
Here is my code -
from os import mkdir
from os.path import join, splitext, isdir
from glob import iglob
from csv import DictReader
from collections import defaultdict
from urllib.request import urlopen
from shutil import copyfileobj
csv_folder = r"/Users/folder/PycharmProjects/pythonProject/CSVfiles/"
glob_pattern = "*.csv"
for file in iglob(join(csv_folder, glob_pattern)):
with open(file) as csv_file:
reader = DictReader(csv_file)
save_folder, _ = splitext(file)
if not isdir(save_folder):
mkdir(save_folder)
title_counter = defaultdict(int)
for row in reader:
url = row["link"]
title = row["title"]
title_counter[title] += 1
_, ext = splitext(url)
save_filename = join(save_folder, f"{title}_{title_counter[title]}{ext}".replace('/', '-'))
print(f"'{save_filename}'")
with urlopen(url) as req, open(save_filename, "wb") as save_file:
copyfileobj(req, save_file)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用
next()
函数跳过CSV的第一行。Use the
next()
function to skip the first row of your CSV.您只需按照文件读取文件的原始文本,然后按新行划分文本并删除第一行:
尽管这可能需要很长时间才能用于较大的CSV文件,因此这可能不是最好的解决方案。
或者,您可以简单地跳过循环中的第一行。
而不是:
您可以使用:
而是吗?我认为应该跳过第一行。
You could just read the raw text from the file as normal and then split the text by new line and delete the first line:
Although this may take a long time for larger CSV files, so this may not be the best solution.
Or you could simply skip the first row in your for loop.
Instead of:
Could you use:
instead? I think that should skip the first row.