从CSV删除或忽略一条线-Python

发布于 2025-01-19 03:41:47 字数 1130 浏览 3 评论 0原文

我尝试找到一种在脚本中添加函数以忽略或删除CSV文件的第一行的方法。我知道我们可以用大熊猫做到这一点，但是没有吗？

非常感谢您的帮助。

这是我的代码 -

from os import mkdir
from os.path import join, splitext, isdir
from glob import iglob
from csv import DictReader
from collections import defaultdict
from urllib.request import urlopen
from shutil import copyfileobj

csv_folder = r"/Users/folder/PycharmProjects/pythonProject/CSVfiles/"
glob_pattern = "*.csv"
for file in iglob(join(csv_folder, glob_pattern)):
    with open(file) as csv_file:
        reader = DictReader(csv_file)
        save_folder, _ = splitext(file)
        if not isdir(save_folder):
            mkdir(save_folder)
        title_counter = defaultdict(int)
        for row in reader:
            url = row["link"]
            title = row["title"]
            title_counter[title] += 1
            _, ext = splitext(url)
            save_filename = join(save_folder, f"{title}_{title_counter[title]}{ext}".replace('/', '-'))
            print(f"'{save_filename}'")
            with urlopen(url) as req, open(save_filename, "wb") as save_file:
                copyfileobj(req, save_file)

原文

I try to find a way to add a function in my script to ignore or delete the first line of my CSV files. I know we can do that with pandas but it is possible without?

Many thanks for your help.

Here is my code -

from os import mkdir
from os.path import join, splitext, isdir
from glob import iglob
from csv import DictReader
from collections import defaultdict
from urllib.request import urlopen
from shutil import copyfileobj

csv_folder = r"/Users/folder/PycharmProjects/pythonProject/CSVfiles/"
glob_pattern = "*.csv"
for file in iglob(join(csv_folder, glob_pattern)):
    with open(file) as csv_file:
        reader = DictReader(csv_file)
        save_folder, _ = splitext(file)
        if not isdir(save_folder):
            mkdir(save_folder)
        title_counter = defaultdict(int)
        for row in reader:
            url = row["link"]
            title = row["title"]
            title_counter[title] += 1
            _, ext = splitext(url)
            save_filename = join(save_folder, f"{title}_{title_counter[title]}{ext}".replace('/', '-'))
            print(f"'{save_filename}'")
            with urlopen(url) as req, open(save_filename, "wb") as save_file:
                copyfileobj(req, save_file)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

您的好友蓝忘机已上羡 2025-01-26 03:41:47

使用next（）函数跳过CSV的第一行。

with open(file) as csv_file:
    reader = DictReader(csv_file)

    # skip first row
    next(reader)

Use the next() function to skip the first row of your CSV.

with open(file) as csv_file:
    reader = DictReader(csv_file)

    # skip first row
    next(reader)

回复收藏 0 原文

笑，眼淚并存 2025-01-26 03:41:47

您只需按照文件读取文件的原始文本，然后按新行划分文本并删除第一行：

file = open(filename, 'r')   # Open the file
content = file.read()        # Read the file
lines = content.split("\n")  # Split the text by the newline character
del lines[0]                 # Delete the first index from the resulting list, ie delete the first line.

尽管这可能需要很长时间才能用于较大的CSV文件，因此这可能不是最好的解决方案。

或者，您可以简单地跳过循环中的第一行。
而不是：

...
for row in reader:
...

您可以使用：

...
for row_num, row in enumerate(list(reader)):
    if row_num == 0:
        continue
    ...

而是吗？我认为应该跳过第一行。

You could just read the raw text from the file as normal and then split the text by new line and delete the first line:

file = open(filename, 'r')   # Open the file
content = file.read()        # Read the file
lines = content.split("\n")  # Split the text by the newline character
del lines[0]                 # Delete the first index from the resulting list, ie delete the first line.

Although this may take a long time for larger CSV files, so this may not be the best solution.

Or you could simply skip the first row in your for loop.
Instead of:

...
for row in reader:
...

Could you use:

...
for row_num, row in enumerate(list(reader)):
    if row_num == 0:
        continue
    ...

instead? I think that should skip the first row.

回复收藏 0 原文

~没有更多了~

关于作者

深爱不及久伴

暂无简介

文章

27 人气

关注发私信

李珊平

文章 0 评论 0

关注

Quxin

文章 0 评论 0

关注

范无咎

文章 0 评论 0

关注

github_ZOJ2N8YxBm

文章 0 评论 0

关注

若言

文章 0 评论 0

关注

南…巷孤猫

文章 0 评论 0

友情链接

文江博客

从CSV删除或忽略一条线-Python

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

从CSV删除或忽略一条线-Python

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。