从 CSV 中删除空白行?

发布于 2024-10-09 07:42:50 字数 1001 浏览 4 评论 0原文

我有一个很大的 csv 文件,其中有些行完全空白。如何使用 Python 删除 csv 中的所有空白行?

在您提出所有建议之后,这就是我到目前为止所得到的,

import csv

# open input csv for reading
inputCSV = open(r'C:\input.csv', 'rb')

# create output csv for writing
outputCSV = open(r'C:\OUTPUT.csv', 'wb')

# prepare output csv for appending
appendCSV = open(r'C:\OUTPUT.csv', 'ab')

# create reader object
cr = csv.reader(inputCSV, dialect = 'excel')

# create writer object
cw = csv.writer(outputCSV, dialect = 'excel')

# create writer object for append
ca = csv.writer(appendCSV, dialect = 'excel')

# add pre-defined fields
cw.writerow(['FIELD1_','FIELD2_','FIELD3_','FIELD4_'])

# delete existing field names in input CSV
# ???????????????????????????

# loop through input csv, check for blanks, and write all changes to append csv
for row in cr:
    if row or any(row) or any(field.strip() for field in row):
        ca.writerow(row)

# close files
inputCSV.close()
outputCSV.close()
appendCSV.close()

这可以吗?或者有更好的方法吗?

I have a large csv file in which some rows are entirely blank. How do I use Python to delete all blank rows from the csv?

After all your suggestions, this is what I have so far

import csv

# open input csv for reading
inputCSV = open(r'C:\input.csv', 'rb')

# create output csv for writing
outputCSV = open(r'C:\OUTPUT.csv', 'wb')

# prepare output csv for appending
appendCSV = open(r'C:\OUTPUT.csv', 'ab')

# create reader object
cr = csv.reader(inputCSV, dialect = 'excel')

# create writer object
cw = csv.writer(outputCSV, dialect = 'excel')

# create writer object for append
ca = csv.writer(appendCSV, dialect = 'excel')

# add pre-defined fields
cw.writerow(['FIELD1_','FIELD2_','FIELD3_','FIELD4_'])

# delete existing field names in input CSV
# ???????????????????????????

# loop through input csv, check for blanks, and write all changes to append csv
for row in cr:
    if row or any(row) or any(field.strip() for field in row):
        ca.writerow(row)

# close files
inputCSV.close()
outputCSV.close()
appendCSV.close()

Is this ok or is there a better way to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

梦中的蝴蝶 2024-10-16 07:42:50

使用 csv 模块:

import csv
...

with open(in_fnam, newline='') as in_file:
    with open(out_fnam, 'w', newline='') as out_file:
        writer = csv.writer(out_file)
        for row in csv.reader(in_file):
            if row:
                writer.writerow(row)

如果您还需要删除所有字段均为空的行,请将 if row: 行更改为:

if any(row):

如果您还想处理字段仅由空格组成的空文件,您可以将其替换为:

if any(field.strip() for field in row):

请注意,在 Python 2.x 及更早版本中,csv 模块需要二进制文件,因此您需要打开文件带有 e 'b' 标志。在 3.x 中,这样做会导致错误。

Use the csv module:

import csv
...

with open(in_fnam, newline='') as in_file:
    with open(out_fnam, 'w', newline='') as out_file:
        writer = csv.writer(out_file)
        for row in csv.reader(in_file):
            if row:
                writer.writerow(row)

If you also need to remove rows where all of the fields are empty, change the if row: line to:

if any(row):

And if you also want to treat fields that consist of only whitespace as empty you can replace it with:

if any(field.strip() for field in row):

Note that in Python 2.x and earlier, the csv module expected binary files, and so you'd need to open your files with e 'b' flag. In 3.x, doing this will result in an error.

两人的回忆 2024-10-16 07:42:50

令人惊讶的是这里没有人提到pandas。这是一个可能的解决方案。

import pandas as pd
df = pd.read_csv('input.csv')
df.to_csv('output.csv', index=False)

Surprised that nobody here mentioned pandas. Here is a possible solution.

import pandas as pd
df = pd.read_csv('input.csv')
df.to_csv('output.csv', index=False)
又爬满兰若 2024-10-16 07:42:50

使用python从.csv文件中删除空行

    import csv
  ...


 with open('demo004.csv') as input, open('demo005.csv', 'w', newline='') as output:
     writer = csv.writer(output)
     for row in csv.reader(input):
         if any(field.strip() for field in row):
             writer.writerow(row)

谢谢

Delete empty row from .csv file using python

    import csv
  ...


 with open('demo004.csv') as input, open('demo005.csv', 'w', newline='') as output:
     writer = csv.writer(output)
     for row in csv.reader(input):
         if any(field.strip() for field in row):
             writer.writerow(row)

Thankyou

友谊不毕业 2024-10-16 07:42:50

用 pandas 来做这件事非常简单。使用 pandas 打开 csv 文件:

import pandas as pd
df = pd.read_csv("example.csv")
#checking the number of empty rows in th csv file
print (df.isnull().sum())
#Droping the empty rows
modifiedDF = df.dropna()
#Saving it to the csv file 
modifiedDF.to_csv('modifiedExample.csv',index=False)

Doing it with pandas is very simple. Open your csv file with pandas:

import pandas as pd
df = pd.read_csv("example.csv")
#checking the number of empty rows in th csv file
print (df.isnull().sum())
#Droping the empty rows
modifiedDF = df.dropna()
#Saving it to the csv file 
modifiedDF.to_csv('modifiedExample.csv',index=False)
世俗缘 2024-10-16 07:42:50

您必须打开第二个文件,向其中写入所有非空行,删除原始文件并将第二个文件重命名为原始名称。

编辑:真正的空白行将类似于“\n”:

for line in f1.readlines():
    if line.strip() == '':
        continue
    f2.write(line)

包含所有空白字段的行将类似于“,,,,,\n”。如果您认为这是一个空行:

for line in f1.readlines():
    if ''.join(line.split(',')).strip() == '':
        continue
    f2.write(line)

打开、关闭、删除和重命名文件就留给您作为练习。 (提示:导入 os、help(open)、help(os.rename)、help(os.unlink))

EDIT2:Laurence Gonsalves 引起我的注意,有效的 csv 文件可能在引用的 csv 字段中嵌入空行,例如 < code>1, '这\n\很棘手',123.45。在这种情况下,csv 模块将为您处理这个问题。对不起,劳伦斯,你的回答应该被接受。 csv 模块还将解决像 "","",""\n 这样的行的问题。

You have to open a second file, write all non blank lines to it, delete the original file and rename the second file to the original name.

EDIT: a real blank line will be like '\n':

for line in f1.readlines():
    if line.strip() == '':
        continue
    f2.write(line)

a line with all blank fields would look like ',,,,,\n'. If you consider this a blank line:

for line in f1.readlines():
    if ''.join(line.split(',')).strip() == '':
        continue
    f2.write(line)

openning, closing, deleting and renaming the files is left as an exercise for you. (hint: import os, help(open), help(os.rename), help(os.unlink))

EDIT2: Laurence Gonsalves brought to my attention that a valid csv file could have blank lines embedded in quoted csv fields, like 1, 'this\n\nis tricky',123.45. In this case the csv module will take care of that for you. I'm sorry Laurence, your answer deserved to be accepted. The csv module will also address the concerns about a line like "","",""\n.

帅气称霸 2024-10-16 07:42:50

用于从 csv 文件中删除空白行而不创建另一个文件的 python 代码。

def ReadWriteconfig_file(文件):

try:
    file_object = open(file, 'r')
    lines = csv.reader(file_object, delimiter=',', quotechar='"')
    flag = 0
    data=[]
    for line in lines:
        if line == []:
            flag =1
            continue
        else:
            data.append(line)
    file_object.close()
    if flag ==1: #if blank line is present in file
        file_object = open(file, 'w')
        for line in data:
            str1 = ','.join(line)
            file_object.write(str1+"\n")
        file_object.close() 
except Exception,e:
    print e

python code for remove blank line from csv file without create another file.

def ReadWriteconfig_file(file):

try:
    file_object = open(file, 'r')
    lines = csv.reader(file_object, delimiter=',', quotechar='"')
    flag = 0
    data=[]
    for line in lines:
        if line == []:
            flag =1
            continue
        else:
            data.append(line)
    file_object.close()
    if flag ==1: #if blank line is present in file
        file_object = open(file, 'w')
        for line in data:
            str1 = ','.join(line)
            file_object.write(str1+"\n")
        file_object.close() 
except Exception,e:
    print e
尛丟丟 2024-10-16 07:42:50

这是一个使用 pandas 删除空白行的解决方案。

 import pandas as pd
 df = pd.read_csv('input.csv')
 df.dropna(axis=0, how='all',inplace=True)
 df.to_csv('output.csv', index=False)

Here is a solution using pandas that removes blank rows.

 import pandas as pd
 df = pd.read_csv('input.csv')
 df.dropna(axis=0, how='all',inplace=True)
 df.to_csv('output.csv', index=False)
嗳卜坏 2024-10-16 07:42:50

不幸的是,我需要这样做,但没有像此代码那样在 CSV 文件末尾写入空白行(如果您保存 -> .csv,这也是 Excel 所做的)。我使用 CSV 模块的(更简单的)代码也执行此操作:

import csv

input = open("M51_csv_proc.csv", 'rb')
output = open("dumpFile.csv", 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
    writer.writerow(row)
input.close()
output.close() 

M51_csv_proc.csv 正好有 125 行;该程序始终输出 126 行,最后一行为空。

我已经经历过所有这些线程,似乎没有什么可以改变这种行为。

I need to do this but not have a blank row written at the end of the CSV file like this code unfortunately does (which is also what Excel does if you Save-> .csv). My (even simpler) code using the CSV module does this too:

import csv

input = open("M51_csv_proc.csv", 'rb')
output = open("dumpFile.csv", 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
    writer.writerow(row)
input.close()
output.close() 

M51_csv_proc.csv has exactly 125 rows; the program always outputs 126 rows, the last one being blank.

I've been through all these threads any nothing seems to change this behaviour.

娜些时光,永不杰束 2024-10-16 07:42:50

在此脚本中,所有 CR / CRLF 均从 CSV 文件中删除,然后具有如下行:

"My name";[email protected];"This is a comment.
Thanks!"

执行脚本 https://github.com/eoconsulting/lr2excelcsv/blob/master/lr2excelcsv.py

结果(Excel CSV 格式):

"My name",[email protected],"This is a comment. Thanks!"

In this script all the CR / CRLF are removed from a CSV file then has lines like this:

"My name";[email protected];"This is a comment.
Thanks!"

Execute the script https://github.com/eoconsulting/lr2excelcsv/blob/master/lr2excelcsv.py

Result (in Excel CSV format):

"My name",[email protected],"This is a comment. Thanks!"
ぃ弥猫深巷。 2024-10-16 07:42:50

将 PATH_TO_YOUR_CSV 替换为您的

import pandas as pd

df = pd.read_csv('PATH_TO_YOUR_CSV')
new_df = df.dropna()
df.dropna().to_csv('output.csv', index=False)

或 内联:

import pandas as pd

pd.read_csv('data.csv').dropna().to_csv('output.csv', index=False)

Replace the PATH_TO_YOUR_CSV with your

import pandas as pd

df = pd.read_csv('PATH_TO_YOUR_CSV')
new_df = df.dropna()
df.dropna().to_csv('output.csv', index=False)

or in-line:

import pandas as pd

pd.read_csv('data.csv').dropna().to_csv('output.csv', index=False)
对不⑦ 2024-10-16 07:42:50

我也有同样的问题。

我将 .csv 文件转换为数据帧,然后将数据帧转换回 .csv 文件。

带有空行的初始 .csv 文件是 'csv_file_logger2.csv' 。

所以,我做了以下过程

import csv
import pandas as pd
df=pd.read_csv('csv_file_logger2.csv')

df.to_csv('out2.csv',index = False)

I had the same, problem.

I converted the .csv file to a dataframe and after that I converted the dataframe back to the .csv file.

The initial .csv file with the blank lines was the 'csv_file_logger2.csv' .

So, i do the following process

import csv
import pandas as pd
df=pd.read_csv('csv_file_logger2.csv')

df.to_csv('out2.csv',index = False)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文