将空 csv 列值替换为零

发布于 2024-09-02 05:25:06 字数 1633 浏览 2 评论 0原文

所以我正在处理一个缺少值的 csv 文件。 我想要我的脚本是:

#!/usr/bin/python

import csv
import sys

#1. Place each record of a file in a list.
#2. Iterate thru each element of the list and get its length.
#3. If the length is less than one replace with value x.


reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for x in row[:]:
                if len(x)< 1:
                         x = 0
                print x
print row

这是一个数据示例,我尝试了它,理想情况下它应该适用于任何列长度

Before:
actnum,col2,col4
xxxxx ,    ,
xxxxx , 845   ,
xxxxx ,    ,545

After
actnum,col2,col4
xxxxx , 0  , 0
xxxxx , 845, 0
xxxxx , 0  ,545

任何指导将不胜感激

更新这是我现在拥有的(谢谢):

reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
print row

但是它似乎只是输入一条记录后,我将通过管道将输出传输到命令行上的新文件。

更新 3:好的,现在我遇到了相反的问题,我输出每条记录的重复项。 为什么会发生这种情况?

After
actnum,col2,col4
actnum,col2,col4
xxxxx , 0  , 0
xxxxx , 0  , 0
xxxxx , 845, 0
xxxxx , 845, 0
xxxxx , 0  ,545
xxxxx , 0  ,545

好的,我修复了它(如下),谢谢你们的帮助。

#!/usr/bin/python

import csv
import sys

#1. Place each record of a file in a list.
#2. Iterate thru each element of the list and get its length.
#3. If the length is less than one replace with value x.


reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
    print ','.join(str(x) for x in row)

So I'm dealing with a csv file that has missing values.
What I want my script to is:

#!/usr/bin/python

import csv
import sys

#1. Place each record of a file in a list.
#2. Iterate thru each element of the list and get its length.
#3. If the length is less than one replace with value x.


reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for x in row[:]:
                if len(x)< 1:
                         x = 0
                print x
print row

Here is an example of data, I trying it on, ideally it should work on any column lenghth

Before:
actnum,col2,col4
xxxxx ,    ,
xxxxx , 845   ,
xxxxx ,    ,545

After
actnum,col2,col4
xxxxx , 0  , 0
xxxxx , 845, 0
xxxxx , 0  ,545

Any guidance would be appreciated

Update Here is what I have now (thanks):

reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
print row

However it only seems to out put one record, I will be piping the output to a new file on the command line.

Update 3: Ok now I have the opposite problem, I'm outputting duplicates of each records.
Why is that happening?

After
actnum,col2,col4
actnum,col2,col4
xxxxx , 0  , 0
xxxxx , 0  , 0
xxxxx , 845, 0
xxxxx , 845, 0
xxxxx , 0  ,545
xxxxx , 0  ,545

Ok I fixed it (below) thanks you guys for your help.

#!/usr/bin/python

import csv
import sys

#1. Place each record of a file in a list.
#2. Iterate thru each element of the list and get its length.
#3. If the length is less than one replace with value x.


reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
    print ','.join(str(x) for x in row)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

昔日梦未散 2024-09-09 05:25:06

将代码更改

for row in reader:
    for x in row[:]:
                if len(x)< 1:
                         x = 0
                print x

for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
                print x

:不确定您认为通过 print 要完成什么,但关键问题是您需要修改 row,为此目的,您需要一个索引,enumerate 为您提供了索引。

另请注意,除了要更改为数字 0 的空值之外,所有其他值都将保留为字符串。如果您想将它们转换为 int,您必须显式地执行此操作。

Change your code:

for row in reader:
    for x in row[:]:
                if len(x)< 1:
                         x = 0
                print x

into:

for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
                print x

Not sure what you think you're accomplishing by the print, but the key issue is that you need to modify row, and for that purpose you need an index into it, which enumerate gives you.

Note also that all other values, except the empty ones which you're changing into the number 0, will remain strings. If you want to turn them into ints you have to do that explicitly.

血之狂魔 2024-09-09 05:25:06

非常就快到了!

只有几个小错误。

  • <代码>len(x)< 1 不适用于数据第二行的第二列,因为 x 将包含 ' ' (并且长度 > 1)。您需要剥离您的字符串。

  • print row 可能会打印一个空列表,因为您已经完成了迭代。您可能可以删除这一行。

另外:您是否正在尝试修改文件或只是输出更正以通过管道传输到其他文件或进程?

You are very nearly there!

There are just a couple of small bugs.

  • len(x)< 1 will not work for the second column in the second row of your data because x will contain ' ' (and have a length > 1). You'll need to strip your strings.

  • print row will probably print an empty list because you've finished iterating. You can probably just remove this line.

Also: Are you trying to modify the file or just output the corrections to pipe to some other file or process?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文