在读取文本文件时,Python 可以从字符串中删除双引号吗?

发布于 2024-08-11 05:13:25 字数 584 浏览 5 评论 0原文

我有一些像这样的文本文件,有几行 5000 行:

5.6  4.5  6.8  "6.5" (new line)
5.4  8.3  1.2  "9.3" (new line)

所以最后一项是双引号之间的数字。

我想要做的是,使用Python(如果可能的话)将四列分配给双精度变量。但主要问题是最后一项,我发现没有办法删除数字的双引号,在linux中可能吗?

这是我尝试过的:

#!/usr/bin/python

import os,sys,re,string,array

name=sys.argv[1]
infile = open(name,"r")

cont = 0
while 1:
         line = infile.readline()
         if not line: break
         l = re.split("\s+",string.strip(line)).replace('\"','')
     cont = cont +1
     a = l[0]
     b = l[1]
     c = l[2]
     d = l[3]

I have some text file like this, with several 5000 lines:

5.6  4.5  6.8  "6.5" (new line)
5.4  8.3  1.2  "9.3" (new line)

so the last term is a number between double quotes.

What I want to do is, using Python (if possible), to assign the four columns to double variables. But the main problem is the last term, I found no way of removing the double quotes to the number, is it possible in linux?

This is what I tried:

#!/usr/bin/python

import os,sys,re,string,array

name=sys.argv[1]
infile = open(name,"r")

cont = 0
while 1:
         line = infile.readline()
         if not line: break
         l = re.split("\s+",string.strip(line)).replace('\"','')
     cont = cont +1
     a = l[0]
     b = l[1]
     c = l[2]
     d = l[3]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

ぶ宁プ宁ぶ 2024-08-18 05:13:25
for line in open(name, "r"):
    line = line.replace('"', '').strip()
    a, b, c, d = map(float, line.split())

这是一种简单的做法,如果(例如)线上没有四个值等,则会引发异常。

for line in open(name, "r"):
    line = line.replace('"', '').strip()
    a, b, c, d = map(float, line.split())

This is kind of bare-bones, and will raise exceptions if (for example) there aren't four values on the line, etc.

╭ゆ眷念 2024-08-18 05:13:25

您可以使用标准库中的一个模块,名为 shlex< /强>

>>> import shlex
>>> print shlex.split('5.6  4.5  6.8  "6.5"')
['5.6', '4.5', '6.8', '6.5']

There's a module you can use from the standard library called shlex:

>>> import shlex
>>> print shlex.split('5.6  4.5  6.8  "6.5"')
['5.6', '4.5', '6.8', '6.5']
時窥 2024-08-18 05:13:25

csv 模块(标准库)会自动执行此操作,尽管文档对于 skipinitialspace 的描述不是很具体

>>> import csv

>>> with open(name, 'rb') as f:
...     for row in csv.reader(f, delimiter=' ', skipinitialspace=True):
...             print '|'.join(row)

5.6|4.5|6.8|6.5
5.4|8.3|1.2|9.3

The csv module (standard library) does it automatically, although the docs isn't very specific about skipinitialspace

>>> import csv

>>> with open(name, 'rb') as f:
...     for row in csv.reader(f, delimiter=' ', skipinitialspace=True):
...             print '|'.join(row)

5.6|4.5|6.8|6.5
5.4|8.3|1.2|9.3
再见回来 2024-08-18 05:13:25
for line in open(fname):
    line = line.split()
    line[-1] = line[-1].strip('"\n')
    floats = [float(i) for i in line]

另一种选择是使用内置模块,这是用于此任务的。即 csv

>>> import csv
>>> for line in csv.reader(open(fname), delimiter=' '):
    print([float(i) for i in line])

[5.6, 4.5, 6.8, 6.5]
[5.6, 4.5, 6.8, 6.5]
for line in open(fname):
    line = line.split()
    line[-1] = line[-1].strip('"\n')
    floats = [float(i) for i in line]

another option is to use built-in module, that is intended for this task. namely csv:

>>> import csv
>>> for line in csv.reader(open(fname), delimiter=' '):
    print([float(i) for i in line])

[5.6, 4.5, 6.8, 6.5]
[5.6, 4.5, 6.8, 6.5]
久而酒知 2024-08-18 05:13:25

或者您可以简单地

l = re.split("\s+",string.strip(line)).replace('\"','')

用以下内容替换您的行:

l = re.split('[\s"]+',string.strip(line))

Or you can simply replace your line

l = re.split("\s+",string.strip(line)).replace('\"','')

with this:

l = re.split('[\s"]+',string.strip(line))
踏月而来 2024-08-18 05:13:25

我本质上使用删除“25”中的“

Code:
        result = result.strip("\"") #remove double quotes characters 

I used in essence to remove the " in "25" using

Code:
        result = result.strip("\"") #remove double quotes characters 
落日海湾 2024-08-18 05:13:25

我认为最简单、最有效的方法就是切片!

从你的代码中:

d = l[3]
returns "6.5"

所以你只需添加另一个语句:

d = d[1:-1]

现在它将返回 6.5,没有前导和结尾双引号。

中提琴! :)

I think the easiest and most efficient thing to do would be to slice it!

From your code:

d = l[3]
returns "6.5"

so you simply add another statement:

d = d[1:-1]

now it will return 6.5 without the leading and end double quotes.

viola! :)

迷爱 2024-08-18 05:13:25

您可以使用正则表达式,尝试这样的操作

import re
re.findall("[0-9.]+", file(name).read())

这将为您提供文件中所有数字的列表,作为不带任何引号的字符串。

You can use regexp, try something like this

import re
re.findall("[0-9.]+", file(name).read())

This will give you a list of all numbers in your file as strings without any quotes.

英雄似剑 2024-08-18 05:13:25

恕我直言,最通用的双引号剥离器是这样的:

In [1]: s = '1 " 1 2" 0 a "3 4 5 " 6'
In [2]: [i[0].strip() for i in csv.reader(s, delimiter=' ') if i != ['', '']]
Out[2]: ['1', '1 2', '0', 'a', '3 4 5', '6']

IMHO, the most universal doublequote stripper is this:

In [1]: s = '1 " 1 2" 0 a "3 4 5 " 6'
In [2]: [i[0].strip() for i in csv.reader(s, delimiter=' ') if i != ['', '']]
Out[2]: ['1', '1 2', '0', 'a', '3 4 5', '6']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文