Python-读取时忽略每行的第一个字符(制表符)

发布于 2024-11-04 08:08:05 字数 711 浏览 5 评论 0原文

这是我之前问题的延续(如果您好奇的话请检查它们)。
我已经看到了隧道尽头的曙光,但我还有最后一个问题。

由于某种原因,每一行都以 TAB 字符开头。
我如何忽略第一个字符(在我的例子中为“制表符”(\t))?

filename = "terem.txt"

OraRend = collections.namedtuple('OraRend', 'Nap, OraKezdese, OraBefejezese, Azonosito, Terem, OraNeve, Emelet')


csv.list_dialects()
for line in csv.reader(open(filename, "rb"), delimiter='\t', lineterminator='\t\t', doublequote=False, skipinitialspace=True):
    print line  
    orar = OraRend._make(line) # Here comes the trouble!

文本文件:
http://pastebin.com/UYg4P4J1
(无法真正将其与所有选项卡一起粘贴到此处。)

我发现了 lstrip、strip 和其他方法,它们都会吃掉所有 字符,因此元组的填充将会失败。

This is a continuation of my former questions (check them if you are curious).
I can already see the light at the end of the tunnel, but I've got a last problem.

For some reason, every line starts with a TAB character.
How can I ignore that first character ("tab" (\t) in my case)?

filename = "terem.txt"

OraRend = collections.namedtuple('OraRend', 'Nap, OraKezdese, OraBefejezese, Azonosito, Terem, OraNeve, Emelet')


csv.list_dialects()
for line in csv.reader(open(filename, "rb"), delimiter='\t', lineterminator='\t\t', doublequote=False, skipinitialspace=True):
    print line  
    orar = OraRend._make(line) # Here comes the trouble!

The text file:
http://pastebin.com/UYg4P4J1
(Can't really paste it here with all the tabs.)

I have found lstrip, strip and other methods, all of them would eat all the chars, so the filling of the tuple would fail.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

甜扑 2024-11-11 08:08:05

可以执行line = line[1:]来删除第一个字符。但如果这样做,您应该添加一个断言,表明第一个字符确实是制表符,以避免在没有前导制表符的情况下损坏数据。

有一个更简单的替代方案,它还可以处理其他几种情况,并且如果要删除的东西不存在,也不会破坏东西。您可以使用 line = line.strip() 去除所有前导和尾随空格。或者,使用 .lstrip() 仅去除前导空格,并添加 '\t' 作为任一方法调用的参数(如果您想保留其他空格)删除选项卡。

You could do line = line[1:] to just strip the first character. But if you do this, you should add an assertion that the first character is indeed a tab, to avoid mangling data without leading tab.

There is an easier alternative that also handles several other cases and doesn't break things if the things to be removed aren't there. You can strip all leading and trailing whitespace with line = line.strip(). Alternatively, use .lstrip() to strip only leading whitespace, and add '\t' as argument to either method call if you want to leave other whitespace in place and just remove tabs.

温暖的光 2024-11-11 08:08:05

要从字符串中删除第一个字符:

>>> s = "Hello"
>>> s
'Hello'
>>> s[1:]
'ello'

To remove the first character from a string:

>>> s = "Hello"
>>> s
'Hello'
>>> s[1:]
'ello'
撕心裂肺的伤痛 2024-11-11 08:08:05

来自文档:

str.lstrip([字符])

返回删除了前导字符的字符串副本。字符
参数是一个字符串,指定
要删除的字符集。如果
省略或无,字符参数
默认删除空格。这
chars 参数不是前缀;
相反,其值的所有组合
被剥夺了

如果您只想删除行首的制表符,请使用

str.lstrip("\t")

做的好处是您不必检查以确保第一个字符实际上是制表符。但是,如果存在多个选项卡的情况,并且您希望保留第二个选项卡并打开,则必须使用 str[1:]

From the docs:

str.lstrip([chars])

Return a copy of the string with leading characters removed. The chars
argument is a string specifying the
set of characters to be removed. If
omitted or None, the chars argument
defaults to removing whitespace. The
chars argument is not a prefix;
rather, all combinations of its values
are stripped

If you want to only remove the tab at the beginning of a line, use

str.lstrip("\t")

This has the benefit that you don't have to check to make sure the first character is, in fact, a tab. However, if there are cases when there are more than one tab, and you want to keep the second tab and on, you're going to have to use str[1:].

幻想少年梦 2024-11-11 08:08:05

考虑一下这一点。您不需要将“文件”传递给 csv.reader。作为字符串值序列的文件行对象可以很好地工作。

filename = "terem.txt"

OraRend = collections.namedtuple('OraRend', 'Nap, OraKezdese, OraBefejezese, Azonosito, Terem, OraNeve, Emelet')

with open(filename, "rb") as source:
    cleaned = ( line.lstrip() for line in source )
    rdr= csv.reader( cleaned, delimiter='\t', lineterminator='\t\t', doublequote=False, skipinitialspace=True)
    for line in rdr
        print line  
        orar = OraRend._make(line) # Here comes the trouble!

Consider this. You don't need to pass a "file" to csv.reader. A file-line object that is a sequence of string values works nicely.

filename = "terem.txt"

OraRend = collections.namedtuple('OraRend', 'Nap, OraKezdese, OraBefejezese, Azonosito, Terem, OraNeve, Emelet')

with open(filename, "rb") as source:
    cleaned = ( line.lstrip() for line in source )
    rdr= csv.reader( cleaned, delimiter='\t', lineterminator='\t\t', doublequote=False, skipinitialspace=True)
    for line in rdr
        print line  
        orar = OraRend._make(line) # Here comes the trouble!
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文