Python如何在匹配后抓取一定数量的行
假设我有一个以下格式的输入文本文件:
Section1 Heading Number of lines: n1
Line 1
Line 2
...
Line n1
Maybe some irrelevant lines
Section2 Heading Number of lines: n2
Line 1
Line 2
...
Line n2
其中文件的某些部分以标题行开头,该标题行指定该部分中有多少行。每个部分标题都有不同的名称。
我编写了一个正则表达式,它将根据用户搜索每个部分的标题名称来匹配标题行,解析它,然后返回数字 n1/n2/etc 来告诉我该部分中有多少行。我一直在尝试使用 for-in 循环来读取每一行,直到计数器达到 n1,但到目前为止还没有成功。
这是我的问题:当匹配中给出了匹配行之后的特定行数并且每个部分都不同时,如何返回该行数?我是编程新手,非常感谢任何帮助。
编辑:好的,这是我到目前为止的相关代码:
import re
print
fname = raw_input("Enter filename: ")
toolname = raw_input("Enter toolname: ")
def findcounter(fname, toolname):
logfile = open(fname, "r")
pat = 'SUCCESS Number of lines :'
#headers all have that format
for line in logfile:
if toolname in line:
if pat in line:
s=line
pattern = re.compile(r"""(?P<name>.*?) #starting name
\s*SUCCESS #whitespace and success
\s*Number\s*of\s*lines #whitespace and strings
\s*\:\s*(?P<n1>.*)""",re.VERBOSE)
match = pattern.match(s)
name = match.group("name")
n1 = int(match.group("n1"))
#after matching line, I attempt to loop through the next n1 lines
lcount = 0
for line in logfile:
if line == match:
while lcount <= n1:
match.append(line)
lcount += 1
return result
文件本身相当长,并且在我感兴趣的部分之间散布着许多不相关的行。我不太确定的是如何指定直接在匹配行之后打印这些行。
Let's say I have an input text file of the following format:
Section1 Heading Number of lines: n1
Line 1
Line 2
...
Line n1
Maybe some irrelevant lines
Section2 Heading Number of lines: n2
Line 1
Line 2
...
Line n2
where certain sections of the file start with a header line that specifies how many lines are in that section. Each section heading has a different name.
I have written a regular expression that will match the header line based on the header name the user searches for each section, parse it, and then return the number n1/n2/etc that tells me how many lines are in the section. I have been trying to use a for-in loop to read through each line until a counter reaches n1, but it hasn't worked out so far.
Here's my question: how do I return just a certain number of lines following a matched line when that number is given in the match and different for each section? I'm new to programming, and I appreciate any help.
EDIT: Okay, here's the relevant code that I have so far:
import re
print
fname = raw_input("Enter filename: ")
toolname = raw_input("Enter toolname: ")
def findcounter(fname, toolname):
logfile = open(fname, "r")
pat = 'SUCCESS Number of lines :'
#headers all have that format
for line in logfile:
if toolname in line:
if pat in line:
s=line
pattern = re.compile(r"""(?P<name>.*?) #starting name
\s*SUCCESS #whitespace and success
\s*Number\s*of\s*lines #whitespace and strings
\s*\:\s*(?P<n1>.*)""",re.VERBOSE)
match = pattern.match(s)
name = match.group("name")
n1 = int(match.group("n1"))
#after matching line, I attempt to loop through the next n1 lines
lcount = 0
for line in logfile:
if line == match:
while lcount <= n1:
match.append(line)
lcount += 1
return result
The file itself is pretty long, and there are lots of irrelevant lines interspersed between the sections I'm interested in. What I'm not too sure about is how to specify printing the lines directly after a matched line.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以将这样的逻辑放入生成器中:
You can put logic like this in a generator: