使用 python 读取文件并查看文件中是否存在特定字符串

发布于 2024-09-13 14:29:02 字数 1675 浏览 2 评论 0原文

我有一个以下格式的文件

Summary;None;Description;Emails\nDarlene\nGregory Murphy\nDr. Ingram\n;DateStart;20100615T111500;DateEnd;20100615T121500;Time;20100805T084547Z
Summary;Presence tech in smart energy management;Description;;DateStart;20100628T130000;DateEnd;20100628T133000;Time;20100628T055408Z
Summary;meeting;Description;None;DateStart;20100629T110000;DateEnd;20100629T120000;Time;20100805T084547Z
Summary;meeting;Description;None;DateStart;20100630T090000;DateEnd;20100630T100000;Time;20100805T084547Z
Summary;Balaji Viswanath: Meeting;Description;None;DateStart;20100712T140000;DateEnd;20100712T143000;Time;20100805T084547Z
Summary;Government Industry Training:  How Smart is Your City - The Smarter City Assessment Tool\nUS Call-In Information:  1-866-803-2143\,     International Number:  1-210-795-1098\,     International Toll-free Numbers:  See below\,     Passcode:  6785765\nPresentation Link - Copy and paste URL into web browser:  http://w3.tap.ibm.com/medialibrary/media_view?id=87408;Description;International Toll-free Numbers link - Copy and paste this URL into your web browser:\n\nhttps://w3-03.sso.ibm.com/sales/support/ShowDoc.wss?docid=NS010BBUN-7P4TZU&infotype=SK&infosubtype=N0&node=clientset\,IA%7Cindustries\,Y&ftext=&sort=date&showDetails=false&hitsize=25&offset=0&campaign=#International_Call-in_Numbers;DateStart;20100811T203000;DateEnd;20100811T213000;Time;20100805T084547Z

现在我需要创建一个执行以下操作的函数：

函数参数将指定要读取的行，假设我已经完成了 line.split(;)

查看是否有“会议”或第[1]行任意位置“拨入号码”，并查看第[2]行任意位置是否有“会议”或“拨入号码”。如果这两个条件之一为真，则该函数应返回“呼入会议”。否则它应该返回“None Inferred”。

提前致谢

原文

I have a file in the following format

Summary;None;Description;Emails\nDarlene\nGregory Murphy\nDr. Ingram\n;DateStart;20100615T111500;DateEnd;20100615T121500;Time;20100805T084547Z
Summary;Presence tech in smart energy management;Description;;DateStart;20100628T130000;DateEnd;20100628T133000;Time;20100628T055408Z
Summary;meeting;Description;None;DateStart;20100629T110000;DateEnd;20100629T120000;Time;20100805T084547Z
Summary;meeting;Description;None;DateStart;20100630T090000;DateEnd;20100630T100000;Time;20100805T084547Z
Summary;Balaji Viswanath: Meeting;Description;None;DateStart;20100712T140000;DateEnd;20100712T143000;Time;20100805T084547Z
Summary;Government Industry Training:  How Smart is Your City - The Smarter City Assessment Tool\nUS Call-In Information:  1-866-803-2143\,     International Number:  1-210-795-1098\,     International Toll-free Numbers:  See below\,     Passcode:  6785765\nPresentation Link - Copy and paste URL into web browser:  http://w3.tap.ibm.com/medialibrary/media_view?id=87408;Description;International Toll-free Numbers link - Copy and paste this URL into your web browser:\n\nhttps://w3-03.sso.ibm.com/sales/support/ShowDoc.wss?docid=NS010BBUN-7P4TZU&infotype=SK&infosubtype=N0&node=clientset\,IA%7Cindustries\,Y&ftext=&sort=date&showDetails=false&hitsize=25&offset=0&campaign=#International_Call-in_Numbers;DateStart;20100811T203000;DateEnd;20100811T213000;Time;20100805T084547Z

Now I need to create a function that does the following:

The function argument would specify which line to read, and let say i have already done line.split(;)

See if there is "meeting" or "call in number" anywhere in line[1], and see if there is "meeting" or "call in number" anywhere in line[2]. If either of both of these are true, the function should return "call-in meeting". Else it should return "None Inferred".

Thanks in advance

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

星星的轨迹 2024-09-20 14:29:02

使用 in 运算符查看是否存在匹配

for line in open("file"):
    if "string" in line :
        ....

use the in operator to see if there is a match

for line in open("file"):
    if "string" in line :
        ....

回复收藏 0 原文

孤君无依 2024-09-20 14:29:02

vlad003 是对的：如果行中有换行符；他们将是新线路！在这种情况下，我会改为拆分“摘要”：

import itertools

def chunks( filePath ):
    "Since you have newline characters in each section,\
    you can't read each line in turn. This function reads\
    lines of the file and splits them into chunks, restarting\
    each time 'Summary' starts a line."
    with open( filePath ) as theFile:
        chunk = [ ]
        for line in theFile:
            if line.startswith( "Summary" ):
                if chunk: yield chunk
                chunk = [ line ]
            else:
                chunk.append( line )
        yield chunk

def nth(iterable, n, default=None):
    "Gets the nth element of an iterator."
    return next(islice(iterable, n, None), default)

def getStatus( chunkNum ):
    "Get the nth chunk of the file, split it by ";", and return the result."
    chunk = nth( chunks, chunkNum, "" ).split( ";" )
    if not chunk[ 0 ]:
        raise SomeError # could not get the right chunk
    if "meeting" in chunk[ 1 ].lower() or "call in number" in chunk[ 1 ].lower():
        return "call-in meeting"
    else:
        return "None Inferred"

请注意，如果您计划读取文件的所有块，这是愚蠢的，因为它会打开文件并在每次查询时读取一次。如果您打算经常这样做，则值得将其解析为更好的数据格式（例如状态数组）。这将需要一次遍历该文件，并为您提供更好的查找。

vlad003 is right: if you have newline characters in the lines; they will be new lines! In this case, I would split on "Summary" instead:

import itertools

def chunks( filePath ):
    "Since you have newline characters in each section,\
    you can't read each line in turn. This function reads\
    lines of the file and splits them into chunks, restarting\
    each time 'Summary' starts a line."
    with open( filePath ) as theFile:
        chunk = [ ]
        for line in theFile:
            if line.startswith( "Summary" ):
                if chunk: yield chunk
                chunk = [ line ]
            else:
                chunk.append( line )
        yield chunk

def nth(iterable, n, default=None):
    "Gets the nth element of an iterator."
    return next(islice(iterable, n, None), default)

def getStatus( chunkNum ):
    "Get the nth chunk of the file, split it by ";", and return the result."
    chunk = nth( chunks, chunkNum, "" ).split( ";" )
    if not chunk[ 0 ]:
        raise SomeError # could not get the right chunk
    if "meeting" in chunk[ 1 ].lower() or "call in number" in chunk[ 1 ].lower():
        return "call-in meeting"
    else:
        return "None Inferred"

Note that this is silly if you plan to read all the chunks of the file, since it opens the file and reads through it once per query. If you plan to do this often, it would be worth parsing it into a better data format (e.g. an array of statuses). This would require one pass through the file, and give you much better lookups.

回复收藏 0 原文

独守阴晴ぅ圆缺 2024-09-20 14:29:02

基于 Ghostdog74 的答案：

def finder(line):
    '''Takes line number as argument. First line is number 0.'''
    with open('/home/vlad/Desktop/file.txt') as f:
        lines = f.read().split('Summary')[1:]
        searchLine = lines[line]
        if 'meeting' in searchLine.lower() or 'call in number' in searchLine.lower():
            return 'call-in meeting'
        else:
            return 'None Inferred'

我不太明白 line[1] 和 line[2] 的意思，所以这是我能做的最好的事情。

编辑：修复了 \n 的问题。我认为由于您正在搜索会议和呼叫号码，因此您不需要摘要，因此我用它来分割行。

A build on ghostdog74's answer:

def finder(line):
    '''Takes line number as argument. First line is number 0.'''
    with open('/home/vlad/Desktop/file.txt') as f:
        lines = f.read().split('Summary')[1:]
        searchLine = lines[line]
        if 'meeting' in searchLine.lower() or 'call in number' in searchLine.lower():
            return 'call-in meeting'
        else:
            return 'None Inferred'

I don't quite understand what you meant by line[1] and line[2] so this is the best I could do.

EDIT: Fixed the problem with the \n's. I figure since you're searching for the meeting and call in number you don't need the Summary so I used it to split the lines.

回复收藏 0 原文

~没有更多了~