如何使用 python 匹配文本文件中的单词？

发布于 2024-10-20 13:16:02 字数 304 浏览 8 评论 0原文

我想搜索并匹配文本文件中的特定单词。

with open('wordlist.txt', 'r') as searchfile:
        for line in searchfile:
            if word in line:
                    print line

此代码甚至返回包含目标单词子字符串的单词。例如，如果单词是“there”，则搜索将返回“there”、“therefore”、“thereby”等。

我希望代码仅返回包含“there”的行。时期。

原文

I want to search and match a particular word in a text file.

with open('wordlist.txt', 'r') as searchfile:
        for line in searchfile:
            if word in line:
                    print line

This code returns even the words that contain substrings of the target word. For example if the word is "there" then the search returns "there", "therefore", "thereby", etc.

I want the code to return only the lines which contain "there". Period.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

岁月染过的梦 2024-10-27 13:16:02

import re

file = open('wordlist.txt', 'r')

for line in file.readlines():
    if re.search('^there
re.search 函数扫描字符串 line，如果找到第一个参数中定义的正则表达式，则返回 true，忽略 re.I。 ^ 字符表示“行的开头”，而 $ 字符表示“行的结尾”。因此，搜索函数只有在匹配前面是行首、后面是行尾的there时才会返回 true，也称为独立的。
, line, re.I):
        print line

re.search 函数扫描字符串 line，如果找到第一个参数中定义的正则表达式，则返回 true，忽略 re.I。 ^ 字符表示“行的开头”，而 $ 字符表示“行的结尾”。因此，搜索函数只有在匹配前面是行首、后面是行尾的there时才会返回 true，也称为独立的。

import re

file = open('wordlist.txt', 'r')

for line in file.readlines():
    if re.search('^there
The re.search function scans the string line and returns true if it finds the regular expression defined in the first parameter, ignoring case with re.I. The ^ character means 'beginning of the line' while the $ character means 'end of the line'. Therefore, the search function will only return true if it matches there preceded by the beginning of the line, and followed by the end of the line, aka isolated on its own.
, line, re.I):
        print line

The re.search function scans the string line and returns true if it finds the regular expression defined in the first parameter, ignoring case with re.I. The ^ character means 'beginning of the line' while the $ character means 'end of the line'. Therefore, the search function will only return true if it matches there preceded by the beginning of the line, and followed by the end of the line, aka isolated on its own.

回复收藏 0 原文

岁月染过的梦 2024-10-27 13:16:02

将行拆分为标记：if word in line.split():

回复收藏 0 原文

随风而去 2024-10-27 13:16:02

您始终可以使用正则表达式，类似于：

import re

with open('wordlist.txt', 'r') as searchfile:
        for line in searchfile:
            if re.search( r'\sthere\s', line, re.M|re.I):
                    print line

\sthere\s - 任何空格后跟“there”，后跟任何空格
re.I - 表示不区分大小写
< code>re.M - 在这种情况下并不重要（因为行只有 1 \n）

You can always use regex, something along the lines of:

import re

with open('wordlist.txt', 'r') as searchfile:
        for line in searchfile:
            if re.search( r'\sthere\s', line, re.M|re.I):
                    print line