如何使用 python 匹配文本文件中的单词?

发布于 2024-10-20 13:16:02 字数 304 浏览 8 评论 0原文

我想搜索并匹配文本文件中的特定单词。

with open('wordlist.txt', 'r') as searchfile:
        for line in searchfile:
            if word in line:
                    print line

此代码甚至返回包含目标单词子字符串的单词。例如,如果单词是“there”,则搜索将返回“there”、“therefore”、“thereby”等。

我希望代码仅返回包含“there”的行。时期。

I want to search and match a particular word in a text file.

with open('wordlist.txt', 'r') as searchfile:
        for line in searchfile:
            if word in line:
                    print line

This code returns even the words that contain substrings of the target word. For example if the word is "there" then the search returns "there", "therefore", "thereby", etc.

I want the code to return only the lines which contain "there". Period.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

岁月染过的梦 2024-10-27 13:16:02
import re

file = open('wordlist.txt', 'r')

for line in file.readlines():
    if re.search('^there

re.search 函数扫描字符串 line,如果找到第一个参数中定义的正则表达式,则返回 true,忽略 re.I^ 字符表示“行的开头”,而 $ 字符表示“行的结尾”。因此,搜索函数只有在匹配前面是行首、后面是行尾的there时才会返回 true,也称为独立的。

, line, re.I): print line

re.search 函数扫描字符串 line,如果找到第一个参数中定义的正则表达式,则返回 true,忽略 re.I^ 字符表示“行的开头”,而 $ 字符表示“行的结尾”。因此,搜索函数只有在匹配前面是行首、后面是行尾的there时才会返回 true,也称为独立的。

import re

file = open('wordlist.txt', 'r')

for line in file.readlines():
    if re.search('^there

The re.search function scans the string line and returns true if it finds the regular expression defined in the first parameter, ignoring case with re.I. The ^ character means 'beginning of the line' while the $ character means 'end of the line'. Therefore, the search function will only return true if it matches there preceded by the beginning of the line, and followed by the end of the line, aka isolated on its own.

, line, re.I): print line

The re.search function scans the string line and returns true if it finds the regular expression defined in the first parameter, ignoring case with re.I. The ^ character means 'beginning of the line' while the $ character means 'end of the line'. Therefore, the search function will only return true if it matches there preceded by the beginning of the line, and followed by the end of the line, aka isolated on its own.

岁月染过的梦 2024-10-27 13:16:02

将行拆分为标记:if word in line.split():

split the line into tokens: if word in line.split():

随风而去 2024-10-27 13:16:02

您始终可以使用正则表达式,类似于:

import re

with open('wordlist.txt', 'r') as searchfile:
        for line in searchfile:
            if re.search( r'\sthere\s', line, re.M|re.I):
                    print line
  • \sthere\s - 任何空格后跟“there”,后跟任何空格
  • re.I - 表示不区分大小写
  • < code>re.M - 在这种情况下并不重要(因为行只有 1 \n)

You can always use regex, something along the lines of:

import re

with open('wordlist.txt', 'r') as searchfile:
        for line in searchfile:
            if re.search( r'\sthere\s', line, re.M|re.I):
                    print line
  • \sthere\s - any space followed by 'there' followed by any space
  • re.I - means case insensitive
  • re.M - doesn't really matter in this case (since lines only have 1 \n)
歌入人心 2024-10-27 13:16:02

你应该使用正则表达式。 Python 文档中的正则表达式指南可能是一个不错的起点。

You ought to use a regular expression. The regular expression howto from the Python docs might be a good place to start.

栀子花开つ 2024-10-27 13:16:02

查找 re 模块(正则表达式)。 re.search 使用正则表达式 'there' 就是你想要的。

Look up the re module (regular expressions). re.search with the regex ' there ' is what you want.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文