匹配“~”在带有 python 正则表达式的文件名末尾
我正在使用脚本(Python)来查找一些文件。我将文件名与正则表达式模式进行比较。现在,我必须找到以“~”(波形符)结尾的文件,所以我构建了这个正则表达式:
if re.match("~$", string_test):
print "ok!"
嗯,Python 似乎无法识别正则表达式,我不知道为什么。我在其他语言中尝试了相同的正则表达式,它工作得很好,知道吗?
PD:我在网上读到,我必须插入
# -*- coding: utf-8 -*-
但没有帮助:(。
非常感谢,同时我将继续阅读,看看是否能找到一些东西。
I'm working in a script (Python) to find some files. I compare names of files against a regular expression pattern. Now, I have to find files ending with a "~" (tilde), so I built this regex:
if re.match("~$", string_test):
print "ok!"
Well, Python doesn't seem to recognize the regex, I don't know why. I tried the same regex in other languages and it works perfectly, any idea?
PD: I read in a web that I have to insert
# -*- coding: utf-8 -*-
but doesn't help :( .
Thanks a lot, meanwhile I'm going to keep reading to see if a find something.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
re.match()
仅如果正则表达式在输入字符串的开头匹配,则成功。要搜索任何子字符串,请使用re.search()
相反:re.match()
is only successful if the regular expression matches at the beginning of the input string. To search for any substring, usere.search()
instead:你的正则表达式只会匹配字符串“~”和(不管你信不信)“~\n”。
你需要 re.match(r".*~$", 不管怎样) ...这意味着零个或多个(除换行符之外的任何内容)后跟波浪号,后跟(字符串结尾或结尾之前的换行符)细绳)。
万一文件名可能包含换行符,请使用 re.DOTALL 标志并使用 \Z 而不是 $。
用其他语言“工作”:您一定使用过搜索功能。
字符串常量开头的 r 表示原始转义,例如 '\n' 是换行符,但 r'\n' 是两个字符,一个反斜杠后跟 n - 也可以用 '\n' 表示。原始转义在正则表达式中保存了大量
\\
,应该自动使用 r"regex"顺便说一句:在这种情况下避免正则表达式混乱...使用whatever.endswith('~')
Your regex will only match strings "~" and (believe it or not) "~\n".
You need re.match(r".*~$", whatever) ... that means zero or more of (anything except a newline) followed by a tilde followed by (end-of-string or a newline preceding the end of string).
In the unlikely event that a filename can include a newline, use the re.DOTALL flag and use \Z instead of $.
"worked" in other languages: you must have used a search function.
r at the beginning of a string constant means raw escapes e.g. '\n' is a newline but r'\n' is two characters, a backslash followed by n -- which can also be represented by '\n'. Raw escapes save a lot of
\\
in regexes, one should use r"regex" automaticallyBTW: in this case avoid the regex confusion ... use whatever.endswith('~')
要查找文件,请使用
glob
代替,For finding files, use
glob
instead,正确的正则表达式和
glob
解决方案已经发布。另一种选择是使用fnmatch
模块:这比使用正则表达式容易一点。请注意,此处发布的所有方法本质上是等效的:
fnmatch
是使用正则表达式实现的,而glob
又使用fnmatch
。请注意,仅在 2009 年才有 补丁< /a> 被添加到
fnmatch
(六年后!),增加了对带有换行符的文件名的支持。The correct regex and the
glob
solution have already been posted. Another option is to use thefnmatch
module:This is a tiny bit easier than using a regex. Note that all methods posted here are essentially equivalent:
fnmatch
is implemented using regular expressions, andglob
in turn usesfnmatch
.Note that only in 2009 a patch was added to
fnmatch
(after six years!) that added support for file names with newlines.