`match = re.match(); 的替代方案 如果匹配:...` 习语?
如果你想检查某些内容是否与正则表达式匹配,如果是,则打印第一组,你可以这样做。
import re
match = re.match("(\d+)g", "123g")
if match is not None:
print match.group(1)
这完全是迂腐的,但是中间的 match
变量有点烦人。
像 Perl 这样的语言会这样做这是通过为匹配组创建新的 $1
..$9
变量来实现的,例如......
if($blah ~= /(\d+)g/){
print $1
}
来自 这个 reddit 评论,
with re_context.match('^blah', s) as match:
if match:
...
else:
...
..我认为这是一个有趣的想法,所以我写了一个简单的实现:(
#!/usr/bin/env python2.6
import re
class SRE_Match_Wrapper:
def __init__(self, match):
self.match = match
def __exit__(self, type, value, tb):
pass
def __enter__(self):
return self.match
def __getattr__(self, name):
if name == "__exit__":
return self.__exit__
elif name == "__enter__":
return self.__name__
else:
return getattr(self.match, name)
def rematch(pattern, inp):
matcher = re.compile(pattern)
x = SRE_Match_Wrapper(matcher.match(inp))
return x
return match
if __name__ == '__main__':
# Example:
with rematch("(\d+)g", "123g") as m:
if m:
print(m.group(1))
with rematch("(\d+)g", "123") as m:
if m:
print(m.group(1))
理论上这个功能可以修补到_sre.SRE_Match
对象)
如果没有匹配项,如果您可以跳过 with
语句的代码块的执行,那就太好了,这会将其简化为
with rematch("(\d+)g", "123") as m:
print(m.group(1)) # only executed if the match occurred
...。 .但这似乎是不可能的,根据我可以从 PEP 343 推断出的
任何想法? 正如我所说,这确实是微不足道的烦恼,几乎达到了代码高尔夫的地步。
If you want to check if something matches a regex, if so, print the first group, you do..
import re
match = re.match("(\d+)g", "123g")
if match is not None:
print match.group(1)
This is completely pedantic, but the intermediate match
variable is a bit annoying..
Languages like Perl do this by creating new $1
..$9
variables for match groups, like..
if($blah ~= /(\d+)g/){
print $1
}
From this reddit comment,
with re_context.match('^blah', s) as match:
if match:
...
else:
...
..which I thought was an interesting idea, so I wrote a simple implementation of it:
#!/usr/bin/env python2.6
import re
class SRE_Match_Wrapper:
def __init__(self, match):
self.match = match
def __exit__(self, type, value, tb):
pass
def __enter__(self):
return self.match
def __getattr__(self, name):
if name == "__exit__":
return self.__exit__
elif name == "__enter__":
return self.__name__
else:
return getattr(self.match, name)
def rematch(pattern, inp):
matcher = re.compile(pattern)
x = SRE_Match_Wrapper(matcher.match(inp))
return x
return match
if __name__ == '__main__':
# Example:
with rematch("(\d+)g", "123g") as m:
if m:
print(m.group(1))
with rematch("(\d+)g", "123") as m:
if m:
print(m.group(1))
(This functionality could theoretically be patched into the _sre.SRE_Match
object)
It would be nice if you could skip the execution of the with
statement's code block, if there was no match, which would simplify this to..
with rematch("(\d+)g", "123") as m:
print(m.group(1)) # only executed if the match occurred
..but this seems impossible based of what I can deduce from PEP 343
Any ideas? As I said, this is really trivial annoyance, almost to the point of being code-golf..
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
我不认为使用
with
是这种情况下的解决方案。 您必须在BLOCK
部分(由用户指定)引发异常,并让__exit__
方法将True
返回到“吞下”例外。 所以它永远不会好看。我建议采用类似于 Perl 语法的语法。 制作您自己的扩展
re
模块(我将其称为rex
)并让它在其模块命名空间中设置变量:正如您在下面的注释中看到的,此方法是既不是范围安全的,也不是线程安全的。 仅当您完全确定您的应用程序将来不会成为多线程,并且从您使用此功能的范围调用的任何函数也将使用相同的功能时,您才会使用此功能方法。
I don't think using
with
is the solution in this case. You'd have to raise an exception in theBLOCK
part (which is specified by the user) and have the__exit__
method returnTrue
to "swallow" the exception. So it would never look good.I'd suggest going for a syntax similar to the Perl syntax. Make your own extended
re
module (I'll call itrex
) and have it set variables in its module namespace:As you can see in the comments below, this method is neither scope- nor thread-safe. You would only use this if you were completely certain that your application wouldn't become multi-threaded in the future and that any functions called from the scope that you're using this in will also use the same method.
这看起来并不漂亮,但您可以从
getattr(object, name[, default])
内置函数中受益,如下所示:模仿 if match 打印组流程,你可以这样使用
for
语句:当然你可以定义一个小函数来完成肮脏的工作:
This is not really pretty-looking, but you can profit from the
getattr(object, name[, default])
built-in function using it like this:To mimic the if match print group flow, you can (ab)use the
for
statement this way:Of course you can define a little function to do the dirty work:
这不是完美的解决方案,但确实允许您为同一字符串链接多个匹配选项:
Not the perfect solution, but does allow you to chain several match options for the same str:
这是我的解决方案:
您可以根据需要使用任意多个 elif 子句。
更好的是:
追加和更新都返回无。 因此,您必须在每种情况下使用或部分来实际检查表达式的结果。
不幸的是,只有当代码位于顶层(即不在函数中)时,这才有效。
Here's my solution:
You can use as many elif clauses as needed.
Even better:
Both append and update return None. So you have to actually check the result of your expression by using the or part in every case.
Unfortunately, this only works as long as the code resides top-level, i.e. not in a function.
这就是我所做的:
也就是说,我将一个列表传递给函数以模拟按引用传递。
This is what I do:
That is, I pass a list to the function to emulate pass-by-reference.
我认为这不是小事。 如果我经常编写这样的代码,我不想在代码中添加多余的条件。
这有点奇怪,但你可以使用迭代器来做到这一点:
奇怪的是它使用迭代器来处理不迭代的东西——它更接近条件,乍一看它可能看起来会产生每场比赛有多个结果。
上下文管理器不能导致其托管函数被完全跳过,这看起来确实很奇怪; 虽然这不是“with”的明确用例之一,但它似乎是一个自然的扩展。
I don't think it's trivial. I don't want to have to sprinkle a redundant conditional around my code if I'm writing code like that often.
This is slightly odd, but you can do this with an iterator:
The odd thing is that it's using an iterator for something that isn't iterating--it's closer to a conditional, and at first glance it might look like it's going to yield multiple results for each match.
It does seem odd that a context manager can't cause its managed function to be skipped entirely; while that's not explicitly one of the use cases of "with", it seems like a natural extension.
从
Python 3.8
开始,并引入赋值表达式 (PEP 572 )(:=
运算符),我们现在可以捕获条件值re.match(r'(\d+)g', '123g')
变量match
以便检查它是否不是None
,然后在条件体内重新使用它:Starting
Python 3.8
, and the introduction of assignment expressions (PEP 572) (:=
operator), we can now capture the condition valuere.match(r'(\d+)g', '123g')
in a variablematch
in order to both check if it's notNone
and then re-use it within the body of the condition:另一种不错的语法是这样的:
Another nice syntax would be something like this:
我有另一种方法来做到这一点,基于 Glen Maynard 的解决方案:
与 Glen 的解决方案类似,这会迭代 0(如果不匹配)或 1(如果匹配)次。
不需要潜艇,但结果不太整洁。
I have another way of doing this, based on Glen Maynard's solution:
Similar to Glen's solution, this itterates either 0 (if no match) or 1 (if a match) times.
No sub needed, but less tidy as a result.
如果您在一个地方做了很多这样的事情,这里有一个替代答案:
您可以使用与 re 相同的线程安全性编译一次正则表达式,为整个函数创建一个可重用的 Matcher 对象,然后您可以非常方便地使用它简洁地。 这还有一个好处是您可以通过明显的方式反转它——要使用迭代器来做到这一点,您需要传递一个标志来告诉它反转其结果。
不过,如果每个函数只进行一次匹配,那么这并没有多大帮助; 您不想将 Matcher 对象保留在比这更广泛的上下文中; 它会导致与 Blixt 的解决方案相同的问题。
If you're doing a lot of these in one place, here's an alternative answer:
You can compile the regex once with the same thread safety as re, create a single reusable Matcher object for the whole function, and then you can use it very concisely. This also has the benefit that you can reverse it in the obvious way--to do that with an iterator, you'd need to pass a flag to tell it to invert its result.
It's not much help if you're only doing a single match per function, though; you don't want to keep Matcher objects in a broader context than that; it'd cause the same issues as Blixt's solution.