在正则表达式 python 之间提取内容？

发布于 2024-12-11 13:07:29 字数 364 浏览 0 评论 0原文

有没有一种简单的方法可以在正则表达式之间提取内容？假设我有以下示例文本

 SOME TEXT [SOME MORE TEXT] value="ssss" SOME MORE TEXT

我的正则表达式是：

 compiledRegex = re.compile('\[.*\] value=("|\').*("|\')')

这显然会返回整个 [SOME MORE TEXT] value="ssss"，但是我只希望返回 ssss 因为这就是我正在寻找的

我显然可以定义一个解析器函数，但我觉得 python 提供了一些简单的 pythonic 方法来完成这样的任务

原文

Is there a simple method to pull content between a regex? Assume I have the following sample text

 SOME TEXT [SOME MORE TEXT] value="ssss" SOME MORE TEXT

My regex is:

 compiledRegex = re.compile('\[.*\] value=("|\').*("|\')')

This will obviously return the entire [SOME MORE TEXT] value="ssss", however I only want ssss to be returned since that's what I'm looking for

I can obviously define a parser function but I feel as if python provides some simple pythonic way to do such a task

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

り繁华旳梦境 2024-12-18 13:07:29

这就是捕获组的设计目的。

compiledRegex = re.compile('\[.*\] value=(?:"|\')(.*)(?:"|\')') 
matches = compiledRegex.match(sampleText)
capturedGroup = matches.group(1) # grab contents of first group

旧组（括号）内的 ?: 表示该组现在是一个非捕获组；也就是说，它在结果中不能作为一个组进行访问。我转换了它们以保持输出更简单，但是如果您愿意，您可以将它们保留为捕获组（但是您必须使用 matches.group(2) 代替，因为第一个引号将是第一个捕获组）。

This is what capturing groups are designed to do.

compiledRegex = re.compile('\[.*\] value=(?:"|\')(.*)(?:"|\')') 
matches = compiledRegex.match(sampleText)
capturedGroup = matches.group(1) # grab contents of first group

The ?: inside the old groups (the parentheses) means that the group is now a non-capturing group; that is, it won't be accessible as a group in the result. I converted them to keep the output simpler, but you can leave them as capturing groups if you prefer (but then you have to use matches.group(2) instead, since the first quote would be the first captured group).

回复收藏 0 原文

一曲琵琶半遮面シ 2024-12-18 13:07:29

您原来的正则表达式太贪婪：r'.*\]'不会在第一个']'和第二个'.*'处停止code> 不会在 '"' 处停止。要在 c 处停止，您可以使用 [^c] 或 '.* ?'：

regex = re.compile(r"""\[[^]]*\] value=("|')(.*?)\1""")

示例

m = regex.search("""SOME TEXT [SOME MORE TEXT] value="ssss" SOME MORE TEXT""")
print m.group(2)

Your original regex is too greedy: r'.*\]' won't stop at the first ']' and the second '.*' won't stop at '"'. To stop at c you could use [^c] or '.*?':

regex = re.compile(r"""\[[^]]*\] value=("|')(.*?)\1""")

Example

m = regex.search("""SOME TEXT [SOME MORE TEXT] value="ssss" SOME MORE TEXT""")
print m.group(2)

回复收藏 0 原文

~没有更多了~

关于作者

初吻给了烟

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

在正则表达式 python 之间提取内容？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

示例

Example

关于作者

相关话题

热门标签

推荐作者

已经忘了多久

15867725375

LonelySnow

走过海棠暮

轻许诺言

信馬由缰

友情链接

在正则表达式 python 之间提取内容？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

示例

Example

关于作者

相关话题

热门标签

推荐作者

已经忘了多久

15867725375

LonelySnow

走过海棠暮

轻许诺言

信馬由缰

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。