用户警告：此模式被解释为正则表达式，并且具有匹配组

发布于 2025-01-12 13:47:28 字数 2017 浏览 1 评论 0原文

给定以下 pandas DataFrame -

json_path	报告组	实体/分组	实体 ID	调整值（今天，无 Div，美元）	调整后的 TWR（本季度，无 Div，美元）	调整后的 TWR（年初至今，无 Div，美元）	年化调整后的 TWR（自成立以来，无 Div，美元）	调整值（无 Div，美元）
data.attributes.total.children。[0 ].孩子们。[0].孩子们。[0]	巴拉克家族	威廉和鲁珀特信托	9957007	-1.44				-1.44
data.attributes.total.children.[0].children.[0].children.[0].children.[0]	兵营家庭	现金	-	-1.44				-1.44
data.attributes.total.children.[0].children.[0].children.[1]	Barrack Family	Gratia Holdings No. 2 LLC	8413655	55491732.66	-0.971018847	-0.971018847	11.52490309	55491732.66
data.attributes.total.children.[0].children.[0].children.[1].children.[0]	Barrack Family	投资级固定收益	-	18469768.6				18469768.6
data.attributes.total.children.[0].children.[0].children.[1].children.[1]	Barrack Family	高收益固定收益	-	3668982.44	-0.205356545	-0.205356545	4.441190127	3668982.44

我尝试使用以下语句仅保存包含 4 次 .children.[] 出现的行 -

代码： perf_by_entity_df = df[df['json_path'].str.contains(r'(\.children\.\[\d+\]){4}')]

但是收到以下内容：

错误：用户警告：此模式被解释为正则表达式，并且具有匹配组。要实际获取组，请使用 str.extract。

对于为什么会发生这种情况有什么建议吗？

原文

Given the following pandas DataFrame -

json_path	Reporting Group	Entity/Grouping	Entity ID	Adjusted Value (Today, No Div, USD)	Adjusted TWR (Current Quarter, No Div, USD)	Adjusted TWR (YTD, No Div, USD)	Annualized Adjusted TWR (Since Inception, No Div, USD)	Adjusted Value (No Div, USD)
data.attributes.total.children.[0].children.[0].children.[0]	Barrack Family	William and Rupert Trust	9957007	-1.44				-1.44
data.attributes.total.children.[0].children.[0].children.[0].children.[0]	Barrack Family	Cash	-	-1.44				-1.44
data.attributes.total.children.[0].children.[0].children.[1]	Barrack Family	Gratia Holdings No. 2 LLC	8413655	55491732.66	-0.971018847	-0.971018847	11.52490309	55491732.66
data.attributes.total.children.[0].children.[0].children.[1].children.[0]	Barrack Family	Investment Grade Fixed Income	-	18469768.6				18469768.6
data.attributes.total.children.[0].children.[0].children.[1].children.[1]	Barrack Family	High Yield Fixed Income	-	3668982.44	-0.205356545	-0.205356545	4.441190127	3668982.44

I try and save only rows that contain 4x occurances of .children.[] using the following statement -

Code: perf_by_entity_df = df[df['json_path'].str.contains(r'(\.children\.\[\d+\]){4}')]

However receive the following:

Error:UserWarning: This pattern is interpreted as a regular expression, and has match groups. To actually get the groups, use str.extract.

Any suggestions why this is happening?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

留一抹残留的笑 2025-01-19 13:47:28

使用下面的代码来抑制警告：

perf_by_entity_df = df[df['json_path'].str.contains(r'(?:\.children\.\[\d+\]){4}')]

Replace:

r'(\.children\.\[\d+\]){4}'

By:

r'(?:\.children\.\[\d+\]){4}'
#  ^^-- HERE: Non capturing group

来自文档< /a>:

(?:...)
常规括号的非捕获版本。匹配括号内的任何正则表达式，但执行匹配后无法检索该组匹配的子字符串，也无法稍后在模式中引用该子字符串。

Use the code below to suppress the warning:

perf_by_entity_df = df[df['json_path'].str.contains(r'(?:\.children\.\[\d+\]){4}')]

Replace:

r'(\.children\.\[\d+\]){4}'

By:

r'(?:\.children\.\[\d+\]){4}'
#  ^^-- HERE: Non capturing group

From the documentation:

(?:...)
A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

回复收藏 0 原文

~没有更多了~