当前位置：文江博客话题详情

regex r regex-lookarounds stringr

在查看（R/StringR）中使用量词

发布于 2025-01-22 20:53:42 字数 805 浏览 0 评论 0 原文

我想从以下字符串中提取名称 john doe ：

str <- 'Name: |             |John Doe     |'

我可以做：

library(stringr)
str_extract(str,'(?<=Name: \\|             \\|).*(?=     \\|)')
[1] "John Doe"

但是，这涉及到很多空间，并且当未固定空间数量时，它效果不佳。但是，当我尝试使用量词（+）时，我会收到一个错误：

str_extract(str,'(?<=Name: \\| +\\|).*(?= +\\|)')
Error in stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : 
  Look-Behind pattern matches must have a bounded maximum length. (U_REGEX_LOOK_BEHIND_LIMIT, context=`(?<=Name: \| +\|).*(?= +\|)`)

其他变体也是如此：

str_extract(str,'(?<=Name: \\|\\s+\\|).*(?=\\s+\\|)') 
str_extract(str,'(?<=Name: \\|\\s{1,}\\|).*(?=\\s{1,}\\|)')

是否有解决方案？

原文

I'd like to extract the name John Doe from the following string:

str <- 'Name: |             |John Doe     |'

I can do:

library(stringr)
str_extract(str,'(?<=Name: \\|             \\|).*(?=     \\|)')
[1] "John Doe"

But that involves typing a lot of spaces, and it doesn't work well when the number of spaces is not fixed. But when I try to use a quantifier (+), I get an error:

str_extract(str,'(?<=Name: \\| +\\|).*(?= +\\|)')
Error in stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : 
  Look-Behind pattern matches must have a bounded maximum length. (U_REGEX_LOOK_BEHIND_LIMIT, context=`(?<=Name: \| +\|).*(?= +\|)`)

The same goes for other variants:

str_extract(str,'(?<=Name: \\|\\s+\\|).*(?=\\s+\\|)') 
str_extract(str,'(?<=Name: \\|\\s{1,}\\|).*(?=\\s{1,}\\|)')

Is there a solution to this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

愁杀 2025-01-29 20:53:42

怎么样：
首先，我们删除名称
然后我们用空间代替所有特殊字符
最后 str_squish 它

Library(stringr)

str_squish(str_replace_all( str_remove(str, "Name"), "[^[:alnum:]]", " "))

[1] "John Doe"

How about:
First we remove Name
Then we replace all special characters with space
and finally str_squish it

Library(stringr)

str_squish(str_replace_all( str_remove(str, "Name"), "[^[:alnum:]]", " "))

[1] "John Doe"

回复收藏 0 原文

弱骨蛰伏 2025-01-29 20:53:42

使用基本R的另一个解决方案：

sub("Name: \\|\\s+\\|(.*\\S)\\s+\\|", "\\1", str)
# [1] "John Doe"

Another solution using base R:

sub("Name: \\|\\s+\\|(.*\\S)\\s+\\|", "\\1", str)
# [1] "John Doe"

回复收藏 0 原文

篱下浅笙歌 2025-01-29 20:53:42

您也可以使用 \ k 将迄今为止与正则匹配的匹配保持匹配。

Name: \|\h+\|\K.*?(?=\h+\|)

说明

名称：\ | match 名称：|
\ h+ \ | 匹配1+空格和 |
\ k 忘记到目前为止匹配的
。
积极的lookahead，断言右侧的更多空间，然后是 |

请参阅a and a R demo.

示例

str <- 'Name: |             |John Doe     |'    
regmatches(str, regexpr("Name: \\|\\h+\\|\\K.*?(?=\\h+\\|)", str, perl=T))

输出

[1] "John Doe"

You might also use the \K to keep what is matched so far out of the regex match.

Name: \|\h+\|\K.*?(?=\h+\|)

Explanation

Name: \| match Name: |
\h+\| Match 1+ spaces and |
\K Forget what is matched so far
.*? Match as least as possible chars
(?=\h+\|) Positive lookahead, assert 1+ more spaces to the right followed by |

See a regex demo and a R demo.

Example

str <- 'Name: |             |John Doe     |'    
regmatches(str, regexpr("Name: \\|\\h+\\|\\K.*?(?=\\h+\\|)", str, perl=T))

Output

[1] "John Doe"

回复收藏 0 原文

~没有更多了~

关于作者

浅沫记忆

暂无简介

文章

766 人气

关注发私信

尘曦

文章 0 评论 0

关注

在梵高的星空下

文章 0 评论 0

关注

善良天后

文章 0 评论 0

关注

韬韬不绝

文章 0 评论 0

关注

qq_CgiN62

文章 0 评论 0

关注

不美如何

文章 0 评论 0

友情链接

文江博客

在查看（R/StringR）中使用量词

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

尘曦

在梵高的星空下

善良天后

韬韬不绝

qq_CgiN62

不美如何

友情链接

在查看（R/StringR）中使用量词

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

尘曦

在梵高的星空下

善良天后

韬韬不绝

qq_CgiN62

不美如何

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。