str_enttract所有语法

发布于 2025-02-08 11:14:32 字数 1293 浏览 3 评论 0原文

我需要一些帮助Stringr :: str_extract_all

x是我的数据框架的名称。

V1
(A_K9B,A_K9one,A_K9two,B_U10J) 
x = x %>% 
  mutate(N_alph = map_chr(str_extract_all(x$V1, 'A_([A-Z][0-10])[A-Z]'), toString))
x = x %>% 
  mutate(N_.1 = map_chr(str_extract_all(x$V1, 'A_([A-Z][0-10])[o][n][e]'), toString))
x = x %>% 
  mutate(N_.2 = map_chr(str_extract_all(x$V1, 'A_([A-Z][0-10])[t][w][o]'), toString))

这是我当前的输出:

V1                                N_alph  N_.1     N_.2
(A_K9B,A_K9one,A_K9two,B_U10J)   A_K9B   A_K9one  A_K9two 

我的列n_alph我都可以与其他两个分开。但理想情况下,我想避免键入[o] [n] [e][t] [w] [o]比一个字母字母,如果我使用:

x = x %>% 
  mutate(N_alph = map_chr(str_extract_all(x$V1, 'A_([A-Z][0-10])[A-Z]'), toString))
x = x %>% 
  mutate(N_all.words = map_chr(str_extract_all(x$V1, 'A_([A-Z][0-10])[\\w+]'), toString))

输出为:

V1                                N_alph  N_all.words    
(A_K9B,A_K9one,A_K9two,B_U10J)   A_K9B   A_K9B,A_K9o,A_K9t 

所需的输出将为

V1                                N_alph  N_all.words    
(A_K9B,A_K9one,A_K9two,B_U10J)   A_K9B   A_K9one,A_K9two 

I need some help with stringr::str_extract_all

x is the name of my data frame.

V1
(A_K9B,A_K9one,A_K9two,B_U10J) 
x = x %>% 
  mutate(N_alph = map_chr(str_extract_all(x$V1, 'A_([A-Z][0-10])[A-Z]'), toString))
x = x %>% 
  mutate(N_.1 = map_chr(str_extract_all(x$V1, 'A_([A-Z][0-10])[o][n][e]'), toString))
x = x %>% 
  mutate(N_.2 = map_chr(str_extract_all(x$V1, 'A_([A-Z][0-10])[t][w][o]'), toString))

This is my current output:

V1                                N_alph  N_.1     N_.2
(A_K9B,A_K9one,A_K9two,B_U10J)   A_K9B   A_K9one  A_K9two 

I am fine with my column N_alph as is I want it separate from the other two. But Ideally I would like to avoid typing [o][n][e] and [t][w][o] for those variables that are followed by words rather than one alphabetical letter, if I use:

x = x %>% 
  mutate(N_alph = map_chr(str_extract_all(x$V1, 'A_([A-Z][0-10])[A-Z]'), toString))
x = x %>% 
  mutate(N_all.words = map_chr(str_extract_all(x$V1, 'A_([A-Z][0-10])[\\w+]'), toString))

Output is:

V1                                N_alph  N_all.words    
(A_K9B,A_K9one,A_K9two,B_U10J)   A_K9B   A_K9B,A_K9o,A_K9t 

Desired output would be

V1                                N_alph  N_all.words    
(A_K9B,A_K9one,A_K9two,B_U10J)   A_K9B   A_K9one,A_K9two 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

逆光飞翔i 2025-02-15 11:14:32

当您使用\ w,\ b,\ s等这样的元视频器时,您不需要方括号。但是,如果您确实使用了方括号,那么+需要在外面。另外,当我们谈论单个字符而不是字符的组合时,数字组应为[0-9]。要考虑到高于9的数字,我们只是扩展了使用{}括号或简单的+运算符检查组的次数。最终结果看起来像是这样:

x %>% 
  mutate(N_all.words = str_extract_all(V1, 'A_([A-Z][0-9]{1,2})\\w+'))

结果:

                              V1             N_all.words
1 (A_K9B,A_K9one,A_K9two,B_U10J) A_K9B, A_K9one, A_K9two

我还创建了一个我发现了一些整理的版本:

x %>% 
  mutate(N_all.words = str_extract_all(V1, 'A_\\w\\d{1,2}\\w+'))

When you use metacharacters like \w, \b, \s, etc., you don't need the square brackets. But if you do use the square brackets than the + would need to be outside. Also, the number group should be [0-9] as we are talking about individual characters, not combinations of characters. To take into account numbers higher than 9 we just expand the amount of times we check for the group with {} brackets, or simply the + operator. The final result looks like so:

x %>% 
  mutate(N_all.words = str_extract_all(V1, 'A_([A-Z][0-9]{1,2})\\w+'))

Resulting to:

                              V1             N_all.words
1 (A_K9B,A_K9one,A_K9two,B_U10J) A_K9B, A_K9one, A_K9two

I also created a version that I found a little tidier:

x %>% 
  mutate(N_all.words = str_extract_all(V1, 'A_\\w\\d{1,2}\\w+'))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文