Ruby 正则表达式:分割字符串并匹配以换行符或字符串开头开头的匹配项?

发布于 2024-12-10 22:04:54 字数 1117 浏览 0 评论 0原文

这是我为此使用的正则表达式。我使用 Ruby,如果我没记错的话,它使用 POSIX 正则表达式。

regex = /(?:\n^)(\*[\w+ ?]+\*)\n/

这是我的目标:我想用一个正则表达式分割一个字符串,该正则表达式是*由星号分隔*,包括那些星号。 但是:我只想按以换行符 (\n) 开头的匹配项进行分割,或者它是整个字符串的开头。这是我正在使用的字符串。

"*Friday*\nDo not *break here*\n*But break here*\nBut again, not this"

我的正则表达式在 *Friday* 匹配时没有正确拆分,但在 *But Break here* 匹配时正在拆分(它也是在这里进行拆分)。我的问题出在第一组中的某个地方,我认为: (?:\n^) — 我知道这是错误的,而且我不完全确定正确的编写方式。有人可以透露一些信息吗?这是我的完整代码。

regex = /(?:\n^)(\*[\w+ ?]+\*)\n/
str = "*Friday*\nDo not *break here*\n*But break here*\nBut again, not this"
str.split(regex)

结果是这样的:

>>> ["*Friday*\nDo not *break here*", "*But break here*", "But again, not this"]

我希望它是这样的:

>>> ["*Friday*", "Do not *break here*", "*But break here*", "But again, not this"]

编辑#1:我已经更新了我的正则表达式和结果。 (2011/10/18 16:26 CST)
编辑#2:我再次更新了两者。 (中部标准时间 16:32)

Here's my regular expression that I have for this. I'm in Ruby, which — if I'm not mistaken — uses POSIX regular expressions.

regex = /(?:\n^)(\*[\w+ ?]+\*)\n/

Here's my goal: I want to split a string with a regex that is *delimited by asterisks*, including those asterisks. However: I only want to split by the match if it is prefaced with a newline character (\n), or it's the start of the whole string. This is the string I'm working with.

"*Friday*\nDo not *break here*\n*But break here*\nBut again, not this"

My regular expression is not splitting properly at the *Friday* match, but it is splitting at the *But break here* match (it's also throwing in a here split). My issue is somewhere in the first group, I think: (?:\n^) — I know it's wrong, and I'm not entirely sure of the correct way to write it. Can someone shed some light? Here's my complete code.

regex = /(?:\n^)(\*[\w+ ?]+\*)\n/
str = "*Friday*\nDo not *break here*\n*But break here*\nBut again, not this"
str.split(regex)

Which results in this:

>>> ["*Friday*\nDo not *break here*", "*But break here*", "But again, not this"]

I want it to be this:

>>> ["*Friday*", "Do not *break here*", "*But break here*", "But again, not this"]

Edit #1: I've updated my regex and result. (2011/10/18 16:26 CST)
Edit #2: I've updated both again. (16:32 CST)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

如若梦似彩虹 2024-12-17 22:04:54

如果只是在每个字符串前面添加一个 '\n' 会怎么样?这大大简化了处理过程:

regex = /(?:\n)(\*[\w+ ?]+\*)\n/
str = "*Friday*\nDo not *break here*\n*But break here*\nBut again, not this"

res = ("\n"+str).split(regex)
res.shift if res[0] == ""
res
=> [ "*Friday*", "Do not *break here*", 
     "*But break here*", "But again, not this"]

我们必须注意最初的额外匹配,但这还不错。我怀疑有人可以缩短这个时间。

What if you just add a '\n' to the front of each string. That simplifies the processing quite a bit:

regex = /(?:\n)(\*[\w+ ?]+\*)\n/
str = "*Friday*\nDo not *break here*\n*But break here*\nBut again, not this"

res = ("\n"+str).split(regex)
res.shift if res[0] == ""
res
=> [ "*Friday*", "Do not *break here*", 
     "*But break here*", "But again, not this"]

We have to watch for the initial extra match but it's not too bad. I suspect someone can shorten this a bit.

七禾 2024-12-17 22:04:54

第 1 组和第 1 组下面的正则表达式 2 :

(?:\A|\\n)(\*.*?\*)|(?:\A|\\n)(.*?)(?=\\n|\Z)

将为您提供所需的输出。我不是红宝石专家,所以你必须自己创建列表:)

Groups 1 & 2 of the regex below :

(?:\A|\\n)(\*.*?\*)|(?:\A|\\n)(.*?)(?=\\n|\Z)

Will give you your desired output. I am no ruby expert so you will have to create the list yourself :)

海之角 2024-12-17 22:04:54

为什么不直接在换行符处分割呢?从你的例子来看,这看起来就是你真正想做的事情。

str.split("\n")

Why not just split at newlines? From your example, it looks that's what you're really trying to do.

str.split("\n")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文