如何在 Ruby 中修复这个多行正则表达式?
我的 Ruby 正则表达式在多行模式下无法正常工作。
我正在尝试将 Markdown 文本转换为 Redmine 中使用的 Textile-eque 标记。问题出在我用于转换代码块的正则表达式中。它应该找到任何以 4 个空格或制表符开头的行,然后将它们包装在 pre 标记中。
markdownText = '# header
some text that precedes code
var foo = 9;
var fn = function() {}
fn();
some post text'
puts markdownText.gsub!(/(^(?:\s{4}|\t).*?$)+/m,"<pre>\n\\1\n</pre>")
预期结果:
# header
some text that precedes code
<pre>
var foo = 9;
var fn = function() {}
fn();
</pre>
some post text
问题是结束 pre 标记打印在文档末尾,而不是“fn();”之后。我尝试了以下表达式的一些变体,但它不匹配:
gsub!(/(^(?:\s{4}|\t).*?$)+^(\S)/m, "<pre>\n\\1\n</pre>\\2")
如何让正则表达式仅匹配缩进的代码块?您可以在 此处 在 Rubular 上测试此正则表达式。
I have a regular expression in Ruby that isn't working properly in multiline mode.
I'm trying to convert Markdown text into the Textile-eque markup used in Redmine. The problem is in my regular expression for converting code blocks. It should find any lines leading with 4 spaces or a tab, then wrap them in pre tags.
markdownText = '# header
some text that precedes code
var foo = 9;
var fn = function() {}
fn();
some post text'
puts markdownText.gsub!(/(^(?:\s{4}|\t).*?$)+/m,"<pre>\n\\1\n</pre>")
Intended result:
# header
some text that precedes code
<pre>
var foo = 9;
var fn = function() {}
fn();
</pre>
some post text
The problem is that the closing pre tag is printed at the end of the document instead of after "fn();". I tried some variations of the following expression but it doesn't match:
gsub!(/(^(?:\s{4}|\t).*?$)+^(\S)/m, "<pre>\n\\1\n</pre>\\2")
How do I get the regular expression to match just the indented code block? You can test this regular expression on Rubular here.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
首先,请注意,Ruby 中的
'm'
多行模式与其他语言的's'
单行模式等效。换句话说; Ruby 中的'm'
模式表示:“点匹配全部”。这个正则表达式可以很好地匹配类似 markdown 的代码部分:
这需要在代码部分之前和之后有一个空行,并允许在代码部分本身内有空行。它允许
\r\n
或\n
行终止。请注意,这不会去除每行前的前 4 个空格(或制表符)。这样做将需要更多的代码复杂性。 (我不是一个红宝石爱好者,所以无法帮助解决这个问题。)我建议查看 markdown 源代码本身,看看它是如何真正完成的。
First, note that
'm'
multi-line mode in Ruby is equivalent to's'
single-line mode of other languages. In other words;'m'
mode in Ruby means: "dot matches all".This regex will do a pretty good job of matching a markdown-like code section:
This requires a blank line before and after the code section and allows blank lines within the code section itself. It allows for either
\r\n
or\n
line terminations. Note that this does not strip the leading 4 spaces (or tab) before each line. Doing that will require more code complexity. (I am not a ruby guy so can't help out with that.)I would recommend looking at the markdown source itself to see how its really being done.
/^(\s{4}|\t)+.+\;\n$/m
效果好一点,但仍然会拾取我们不想要的换行符。
这里它是在 rubular 上的。
/^(\s{4}|\t)+.+\;\n$/m
works a little better, still picks up a newline that we don't want.
here it is on rubular.
这对我来说适用于您的示例输入。
This is working for me with your sample input.
这是另一个捕获单个块中所有缩进行的代码
Here's another one that captures all the indented lines in a single block