正则表达式:匹配非转义双引号字符串
我正在为我的网站编写一个用 php 编写的带有语法突出显示(针对 Ruby)的 Ruby 代码框,到目前为止我可以用它来为实例变量、注释、符号和全局变量着色,但是在使用以下正则表达式时遇到了问题为了匹配双引号字符串,这是我的代码:
<?php
function codebox($code, $name="", $highlighted_line = -1)
{
echo '<table class="code_table">';
echo '<tr>';
echo '<td class="code_table_header"></td>';
echo '<td class="code_table_name">$name</td>';
echo '<td class="code_table_header"><a href="" class="copy_to_clipboard_link">copy to clipboard</a></td>';
echo '</tr>';
$oddity = 'even';
$line_number = 1;
foreach(preg_split('/(\r?\n)/', $code) as $line)
{
echo '<tr>';
if($line_number % 10 == 0)
{
echo '<td class="line_number" style="font-weight:bold;">' . $line_number . '</td>';
} else {
echo '<td class="line_number">' . $line_number . '</td>';
}
if($line_number == $highlighted_line)
{
echo '<td class="selected_code_cell" colspan="2">' . syntax_highlight($line) . '</td>';
} else {
echo '<td class="' . $oddity . '_code_cell" colspan="2">' . syntax_highlight($line) . '</td>';
}
echo '</tr>';
$line_number += 1;
if($oddity == 'even')
{
$oddity = 'odd';
} else {
$oddity = 'even';
};
};
};
function syntax_highlight($code)
{
// Make it so html doesn't bodge up
$code = htmlentities($code);
// Replace tabs with 4 none blocking spaces
$code = str_replace(' ', ' ', $code);
//instance variables
$code = preg_replace('/\B(\@\w*\S)/', '<span style="color:lime;">$1</span>', $code);
//global variables
$code = preg_replace('/\B(\$\w*\S)/', '<span style="font-weight:bolder;color:#00b0f0;">$1</span>', $code);
//symbols
$code = preg_replace('/\B(\:\w*\S)/', '<span style="color:yellow;">$1</span>', $code);
//strings (double quote)
$code = preg_replace('/"(?:\.|(\\\")|[^\""\n])*"/', '<span style="font-style:italic;color:#FF5A00;">$1</span>', $code);
//strings (single quote)
//$code = preg_replace('/\'(?:\.|(\\\')|[^\'\'\n])*\'/', '<span style="font-style:italic;color:#FF5A00;">$1</span>', $code);
return $code;
};
?>
出于某种原因,双引号字符串会破坏其他字符串,并且不会执行语法突出显示,有谁知道为什么?预先感谢,嗯。
I'm writing a Ruby code box with syntax highlighting (for Ruby) written in php for my website, I can get it to color instance variables, comments, symbols and global variables so far but I have encountered a problem when using the following regex to match double quoted strings, here is my code:
<?php
function codebox($code, $name="", $highlighted_line = -1)
{
echo '<table class="code_table">';
echo '<tr>';
echo '<td class="code_table_header"></td>';
echo '<td class="code_table_name">$name</td>';
echo '<td class="code_table_header"><a href="" class="copy_to_clipboard_link">copy to clipboard</a></td>';
echo '</tr>';
$oddity = 'even';
$line_number = 1;
foreach(preg_split('/(\r?\n)/', $code) as $line)
{
echo '<tr>';
if($line_number % 10 == 0)
{
echo '<td class="line_number" style="font-weight:bold;">' . $line_number . '</td>';
} else {
echo '<td class="line_number">' . $line_number . '</td>';
}
if($line_number == $highlighted_line)
{
echo '<td class="selected_code_cell" colspan="2">' . syntax_highlight($line) . '</td>';
} else {
echo '<td class="' . $oddity . '_code_cell" colspan="2">' . syntax_highlight($line) . '</td>';
}
echo '</tr>';
$line_number += 1;
if($oddity == 'even')
{
$oddity = 'odd';
} else {
$oddity = 'even';
};
};
};
function syntax_highlight($code)
{
// Make it so html doesn't bodge up
$code = htmlentities($code);
// Replace tabs with 4 none blocking spaces
$code = str_replace(' ', ' ', $code);
//instance variables
$code = preg_replace('/\B(\@\w*\S)/', '<span style="color:lime;">$1</span>', $code);
//global variables
$code = preg_replace('/\B(\$\w*\S)/', '<span style="font-weight:bolder;color:#00b0f0;">$1</span>', $code);
//symbols
$code = preg_replace('/\B(\:\w*\S)/', '<span style="color:yellow;">$1</span>', $code);
//strings (double quote)
$code = preg_replace('/"(?:\.|(\\\")|[^\""\n])*"/', '<span style="font-style:italic;color:#FF5A00;">$1</span>', $code);
//strings (single quote)
//$code = preg_replace('/\'(?:\.|(\\\')|[^\'\'\n])*\'/', '<span style="font-style:italic;color:#FF5A00;">$1</span>', $code);
return $code;
};
?>
For some reason, the double quoted string breaks the other ones and no syntax highlighting is performed, does anyone know why? Thanks in advance, ell.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不要尝试使用正则表达式来解析像 Ruby 这样的不规则语言。尝试为 Ruby 找到一个合适的解析器,而不是返回所用语言标记的数组。
Don’t try to parse an irregular language like Ruby with regular expressions. Try to find a proper parser for Ruby instead that returns an array of the used language tokens.
冈布说的话。仅使用正则表达式无法使其正常工作。但你可以尝试这个:
或者也许你在引用之前使用断言
(? 会更好。
What Gumbo said. You cannot make this work correctly with regular expressions alone. But you might try this:
Or maybe you have better luck with an assertion
(?<![\\])
right before the quote.