保留文本但消除标签之间的 CR LF

发布于 2024-11-08 20:49:05 字数 716 浏览 6 评论 0原文

各位正则表达式专家,

我有一个充满表达式的平面文件,例如:

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
"SELECT * FROM THIS_QUERY
WHERE IS_SPREAD_OVER == 123
ORDER BY MULTIPLE_LINES
HAVING AND_IS_BETWEEN_QUOTES"
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

我想消除引号和引号本身之间的 CRLF ,以便我的所有查询都是方便的单行句,如下所示:

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
SELECT * FROM THIS_QUERY WHERE IS_SPREAD_OVER == 123 ORDER BY MULTIPLE_LINES HAVING BUT_IS_BETWEEN_QUOTES
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

请发布解决方案中使用的正则表达式风格。我正在使用 TextCrawler,它声称是 ECMA262(与 VBScript/Javascript 相同),我最接近的解决方案是这样的:

(\r\n".*)(.*)\r\n(.*"\r\n)

请原谅我的笨拙。 此致, 山猫开普勒

Fellow Regexers,

I have a flat file full of expressions like:

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
"SELECT * FROM THIS_QUERY
WHERE IS_SPREAD_OVER == 123
ORDER BY MULTIPLE_LINES
HAVING AND_IS_BETWEEN_QUOTES"
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

I want to eliminate the CRLF between the quotes and the quotes themselves, so that all my queries are convenient one-liners like that:

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
SELECT * FROM THIS_QUERY WHERE IS_SPREAD_OVER == 123 ORDER BY MULTIPLE_LINES HAVING BUT_IS_BETWEEN_QUOTES
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

Please post the RegEx flavor used in the solution. I'm using TextCrawler, which claims to be ECMA262 (same as VBScript/Javascript) and the closest I came to a solution is something like:

(\r\n".*)(.*)\r\n(.*"\r\n)

Forgive my n00biness.
Best regards,
Lynx Kepler

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

戒ㄋ 2024-11-15 20:49:05

如果下一个 " 位于行尾,您可以首先删除所有 CRLF:

result = subject.replace(/\r\n(?=[^"]*"$)/mg, " ");

说明:

\r\n    # Match a CRLF
(?=     # if and only if
 [^"]*  # it is followed by any number of non-quote characters
 "      # and a quote
 $      # at the end of a line
)       # End of lookahead.

这会将您的示例转换为

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
"SELECT * FROM THIS_QUERY WHERE IS_SPREAD_OVER == 123 ORDER BY MULTIPLE_LINES HAVING AND_IS_BETWEEN_QUOTES"
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

在此处输入图像描述

然后,在第二步中,删除引号:

result = subject.replace(/^"|"$/mg, "");

You could first remove all CRLFs if the next " is at the end of a line:

result = subject.replace(/\r\n(?=[^"]*"$)/mg, " ");

Explanation:

\r\n    # Match a CRLF
(?=     # if and only if
 [^"]*  # it is followed by any number of non-quote characters
 "      # and a quote
 $      # at the end of a line
)       # End of lookahead.

This transforms your example into

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
"SELECT * FROM THIS_QUERY WHERE IS_SPREAD_OVER == 123 ORDER BY MULTIPLE_LINES HAVING AND_IS_BETWEEN_QUOTES"
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

enter image description here

Then, in a second step, remove the quotes:

result = subject.replace(/^"|"$/mg, "");
一笑百媚生 2024-11-15 20:49:05

使用 Perl 你可以这样做:

s/^"([^"]*)"$/$s = $1; $s =~ s!(?:\n|\r)+! !g; $s/兆

With Perl you could do something like:

s/^"([^"]*)"$/$s = $1; $s =~ s!(?:\n|\r)+! !g; $s/meg

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文