javascript正则是将_xxx_转换为[i] xxx [i]忽略<>> (URL)或``(代码)

发布于 2025-01-28 10:51:00 字数 370 浏览 2 评论 0原文

我正在尝试重新格式化稀疏格式以供bbcode,并需要一些帮助。 Slack确实是这样的:

_this is italic_ and this isn't

我的当前表达式(/\ _([[^\ _]*)\ _/gm)有效,但不幸的是,在URL和内部代码snippets中拾取了下划线。 Slack格式的URL和类似的代码:

<www.thislink.com|here's a link>
`here's a code snippet`

如何告诉Regex不要在链接或代码段中匹配任何下划线对?我一直在尝试负面的lookahead和beebhind,但没有成功。

I'm trying to reformat Slack formatting to bbcode and need a little help. Slack does italics like this:

_this is italic_ and this isn't

My current expression (/\_([^\_]*)\_/gm) works but unfortunately picks up underscores in URLs and inside code snippets. Slack formats URLs and code like this:

<www.thislink.com|here's a link>
`here's a code snippet`

How can I tell regex not to match any underscore pairs inside a link or code snippet? I've been trying negative lookahead and lookbehind but without success.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

余生共白头 2025-02-04 10:51:00

您需要匹配并捕获所需的内容,并匹配您不需要的东西。

一旦获得匹配,请分析并实现适当的代码逻辑:

const re = /<[^<>|]*(?:\|[^<>]*)?>|`[^`]*`|_([^_]*)_/g;
const text = "<www.thislink.com|here's a link>\n`here's a code snippet`\n_this is italic_ and this isn't";
console.log( text.replace(re, (m,g) => g !== undefined ? "[i]" + g + "[/i]" : m ) )

请参阅 the Regex demo 详细信息

  • &lt; [^&lt;&gt; |]*(?:\ | [^&lt;&gt;]*)*)? ;,然后除了&lt;&gt;|之外的零或更多字符,然后是| ,然后除&lt;&gt;外,然后零或更多字符,然后a &gt; char
  • | - 或
  • `[^`]* - 回压,零或更多的chars,除了背景和回程
  • | | - 或
  • _(( [^_]*)_ - _,第1组:零或更多字符以外的其他字符,_,a _

You need to match and capture what you need and just match what you do not need.

Once you get a match, analyze it and implement the appropriate code logic:

const re = /<[^<>|]*(?:\|[^<>]*)?>|`[^`]*`|_([^_]*)_/g;
const text = "<www.thislink.com|here's a link>\n`here's a code snippet`\n_this is italic_ and this isn't";
console.log( text.replace(re, (m,g) => g !== undefined ? "[i]" + g + "[/i]" : m ) )

See the regex demo. Details:

  • <[^<>|]*(?:\|[^<>]*)?> - a <, then zero or more chars other than <, > and |, then an optional sequence of a | and then zero or more chars other than < and > and then a > char
  • | - or
  • `[^`]*` - a backtick, zero or more chars other than a backtick and a backtick
  • | - or
  • _([^_]*)_ - _, Group 1: zero or more chars other than _, a _.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文