不包括无效逃脱的正则java字符串

发布于 2025-01-30 19:04:58 字数 331 浏览 1 评论 0 原文

我已经陷入困境了几天,但我无法弄清楚。我正在尝试为支持逃脱字符的双引号字符串文字写一条正则。正则应接受一个字符串,例如“ 1 \ t2” (\“ 1 \\ t2 \”),并拒绝一个字符串,例如“ invalid \ seccape” (\) “无效\\逃生\”)。这将处理有效而无效的逃脱, \”(\\(?= [bnrt \'\'\“ \\))。只需接受一切(即,^\“(。当它循环循环时,前面的字符是 \\ (\\\\),它允许在它之后放置任何字符。从头开始,我已经两次去了办公室,我无法弄清楚这一点不希望得到答案。

I've been stuck on this for days and I just can't figure it out. I'm trying to write a regex for double-quoted string literals that supports escape characters. The regex should accept a string such as "1\t2" (\"1\\t2\") and reject a string such as "invalid\escape" (\"invalid\\escape\"). This will handle valid and invalid escapes, \"(\\(?=[bnrt\'\"\\]).)*\" but as soon as I introduce anything to handle a string it just accepts everything (i.e., ^\"(.*(\\(?=[bnrt\\\'\"]))*)*\"$. I'm pretty sure the issue is that when it loops back around and the preceding character is \\ (\\\\) it allows any character to be placed after it. I just cannot figure it out. I've deleted my work and started from scratch more times than I can remember, I've gone to office hours twice, and I just cannot figure it out. I need fresh eyes on this because I'm blind to it and at my wits end. I'm not looking to be given the answer. I just want help figuring out what I'm doing wrong.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

我是有多爱你 2025-02-06 19:04:58

打开报价后,您的字符串REGEX以。*开始。无论其内容如何,​​这都与任何字符串匹配。您可以使用(< evave_regex>)*遵循它,但是由于*还允许零匹配,因此Regex Engine只是忽略了它。换句话说,您的字符串正则等同于^\“。*\” $

要改变此行为,您必须确保只有在也有有效的角色之后,后斜切才能匹配。这可以通过使用 [^\\] 更改点来完成,该与后斜切以外的每个字符匹配。生成的正则是这样的,并且可以在您的给定样品上工作:^\“([^\\]*(\\(?= [Bnrt \'\'\'\\])。)*)*)*\“ $ 。

后斜线,然后是有效的逃生角色。执行n次,您的字符串仅填充有效的逃生字符:^\“([^\\] | \\ [Bnrt \'\'\ \ \])*\“ $ 。此言论也适用于您的样品,并且会表现更好,因为您建议的正则罚款遭受

Your string regex starts with .* after the opening quote. This matches any string, regardless of its content. You follow it with (<escape_regex>)*, but since a * also allows zero matches, the regex engine just ignores it. In other words, your string regex is equivalent to ^\".*\"$.

To change this behaviour, you have to ensure that a backslash can only match if it is also followed by a valid character. This can be done by changing your dot with [^\\], which matches every character but a backslash. The resulting regex looks like this and works on your given samples: ^\"([^\\]*(\\(?=[bnrt\'\"\\]).)*)*\"$.

There is, however, a much simpler approach. ([^\\]|\\[bnrt\'\"\\]) matches either a non-backslash character, or a backslash followed by a valid escape character. Do that n times and you have a string filled with only valid escape characters: ^\"([^\\]|\\[bnrt\'\"\\])*\"$. This regex also works on your samples, and will perform a lot better, as your suggested regex suffers from catastrophic backtracking.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文