String.replace所有单反斜杠为双反斜杠

发布于 2024-08-10 08:14:52 字数 348 浏览 10 评论 0 原文

我正在尝试使用 String \something\ 转换为 String \\something\\ >replaceAll,但我不断收到各种错误。我认为这就是解决方案:

theString.replaceAll("\\", "\\\\");

但这给出了以下例外:

java.util.regex.PatternSyntaxException: Unexpected internal error near index 1

I'm trying to convert the String \something\ into the String \\something\\ using replaceAll, but I keep getting all kinds of errors. I thought this was the solution:

theString.replaceAll("\\", "\\\\");

But this gives the below exception:

java.util.regex.PatternSyntaxException: Unexpected internal error near index 1

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

七婞 2024-08-17 08:14:52

String#replaceAll() 将参数解释为 正则表达式\两者 Stringregex 中的转义字符。您需要对正则表达式进行双重转义:

string.replaceAll("\\\\", "\\\\\\\\");

但您不一定需要正则表达式,只是因为您想要精确的逐个字符替换,并且这里不需要模式。所以 String#replace() 应该足够了:

string.replace("\\", "\\\\");

更新:根据注释,您似乎想要在 JavaScript 上下文中使用该字符串。你也许最好使用 StringEscapeUtils#escapeEcmaScript() 而是覆盖更多字符。

The String#replaceAll() interprets the argument as a regular expression. The \ is an escape character in both String and regex. You need to double-escape it for regex:

string.replaceAll("\\\\", "\\\\\\\\");

But you don't necessarily need regex for this, simply because you want an exact character-by-character replacement and you don't need patterns here. So String#replace() should suffice:

string.replace("\\", "\\\\");

Update: as per the comments, you appear to want to use the string in JavaScript context. You'd perhaps better use StringEscapeUtils#escapeEcmaScript() instead to cover more characters.

亣腦蒛氧 2024-08-17 08:14:52

TLDR:使用 theString = theString.replace("\\", "\\\\"); 代替。


问题

replaceAll(target, replacement)target 使用正则表达式 (regex) 语法,部分对 replacement 使用。

问题是 \ 是正则表达式中的特殊字符(它可以像 \d 一样使用来表示数字)和字符串文字(它可以像 "\ n" 表示行分隔符,或 \" 转义双引号符号,双引号符号通常表示字符串文字的结尾,

在这两种情况下都创建 \ 符号 。我们可以通过在其前面放置额外的 \ 来转义它(使其成为文字而不是特殊字符)(就像我们通过 < 在字符串文字中转义 " 一样) code>\")。

因此,对于表示 \ 符号的 target 正则表达式,需要保存 \\,而表示此类的字符串文字文本需要看起来像 "\\\\"

所以我们转义了 \ 两次:

  • 一次在正则表达式 \\
  • 中一次在字符串文字 < code>"\\\\" (每个 \ 表示为 "\\"

。 code>\ 也很特殊,它允许我们转义其他特殊字符 $,通过 $x 表示法,允许我们使用与 匹配的数据部分。正则表达式并由索引为 x 的捕获组保存,例如 "012".replaceAll("(\\d)", "$1$1") 将匹配每个数字、位置它在捕获组 1 中,$1$1 将用它的两个副本替换它(它将复制它),从而产生 “001122”

再次强调,要让 replacement 代表 \ 文字,我们需要使用额外的 \ 对其进行转义,这意味着:

  • replacement 必须包含两个反斜杠字符 \\
  • 和表示 \\ 的字符串文字看起来像 "\\\\"

但因为我们希望 replacement 保存两个个反斜杠,我们需要 "\\\\\\\\" (每个 \ 由一个 "\\\\ “)。

因此,使用 replaceAll 的版本看起来像

replaceAll("\\\\", "\\\\\\\\");

使用 replaceAll 的更简单方法

为了使生活更轻松,Java 提供了自动将文本转义为 target 的工具更换零件。所以现在我们可以只关注字符串,而忘记正则表达式语法:

replaceAll(Pattern.quote(target), Matcher.quoteReplacement(replacement))

在我们的例子中,它看起来像

replaceAll(Pattern.quote("\\"), Matcher.quoteReplacement("\\\\"))

更好:使用 replace

如果我们真的不需要正则表达式语法支持,那么就不要涉及 完全替换所有。相反,让我们使用replace。这两种方法都会替换所有 目标,但replace不涉及正则表达式语法。所以你可以简单地写

theString = theString.replace("\\", "\\\\");

TLDR: use theString = theString.replace("\\", "\\\\"); instead.


Problem

replaceAll(target, replacement) uses regular expression (regex) syntax for target and partially for replacement.

Problem is that \ is special character in regex (it can be used like \d to represents digit) and in String literal (it can be used like "\n" to represent line separator or \" to escape double quote symbol which normally would represent end of string literal).

In both these cases to create \ symbol we can escape it (make it literal instead of special character) by placing additional \ before it (like we escape " in string literals via \").

So to target regex representing \ symbol will need to hold \\, and string literal representing such text will need to look like "\\\\".

So we escaped \ twice:

  • once in regex \\
  • once in String literal "\\\\" (each \ is represented as "\\").

In case of replacement \ is also special there. It allows us to escape other special character $ which via $x notation, allows us to use portion of data matched by regex and held by capturing group indexed as x, like "012".replaceAll("(\\d)", "$1$1") will match each digit, place it in capturing group 1 and $1$1 will replace it with its two copies (it will duplicate it) resulting in "001122".

So again, to let replacement represent \ literal we need to escape it with additional \ which means that:

  • replacement must hold two backslash characters \\
  • and String literal which represents \\ looks like "\\\\"

BUT since we want replacement to hold two backslashes we will need "\\\\\\\\" (each \ represented by one "\\\\").

So version with replaceAll can look like

replaceAll("\\\\", "\\\\\\\\");

Easier way with replaceAll

To make out life easier Java provides tools to automatically escape text into target and replacement parts. So now we can focus only on strings, and forget about regex syntax:

replaceAll(Pattern.quote(target), Matcher.quoteReplacement(replacement))

which in our case can look like

replaceAll(Pattern.quote("\\"), Matcher.quoteReplacement("\\\\"))

Even better: use replace

If we don't really need regex syntax support lets not involve replaceAll at all. Instead lets use replace. Both methods will replace all targets, but replace doesn't involve regex syntax. So you could simply write

theString = theString.replace("\\", "\\\\");
莫多说 2024-08-17 08:14:52

为了避免这种麻烦,您可以使用 replace (它采用纯字符串)而不是 replaceAll (它采用正则表达式)。您仍然需要转义反斜杠,但不需要以正则表达式所需的方式进行转义。

To avoid this sort of trouble, you can use replace (which takes a plain string) instead of replaceAll (which takes a regular expression). You will still need to escape backslashes, but not in the wild ways required with regular expressions.

生寂 2024-08-17 08:14:52

您需要转义第一个参数中的(转义的)反斜杠,因为它是正则表达式。替换(第二个参数 - 请参阅 Matcher#replaceAll(String)) 也有反斜杠的特殊含义,所以你必须将它们替换为:

theString.replaceAll("\\\\", "\\\\\\\\");

You'll need to escape the (escaped) backslash in the first argument as it is a regular expression. Replacement (2nd argument - see Matcher#replaceAll(String)) also has it's special meaning of backslashes, so you'll have to replace those to:

theString.replaceAll("\\\\", "\\\\\\\\");
活雷疯 2024-08-17 08:14:52

是的......当正则表达式编译器看到你给它的模式时,它只看到一个反斜杠(因为Java的词法分析器已经将双反斜杠变成了一个反斜杠)。不管你信不信,你需要将 "\\\\" 替换为 "\\\\"! Java 确实需要一个好的原始字符串语法。

Yes... by the time the regex compiler sees the pattern you've given it, it sees only a single backslash (since Java's lexer has turned the double backwhack into a single one). You need to replace "\\\\" with "\\\\", believe it or not! Java really needs a good raw string syntax.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文