如何在没有 StringTokenizer 的情况下替换字符串中的标记
给定一个像这样的字符串:
Hello {FIRST_NAME}, this is a personalized message for you.
其中 FIRST_NAME 是一个任意标记(传递给该方法的映射中的键),编写一个例程将该字符串转换为:
Hello Jim, this is a personalized message for you.
给定一个带有条目 FIRST_NAME -> 的映射 吉姆.
StringTokenizer 似乎是最直接的方法,但 Javadocs 确实说您应该更喜欢使用正则表达式方法。 在基于正则表达式的解决方案中,您将如何做到这一点?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
根据字符串的复杂程度,您可以尝试使用更严格的字符串模板语言,例如 Velocity。 在 Velocity 的情况下,您可以执行如下操作:
但如果您只想替换一两个值,那么这可能就有点矫枉过正了。
Depending on how ridiculously complex your string is, you could try using a more serious string templating language, like Velocity. In Velocity's case, you'd do something like this:
But that is likely overkill if you only want to replace one or two values.
该文档意味着您应该更喜欢编写基于正则表达式的标记生成器 IIRC。 可能更适合您的是标准正则表达式搜索替换。
The docs mean that you should prefer writing a regex-based tokenizer, IIRC. What might work better for you is a standard regex search-replace.
通常,在这种情况下我们会使用 MessageFormat,并从 ResourceBundle 加载实际的消息文本。 这为您带来了 G10N 友好的额外好处。
Generally we'd use MessageFormat in a case like this, coupled with loading the actual message text from a ResourceBundle. This gives you the added benefit of being G10N friendly.
谢谢大家的回答!
Gizmo 的答案绝对是开箱即用的,并且是一个很好的解决方案,但不幸的是不合适,因为格式不能限于 Formatter 类在这种情况下所做的事情。
亚当·佩恩特(Adam Paynter)确实以正确的模式抓住了问题的核心。
Peter Nix 和 Sean Bright 有一个很好的解决方法来避免正则表达式的所有复杂性,但如果存在错误的标记,我需要提出一些错误,但事实并非如此。
但就执行正则表达式和合理的替换循环而言,这就是我想出的答案(在 Google 和现有答案的帮助下,包括 Sean Bright 关于如何使用 group(1) 与 group() 的评论) ):
doParameter 从映射中获取值并将其转换为字符串,如果不存在则抛出异常。
另请注意,我更改了模式以查找空大括号(即 {}),因为这是明确检查的错误条件。
编辑:
请注意,appendReplacement 与字符串的内容无关。 根据 javadocs,它将 $ 和反斜杠识别为特殊字符,因此我添加了一些转义来处理上面的示例。 没有以最注重性能的方式完成,但就我而言,这并不是一个足够大的事情,值得尝试对字符串创建进行微观优化。感谢 Alan M 的评论,这甚至可以做到更简单地避免appendReplacement的特殊字符问题。
Thanks everyone for the answers!
Gizmo's answer was definitely out of the box, and a great solution, but unfortunately not appropriate as the format can't be limited to what the Formatter class does in this case.
Adam Paynter really got to the heart of the matter, with the right pattern.
Peter Nix and Sean Bright had a great workaround to avoid all of the complexities of the regex, but I needed to raise some errors if there were bad tokens, which that didn't do.
But in terms of both doing a regex and a reasonable replace loop, this is the answer I came up with (with a little help from Google and the existing answer, including Sean Bright's comment about how to use group(1) vs group()):
Where doParameter gets the value out of the map and converts it to a string and throws an exception if it isn't there.
Note also I changed the pattern to find empty braces (i.e. {}), as that is an error condition explicitly checked for.
EDIT:
Note that appendReplacement is not agnostic about the content of the string. Per the javadocs, it recognizes $ and backslash as a special character, so I added some escaping to handle that to the sample above. Not done in the most performance conscious way, but in my case it isn't a big enough deal to be worth attempting to micro-optimize the string creations.Thanks to the comment from Alan M, this can be made even simpler to avoid the special character issues of appendReplacement.
好吧,我宁愿使用 String.format(),或者更好 消息格式。
Well, I would rather use String.format(), or better MessageFormat.
查看它的 javadoc 此处。
Check out the javadocs for it here.
试试这个:
注意:作者的最终解决方案基于此示例,并且更加简洁。
Try this:
Note: The author's final solution builds upon this sample and is much more concise.
使用 import java.util.regex.*:
因此,推荐使用正则表达式,因为它可以轻松识别字符串中需要替换的位置,以及提取用于替换的键的名称。 这比拉断整根弦要有效得多。
您可能希望循环使用 Matcher 行在里面,Pattern 行在外面,这样您就可以替换所有行。 该模式永远不需要重新编译,并且避免不必要的操作会更有效。
With import java.util.regex.*:
So, the regex is recommended because it can easily identify the places that require substitution in the string, as well as extracting the name of the key for substitution. It's much more efficient than breaking the whole string.
You'll probably want to loop with the Matcher line inside and the Pattern line outside, so you can replace all lines. The pattern never needs to be recompiled, and it's more efficient to avoid doing so unnecessarily.
最直接的似乎是这样的:
它循环遍历所有标记,并将每个标记替换为您需要的标记,并使用标准 String 方法进行替换,从而跳过整个 RegEx 的麻烦。
The most straight forward would seem to be something along the lines of this:
It loops through all your tokens and replaces every token with what you need, and uses the standard String method for replacement, thus skipping the whole RegEx frustrations.