Java 用正则表达式replaceAll
我有这样的文本:
...<span>my name is bob and I live in </p><p>America</span>...
我会替换此文本
...<span>my name is bob and I live in </span></p><p><span>America</span>...
我知道replace()函数,但我不太了解正则表达式,怎么可能做到这一点?
请记住,可以在 之前正确关闭其他 span 标记,例如:
...<span>my name is bob</span> and <span>I live in </p><p>America</span>...
I have a text like this:
...<span>my name is bob and I live in </p><p>America</span>...
I would replace this text in
...<span>my name is bob and I live in </span></p><p><span>America</span>...
I know the replace() function, but I don't know well regular expressions, how it's possible to do this?
Keep in mind that is possible to have other span tags correctly closed before the </p>
, for example:
...<span>my name is bob</span> and <span>I live in </p><p>America</span>...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一般来说,您无法使用正则表达式解析 HTML,因为它不是常规语言。
如果您在特定位置生成字符串,并且您知道它只是值本身,那么这可能是可能的。然而,在这种情况下,它不太可能是干净的,因为您不想将标签嵌入到应该只是一些 CDATA 的内容中。如果您开始解析包含标签的文档,那么通常不可能编写一个正确的正则表达式来捕获您的情况。如果您的文档使用非常有限的语法,它可能可以,但我会对此保持警惕,因为我怀疑有人会记得在未来的重构中强制执行这些限制。
更好的解决方案是使用 DOM 之类的东西来迭代实际生成的 HTML 本身并修改节点树。或者,如果您实际上输出纯 XHTML,则可以使用 XSLT 进行此翻译。
In general, you can't parse HTML with regexes, because it's not a regular language.
If you're generating the string in a particular place, and you know it's merely the value itself, then this may be possible. However in that case it's unlikely to be clean because you don't want to embed tags in something that's supposed to be just some CDATA. If you start parsing documents including tags, it's impossible in general to write a proper regex that will capture your case. If your document uses a very limited syntax it may be able to, but I'd be wary about this since I doubt anyone will remember to enforce these limits given future refactoring.
A better solution is to use something like DOM to iterate over the actual generated HTML itself and modify the node tree. Alternatively, on the off-chance you're actually outputting pure XHTML, you could use XSLT to make this translation.
这是一个可怕的非解决方案,但您可以使用
String.replace(CharSequence, CharSequence)
执行字符串替换。它不考虑 HTML 等的格式良好性。它只是盲目地用一个字符串替换另一个字符串。这可能适合你,也可能不适合你。不过,与 HTML 的任何正则表达式方法一样,它很可能只在某些时候有效。
This is a horrible non-solution, but you can use
String.replace(CharSequence, CharSequence)
to perform string replacement. It has no respect of the wellformedness of the HTML etc. It's just blindly substituting one string for another.This may or may not work for you. Like any regex approach to HTML, though, it most likely only works some of the time.