获取HTML页面文本中的特定单词
如果我有以下 HTML 页面,
<div>
<p>
Hello world!
</p>
<p> <a href="example.com"> Hello and Hello again this is an example</a></p>
</div>
我想获取特定的单词,例如“hello”,并将其更改为“welcome”,无论它们在文档中的任何位置,
您有什么建议吗?无论您使用哪种类型的解析器,我都会很高兴得到您的答案?
If I have the following HTML page
<div>
<p>
Hello world!
</p>
<p> <a href="example.com"> Hello and Hello again this is an example</a></p>
</div>
I want to get the specific word for example 'hello' and change it to 'welcome' wherever they are in the document
Do you have any suggestion? I will be happy to get your answers whatever the type of parser you use?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用 XSLT 可以轻松做到这一点。
XSLT 1.0 解决方案:
当此转换应用于提供的 XML 文档时:
生成所需的正确结果:
我的假设是匹配并且替换不区分大小写(即“hello”和“heLlo”都应该替换为“welcome”)。如果需要区分大小写的匹配,则可以大大简化转换。
XSLT 2.0 解决方案:
当此转换应用于提供的 XML 文档(如上所示)时,再次生成所需的正确结果:
解释 :使用标准 XPath 2.0 函数
matches()
和replace()
指定为第三个参数"i"
-- case- 的标志操作不灵敏。This is easy to do with XSLT.
XSLT 1.0 solution:
when this transformation is applied on the provided XML document:
the wanted, correct result is produced:
My assumption is that the matching and replacement is case-insensitive (i.e. "hello" and "heLlo" should both be replaced with "welcome"). In case a case-sensitive match is required, the transformation can be considerably simplified.
XSLT 2.0 Solution:
when this transformation is applied on the provided XML document (shown above), again the wanted, correct result is produced:
Explanation: Use of the standard XPath 2.0 functions
matches()
andreplace()
specifying as the third argument"i"
-- a flag for case-insensitive operation.