Python:使用“lxml.html”将 HTML 内容注入到标签中
我正在使用 lxml.html
库来解析 HTML 文档。
我找到了一个名为 content_tag
的特定标记,并且我想更改其内容(即
和
之间的文本)。
,),新内容是一个字符串,其中包含一些 html,假设它是 'Hello world!'
。我该怎么做?我尝试了 content_tag.text = 'Hello world!'
但随后它转义了所有 html 标签,将 <
替换为 <
等。
我想注入文本而不转义任何 HTML。我怎样才能做到这一点?
I'm using the lxml.html
library to parse an HTML document.
I located a specific tag, that I call content_tag
, and I want to change its content (i.e. the text between <div>
and </div>
,) and the new content is a string with some html in it, say it's 'Hello <b>world!</b>'
.
How do I do that? I tried content_tag.text = 'Hello <b>world!</b>'
but then it escapes all the html tags, replacing <
with <
etc.
I want to inject the text without escaping any HTML. How can I do that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是一种方式:
另请参阅:http://lxml.de /lxmlhtml.html#creating-html-with-the-e-factory
编辑: 所以,我应该早点承认我对 lxml 不太熟悉。我简要查看了文档和源代码,但没有找到干净的解决方案。也许,有更熟悉的人会过来纠正我们的错误。
与此同时,这似乎有效,但没有经过充分测试:
再次编辑:并且此版本删除了文本和子项
This is one way:
See also: http://lxml.de/lxmlhtml.html#creating-html-with-the-e-factory
Edit: So, I should have confessed earlier that I'm not all that familiar with lxml. I looked at the docs and source briefly, but didn't find a clean solution. Perhaps, someone more familiar will stop by and set us both straight.
In the meantime, this seems to work, but is not well tested:
Edit again: and this version removes text and children
假设 content_tag 没有任何子元素,您可以这样做:
Assuming content_tag doesn't have any subelement, you can just do:
经过一番修改,我找到了这个解决方案:
After tinkering around, i found this solution: