用 lxml.html 替换元素
我对 lxml 和 HTML 解析器整体来说还很陌生。 我想知道是否有一种方法可以用另一个元素替换树中的一个元素...
例如我有:
body = """<code> def function(arg): print arg </code> Blah blah blah <code> int main() { return 0; } </code> """
doc = lxml.html.fromstring(body)
codeblocks = doc.cssselect('code')
for block in codeblocks:
lexer = guess_lexer(block.text_content())
hilited = highlight(block.text_content(), lexer, HtmlFormatter())
doc.replace(block, hilited)
我想按照这些思路做一些事情,但这会导致“TypeError”,因为“hilited”不是一个 lxml.etree._Element。
这可行吗?
问候,
I'm fairly new to lxml and HTML Parsers as a whole.
I was wondering if there is a way to replace an element within a tree with another element...
For example I have:
body = """<code> def function(arg): print arg </code> Blah blah blah <code> int main() { return 0; } </code> """
doc = lxml.html.fromstring(body)
codeblocks = doc.cssselect('code')
for block in codeblocks:
lexer = guess_lexer(block.text_content())
hilited = highlight(block.text_content(), lexer, HtmlFormatter())
doc.replace(block, hilited)
I want to do something along those lines, but this results in a "TypeError" because "hilited" isn't an lxml.etree._Element.
Is this feasible?
Regards,
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
关于lxml,
在
doc.replace(block, hilited)
中,block是lxml的Element对象,hilited是字符串,你不能替换它。
有两种方法可以做到这一点
或
Regarding lxml,
In
doc.replace(block, hilited)
block is the lxml's Element object, hilited is string, you cannot replace that.
There is 2 ways to do that
or
如果您不熟悉 python HTML 解析器,您可以尝试 BeautifulSoup,一个 html/ xml 解析器,它可以让您轻松修改解析树< /a>.
If you're new to python HTML parsers, you might try out BeautifulSoup, a html/xml parser, which lets you modify the parse tree easily.