tagoup 和 Groovy 的 XmlSlurper 的奇怪行为
假设我想从 xml 字符串中解析电话号码,如下所示:
str = """ <root>
<address>123 New York, NY 10019
<div class="phone"> (212) 212-0001</div>
</address>
</root>
"""
parser = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser()).parseText (str)
println parser.address.div.text()
它不打印电话号码。
如果我像这样将“div”元素更改为“foo”
str = """ <root>
<address>123 New York, NY 10019
<foo class="phone"> (212) 212-0001</foo>
</address>
</root>
"""
parser = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser()).parseText (str)
println parser.address.foo.text()
那么它就能够解析并打印电话号码。
到底是怎么回事?
顺便说一句,我正在使用 groovy 1.7.5 和 tagoup 1.2
Let's say I want to parse the phone number from an an xml string like this:
str = """ <root>
<address>123 New York, NY 10019
<div class="phone"> (212) 212-0001</div>
</address>
</root>
"""
parser = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser()).parseText (str)
println parser.address.div.text()
It doesn't print the phone number.
If I change the "div" element to "foo" like this
str = """ <root>
<address>123 New York, NY 10019
<foo class="phone"> (212) 212-0001</foo>
</address>
</root>
"""
parser = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser()).parseText (str)
println parser.address.foo.text()
Then its able to parse and print the phone number.
What the heck is going on?
Btw I am using groovy 1.7.5 and tagsoup 1.2
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
只需将代码更改为
这是 Groovy 和许多其他动态语言的诅咒 - “div”是保留方法名称,因此您不会获得节点,而是尝试划分“地址”节点:)
Just change code to
This is curse of Groovy and many other dynamic language - "div" is reserved method name thus you don't get node but rather try to divide "address" node :)
我似乎记得 tagoup 标准化了 HTML 标签 - 即它把它们大写。所以你想要的 GPath 表达式可能是
我发现它很方便能够打印出解析的结果 - 然后你就可以明白为什么你的 GPath 不起作用。用这个..
I seem to recall that tagsoup normalizes HTML tags - i.e. it uppercases them. So the GPath expression you want is probably
I find it handy to be able to print out the result of the parse - then you can see why your GPath isn't working. Use this..
我知道这个问题很老了。但我最近遇到了,这就是我使用的:
Groovy 版本是 2.4
I know that this question is very old. But I faced recently and this is what I used:
Groovy version is 2.4