YQL 丢失 HTML 元素属性?

发布于 2024-11-19 04:25:39 字数 1557 浏览 2 评论 0原文

<一href="http://developer.yahoo.com/yql/console/#h=select%20*%20from%20html%20where%20url=%27http://www.cbs.com/shows/big_bro其他/video/%27%20and%20xpath=%27//div%5B@id=%22cbs-video-metadata-wrapper%22%5D/div%5B@class=%22cbs-video-share%22%5D /a%27" rel="noreferrer">YQL 控制台链接

查询:

select * from html where url='http://www.cbs.com/shows/big_brother/video/' and xpath='//div[@id="cbs-video-metadata-wrapper"]/div[@class="cbs-video-share"]/a'

返回:

<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
    yahoo:count="1" yahoo:created="2011-07-09T23:14:02Z" yahoo:lang="en-US">
    <diagnostics>
        <publiclyCallable>true</publiclyCallable>
        <url execution-time="146" proxy="DEFAULT"><![CDATA[http://www.cbs.com/shows/big_brother/video/]]></url>
        <user-time>163</user-time>
        <service-time>146</service-time>
        <build-version>19262</build-version>
    </diagnostics> 
    <results>
        <a class="twitter-share-button" href="http://twitter.com/share"/>
    </results>
</query>

应该返回类似于以下内容的内容:

    <results>
        <a href="http://twitter.com/share" data-url="http://www.cbs.com/shows/big_brother/video/2045825951/big-brother-episode-1" class="twitter-share-button"></a>
    </results>

如果我退出一级查询,它会完全删除该元素,我也可以使用该元素来获取我需要的数据。

YQL Console Link

Query:

select * from html where url='http://www.cbs.com/shows/big_brother/video/' and xpath='//div[@id="cbs-video-metadata-wrapper"]/div[@class="cbs-video-share"]/a'

Returns:

<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
    yahoo:count="1" yahoo:created="2011-07-09T23:14:02Z" yahoo:lang="en-US">
    <diagnostics>
        <publiclyCallable>true</publiclyCallable>
        <url execution-time="146" proxy="DEFAULT"><![CDATA[http://www.cbs.com/shows/big_brother/video/]]></url>
        <user-time>163</user-time>
        <service-time>146</service-time>
        <build-version>19262</build-version>
    </diagnostics> 
    <results>
        <a class="twitter-share-button" href="http://twitter.com/share"/>
    </results>
</query>

Should Return Something Similar To:

    <results>
        <a href="http://twitter.com/share" data-url="http://www.cbs.com/shows/big_brother/video/2045825951/big-brother-episode-1" class="twitter-share-button"></a>
    </results>

If I back out the query one level, it totally strips out the element, which I could also use to get the data I need.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

菩提树下叶撕阳。 2024-11-26 04:25:39

我们现在有一个新的 html 解析器可以识别自定义属性。

添加 compat="html5" 以触发新的解析器。

例如:

select * from html where url = "http://mydomain.com" and compat="html5"

We have a new html parser that recognizes custom attributes now.

Add compat="html5" to trigger the new parser.

e.g.:

select * from html where url = "http://mydomain.com" and compat="html5"
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文