如何从 XPath 查询中的先前属性值中提取嵌入的属性值？

发布于 2024-11-18 07:00:45 字数 487 浏览 6 评论 0原文

我试图从 html 的以下部分中的 onclick 属性中“选择”链接

<span onclick="Javascript:document.quickFindForm.action='/blah_blah'" 
 class="specialLinkType"><img src="blah"></span>

，但无法获得比以下 XPath 更进一步的信息，

//span[@class="specialLinkType"]/@onclick

该 XPath 只返回

Javascript:document.quickFindForm.action

Any ideas on how to pick out that link inside of the QuickFindForm .action 带有 XPath？

原文

I'm trying to "select" the link from the onclick attribute in the following portion of html

<span onclick="Javascript:document.quickFindForm.action='/blah_blah'" 
 class="specialLinkType"><img src="blah"></span>

but can't get any further than the following XPath

//span[@class="specialLinkType"]/@onclick

which only returns

Javascript:document.quickFindForm.action

Any ideas on how to pick out that link inside of the quickFindForm.action with an XPath?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

z祗昰~ 2024-11-25 07:00:45

我在 Java 应用程序中尝试了 XPath，它工作正常：

    import java.io.IOException;
    import java.io.StringReader;

    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.parsers.ParserConfigurationException;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathExpression;
    import javax.xml.xpath.XPathFactory;

    import org.w3c.dom.Document;
    import org.xml.sax.InputSource;
    import org.xml.sax.SAXException;

    public class Teste {

        public static void main(String[] args) throws Exception {
            Document doc = stringToDom("<span onclick=\"Javascript:document.quickFindForm.action='/blah_blah'\" class=\"specialLinkType\"><img src=\"blah\"/></span>");
            XPath newXPath = XPathFactory.newInstance().newXPath();
            XPathExpression xpathExpr = newXPath.compile("//span[@class=\"specialLinkType\"]/@onclick");
            String result = xpathExpr.evaluate(doc);
            System.out.println(result);

        }

        public static Document stringToDom(String xmlSource) throws SAXException, ParserConfigurationException, IOException {
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();
            return builder.parse(new InputSource(new StringReader(xmlSource)));
        }
    }

结果：

Javascript:document.quickFindForm.action='/blah_blah'

I tried the XPath in a Java application and it worked ok:

    import java.io.IOException;
    import java.io.StringReader;

    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.parsers.ParserConfigurationException;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathExpression;
    import javax.xml.xpath.XPathFactory;

    import org.w3c.dom.Document;
    import org.xml.sax.InputSource;
    import org.xml.sax.SAXException;

    public class Teste {

        public static void main(String[] args) throws Exception {
            Document doc = stringToDom("<span onclick=\"Javascript:document.quickFindForm.action='/blah_blah'\" class=\"specialLinkType\"><img src=\"blah\"/></span>");
            XPath newXPath = XPathFactory.newInstance().newXPath();
            XPathExpression xpathExpr = newXPath.compile("//span[@class=\"specialLinkType\"]/@onclick");
            String result = xpathExpr.evaluate(doc);
            System.out.println(result);

        }

        public static Document stringToDom(String xmlSource) throws SAXException, ParserConfigurationException, IOException {
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();
            return builder.parse(new InputSource(new StringReader(xmlSource)));
        }
    }

Result:

Javascript:document.quickFindForm.action='/blah_blah'

回复收藏 0 原文

乖乖公主 2024-11-25 07:00:45

如果 Scrapy 支持 XPath 字符串函数，这将起作用

substring-before(
   substring-after(
      //span[@class="specialLinkType"]/@onclick,"quickFindForm.action='")
   ,"'")

它看起来也支持正则表达式。像这样的东西应该有效

.select('//span[@class="specialLinkType"]/@onclick').re(r'quickFindForm.action=\'(.*?)\'')

警告：我无法测试第二个解决方案，您必须检查 \' 在这种情况下是单引号的正确转义序列。

If Scrapy supports XPath string functions this will work

substring-before(
   substring-after(
      //span[@class="specialLinkType"]/@onclick,"quickFindForm.action='")
   ,"'")

It looks like it also supports regex. Something like this should work

.select('//span[@class="specialLinkType"]/@onclick').re(r'quickFindForm.action=\'(.*?)\'')

Caveat: I can't test the second solution and you will have to check that \' is the proper escape sequence for single quotes in this case.

回复收藏 0 原文

天荒地未老 2024-11-25 07:00:45

我用的是xquery，但是xpath中应该是一样的。我使用了一个 xpath 函数“tokenize”，它根据正则表达式分割字符串（http://www.xqueryfunctions.com/xq/fn_tokenize.html）。
在这种情况下，我根据“ ' ”分割字符串，

        xquery version "1.0";
        let $x := //span[@class="specialLinkType"]/@onclick
        let $c := fn:tokenize( $x, '''' )
        return $c[2]

在 xpath 中应该是：

        fn:tokenize(//span[@class="specialLinkType"]/@onclick, '''' )[2]

I used xquery but it should be the same in xpath. I used an xpath function "tokenize" that splits a string based on a regular expression (http://www.xqueryfunctions.com/xq/fn_tokenize.html).
In this case I split the string basing on " ' "

        xquery version "1.0";
        let $x := //span[@class="specialLinkType"]/@onclick
        let $c := fn:tokenize( $x, '''' )
        return $c[2]

That in xpath shoud be:

        fn:tokenize(//span[@class="specialLinkType"]/@onclick, '''' )[2]

回复收藏 0 原文

~没有更多了~

关于作者

星

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

如何从 XPath 查询中的先前属性值中提取嵌入的属性值？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

燃烧我的卡路李先生

qq_2gSKZM

∞梦里开花

qq_IklFPL

迷途知返

深海不蓝

友情链接

如何从 XPath 查询中的先前属性值中提取嵌入的属性值？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

燃烧我的卡路李先生

qq_2gSKZM

∞梦里开花

qq_IklFPL

迷途知返

深海不蓝

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。