使用书签进行网页抓取?

发布于 2024-11-04 23:04:03 字数 327 浏览 0 评论 0原文

我想使用一个小书签来收集来自不同网页的一些资源。
即,我不想使用一些浏览器扩展来从页面获取 html 元素,而是使用一些 javascript bookmarklet 从网站捕获代码。

[编辑] 如何使用 JavaScript 书签从页面获取 html 元素?
问题是关于使用 bookmarklet 获取 html 内部代码,而不是一般的 bookmarklet。

I'd like to use a bookmarklet to gather toghether a few resouces from different webpages.
i.e, instead of using some browser extensions to get the html elements from the pages, I would like to use some javascript bookmarklet to capture the code from the sites.

[Edit] How do I get the html elements from the page with JavaScript bookmarklet?

The question is about getting the html inner code with bookmarklet, not about bookmarklet in general.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

所有深爱都是秘密 2024-11-11 23:04:03

您不需要任何库来执行此操作。只需在 Firebug 或 Chrome Inspector 中创建您的功能,然后将其格式化为一行,如下所示:

javascript:(function(){alert(1);})();

将其复制并粘贴到该位置栏并按回车键执行它。更换警报(1);用你的代码。我们将其包装在一个自动执行的匿名函数中,否则您执行的响应将替换网页。

如果您的代码真的很长,您可以将其全部写入外部 javascript 文件中,并且在您看到上面的警报的地方,只需使用 src 创建一个脚本标记并将其附加到页面即可。

You don't need any libraries to do this. Just create your functionality in Firebug or Chrome Inspector and then format it on one line like this:

javascript:(function(){alert(1);})();

Copy and paste this to the location bar and hit enter to execute it. Replace the alert(1); with your code. We wrap it in a self-executing anonymous function or else the response of what you execute would replace the web page.

If your code is really long you can write it all in an external javascript file and where you see the alert above just create a script tag with your src and append it to the page.

如痴如狂 2024-11-11 23:04:03

您可能会考虑加载 javascript代码,该代码将在有人时执行抓取由于小书签长度限制,您点击了您的小书签。有关访问 DOM 元素的信息,请参阅此参考

请注意,由于跨框架安全

You might consider loading javascript code that will perform scraping when someone clicks on your bookmarklet, because of bookmarklet length limitation. On accessing DOM elements see this reference.

Please note, that scraping will only be possible for FRAME/IFRAMEs originated from the same domain as the main window, due to cross frame security.

不一样的天空 2024-11-11 23:04:03

我制作的这个长脚本将为您提供准确的内容,以及其他一些独特的增强功能:

javascript:void function(e){var t=function(e){document.writeln("<!DOCTYPE html>"),document.writeln("<html>"),document.writeln("<body>"),document.writeln(""),document.writeln('<p style="font-size:20px"><b>Public Bookmarklet for viewing a whois of a site. Of course this isnt as complex as the real thing, because I got all the data below from scratch.</b></p><p style="font-size:13px"><i>made by shoe%231327</i></p>'),document.writeln('<p style="font-size:20px">DOMAIN INFO:</p>'),document.writeln(""),document.writeln('{"dig":{"header":{"id":"43226","qr":"1","opcode":"Query","aa":"false","tc":"false","rd":"false","ra":"false","ad":"false","cd":"false","rcode":"NXDOMAIN","qdcount":"1","ancount":"0","nscount":"0","arcount":"0"},"answer":[],"additional":[],"authority":[],"bind":";; Security Level : UNCHECKED\n;; HEADER SECTION\n;; id = 43226\n;; qr = 1    opcode = Query    aa = false    tc = false    rd = false\n;; ra = false    ad = false    cd = false    rcode  = NXDOMAIN\n;; qdcount = 1  ancount = 0  nscount = 0  arcount = 0\n\n;; QUESTION SECTION (1  record)\n;; :fqdn.INANY\n"},"error":false}'),document.writeln('<p id="demo"></p>'),document.writeln("<script>"),document.writeln('document.getElementById("demo").innerHTML = '),document.writeln('"DOMAIN:<br>" + window.location.href;'),document.writeln("</script>"),document.writeln("<!--"),document.writeln('<script type="application/javascript">'),document.writeln("  function getIP(json) {"),document.writeln('    document.write("CLIENT IP: ", json.ip);'),document.writeln("  }"),document.writeln("</script>"),document.writeln(""),document.writeln('<script type="application/javascript" src="https://api.ipify.org%3Fformat=jsonp%26callback=getIP"></script>'),document.writeln("-->"),document.writeln("</body>"),document.writeln("</html>"),document.writeln("<p>statuses: [ <br>"),document.writeln('            "clientTransferProhibited"'),document.writeln("            <br>"),document.writeln("            ]"),document.writeln("</p>"),document.writeln('<p style="font-size:20px">CLIENT INFO:</p>'),document.writeln('<pre id="response"></pre>'),document.writeln(""),e.get("https://api.ipdata.co/%3Fapi-key=test",function(t){e("%23response").html(JSON.stringify(t,null,4))},"jsonp"),document.writeln("</body>"),document.writeln("</html>")},n=e%26%26e.fn%26%26parseFloat(e.fn.jquery)>=1.7;if(n)t(e);else{var o=document.createElement("script");o.src="//ajax.googleapis.com/ajax/libs/jquery/1/jquery.js",o.onload=o.onreadystatechange=function(){var e=this.readyState;e%26%26"loaded"!==e%26%26"complete"!==e||t(jQuery.noConflict())}}document.getElementsByTagName("head")[0].appendChild(o)}(window.jQuery);

This long script I made will give you exactly that, as well as a few other unique enhancements:

javascript:void function(e){var t=function(e){document.writeln("<!DOCTYPE html>"),document.writeln("<html>"),document.writeln("<body>"),document.writeln(""),document.writeln('<p style="font-size:20px"><b>Public Bookmarklet for viewing a whois of a site. Of course this isnt as complex as the real thing, because I got all the data below from scratch.</b></p><p style="font-size:13px"><i>made by shoe%231327</i></p>'),document.writeln('<p style="font-size:20px">DOMAIN INFO:</p>'),document.writeln(""),document.writeln('{"dig":{"header":{"id":"43226","qr":"1","opcode":"Query","aa":"false","tc":"false","rd":"false","ra":"false","ad":"false","cd":"false","rcode":"NXDOMAIN","qdcount":"1","ancount":"0","nscount":"0","arcount":"0"},"answer":[],"additional":[],"authority":[],"bind":";; Security Level : UNCHECKED\n;; HEADER SECTION\n;; id = 43226\n;; qr = 1    opcode = Query    aa = false    tc = false    rd = false\n;; ra = false    ad = false    cd = false    rcode  = NXDOMAIN\n;; qdcount = 1  ancount = 0  nscount = 0  arcount = 0\n\n;; QUESTION SECTION (1  record)\n;; :fqdn.INANY\n"},"error":false}'),document.writeln('<p id="demo"></p>'),document.writeln("<script>"),document.writeln('document.getElementById("demo").innerHTML = '),document.writeln('"DOMAIN:<br>" + window.location.href;'),document.writeln("</script>"),document.writeln("<!--"),document.writeln('<script type="application/javascript">'),document.writeln("  function getIP(json) {"),document.writeln('    document.write("CLIENT IP: ", json.ip);'),document.writeln("  }"),document.writeln("</script>"),document.writeln(""),document.writeln('<script type="application/javascript" src="https://api.ipify.org%3Fformat=jsonp%26callback=getIP"></script>'),document.writeln("-->"),document.writeln("</body>"),document.writeln("</html>"),document.writeln("<p>statuses: [ <br>"),document.writeln('            "clientTransferProhibited"'),document.writeln("            <br>"),document.writeln("            ]"),document.writeln("</p>"),document.writeln('<p style="font-size:20px">CLIENT INFO:</p>'),document.writeln('<pre id="response"></pre>'),document.writeln(""),e.get("https://api.ipdata.co/%3Fapi-key=test",function(t){e("%23response").html(JSON.stringify(t,null,4))},"jsonp"),document.writeln("</body>"),document.writeln("</html>")},n=e%26%26e.fn%26%26parseFloat(e.fn.jquery)>=1.7;if(n)t(e);else{var o=document.createElement("script");o.src="//ajax.googleapis.com/ajax/libs/jquery/1/jquery.js",o.onload=o.onreadystatechange=function(){var e=this.readyState;e%26%26"loaded"!==e%26%26"complete"!==e||t(jQuery.noConflict())}}document.getElementsByTagName("head")[0].appendChild(o)}(window.jQuery);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文