如何检测页面上已访问和未访问的链接?

发布于 2024-12-02 12:33:46 字数 330 浏览 0 评论 0原文

我的目标是检测网页上未访问的链接,然后创建一个greasemonkey 脚本来单击这些链接。这里的未访问链接是指我没有打开的链接。由于我可以看到所有浏览器都提供了更改已访问和未访问链接颜色的功能,因此可以以任何方式检测这些链接。 在搜索时我发现了这个链接: http://www.mozdev.org /pipermail/greasemonkey/2005-November/006821.html 但这里有人告诉我这不再可能了。请帮忙。

My aim is to detect the unvisited links on a webpage and then create a greasemonkey script to click on those links. By unvisited links here I mean the links which are not opened by me. Since I can see all the browser provide capability to change the color of visited and unvisited link is it possible to detect these links in any manner.
While searching I came upon this link: http://www.mozdev.org/pipermail/greasemonkey/2005-November/006821.html but someone here told me that this is no longer possible. Please help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

旧人 2024-12-09 12:33:46

正确的,javascript 不可能检测在 Firefox 或 Chrome 中是否访问了链接——这是唯一适用于此 Greasemonkey 上下文的浏览器。

这是因为 Firefox 和 Chrome 非常重视安全和隐私。来自 CSS2 规范

注意。样式表作者可能会滥用 :link 和 :visited 伪类来确定用户在未经用户同意的情况下访问了哪些网站。

用户代理可能因此将所有链接视为未访问的链接,或实施其他措施来保护用户的隐私,同时以不同方式呈现已访问和未访问的链接。有关处理隐私的更多信息,请参阅 [P3P]。

另请参阅“隐私和 :visited 选择器”
您可以看到一个演示,显示安全浏览器不会让您嗅探访问过的链接 jsfiddle.net/n8F9U。




根据您的具体情况,由于您正在访问页面并保持其打开状态,因此您可以帮助脚本跟踪访问了哪些链接。这并不是万无一失的,但我相信它会满足您的要求。

首先,通过执行以下操作来查看正在运行的脚本

  1. 按原样安装脚本。
  2. 浏览到测试页面,jsbin.com/eledog
    测试页面每次重新加载或刷新时都会添加一个新链接。
  3. GM 脚本向其运行的页面添加了 2 个按钮。左上角有一个“开始/停止”按钮,右下角有一个“清除”按钮。

    当您按下“开始”按钮时,它会执行以下操作:

    1. 页面上的所有现有链接都会记录为“已访问”。
    2. 它启动一个计时器(默认设置:3 秒),当计时器关闭时,它会重新加载页面。
    3. 每次页面重新加载时,它都会打开所有新链接并启动新的重新加载计时器。
    4. 按“停止”按钮停止重新加载,访问过的链接列表将被保留。

    “清除”按钮可删除访问过的页面列表。
    警告:如果您在刷新循环处于活动状态时按“清除”,则下次重新加载页面时,所有链接都将在新选项卡中打开。

接下来,在您的网站上使用该脚本...

仔细阅读脚本中的注释,您必须更改 @include@exclude 和 selectorStr 值以匹配您正在使用的网站。

为了获得最佳结果,禁用任何“重新加载每个”附加组件或“自动更新”选项。

重要说明:

  1. 脚本必须使用永久存储来跟踪链接。< br>
    选项包括:cookie、sessionStoragelocalStorage, globalStorageGM_setValue()IndexedDB

    这些都有缺点,在这种情况下(单个站点、可能存在大量链接、多个会话),localStorage 是最佳选择(IndexedDB 可能是,但它仍然太不稳定——导致我的机器上经常出现 FF 崩溃)。

    这意味着只能在每个站点的基础上跟踪链接,并且“安全”、“隐私”或“清洁”实用程序可以阻止或删除已访问链接的列表。 (就像清除浏览器的历史记录将重置已访问链接的所有 CSS 样式一样。)

  2. 该脚本目前仅适用于 Firefox。即使安装了 Tampermonkey,如果不进行一些重新设计,它也不应该在 Chrome 上运行。


脚本:

/*******************************************************************************
**  This script:
**      1)  Keeps track of which links have been clicked.
**      2)  Refreshes the page at regular intervals to check for new links.
**      3)  If new links are found, opens those links in a new tab.
**
**  To Set Up:
**      1)  Carefully choose and specify `selectorStr` based on the particulars
**          of the target page(s).
**          The selector string uses any valid jQuery syntax.
**      2)  Set the @include, and/or, @exclude, and/or @match directives as
**          appropriate for the target site.
**      3)  Turn any "Auto update" features off.  Likewise, do not use any
**          "Reload Every" addons.  This script will handle reloads/refreshes.
**
**  To Use:
**      The script will place 2 buttons on the page: A "Start/Stop" button in
**      the upper left and a "Clear" button in the lower left.
**
**      Press the "Start" button to start the script reloading the page and
**      opening any new links.
**      When the button is pressed, it is assumed that any existing links have
**      been visited.
**
**      Press the "Stop" button to halt the reloading and link opening.
**
**      The "Clear" button erases the list of visited links -- which might
**      otherwise be stored forever.
**
**  Methodology:
**      Uses localStorage to track state-machine state, and to keep a
**      persistent list of visited links.
**
**      Implemented with jQuery and some GM_ functions.
**
**      For now, this script is Firefox-only.  It probably will not work on
**      Chrome, even with Tampermonkey.
*/
// ==UserScript==
// @name        _New link / visited link, tracker and opener
// @include     http://jsbin.com/*
// @exclude     /\/edit\b/
// @require     http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js
// @grant       GM_addStyle
// ==/UserScript==
/*- The @grant directive is needed to work around a design change
    introduced in GM 1.0.   It restores the sandbox.
*/

//--- Key control/setup variables:
var refreshDelay    = 3000;    //-- milliseconds.
var selectorStr     = 'ul.topicList a.topicTitle';

//--- Add the control buttons.
$("body")  .append (  '<div id="GM_StartStopBtn" class="GM_ControlWrap">'
                    + '<button>Start checking for new links.</button></div>'
            )
           .append (  '<div id="GM_ClearVisitListBtn" class="GM_ControlWrap">'
                    + '<button>Clear the list of visited links.</button></div>'
            );
$('div.GM_ControlWrap').hover (
    function () { $(this).stop (true, false).fadeTo ( 50, 1); },
    function () { $(this).stop (true, false).fadeTo (900, 0.8); }// Coordinate with CSS.
);

//--- Initialize the link-handler object, but wait until the load event.
var stateMachine;
window.addEventListener ("load", function () {
        stateMachine    = new GM_LinkTrack (    selectorStr,
                                                '#GM_StartStopBtn button',
                                                '#GM_ClearVisitListBtn button',
                                                refreshDelay
                                            );

        /*--- Display the current number of visited links.
            We only update once per page load here.
        */
        var numLinks    = stateMachine.GetVisitedLinkCount ();
        $("body").append ('<p>The page opened with ' + numLinks + ' visited links.</p>');
    },
    false
);


/*--- The link and state tracker object.
    Public methods:
        OpenAllNewLinks ()
        StartStopBtnHandler ()
        ClearVisitedLinkList ()
        StartRefreshTimer ();
        StopRefreshTimer ();
        SetAllCurrentLinksToVisited ()
        GetVisitedLinkCount ()
*/
function GM_LinkTrack (selectorStr, startBtnSel, clearBtnSel, refreshDelay)
{
    var visitedLinkArry = [];
    var numVisitedLinks = 0;
    var refreshTimer    = null;
    var startTxt        = 'Start checking for new links.';
    var stopTxt         = 'Stop checking links and reloading.';

    //--- Get visited link-list from storage.
    for (var J = localStorage.length - 1;  J >= 0;  --J) {
        var itemName    = localStorage.key (J);

        if (/^Visited_\d+$/i.test (itemName) ) {
            visitedLinkArry.push (localStorage[itemName] );
            numVisitedLinks++;
        }
    }

    function LinkIsNew (href) {
        /*--- If the link is new, adds it to the list and returns true.
            Otherwise returns false.
        */
        if (visitedLinkArry.indexOf (href) == -1) {
            visitedLinkArry.push (href);

            var itemName    = 'Visited_' + numVisitedLinks;
            localStorage.setItem (itemName, href);
            numVisitedLinks++;

            return true;
        }
        return false;
    }

    //--- For each new link, open it in a separate tab.
    this.OpenAllNewLinks        = function ()
    {
        $(selectorStr).each ( function () {

            if (LinkIsNew (this.href) ) {
                GM_openInTab (this.href);
            }
        } );
    };

    this.StartRefreshTimer      = function () {
        if (typeof refreshTimer != "number") {
            refreshTimer        = setTimeout ( function() {
                                        window.location.reload ();
                                    },
                                    refreshDelay
                                );
        }
    };

    this.StopRefreshTimer       = function () {
        if (typeof refreshTimer == "number") {
            clearTimeout (refreshTimer);
            refreshTimer        = null;
        }
    };

    this.SetAllCurrentLinksToVisited = function () {
        $(selectorStr).each ( function () {
            LinkIsNew (this.href);
        } );
    };

    this.GetVisitedLinkCount = function () {
        return numVisitedLinks;
    };

    var context = this; //-- This seems clearer than using `.bind(this)`.
    this.StartStopBtnHandler    = function (zEvent) {
        if (inRefreshCycle) {
            //--- "Stop" pressed.  Stop searching for new links.
            $(startBtnSel).text (startTxt);
            context.StopRefreshTimer ();
            localStorage.setItem ('inRefreshCycle', '0'); //Set false.
        }
        else {
            //--- "Start" pressed.  Start searching for new links.
            $(startBtnSel).text (stopTxt);
            localStorage.setItem ('inRefreshCycle', '1'); //Set true.

            context.SetAllCurrentLinksToVisited ();
            context.StartRefreshTimer ();
        }
        inRefreshCycle  ^= true;    //-- Toggle value.
    };

    this.ClearVisitedLinkList   = function (zEvent) {
        numVisitedLinks = 0;

        for (var J = localStorage.length - 1;  J >= 0;  --J) {
            var itemName    = localStorage.key (J);

            if (/^Visited_\d+$/i.test (itemName) ) {
                localStorage.removeItem (itemName);
            }
        }
    };

    //--- Activate the buttons.
    $(startBtnSel).click (this.StartStopBtnHandler);
    $(clearBtnSel).click (this.ClearVisitedLinkList);

    //--- Determine state.  Are we running the refresh cycle now?
    var inRefreshCycle  = parseInt (localStorage.inRefreshCycle, 10)  ||  0;
    if (inRefreshCycle) {
        $(startBtnSel).text (stopTxt); //-- Change the btn lable to "Stop".
        this.OpenAllNewLinks ();
        this.StartRefreshTimer ();
    }
}

//--- Style the control buttons.
GM_addStyle ( "                                                             \
    .GM_ControlWrap {                                                       \
        opacity:            0.8;    /*Coordinate with hover func. */        \
        background:         pink;                                           \
        position:           fixed;                                          \
        padding:            0.6ex;                                          \
        z-index:            666666;                                         \
    }                                                                       \
    .GM_ControlWrap button {                                                \
        padding:            0.2ex 0.5ex;                                    \
        border-radius:      1em;                                            \
        box-shadow:         3px 3px 3px gray;                               \
        cursor:             pointer;                                        \
    }                                                                       \
    .GM_ControlWrap button:hover {                                          \
        color:              red;                                            \
    }                                                                       \
    #GM_StartStopBtn {                                                      \
        top:                0;                                              \
        left:               0;                                              \
    }                                                                       \
    #GM_ClearVisitListBtn {                                                 \
        bottom:             0;                                              \
        right:              0;                                              \
    }                                                                       \
" );

Correct, it is not possible for javascript to detect if a link is visited in either Firefox or Chrome -- which are the only 2 browsers applicable in this Greasemonkey context.

That is because Firefox and Chrome take security and privacy seriously. From the CSS2 spec:

Note. It is possible for style sheet authors to abuse the :link and :visited pseudo-classes to determine which sites a user has visited without the user's consent.

UAs may therefore treat all links as unvisited links, or implement other measures to preserve the user's privacy while rendering visited and unvisited links differently. See [P3P] for more information about handling privacy.

See also, "Privacy and the :visited selector"
You can see a demo showing that secure-ish browsers will not let you sniff visited links at jsfiddle.net/n8F9U.




For your specific situation, because you are visiting a page and keeping it open, you can help a script keep track of what links were visited. It's not fool-proof, but I believe it will do what you've asked for.

First, see the script in action by doing the following:

  1. Install the script, as is.
  2. Browse to the test page, jsbin.com/eledog.
    The test page adds a new link, every time it is reloaded or refreshed.
  3. The GM script adds 2 buttons to the pages it runs on. A "start/Stop" button in the upper left and a "Clear" button in the lower right.

    When you press the "Start" button, it does the following:

    1. All existing links on the page are logged as "visited".
    2. It starts a timer (default setting: 3 seconds), when the timer goes off, it reloads the page.
    3. Each time the page reloads, it opens any new links and kicks off a new reload-timer.
    4. Press the "Stop" button to stop the reloads, the list of visited links is preserved.

    The "Clear" button, erases the list of visited pages.
    WARNING: If you press "Clear" while the refresh loop is active, then the next time the page reloads, all links will be opened in new tabs.

Next, to use the script on your site...

Carefully read the comments in the script, you will have to change the @include, @exclude, and selectorStr values to match the site you are using.

For best results, disable any "Reload Every" add-ons, or "Autoupdate" options.

Important notes:

  1. The script has to use permanent storage to to track the links.
    The options are: cookies, sessionStorage, localStorage, globalStorage, GM_setValue(), and IndexedDB.

    These all have drawbacks, and in this case (single site, potentially huge number of links, multiple sessions), localStorage is the best choice (IndexedDB might be, but it is still too unstable -- causing frequent FF crashes on my machine).

    This means that links can only be tracked on a per-site basis, and that "security", "privacy", or "cleaner" utilities can block or erase the list of visited links. (Just like, clearing the browser's history will reset any CSS styling for visited links.)

  2. The script is Firefox-only, for now. It should not work on Chrome, even with Tampermonkey installed, without a little re-engineering.


The script:

/*******************************************************************************
**  This script:
**      1)  Keeps track of which links have been clicked.
**      2)  Refreshes the page at regular intervals to check for new links.
**      3)  If new links are found, opens those links in a new tab.
**
**  To Set Up:
**      1)  Carefully choose and specify `selectorStr` based on the particulars
**          of the target page(s).
**          The selector string uses any valid jQuery syntax.
**      2)  Set the @include, and/or, @exclude, and/or @match directives as
**          appropriate for the target site.
**      3)  Turn any "Auto update" features off.  Likewise, do not use any
**          "Reload Every" addons.  This script will handle reloads/refreshes.
**
**  To Use:
**      The script will place 2 buttons on the page: A "Start/Stop" button in
**      the upper left and a "Clear" button in the lower left.
**
**      Press the "Start" button to start the script reloading the page and
**      opening any new links.
**      When the button is pressed, it is assumed that any existing links have
**      been visited.
**
**      Press the "Stop" button to halt the reloading and link opening.
**
**      The "Clear" button erases the list of visited links -- which might
**      otherwise be stored forever.
**
**  Methodology:
**      Uses localStorage to track state-machine state, and to keep a
**      persistent list of visited links.
**
**      Implemented with jQuery and some GM_ functions.
**
**      For now, this script is Firefox-only.  It probably will not work on
**      Chrome, even with Tampermonkey.
*/
// ==UserScript==
// @name        _New link / visited link, tracker and opener
// @include     http://jsbin.com/*
// @exclude     /\/edit\b/
// @require     http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js
// @grant       GM_addStyle
// ==/UserScript==
/*- The @grant directive is needed to work around a design change
    introduced in GM 1.0.   It restores the sandbox.
*/

//--- Key control/setup variables:
var refreshDelay    = 3000;    //-- milliseconds.
var selectorStr     = 'ul.topicList a.topicTitle';

//--- Add the control buttons.
$("body")  .append (  '<div id="GM_StartStopBtn" class="GM_ControlWrap">'
                    + '<button>Start checking for new links.</button></div>'
            )
           .append (  '<div id="GM_ClearVisitListBtn" class="GM_ControlWrap">'
                    + '<button>Clear the list of visited links.</button></div>'
            );
$('div.GM_ControlWrap').hover (
    function () { $(this).stop (true, false).fadeTo ( 50, 1); },
    function () { $(this).stop (true, false).fadeTo (900, 0.8); }// Coordinate with CSS.
);

//--- Initialize the link-handler object, but wait until the load event.
var stateMachine;
window.addEventListener ("load", function () {
        stateMachine    = new GM_LinkTrack (    selectorStr,
                                                '#GM_StartStopBtn button',
                                                '#GM_ClearVisitListBtn button',
                                                refreshDelay
                                            );

        /*--- Display the current number of visited links.
            We only update once per page load here.
        */
        var numLinks    = stateMachine.GetVisitedLinkCount ();
        $("body").append ('<p>The page opened with ' + numLinks + ' visited links.</p>');
    },
    false
);


/*--- The link and state tracker object.
    Public methods:
        OpenAllNewLinks ()
        StartStopBtnHandler ()
        ClearVisitedLinkList ()
        StartRefreshTimer ();
        StopRefreshTimer ();
        SetAllCurrentLinksToVisited ()
        GetVisitedLinkCount ()
*/
function GM_LinkTrack (selectorStr, startBtnSel, clearBtnSel, refreshDelay)
{
    var visitedLinkArry = [];
    var numVisitedLinks = 0;
    var refreshTimer    = null;
    var startTxt        = 'Start checking for new links.';
    var stopTxt         = 'Stop checking links and reloading.';

    //--- Get visited link-list from storage.
    for (var J = localStorage.length - 1;  J >= 0;  --J) {
        var itemName    = localStorage.key (J);

        if (/^Visited_\d+$/i.test (itemName) ) {
            visitedLinkArry.push (localStorage[itemName] );
            numVisitedLinks++;
        }
    }

    function LinkIsNew (href) {
        /*--- If the link is new, adds it to the list and returns true.
            Otherwise returns false.
        */
        if (visitedLinkArry.indexOf (href) == -1) {
            visitedLinkArry.push (href);

            var itemName    = 'Visited_' + numVisitedLinks;
            localStorage.setItem (itemName, href);
            numVisitedLinks++;

            return true;
        }
        return false;
    }

    //--- For each new link, open it in a separate tab.
    this.OpenAllNewLinks        = function ()
    {
        $(selectorStr).each ( function () {

            if (LinkIsNew (this.href) ) {
                GM_openInTab (this.href);
            }
        } );
    };

    this.StartRefreshTimer      = function () {
        if (typeof refreshTimer != "number") {
            refreshTimer        = setTimeout ( function() {
                                        window.location.reload ();
                                    },
                                    refreshDelay
                                );
        }
    };

    this.StopRefreshTimer       = function () {
        if (typeof refreshTimer == "number") {
            clearTimeout (refreshTimer);
            refreshTimer        = null;
        }
    };

    this.SetAllCurrentLinksToVisited = function () {
        $(selectorStr).each ( function () {
            LinkIsNew (this.href);
        } );
    };

    this.GetVisitedLinkCount = function () {
        return numVisitedLinks;
    };

    var context = this; //-- This seems clearer than using `.bind(this)`.
    this.StartStopBtnHandler    = function (zEvent) {
        if (inRefreshCycle) {
            //--- "Stop" pressed.  Stop searching for new links.
            $(startBtnSel).text (startTxt);
            context.StopRefreshTimer ();
            localStorage.setItem ('inRefreshCycle', '0'); //Set false.
        }
        else {
            //--- "Start" pressed.  Start searching for new links.
            $(startBtnSel).text (stopTxt);
            localStorage.setItem ('inRefreshCycle', '1'); //Set true.

            context.SetAllCurrentLinksToVisited ();
            context.StartRefreshTimer ();
        }
        inRefreshCycle  ^= true;    //-- Toggle value.
    };

    this.ClearVisitedLinkList   = function (zEvent) {
        numVisitedLinks = 0;

        for (var J = localStorage.length - 1;  J >= 0;  --J) {
            var itemName    = localStorage.key (J);

            if (/^Visited_\d+$/i.test (itemName) ) {
                localStorage.removeItem (itemName);
            }
        }
    };

    //--- Activate the buttons.
    $(startBtnSel).click (this.StartStopBtnHandler);
    $(clearBtnSel).click (this.ClearVisitedLinkList);

    //--- Determine state.  Are we running the refresh cycle now?
    var inRefreshCycle  = parseInt (localStorage.inRefreshCycle, 10)  ||  0;
    if (inRefreshCycle) {
        $(startBtnSel).text (stopTxt); //-- Change the btn lable to "Stop".
        this.OpenAllNewLinks ();
        this.StartRefreshTimer ();
    }
}

//--- Style the control buttons.
GM_addStyle ( "                                                             \
    .GM_ControlWrap {                                                       \
        opacity:            0.8;    /*Coordinate with hover func. */        \
        background:         pink;                                           \
        position:           fixed;                                          \
        padding:            0.6ex;                                          \
        z-index:            666666;                                         \
    }                                                                       \
    .GM_ControlWrap button {                                                \
        padding:            0.2ex 0.5ex;                                    \
        border-radius:      1em;                                            \
        box-shadow:         3px 3px 3px gray;                               \
        cursor:             pointer;                                        \
    }                                                                       \
    .GM_ControlWrap button:hover {                                          \
        color:              red;                                            \
    }                                                                       \
    #GM_StartStopBtn {                                                      \
        top:                0;                                              \
        left:               0;                                              \
    }                                                                       \
    #GM_ClearVisitListBtn {                                                 \
        bottom:             0;                                              \
        right:              0;                                              \
    }                                                                       \
" );
魔法少女 2024-12-09 12:33:46

您可以解析页面上的所有链接并获取它们的 CSS 颜色属性。如果链接的颜色与您在 CSS 中定义的未访问链接的颜色匹配,则该链接未被访问。

这种技术通常用于确定所有访问过的链接。
这是一种安全漏洞,可让您确定用户是否访问了特定网站。通常被卑鄙的营销人员使用。

这种技巧通常被归类为“浏览器历史记录操纵技巧”。

更多信息与代码:http://www.stevenyork.com/tutorial/getting_browser_history_using_javascript

You can parse all links on the page and and get their CSS color property. If a color of the link is a match to the color of unvisited link you defined in CSS the this link is unvisited.

This kind of technique usually used to determine all visited links.
This is sort of a security breach that allows you to determine if user visited particular web-site. Usually used by sleazy marketers.

This kind of tricks usually classifies as a "browser's history manipulation tricks".

More info with code: http://www.stevenyork.com/tutorial/getting_browser_history_using_javascript

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文