Greasemonkey,XPath:查找表行中的所有链接

发布于 2024-07-26 00:11:55 字数 2326 浏览 5 评论 0原文

鉴于:

<tr>
  <td><a href="http://foo.com">Keyword 1</a></td>
  <td><a href="http://bar.com">Keyword 2</a></td>
  <td><a href="http://wombat.com">Keyword 3</a></td>
</tr>

<tr>
  <td><a href="http://blah.com">Keyword 4</a></td>
  <td><a href="http://woof.com">Keyword 5</a></td>
  <td><a href="http://miaow.com">Keyword 6</a></td>
</tr>

我需要匹配表格单元格中的每个 URI。 整个文档中的关键字是一致的。 我可以毫无问题地匹配整个文档的链接:

var links_in_document = document.evaluate(
  "//a[starts-with(text(),'Keyword')]",
  document,
  null,
  XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE,
  null);

但是,尽管我有一种简单的方法来引用 TR 节点,但我似乎无法找到正确的 XPath 来获取行中的链接。 下面的代码片段似乎给了我第一个 TD 中的第一个链接,但没有提供其余的链接。 帮助?

var links_in_row = document.evaluate(
  ".//a[starts-with(text(),'Keyword')]",
  row,
  null,
  XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE,
  null);

(其中“行”是上下文节点)。

编辑:也许我不清楚,我可以很好地从文档级别找到链接。 我尝试使用 TR 节点作为 XPath 的上下文来隔离单行中的链接。

编辑:解决方案,出于兴趣。 我正在处理的损坏标记没有 id 属性,因此我添加了一些属性并能够继续。 片段:

var exhibit_link;
for( var i = 0; i < all_exhibit_links.snapshotLength; i++ ) {
  exhibit_link = all_exhibit_links.snapshotItem( i );

  // The rows have no unique ID, so we need to give them one.
  // This will give the XPath something to 'latch onto'.
  exhibit_link.parentNode.parentNode.id = 'ex_link_row_' + i.toString();

  exhibit_link.addEventListener( "click", 
    function( event ) {
      var row_id = event.target.parentNode.parentNode.id;

      // Find only those links that are within rows with the corresponding id
      var row_links = document.evaluate(
        "id('" + row_id + "')/td/a[starts-with(text(),'Exhibit')]",
        document,
        null,
        XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
        null);

      // Open each link in a new tab
      for( var j = 0; j < row_links.snapshotLength; j++ ) {
        row_link = row_links.snapshotItem( j );
        GM_openInTab( row_link.href );
      }

      // Suppress the original function of the link
      event.stopPropagation();
      event.preventDefault();
    }, 
    true );
}

Given:

<tr>
  <td><a href="http://foo.com">Keyword 1</a></td>
  <td><a href="http://bar.com">Keyword 2</a></td>
  <td><a href="http://wombat.com">Keyword 3</a></td>
</tr>

<tr>
  <td><a href="http://blah.com">Keyword 4</a></td>
  <td><a href="http://woof.com">Keyword 5</a></td>
  <td><a href="http://miaow.com">Keyword 6</a></td>
</tr>

I need to match each URI within the table cells. The keyword is consistent throughout the document. I can match links for the entire document with no trouble:

var links_in_document = document.evaluate(
  "//a[starts-with(text(),'Keyword')]",
  document,
  null,
  XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE,
  null);

However, even though I have an easy way to reference the TR node, I can't seem to find the right XPath to obtain the links in the row. The snippet below seems to give me the first link in the first TD, but not the rest. Help?

var links_in_row = document.evaluate(
  ".//a[starts-with(text(),'Keyword')]",
  row,
  null,
  XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE,
  null);

(where 'row' is the context node).

Edited: perhaps I wasn't clear, I can find the links from document level just fine. I am trying to isolate the links in a single row by using the TR node as the context for the XPath.

Edit: solution, for interest. The broken markup I was working on had no id attributes, so I added some and was able to proceed. Snippet:

var exhibit_link;
for( var i = 0; i < all_exhibit_links.snapshotLength; i++ ) {
  exhibit_link = all_exhibit_links.snapshotItem( i );

  // The rows have no unique ID, so we need to give them one.
  // This will give the XPath something to 'latch onto'.
  exhibit_link.parentNode.parentNode.id = 'ex_link_row_' + i.toString();

  exhibit_link.addEventListener( "click", 
    function( event ) {
      var row_id = event.target.parentNode.parentNode.id;

      // Find only those links that are within rows with the corresponding id
      var row_links = document.evaluate(
        "id('" + row_id + "')/td/a[starts-with(text(),'Exhibit')]",
        document,
        null,
        XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
        null);

      // Open each link in a new tab
      for( var j = 0; j < row_links.snapshotLength; j++ ) {
        row_link = row_links.snapshotItem( j );
        GM_openInTab( row_link.href );
      }

      // Suppress the original function of the link
      event.stopPropagation();
      event.preventDefault();
    }, 
    true );
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

喵星人汪星人 2024-08-02 00:11:55

使用您的 html 示例和以下代码在 JavaScript Shell 中进行快速测试:

var links_in_row = document.evaluate( ".//a[starts-with(text(),'Keyword')]"
          , document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
var i = 0;
while( (link = links_in_row.snapshotItem(i) ) != null) {
   print(link.innerHTML);i++;
}

打印出:

Keyword 1
Keyword 2
Keyword 3

这表明它工作正常。
我所做的唯一更改不是从行级别开始,而是从文档开始......

a quick test in the JavaScript Shell with your html example and the following code:

var links_in_row = document.evaluate( ".//a[starts-with(text(),'Keyword')]"
          , document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
var i = 0;
while( (link = links_in_row.snapshotItem(i) ) != null) {
   print(link.innerHTML);i++;
}

prints out:

Keyword 1
Keyword 2
Keyword 3

which suggests it is working correctly.
Only change i made was not to start at the row level, but at the document...

素年丶 2024-08-02 00:11:55

扩展伯特所写的内容,这对我有用。

var rows = document.evaluate( "//tr"
          , document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
var i = 0;
while( (row = rows.snapshotItem(i) ) != null) {
    print( 'NEW ROW----');
    var links = document.evaluate(".//a[starts-with(text(),'Keyword')]",
                                  row, null, 
                                  XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
    var k = 0;
    while ((link = links.snapshotItem(k)) != null) {
       print( link.innerHTML );
       k++;
    }
    i++;
}

打印出来:

NEW ROW----
Keyword 1
Keyword 2
Keyword 3
NEW ROW----
Keyword 4
Keyword 5
Keyword 6

我认为除了复制粘贴的内容之外还缺少一些东西。

恕我直言,伯特应该得到这个问题的答案。

Extending on what bert wrote, this works for me.

var rows = document.evaluate( "//tr"
          , document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
var i = 0;
while( (row = rows.snapshotItem(i) ) != null) {
    print( 'NEW ROW----');
    var links = document.evaluate(".//a[starts-with(text(),'Keyword')]",
                                  row, null, 
                                  XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
    var k = 0;
    while ((link = links.snapshotItem(k)) != null) {
       print( link.innerHTML );
       k++;
    }
    i++;
}

Prints out:

NEW ROW----
Keyword 1
Keyword 2
Keyword 3
NEW ROW----
Keyword 4
Keyword 5
Keyword 6

I think there's something missing outside of what was copy pasted.

bert should get the answer for this one IMHO.

蝶舞 2024-08-02 00:11:55

尝试:

descendant::*[self::a[starts-with(text(), 'Keyword')]]

Try:

descendant::*[self::a[starts-with(text(), 'Keyword')]]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文