何时使用 NodeIterator

发布于 2024-12-12 11:18:08 字数 962 浏览 0 评论 0原文

基准 比较 QSA 和 QSA .forEachNodeIterator

toArray(document.querySelectorAll("div > a.klass")).forEach(function (node) {
  // do something with node
});

var filter = {
    acceptNode: function (node) {
        var condition = node.parentNode.tagName === "DIV" &&
            node.classList.contains("klass") &&
            node.tagName === "A";

        return condition ? NodeFilter.FILTER_ACCEPT : NodeFilter.FILTER_REJECT
    }  
}
// FIREFOX Y U SUCK
var iter = document.createNodeIterator(document, NodeFilter.SHOW_ELEMENT, filter, false);
var node;
while (node = iter.nextNode()) {
    // do thing with node    
}

现在要么 NodeIterator 很糟糕,要么我做错了。

问题:什么时候应该使用NodeIterator

如果您不知道,DOM4 指定了 NodeIterator 是什么。

Benchmark compares QSA & .forEach vs a NodeIterator

toArray(document.querySelectorAll("div > a.klass")).forEach(function (node) {
  // do something with node
});

var filter = {
    acceptNode: function (node) {
        var condition = node.parentNode.tagName === "DIV" &&
            node.classList.contains("klass") &&
            node.tagName === "A";

        return condition ? NodeFilter.FILTER_ACCEPT : NodeFilter.FILTER_REJECT
    }  
}
// FIREFOX Y U SUCK
var iter = document.createNodeIterator(document, NodeFilter.SHOW_ELEMENT, filter, false);
var node;
while (node = iter.nextNode()) {
    // do thing with node    
}

Now either NodeIterator's suck or I'm doing it wrong.

Question: When should I use a NodeIterator ?

In case you don't know, DOM4 specifies what NodeIterator is.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

自此以后,行同陌路 2024-12-19 11:18:08

由于各种原因,NodeIterator(以及 TreeWalker)几乎从未被使用过。这意味着有关该主题的信息很少,并且出现了类似@gsnedders'的答案,但没有提及任何API的独特功能。我知道这个问题已经有近十年的历史了,所以请原谅我的死灵术。

1.发起与性能

确实,NodeIterator启动querySelectorAll 之类的方法慢得多,但这不是性能你应该测量。

NodeIterator 的特点是它们是实时的,就像 HTMLCollection 或实时 NodeList 一样,您可以继续使用启动一次后的对象。
querySelectorAll 返回的 NodeList 是静态的,每次需要匹配新添加的元素时都必须重新启动。

此版本 jsPerf 将 NodeIterator 放入准备代码。实际测试仅尝试使用 iter.nextNode() 循环所有新添加的元素。您可以看到迭代器现在速度快了几个数量级。

2.选择器性能

好吧,很酷,启动后缓存迭代器使这个示例比查询更快。不过,如何使用 API 对于迭代速度仍然很重要。 在此版本中,您可以观察到另一个显着差异。我添加了 10 个选择器不应匹配的类 (done[0-9])。迭代器损失了大约 10% 的速度,而 querySelector 损失了 20% 的速度。

另一方面,此版本显示了当您添加另一个<时会发生什么code>div > 在选择器的开头。迭代器损失了 33% 的速度,而 querySelectors 的速度增加10%

删除选择器开头的初始 div >,如 此版本表明这两种方法都变得更慢,因为它们比早期版本匹配更多。正如预期的那样,在这种情况下,迭代器比 querySelector 的性能相对更高。

这意味着在 NodeIterator 中根据节点自身属性(其类、属性等)进行过滤可能会更快,同时具有大量组合器(>、+、~ 等)。 ) 在选择器中可能意味着 querySelectorAll 更快。​​
对于 (空格)组合器尤其如此。使用 querySelectorAll('article a') 选择元素比手动循环每个 a 元素的所有父元素、查找具有 tagName'ARTICLE' 的代码>。

PS 在第 3.2 节中,我给出了一个示例,说明如果您想要与空间组合器所做的相反(排除 a 标记与 文章祖先)。

3.不可能的选择器

3.1 简单的层次关系

当然,手动过滤元素实际上为您提供了无限的控制。这意味着您可以过滤掉通常无法与 CSS 选择器匹配的元素。例如,CSS 选择器只能“向后看”,就像使用 选择 div 之前的 另一个 div 一样。 >div + div。选择后面有另一个divdiv是不可能的。

但是,在 NodeFilter 内,您可以通过检查 node.nextElementSibling.tagName === 'DIV' 来实现此目的。对于 CSS 选择器无法做出的每个选择也是如此。

3.2 更多全局层次关系

我个人喜欢使用 NodeFilter 的另一件事是,当传递给 TreeWalker 时,您可以拒绝节点及其整个子树,通过返回NodeFilter.FILTER_REJECT而不是NodeFilter.FILTER_SKIP

想象一下,您想要迭代页面上的所有 a 标记,除了具有 article 祖先的标记。
使用 querySelectors,您可以输入类似

let a = document.querySelectorAll('a')
a = Array.prototype.filter.call(a, function (node) {
  while (node = node.parentElement) if (node.tagName === 'ARTICLE') return false
  return true
})

While in a 的 内容NodeFilter,您只需输入此

return node.tagName === 'ARTICLE' ? NodeFilter.FILTER_REJECT : // ✨ Magic happens here ✨
       node.tagName === 'A'       ? NodeFilter.FILTER_ACCEPT :
                                    NodeFilter.FILTER_SKIP

结论

NodeIteratorTreeWalker 不应被实例化对于大量循环,绝对不应该替换一次性循环。出于所有意图和目的,它们只是跟踪节点列表/树的替代方法,后者还通过 FILTER_REJECT 添加了一些方便的糖。

NodeIteratorTreeWalker 都必须提供一个主要优点:

  • 实时性,如第 1 节中所述。

但是,当出现以下任何情况时,请勿使用它们: true:

  • 它的实例只会使用一次/几次
  • 使用 CSS 选择器可以查询复杂的层次关系
    (即 body.no-js 文章 > div > div a[href^="/"]

NodeIterator (and TreeWalker, for that matter) are almost never used, because of a variety of reasons. This means that information on the topic is scarce and answers like @gsnedders' come to be, which fail to mention any of the API's unique features. I know this question is almost a decade old, so excuse my necromancy.

1. Initiation & Performance

It is true that the initiation of a NodeIterator is way slower than a method like querySelectorAll, but that is not the performance you should be measuring.

The thing about NodeIterators is that they are live-ish in the way that, just like an HTMLCollection or live NodeList, you can keep using the object after initiating it once.
The NodeList returned by querySelectorAll is static and will have to be re-initiated every time you need to match newly added elements.

This version of the jsPerf puts the NodeIterator in the preparation code. The actual test only tries to loop over all newly added elements with iter.nextNode(). You can see that the iterator is now orders of magnitudes faster.

2. Selector performance

Okay, cool, caching the iterator after initiation makes this example faster than querying. How you use the API still matters for iteration speed, though. In this version, you can observe another significant difference. I've added 10 classes (done[0-9]) that the selectors shouldn't be matching. The iterator loses about 10% of its speed, while the querySelectors lose 20%.

This version, on the other hand, shows what happens when you add another div > at the start of the selector. The iterator loses 33% of its speed, while the querySelectors got a speed INCREASE of 10%.

Removing the initial div > at the start of the selector like in this version shows that both methods become slower, because they match more than earlier versions. Like expected, the iterator is relatively more performant than the querySelectors in this case.

This means that filtering on basis of a node's own properties (its classes, attributes, etc.) is probably faster in a NodeIterator, while having a lot of combinators (>, +, ~, etc.) in your selector probably means querySelectorAll is faster.
This is especially true for the  (space) combinator. Selecting elements with querySelectorAll('article a') is way easier than manually looping over all parents of every a element, looking for one that has a tagName of 'ARTICLE'.

P.S. in §3.2, I give an example of how the exact opposite can be true if you want the opposite of what the space combinator does (exclude a tags with an article ancestor).

3. Impossible selectors

3.1 Simple hierarchical relationships

Of course, manually filtering elements gives you practically unlimited control. This means that you can filter out elements that would normally be impossible to match with CSS selectors. For example, CSS selectors can only "look back" in the way that selecting divs that are preceded by another div is possible with div + div. Selecting divs that are followed by another div is impossible.

However, inside a NodeFilter, you can achieve this by checking node.nextElementSibling.tagName === 'DIV'. The same goes for every selection CSS selectors can't make.

3.2 More global hierarchical relationships

Another thing I personally love about the usage of NodeFilters, is that when passed to a TreeWalker, you can reject a node and its whole sub-tree by returning NodeFilter.FILTER_REJECT instead of NodeFilter.FILTER_SKIP.

Imagine you want to iterate over all a tags on the page, except for ones with an article ancestor.
With querySelectors, you'd type something like

let a = document.querySelectorAll('a')
a = Array.prototype.filter.call(a, function (node) {
  while (node = node.parentElement) if (node.tagName === 'ARTICLE') return false
  return true
})

While in a NodeFilter, you'd only have to type this

return node.tagName === 'ARTICLE' ? NodeFilter.FILTER_REJECT : // ✨ Magic happens here ✨
       node.tagName === 'A'       ? NodeFilter.FILTER_ACCEPT :
                                    NodeFilter.FILTER_SKIP

In conclusion

NodeIterators and TreeWalkers shouldn't be instantiated for loads of loops and definitely shouldn't replace one-off loops. For all intents and purposes, they are just alternative methods for keeping track of lists/trees of nodes, with the latter also having a handy bit of sugar added with FILTER_REJECT.

There's one main advantage both NodeIterators and TreeWalkers have to offer:

  • Live-ishness, as discussed in §1

However, don't use them when any of the following is true:

  • Its instance is only going to be used once/a few times
  • Complex hierarchical relationships are queried that are possible with CSS selectors
    (i.e. body.no-js article > div > div a[href^="/"])
┼── 2024-12-19 11:18:08

由于多种原因,速度很慢。最明显的是,没有人使用它,因此花在优化它上的时间要少得多。另一个问题是它是大规模可重入的,每个节点都必须调用 JS 并运行过滤器函数。

如果您查看基准测试的第三版,您会发现我已经使用 getElementsByTagName("*") 添加了迭代器正在执行的操作的重新实现,然后对其运行相同的过滤器。结果表明,速度要快得多。去 JS -> C++-> JS 很慢。

完全用 JS(getElementsByTagName 情况)或 C++(querySelectorAll 情况)过滤节点比重复跨越边界要快得多。

另请注意,querySelectorAll 使用的选择器匹配相对智能:它执行从右到左匹配,并且基于预先计算的缓存(大多数浏览器将迭代所有元素的缓存列表)类“klass”,检查它是否是 a 元素,然后检查父级是否是 div),因此他们甚至不会费心迭代整个文档。

那么什么时候使用 NodeIterator 呢?至少在 JavaScript 中基本上从来没有。在 Java 等语言中(毫无疑问,这是存在名为 NodeIterator 的接口的主要原因),它可能会和其他语言一样快,因为这样您的过滤器将使用与过滤器相同的语言。除此之外,唯一有意义的情况是在创建 Node 对象的内存使用量远远大于 Node 的内部表示的语言中。

It's slow for a variety of reasons. Most obviously is the fact that nobody uses it so quite simply far less time has been spent optimizing it. The other problem is it's massively re-entrant, every node having to call into JS and run the filter function.

If you look at revision three of the benchmark, you'll find I've added a reimplementation of what the iterator is doing using getElementsByTagName("*") and then running an identical filter on that. As the results show, it's massively quicker. Going JS -> C++ -> JS is slow.

Filtering the nodes entirely in JS (the getElementsByTagName case) or C++ (the querySelectorAll case) is far quicker than doing it by repeatedly crossing the boundary.

Note also selector matching, as used by querySelectorAll, is comparatively smart: it does right-to-left matching and is based on pre-computed caches (most browsers will iterate over a cached list of all elements with the class "klass", check if it's an a element, and then check if the parent is a div) and hence they won't even bother with iterating over the entire document.

Given that, when to use NodeIterator? Basically never in JavaScript, at least. In languages such as Java (undoubtedly the primary reason why there's an interface called NodeIterator), it will likely be just as quick as anything else, as then your filter will be in the same language as the filter. Apart from that, the only other time it makes sense is in languages where the memory usage of creating a Node object is far greater than the internal representation of the Node.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文