Javascript getElementById 查找 - 哈希映射还是递归树遍历?

发布于 2024-08-30 09:31:26 字数 117 浏览 6 评论 0原文

DOM 是否有一个元素的哈希表,其键是元素的 id?
我想知道 document.getElementById 是否查找哈希表或遍历整个树。
这种行为在不同浏览器中是否有所不同?

Does the DOM have a hash-table of elements whose keys are the elements' ids?
I want to know if document.getElementById looks up a hash table or traverses the entire tree.
Is this behavior different across browsers?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

沧笙踏歌 2024-09-06 09:31:26

我了解 Firefox 和 WebKit DOM 实现,它们都使用哈希表来查找元素,深入研究它们的来源,您可以查看内部实现:

WebKit 实现,Document.cpp,如果 id 唯一,则使用哈希表,否则它遍历文档以获取第一个匹配项:

Element* Document::getElementById(const AtomicString& elementId) const
{
    if (elementId.isEmpty())
        return 0;

    m_elementsById.checkConsistency();

    Element* element = m_elementsById.get(elementId.impl());//<-- hastable lookup
    if (element)
        return element;

    if (m_duplicateIds.contains(elementId.impl())) {
        // We know there's at least one node with this id,
        // but we don't know what the first one is.
        for (Node *n = traverseNextNode(); n != 0; n = n->traverseNextNode()) {
            if (n->isElementNode()) {
                element = static_cast<Element*>(n);
                if (element->hasID() &&
                element->getAttribute(element->idAttributeName()) == elementId) {
                    m_duplicateIds.remove(elementId.impl());
                    m_elementsById.set(elementId.impl(), element);
                    return element;
                }
            }
        }
        ASSERT_NOT_REACHED();
    }
    return 0;
}

Firefox 实现, nsDocument.cpp

I know about the Firefox and WebKit DOM implementations, both of them use a hashtable to lookup the elements, digging into the source of them you can give a look to the internal implementations:

WebKit implementation, Document.cpp, uses the hashtable if the id is unique, otherwise it traverses the document to get the first match:

Element* Document::getElementById(const AtomicString& elementId) const
{
    if (elementId.isEmpty())
        return 0;

    m_elementsById.checkConsistency();

    Element* element = m_elementsById.get(elementId.impl());//<-- hastable lookup
    if (element)
        return element;

    if (m_duplicateIds.contains(elementId.impl())) {
        // We know there's at least one node with this id,
        // but we don't know what the first one is.
        for (Node *n = traverseNextNode(); n != 0; n = n->traverseNextNode()) {
            if (n->isElementNode()) {
                element = static_cast<Element*>(n);
                if (element->hasID() &&
                element->getAttribute(element->idAttributeName()) == elementId) {
                    m_duplicateIds.remove(elementId.impl());
                    m_elementsById.set(elementId.impl(), element);
                    return element;
                }
            }
        }
        ASSERT_NOT_REACHED();
    }
    return 0;
}

Firefox implementation, nsDocument.cpp

淡莣 2024-09-06 09:31:26

实现可以自由地做任何他们喜欢的事情,但由于 id 被定义为唯一值,因此使用哈希映射或类似的查找机制而不是遍历似乎是明智的。然而,当您开始构建一个包含许多(有时是冲突的)命令的复杂网络浏览器时,从外部看来合理的事情可能并不合理。

我做了一个简单但非常简单的测试(参见答案末尾的页面)。它非常简单化,尤其是因为我们不知道浏览器不会缓存以前的结果。

Chrome 4.1.249.1059 报告:

ID: 0.0082ms per lookup
Tag: 0.0249ms per lookup

因此,通过 ID 比通过标记名称更快。

IE7 报告:

ID: 2.4438ms per lookup
Tag: 0.0437ms per lookup

通过标签名称比通过 ID 快得多(但请记住 IE7 有一个 getElementById 的概念被破坏;这在 IE8 中已修复)。

IE8(在不同的机器上,不要比较绝对值,我们正在查看测试的浏览器内的差异)报告:

ID: 1.1335ms per lookup
Tag: 0.0287ms per lookup

所以与 IE7 大致相同。

Firefox 3.6.3报告:

ID: 0.0042ms per lookup
Tag: 0.006ms per lookup

所以它不太关心(对于重复的请求;再次,它可能正在缓存)。

Opera 10.5.1 报告:

ID: 0.006ms per lookup
Tag: 1.467ms per lookup

通过 ID 比通过标签名称更快。

随心所欲地利用这些结果。需要更复杂的测试用例才能真正推断出机制。当然,至少在其中两种情况下(Firefox 和 Chrome),我们可以查看源代码。 CMS 恳请向我们指出 WebKitFirefox 实现(看看它,我对缓存的怀疑可能是 )。

测试页面:

<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
<title>Test Page</title>
<style type='text/css'>
body {
    font-family: sans-serif;
}
#log p {
    margin:     0;
    padding:    0;
}
</style>
<script type='text/javascript'>
window.onload = pageInit;
function pageInit() {
    document.getElementById('btnGo').onclick = btnGoClick;
}
function btnGoClick() {

    log("Testing...");
    setTimeout(run, 0);
}

function run() {
    var count, time;

    try {
        // Warm up
        testid(10);
        testtag(10);

        // Test
        count = 10000
        time = testid(count);
        log("ID: " + (time / count) + "ms per lookup");
        time = testtag(count);
        log("Tag: " + (time / count) + "ms per lookup");
    }
    catch (e) {
        log("Error: " + (e.message ? e.message : String(e)));
    }
}

function testid(count) {
    var start;

    start = new Date().getTime();
    while (count-- > 0) {
        if (!document.getElementById('fred')) {
            throw "ID 'fred' not found";
        }
    }
    return new Date().getTime() - start;
}

function testtag(count) {
    var start;

    start = new Date().getTime();

    while (count-- > 0) {
        if (document.getElementsByTagName('em').length == 0) {
            throw "Tag 'em' not found";
        }
    }
    return new Date().getTime() - start;
}

function log(msg) {
    var log = document.getElementById('log');
    log.innerHTML += "<p>" + msg + "<\/p>";
}
</script>
</head>
<body><div>
<input type='button' id='btnGo' value='Go'>
<div id='log'></div>
<hr>
<div>test test<span>test<span>test<span>test<span>test</span></span></span></span></div>
<div>test test<span>test<span>test<span>test<span>test</span></span></span></span></div>
<div>test test<span>test<span>test<span>test<span>test</span></span></span></span></div>
<div>test test<span>test<span>test<span>test<span>test</span></span></span></span></div>
<!-- repeat the above a couple of thousand times; I had about 2,200 -->
<div>test test<span>test<span>test<span>test<span>test</span></span></span></span></div>
<div>test test<span>test<span>test<span>test<em id='fred'>test</em></span></span></span></div>
</div></body>
</html>

Implementations are free to do whatever they like, but since id is defined as a unique value, it would seem sensible to use a hash map or similiar lookup mechanism rather than traversal. What seems sensible from the outside, though, may not be when you get into the plumbing of building a complex web browser with many (sometimes conflicting) imperatives.

I did an easy but very simplistic test (see page at end of answer). It's very simplistic not least because we don't know that browsers don't cache previous results.

Chrome 4.1.249.1059 reports:

ID: 0.0082ms per lookup
Tag: 0.0249ms per lookup

So, dramatically faster by ID than tag name.

IE7 reports:

ID: 2.4438ms per lookup
Tag: 0.0437ms per lookup

So dramatically faster by tag name than ID (but remember IE7 has a broken concept of getElementById; this is fixed in IE8).

IE8 (on a different machine, don't compare absolutes, we're looking at diffs within the browser tested) reports:

ID: 1.1335ms per lookup
Tag: 0.0287ms per lookup

So about the same as IE7.

Firefox 3.6.3 reports:

ID: 0.0042ms per lookup
Tag: 0.006ms per lookup

So it doesn't care that much (on repeated requests; again, it may be caching).

Opera 10.5.1 reports:

ID: 0.006ms per lookup
Tag: 1.467ms per lookup

Dramatically faster by ID than tag name.

Make of those results what you will. A more complex test case would be needed to really infer the mechanisms. Of course, in at least two of those cases (Firefox and Chrome), we can go look at the source. CMS kindly points us to the WebKit and Firefox implementations (and looking at it, my suspicion about caching may have been on the money).

Test page:

<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
<title>Test Page</title>
<style type='text/css'>
body {
    font-family: sans-serif;
}
#log p {
    margin:     0;
    padding:    0;
}
</style>
<script type='text/javascript'>
window.onload = pageInit;
function pageInit() {
    document.getElementById('btnGo').onclick = btnGoClick;
}
function btnGoClick() {

    log("Testing...");
    setTimeout(run, 0);
}

function run() {
    var count, time;

    try {
        // Warm up
        testid(10);
        testtag(10);

        // Test
        count = 10000
        time = testid(count);
        log("ID: " + (time / count) + "ms per lookup");
        time = testtag(count);
        log("Tag: " + (time / count) + "ms per lookup");
    }
    catch (e) {
        log("Error: " + (e.message ? e.message : String(e)));
    }
}

function testid(count) {
    var start;

    start = new Date().getTime();
    while (count-- > 0) {
        if (!document.getElementById('fred')) {
            throw "ID 'fred' not found";
        }
    }
    return new Date().getTime() - start;
}

function testtag(count) {
    var start;

    start = new Date().getTime();

    while (count-- > 0) {
        if (document.getElementsByTagName('em').length == 0) {
            throw "Tag 'em' not found";
        }
    }
    return new Date().getTime() - start;
}

function log(msg) {
    var log = document.getElementById('log');
    log.innerHTML += "<p>" + msg + "<\/p>";
}
</script>
</head>
<body><div>
<input type='button' id='btnGo' value='Go'>
<div id='log'></div>
<hr>
<div>test test<span>test<span>test<span>test<span>test</span></span></span></span></div>
<div>test test<span>test<span>test<span>test<span>test</span></span></span></span></div>
<div>test test<span>test<span>test<span>test<span>test</span></span></span></span></div>
<div>test test<span>test<span>test<span>test<span>test</span></span></span></span></div>
<!-- repeat the above a couple of thousand times; I had about 2,200 -->
<div>test test<span>test<span>test<span>test<span>test</span></span></span></span></div>
<div>test test<span>test<span>test<span>test<em id='fred'>test</em></span></span></span></div>
</div></body>
</html>
莫多说 2024-09-06 09:31:26

具体实现未在 HTML 中定义规范,因此它可以轻松地改变浏览器。例如 IE 文档 指出

返回对具有指定 ID 或 NAME 属性值的第一个对象的引用。

所以我很想说它会进行搜索(或者它只是在哈希冲突的情况下抛出元素)。

编辑 另请记住,还有其他数据结构(如树)允许访问时间介于常数和线性之间。

The specific implementation isn't defined in the HTML spec so it could easily vary browser to browser. For example IE documentation states

Returns a reference to the first object with the specified value of the ID or NAME attribute.

so I'd be tempted to say it does a search (or it just throws out elements in cases of hash collisions).

EDIT Also keep in mind there are other data structures (like trees) that allow for access time somewhere between constant and linear.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文