JavaScript 宿主对象是如何实现的?

发布于 2024-12-11 03:10:02 字数 611 浏览 1 评论 0原文

今天我在思考这个问题,我意识到我对此没有清晰的认识。

以下是我认为正确的一些说法(如果我错了,请纠正我):

  • DOM 是 W3C 指定的接口的集合。
  • 当解析 HTML 源代码时,浏览器会创建一棵 DOM 树,其中包含实现 DOM 接口的节点。
  • ECMAScript 规范没有浏览器主机对象(DOM、BOM、HTML5 API 等)的引用。
  • DOM 的实际实现方式取决于浏览器内部结构,并且大多数浏览器之间可能有所不同。
  • 现代 JS 解释器使用 JIT 来提高代码性能并将其转换为字节码

我很好奇当我调用 document.getElementById('foo') 时幕后会发生什么。调用是否由解释器委托给浏览器本机代码,或者浏览器是否具有所有主机对象的 JS 实现?您知道他们在这方面做了什么优化吗?

我阅读了浏览器内部概述,但它没有提及任何相关内容。有时间我会看一下Chrome和FF的源码,但我想先在这里问一下。 :)

I was thinking about this today and I realized I don't have a clear picture here.

Here are some statements I think to be true (please correct me if I'm wrong):

  • the DOM is a collection of interfaces specified by W3C.
  • when parsing HTML source code, the browser creates a DOM tree which has nodes that implement DOM interfaces.
  • the ECMAScript spec has no reference of browser host objects (DOM, BOM, HTML5 APIs etc.).
  • how the DOM is actually implemented depends on browser internals and is probably different among most of them.
  • modern JS interpreters use JIT to improve the code performance and translate it to bytecode

I am curious about what happens behind the scenes when I call document.getElementById('foo'). Does the call get delegated to browser native code by the interpreter or does the browser have JS implementations of all host objects? Do you know about any optimizations they do in regard to this?

I read this overview of browser internals but it didn't mention anything about this. I will look through the Chrome and FF source when I have time, but I thought about asking here first. :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

峩卟喜欢 2024-12-18 03:10:02

您的所有要点都是正确的,除了:

现代 JS 解释器使用 JIT 来提高代码性能并将其翻译为字节码

应该是“...并将其翻译为本机代码”。 SpiderMonkey(Firefox 中的 JS 引擎)在当前 JS 速度军备竞赛之前很长一段时间都作为字节码解释器工作。

在 Mozilla 的 JS-to-DOM 桥上:

主机对象通常用 C++ 实现,尽管正在进行 的实验在 JS 中实现 DOM。因此,当网页调用 document.getElementById('foo') 时,通过 ID 检索元素的实际工作是在 C++ 方法中完成的,正如 hsivonen 指出的那样。

调用底层 C++ 实现的具体方式取决于 API,并且也会随着时间的推移而改变(请注意,我没有参与开发,因此可能在一些细节上有错误,这里是

  • wordpress.com/2012/04/07/history-of-mozillas-dom-bindings/" rel="nofollow">jst 的博客文章,他实际上 最低的每个 JS 引擎都提供 API 来定义宿主对象。例如,浏览器可以调用 JS_DefineFunctions(如SpiderMonkey 用户指南中所示)来让引擎知道每当脚本调用具有指定名称的函数时,就应该调用提供的 C 回调。对于宿主对象的其他方面(例如枚举、属性 getter/setter 等)也是如此。
  • 对于核心 ECMAScript 功能以及在一些棘手的 DOM 情况下,JS 引擎/浏览器直接使用这些 API 来定义宿主对象及其行为,但是它需要大量常见的样板代码,例如检查参数类型、将它们转换为适当的 C++ 类型、错误处理等。
  • 由于我不会详细介绍的原因,比如说历史上,Mozilla 大量使用了 XPCOM 用于它的许多对象,包括大部分 DOM。 XPCOM 的一项功能是它与 JS 的绑定,称为 XPConnect。除此之外,XPConnect 可以采用 IDL 中的接口定义(例如 nsIDOMDocument;或更准确地说是其编译表示),向脚本公开具有指定属性的对象,然后,当脚本调用 getElementById 时,执行必要的操作参数检查/转换并将调用直接路由到 C++ 方法 (nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn))
  • XPConnect 的工作方式是效率相当低:它将通用函数注册为要在脚本访问主机对象时执行的回调,并且这些通用函数动态地计算出它们在每种特定情况下需要执行的操作。 这篇关于 Quickstubs 的文章将向您介绍一个示例。
  • 上一个链接中提到的“快速存根”是一种通过牺牲一些代码大小来优化 JS->C++ 调用时间的方法:不是总是使用知道如何进行任何类型调用的通用 C++ 函数,而是专门的代码在 Firefox 构建时自动生成预定义的“热门”调用列表。
  • 后来,JIT(当时的tracemonkey)被教导生成调用C++方法的代码 作为 JS 中“热”路径生成的本机代码的一部分。我不确定新的 JIT(jaegermonkey)在这方面是如何工作的。
  • 使用 "paris 绑定" 对象 暴露于网页 JS 而不依赖 XPConnect,而是生成所有必要的粘合基于 WebIDL(而不是 XPCOM 时代的 IDL)的 JSClass 代码。另请参阅从事此工作的开发人员的帖子:jstkhuey。另请参阅 Web 公开的 DOM 是如何实现的?

我对最后三点的细节特别模糊,所以请持保留态度。

最新的改进被列为 bug 622298 的依赖项,但我不这样做密切关注他们。

All of your bullet points are correct, except:

modern JS interpreters use JIT to improve the code performance and translate it to bytecode

should be "...and translate it to native code". SpiderMonkey (the JS engine in Firefox) worked as a bytecode interpreter for a long time before the current JS speed arms race.

On Mozilla's JS-to-DOM bridge:

The host objects are typically implemented in C++, though there is an experiment underway to implement DOM in JS. So when a web page calls document.getElementById('foo'), the actual work of retrieving the element by its ID is done in a C++ method, as hsivonen noted.

The specific way the underlying C++ implementation gets called depends on the API and also changed over time (note that I'm not involved in the development, so might be wrong about some details, here's a blog post by jst, who was actually involved in creating much of this code):

  • At the lowest level every JS engine provides APIs to define host objects. For example, the browser can call JS_DefineFunctions (as demonstrated in the SpiderMonkey User Guide) to let the engine know that whenever script calls a function with the specified name, a provided C callback should be called. Same for other aspects of the host objects (e.g. enumeration, property getters/setters, etc.)
  • For the core ECMAScript functionality and in some tricky DOM cases the JS engine/the browser uses these APIs directly to define host objects and their behaviors, but it requires a lot of common boilerplate code for e.g. checking parameter types, converting them to the appropriate C++ types, error handling etc.
  • For reasons I won't go into, let's say historically, Mozilla made heavy use of XPCOM for many of its objects, including much of the DOM. One feature of XPCOM is its binding to JS called XPConnect. Among other things, XPConnect can take an interface definition in IDL (such as nsIDOMDocument; or more precisely its compiled representation), expose an object with the specified properties to the script, and later, when a script calls getElementById, perform the necessary parameter checks/conversions and route the call directly to a C++ method (nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn))
  • The way XPConnect worked was quite inefficient: it registered generic functions as callbacks to be executed when a script accesses a host object, and these generic functions figured out what they needed to do in every particular case dynamically. This post about quickstubs walks you through one example.
  • "Quick stubs" mentioned in the previous link is a way to optimize JS->C++ calls time by trading some code size for it: instead of always using generic C++ functions that know how to make any kind of call, the specialized code is automatically generated at the Firefox build time for a pre-defined list of "hot" calls.
  • Later on the JIT (tracemonkey at that time) was taught to generate the code calling C++ methods as part of the native code generated for "hot" paths in JS. I'm not sure how the newer JITs (jaegermonkey) work in this regard.
  • With "paris bindings" the objects are exposed to webpage JS without any reliance on XPConnect, instead generating all the necessary glue JSClass code based on WebIDL (instead of XPCOM-era IDL). See also posts by developers who worked on this: jst and khuey. Also see How is the web-exposed DOM implemented?

I'm fuzzy on details of the three last points in particular, so take it with a grain of salt.

The most recent improvements are listed as dependencies of bug 622298, but I don't follow them closely.

月朦胧 2024-12-18 03:10:02

JS 对 DOM 方法(例如 getElementById)的调用会导致 JS 引擎调用实现 DOM 的 C++ 代码。例如,在 Firefox 中,调用最终在 nsDocument 中结束: :GetElementById(const nsAString& aId, nsIDOMElement** aReturn)

正如您所看到的,Firefox 维护了一个哈希表,将 id 映射到 C++ 中的元素,作为本例中的优化,因此它不会遍历整个 DOM 树来查找 id。

JS calls to DOM methods like getElementById cause the JS engine to call into the C++ code that implements the DOM. For example, in Firefox, the call ends up in nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn).

As you can see, Firefox maintains a hashtable that maps ids to elements in C++ as an optimization in this case, so it doesn't walk the whole DOM tree looking for the id.

︶ ̄淡然 2024-12-18 03:10:02

在所有主要浏览器实现中,DOM 几乎都是作为独立于语言的库实现的,这意味着它位于与 Javascript 引擎不同的库中。例如,在 IE 中,JS 引擎在 jscript.dll 中实现,而 DOM 在 mshtml.dll 中实现。 Safari 有 Nitro(JS) 和 WebCore(DOM)。 Chrome有V8(JS)和WebCore(DOM),Firefox有SpiderMonkey/TraceMonkey(JS)和Gecko(DOM)。

这意味着无论何时你的 JS 必须访问 DOM,它都必须访问 DOM 库 - 由于必须进行所有封送处理,这本质上很慢。曾经使用过的一个类比是,两块土地通过收费桥相连,任何时候你触摸 DOM,你都必须穿过桥并返回 - 支付性能费用。

参考资料

The DOM is implemented as a language-independent library pretty much in all major browser implementations, which means it's in a different library from the Javascript engine. For example in IE, the JS engine is implemented in jscript.dll while the DOM is implemented in mshtml.dll. Safari has Nitro(JS) and WebCore(DOM). Chrome has V8(JS) and WebCore(DOM), and Firefox has SpiderMonkey/TraceMonkey(JS) and Gecko(DOM).

What this means is that anytime your JS has to access the DOM, it has to reach over to the DOM library - which is inherently slow because of all the marshaling that has to take place. An analogy that has been used is 2 pieces of land connected by a toll bridge, any time you touch the DOM, you must cross over the bridge and cross back - paying a performance toll.

References

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文