Firefox 中的大子字符串比 Chrome 快约 9000 倍:为什么?
基准:http://jsperf.com/substringing
所以,我正在启动我的第一个 HTML5 浏览器基于客户端的项目。本质上,它必须将非常非常大的文本文件解析为一个或多个对象数组。我知道我将如何编码;我现在最关心的是尽快获得解析器代码,我的主要测试平台是 Chrome。然而,在查看子字符串方法之间的差异时(我已经很长一段时间没有接触过 JavaScript),我注意到与 FireFox 相比,Chrome 中的这个基准测试速度慢得令人难以置信。为什么?
我的第一个假设是,它与 FireFox 的 JS 引擎处理字符串对象的方式有关,对于 FireFox 来说,这个操作是简单的指针操作,而对于 Chrome 来说,它实际上是在进行硬拷贝。但是,我不确定为什么 Chrome 不会进行指针操作,或者为什么 FireFox 会。有人有一些见解吗?
JSPerf 似乎丢弃了我的 FireFox 结果,而不是在 BrowserScope 上显示它们。对我来说,我在 FF4 中的 .substr()
上获得了 9,568,203 ±1.44% Ops/sec。
编辑:所以我看到 FF3.5 的性能结果实际上低于 Chrome。所以我决定检验我的指针假设。这使我进入了子字符串测试的第二次修订,该测试正在执行FF4 中的 1,092,718±1.62%
Ops/sec 与 Chrome 中的 1,195±3.81%
Ops/sec,降至仅速度快了 1000 倍,但性能仍然存在难以解释的差异。
后记:不,我对 Internet Explorer 一点也不关心。我很关心如何提高自己的技能并更深入地了解这门语言。
The Benchmark: http://jsperf.com/substringing
So, I'm starting up my very first HTML5 browser-based client-side project. It's going to have to parse very, very large text files into, essentially, an array or arrays of objects. I know how I'm going to go about coding it; my primary concern right now is getting the parser code as fast as I can get it, and my primary testbed is Chrome. However, while looking at the differences between substring methods (I haven't touched JavaScript in a long, long time), I noticed that this benchmark was incredibly slow in Chrome compared to FireFox. Why?
My first assumption is that it has to do with the way FireFox's JS engine would handle string objects, and that for FireFox this operation is simple pointer manipulation, while for Chrome it's actually doing hard copies. But, I'm not sure why Chrome wouldn't do pointer manipulation or why FireFox would. Anyone have some insight?
JSPerf appears to be throwing out my FireFox results, not displaying them on the BrowserScope. For me, I'm getting 9,568,203 ±1.44% Ops/sec on .substr()
in FF4.
Edit: So I see a FF3.5 performance result down there actually below Chrome. So I decided to test my pointers hypothesis. This brought me to a 2nd revision of my Substrings test, which is doing 1,092,718±1.62%
Ops/sec in FF4 versus 1,195±3.81%
Ops/sec in Chrome, down to only 1000x faster, but still an inexplicable difference in performance.
A postscriptum: No, I'm not concerned one lick about Internet Explorer. I'm concerned about trying to improve my skills and getting to know this language on a deeper level.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
对于 Spidermonkey(Firefox 中的 JS 引擎),
substring()
调用只是创建一个新的“依赖字符串”:一个字符串对象,它存储一个指向子字符串的指针,以及开始和结束偏移。这正是为了使substring()
更快,并且对于不可变字符串来说这是一个明显的优化。至于为什么 V8 不这样做...一种可能是 V8 试图节省空间:在依赖字符串设置中,如果您保留子字符串但忘记原始字符串,则原始字符串无法被 GC 处理,因为子字符串正在使用其字符串数据的一部分。
无论如何,我只是查看了 V8 源代码,看起来他们根本不做任何类型的依赖字符串;不过,这些评论并没有解释为什么他们不这样做。
[更新,12/2013]:在我给出上述答案几个月后,V8 添加了对依赖字符串的支持,正如 Paul Draper 指出的那样。
In the case of Spidermonkey (the JS engine in Firefox), a
substring()
call just creates a new "dependent string": a string object that stores a pointer to the thing it's a substring off and the start and end offsets. This is precisely to makesubstring()
fast, and is an obvious optimization given immutable strings.As for why V8 does not do that... A possibility is that V8 is trying to save space: in the dependent string setup if you hold on to the substring but forget the original string, the original string can't get GCed because the substring is using part of its string data.
In any case, I just looked at the V8 source, ans it looks like they just don't do any sort of dependent strings at all; the comments don't explain why they don't, though.
[Update, 12/2013]: A few months after I gave the above answer V8 added support for dependent strings, as Paul Draper points out.
您是否已从基准测试结果中消除了
.length
的读取?我相信 V8 有几种字符串表示形式:
数字 4 使字符串
+=
更加高效。我只是猜测,但如果他们试图将两个字符串指针和一个长度打包到一个小空间中,他们可能无法使用指针缓存大长度,因此可能最终会遍历连接的链接列表以进行计算长度。当然,这假设 Array.prototype.join 从数组部分创建形式 (4) 的字符串。
它确实导致了一个可检验的假设,即使没有缓冲区副本,该假设也可以解释差异。
编辑:
我查看了 V8 源代码, StringBuilderConcat 是我想要的地方开始拉动,尤其是
runtime.cc
。Have you eliminated the reading of
.length
from your benchmark results?I believe V8 has a few representations of a string:
Number 4 is what makes string
+=
efficient.I'm just guessing but if they're trying to pack two string pointers and a length into a small space, they may not be able to cache large lengths with the pointers, so may end up walking the joined link list in order to compute the length. This assumes of course that
Array.prototype.join
creates strings of form (4) from the array parts.It does lead to a testable hypothesis which would explain the discrepancy even absent buffer copies.
EDIT:
I looked through the V8 source code and StringBuilderConcat is where I would start pulling, especially
runtime.cc
.