为什么GHC这么大/很大?
有没有一个简单的答案:为什么 GHC 这么大?
- OCaml:2MB
- Python:15MB
- SBCL:9MB
- OpenJRE - 26MB
- GHC:113MB
对“如果 Haskell 是正确的工具,为什么我不应该关心大小”的宣传不感兴趣;这是一个技术问题。
Is there a simple answer: Why is GHC so big?
- OCaml: 2MB
- Python: 15MB
- SBCL: 9MB
- OpenJRE - 26MB
- GHC: 113MB
Not interested in evangelism of "Why I shouldn't care about the size if Haskell is the right tool"; this is a technical question.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
确实有点傻。 GHC 附带的每个库都提供不少于4 种风格:
GHCi 版本只是在单个
.o
文件中链接在一起的静态版本。其他三个版本也都有自己的一组接口文件(.hi
文件)。配置文件版本似乎是未配置文件版本大小的两倍(这有点可疑,我应该研究一下原因)。请记住,GHC 本身就是一个库,因此您将获得 4 个 GHC 副本。不仅如此,GHC 二进制文件本身也是静态链接的,因此有 5 个 GHC 副本。
我们最近做到了这一点,以便 GHCi 可以使用静态
.a
文件。这将使我们能够摆脱其中一种味道。从长远来看,我们应该动态链接 GHC,但这是一个更大的变化,因为这需要将动态链接设置为默认值 - 与 C 不同,使用 GHC 您必须预先决定是否要动态链接。在这真正实用之前,我们需要更多的改变(例如,对 Cabal 和软件包系统等)。It's a bit silly really. Every library that comes with GHC is provided in no less than 4 flavours:
The GHCi version is just the static version linked together in a single
.o
file. The other three versions all have their own set of interface files (.hi
files) too. The profiled versions seem to be about twice the size of the unprofiled versions (which is a bit suspicious, I should look into why that is).Remember that GHC itself is a library, so you're getting 4 copies of GHC. Not only that, but the GHC binary itself is statically linked, so that's 5 copies of GHC.
We recently made it so that GHCi could use the static
.a
files. That will allow us to get rid of one of these flavours. Longer term, we should dynamically link GHC, but that's a bigger change because that would entail making dynamic linking the default - unlike in C, with GHC you have to decide up front whether you're going to link dynamically or not. And we need more changes (e.g. to Cabal and the package system, amongst other things) before this is really practical.也许我们应该将苹果与苹果、橙子与橙子进行比较。 JRE 是一个运行时,而不是开发工具包。我们可以比较:开发套件的源代码大小、编译后的开发套件的大小以及编译后的最小运行时间的大小。
OpenJDK 7 源包为 82 MB (download.java.net/openjdk/jdk7),而 GHC 7 源包为 23 MB (haskell.org/ghc/download_ghc_7_0_1)。 GHC这里并不大。运行时大小:Ubuntu 上的 openjdk-6-jre-headless 未压缩为 77 MB,而 Haskell helloworld 与其运行时静态链接,大小小于 1 MB。 GHC这里并不大。
GHC 很大,是编译后的开发工具包的大小:
GHC 本身需要 270 MB,并且所有库和实用程序加在一起需要超过 500 MB。是的,即使有基础库和构建工具/依赖管理器,它也很多。 Java开发平台较小。
GHC:
针对带有依赖项的 OpenJDK:
但它仍然超过 100 MB,而不是您编写的 26 MB。
ghc6 和 ghc6-prof 中的重量级内容是:
请注意
libHSghc-6.12.1_p.a
有多大。因此,答案似乎是每个库的静态链接和分析版本。Probably we should compare apples to apples and oranges to oranges. JRE is a runtime, not a developer kit. We may compare: source size of the development kit, the size of the compiled development kit and the compiled size of the minimal runtime.
OpenJDK 7 source bundle is 82 MB (download.java.net/openjdk/jdk7) vs GHC 7 source bundle, which is 23 MB (haskell.org/ghc/download_ghc_7_0_1). GHC is not big here. Runtime size: openjdk-6-jre-headless on Ubuntu is 77 MB uncompressed vs Haskell helloworld, statically linked with its runtime, which is <1 MB. GHC is not big here.
Where GHC is big, is the size of the compiled development kit:
GHC itself takes 270 MB, and with all the libraries and utilities that come together it takes over 500 MB. And yes, it's a lot, even with base libraries and a build tool/dependency manager. Java development platform is smaller.
GHC:
against OpenJDK withdependencies:
But it is still more than 100 MB, not 26 MB as you write.
Heavyweight things in ghc6 and ghc6-prof are:
Please note how big is
libHSghc-6.12.1_p.a
. So the answer seems to be static linking and profiling versions for every library out there.我的猜测是——大量的静态链接。每个库都需要静态链接其依赖项,而依赖项又需要静态链接它们的依赖项等等。这一切通常都是在有或没有分析的情况下编译的,即使没有分析,二进制文件也不会被剥离,因此包含大量的调试器信息。
My guess -- lots and lots of static linking. Each library needs to statically link its dependencies, which in turn need to statically link theirs and soforth. And this is all compiled often both with and without profiling, and even without profiling the binaries aren't stripped and so hold lots of debugger information.
因为它捆绑了 gcc 和一堆库,所有这些都是静态链接的。
至少在 Windows 上是这样。
Because it bundles gcc and a bunch of libraries, all statically linked.
At least on Windows.
简短的回答是,这是因为所有可执行文件都是静态链接的,其中可能包含调试信息,并且库包含在多个副本中。其他评论者已经说过了。
动态链接是可能的,并且将显着减小大小。下面是一个
Hello.hs
示例:我在 Windows 上使用 GHC 7.4.2 进行构建。
ghc --make -O2
为Hello.exe
提供 1105Ks在其上运行
strip
留下 630Kghc --make -O2 -dynamic
给出 40K剥离后只剩下 13K。
它的依赖项是 5 个 dll,未剥离的总大小为 9.2 MB,剥离后的总大小为 5.7 MB。
Short answer is that it's because all executables are statically linked, may have debug info in them and libraries are included in multiple copies. This has already been said by other commenters.
Dynamic linking is possible and will reduce the size dramatically. Here is an example
Hello.hs
:I build with GHC 7.4.2 on Windows.
ghc --make -O2
givesHello.exe
of 1105KsRunning
strip
on it leaves 630Kghc --make -O2 -dynamic
gives 40KStripping it leaves just 13K.
Its dependencies are 5 dlls with total size of 9.2 MBs unstripped and 5.7 MB stripped.
以下是我的盒子上的目录大小细分:
https://spreadsheets.google.com/ ccc?key=0AveoXImmNnZ6dDlQeHY2MmxPcEYzYkpweEtDSS1fUlE&hl=en
看起来最大的目录(123 MB)是用于编译编译器本身的二进制文件。这些文件的大小达到惊人的 65 MB。第三名是 Cabal,大小为 41 MB。
bin 目录有 33 MB,我认为构建 Haskell 应用程序在技术上只需要其中的一个子集。
Here's the directory size breakdown on my box:
https://spreadsheets.google.com/ccc?key=0AveoXImmNnZ6dDlQeHY2MmxPcEYzYkpweEtDSS1fUlE&hl=en
It looks like the largest directory (123 MB) is the binaries for compiling the compiler itself. The documents weigh in at an astounding 65 MB. Third place is Cabal at 41 MB.
The bin directory is 33 MB, and I think that only a subset of that is what's technically required to build Haskell applications.