遗留 gcc 编译器问题

发布于 2024-07-17 23:52:20 字数 4787 浏览 2 评论 0原文

我们正在使用基于 gcc 2.6.0 的遗留编译器来交叉编译我们仍在使用的旧嵌入式处理器（是的，它自 1994 年以来仍在使用！）。为该芯片进行 gcc 移植的工程师早已离开。虽然我们也许能够从网络上的某个地方恢复 gcc 2.6.0 源代码，但该芯片的更改集已经消失在企业历史的殿堂中。直到最近，我们一直处于混乱状态，因为编译器仍在运行并生成了可用的可执行文件，但从 Linux 内核 2.6.25（以及 2.6.26）开始，它失败并显示消息 gcc: virtual memory exend..即使不带参数或仅使用 -v 运行。我已经使用 2.6.24 内核重新启动了我的开发系统（从 2.6.26 开始），编译器再次工作（使用 2.6.25 重新启动则不起作用）。

我们有一个系统保留在 2.6.24，只是为了构建该芯片，但感觉有点暴露，以防 Linux 世界发展到我们无法再重建可以运行的系统的地步。编译器（即我们的 2.6.24 系统死机了，我们无法让 2.6.24 在新系统上安装和运行，因为某些软件部分不再可用）。

有谁知道我们可以对更现代的安装做些什么来让这个遗留编译器运行？

编辑：

回答一些评论...

遗憾的是，我们芯片特有的源代码更改丢失了。这种损失发生在两个主要公司重组和几个系统管理员（其中几个确实留下了烂摊子）的过程中。我们现在使用配置控制，但是对于这个问题来说，关闭谷仓门已经太晚了。

使用虚拟机是一个好主意，也可能是我们最终要做的事情。谢谢你的想法。

最后，我按照 ehemient 的建议尝试了 strace，发现最后一个系统调用是 brk()，它在新系统（2.6.26 内核）上返回错误，在旧系统（2.6.24 内核）上返回成功。这表明我确实耗尽了虚拟内存，除了 tcsh“limit”在新旧系统上返回相同的值，并且 /proc/meminfo 显示新系统具有稍微更多的内存和更多的交换空间。也许是碎片问题或程序加载位置的问题？

我做了一些进一步的研究，并在内核 2.6.25 中添加了“brk 随机化”，但是 CONFIG_COMPAT_BRK 据说默认情况下处于启用状态（这会禁用 brk 随机化）。

编辑：

好的，更多信息：看来 brk 随机化确实是罪魁祸首，旧版 gcc 正在调用 brk() 来更改数据段的末尾，但现在失败了，导致旧版 gcc 报告“虚拟内存耗尽”。有一些记录在案的方法可以禁用 brk 随机化：

sudo echo 0 > /proc/sys/kernel/randomize_va_space
sudo sysctl -w kernel.randomize_va_space=0
使用 setarch i386 -R tcsh（或“-R -L”）启动新 shell

我已经尝试过它们，它们似乎确实有效果，因为 brk() 返回值与没有它们时不同（并且始终相同）（在内核 2.6.25 和2.6.26)，但 brk() 仍然失败，因此旧版 gcc 仍然失败:-(。

此外，我设置了 vm.legacy_va_layout=1 和 vm.overcommit_memory=2 code> 没有任何变化，并且我已经使用 /etc/sysctl.conf 中保存的 vm.legacy_va_layout=1 和 kernel.randomize_va_space=0 设置重新启动，但仍然没有变化。

编辑：

在内核 2.6.26（和 2.6.25）上使用 kernel.randomize_va_space=0 会导致 strace Legacy 报告以下 brk() 调用-gcc:

brk(0x80556d4) = 0x8056000

这表示 brk() 失败，但看起来失败是因为数据段已经超出了请求的范围。使用 objdump，我可以看到数据段应该以 0x805518c 结束，而失败的 brk() 表明数据段当前以 0x8056000 结束：

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .interp       00000013  080480d4  080480d4  000000d4  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .hash         000001a0  080480e8  080480e8  000000e8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .dynsym       00000410  08048288  08048288  00000288  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .dynstr       0000020e  08048698  08048698  00000698  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .rel.bss      00000038  080488a8  080488a8  000008a8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .rel.plt      00000158  080488e0  080488e0  000008e0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .init         00000008  08048a40  08048a40  00000a40  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  7 .plt          000002c0  08048a48  08048a48  00000a48  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  8 .text         000086cc  08048d10  08048d10  00000d10  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  9 .fini         00000008  080513e0  080513e0  000093e0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 10 .rodata       000027d0  080513e8  080513e8  000093e8  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 11 .data         000005d4  08054bb8  08054bb8  0000bbb8  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 12 .ctors        00000008  0805518c  0805518c  0000c18c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 13 .dtors        00000008  08055194  08055194  0000c194  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 14 .got          000000b8  0805519c  0805519c  0000c19c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 15 .dynamic      00000088  08055254  08055254  0000c254  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 16 .bss          000003b8  080552dc  080552dc  0000c2dc  2**3
                  ALLOC
 17 .note         00000064  00000000  00000000  0000c2dc  2**0
                  CONTENTS, READONLY
 18 .comment      00000062  00000000  00000000  0000c340  2**0
                  CONTENTS, READONLY
SYMBOL TABLE:
no symbols

编辑：

回应下面 ehemient 的评论：“对待 GCC 很奇怪作为没有源代码的二进制文件”！

因此，使用 strace、objdump、gdb 以及我对 386 汇编器和架构的有限理解，我将问题追溯到遗留代码中的第一个 malloc 调用。旧版 gcc 调用 malloc，它返回 NULL，这会导致 stderr 上出现“虚拟内存耗尽”消息。这个malloc位于libc.so.5中，它调用getenv 多次并最终调用 brk()...我想增加堆...这失败了。

由此我只能推测问题不仅仅是 brk 随机化，或者我没有完全禁用 brk 随机化，尽管 randomize_va_space=0 和 Legacy_va_layout=1 sysctl 设置。

原文

We are using a legacy compiler, based on gcc 2.6.0, to cross compile for an old imbedded processor we are still using (yes, it is still in use since 1994!). The engineer that did the gcc port for this chip has long since moved on. Although we might be able to recover the gcc 2.6.0 source from somewhere on the web, the change set for this chip has
disappeared in the halls of corporate history. We have muddled along until recently as the compiler still ran and produced workable executables, but as of linux kernel 2.6.25 (and also 2.6.26) it fails with the message gcc: virtual memory exhausted... even when run with no parameters or with only -v. I have rebooted my development system (from 2.6.26) using the 2.6.24 kernel and the compiler works again (rebooting with 2.6.25 does not).

We have one system that we are keeping at 2.6.24 just for the purpose of doing builds for this chip, but are feeling a bit exposed in case the linux world moves on to the point that we cannot any longer rebuild a system that will run the compiler (i.e. our 2.6.24 system dies and we cannot get 2.6.24 to install and run on a new system because some of the software parts are no longer available).

Does anyone have any ideas for what we might be able to do to a more modern installation to get this legacy compiler to run?

Edit:

To answer some of the comments...

Sadly it is the source code changes that are specific to our chip that are lost. This loss occurred over two major company reorgs and several sysadmins (a couple of which really left a mess). We now use configuration control, but that is closing the barn door too late for this problem.

The use of a VM is a good idea, and may be what we end up doing. Thank you for that idea.

Finally, I tried strace as ephemient suggested and found that the last system call was brk() which returned an error on the new system (2.6.26 kernel) and returned success on the old system (2.6.24 kernel). This would indicate that I really am running out of virtual memory, except that tcsh "limit" returns the same values on old and new systems, and /proc/meminfo shows the new systems has slightly more memory and quite a bit more swap space. Maybe it is a problem of fragmentation or where the program is being loaded?

I did some further research and "brk randomization" was added in kernel 2.6.25, however CONFIG_COMPAT_BRK is supposedly enabled by default (which disables brk randomization).

Edit:

OK, more info:
It really looks like brk randomization is the culprit, the legacy gcc is calling brk() to change the end of the data segment and that now fails, causing the legacy gcc to report "virtual memory exhausted". There are a few documented ways to disable brk randomization:

sudo echo 0 > /proc/sys/kernel/randomize_va_space
sudo sysctl -w kernel.randomize_va_space=0
starting a new shell with setarch i386 -R tcsh (or "-R -L")

I have tried them and they do seem to have an effect in that the brk() return value is different (and always the same) than without them (tried on both kernel 2.6.25 and 2.6.26), but the brk() still fails so the legacy gcc still fails :-(.

In addition I have set vm.legacy_va_layout=1 and vm.overcommit_memory=2 with no change, and I have rebooted with the vm.legacy_va_layout=1 and kernel.randomize_va_space=0 settings saved in /etc/sysctl.conf. Still no change.

Edit:

Using kernel.randomize_va_space=0 on kernel 2.6.26 (and 2.6.25) results in the following brk() call being reported by strace legacy-gcc:

brk(0x80556d4) = 0x8056000

This indicates the brk() failed, but it looks like it failed because the the data segment already ends beyond what was requested. Using objdump, I can see the data segment should end at 0x805518c whereas the failed brk() indicates that the data segment currently ends at 0x8056000:

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .interp       00000013  080480d4  080480d4  000000d4  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .hash         000001a0  080480e8  080480e8  000000e8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .dynsym       00000410  08048288  08048288  00000288  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .dynstr       0000020e  08048698  08048698  00000698  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .rel.bss      00000038  080488a8  080488a8  000008a8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .rel.plt      00000158  080488e0  080488e0  000008e0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .init         00000008  08048a40  08048a40  00000a40  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  7 .plt          000002c0  08048a48  08048a48  00000a48  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  8 .text         000086cc  08048d10  08048d10  00000d10  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  9 .fini         00000008  080513e0  080513e0  000093e0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 10 .rodata       000027d0  080513e8  080513e8  000093e8  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 11 .data         000005d4  08054bb8  08054bb8  0000bbb8  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 12 .ctors        00000008  0805518c  0805518c  0000c18c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 13 .dtors        00000008  08055194  08055194  0000c194  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 14 .got          000000b8  0805519c  0805519c  0000c19c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 15 .dynamic      00000088  08055254  08055254  0000c254  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 16 .bss          000003b8  080552dc  080552dc  0000c2dc  2**3
                  ALLOC
 17 .note         00000064  00000000  00000000  0000c2dc  2**0
                  CONTENTS, READONLY
 18 .comment      00000062  00000000  00000000  0000c340  2**0
                  CONTENTS, READONLY
SYMBOL TABLE:
no symbols

Edit:

To echo ephemient's comment below: "So strange to treat GCC as a binary without source"!

So, using strace, objdump, gdb and my limited understanding of 386 assembler and architecture I have traced the problem to the 1st malloc call in the legacy code. The legacy gcc calls malloc, which returns NULL, which results in the "virtual memory exhausted" message on stderr. This malloc is in libc.so.5, and it calls getenv
a bunch of times and ends up calling brk()... I guess to increase the heap... which fails.

From this I can only surmise that the problem is more than brk randomization, or I have not fully disabled brk randomization, despite the randomize_va_space=0 and legacy_va_layout=1 sysctl settings.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

寄与心 2024-07-24 23:52:20

将 linux + 旧的 gcc 安装到虚拟机上。

回复收藏 0 原文

夏九 2024-07-24 23:52:20

您有此自定义编译器的源代码吗？如果您可以恢复 2.6.0 基线（这应该相对容易），那么 diff 和 patch 应该恢复您的更改集。

然后我建议使用该更改集来针对最新的 gcc 构建新版本。然后将其置于配置控制之下。

抱歉，我不是故意要喊叫的。只是三十年来我一直在说同样的话。

回复收藏 0 原文

笑，眼淚并存 2024-07-24 23:52:20

您可以stracegcc-2.6.0可执行文件吗？它可能会执行诸如读取 /proc/$$/maps 之类的操作，并且当输出以微不足道的方式发生变化时会感到困惑。最近注意到在 2.6.28 和 2.6.29 之间出现了类似的问题。

如果是这样，您可以破解 /usr/src/linux/fs/proc/task_mmu.c 或类似内容来恢复旧的输出，或者设置一些 $LD_PRELOAD 来伪造gcc 读取另一个文件。

编辑

既然您提到了 brk...

CONFIG_COMPAT_BRK 使默认的 kernel.randomize_va_space=1 而不是 2，但是除了堆（brk）之外，它仍然随机化所有内容。

如果您 echo 0 > ，看看您的问题是否消失。 /proc/sys/kernel/randomize_va_space 或 sysctl kernel.randomize_va_space=0 （等效）。

如果是这样，请将 kernel.randomize_va_space = 0 添加到 /etc/sysctl.conf 或将 norandmaps 添加到内核命令行（等效），并且再次快乐吧。

回复收藏 0 原文

弃爱 2024-07-24 23:52:20

我遇到了这个并考虑了你的问题。也许您可以找到一种方法来使用二进制文件将其移动到 ELF 格式？或者可能是无关紧要的，但使用 objdump 可以为您提供更多信息。

你能看一下进程内存映射吗？

回复收藏 0 原文

鹿童谣 2024-07-24 23:52:20

所以我已经解决了一些问题......这不是一个完整的解决方案，但它确实解决了我在遗留 gcc 中遇到的原始问题。

在 .plt （过程链接表）中的每个 libc 调用上放置断点，我看到 malloc （在 libc.so.5 中）调用 getenv() 来获取：

    MALLOC_TRIM_THRESHOLD_
    MALLOC_TOP_PAD_
    MALLOC_MMAP_THRESHOLD_
    MALLOC_MMAP_MAX_
    MALLOC_CHECK_

所以我在网络上搜索了这些并发现这告诉

    setenv MALLOC_TOP_PAD_ 536870912

遗留的 gcc 可以工作了！！！

但不是免费的，它在失败之前到达了构建中的链接，因此我们拥有的遗留nld还发生了一些事情:-(它正在报告：

    Virtual memory exceeded in `new'

在/etc/sysctl.conf中我有：

    kernel.randomize_va_space=0
    vm.legacy_va_layout=1

它仍然有效相同的 if

    kernel.randomize_va_space=1
    vm.legacy_va_layout=0

但不是 if

kernel.randomize_va_space=2

有建议使用“ldd”查看共享库依赖关系：遗留 gcc 只需要 libc5，但遗留 nld 还需要 libg++.so.27、libstdc++.so.27、libm.so .5 显然有一个 libg++.so.27 的 libc5 版本（libg++27-altdev ??）
libc5-compat 怎么样？

所以，正如我所说，还没有回家……越来越近了。我可能会发布一个关于 nld 问题的新问题。

编辑：

我原本打算避免“接受”这个答案，因为我仍然对相应的遗留链接器有问题，但为了至少在这个问题上得到一些结论，我正在重新考虑那个位置。

感谢 an0nym0usc0ward

提出使用 vm 的建议（这可能最终成为接受的答案）
ehemient 提出使用 strace 的建议，并帮助 stackoverflow 使用
shodanex 提出使用 objdump

编辑

下面是我学到的最后的东西，现在我将接受虚拟机解决方案，因为我无法以任何其他方式完全解决它（至少在为此分配的时间内）。

较新的内核有一个 CONFIG_COMPAT_BRK 构建标志以允许使用 libc5，因此大概用此标志构建一个新内核将解决该问题（查看内核 src，它看起来会，但我不能确定，因为我这样做了不遵循所有路径）。还有另一种记录在案的方法允许在运行时（而不是在内核构建时）使用 libc5：sudo sysctl -w kernel.randomize_va_space=0。然而，这确实
如果没有完成完整的工作，一些（大多数？）libc5 应用程序仍然会崩溃，例如我们的旧版编译器和链接器。这似乎是由于新旧内核之间的对齐假设存在差异。我已经修补了链接器二进制文件，使其认为它具有更大的 bss 部分，以便将 bss 的末尾带到页面边界，并且当 sysctl var kernel.randomize_va_space=0 时，这适用于较新的内核。这对我来说不是一个令人满意的解决方案，因为我盲目地修补了一个关键的二进制可执行文件，并且即使在较新的内核上运行修补的链接器产生了与在较旧的内核上运行的原始链接器相同的输出，但这并不能证明一些其他链接器输入（即我们更改正在链接的程序）也会产生相同的结果。

So I have worked something out... it is not a complete solution, but it does get past the original problem I had with the legacy gcc.

Putting breakpoints on every libc call in the .plt (procedure linkage table) I see that malloc (in libc.so.5) calls getenv() to get:

    MALLOC_TRIM_THRESHOLD_
    MALLOC_TOP_PAD_
    MALLOC_MMAP_THRESHOLD_
    MALLOC_MMAP_MAX_
    MALLOC_CHECK_

So I web-searched these and found this which advised

    setenv MALLOC_TOP_PAD_ 536870912

then the legacy gcc WORKS!!!!

But not home free, it got up to the link in the build before failing, so there is something further going on with the legacy nld we have :-( It is reporting:

    Virtual memory exceeded in `new'

In /etc/sysctl.conf I have:

    kernel.randomize_va_space=0
    vm.legacy_va_layout=1

It still works the same if

    kernel.randomize_va_space=1
    vm.legacy_va_layout=0

but not if

kernel.randomize_va_space=2

There was a suggestion to use "ldd" to see the shared library dependencies: the legacy gcc only needs libc5, but the legacy nld also needs libg++.so.27, libstdc++.so.27, libm.so.5 and apparently there is a libc5 version of libg++.so.27 (libg++27-altdev ??)
and what about libc5-compat?

So, as I said, not yet home free... be getting closer. I'll probably post a new question about the nld problem.

Edit:

I was originally going to refrain from "Accepting" this answer since it I still have a problem with the corresponding legacy linker, but in order to get some finality on this question at least, I am rethinking that position.

Thank-you's go out to:

an0nym0usc0ward for the suggestion of using a vm (which may ultimately become the Accepted Answer)
ephemient for suggesting using strace, and help with stackoverflow usage
shodanex for suggesting using objdump

Edit

Below is the last stuff that I learned, and now I will accept the VM solution since I could not fully solve it any other way (at least in the time alloted for this).

The newer kernels have a CONFIG_COMPAT_BRK build flag to allow libc5 to be used, so presumably building a new kernel with this flag will fix the problem (and looking through the kernel src, it looks like it will, but I cant be sure since I did not follow all of the paths). There is also another documented way to allow libc5 use at runtime (rather than at kernel build time): sudo sysctl -w kernel.randomize_va_space=0. This, however does
not do a complete job and some (most?) libc5 apps will still break, e.g. our legacy compiler and linker. This seems to be due to a difference in alignment assumptions between the newer and older kernels. I have patched the linker binary to make it think it has a bigger bss section, in order to bring the end of the bss up to a page boundary, and this works on the newer kernel when the sysctl var kernel.randomize_va_space=0. This is NOT a satisfactory solution to me since I am blindly patching a critical binary executable, and even though running the patched linker on the newer kernel produced a bit-identical output to the original linker run on the older kernel, that does not prove that some other linker input (i.e. we change the program being linked) will also produce identical results.

回复收藏 0 原文