适用于 Linux 的 Xscale 编译器? (也是Xscale编译标志问题)

发布于 2024-08-18 13:56:44 字数 250 浏览 4 评论 0原文

我目前正在使用基于 GCC 3.3.3 的交叉编译器来编译 Xscale PXA270 开发板。但是,我想知道是否还有其他 Xscale 编译器可以在 Linux(或 Windows)上运行?我使用的交叉编译器设置在目标设备上的性能非常糟糕,某些执行大量数学运算的程序在 Xscale 处理器上的性能比在类似时钟的 Pentium 2 上差 10 到 20 倍。编译器的任何其他选项我应该使用基于 GCC 的编译器设置哪些特定的编译器标志可能有助于提高性能?

谢谢, 本

I am currently using a GCC 3.3.3 based cross compiler to compile for a Xscale PXA270 development board. However, I was wondering if there are other Xscale compilers out there that run on Linux (or Windows for that matter)? The cross compiler setup I am using has horrendous performance on the target device, with certain programs that do a decent amount of math operations performing 10 to 20 times worse on the Xscale processor than on a similarly clocked Pentium 2. Any other options for compilers out there or specific compiler flags I should be setting with my GCC-based compiler that may help with the performance?

Thanks,
Ben

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

不喜欢何必死缠烂打 2024-08-25 13:56:44

与 Pentium 2 不同,XScale 架构没有本机浮点指令。这意味着必须使用整数指令来模拟浮点数学 - 10 到 20 倍的减速听起来是正确的。

为了提高性能,您可以尝试以下几项操作:

  • 在可能的情况下,尽量减少浮点的使用 - 在某些地方,您可以用普通整数或定点计算来代替;
  • 通过尽可能预先计算值表来权衡内存与速度;
  • 在不需要后者精度的计算中使用 float 代替 double(包括使用 float 版本的 >math.h 函数);
  • 最小化整数和浮点类型之间的转换。

Unlike the Pentium 2, the XScale architecture doesn't have native floating point instructions. That means floating point math has to be emulated using integer instructions - a 10 to 20 times slowdown sounds about right.

To improve performance, you can try a few things:

  • Where possible, minimise the use of floating point - in some places, you may be able to subsitute plain integer or fixed point calculations;
  • Trade-off memory for speed, by precalculating tables of values where possible;
  • Use floats instead of doubles in calculations where you do not need the precision of the latter (including using the C99 float versions of math.h functions);
  • Minimise conversions between integers and floating point types.
你是年少的欢喜 2024-08-25 13:56:44

是的,您没有 FPU,因此浮点需要在整数数学中完成。然而,有两种机制可以做到这一点,其中一种比另一种快 11 倍。

GCC 目标 arm-linux-gnu 通常在 ARM 第一个 FPU(“FPA”)的代码中包含真正的浮点指令,但现在很少见,甚至不存在。这些会导致非法指令陷阱,然后在内核中捕获并模拟这些陷阱。由于上下文切换,这非常慢。

-msoft-float 而是插入对库函数的调用(在 libgcc.a 中)。这避免了切换到内核空间,并且速度比模拟 FPA 指令快 11 倍。

您没有说明您正在使用什么浮点模型 - 可能您已经使用 -msoft-float 构建了整个用户区 - 但可能值得检查您的对象文件是否不包含 FPA 指令。您可以检查:

objdump -d file | grep '<space><tab>f' | less

where file is any object file, executable or library that your compiler outputs. All FPA instructions start with f, while no other ARM instructions do. Those are actual space and tab characters there, and you might need to say <control-V><tab> to get the tab character past your shell.

如果使用 FPA insns,则需要使用 -msoft-float 编译整个用户空间。

关于这些问题最全面的进一步阅读是 http://wiki.debian.org/ArmEabiPort 这是主要涉及第三种替代方案:使用 arm-linux-gnueabi 编译器,这是一种较新的替代 ABI,从 gcc-4.1.1 开始提供,具有不同的特性。请参阅文档了解更多详细信息。

Yes, you don't have an FPU so floating point needs to be done in integer math. However, there are two mechanisms for doing this, and one is 11 times faster than the other.

GCC target arm-linux-gnu normally includes real floating point instructions in the code for ARM's first FPU, the "FPA", now so rare it is nonexistent. These cause illegal instruction traps which are then caught and emulated in the kernel. This is extremely slow due to the context switch.

-msoft-float instead inserts calls to library functions (in libgcc.a). This avoids the switch into kernel space and is 11 times faster that the emulated FPA instructions.

You don't say what floating point model you are using - it may be that you are already building the whole userland with -msoft-float - but it might be worth checking that your object files contain no FPA instructions. You can check with:


objdump -d file | grep '<space><tab>f' | less

where file is any object file, executable or library that your compiler outputs. All FPA instructions start with f, while no other ARM instructions do. Those are actual space and tab characters there, and you might need to say <control-V><tab> to get the tab character past your shell.

If it is using FPA insns, you need to compile your entire userland using -msoft-float.

The most comprehensive further reading on these issues is http://wiki.debian.org/ArmEabiPort which is primarily concerned with a third alternative: using an arm-linux-gnueabi compiler, a newer alternative ABI that is available from gcc-4.1.1 onwards and which has different characteristics. See the document for further details.

地狱即天堂 2024-08-25 13:56:44

《其他xscale编译器》

开源:llvm和pcc,其中llvm对linux最友好,功能也最丰富,同时也有gcc前端; pcc 是古老的便携式 C 编译器的后代,似乎更面向 bsd。

商业:Keil 编译器(由 ARM Ltd 所有)似乎可以生成比 GCC 更快的代码,但不会对缺少 FPU 产生重大影响。

"Other xscale compilers"

Open source: llvm and pcc, of which llvm is the most linux-friendly and functional, and also has a gcc front-end; pcc, a descendant of the venerable Portable C Compiler, seems more bsd-oriented.

Commercial: The Keil compiler (owned by ARM Ltd) seems to produce faster code than GCC, but is not going to impact your lack of an FPU significantly.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文