如何从外部存储器运行 ARM 代码?

发布于 2024-08-28 23:39:22 字数 119 浏览 4 评论 0原文

我正在使用LPC2132 ARM芯片来开发程序。然而,我的程序已经变得比芯片上的空间还要大。

如何将我的芯片连接到某种外部存储芯片以保存额外的可执行代码?这可能吗?如果没有,当人们用完芯片空间时通常会做什么?

I am using an LPC2132 ARM chip to develop a program. However, my program has grown larger than the space on the chip.

How can I connect my chip to some sort of external memory chip to hold additional executable code? Is this possible? If not, what do people normally do when they run out of chip space?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

心房敞 2024-09-04 23:39:22

马克的回答很好。有一个问题——您的 RAM 或闪存是否不足,或者两者都不足?解决方案/答案可能取决于...

几年前,我发现自己处于类似的情况(LPC2148 上的空间(闪存和 RAM)用完)。在引脚兼容部件中,这是最大的闪存和 RAM。所以这是一个不幸的情况,“用你现有的东西凑合”,正如马克所说,选择了错误的芯片(实际上,需求和功能超出了芯片最初的预期功能。我确信没有人经历过这种情况;-))

无论如何,我发现自己陷入了一场“字节之战”。以下是我记得做过的事情(请注意,很多代码是我从客户那里继承的...)

  • [+RAM, -ROM] 使任何可以
  • [+ ROM] 尽可能使用 Thumb(参见 Mark 的评论)
  • [+ROM] 尽可能使用查找表
  • [+ROM] 重构和优化组合常见功能(尤其是将频繁使用的类似函数的宏转换为子例程)
  • [+ROM]任何从一个地方调用的函数 - 将其直接内联而不是放在函数中
  • [+ROM, +RAM] 将所有浮点使用更改为定点
  • [+ROM, +RAM] 消除未使用的变量 + 常量(使用 lint 和链接器映射来查找/消除/验证)
  • [+ROM]尝试用 if/else 替换开关,反之亦然
  • [+ROM]确保您的链接器配置为消除“死” (未使用)代码
  • [+ROM]重新处理字符串+常量,以便仅在一个地方定义相同的“事物”
  • [+ROM](喘息,叹息)替换带宏的数据隐藏函数(或者内联,如果可以的话)——注意抢占、竞争条件、互斥等...
  • [+ROM, +RAM] - 消除所有调试/临时代码- 通常有 I/O 引脚切换/打印/等等...这些不是有条件编译出来的

,还有更多,但我必须跑去开会。我只记得那是进步、十和十。一次数百个字节,最终节省了一些相当可观的费用。我最终从 flash & 恢复了大约 20%。 RAM,这足以完成该项目。我大约花了两周的时间来清理这些东西,但节省的成本是值得的。

我会尝试回来&发布更多战术,我现在还不能。根据记录,我曾经遇到过必须在 & 中加载/交换代码的情况。根据需要(算法、表格等),运行时串行闪存的 RAM 不足,这太糟糕了。首先,尝试尽可能地收紧当前的代码。这也是一种智力练习,它迫使你深入了解背后的原理和原理。了解你的编译器到底在做什么。

最后一点:在整个项目中编写良好的紧凑代码,但在最后在必要且业务案例证明其合理性时进行这种优化。强>

Mark's answer is a good one. One question -- are you running short of RAM, or flash, or both? The solutions / answers might depend...

A couple years ago, I found myself in a similar situation (running out of room (flash & RAM) on the LPC2148. Of the pin-compatible parts, this was the largest flash & largest RAM. So it was an unfortunate situation of "make do with what you have". And as Mark said, the wrong chip was chosen (well actually, the requirements & functionality grew beyond what the chip was originally supposed to do... I'm sure no one else has ever experienced that ;-) )

Anyway, I found myself in a "battle of bytes". Here are the things I remember doing (mind you, a lot of this code I inherited from the customer...)

  • [+RAM, -ROM] make anything const that can be
  • [+ROM] use Thumb where possible (see Mark's comments)
  • [+ROM] use look-up tables where possible
  • [+ROM] re-factor & combine common functionality (esp. convert heavily-used function-like macros into subroutines)
  • [+ROM] anything that's a function called from one place - put it directly in-line instead of in a function
  • [+ROM, +RAM] change all floating point usage to fixed-point
  • [+ROM, +RAM] eliminate unused variables + constants (use lint & linker map to find/eliminate/verify)
  • [+ROM] try replacing switch w/ if/else, and vice-versa
  • [+ROM] make sure your linker is configured to eliminate "dead" (unused) code
  • [+ROM] re-work strings + constants so that identical "things" are defined in only one place
  • [+ROM] (gasp, sigh) replace data hiding functions w/ macros (or inline if you can) -- beware of preemption, race conditions, mutual exclusion, etc...
  • [+ROM, +RAM] - eliminate all debugging / temp code - usually there are I/O pin toggles/prints/etc... that aren't conditionally compiled out

Man there are a bunch more but I have to run to a meeting. All I remember is that it was progress, tens & hundreds of bytes at a time, that ended up yielding some pretty significant savings. I ended up recovering about 20% from flash & RAM, and that was enough to complete the project. It took me maybe ~2 weeks to clean this stuff up, but the cost savings were well worth it.

I'll try to come back & post more tactics, I just can't right now. For the record, I've been in situations where I had to load/swap code in & out of RAM at run-time from serial flash as needed (algorithms, tables, etc..) and it was awful. First, try to tighten your current code as much as possible. It's also a somewhat intellectual exercise and it forces you to get under the hood & understand what the hell your compiler is really doing.

Last point: write good tight code throughout the project, but do this kind of optimization at the end, when it's necessary and a business case justifies it.

南七夏 2024-09-04 23:39:22

查看此处提供的该部件的数据表:

http://www.keil .com/dd/docs/datashts/philips/lpc2131_32_34_36_38.pdf

它似乎没有用于内存映射外部闪存或 sdram 的接口,也没有 MMU。

它确实有 SPI 端口,可用于连接 SD 卡、EEPROM 或串行闪存以进行片外存储,但这些端口不会进行内存映射,考虑到 RAM 非常有限,您必须处理移入移出的代码段在那个芯片上,这将是困难的。

将数据移至外部存储并仅在片上 ROM 中存储代码可能就“足够了”,这会简化您的挑战,但代价是访问数据时增加延迟。您还可以考虑使用拇指指令集,它会以牺牲一定速度为代价来减少代码大小,并使编译器针对代码密度而不是速度进行优化。

如果没有,人们通常会做什么
当他们用完芯片空间时?

这里不幸的答案是您为您的应用程序选择了错误的芯片和/或需要重新考虑如何构建您的应用程序以使其适合该芯片。

编辑:

看起来还有一些几乎引脚兼容的部件具有更多资源。 LPC2138 有 512kB 闪存和 32kB 内存(与您的 64/16 相比)。两者之间还有几种尺寸可供选择。

快速浏览一下引脚输出,发现唯一的区别是板上有第二个 ADC,它与其他一些引脚复用。显然,要充分研究这一点,但看起来您可以只更换到更高端的部件,而无需修改电路板的其余部分。

Looking at the datasheet for that part available here:

http://www.keil.com/dd/docs/datashts/philips/lpc2131_32_34_36_38.pdf

It doesn't appear to have interfaces for memory mapped external flash or sdram nor does it have an MMU.

It does have SPI ports, which could be used to interface to SD cards, EEPROM, or serial flash for off chip storage but these would not be memory mapped, you would have to handle moving code segments in an out, given the very limited ram on that chip, that would be difficult.

It may be "enough" that you move data into the external storage and store only code in the on chip ROM, this would simplify your challenge at the expense of increased latency when accessing data. You can also look at using the thumb instruction set, which reduces code size at the expense of some speed as well as having the compiler optimize for code density instead of speed.

If not, what do people normally do
when they run out of chip space?

The unfortunate answer here is you chose the wrong chip for your application and/or need to rethink how your application is architected to make it fit in this chip.

EDIT:

It also looks like there are some almost pin compatible parts with more resources. The LPC2138 has 512kB of flash and 32kB of ram (in comparison to 64/16 on your part). There are also a couple sizes in between the two available.

A quick glance at the pin outs looked like the only difference was a second ADC on board that is multiplexed with some of the other pins. Obviously look into this fully but it looks like you could just swap out to the higher end parts without modifying the rest of the board.

初见 2024-09-04 23:39:22

如果您必须连接外部存储器(意味着需要更改硬件),为什么不使用具有更大存储器的芯片。事实上,有些芯片将完全引脚兼容并具有更多闪存,因此您可以避免重新设计(仅更换芯片)。

If you have to connect external memory (meaning hardware changes are necessary), why not use a chip with bigger memory. In fact some chips will be fully pin compatible and have more flash, so you avoid redesign (only chip replacement).

只怪假的太真实 2024-09-04 23:39:22

如果没有,人们通常会做什么
当他们用完芯片空间时?

他们要做的第一件事就是优化他们的应用程序。我不是在谈论运行编译器优化器(尽管这可能是解决方案的一部分),而是应用 Dan 建议的技术。看看你的数据结构和算法的空间效率,通常在空间和执行速度之间存在权衡,但你可能不需要最快的算法,但你确实需要节省空间。

您首先需要知道您的目标以及它是否可行。您的应用程序超出可用空间多少,当前有多大?链接器映射或构建日志应该告诉您这一点。如果您还没有解决优化问题,我很少看到一个应用程序能够相对轻松地完成至少 5% 的任务,甚至在使用优化器之前就可以通过共同努力完成更多任务。

链接器映射还会告诉您每个函数/模块使用的内存量,以便您可以将优化定位在效果最大的地方。您可能还会对映射文件中已链接的库代码感到惊讶,并且您可以问自己为什么以及是否可以消除它。

使用编译器优化会限制轻松使用调试器的能力,但您不需要优化每个模块。因此,如果您需要调试但也使用编译器优化,请优化除您在任何特定时间调试的模块之外的所有模块。

但请注意,看似有效但存在缺陷或使用未定义语言行为的代码可能会在编译器优化后改变其行为(即失败);留下的代码会失败,但无法调试。帮助避免这种情况的最佳策略是使用编译器允许的最大警告级别构建代码(并将警告设置为错误),并消除所有警告。如果可能,请使用静态分析工具,例如 Lint。

如果您还没有这样做,那么在您的情况下,最快、最彻底的节省可能是编译为 Thumb 而不是 ARM 指令集。

最后,当所有其他方法都失败时,您的部件是 LPC2131/32/34/36/38 设备系列的成员,“最大”部件具有 512K 闪存/32K RAM,因此您可以更换为同一系列中的不同部件并在很大程度上保留了软件兼容性。如果您还需要引脚兼容性,请检查数据表

If not, what do people normally do
when they run out of chip space?

The first thing they'd do would be to optimise their application. I am not talking about running the compiler optimiser (although that may be part of the solution), but applying techniques such as Dan has suggested. Look at the space efficiency of your data structures, and algorithms, often there is a trade off between space and execution speed, but you may not need the fastest possible algorithm, but you do need to save space.

You need to know your target and whether it is feasible in the first instance. By how much does your application exceed the available space, and how large is it currently? The linker map or build log should tell you this. If you have not addressed optimisation yet, I have seldom seen an application that could not have at least 5% knocked off relatively painlessly, and more with concerted effort even before using the optimiser.

The linker map will also tell you the amount of memory used by each function/module, so you can target your optimisation where it will have the greatest effect. You may also be surprised from the map file at what library code has become linked, and you could ask yourself why and whether it could be eliminated.

Using compiler optimisation limits the ability to use a debugger easily, but you do not need to optimise every module. So if you need to debug but also use compiler optimisation, optimise all modules except the ones you are debugging at any particular time.

Be aware however that code that appears to work but is flawed or uses undefined language behaviour may change its behaviour (i.e. fail) following compiler optimisation; leaving you with code that fails, but cannot be debugged. The best strategy to help avoid this situation is to build the code with the maximum warning level your compiler allows (and set warnings to errors), and eliminate all warnings. If possible use a static analysis tool such as Lint.

If you have not already done it, the quickest and most drastic saving in your case would likely be to compile to the Thumb rather than ARM instruction set.

Finally when all else fails, your part is a member of a family of devices LPC2131/32/34/36/38, the 'largest' part having 512K Flash/32K RAM, so you could change to a different part in the same family and largely retain software compatibility. Check the datasheet if you also need pin compatibility.

习惯成性 2024-09-04 23:39:22

选择 TI OMAP 处理器。所有这些都运行来自 DDR3(或 DDR2)内存的代码,并且对于某些型号可以在 1GHz 下运行。这里唯一的缺点是这些类型的处理器都是 BGA 和 DDR2/3 内存 PCB 布局并不简单或容易一次就做好。

Go for a TI OMAP processor. All of these run code from DDR3 (or DDR2) memory and can operate at 1GHz for some models. The only drawback here is these types of processors are all BGA and DDR2/3 memory PCB layout is not simple or easy to get right the first time.

逆流 2024-09-04 23:39:22

您将必须开发某种热插拔模块代码并为外部模块连接某种存储芯片。

You're going to have to develop some sort of hot-swappable module code and connect in some sort of memory chip for the external modules.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文