嵌入式系统中的代码执行

发布于 2024-08-03 10:12:36 字数 121 浏览 4 评论 0 原文

我在嵌入式系统领域工作。我想知道如何从 C 文件开始从微控制器(uC 通常不需要主观)执行代码。我还想知道启动代码、目标文件等内容。我找不到有关上述内容的任何在线文档。如果可能,请提供从头开始解释这些内容的链接。预先感谢您的帮助

I am working in embedded system domain. I would like to know how a code gets executed from a microcontroller(uC need not be subjective, in general), starting from a C file. Also i would like to know stuffs like startup code, object file, etc. I couldnt find any online documentations regarding the above stuff. If possible, please provide links which explains those things from scratch. Thanks in advance for your help

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

夏天碎花小短裙 2024-08-10 10:12:37

一般来说,您的工作水平比通用计算机低得多。

每个CPU在加电时都会有一定的行为,例如清除所有寄存器并将程序计数器设置为0xf000(这里的所有内容都是非特定的,就像你的问题一样)。

诀窍是确保您的代码位于正确的位置。

编译过程通常与通用计算机类似,将 C 语言转换为机器代码(目标文件)。从那里,您需要将该代码链接到:

  • 您的系统启动代码,通常是汇编程序。
  • 任何运行时库(包括 C RTL 所需的位)。

系统启动代码通常只是初始化硬件并设置环境,以便您的 C 代码可以运行。嵌入式系统中的运行时库通常使大而笨重的东西(如浮点支持或 printf)成为可选,以减少代码膨胀。

嵌入式系统中的链接器通常也简单得多,输出固定位置代码而不是可重定位二进制文​​件。您可以使用它来确保启动代码位于(例如)0xf000。

在嵌入式系统中,您通常希望可执行代码从一开始就存在,因此您可以将其烧录到 EPROM(或 EEPROM 或闪存或其他在断电时保留内容的设备)中。

当然,请记住我上次尝试的是 8051 和 68302 处理器。现在的“嵌入式”系统可能是成熟的 Linux 盒子,具有各种出色的硬件,在这种情况下,通用和嵌入式之间没有真正的区别。

但我对此表示怀疑。仍然需要非常低规格的硬件,需要定制操作系统和/或应用程序代码。

SPJ Embedded Technologies 有一个可下载评估 8051 开发环境,看起来正是您想要的。您可以创建最大 2K 大小的程序,但它似乎经历了整个过程(编译链接、生成 HEX 或 BIN 文件以转储到目标硬件,甚至是可以访问片上内容和外部设备的模拟器) )。

非评估版产品售价 200 欧元,但如果您只想玩玩,我会下载评估版 - 除了 2K 限制之外,它是完整的产品。

Generally, you're working at a lot lower level than general purpose computers.

Each CPU will have certain behaviour on power up, such as clearing all registers and setting the program counter to 0xf000 (everything here is non-specific, as is your question).

The trick is to ensure your code is at the right place.

The compilation process is usually similar to general purpose computers in that you translate C into machine code (object files). From there, you need to link that code with:

  • your system start-up code, often in assembler.
  • any runtime libraries (including required bits of the C RTL).

System start-up code generally just initialises the hardware and sets up the environment so that your C code can work. Runtime libraries in embedded systems often make the big bulky stuff (like floating point support or printf) optional so as to keep down code bloat.

The linker in embedded systems also usually is a lot simpler, outputting fixed-location code rather than relocatable binaries. You use it to ensure the start-up code goes at (e.g.) 0xf000.

In embedded systems, you generally want the executable code to be there from the start so you may burn it into EPROM (or EEPROM or Flash or other device that maintains contents on power-down).

Of course, keep in mind my last foray was with 8051 and 68302 processors. It may be that 'embedded' systems nowadays are full blown Linux boxes with all sorts of wonderful hardware, in which case there'd be no real difference between general purpose and embedded.

But I doubt it. There's still a need for seriously low-spec hardware that needs custom operating systems and/or application code.

SPJ Embedded Technologies has an downloadable evaluation of their 8051 development environment that looks to be what you want. You can create programs up to 2K in size but it seems to go through the entire process (compiling linking, generation of HEX or BIN files for dumping on to target hardware, even a simulator which gives access to the on-chip stuff and external devices).

The non-evaluation product costs 200 Euro but, if all you want is a bit of a play, I'd just download the evaluation - other than the 2K limit, it's the full product.

时光是把杀猪刀 2024-08-10 10:12:37

我的印象是您对 sybreon 所说的“第 2 步”最感兴趣。那里可能会发生很多事情,而且不同平台的情况差异很大。通常,这些东西是由引导加载程序、板支持包、C 运行时 (CRT) 以及操作系统(如果有的话)的某种组合来处理的。

通常,在复位向量之后,某种引导加载程序将从闪存执行。该引导加载程序可能只是设置硬件并跳转到应用程序的 CRT(也在闪存中)。在这种情况下,CRT 可能会清除 .bss,将 .data 复制到 RAM 等。在其他系统中,引导加载程序可以从编码文件(如 ELF)分散加载应用程序,而 CRT 只是设置其他运行时的东西(堆等)。所有这些都发生在 CRT 调用应用程序的 main() 之前。

如果您的应用程序是静态链接的,则链接器指令将指定初始化 .data/.bss 和堆栈的地址。这些值要么链接到 CRT,要么编码到 ELF。在动态链接环境中,应用程序加载通常由操作系统处理,该操作系统重新定位 ELF 以在操作系统指定的任何内存中运行。

此外,某些目标从闪存运行应用程序,但其他目标会将可执行的.text从闪存复制到RAM。 (这通常是速度/占用空间的权衡,因为在大多数目标上,RAM 比闪存更快/更宽。)

I get the impression you're most interested in what sybreon calls "step 2." Lots can happen there, and it varies greatly by platform. Usually, this stuff is handled by some combination of bootloader, board-support package, C Runtime (CRT), and if you've got one, the OS.

Typically, after the reset vector, some sort of bootloader will execute from flash. This bootloader might just set up hardware and jump into your app's CRT, also in flash. In this case, the CRT would probably clear the .bss, copy the .data to RAM, etc. In other systems, the bootloader can scatter-load the app from a coded file, like an ELF, and the CRT just sets up other runtime stuff (heap, etc.). All of this happens before the CRT calls the app's main().

If your app is statically linked, linker directives will specify the addresses where .data/.bss and stack are initialized. These values are either linked into the CRT or coded into the ELF. In a dynamically-linked environment, app loading is usually handled by an OS which re-targets the ELF to run in whatever memory the OS designates.

Also, some targets run apps from flash, but others will copy the executable .text from flash to RAM. (This is usually a speed/footprint tradeoff, since RAM is faster/wider than flash on most targets.)

天煞孤星 2024-08-10 10:12:37

好的,我会尝试一下......

首先是架构。冯·诺依曼 vs 哈佛。哈佛架构具有独立的代码和数据存储器。冯·诺依曼则不然。许多微控制器都使用Harvard,这是我所熟悉的。

因此,从基本的哈佛架构开始,您就拥有了程序存储器。当微控制器首次启动时,它执行内存位置零处的指令。通常这是一个跳转到主代码开始处的地址命令。

现在,当我说指令时,我指的是操作码。操作码是编码为二进制数据的指令 - 通常为 8 或 16 位。在某些架构中,每个操作码都被硬编码为表示特定的事物,而在其他架构中,每个位都可能很重要(即,位 1 表示检查进位,位 2 表示检查零标志等)。因此有操作码,然后是操作码的参数。 JUMP 指令是一个操作码和代码“跳转”到的 8 位、16 位或 32 位内存地址。即,控制被转移到该地址处的指令。它通过操作一个特殊寄存器来实现这一点,该寄存器包含下一条要执行的指令的地址。因此,要跳转到内存位置 0x0050,它将用 0x0050 替换该寄存器的内容。在下一个时钟周期,处理器将读取寄存器并定位内存地址并执行那里的指令。

执行指令会导致机器状态发生变化。有一个通用状态寄存器,记录有关最后一个命令执行的操作的信息(即,如果它是添加,那么如果需要执行,则有一个位,等等)。有一个“累加器”寄存器,用于放置指令的结果。指令的参数可以进入多个通用寄存器之一或累加器,也可以进入内存地址(数据或程序)。不同的操作码只能对某些位置的数据执行。例如,您可能能够从两个通用寄存器中添加数据并将结果显示在累加器中,但是您不能从两个数据存储位置获取数据并将结果显示在另一个数据存储位置中。您必须将所需的数据移动到通用寄存器,进行加法,然后将结果移动到您想要的内存位置。这就是为什么组装被认为是困难的。状态寄存器的数量与架构设计的数量一样多。更复杂的架构可能有更多的功能来允许更复杂的命令。更简单的可能不会。

还有一个称为堆栈的内存区域。它只是某些微控制器(如 8051)的内存区域。在其他情况下,它可以有特殊的保护。有一个称为堆栈指针的寄存器,它记录堆栈“顶部”所在的内存位置。当您将某些内容从累加器“推入”堆栈时,“顶部”内存地址就会递增,并且累加器中的数据将放入以前的地址中。当从堆栈中检索或弹出数据时,则执行相反的操作,堆栈指针递减,并将堆栈中的数据放入累加器中。

现在我也对指令是如何“执行”感到困惑。好吧,这就是你开始讨论数字逻辑——VHDL 类型的东西。多路复用器、解码器和真值表等。这就是设计的真正本质——某种程度上。因此,如果您想将内存位置的内容“移动”到累加器中,您必须弄清楚寻址逻辑,清除累加器寄存器,并将其与内存位置处的数据相结合,等等。当所有这些都放在一起时,这是令人望而生畏的,但是如果您已经用 VHDL 或任何数字逻辑方式完成了单独的部分(例如寻址、半加器等),您可能知道需要什么。

这与 C 有什么关系?那么,编译器将获取 C 指令并将它们转换为一系列执行请求操作的操作码。所有这些基本上都是十六进制数据 - 放置在程序存储器中某个位置的 1 和 0。这是通过编译器/链接器指令来完成的,编译器/链接器指令告诉什么内存位置用于什么代码。它被写入芯片上的闪存,然后当芯片重新启动时,它会转到代码存储位置 0x0000 并跳转到程序存储器中代码的起始地址,然后开始插入操作码。

Ok, I'll give this a shot...

First off architectures. Von Neumann vs. Harvard. Harvard architecture has separate memory for code and data. Von Neumann does not. Harvard is used in many microcontrollers and it is what I'm familiar with.

So starting with your basic Harvard architecture you have program memory. When the microcontroller first starts up it executes the instructions at memory location zero. Usually this is a JUMP to address command where the main code starts.

Now, when I say instructions I mean opcodes. Opcodes are instructions encoded into binary data - usually 8 or 16 bits. In some architectures each opcode is hardcoded to mean specific things, in others each bit can be significant (ie, bit 1 means check carry, bit 2 means check zero flag, etc). So there are opcodes and then parameters for the opcodes. A JUMP instruction is an opcode and an 8 or 16 or 32 bit memory address which the code 'jumps' to. Ie, control is transferred to the instructions at that address. It accomplishes this by manipulating a special register that contains the address of the next instruction to be executed. So to JUMP to memory location 0x0050 it would replace the contents of that register with 0x0050. On the next clock cycle the processor would read the register and locate the memory address and execute the instruction there.

Executing instructions causes changes in the state of the machine. There is a general status register that records information about what the last command did (ie, if it's an addition then if there was a carry out required, there's a bit for that, etc). There is an 'accumulator' register where the result of the instruction is placed. The parameters for instructions can either go in one of several general purpose registers, or the accumulator, or in memory addresses (data OR program). Different opcodes can only be executed on data in certain places. For instance, you might be able to ADD data from two general purpose registers and have the result show up in the accumulator, but you can't take data from two data memory locations and have the result show up in another data memory location. You'd have to move the data you want to the general purpose registers, do the addition, then move the result to the memory location you want. That's why assembly is considered difficult. There are as many status registers as the architecture is designed for. More complex architectures may have more to allow more complex commands. Simpler ones may not.

There is also an area of memory known as the stack. It's just an area in memory for some microcontrollers (like the 8051). In others it can have special protections. There is a register called a stack pointer that records what memory location the 'top' of the stack is at. When you 'push' something on to the stack from the accumulator then the 'top' memory address is incremented and the data from the accumulator is put into the former address. When retrieving or popping data from the stack, the reverse is done and the stack pointers is decremented and the data from the stack is put into the accumulator.

Now I have also sort of glazed over how instructions are 'executed'. Well, this is when you get down to digital logic - VHDL type of stuff. Multiplexers and decoders and truth tables and such. That's the real nitty gritty of design - kind of. So if you want to 'move' the contents of a memory location into the accumulator you have to figure out addressing logic, clear the accumulator register, AND it with the data at the memory location, etc. It's daunting when placed all together but if you've done separate parts (like addressing, a half-adder, etc) in VHDL or in any digital logic fashion you might have an idea what's required.

How does this relate to C? Well, a compiler will take the C instructions and turn them into a series of opcodes that perform the requested operations. All of that is basically hex data - one's and zeros that get placed at some point in program memory. This is done with compiler/linker directives that tell what memory location is used for what code. It's written to the flash memory on the chip, and then when the chip restarts it goes to code memory location 0x0000 and JUMPs to the start address of the code in program memory, then starts plugging away at opcodes.

南风起 2024-08-10 10:12:37

您可以参考链接https://automotivetechis.wordpress.com/

以下序列概述了控制器指令执行的序列:

1) 为程序的执行分配主内存。

2) 将地址空间从辅助存储器复制到主存储器。

3) 将可执行文件中的.text 和.data 部分复制到主内存中。

4) 将程序参数(例如,命令行参数)复制到堆栈上。

5) 初始化寄存器:设置esp(堆栈指针)指向堆栈顶部,清除其余寄存器。

6) 跳转到启动例程,其中:从堆栈中复制 main() 的参数,然后跳转到 main()。

You can refer to the link https://automotivetechis.wordpress.com/.

The following sequence overviews the sequence of controller instruction executions:

1) Allocates primary memory for the program’s execution.

2) Copies address space from secondary to primary memory.

3) Copies the .text and .data sections from the executable into primary memory.

4) Copies program arguments (e.g., command line arguments) onto the stack.

5) Initializes registers: sets the esp (stack pointer) to point to top of stack, clears the rest.

6) Jumps to start routine, which: copies main()‘s arguments off of the stack, and jumps to main().

淡紫姑娘! 2024-08-10 10:12:37

我有使用 AVR 微控制器的经验,但我认为这对于所有微控制器来说几乎都是相同的:

编译过程与普通 C 代码相同。它被编译成目标文件,这些文件链接在一起,但不是输出一些复杂的格式,如 ELF 或 PE,而是简单地将输出放置在 uC 内存中的某个固定地址上,没有任何标头。

启动代码(如果编译器生成任何代码)的添加方式与“普通”计算机的启动代码相同 - 在 main() 代码之前添加了一些代码(也可能在其之后)。

另一个区别是链接——所有东西都必须静态链接,因为微控制器没有操作系统来处理动态链接。

I have experience with AVR microcontrollers, but i think this will be pretty much the same for all of them:

The compilation goes along the same lines as with a normal C code. It is compiled into the object files, these are linked together, but instead of outputting some complex format like ELF or PE, the output is simply placed on some fixed address in the uC's memory without any headers.

The startup code (if the compiler generates any) is added in a same way as startup code for "normal" computers -- there is some code added before your main() code (and maybe after it too).

Another difference is linking -- everythig has to be linked statically, because microcontrollers don't have an OS to handle the dynamic linking.

攒一口袋星星 2024-08-10 10:12:37

您可以查看 Jim Lynch 撰写的非常详细的 GNU ARM 教程

You could take a look at the very detailed GNU ARM Tutorial by Jim Lynch.

韵柒 2024-08-10 10:12:36

作为一名微处理器架构师,我有机会在非常低的软件级别上工作。基本上,低级嵌入式仅在硬件特定级别与一般 PC 编程有很大不同。

低级嵌入式软件可分为以下几类:

  1. 重置向量 - 这通常是用汇编编写的。它是启动时运行的第一件事,可以被视为特定于硬件的代码。它通常会执行简单的功能,例如通过配置寄存器等将处理器设置为预定义的稳定状态。然后就会跳转到启动代码。最基本的复位向量只是直接跳转到启动代码。
  2. 启动代码 - 这是第一个运行的特定于软件的代码。它的工作基本上是建立软件环境,以便C代码可以在上面运行。例如,C 代码假设存在一个定义为堆栈和堆的内存区域。这些通常是软件结构而不是硬件。因此,这段启动代码将定义栈指针和堆指针等。这通常分组在“c-runtime”下。对于 C++ 代码,还调用构造函数。在例程结束时,它将执行main()编辑:需要初始化的变量以及需要清除的内存的某些部分都在这里完成。基本上,是将事物转变为“已知状态”所需的一切。
  3. 应用程序代码 - 这是从 main() 函数开始的实际 C 应用程序。正如您所看到的,很多事情实际上都在幕后发生,甚至在调用第一个主函数之前就已经发生了。如果有良好的硬件抽象层可用,则通常可以将代码编写为与硬件无关的代码。应用程序代码肯定会使用很多库函数。这些库通常在嵌入式系统中静态链接。
  4. - 这些是提供原始 C 函数的标准 C 库。还有一些特定于处理器的库,可以实现软件浮点支持等功能。还可以有特定于硬件的库来访问 I/O 设备等 stdin/stdout。一些常见的 C 库是 NewlibuClibc
  5. 中断/异常处理程序 - 这些是在正常代码执行期间由于硬件或处理器状态变化而随机运行的例程。这些例程通常也是用汇编语言编写的,因为它们应该以最小的软件开销运行,以便为调用的实际硬件提供服务。

希望这将提供一个良好的开始。如果您还有其他疑问,请随时发表评论。

Being a microprocessor architect, I have had the opportunity to work at a very low level for software. Basically, low-level embedded is very different from general PC programming only at the hardware specific level.

Low-level embedded software can be broken down into the following:

  1. Reset vector - this is usually written in assembly. It is the very first thing that runs at start-up and can be considered hardware-specific code. It will usually perform simple functions like setting up the processor into a pre-defined steady state by configuring registers and such. Then it will jump to the startup code. The most basic reset vector merely jumps directly to the start-up code.
  2. Startup code - this is the first software-specific code that runs. Its job is basically to set up the software environment so that C code can run on top. For example, C code assumes that there is a region of memory defined as stack and heap. These are usually software constructs instead of hardware. Therefore, this piece of start-up code will define the stack pointers and heap pointers and such. This is usually grouped under the 'c-runtime'. For C++ code, constructors are also called. At the end of the routine, it will execute main(). edit: Variables that need to be initialised and also certain parts of memory that need clearing are done here. Basically, everything that is needed to move things into a 'known state'.
  3. Application code - this is your actual C application starting from the main() function. As you can see, a lot of things are actually under the hood and happen even before your first main function is called. This code can usually be written as hardware-agnostic if there is a good hardware abstraction layer available. The application code will definitely make use of a lot of library functions. These libraries are usually statically linked in embedded systems.
  4. Libraries - these are your standard C libraries that provide primitive C functions. There are also processor specific libraries that implement things like software floating-point support. There can also be hardware-specific libraries to access the I/O devices and such for stdin/stdout. A couple of common C libraries are Newlib and uClibc.
  5. Interrupt/Exception handler - these are routines that run at random times during normal code execution as a result of changes in hardware or processor states. These routines are also typically written in assembly as they should run with minimal software overhead in order to service the actual hardware called.

Hope this will provide a good start. Feel free to leave comments if you have other queries.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文