GAS ELF 什么时候需要指令 .type、.thumb、.size 和 .section?
我正在为基于 ARM Cortex-M3 的微控制器(Thumb 2 指令集)开发一个汇编程序,使用 GNU as。
在一些示例代码中,我发现诸如 .size
、.section
和 .type
之类的指令,我认为它们是 ELF 指令。举个例子:
.section .text.Reset_Handler
.weak Reset_Handler
.type Reset_Handler, %function
Reset_Handler:
bl main
b Infinite_Loop
.size Reset_Handler, .-Reset_Handler
.type
指令据说可以设置符号的类型 - 通常为 %object
(意味着数据?)或 %function.我不知道这有什么区别。它并不总是包含在内,所以我不确定何时需要使用它。
与此相关的还有 .thumb_func
指令。根据我的阅读,它似乎可能相当于:
.thumb
.type Symbol_Name, %function
或者它是完全不同的东西?
.size
据称设置与符号关联的大小。什么时候需要这个,我不知道。这是默认计算的,但可以用该指令覆盖吗?如果是这样 - 你什么时候想覆盖?
.section
更容易找到文档,我想我对它的作用有一个很好的了解,但我仍然有点不确定其用法。按照我的理解,它在不同的 ELF 部分之间切换(text
用于代码,data
用于可写数据,bss
用于未初始化数据,>rodata
用于常量和其他),并在需要时定义新的。我猜你会根据是否定义代码、数据、未初始化数据等在这些之间进行切换。但是为什么要为函数创建一个小节,如上面的示例所示?
Any help with this is appreciated. If you can find links to tutorials or docs that explain this in greater detail - preferably understandable for a novice - I would be very grateful.
到目前为止,用作手册已经提供了一些帮助 - 也许您可以从中获得更多信息比我有更多的知识。
I'm working on an assembly program for an ARM Cortex-M3 based microcontroller (Thumb 2 instruction set), using GNU as.
In some example code I find directives like .size
, .section
and .type
which I understand are ELF directives. As an example:
.section .text.Reset_Handler
.weak Reset_Handler
.type Reset_Handler, %function
Reset_Handler:
bl main
b Infinite_Loop
.size Reset_Handler, .-Reset_Handler
The .type
directive is said to set the type of a symbol - usually either to %object
(meaning data?) or %function
. I do not know what difference it makes. It is not always included, so I am unsure when it needs to be used.
Also related to this is the .thumb_func
directive. From what I have read it seems like it might be equivalent to:
.thumb
.type Symbol_Name, %function
Or is it something completely different?
.size
supposedly sets the size associated with a symbol. When this is needed, I have no idea. Is this calculated by default, but overrideable with this directive? If so - when would you want to override?
.section
is easier to find docs on, and I think I have a fair idea of what it does, but I am still a little bit unsure about the usage. The way I understand it, it switches between different ELF sections (text
for code, data
for writable data, bss
for uninitialized data, rodata
for constants, and others), and defines new ones when desired. I guess you would switch between these depending on whether you define code, data, uninitialized data, etc. But why would you create a subsection for a function, as in the example above?
Any help with this is appreciated. If you can find links to tutorials or docs that explain this in greater detail - preferably understandable for a novice - I would be very grateful.
So far, the Using as manual has been of some help - maybe you can get more out of it than me, with more knowledge.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
多年来,我一直在使用大量汇编程序对arm/thumb进行编程,并且只需要很少的指令。
正如另一位响应者指出的那样,
.thumb_func
非常重要。例如
.arm
或曾经是.code32
或.code 32
告诉它这是arm代码而不是thumb代码,这对于你来说cortex-m3 你不需要使用。.thumb
同样,曾经是.code 16
或者也许仍然有效,同样的处理使得以下代码拇指不是手臂。如果您使用的标签不是需要从其他文件或间接分支到的全局标签,则不需要
.thumb_func
。但是为了正确计算这些全局标签之一的分支地址(lsbit 为 1 代表拇指,0 代表手臂),您需要将其标记为拇指或手臂标签,并且thumb_func 会执行此操作,否则您必须在分支添加更多代码之前设置该位,并且该标签无法从 C 调用。直到
.thumb
汇编程序都是所需的 Arm 代码。根据需要,两个和三个标签/功能都是拇指代码,但是两个标签具有偶数编号地址,而三个标签/功能具有正确的奇数编号地址。
使用最新的代码源工具来组装、链接和转储上述示例。
现在对于 cortex-m3,一切都是拇指(/thumb2)
.thumb_func
可能并不那么重要,它可能只适用于命令行开关(很容易做一个实验来找出答案)。不过,如果您从仅限拇指的处理器转向普通的手臂/拇指核心,这是一个好习惯。汇编器通常喜欢添加所有这些指令和其他方式,使事物看起来/感觉更像高级语言。我只是说你不必使用它们,我为arm切换了汇编器,并为许多不同的处理器使用了许多不同的汇编器,并且更喜欢少即是多的方法,这意味着专注于汇编本身并使用尽可能少的工具特定项目。不过,我通常是例外而不是规则,因此您可以通过查看编译器输出生成的指令(并使用文档进行验证)来找出更常用的指令。
当将arm和thumb汇编器或数据与汇编器混合时,我确实使用
.align
,您会期望此类平台的汇编器知道一些显而易见的事情,例如thumb指令位于半字边界上,而arm指令位于半字边界上。在单词边界上对齐。这些工具并不总是那么智能。洒上.align
不会有什么坏处。.text
是默认值,因此有点多余,但不会造成伤害。.text
和.data
是标准属性(不是特定于arm的),如果你正在编译你可能关心的目标上的rom和ram的组合(取决于你的目标)与你的链接器脚本一起做),否则.text
将适用于所有内容。.size
显然是该指令的函数开始的大小。汇编器无法自行解决这一问题,因此,如果此函数的大小对于您的代码、链接器脚本、调试器、加载器等很重要,那么这需要正确,否则您不必费心。无论如何,函数是一个高级概念,汇编器实际上并不具有函数,更不需要声明其大小。 C 编译器当然不在乎,它只是寻找要分支到的标签,对于 ARM 系列来说,它是分支到的拇指代码还是 ARM 代码。如果您在大段代码上懒惰使用立即数 (
ldr rx,=0x12345678
),您可能会发现.pool 指令
(有一个较新的等效项)很有用。同样,这些工具并不总是足够聪明,无法将这些数据放置在无条件分支之后,有时您必须告诉它们。我半认真地说懒惰,一直做标签:.word
的事情是很痛苦的,我相信arm和gcc工具都允许这个快捷方式,所以我和其他人一样使用它。另请注意,llvm 会输出一个额外的 .eabi_attribute 或两个由 binutils 的代码源版本/mod 支持的附加 .eabi_attribute ,但 gnu 发布的 binutils 不支持(可能尚不支持)。两个有效的解决方案,修改 llvm 的 asm 打印函数以不写入
eabi_attributes
或至少使用注释 (@
) 写入它们,或者从代码中获取 binutils 源/mods sourcery 并以这种方式构建 binutils。代码源倾向于引导 gnu(例如,thumb2 支持)或者向后移植新功能,因此我假设这些 llvm 属性不久就会出现在主线 binutils 中。通过从 llvm 编译的代码中删除eabi_attribute
,我没有遭受任何不良影响。这是上面相同函数的 llvm 输出,显然这是我修改以注释掉
eabi_attribute
的 llc。elf 文件格式有详细的文档记录,如果您想真正了解 elf 特定指令(如果有)正在做什么,则非常容易解析。其中许多指令对链接器的帮助最为重要。例如,
.thumb_func
、.text
、.data
。I have been programming arm/thumb for many years lots of assembler and have needed very few of the many directives out there.
.thumb_func
is quite important as pointed out by another responder.for example
.arm
or used to be something like.code32
or.code 32
tells it this is arm code not thumb code, which for your cortex-m3 you won't need to use..thumb
likewise, used to be.code 16
or maybe that still works, same deal makes the following code thumb not arm.If the labels you are using are not global labels that you need to branch to from other files or indirectly, then won't need the
.thumb_func
. But in order for the address of a branch to one of these global labels to be computed properly (lsbit is a 1 for thumb and 0 for arm) you want to mark it as a thumb or arm label and the thumb_func does that, otherwise you have to set that bit before branching adding more code and the label is not callable from C.Up to the
.thumb
the assembler is arm code as desired.Both the two and three labels/functions are thumb code as desired but the two label has an even numbered address and three has the proper odd numbered address.
The latest codesourcery tools were used to assemble, link, and dump the above sample.
Now for the cortex-m3 where everything is thumb(/thumb2)
.thumb_func
may not be as important, it may just work with command line switches (very easy to do an experiment to find out). It is a good habit to have though in case you move away from a thumb only processor to a normal arm/thumb core.Assemblers generally like to add all of these directives and other ways of making things look/feel more like a high level language. I am just saying you don't have to use them, I switched assemblers for arm and use many different assemblers for many different processors and prefer the less is more approach, meaning focus on the assembly itself and use as few tool specific items as possible. I am usually the exception not the rule though, so you can probably figure out the more often used directives by looking at what directives the compiler output generates (and verify with documentation).
I do use the
.align
when mixing arm and thumb assembler or data in with assembler, you would expect the assembler for such a platform to know something as obvious as thumb instructions are on halfword boundaries and arm instructions are aligned on word boundaries. The tools are not always that smart. Sprinkling.align
s about won't hurt..text
is the default so that is a bit redundant, but won't hurt..text
and.data
are standard attributes (not specific to arm) if you are compiling for a combination of rom and ram on your target you may care (depends on what you do with your linker script), otherwise.text
will work for everything..size
apparently the size of the function start to that directive. The assembler cannot figure this out on its own, so if the size of this function is important for your code, linker script, debugger, loader, whatever then this needs to be right, otherwise you don't have to bother. A function is a high level concept anyway assembler doesn't really have functions much less a need to declare their size. And the C compiler certainly doesn't care, it is only looking for a label to branch to and in the case of the arm family is it thumb code or arm code that is being branched to.you may find the
.pool directive
(there is a newer equivalent) useful if you are lazy with your immediates (ldr rx,=0x12345678
) on long stretches of code. Here again the tools are not always smart enough to place this data after an unconditional branch, you sometimes have to tell them. I say lazy half seriously, it is painful to do the label:.word
thing all the time and I believe both the arm and gcc tools allowed for that shortcut, so I use it as much as anyone else.Also note llvm outputs an additional
.eabi_attribute
or two that is supported by code sourcery's version/mods to binutils but not supported (perhaps yet) by the gnu released binutils. Two solutions that work, modify llvm's asm print function to not write theeabi_attributes
or at least write them with a comment (@
), or get the binutils source/mods from code sourcery and build binutils that way. code sourcery tends to lead gnu (thumb2 support for example) or perhaps backports new features, so I assume these llvm attributes will be present in the mainline binutils before long. I have suffered no ill effects by trimming theeabi_attribute
s off of the llvm compiled code.Here is the llvm output for the same function above, apparently this is the llc that I modified to comment out the
eabi_attribute
s.The elf file format is well documented and very easy to parse if you want to really see what the elf specific directives (if any) are doing. Many of these directives are to help the linker more than anything.
.thumb_func
,.text
,.data
for example.程序的各个部分与大多数系统(Linux、BSD...)存储其对象和可执行文件的 ELF 格式密切相关。 这篇文章应该能让您深入了解ELF 是如何工作的,这将帮助您理解节的原因。
简而言之,节可以让您将程序组织到具有不同属性的不同内存区域,包括地址、执行和写入权限等。在最终链接阶段,链接器使用特定的 链接器脚本,通常将同名的所有部分分组在一起(例如,来自所有编译单元的所有代码在一起,... )并为它们分配内存中的最终地址。
对于嵌入式系统,它们的用途尤其明显:首先,引导代码(通常包含在
.text
部分中)必须加载到固定地址才能执行。然后,只读数据可以被分组到专用的只读部分中,该部分将被映射到设备的 ROM 区域中。最后一个例子:操作系统的初始化函数只被调用一次,之后就不再使用,浪费了宝贵的内存空间。如果所有这些初始化函数都被分组到一个名为.initcode
的专用部分,并且如果该部分被设置为程序的最后一部分,那么操作系统可以轻松回收该内存一旦初始化完成,就降低自己的内存上限。例如,Linux 就使用该技巧,GCC 允许您通过在后缀__attribute__ ((section ("MYSECTION")))
后缀来将变量或方法放入特定部分。 type
和.size
实际上对我来说仍然很不清楚。我将它们视为链接器的助手,并且从未在汇编器生成的代码之外看到它们。.thumb_func
似乎只需要旧的 OABI 接口才能与 Arm 代码互通。除非您使用旧的工具链,否则您可能不必担心它。Sections of your program are tightly related to the ELF format in which most systems (Linux, BSD, ...) store their object and executable files. This article should give you a good insight about how ELF works, which will help you understand the why of sections.
Simply put, sections let you organize your program into different memory areas which have different properties, including address, permission to execute and write, etc. During the final link stage, the linker uses a particular linker script that usually groups all sections of the same name together (e.g. all code from all compilation units together, ...) and assigns them a final address in memory.
For embedded systems their use is particularly obvious: first, the boot code (usually contained in the
.text
section) must be loaded at a fixed address in order to be executed. Then, read-only data can be grouped into a dedicated read-only section that will be mapped into the ROM area of the device. Last example: operating systems have initialization functions that are only called once and then never used afterwards, wasting precious memory space. If all these initialization functions are grouped together into a dedication section called, say,.initcode
, and if this section is set to be the last section of the program, then the operating system can easily reclaim this memory once initialization is finished by lowering the upper limit of its own memory. Linux for instance is known to use that trick, and GCC allows you to place a variable or method into a specific section by postfixing it with__attribute__ ((section ("MYSECTION")))
.type
and.size
are actually still quite unclear to me too. I see them as helpers for the linker and never saw them outside of assembler-generated code..thumb_func
seems to only be needed for the old OABI interface in order to allow interworking with Arm code. Unless you are using an old toolchain, you probably don't have to worry about it.我在试图弄清楚为什么 ARM 和 Thumb 互操作与最近的 binutils 发生冲突时遇到了这个问题(已通过 2.21.53 (MacPorts) 和 2.22 (Yagarto 4.7.1) 验证)。
根据我的经验,
.thumb_func
与早期的 binutils 配合得很好,可以生成正确的互操作胶合板。然而,随着最新版本的发布,需要
.type *name*, %function
指令来确保正确生成单板。binutils 邮件列表帖子
我懒得去挖掘旧版本的 binutils,用于检查
.type
指令是否足以替代早期 binutils 的.thumb_func
。我想在代码中包含这两个指令没有什么坏处。编辑:更新了在代码中使用
.thumb_func
的注释,显然它适用于 ARM->Thumb 互操作来标记 Thumb 例程来生成胶合板,但 Thumb->ARM 互操作失败,除非.type
指令用于标记 ARM 函数。I came across this when trying to figure out why ARM and Thumb interworking broke with recent binutils (verified with 2.21.53 (MacPorts), also 2.22 (Yagarto 4.7.1)).
From my experience,
.thumb_func
worked fine with earlier binutils to generate the correct interworking veneers. However, with the more recent releases, the.type *name*, %function
directive is needed to ensure proper veneer generation.binutils mailing list post
I'm too lazy to dig up an older version of binutils to check if the
.type
directive is sufficient in place of.thumb_func
for earlier binutils. I guess there is no harm in including both directives in your code.Edited: updated comment on using
.thumb_func
in the code, apparently it works for ARM->Thumb interworking to flag the Thumb routine to generate veneers, but Thumb->ARM interworking fails unless the.type
directive is used to flag the ARM function.