想要在基于 ARM9 的芯片中配置特定的外设寄存器

发布于 2024-08-25 10:18:10 字数 222 浏览 4 评论 0原文

我有基于 ARM 芯片的基于 verilog 的验证环境。我必须编写新的测试 C++ 验证外设。我已准备好所有基于 ARM 的 GCC 工具。我不知道如何制作在基于 C++ 的测试中可见的特定外设寄存器。我想写入该寄存器，想要等待来自外设的中断，然后想要读回另一个外设寄存器的状态。

我想知道如何才能做到？我应该参考 ARM 的哪些文档。我尝试发现所有文档都是针对系统开发人员的我需要基本信息。问候马尼什

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

拥抱影子 2024-09-01 10:18:10

即使不是立即，您最终也会需要 ARM ARM（是的 ARM 两次，一次用于 ARM，第二次用于架构参考手册，您可以通过 google 搜索并免费下载）。其次，您需要芯片中特定内核的 TRM（技术参考手册）。 ARM 不生产芯片，他们生产其他人放入其芯片中的处理器内核，因此拥有该芯片的公司可能会也可能不会在其文档中包含 TRM。如果您有 verilog 中的 ARM 内核，那么我假设您购买了它，这意味着您拥有所购买的特定内核的特定 TRM，以及任何附加组件（例如缓存）。

您可以对此持保留态度，但我已经做了您多年来所做的事情（在模拟中进行测试，然后在真实芯片上进行测试），我更喜欢编写我的 C 代码，就好像它将在嵌入式中运行一样在手臂上。那么在这种情况下，也许您正在嵌入手臂上运行。

而不是这样的：


#define SOMEREG (*(volatile unsigned int *)0X12345678)

然后在您的代码


SOMEREG = 0xabc;

或


somevariable = SOMEREG;
somevariable |= 0x10;
SOMEREG = somevariable;

我的 C 代码中使用外部函数。


extern unsigned int GET32 ( unsigned int address );
extern void PUT32 ( unsigned int address, unsigned int data);


somevariable = GET32(0x12345678);
somevariable|=0x10;
PUT32(0x12345678,somevariable);

当在芯片上运行模拟或模拟之外时：


.globl PUT32
PUT32:
   str r1,[r0]
   bx lr ;@ or mov pc,lr depending on architecture
.globl PUT16
PUT16:
   strh r1,[r0]
   bx lr
.globl GET32
GET32:
   ldr r0,[r0] ;@ I know what the ARM ARM says, this works
   bx lr
.globl GET16
GET16
   ldrh r0,[r0]
   bx lr

假设您将文件命名为 putget.s，

arm-something-as putget.s -o putget.o

然后将 putget.o 与您的 C/C++ 对象链接起来。

我曾经有过 gcc 和几乎所有其他编译器都无法让 *易失性 100% 工作的情况，通常是在您将代码发布给制造人员进行测试并在产品上运行它们之后，它失败了并且您有停止生产并重新编写或重新调整一堆代码，以使编译器不再感到困惑。外部函数方法在所有编译器上都可以 100% 工作，唯一的缺点是运行嵌入式时的性能，但跨所有接口和操作系统的抽象的好处会为您带来回报。

我假设您正在做两件事之一，要么在模拟手臂上运行代码，试图与与模拟手臂相关的东西进行对话。最终我假设代码会这样做，所以你将不得不进入工具和链接器问题，那里有很多例子，还有我自己的一些例子，就像构建一个 gcc 交叉编译器一样，第一次展示就很简单。如果这是一个最终将绑定到手臂的外设，但目前位于核心之外但在设计内部，这意味着它希望是内存映射的并绑定到手臂内存接口（amba、axi 等）。

对于第一种情况，您必须克服嵌入式障碍，您将需要构建可启动代码，可能基于 ROM/闪存（只读），因为这可能是 Arm/芯片的启动方式，处理链接器脚本以分离 ROM /内存。这是我的建议，硬件工程师最终（如果不是现在）会想要模拟 ROM 时序，这在模拟中是非常慢的。编译你的程序完全从ram运行（除了异常表，它是一个单独的主题），编译成你愿意编写一个临时实用程序来读取的二进制格式，elf很容易，ihex和srec也是如此，无就像普通的旧二进制 .bin 一样简单。您最终想要做的是编写一些在虚拟 prom/flash 上启动的汇编程序，启用指令缓存（如果它们已实现并在模拟中工作，如果没有，则等待该步骤）使用 ldm amd stm 指令一个循环，一次将尽可能多的单词复制到 ram，然后分支到 ram。我有一个基于主机的实用程序，它采用 .bin 文件创建一个汇编程序，其中包括将二进制文件复制到 RAM 并将二进制文件本身作为 .words 嵌入汇编程序中的汇编程序，然后将该程序汇编并链接为模拟可以的格式使用。不要让硬件工程师说服您每次都必须重新构建verilog，您可以在verilog中使用$readmemh()或其他类似的东西在运行时读取文件，而不必重新编译verilog更改arm二进制文件。您将需要编写一个基于主机的临时实用程序，将您的 .bin 或 .elf 或其他任何文件转换为 verilog 可以读取的文件，readmemh 是微不足道的......所以我正切题，使用 put/get要与寄存器通信，您必须使用 TRM 和 ARM ARM 将中断处理程序代码放置在某处，您必须启用中断，很可能在 ARM 以及外设中的多个位置。模拟的美妙之处在于，您可以观察代码的执行情况，并且可以看到中断离开外设，并根据您所看到的情况调试代码，使用真实的芯片，您不知道代码是否无法创建中断，或者您的代码是否无法创建中断。代码无法启用中断，或者如果中断正在工作，但您在中断处理程序中犯了错误，使用verilog模拟器您可以看到所有这些，您应该努力学习读取波形而不是依赖硬件工程师为你做这件事。 modelsim 或 cadence 或任何可以将波形保存为 .vcd 格式的人都可以使用名为 gtkwave 的免费工具来查看波形。不要让他们让您相信他们没有更多许可证可供您查看内容。

所有这些都是次要的，如果这是一个非核心但在芯片外设上的东西，那么您可能想首先在没有 ARM 核心的情况下测试该逻辑。如果您不懂 verilog，很简单，您只需查看代码即可弄清楚。如果已经具备语言（尤其是 C）经验，软件工程师可以在几天或一周内掌握它。无论哪种方式，硬件工程师可能都有一个外设测试台，您创建或让他们创建一个带有寄存器的测试台，类似于连接到 Arm 后您将看到的内容，无论是直接在 Arm 总线上还是在简化 Arm 总线的测试台接口上。然后使用vpi，它很丑陋但可以使用（谷歌外语界面以及vpi）来连接运行模拟的主机上的C代码。大部分工作都用 C 和 verilog 完成，最大限度地减少 vpi 噩梦。因为从某种意义上说，这是编译并链接到模拟的，所以您不希望每次想要更改测试程序时都重新构建模拟。因此，请使用套接字或其他 IPC 接口之类的东西，以便可以与 vpi 代码分离。然后编写一些实现 put32 和 get32 的主机代码（put8、put16，任何你想实现的函数）。因此，现在您采用可以在手臂上运行的测试程序（如果以这种方式编译），而是在丢失的链接上编译它，将其链接到 put/get/whatever 抽象层。现在，您可以编写在主机上运行的程序，但可以在模拟中与外设交互，就好像它是真实的硬件一样，就好像您的主机程序是手臂中的嵌入式程序一样。在这种环境中，中断可能是微不足道的，因为您所要做的就是在波形中查找它，或者当信号改变状态或类似情况时让 vpi 代码在控制台上打印一些内容。

哦，从 rom 复制到 ram，然后从 ram 运行的原因是，平均而言，您的 sim 时间会显着缩短，五到几十分钟而不是几个小时。在没有手臂的情况下，使用外语接口桥接到主机或从主机上桥接，自行模拟外围设备，将您的模拟时间从五分钟缩短到几秒钟，具体取决于您正在做什么。如果您使用某种像我的 put/get 这样的抽象，您可以在一个文件中一次编写外设代码，以不同的方式链接它，一个文件/程序/函数只能在模拟中与外设一起使用，以便快速开发您的代码，然后在模拟中使用 ARM 运行，增加 ARM 异常和 ARM 中断系统的复杂性，然后在真实芯片上运行，就像在模拟芯片上运行一样。然后，稍后可以使用 mmap 等在驱动程序或应用程序空间中按原样使用该代码。

You will eventually if not immediately want the ARM ARM (yes ARM twice, once for ARM the second one for Architectural Reference Manual, you can google it and find it for free as a download). Second you want the TRM, Technical Reference Manual for the specific core in your chip. ARM doesn't make chips they make processor cores that other people put in their chips so the company that has the chip may or may not have the TRM included in their documentation. If you have an arm core in verilog then I assume you purchased it and that means you have the specific TRM for the specific core that you purchased available, plus any add ons (like a cache for example).

You can take this with a grain of salt but I have done what you are doing for many years (testing in simulation and later on the real chip) now and my preference is to write my C code as if it were going to be running embedded on the arm. Well in this case perhaps you are running embedded on the arm.

Instead of something like this:


#define SOMEREG (*(volatile unsigned int *)0X12345678)

and then in your code


SOMEREG = 0xabc;


somevariable = SOMEREG;
somevariable |= 0x10;
SOMEREG = somevariable;

My C code uses external functions.


extern unsigned int GET32 ( unsigned int address );
extern void PUT32 ( unsigned int address, unsigned int data);


somevariable = GET32(0x12345678);
somevariable|=0x10;
PUT32(0x12345678,somevariable);

When running on the chip in or out of simulation:


.globl PUT32
PUT32:
   str r1,[r0]
   bx lr ;@ or mov pc,lr depending on architecture
.globl PUT16
PUT16:
   strh r1,[r0]
   bx lr
.globl GET32
GET32:
   ldr r0,[r0] ;@ I know what the ARM ARM says, this works
   bx lr
.globl GET16
GET16
   ldrh r0,[r0]
   bx lr

Say you name the file it putget.s

arm-something-as putget.s -o putget.o

then link putget.o in with your C/C++ objects.

I have had gcc and pretty much every other compiler fail to get the *volatile thing to work 100%, usually right after you release your code to the manufacturing folks to take your tests and run them on the product is when it fails and you have to stop production and re-write or re-tune a bunch of code to get the compiler not confused again. The external function approach has worked 100% on all compilers, the only drawback is the performance when running embedded, but the benefits of abstraction across all interfaces and operating systems pays you back for that.

I assume you are doing one of two things, either you are running code on the simulated arm trying to talk to something tied to the simulated arm. Eventually I assume the code will be doing that so you will have to get into tools and linker issues, which there are many examples out there, some of my own as well, just like building a gcc cross compiler, trivial once shown the first time. If this is a peripheral that will eventually be tied to an arm, but for now is outside the core but inside the design, meaning it hopefully is memory mapped and is tied to the arms memory interface (amba, axi, etc).

for the first case you have to overcome the embedded hurdle, you will need to build bootable code, probably rom/flash based (read only) as that is likely how the arm/chip will boot, dealing with the linker scripts to separate the rom/ram. Here is my advice on that eventually if not now the hardware engineers will want to simulate the rom timing, which is painfully slow in simulation. Compile your program to run completely from ram (other than the exception table which is a separate topic), compile to a binary format that you are willing to write an ad hoc utility for reading, elf is easy, so is ihex and srec, none as easy as a plain old binary .bin. What you ultimately want to do is write some assembler that boots up on the virtual prom/flash, enables the instruction cache (if they have that implemented and working in simulation, if not then wait on that step) uses the ldm amd stm instructions in a loop to copy as many words at a time as you can to ram, then branch to ram. I have a host based utility that takes the .bin file creates an assembler program that includes the assembler that copies the binary to ram and embed the binary itself as .words in the assembler, then assemble and link that program to a format the simulation can use. Do not let the hardware engineers convince you that you have to re-build the verilog every time, you can use a $readmemh() or some other such thing in verilog to read a file at runtime and not have to re-compile the verilog to change the arm binary. You will want to write an ad hoc host based utility to convert your .bin or .elf or whatever to a file that the verilog can read, readmemh is trivial... So I am getting off on a tangent, use the put/get to talk to registers, you have to use the TRM and the ARM ARM to place the interrupt handler code somewhere, you have to enable the interrupt, most likely in more than one place in the arm as well as in the peripheral. The beauty of simulating is that you can watch your code execute and you can see the interrupt leave the peripheral and debug your code based on what you see, with a real chip you dont know if your code is failing to create the interrupt or if your code is failing to enable the interrupt or if the interrupt is working but you made a mistake in the interrupt handler, with a verilog simulator you can see all of this and you should strive to learn to read the waveforms and not rely on the hardware engineers to do it for you. modelsim or cadence or whomever can save the waveforms in .vcd format and you can use a free tool named gtkwave to view the waveforms. Dont let them convince you that they dont have any more licences available for you to look at stuff.

All of that is secondary, if this is an off core but on chip peripheral then you probably want to test that logic without the arm core first. If you dont know verilog, its easy you should just look at the code and you can figure it out. Software engineers can pick it up in a few days or a week if already experienced in languages, particularly C. Either way, the hardware engineer likely has a test bench for the peripheral, you create or have them create a test bench with a register that is similar to what you will see once connected to the arm, either directly on the arm bus or on a test bench interface that simplifies the arm bus. then use vpi, which is ugly but works (google foreign language interface as well as vpi) to connect C code on the host machine running the simulation. Do most of your work in C and verilog minimizing the vpi nightmare. Because this is compiled and linked to the simulation in a sense you do not want to have to re-build the sim every time you want to change your test program. So use something like sockets or some other IPC interface so that you can separate from the vpi code. Then write some host code that implements put32 and get32 (put8, put16, whatever functions you want to implement). so now you take your test program that can run on the arm if compiled that way and instead compile it on the lost linking it to the put/get/whatever abstraction layer. Now you can write programs that for now run on the host but interact with the peripheral in simulation as if it were real hardware and as if your host programs were embedded programs in the arm. the interrupt is likely trivial in this environment as all you have to do is either look for it in the waveforms or have the vpi code print something on the console when the signal changes states or something like that.

Oh, the reason for copying from rom to ram then running from ram is that on average your sim times will be significantly shorter, fives and tens of minutes instead of hours. simulating the peripheral by itself without the arm using a foreign language interface to bridge to/from the host, cuts your sim time from fives of minutes to seconds depending on what you are doing. If you use some sort of abstraction like my put/get you can write your peripheral code one time in one file, linking it different ways that one file/program/function can be used with the perhipheral only in simulation for quickly developing your code, then run with the arm in place in simulation adding the complexity of the arm exceptions and arm interrupt system, and later on the real chip as you were running on the simulated chip. and then later that code can hopefully be used as is in a driver or application space using mmap, etc.

回复收藏 0 原文

~没有更多了~