提高 16 位处理器上 32 位数学的性能

发布于 2024-12-13 18:46:52 字数 2385 浏览 5 评论 0原文

我正在为嵌入式设备开发一些固件，该设备使用以 40 MIPS 运行的 16 位 PIC，并用 C 语言编程。该系统将控制两个步进电机的位置，并始终保持每个电机的步进位置。每个电机的最大位置约为 125000 步，因此我无法使用 16 位整数来跟踪位置。我必须使用 32 位无符号整数 (DWORD)。电机以每秒 1000 步的速度移动，我设计了固件，以便在定时器 ISR 中处理步数。定时器 ISR 执行以下操作：

1) 将一台电机的当前位置与目标位置进行比较，如果它们相同，则设置 isMoving 标志为 false 并返回。如果它们不同，则将 isMoving 标志设置为 true。

2) 如果目标位置大于当前位置，则向前移动一步，然后增加当前位置。

3) 如果目标位置小于当前位置，则向后移动一步，然后将当前位置递减。

代码如下：

void _ISR _NOPSV _T4Interrupt(void)
{
    static char StepperIndex1 = 'A';    

    if(Device1.statusStr.CurrentPosition == Device1.statusStr.TargetPosition)
    {
        Device1.statusStr.IsMoving = 0;
        // Do Nothing
    }   
    else if (Device1.statusStr.CurrentPosition > Device1.statusStr.TargetPosition)
    {
        switch (StepperIndex1)      // MOVE OUT
        {
            case 'A':
                SetMotor1PosB();
                StepperIndex1 = 'B';
                break;
            case 'B':
                SetMotor1PosC();
                StepperIndex1 = 'C';
                break;
            case 'C':
                SetMotor1PosD();
                StepperIndex1 = 'D';
                break;
            case 'D':
                default:
                SetMotor1PosA();
                StepperIndex1 = 'A';
                break;      
        }
        Device1.statusStr.CurrentPosition--;    
        Device1.statusStr.IsMoving = 1;
    }   
    else
    {
        switch (StepperIndex1)      // MOVE IN 
        {
            case 'A':
                SetMotor1PosD();
                StepperIndex1 = 'D';
                break;
            case 'B':
                SetMotor1PosA();
                StepperIndex1 = 'A';
                break;
            case 'C':
                SetMotor1PosB();
                StepperIndex1 = 'B';
                break;
            case 'D':
                default:
                SetMotor1PosC();
                StepperIndex1 = 'C';
                break;      
        }
        Device1.statusStr.CurrentPosition++;
        Device1.statusStr.IsMoving = 1;
    }   
    _T4IF = 0;          // Clear the Timer 4 Interrupt Flag.
}

当收到移动请求时，在主程序循环中设置目标位置。 SetMotorPos 行只是用于打开/关闭特定端口引脚的宏。

我的问题是：有什么办法可以提高这段代码的效率吗？如果位置是 16 位整数，则代码可以正常运行，但如果位置是 32 位整数，则需要进行太多处理。该设备必须毫不犹豫地与 PC 通信，正如所写的那样，性能会受到明显影响。我真的只需要 18 位数学，但我不知道有什么简单的方法可以做到这一点！任何建设性的意见/建议将不胜感激。

原文

I am working on some firmware for an embedded device that uses a 16 bit PIC operating at 40 MIPS and programming in C. The system will control the position of two stepper motors and maintain the step position of each motor at all times. The max position of each motor is around 125000 steps so I cannot use a 16bit integer to keep track of the position. I must use a 32 bit unsigned integer (DWORD). The motor moves at 1000 steps per second and I have designed the firmware so that steps are processed in a Timer ISR. The timer ISR does the following:

1) compare the current position of one motor to the target position, if they are the same set the isMoving flag false and return. If they are different set the isMoving flag true.

2) If the target position is larger than the current position, move one step forward, then increment the current position.

3) If the target position is smaller than the current position, move one step backward, then decrement the current position.

Here is the code:

void _ISR _NOPSV _T4Interrupt(void)
{
    static char StepperIndex1 = 'A';    

    if(Device1.statusStr.CurrentPosition == Device1.statusStr.TargetPosition)
    {
        Device1.statusStr.IsMoving = 0;
        // Do Nothing
    }   
    else if (Device1.statusStr.CurrentPosition > Device1.statusStr.TargetPosition)
    {
        switch (StepperIndex1)      // MOVE OUT
        {
            case 'A':
                SetMotor1PosB();
                StepperIndex1 = 'B';
                break;
            case 'B':
                SetMotor1PosC();
                StepperIndex1 = 'C';
                break;
            case 'C':
                SetMotor1PosD();
                StepperIndex1 = 'D';
                break;
            case 'D':
                default:
                SetMotor1PosA();
                StepperIndex1 = 'A';
                break;      
        }
        Device1.statusStr.CurrentPosition--;    
        Device1.statusStr.IsMoving = 1;
    }   
    else
    {
        switch (StepperIndex1)      // MOVE IN 
        {
            case 'A':
                SetMotor1PosD();
                StepperIndex1 = 'D';
                break;
            case 'B':
                SetMotor1PosA();
                StepperIndex1 = 'A';
                break;
            case 'C':
                SetMotor1PosB();
                StepperIndex1 = 'B';
                break;
            case 'D':
                default:
                SetMotor1PosC();
                StepperIndex1 = 'C';
                break;      
        }
        Device1.statusStr.CurrentPosition++;
        Device1.statusStr.IsMoving = 1;
    }   
    _T4IF = 0;          // Clear the Timer 4 Interrupt Flag.
}

The target position is set in the main program loop when move requests are received. The SetMotorPos lines are just macros to turn on/off specific port pins.

My question is: Is there any way to improve the efficiency of this code? The code functions fine as is if the positions are 16bit integers but as 32bit integers there is too much processing. This device must communicate with a PC without hesitation and as written there is a noticeable performance hit. I really only need 18 bit math but I don't know of an easy way of doing that! Any constructive input/suggestions would be most appreciated.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

如梦初醒的夏天 2024-12-20 18:46:52

警告：所有数字都是编造的...

假设上述 ISR 有大约 200 条（可能更少）编译代码指令，其中包括在 ISR 之前和之后保存/恢复 CPU 寄存器的指令，每个指令需要 5 个时钟周期（可能是 1 到 3）并且您每秒调用其中 2 个 1000 次，我们最终得到 2*1000*200*5 = 每秒 2 百万个时钟周期或 2米普斯。

您实际上在其他地方消耗了剩余的 38 MIPS 吗？

这里唯一可能重要但我看不到的事情是 SetMotor*Pos*() 函数内部所做的事情。他们会进行复杂的计算吗？它们是否与电机进行一些缓慢的通信，例如等待它们响应发送给它们的命令？

无论如何，令人怀疑的是，如此简单的代码在处理 32 位整数时会比处理 16 位整数时明显慢一些。

如果您的代码速度很慢，请找出时间花在哪里以及花费了多少时间，并对其进行分析。在 ISR 中生成方波脉冲信号（当 ISR 开始时为 1，当 ISR 即将返回时为 0）并用示波器测量其持续时间。或者做任何更容易找到的事情。测量程序所有部分所花费的时间，然后在真正必要的地方进行优化，而不是在您之前认为会优化的地方。

回复收藏 0 原文

南城旧梦 2024-12-20 18:46:52

我认为 16 位和 32 位算术之间的差异不应该那么大，因为您只使用增量和比较。但问题可能在于每个 32 位算术运算都意味着一个函数调用（如果编译器不能/不愿意内联更简单的运算）。

一种建议是自己进行算术，将 Device1.statusStr.CurrentPosition 分成两部分，例如 Device1.statusStr.CurrentPositionH 和 Device1.statusStr.CurrentPositionL。然后使用一些宏来进行操作，例如：

#define INC(xH,xL) {xL++;if (xL == 0) xH++;}

回复收藏 0 原文

初熏 2024-12-20 18:46:52

我将摆脱 StepperIndex1 变量，而是使用 CurrentPosition 的两个低位来跟踪当前步骤索引。或者，跟踪完整旋转（而不是每一步）的当前位置，以便它可以适合 16 位变量。移动时，仅在移动到“A”阶段时增加/减少位置。当然，这意味着您只能针对每个完整旋转，而不是每个步骤。

回复收藏 0 原文

恬淡成诗 2024-12-20 18:46:52

抱歉，您使用了错误的程序设计。

让我们检查一下 16 位和 32 位 PIC24 或 PIC33 asm 代码之间的差异...

16 位增量

inc    PosInt16               ;one cycle

因此 16 位增量需要一个周期

32 位增量

clr    Wd                     ;one cycle
inc    low PosInt32           ;one cycle
addc   high PosInt32, Wd      ;one cycle

和 32 增量需要三个周期。
总差异为 2 个周期或 50ns（纳秒）。

简单的计算就能告诉你一切。您拥有每秒 1000 步和 40Mips DSP ，因此您以每秒 1000 步的速度每步有 40000 条指令。 绰绰有余！

Sorry, but you are using bad program design.

Let's check the difference between 16 bit and 32 bit PIC24 or PIC33 asm code...

16 bit increment

inc    PosInt16               ;one cycle

So 16 bit increment takes one cycle

32bit increment

clr    Wd                     ;one cycle
inc    low PosInt32           ;one cycle
addc   high PosInt32, Wd      ;one cycle

and 32 increment takes three cycles.
The total difference is 2 cycles or 50ns (nano seconds).

Simple calcolation will show you all. You have 1000 steps per second and 40Mips DSP so you have 40000 instructions per step at 1000 steps per second. More than enough!

回复收藏 0 原文

自由如风 2024-12-20 18:46:52

当您将其从 16 位更改为 32 位时，您是否更改任何编译标志以告诉它编译为 32 位应用程序。

您是否尝试过使用 32 位扩展但仅使用 16 位整数进行编译。你还会遇到这样的性能下降吗？

很可能只是通过从 16 位更改为 32 位，某些操作就会以不同的方式进行编译，也许可以在两组已编译的 ASM 代码之间进行比较，看看实际上有什么不同，是很多还是只有几行。

解决方案可能是不使用 32 位整数，而只使用两个 16 位整数，
当 valueA 为 int16.Max 时，将其设置为 0，然后将 valueB 加 1，否则只需将 ValueA 加 1，当值 B >= 3 时，然后检查 valueA >= 26696 （或类似的内容，具体取决于您是否使用无符号或签名 int16) 然后你的电机在 12500 进行检查。

回复收藏 0 原文

~没有更多了~