GCC ARM 汇编预处理器宏
我正在尝试使用汇编(ARM)宏进行定点乘法:
#define MULT(a,b) __asm__ __volatile__ ( \
"SMULL r2, r3, %0, %1\n\t" \
"ADD r2, r2, #0x8000\n\t" \
"ADC r3, r3, #0\n\t" \
"MOV %0, r2, ASR#16\n\t" \
"ORR %0, %0, r3, ASL#16" \
: "=r" (a) : "0"(a), "1"(b) : "r2", "r3" );
但是在尝试编译时出现错误:'asm'之前的预期表达式
(如果你珍惜你的时间,但如果你看一下它会很好,这里的主要问题是如何使上述工作)
我尝试了这个:
static inline GLfixed MULT(GLfixed a, GLfixed b){
asm volatile(
"SMULL r2, r3, %[a], %[b]\n"
"ADD r2, r2, #0x8000\n"
"ADC r3, r3, #0\n"
"MOV %[a], r2, ASR#16\n"
"ORR %[a], %[a], r3, ASL#16\n"
: "=r" (a)
: [a] "r" (a), [b] "r" (b)
: "r2", "r3");
return a; }
这个可以编译,但似乎有一个问题,因为当我使用常量时,例如:MULT (65536,65536)它可以工作,但是当我使用变量时,它似乎搞砸了:
GLfixed m[16];
m[0]=costab[player_ry];//1(65536 integer representation)
m[5]=costab[player_rx];//1(65536 integer representation)
m[6]=-sintab[player_rx];//0
m[8]=-sintab[player_ry];//0
LOG("%i,%i,%i",m[6],m[8],MULT(m[6],m[8]));
m[1]=MULT(m[6],m[8]);
m[2]=MULT(m[5],-m[8]);
m[9]=MULT(-m[6],m[0]);
m[10]=MULT(m[5],m[0]);
m[12]=MULT(m[0],0)+MULT(m[8],0);
m[13]=MULT(m[1],0)+MULT(m[5],0)+MULT(m[9],0);
m[14]=MULT(m[2],0)+MULT(m[6],0)+MULT(m[10],0);
m[15]=0x00010000;//1(65536 integer representation)
int i=0;
while(i<16)
{
LOG("%i,%i,%i,%i",m[i],m[i+1],m[i+2],m[i+3]);
i+=4;
}
上面的代码将打印(LOG就像这里的printf):
0,0,-1411346156
65536,65536,65536,440
-2134820096,65536,0,-1345274311
0,65536,22,220
65536,196608,131072,65536
当正确的结果是(显然上面有很多垃圾)时:
0,0,0
65536,0,0,0
0,65536,0,0
0,0,65536,0
0,0,0,65536
I am trying to use an assembly(ARM) macro for fixed-point multiplication:
#define MULT(a,b) __asm__ __volatile__ ( \
"SMULL r2, r3, %0, %1\n\t" \
"ADD r2, r2, #0x8000\n\t" \
"ADC r3, r3, #0\n\t" \
"MOV %0, r2, ASR#16\n\t" \
"ORR %0, %0, r3, ASL#16" \
: "=r" (a) : "0"(a), "1"(b) : "r2", "r3" );
but when trying to compile I get error(s): expected expression before 'asm'
(You can ignore everything below this if you value your time but it would be nice if you took a look at it, the main question here is how to make the above work)
I tried this:
static inline GLfixed MULT(GLfixed a, GLfixed b){
asm volatile(
"SMULL r2, r3, %[a], %[b]\n"
"ADD r2, r2, #0x8000\n"
"ADC r3, r3, #0\n"
"MOV %[a], r2, ASR#16\n"
"ORR %[a], %[a], r3, ASL#16\n"
: "=r" (a)
: [a] "r" (a), [b] "r" (b)
: "r2", "r3");
return a; }
This compiles but there seems to be a problem because when I use constants ex: MULT(65536,65536) it works but when I use variables it seems to f**k up:
GLfixed m[16];
m[0]=costab[player_ry];//1(65536 integer representation)
m[5]=costab[player_rx];//1(65536 integer representation)
m[6]=-sintab[player_rx];//0
m[8]=-sintab[player_ry];//0
LOG("%i,%i,%i",m[6],m[8],MULT(m[6],m[8]));
m[1]=MULT(m[6],m[8]);
m[2]=MULT(m[5],-m[8]);
m[9]=MULT(-m[6],m[0]);
m[10]=MULT(m[5],m[0]);
m[12]=MULT(m[0],0)+MULT(m[8],0);
m[13]=MULT(m[1],0)+MULT(m[5],0)+MULT(m[9],0);
m[14]=MULT(m[2],0)+MULT(m[6],0)+MULT(m[10],0);
m[15]=0x00010000;//1(65536 integer representation)
int i=0;
while(i<16)
{
LOG("%i,%i,%i,%i",m[i],m[i+1],m[i+2],m[i+3]);
i+=4;
}
The above code will print(LOG is like printf here):
0,0,-1411346156
65536,65536,65536,440
-2134820096,65536,0,-1345274311
0,65536,22,220
65536,196608,131072,65536
When the correct result would be(obviously alot of junk in the above):
0,0,0
65536,0,0,0
0,65536,0,0
0,0,65536,0
0,0,0,65536
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
第一部分很简单:问题是 __asm__ 块是一个语句,而不是一个表达式。
您可以使用 GCC 的 语句表达式 扩展来实现您想要的东西 - 一些东西像这样:
第二部分是由于输入和输出操作数规范的问题。这里有两个不同的版本,而且都是错误的。在宏版本中,您说过:
它将
a
限制到寄存器(这是操作数0);a
与操作数0相同,即相同的寄存器(这是操作数1);b
与操作数1相同,即再次相同(这是操作数2)。此处需要
"r"(b)
,并且可以将其称为%2
。在内联版本中,您说过:
它将输出
a
和输入a
和b
限制为寄存器,但%0
)。您应该能够使用以下方法修复原始版本:
并将
a
引用为%0
或%1
,以及b
为%2
。内联版本可以这样修复:
并将操作数引用为
%[a]
和%[b]
。如果您想在宏版本中使用名称,则需要类似的内容
(并参考
%[arg_a]
和%[arg_b]
),否则预处理器将扩展[a]
和[b]
内的a
和b
。请注意命名参数情况中的微妙之处:当为参数指定名称时(如输出
a
中所示),您会编写[a]
- 不带引号 - 但当您引用的是另一个已命名操作数的名称(如输入a
中),您需要将其放在引号内:"[a]"
。The first part is easy enough: the problem is that an
__asm__
block is a statement, not an expression.You can use GCC's statement expressions extension to achieve what you want - something like this:
The second part is due to problems in the input and output operand specifications. You have two different versions here, and both are wrong. In the macro version, you've said:
which constrains
a
to a register (this is operand 0);a
to be the same as operand 0, i.e. the same register (this is operand 1);b
to be the same as operand 1, i.e. the same again (this is operand 2).You need
"r"(b)
here, and can refer to it as%2
.In the inline version, you've said:
which constrains the output
a
and the inputa
andb
to registers, but%0
).You should be able to fix the original version with:
and refer to the
a
as either%0
or%1
, andb
as%2
.The inline version can be fixed like this:
and refer the operands as
%[a]
and%[b]
.If you want to use names in the macro version, you'll need something along the lines of
(and refer to
%[arg_a]
and%[arg_b]
) because otherwise the preprocessor will expand thea
andb
inside[a]
and[b]
.Note the subtlety in the named argument cases: when a name is being given to an argument (as in the output
a
) you write[a]
- no quotes - but when you are referring to the name of another already-named operand (as in the inputa
) you need to put it inside quotes:"[a]"
.您是否尝试过简单的 C 代码而不是汇编?在我的 GCC 4.5.3 系统上,编译器生成的代码至少与您手写的汇编程序一样好:
编译为以下 asm 代码:
(删除了函数调用 Epilog 和 Prolog)
如果您有多个乘法,代码会变得更好在单个函数中,因为编译器将删除冗余的 mov r3, #32768 指令。
Have you tried simple C-code instead of assembly? On my system with GCC 4.5.3 the compiler generates code that is at least as good as your hand written assembler:
compiles to the following asm-code:
(Function call epilog and prolog removed)
The code becomes even better if you have multiple multiplications in a single function because the compiler will remove the redundant mov r3, #32768 instructions.