HY。我正在尝试使用 JavaCC(汇编器)制作一个解析器,以将汇编代码(微控制器 8051)转换为机器代码。我已经阅读了有关 javaCC 语法及其结构方式的信息,但我遇到了困境。例如我有 ADD
指令:
`ADD A,Rn` or `ADD A,@Ri`
对于每个指令,我都有一个机器代码(十六进制代码)例如: ADD A,R0
转换为 28H 。
我还可以使用 MOV
指令:
MOV A,Rn
或 MOV A,@Ri
但我也有 MOV data_addr,Rn
和 MOV R6,#data
代码>等等。
现在我的问题是如何区分两条指令。假设我像这样定义我的令牌:
令牌{
|
}
我无法为每个标记定义函数来指定特定行为,因为我有很多指令。要说 token.image==.equals("mov"),然后朝特定行为的一个方向前进
这有点太多了,你不觉得吗?......所以我几乎陷入困境。我不知道该走哪条路。
谢谢您的帮助。!
HY.I'm trying to make a parser using JavaCC (an assembler) to transform from assembly code (Microcontroller 8051) to Machine COde.I have read about the javaCC grammar and the way it is structured but i have a dilemma.For example I have the ADD
instruction:
`ADD A,Rn` or `ADD A,@Ri`
and for each of them i have a Machine code (hexa code)ex: ADD A,R0
translates to 28H .
And also i can have the MOV
instruction :
MOV A,Rn
or MOV A,@Ri
but i aloso have MOV data_addr,Rn
and MOV R6,#data
and so on .
Now my problem is how do i make this difference between 2 instructions.Supose i define my tokens like this:
Token{
<IN_MOV :"mov">
|<IN_ADD:"add"
}
i couldn't define functions for each token a function to specify a specific behavior because i have many instructions.To say that token.image==.equals("mov"), then go on one direction to the specific behaviour
it is a little much , don't you think?....so i`m pretty much stuck.I don't know wich way to go .
Thx for the help.!
发布评论
评论(2)
看来您对词法分析器的期望过高。词法分析器是有限状态机,而解析器则不是。
因此,词法分析器应该为指令(
MOV
、ADD
、...)生成标记,并为操作数生成标记。词法分析器不应该试图太聪明并期望特定指令的特定操作数。现在解析器可以预期指令和操作数的特定组合。例如,您可以使用
MOV
指令仅接受@
操作数,这样任何其他操作数都会导致解析异常。如果需要进一步验证指令和操作数的组合,则必须在产生式代码中进行。例如,对于某些指令,您可以将两个相同的操作数视为错误;这在生产中很难表达,但在代码中却很简单。
如果您需要进一步验证,例如通过检测无效的指令序列,那么您将必须在整个产生式中维护一个状态,甚至构建一个 AST 并在解析完成后对其进行处理。
It seems you expect too much from the lexer. The lexer is a finite state machine, while the parser is not.
So the lexer should produce tokens for the instructions (
MOV
,ADD
, ...) and tokens for the operands. The lexer should not try to be too clever and expect specific operands for specific instructions.Now the parser can expect specific combinations of instructions and operands. For example, you can accept only
@
operands with theMOV
instruction, so that any other operand will cause a parse exception.If you need to further validate the combination of instructions and operands, you have to do it in the code of the productions. For example, you can treat two identical operands as an error for some instructions; this is very difficult to express in a production but trivial in code.
If you need to validate even further, for example by detecting invalid sequences of instructions, then you will have to maintain a state across the productions, or even build an AST and process it after the parsing is complete.
请参阅此完整的汇编语言语法 有关您需要在汇编代码的解析器中编写的各种内容的示例。
See this complete assembly language grammar for lots of examples of the kinds of things you need to write in your parser for assembler code.