语法树
我在大学里有一个实验室。但我不明白我该怎么做。 我有某种语言的语法(例如算术表达式语法)。我必须建立这个语法树(我不知道如何)。那么我必须确定,输入的句子是否是该语言的句子?我该怎么做呢? PS我读过《龙之书》的几章,但没有找到我需要的东西。 PPS 我不能使用 lex/flex 和 yacc/bison。
编辑抱歉!我应该更加细心。实际上,我有一个语法,如果可能的话,我必须使用这个语法构建树并输入句子(我知道如何做到这一点)(在另一种情况下,我必须显示有关它的消息)。有什么简单易懂的算法吗?我可以使用任何简单语言的语法,例如算术表达式。
I have a lab in the university. But I don't understand how I can do it.
I have a grammar of some language(for example, arithmetic expressions grammar). I must build tree of this grammar(I don't know how). Then I must determine, whether input sentence is sentence of this language? How can I do it?
P.S. I've read few chapters of Dragon Book, but I havn't found anything I need.
P.P.S. I can't use lex/flex and yacc/bison.
EDIT Sorry! I shuild be more attentive. Really I have a grammar and I must build tree using this grammar and input sentence(I understand how to do it) if it possible(in another case I must show message about it). Is there any simple for understanding algorithms for it? I can use grammar of any simple language like arithmetical expressions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我假设这是理论上的,因为它是家庭作业,而不是您此时实际需要编码的东西。 (Kendrick 的答案涵盖了代码方法)
基本思想是从 BNF 起始变量开始,并尝试找出如何扩展它,一次应用一个规则,看看是否可以得出输入序列。
对于如下所示的规则集:
给定表达式
a + b - a
,您将执行如下操作:
开始
(给定)b.
表达式
(1)c.
表达式“-”项
(3)d.
表达式 '-' 'a'
(5)e.
表达式 '+' 项 '-' 'a'
(2)f.
术语“+”术语“-”“a”
(4)g。
'a' '+' 项 '-' 'a'
(5)h.
'a' '+' 'b' '-' 'a'
(6)这就是您一次一步执行的方法...现在的诀窍是,您需要追踪所有你的规则调用。所以你真正的树看起来像这样:
一开始有点复杂,但是一旦你真正看到它是如何完成的,就不难上手了。
注意:有些人发现逆向工作更容易,从您的输入开始,然后反向应用规则来尝试找到您的起始表达式。当您编写解析器时,您将不可避免地需要在某种程度上遵循此路线。
编辑:我在上面使用小写字母列出了所有表达式步骤,然后表明树中的每组分支实际上对应于一个规则应用程序。可能有帮助,也可能没有帮助。
I'm assuming this is theoretical since it's homework, and not something you actually need to code at this point. (Kendrick's answer covers the code approach)
The basic idea is to start with your BNF start variable and try to figure out how to expand it, applying rules one at a time, to see if you can come up with your input sequence.
For a ruleset like the following:
Given the expression
a + b - a
, you would go something like this:a.
start
(given)b.
expression
(1)c.
expression '-' term
(3)d.
expression '-' 'a'
(5)e.
expression '+' term '-' 'a'
(2)f.
term '+' term '-' 'a'
(4)g.
'a' '+' term '-' 'a'
(5)h.
'a' '+' 'b' '-' 'a'
(6)So that's how you do it one step at a time... Now the trick is, you need to trace out all your rule invocations. So your real tree would look something like this:
It's a little complicated at first, but once you actually see how it's done it's not too hard to pick up.
Note: Some people find it easier to work backwards, starting with your input and then applying rules in reverse to try to find your start expression. When you write a parser, you will inevitably need to follow this route on some level.
EDIT: I listed all the expression steps using lowercase letters above, and then showed that each set of branches from the tree actually corresponds with one of the rule applications. May or may not help.
我已经很长时间没有这样做了,所以我将给你一个简短的理论答案。我相信其他人会有更深入的解释。
由于您还没有说出您做了什么(并且您应该将其作为将来问题的一部分),因此第一步是对输入进行标记。
获得令牌后,您可以构建一个状态机来遍历令牌并将它们推送到所需的树结构中。这应该在你的书中涵盖(我还没有读过),但你可能想从在纸质(或电子)上构建模型开始,并考虑每个步骤的有效输入。相等语句的有效左手值是多少?如果你在树上有令牌 x,你可以从这里去哪里?
如果您无法从当前状态确定下一步该去哪里,那么您可能需要在状态机中实现前瞻(是否需要这取决于您的语言的复杂性)。
再说一次,我已经有十多年没有这样做了,所以我的记忆很模糊。希望我已经为您提供了足够的框架来完善您的答案,或者为其他对此主题更了解的人提供一个框架来批评我的错误或过时(从而得出您作为一个团队正在寻找的答案) )。
It's been a long time since I've done this, so I'm going to give you the short theoretical answer. I'm sure someone else will have a more in-depth explaination.
Since you haven't said what you've done (and you should as part of your questions in the future), the first step is to tokenize the input.
Once you have the tokens, you can build a state machine to walk the tokens and push them into the desired tree structure. This should be covered in your book (I haven't read it) but you probably want to start by building a model on paper (or electronic) and consider valid inputs from each step. What's a valid left hand value for an equality statement? If you have token x on the tree, where can you go from here?
If you can't determine from the current state where to go next, then you may need to implement look-ahead in your state machine (whether this is required depends on the complexity of your language).
Again, I haven't done this in over 10 years, so my recollection is fuzzy. Hopefully I've given enough of a framework for you to refine your answer or give others who are more knowledgeable on the subject to roast me for being wrong or out of date (and thus arriving at the answer you're looking for as a group).
我想回答这个问题,但实际上没有足够的材料可供使用 - 我留下了这些问题:
编辑:考虑到评论和更新的问题:我认为最直接的方法是递归下降解析器,它在输入匹配时构建树。查找 Niklaus Wirth 的“编译器构建”一书,该书已在网络上以 PDF 形式提供,其中描述了一种简单语言的 RDP。
递归下降解析器的基本思想是语法中的每个规则都对应一个函数,每个函数检查下一个标记并调用相应的下一个解析函数。例如,如果您的语法有规则
表达式:常量'+'常量
,则关联的解析方法将如下所示:
但另请参阅此问题的其他答案。
I'd like to answer this, but there isn't really enough material to work with - I am left with these questions:
Edit: Taking the comment and the updated question into account: I think the most straight-forward approach would be a recursive descent parser, which builds the tree as the input is matched. Look for Niklaus Wirth's "Compiler Construction" book, which has been made available as PDF on the web, and which describes a RDP for a simple language.
The basic idea of a recursive descent parser is that every rule in the grammar corresponds to a function, and each function checks the next token and calls the corresponding next-lower parse function. For example, if your grammar has a rule
expression : constant '+' constant
the associated parsing method would look like this:
But see also the other answers to this question.