如何构建真值表生成器?
我希望编写一个真值表生成器作为个人项目。
(现有真值表生成器的屏幕截图示例
)
我有以下问题:
- 我应该如何解析如下表达式:((P => Q) &(Q=>R))=> (P => R)
- 我应该使用像 ANTLr 或 YACC 这样的解析器生成器,还是直接使用正则表达式?
- 一旦我解析了表达式,我应该如何生成真值表? 表达式的每个部分都需要分为其最小的组件,并从表的左侧到右侧重新构建。 我会如何评价这样的事情?
任何人都可以向我提供有关解析这些任意表达式并最终评估解析表达式的提示吗?
I'm looking to write a Truth Table Generator as a personal project.
There are several web-based online ones here and here.
(Example screenshot of an existing Truth Table Generator
)
I have the following questions:
- How should I go about parsing expressions like: ((P => Q) & (Q => R)) => (P => R)
- Should I use a parser generator like ANTLr or YACC, or use straight regular expressions?
- Once I have the expression parsed, how should I go about generating the truth table? Each section of the expression needs to be divided up into its smallest components and re-built from the left side of the table to the right. How would I evaluate something like that?
Can anyone provide me with tips concerning the parsing of these arbitrary expressions and eventually evaluating the parsed expression?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这听起来像是一个很棒的个人项目。 您将学到很多有关编译器基本部分如何工作的知识。 我会跳过尝试使用解析器生成器; 如果这是为了你自己的启发,那么从头开始做这一切你会学到更多。
此类系统的工作方式是我们理解自然语言的形式化。 如果我给你一个句子:“狗,罗弗,吃了他的食物。”,你要做的第一件事就是把它分解成单词和标点符号。 “The”、“SPACE”、“dog”、“COMMA”、“SPACE”、“Rover”……这就是“标记化”或“词法分析”。
您要做的下一步是分析标记流以查看句子是否符合语法。 英语的语法极其复杂,但这句话却很简单。 主语-同位语-动词-宾语。 这就是“解析”。
一旦您知道该句子符合语法,您就可以分析该句子以真正从中获取含义。 例如,您可以看到这个句子的三个部分——主语、同位语和宾语中的“his”——都指的是同一个实体,即狗。 你可以看出,狗是吃东西的东西,食物是被吃的东西。 这是语义分析阶段。
然后,编译器有人类没有的第四阶段,即它们生成代表语言中描述的操作的代码。
所以,就做这一切吧。 首先定义您的语言的标记是什么,为每个标记定义一个基类 Token 和一堆派生类。 (IdentifierToken、OrToken、AndToken、ImpliesToken、RightParenToken...)。 然后编写一个接受字符串并返回 IEnumerable 的方法。 那是你的词法分析器。
其次,弄清楚您的语言的语法是什么,并编写一个递归下降解析器,将 IEnumerable 分解为表示您语言中的语法实体的抽象语法树。
然后编写一个分析器来查看该树并计算出一些内容,例如“我有多少个不同的自由变量?”
然后编写一个代码生成器,生成评估真值表所需的代码。 随地吐痰似乎有点过分了,但如果你想变得真正强大,你可以。 让表达式树库为您做这件事可能会更容易; 您可以将解析树转换为表达式树,然后将表达式树转换为委托,并评估委托。
祝你好运!
This sounds like a great personal project. You'll learn a lot about how the basic parts of a compiler work. I would skip trying to use a parser generator; if this is for your own edification, you'll learn more by doing it all from scratch.
The way such systems work is a formalization of how we understand natural languages. If I give you a sentence: "The dog, Rover, ate his food.", the first thing you do is break it up into words and punctuation. "The", "SPACE", "dog", "COMMA", "SPACE", "Rover", ... That's "tokenizing" or "lexing".
The next thing you do is analyze the token stream to see if the sentence is grammatical. The grammar of English is extremely complicated, but this sentence is pretty straightforward. SUBJECT-APPOSITIVE-VERB-OBJECT. This is "parsing".
Once you know that the sentence is grammatical, you can then analyze the sentence to actually get meaning out of it. For instance, you can see that there are three parts of this sentence -- the subject, the appositive, and the "his" in the object -- that all refer to the same entity, namely, the dog. You can figure out that the dog is the thing doing the eating, and the food is the thing being eaten. This is the semantic analysis phase.
Compilers then have a fourth phase that humans do not, which is they generate code that represents the actions described in the language.
So, do all that. Start by defining what the tokens of your language are, define a base class Token and a bunch of derived classes for each. (IdentifierToken, OrToken, AndToken, ImpliesToken, RightParenToken...). Then write a method that takes a string and returns an IEnumerable'. That's your lexer.
Second, figure out what the grammar of your language is, and write a recursive descent parser that breaks up an IEnumerable into an abstract syntax tree that represents grammatical entities in your language.
Then write an analyzer that looks at that tree and figures stuff out, like "how many distinct free variables do I have?"
Then write a code generator that spits out the code necessary to evaluate the truth tables. Spitting IL seems like overkill, but if you wanted to be really buff, you could. It might be easier to let the expression tree library do that for you; you can transform your parse tree into an expression tree, and then turn the expression tree into a delegate, and evaluate the delegate.
Good luck!
我认为解析器生成器有点过分了。 您可以使用将表达式转换为后缀的想法,并评估后缀表达式(或直接从中缀表达式构建表达式树并使用它生成真值表)来解决这个问题。
I think a parser generator is an overkill. You could use the idea of converting an expression to postfix and evaluating postfix expressions (or directly building an expression tree out of the infix expression and using that to generate the truth table) to solve this problem.
正如 Mehrdad 提到的,您应该能够在学习词法分析器/解析器语法的同时手动进行解析。 您想要的最终结果是您所给出的表达式的一些抽象语法树(AST)。
然后,您需要构建一些输入生成器,为表达式中定义的符号创建输入组合。
然后,根据您在第一步中解析的规则 (AST),迭代输入集,生成每个输入组合的结果。
我会怎么做:
我可以想象在解析树时使用 lambda 函数来表达 AST/规则,并在解析时构建符号表,然后可以构建输入集,解析符号表到lambda表达式树,计算结果。
As Mehrdad mentions you should be able to hand roll the parsing in the same time as it would take to learn the syntax of a lexer/parser. The end result you want is some Abstract Syntax Tree (AST) of the expression you have been given.
You then need to build some input generator that creates the input combinations for the symbols defined in the expression.
Then iterate across the input set, generating the results for each input combo, given the rules (AST) you parsed in the first step.
How I would do it:
I could imagine using lambda functions to express the AST/rules as you parse the tree, and building a symbol table as you parse, you then could build the input set, parsing the symbol table to the lambda expression tree, to calculate the results.
如果您的目标是处理布尔表达式,那么解析器生成器和所有与之配套的机器都是浪费时间,除非您想了解它们是如何工作的(那么它们中的任何一个都可以)。
但是,很容易为布尔表达式手动构建递归下降解析器,该解析器计算并返回“评估”表达式的结果。 这样的解析器可以在第一遍中使用来确定唯一变量的数量,其中“评估”意味着“每个新变量名称计数 1”。
编写一个生成器来生成 N 个变量的所有可能的真值是微不足道的; 对于每组值,只需再次调用解析器并使用它来计算表达式,其中评估意味着“根据运算符组合子表达式的值”。
您需要语法:
您的语法可能更复杂,但对于布尔表达式来说,它不可能更复杂。
对于每个语法规则,编写 1 个子例程,该子例程使用全局“扫描”索引到正在解析的字符串中:
每个解析例程都将如此复杂。 严重地。
If your goal is processing boolean expressions, a parser generator and all the machinery that go with is a waste of time, unless you want to learn how they work (then any of them would be fine).
But it is easy to build a recursive-descent parser by hand for boolean expressions, that computes and returns the results of "evaluating" the expression. Such a parser could be used on a first pass to determine the number of unique variables, where "evaluation" means "couunt 1 for each new variable name".
Writing a generator to produce all possible truth values for N variables is trivial; for each set of values, simply call the parser again and use it to evaluate the expression, where evaluate means "combine the values of the subexpressions according to the operator".
You need a grammar:
Yours can be more complicated, but for boolean expressions it can't be that much more complicated.
For each grammar rule, write 1 subroutine that uses a global "scan" index into the string being parsed:
Each of your parse routines will be about this complicated. Seriously.
您可以在 http://code 获取 pyttgen 程序的源代码。 google.com/p/pyttgen/source/browse/#hg/src 它生成逻辑表达式的真值表。 代码基于 ply 库,所以非常简单:)
You can get source code of pyttgen program at http://code.google.com/p/pyttgen/source/browse/#hg/src It generates truth tables for logical expressions. Code based on ply library, so its very simple :)