用 Python 构建推理引擎
我正在寻找方向并尝试标记这个问题:
我正在尝试用Python构建一个简单的推理引擎(有更好的名字吗?),它将接受一个字符串并 -
1 - 创建一个列表 3
2 - 使用正则表达式对这些标记进行分类
- 使用更高级别的规则集根据分类做出决策
示例:
“90001” - 一个标记,与邮政编码正则表达式匹配, 仅包含邮政编码的字符串存在规则会导致发生某种行为
“30 + 14” - 三个标记、数值正则表达式和数学运算符匹配,数字存在规则值后跟一个数学运算符,后跟另一个数值会导致发生某种行为
我正在努力思考如何最好地执行步骤#3,即更高级别的规则集。我确信一定存在某种框架。有什么想法吗?另外,您如何描述这个问题?基于规则的系统、专家系统、推理机,还是其他什么?
谢谢!
I am seeking direction and attempting to label this problem:
I am attempting to build a simple inference engine (is there a better name?) in Python which will take a string and -
1 - create a list of tokens by simply creating a list of white space separated values
2 - categorise these tokens, using regular expressions
3 - Use a higher level set of rules to make decisions based on the categorisations
Example:
"90001" - one token, matches the zipcode regex, a rule exists for a string containing just a zipcode causes a certain behaviour to occur
"30 + 14" - three tokens, regexs for numerical value and mathematical operators match, a rule exists for a numerical value followed by a mathematical operator followed by another numerical value causes a certain behaviour to occur
I'm struggling with how best to do step #3, the higher level set of rules. I'm sure that some framework must exist. Any ideas? Also, how would you characterise this problem? Rule based system, expert system, inference engine, something else?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我很惊讶步骤#3给你带来了麻烦......
假设你可以正确地标记/分类每个标记(并且在分类之前你可以找到正确的标记,因为可能有很多不明确的情况...... ),“第 3 步”问题似乎可以通过上下文无关语法轻松解决,其中每个所需的操作(例如邮政编码查找或数学表达式计算...)将是符号及其产生规则本身由可能的标记类别组成。为了用 BNF 表示法来说明这一点,我们可以有类似的东西。
也许您担心的是,当事情变得复杂时,用不冲突的语法规则来表达整个需求将变得困难。或者也许您担心的是可以动态添加规则,从而强制语法“编译”逻辑与程序集成?不管担心什么,我认为第三步相对来说是微不足道的。
另一方面,除非各种类别(和底层输入文本)也可以用常规语言来描述(正如您在问题中似乎暗示的那样),否则文本解析器和分类器(步骤#1和#2...) 通常是一件不那么简单的事情。
一些简化编写和评估语法的示例 Python 库:
I'm very surprised that step #3 is the one giving you trouble...
Assuming you can label/categorize properly each token (and that prior to categorization you can find the proper tokens, as there may be many ambiguous cases...), the "Step #3" problem seems one that could easily be tackled with a context free grammar where each of the desired actions (such as ZIP code lookup or Mathematical expression calculation...) would be symbols with their production rule itself made of the possible token categories. To illustrate this in BNF notation, we could have something like
Maybe your concern is that when things get complicated, it will become difficult to express the whole requirement in terms of non-conflicting grammar rules. Or maybe your concern is that one could add rules dynamically, hence forcing the grammar "compilation" logic to be integrated with the program ? Whatever the concern, I think that this 3rd step will comparatively be trivial.
On the other hand, and unless the various categories (and underlying input text) are such that they can be described with a regular language as well (as you seem to hint in the question), a text parser and classifier (Steps #1 and #2...) is typically a less than trivial affair..
Some example Python libraries that simplify writing and evaluating grammars:
看起来你在搜索“语法推理”(语法归纳)库。
It looks like you search for "grammar inference" (grammar induction) library.