专家系统中的修剪推导
在规则系统或任何通过前向链接推理规则推断事实的推理系统中,您将如何修剪“不必要的”分支?我不确定正式术语是什么,但我只是想了解人们在推理问题时如何能够限制他们的思路,而我见过的所有语义推理机似乎都无法做到这一点。
例如,在 John McCarthy 的论文中 自然语言理解及其引发的人工智能问题的示例,他描述了让程序智能回答有关《纽约时报》新闻文章问题的潜在问题。在第 4 节“非单调推理的需要”中,他讨论了在推理故事时使用奥卡姆雷蛇来限制事实的包含。他使用的示例故事是关于强盗袭击家具店老板的故事。
如果一个程序被要求在谓词演算中形成故事的“最小完成”,它可能需要包括原始故事中没有直接提到的事实。然而,它还需要某种方式知道何时限制其推导链,以免包含不相关的细节。例如,它可能想包括参与该案的警察的确切人数,但文章省略了这一点,但它不想包括每个警察都有母亲的事实。
In a rule system, or any reasoning system that deduces facts via forward-chaining inference rules, how would you prune "unnecessary" branches? I'm not sure what the formal terminology is, but I'm just trying to understand how people are able to limit their train-of-thought when reasoning over problems, whereas all semantic reasoners I've seen appear unable to do this.
For example, in John McCarthy's paper An Example for Natural Language Understanding and the AI Problems It Raises, he describes potential problems in getting a program to intelligently answer questions about a news article in the New York Times. In section 4, "The Need For Nonmonotonic Reasoning", he discusses the use of Occam's Razer to restrict the inclusion of facts when reasoning about the story. The sample story he uses is one about robbers who victimize a furniture store owner.
If a program were asked to form a "minimal completion" of the story in predicate calculus, it might need to include facts not directly mentioned in the original story. However, it would also need some way of knowing when to limit its chain of deduction, so as not to include irrelevant details. For example, it might want to include the exact number of police involved in the case, which the article omits, but it won't want to include the fact that each police officer has a mother.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好问题。
从您的问题来看,我认为您所说的“修剪”是事前执行的模型构建步骤,即限制可用于构建模型的算法的输入。机器学习中使用的术语“修剪”指的是不同的东西——模型构建之后的“事后”步骤,并且对模型本身进行操作,而不是 可用的输入。 (在机器学习领域中,术语“剪枝”可能有第二个含义,但我不知道。)换句话说,剪枝确实是一种“限制其推论链”的技术,如下所示:您放置它,但它是通过删除完整(工作)模型的组件来实现的,而不是通过限制用于创建该模型的输入。
另一方面,隔离或限制可用于模型构建的输入(我认为您可能已经想到了这一点)确实是机器学习的一个关键主题;显然,它是许多最新 ML 算法(例如支持向量机)卓越性能的一个因素(SVM 的基本原理是仅从一小部分数据构建最大裕度超平面,即“支持向量”)和多自适应回归样条(一种回归技术,其中不尝试通过“通过它绘制一条连续曲线”来拟合数据,而是拟合数据的离散部分,一个是一,对每个部分(即“样条线”)使用有界线性方程,因此数据最优划分的谓词步骤显然是该算法的关键)。
通过剪枝解决什么问题?
至少有我实际编码和使用的特定机器学习算法——决策树、MARS 和神经网络——剪枝是在最初的 >过拟合模型(一种与训练数据拟合得非常紧密以至于无法泛化(准确预测新实例)的模型。在每个实例中,剪枝涉及删除边缘节点(DT、NN)或 ?
其次,为什么需要剪枝?
准确地设置收敛/分裂标准不是更好吗 “自下而上”;模型是自上而下构建的,因此调整模型(以实现与修剪相同的好处)不仅消除了一个或多个决策节点,而且还消除了子节点(例如修剪一棵更靠近因此,消除边缘节点也可能消除从属于该边缘节点的一个或多个强节点,但建模者永远不会知道这一点,因为他/她的调整消除了在该边缘节点处的进一步节点创建。修剪是从另一个方向进行的——从最下级(最低级别)的子节点向根节点的方向向上。
Good Question.
From your Question i think what you refer to as 'pruning' is a model-building step performed ex ante--ie, to limit the inputs available to the algorithm to build the model. The term 'pruning' when used in Machine Learning refers to something different--an ex post step, after model construction and that operates upon the model itself and not on the available inputs. (There could be a second meaning in the ML domain, for the term 'pruning.' of, but i'm not aware of it.) In other words, pruning is indeed literally a technique to "limit its chain of deduction" as you put it, but it does so ex post, by excision of components of a complete (working) model, and not by limiting the inputs used to create that model.
On the other hand, isolating or limiting the inputs available for model construction--which is what i think you might have had in mind--is indeed a key Machine Learning theme; it's clearly a factor responsible for the superior performance of many of the more recent ML algorithms--for instance, Support Vector Machines (the insight that underlies SVM is construction of the maximum-margin hyperplane from only a small subset of the data, i.e, the 'support vectors'), and Multi-Adaptive Regression Splines (a regression technique in which no attempt is made to fit the data by "drawing a single continuous curve through it", instead, discrete section of the data are fit, one by one, using a bounded linear equation for each portion, ie., the 'splines', so the predicate step of optimal partitioning of the data is obviously the crux of this algorithm).
What problem is solving by pruning?
At least w/r/t specific ML algorithms i have actually coded and used--Decision Trees, MARS, and Neural Networks--pruning is performed on an initially over-fit model (a model that fits the training data so closely that it is unable to generalize (accurately predict new instances). In each instance, pruning involves removing marginal nodes (DT, NN) or terms in the regression equation (MARS) one by one.
Second, why is pruning necessary/desirable?
Isn't it better to just accurately set the convergence/splitting criteria? That won't always help. Pruning works from "the bottom up"; the model is constructed from the top down, so tuning the model (to achieve the same benefit as pruning) eliminates not just one or more decision nodes but also the child nodes that (like trimming a tree closer to the trunk). So eliminating a marginal node might also eliminate one or more strong nodes subordinate to that marginal node--but the modeler would never know that because his/her tuning eliminated further node creation at that marginal node. Pruning works from the other direction--from the most subordinate (lowest-level) child nodes upward in the direction of the root node.