是什么导致夹子6.3中的这种分割故障在撤离NEQFUNCTIC

发布于 2025-02-10 15:16:25 字数 25634 浏览 0 评论 0原文

TL/DR:

我在大型机器人框架内使用夹子6.3遇到分割故障,我很感激任何可能导致它们的东西的暗示(例如,这是已知的,并以较新的版本固定,还是存在已知的错误,或者存在一个已知的错误。可能导致那些类型的segfaults),或者我如何追溯到回溯内的故障值。

简介

您好,

在Fawkes框架中,我在机器人应用程序中使用了很多年。自最近执行期间,我在我的某些功能分支中遇到细分故障。经过数周的搜索原因,我对如何进行和需要帮助的原因失去了任何想法。

系统和软件

Fedora 35(尽管在Fedora 33和Fedora 34上也观察到Segfault)

剪辑版本:6.3(摘自 fedora软件包来源) 通过 clipsmm fawkes框架。尽管该框架是多线程的,但我认为分割故障不是通过运行剪辑环境的其他线程引起的。主要是因为Segfault仅在某些功能分支中出现,这仅在更改剪辑代码中,并且没有这些代码,所以一切都稳定了多年。 的可能性很小

但是,我认为重要的是

#0  0x00007f3354394b1e in PropagateReturnAtom (value=0x25, type=0, theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:784
#1  PropagateReturnValue (theEnv=theEnv@entry=0x7f339c6e0970, vPtr=vPtr@entry=0x7f333e7fafa0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:750                                                                
#2  0x00007f335439aeb2 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=<optimized out>, returnValue=0x7f333e7fafa0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:432
#3  0x00007f335436a839 in NeqFunction (theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prdctfun.c:167
#4  0x00007f335439b3a3 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380e3c80, returnValue=0x7f333e7fb0a0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:180
#5  0x00007f335436158a in EvaluateJoinExpression (theEnv=theEnv@entry=0x7f339c6e0970, joinExpr=0x7f33380e3c80, joinPtr=joinPtr@entry=0x7f33380e3b20) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/drive.c:629            
#6  0x00007f3354361aef in NetworkAssertRight (join=0x7f33380e3b20, rhsBinds=0x7f333996fe10, theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/drive.c:235
#7  NetworkAssertRight (theEnv=0x7f339c6e0970, rhsBinds=0x7f333996fe10, join=0x7f33380e3b20) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/drive.c:112                                                                    
#8  0x00007f335433b14e in ProcessFactAlphaMatch (theEnv=theEnv@entry=0x7f339c6e0970, theFact=0x7f333a05b2c0, theMarks=<optimized out>, thePattern=thePattern@entry=0x7f33380e3ab0)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:552                                    
#9  0x00007f3354340fd1 in ProcessMultifieldNode (theEnv=theEnv@entry=0x7f339c6e0970, thePattern=<optimized out>, thePattern@entry=0x7f33380e3ab0, markers=<optimized out>,
    markers@entry=0x7f3339cd4d10, endMark=endMark@entry=0x7f3339cd4d10, offset=5) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:367
#10 0x00007f3354341204 in FactPatternMatch (theEnv=theEnv@entry=0x7f339c6e0970, theFact=0x7f333a05b2c0, patternPtr=0x7f33380e3ab0, offset=offset@entry=5, markers=0x7f3339cd4d10, endMark=endMark@entry=0x7f3339cd4d10)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:243                                    
#11 0x00007f3354340d8c in ProcessMultifieldNode (theEnv=theEnv@entry=0x7f339c6e0970, thePattern=thePattern@entry=0x7f33380e2c30, markers=<optimized out>, markers@entry=0x0, endMark=endMark@entry=0x0, offset=0)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:420                                    
#12 0x00007f3354341204 in FactPatternMatch (theEnv=theEnv@entry=0x7f339c6e0970, theFact=theFact@entry=0x7f333a05b2c0, patternPtr=0x7f33380e2c30, offset=offset@entry=0, markers=markers@entry=0x0, endMark=endMark@entry=0x0)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:243                                    
#13 0x00007f3354343531 in EnvAssert (theEnv=0x7f339c6e0970, vTheFact=0x7f333a05b2c0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmngr.c:770
#14 0x00007f335431c08f in AssertCommand (theEnv=0x7f339c6e0970, rv=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factcom.c:235
#15 0x00007f335439b138 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380dd0d0, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:349                                     
#16 0x00007f335435f993 in PrognFunction (theEnv=0x7f339c6e0970, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prcdrfun.c:570
#17 0x00007f335439b138 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380dd090, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:349                                     
#18 0x00007f33543589ce in EvaluateProcActions (theEnv=0x7f339c6e0970, theModule=<optimized out>, actions=0x7f33380dd090, lvarcnt=0, result=0x7f333e7fb920, crtproc=0x7f335433aa90 <UnboundDeffunctionErr>)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prccode.c:873                                    
#19 0x00007f3354340239 in CallDeffunction (theEnv=0x7f339c6e0970, dptr=0x7f33380dcff0, args=0x7f333e7fb6b0, result=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/dffnxexe.c:131                           
#20 0x00007f33543490ea in EvaluateDeffunctionCall (theEnv=0x7f339c6e0970, value=<optimized out>, result=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/dffnxfun.c:661
#21 0x00007f335439af67 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380e2800, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:422 
#22 0x00007f335435f993 in PrognFunction (theEnv=0x7f339c6e0970, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prcdrfun.c:570
#23 0x00007f335439b138 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380e2760, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:349
#24 0x00007f33543589ce in EvaluateProcActions (theEnv=0x7f339c6e0970, theModule=<optimized out>, actions=0x7f33380e2760, lvarcnt=0, result=0x7f333e7fb920, crtproc=0x0)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prccode.c:873                                    
#25 0x00007f33543915ad in EnvRun (theEnv=0x7f339c6e0970, runLimit=-1) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/engine.c:315
#26 0x00007f3354432e94 in CLIPS::Environment::run(long) (this=0x7f339c64bf00, runlimit=runlimit@entry=-1) at /usr/src/debug/clipsmm-0.3.5-11.fc35.x86_64/clipsmm/environment.cpp:134                                                          
#27 0x00007f33540091b7 in ClipsExecutiveThread::loop() (this=0x7f339c71a5d0) at /home/tarikwork/fawkes-robotino/fawkes/src/libs/core/utils/lockptr.h:301
#28 0x00007f33adaa6d6c in fawkes::Thread::run() (this=0x7f339c71a5d0) at /home/tarikwork/fawkes-robotino/fawkes/src/libs/core/threading/thread.cpp:947
#29 0x00007f33adaa791a in fawkes::Thread::entry(void*) (pthis=0x7f339c71a5d0) at /home/tarikwork/fawkes-robotino/fawkes/src/libs/core/threading/thread.cpp:565
#30 0x00007f33ad6e1da2 in start_thread (arg=<optimized out>) at pthread_create.c:443                                   
#31 0x00007f33ad6819e0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81     

要提及某些线程

D 16:45:38.408820 CLIPS (executive): FIRE  131 central-run-parallel-goal-commit: f-39749,f-39728                      
D 16:45:38.408838 CLIPS (executive): <== f-39749 (goal (id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (class PAYMENT-GOALS) (type ACHIEVE) (sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL) (parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840) (mode EXPANDED) (outcome UNKNOWN) (warning) (error) (message "") (priority 50) (params) (meta) (meta-fact 0) (meta-template nil) (required-resources) (acquired-resources) (committed-to nil) (verbosity DEFAULT) (is-executable TRUE))
D 16:45:38.408864 CLIPS (executive): ==> f-39778 (goal (id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (class PAYMENT-GOALS) (type ACHIEVE) (sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL) (parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840) (mode COMMITTED) (outcome UNKNOWN) (warning) (error) (message "") (priority 50) (params) (meta) (meta-fact 0) (meta-template nil) (required-resources) (acquired-resources) (committed-to nil) (verbosity DEFAULT) (is-executable TRUE))        
D 16:45:38.408911 CLIPS (executive): FIRE  132 wm-sync-update-goals-on-mode-change: f-39778,f-16227,f-39775,f-39774
D 16:45:38.408924 CLIPS (executive): <== f-39775 (wm-fact (id "/template/fact/goal?id=CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839") (key template fact goal args? id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (type SYMBOL) (is-list TRUE) (value nil) (values class PAYMENT-GOALS type ACHIEVE sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840 mode EXPANDED outcome UNKNOWN warning [ ] error [ ] message "" priority 50 params [ ] meta [ ] meta-template nil required-resources [ ] acquired-resources [ ] committed-to nil verbosity DEFAULT is-executable TRUE))
D 16:45:38.408936 CLIPS (executive): <== f-39774 (wm-fact (id "/template/fact/goal-meta?goal-id=CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839") (key template fact goal-meta args? goal-id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (type SYMBOL) (is-list TRUE) (value nil) (values assigned-to nil restricted-to nil order-id nil ring-nr nil root-for-order nil run-all-ordering 1))                                                             
D 16:45:38.409030 CLIPS (executive): ==> f-39779 (wm-fact (id "") (key template fact goal args? id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (type SYMBOL) (is-list TRUE) (value nil) (values class PAYMENT-GOALS type ACHIEVE sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840 mode COMMITTED outcome UNKNOWN warning [ ] error [ ] message "" priority 50 params [ ] meta [ ] meta-template nil required-resources [ ] acquired-resources [ ] committed-to nil verbosity DEFAULT is-executable TRUE))
                                                     

问题导致这些崩溃 如下:

(deffunction assert-template-wm-fact (?fact-id ?id-slots ?other-slots)          
" Helper to create a wm-fact from a template fact"                                                                     
  (assert (wm-fact (key template fact (fact-relation ?fact-id)                                                         
                    args? (template-fact-slots-to-key-vals ?fact-id ?id-slots))                                        
                   (type SYMBOL)                                                                                       
                   (is-list TRUE)                                                                                      
                   (values (template-fact-slots-to-key-vals ?fact-id ?other-slots)))                                   
  )                                                                                                                    
)         
(defrule wm-sync-update-goals-on-mode-change                                                                           
  ?g <- (goal (id ?id) (mode ?mode))                                                                                   
  ?gm <- (goal-meta (goal-id ?id))                                                                                     
  ?wm <- (wm-fact (key template fact goal args? id ?id)                                                                
                  (values $? mode ?other-mode&:(neq ?mode ?other-mode) $?))                                            
  ?wm2 <- (wm-fact (key template fact goal-meta args? goal-id ?id))                                                    
  =>                                                                                                                   
  (retract ?wm)                                                                 
  (retract ?wm2)                                                                                                       
  (assert-template-wm-fact ?g                                                                                          
                           ?*GOAL_ID_SLOTS*                                                                            
                           (delete-member$ (deftemplate-remaining-slots goal ?*GOAL_ID_SLOTS*)                         
                                               meta-fact))                                                             
  (assert-template-wm-fact ?gm                                                                                         
                           ?*GOAL_META_ID_SLOTS*                                                                       
                           (deftemplate-remaining-slots goal-meta ?*GOAL_META_ID_SLOTS*))                              
)

我在返回

框架31-13上的进度基本上描述了执行,直到日志中的点为止,上面的规则主张了第一个新的wm-fact,那么剪辑的魔法发生了我试图向我自己解释如下(从阅读rete算法上的Wikipedia文章,并查看剪辑6.3的源代码6.3):

从本质上讲,现在需要在rete网络中评估新事实,以查看如何查看所有现有规则激活如何激活受新事实的影响。在框架11-12中,完成了其中一项检查,但无济于事,在第10帧中找到了与现有模式的匹配(我认为框架9-7本质上意味着建立了匹配,现在需要进一步评估)。 因此,我尝试从我的理解中研究框架10:

框架10

(gdb) frame 10                                                                  
#10 0x00007f3354341204 in FactPatternMatch (theEnv=theEnv@entry=0x7f339c6e0970, theFact=0x7f333a05b2c0, patternPtr=0x7f33380e3ab0 , offset=offset@entry=5, markers=0x7f3339cd4d10, endMark=endMark@entry=0x7f3339cd4d10)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:243
243!           { ProcessMultifieldNode(theEnv,patternPtr,markers,endMark,0); }
// look at the content of patternPtr=0x7f33380e3ab0  
(gdb) p (factPatternNode) *0x7f33380e3ab0                                       
$62 = {header = {firstHash = 0x7f3339018780, lastHash = 0x7f33399c8ac0, entryJoin = 0x7f33380e3b20, rightHash = 0x7f3338099a30, singlefieldNode = 0, multifieldNode = 1, stopNode = 1, initialize = 0, marked = 0, beginSlot = 0,
    endSlot = 1, selector = 0}, bsaveID = 0, whichField = 4, whichSlot = 5, leaveFields = 0, networkTest = 0x0, nextLevel = 0x0, lastLevel = 0x7f33380e3a10, leftNode = 0x0, rightNode = 0x0}
// look at the content of entryJoin = 0x7f33380e3b20 and follow the join links
(gdb) p (joinNode) *0x7f33380e3b20                                              
$63 = {firstJoin = 0, logicalJoin = 0, joinFromTheRight = 0, patternIsNegated = 0, patternIsExists = 0, initialize = 0, marked = 0, rhsType = 1, depth = 3, bsaveID = 0, memoryAdds = 4299, memoryDeletes = 4167, memoryCompares = 5006,
  leftMemory = 0x7f33380b3d60, rightMemory = 0x0, networkTest = 0x7f33380e3c40, secondaryNetworkTest = 0x0, leftHash = 0x7f339c7f3b40, rightHash = 0x0, rightSideEntryStructure = 0x7f33380e3ab0, nextLinks = 0x7f33380b4ba0,
  lastLevel = 0x7f33380e29d0, rightMatchNode = 0x0, ruleToActivate = 0x0}       
(gdb) p (joinLink) *0x7f33380b4ba0                                              
$64 = {enterDirection = 0 '\000', join = 0x7f33380e3cf0, next = 0x0, bsaveID = 0}
(gdb) p (joinNode) *0x7f33380e3cf0                                              
$65 = {firstJoin = 0, logicalJoin = 0, joinFromTheRight = 0, patternIsNegated = 0, patternIsExists = 0, initialize = 0, marked = 0, rhsType = 1, depth = 4, bsaveID = 0, memoryAdds = 0, memoryDeletes = 0, memoryCompares = 0,
  leftMemory = 0x7f33380b4bd0, rightMemory = 0x0, networkTest = 0x7f33380b92e0, secondaryNetworkTest = 0x0, leftHash = 0x7f33380b4990, rightHash = 0x0, rightSideEntryStructure = 0x7f33380e3210, nextLinks = 0x7f33380b47e0,
  lastLevel = 0x7f33380e3b20, rightMatchNode = 0x7f33380e32b0, ruleToActivate = 0x0}
(gdb) p (joinLink) *0x7f33380b47e0                                              
$66 = {enterDirection = 0 '\000', join = 0x7f33380e3e10, next = 0x0, bsaveID = 0}
(gdb) p (joinNode) *0x7f33380e3e10                                              
$67 = {firstJoin = 0, logicalJoin = 0, joinFromTheRight = 0, patternIsNegated = 0, patternIsExists = 0, initialize = 0, marked = 0, rhsType = 0, depth = 5, bsaveID = 0, memoryAdds = 0, memoryDeletes = 0, memoryCompares = 0,
  leftMemory = 0x7f33380b72a0, rightMemory = 0x0, networkTest = 0x0, secondaryNetworkTest = 0x0, leftHash = 0x0, rightHash = 0x0, rightSideEntryStructure = 0x0, nextLinks = 0x0, lastLevel = 0x7f33380e3cf0, rightMatchNode = 0x0,
  ruleToActivate = 0x7f33380e3ec0}
// We are at the end, this should be the rule which is checked for activation?!                                              
(gdb) print (defrule) *0x7f33380e3ec0                                           
$68 = {header = {name = 0x7f33380b6e50,                                         
    ppForm = 0x7f33380e3f30 "(defrule MAIN::wm-sync-update-goals-on-parent-change\n   ?g <- (goal (id ?id) (parent ?parent))\n   ?gm <- (goal-meta (goal-id ?id))\n   ?wm <- (wm-fact (key template fact goal args? id ?id) (values $? p"..., whichModule = 0x7f339c7d22d0, bsaveID = 0, next = 0x7f33380e4570, usrData = 0x0}, salience = 0, localVarCnt = 0, complexity = 19, afterBreakpoint = 0, watchActivation = 0, watchFiring = 1, autoFocus = 0, executing = 0,
  dynamicSalience = 0x0, actions = 0x7f33380e37a0, logicalJoin = 0x0, lastJoin = 0x7f33380e3e10, disjunct = 0x0}

,所讨论的规则被称为wm-sync-update-goals-parent-change-change,这是:

(defrule wm-sync-update-goals-on-parent-change                                  
  ?g <- (goal (id ?id) (parent ?parent))                                        
  ?gm <- (goal-meta (goal-id ?id))                                              
  ?wm <- (wm-fact (key template fact goal args? id ?id)                         
                  (values $? parent ?other-parent&:(neq ?parent ?other-parent) $?))
  ?wm2 <- (wm-fact (key template fact goal-meta args? goal-id ?id))             
  =>                                                                            
  (retract ?wm)                                                                 
  (retract ?wm2)                                                                
  (assert-template-wm-fact ?g                                                   
                           ?*GOAL_ID_SLOTS*                                     
                           (delete-member$ (deftemplate-remaining-slots goal ?*GOAL_ID_SLOTS*)
                                               meta-fact))                      
  (assert-template-wm-fact ?gm                                                  
                           ?*GOAL_META_ID_SLOTS*                                
                           (deftemplate-remaining-slots goal-meta ?*GOAL_META_ID_SLOTS*))
)

查看帧3-6我认为很明显,该规则内唯一考虑的NEQFUNCTION是(neq?parent?parter?other-parent),这应该是一个相当简单的检查。

但是,这也是我被卡住的地方,因为其余的回溯很奇怪。在剪辑6.3的源代码中,我找不到功能papagereteTurnValuepapagereteTurnatoM,但它们仅在剪辑6.24中出现。因此,与Debuginfo一起使用的东西似乎很奇怪。我很确定运行的剪辑版本确实是6.3,因为我们使用foreach循环。它也不能是一个更高的版本,因为我验证的是,来自6.31的功能确实缺少(例如,修改缩回的事实会导致尚无错误)。

框架3

(gdb) frame 3                                                                   
#3  0x00007f335436a839 in NeqFunction (theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prdctfun.c:167
167!   EvaluateExpression(theEnv,theExpression,&item);                          
(gdb) info args                                                                 
theEnv = 0x7f339c6e0970                                                         
(gdb) info locals                                                               
item = {supplementalInfo = 0x7f339c7d3760, type = 0, value = 0x25, begin = 139859644516720, end = 139857960488192, next = 0x7f333e7fb370}
nextItem = {supplementalInfo = 0x7f339c6e0970, type = 2, value = 0x7f33399bdd90, begin = 139858433126066, end = 19, next = 0x7f33380d0002}
numArgs = 2                                                                     
i = <optimized out>                                                             
theExpression = 0x7f33380e3ca0
// i want find out what is evaluated in the Neq function
// from the source code I guessed I need to look at the evaluation data #define EVALUATION_DATA 44                                                  
(gdb)  p (((struct environmentData *) theEnv)->theData[44])                     
$93 = (void *) 0x7f339c6ab930                                                   
(gdb) p (struct evaluationData) *0x7f339c6ab930                                 
$95 = {CurrentExpression = 0x7f33380e3c80, EvaluationError = 0, HaltExecution = 0, CurrentEvaluationDepth = 2, numberOfAddressTypes = 1, PrimitivesArray = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f339c6feba0, 0x7f339c7bc398,
    0x0 <repeats 23 times>, 0x7f339c70b530, 0x7f339c7213f0, 0x7f339c6fa058, 0x0 <repeats 16 times>, 0x7f339c7b6490, 0x7f339c7b63b0, 0x7f339c7b6420, 0x7f339c7b6570, 0x7f339c7b6260, 0x7f339c7b62d0, 0x7f339c7b6340, 0x7f339c7b6110,
    0x7f339c7b6180, 0x7f339c7b61f0, 0x7f339c7b65e0, 0x7f339c7b6650, 0x7f339c7b6500, 0x7f339c796480, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f339c7d16e0, 0x7f339c7d1750, 0x7f339c7d1600, 0x7f339c7d1670, 0x7f339c7d1830, 0x7f339c7d17c0,
    0x7f339c7d18a0, 0x7f339c7d19f0, 0x7f339c7d1910, 0x7f339c7d1a60, 0x7f339c7d1980, 0x7f339c7d1ad0, 0x7f339c7b9540, 0x7f339c700a30, 0x7f339c700aa0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f339c6fa0c8, 0x0, 0x0, 0x0, 0x0, 0x7f339c7d2678,
    0x7f339c7d26e8, 0x7f339c7d2758, 0x7f339c7d27c8, 0x0 <repeats 51 times>}, ExternalAddressTypes = {0x7f339c6aa790, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}
// The current expression (CurrentExpression = 0x7f33380e3c80) should be Neq
// (type 30 is #define FCALL 30
(gdb) p (expr) *0x7f33380e3c80                                                  
$96 = {type = 30, value = 0x7f339c6aebb0, argList = 0x7f33380e3ca0, nextArg = 0x0}
// This seems to be true:
(gdb) p (struct FunctionDefinition) *0x7f339c6aebb0
$6 = {callFunctionName = 0x7f339c6aec10, actualFunctionName = 0x7f33543c38e3 "NeqFunction", returnValueType = 98 'b', functionPointer = 0x7f335436a7b0 <NeqFunction>, parser = 0x0, restrictions = 0x7f33543c38be "2*", overloadable = 1,
sequenceuseok = 1, environmentAware = 1, bsaveIndex = 0, next = 0x7f339c6aeac0, usrData = 0x0, context = 0x0}
// So the arglist should tell me something about what is compared I assume:
(gdb) p (expr) *0x7f33380e3ca0                                                  
$98 = {type = 58, value = 0x7f339c80f0e0, argList = 0x0, nextArg = 0x7f33380e3cc0}
(gdb) p (expr) *0x7f33380e3ca0@2                                                
$99 = {{type = 58, value = 0x7f339c80f0e0, argList = 0x0, nextArg = 0x7f33380e3cc0}, {type = 57, value = 0x7f33380e26d0, argList = 0x0, nextArg = 0x0}}
                      

现在我缺乏继续以type = 57type = 58#define fact_jn_var1 57#define fact_jn_var2 58。据我了解,这意味着实际数据存储在pripitivesArray evaluationData的中,但是如果这是正确的,那么我仍然无法弄清楚发生了什么。

(gdb) frame 3                                                                   
(gdb) p ((struct evaluationData) *0x7f339c6ab930).PrimitivesArray[57]           
$6 = (struct entityRecord *) 0x7f339c7b6110                                     
(gdb) p (struct entityRecord) *0x7f339c7b6110                                   
$7 = {name = 0x7f33543bfb9c "FACT_JN_VAR1", type = 57, copyToEvaluate = 0, bitMap = 1, addsToRuleComplexity = 0, shortPrintFunction = 0x7f3354344e40 <PrintFactJNGetVar1>, longPrintFunction = 0x7f3354344e40 <PrintFactJNGetVar1>,
  deleteFunction = 0x0, evaluateFunction = 0x7f335435bd70 <FactJNGetVar1>, getNextFunction = 0x0, decrementBusyCount = 0x0, incrementBusyCount = 0x0, propagateDepth = 0x0, markNeeded = 0x0, install = 0x0, deinstall = 0x0, usrData = 0x0}
(gdb) ptype entityRecord                                                        
(gdb) p ((struct evaluationData) *0x7f339c6ab930).PrimitivesArray[58]           
$8 = (struct entityRecord *) 0x7f339c7b6180                                     
(gdb) p (struct entityRecord) *0x7f339c7b6180                                   
$9 = {name = 0x7f33543bfba9 "FACT_JN_VAR2", type = 58, copyToEvaluate = 0, bitMap = 1, addsToRuleComplexity = 0, shortPrintFunction = 0x7f3354344e50 <PrintFactJNGetVar2>, longPrintFunction = 0x7f3354344e50 <PrintFactJNGetVar2>,
  deleteFunction = 0x0, evaluateFunction = 0x7f335435b3e0 <FactJNGetVar2>, getNextFunction = 0x0, decrementBusyCount = 0x0, incrementBusyCount = 0x0, propagateDepth = 0x0, markNeeded = 0x0, install = 0x0, deinstall = 0x0, usrData = 0x0}

TL/DR:

I encounter segmentation faults using clips 6.3 within a large robotics framework and I would appreciate any hints of what could be causing them (e.g., is this something that is known and is fixed in newer versions or is there a known mistake that could cause those type of segfaults) or how I could trace back the faulty value within the backtrace to its origin.

Introduction

Hello,

I have been using clips for many years within a robotics application in the framework fawkes. Since recently I encounter segmentation faults in some of my feature branches during execution. After weeks of searching for the cause I lost any idea on how to proceed and need help trace the cause.

System and Software

Fedora 35 (although the segfault was also observed on Fedora 33 and Fedora 34)

Clips Version: 6.3 (from the fedora package sources)
Embedded via clipsmm within the fawkes framework. Although the framework is multi-threaded I do not think that the segmentation fault is caused through other threads than the one running the clips environment. Mainly, because the segfaults are only occuring in some feature branches, which only change clips code and without those, everything has been stable for multiple years. Nevertheless, I think it is important to mention the slight possibility that some threading issues cause these crashes...

The backtace

#0  0x00007f3354394b1e in PropagateReturnAtom (value=0x25, type=0, theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:784
#1  PropagateReturnValue (theEnv=theEnv@entry=0x7f339c6e0970, vPtr=vPtr@entry=0x7f333e7fafa0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:750                                                                
#2  0x00007f335439aeb2 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=<optimized out>, returnValue=0x7f333e7fafa0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:432
#3  0x00007f335436a839 in NeqFunction (theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prdctfun.c:167
#4  0x00007f335439b3a3 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380e3c80, returnValue=0x7f333e7fb0a0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:180
#5  0x00007f335436158a in EvaluateJoinExpression (theEnv=theEnv@entry=0x7f339c6e0970, joinExpr=0x7f33380e3c80, joinPtr=joinPtr@entry=0x7f33380e3b20) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/drive.c:629            
#6  0x00007f3354361aef in NetworkAssertRight (join=0x7f33380e3b20, rhsBinds=0x7f333996fe10, theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/drive.c:235
#7  NetworkAssertRight (theEnv=0x7f339c6e0970, rhsBinds=0x7f333996fe10, join=0x7f33380e3b20) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/drive.c:112                                                                    
#8  0x00007f335433b14e in ProcessFactAlphaMatch (theEnv=theEnv@entry=0x7f339c6e0970, theFact=0x7f333a05b2c0, theMarks=<optimized out>, thePattern=thePattern@entry=0x7f33380e3ab0)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:552                                    
#9  0x00007f3354340fd1 in ProcessMultifieldNode (theEnv=theEnv@entry=0x7f339c6e0970, thePattern=<optimized out>, thePattern@entry=0x7f33380e3ab0, markers=<optimized out>,
    markers@entry=0x7f3339cd4d10, endMark=endMark@entry=0x7f3339cd4d10, offset=5) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:367
#10 0x00007f3354341204 in FactPatternMatch (theEnv=theEnv@entry=0x7f339c6e0970, theFact=0x7f333a05b2c0, patternPtr=0x7f33380e3ab0, offset=offset@entry=5, markers=0x7f3339cd4d10, endMark=endMark@entry=0x7f3339cd4d10)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:243                                    
#11 0x00007f3354340d8c in ProcessMultifieldNode (theEnv=theEnv@entry=0x7f339c6e0970, thePattern=thePattern@entry=0x7f33380e2c30, markers=<optimized out>, markers@entry=0x0, endMark=endMark@entry=0x0, offset=0)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:420                                    
#12 0x00007f3354341204 in FactPatternMatch (theEnv=theEnv@entry=0x7f339c6e0970, theFact=theFact@entry=0x7f333a05b2c0, patternPtr=0x7f33380e2c30, offset=offset@entry=0, markers=markers@entry=0x0, endMark=endMark@entry=0x0)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:243                                    
#13 0x00007f3354343531 in EnvAssert (theEnv=0x7f339c6e0970, vTheFact=0x7f333a05b2c0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmngr.c:770
#14 0x00007f335431c08f in AssertCommand (theEnv=0x7f339c6e0970, rv=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factcom.c:235
#15 0x00007f335439b138 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380dd0d0, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:349                                     
#16 0x00007f335435f993 in PrognFunction (theEnv=0x7f339c6e0970, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prcdrfun.c:570
#17 0x00007f335439b138 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380dd090, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:349                                     
#18 0x00007f33543589ce in EvaluateProcActions (theEnv=0x7f339c6e0970, theModule=<optimized out>, actions=0x7f33380dd090, lvarcnt=0, result=0x7f333e7fb920, crtproc=0x7f335433aa90 <UnboundDeffunctionErr>)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prccode.c:873                                    
#19 0x00007f3354340239 in CallDeffunction (theEnv=0x7f339c6e0970, dptr=0x7f33380dcff0, args=0x7f333e7fb6b0, result=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/dffnxexe.c:131                           
#20 0x00007f33543490ea in EvaluateDeffunctionCall (theEnv=0x7f339c6e0970, value=<optimized out>, result=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/dffnxfun.c:661
#21 0x00007f335439af67 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380e2800, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:422 
#22 0x00007f335435f993 in PrognFunction (theEnv=0x7f339c6e0970, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prcdrfun.c:570
#23 0x00007f335439b138 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380e2760, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:349
#24 0x00007f33543589ce in EvaluateProcActions (theEnv=0x7f339c6e0970, theModule=<optimized out>, actions=0x7f33380e2760, lvarcnt=0, result=0x7f333e7fb920, crtproc=0x0)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prccode.c:873                                    
#25 0x00007f33543915ad in EnvRun (theEnv=0x7f339c6e0970, runLimit=-1) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/engine.c:315
#26 0x00007f3354432e94 in CLIPS::Environment::run(long) (this=0x7f339c64bf00, runlimit=runlimit@entry=-1) at /usr/src/debug/clipsmm-0.3.5-11.fc35.x86_64/clipsmm/environment.cpp:134                                                          
#27 0x00007f33540091b7 in ClipsExecutiveThread::loop() (this=0x7f339c71a5d0) at /home/tarikwork/fawkes-robotino/fawkes/src/libs/core/utils/lockptr.h:301
#28 0x00007f33adaa6d6c in fawkes::Thread::run() (this=0x7f339c71a5d0) at /home/tarikwork/fawkes-robotino/fawkes/src/libs/core/threading/thread.cpp:947
#29 0x00007f33adaa791a in fawkes::Thread::entry(void*) (pthis=0x7f339c71a5d0) at /home/tarikwork/fawkes-robotino/fawkes/src/libs/core/threading/thread.cpp:565
#30 0x00007f33ad6e1da2 in start_thread (arg=<optimized out>) at pthread_create.c:443                                   
#31 0x00007f33ad6819e0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81     

The relevant Clips rules (i think)

Here is the last entry in my debug log:

D 16:45:38.408820 CLIPS (executive): FIRE  131 central-run-parallel-goal-commit: f-39749,f-39728                      
D 16:45:38.408838 CLIPS (executive): <== f-39749 (goal (id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (class PAYMENT-GOALS) (type ACHIEVE) (sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL) (parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840) (mode EXPANDED) (outcome UNKNOWN) (warning) (error) (message "") (priority 50) (params) (meta) (meta-fact 0) (meta-template nil) (required-resources) (acquired-resources) (committed-to nil) (verbosity DEFAULT) (is-executable TRUE))
D 16:45:38.408864 CLIPS (executive): ==> f-39778 (goal (id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (class PAYMENT-GOALS) (type ACHIEVE) (sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL) (parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840) (mode COMMITTED) (outcome UNKNOWN) (warning) (error) (message "") (priority 50) (params) (meta) (meta-fact 0) (meta-template nil) (required-resources) (acquired-resources) (committed-to nil) (verbosity DEFAULT) (is-executable TRUE))        
D 16:45:38.408911 CLIPS (executive): FIRE  132 wm-sync-update-goals-on-mode-change: f-39778,f-16227,f-39775,f-39774
D 16:45:38.408924 CLIPS (executive): <== f-39775 (wm-fact (id "/template/fact/goal?id=CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839") (key template fact goal args? id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (type SYMBOL) (is-list TRUE) (value nil) (values class PAYMENT-GOALS type ACHIEVE sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840 mode EXPANDED outcome UNKNOWN warning [ ] error [ ] message "" priority 50 params [ ] meta [ ] meta-template nil required-resources [ ] acquired-resources [ ] committed-to nil verbosity DEFAULT is-executable TRUE))
D 16:45:38.408936 CLIPS (executive): <== f-39774 (wm-fact (id "/template/fact/goal-meta?goal-id=CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839") (key template fact goal-meta args? goal-id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (type SYMBOL) (is-list TRUE) (value nil) (values assigned-to nil restricted-to nil order-id nil ring-nr nil root-for-order nil run-all-ordering 1))                                                             
D 16:45:38.409030 CLIPS (executive): ==> f-39779 (wm-fact (id "") (key template fact goal args? id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (type SYMBOL) (is-list TRUE) (value nil) (values class PAYMENT-GOALS type ACHIEVE sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840 mode COMMITTED outcome UNKNOWN warning [ ] error [ ] message "" priority 50 params [ ] meta [ ] meta-template nil required-resources [ ] acquired-resources [ ] committed-to nil verbosity DEFAULT is-executable TRUE))
                                                     

The Rule that is fired last is the following:

(deffunction assert-template-wm-fact (?fact-id ?id-slots ?other-slots)          
" Helper to create a wm-fact from a template fact"                                                                     
  (assert (wm-fact (key template fact (fact-relation ?fact-id)                                                         
                    args? (template-fact-slots-to-key-vals ?fact-id ?id-slots))                                        
                   (type SYMBOL)                                                                                       
                   (is-list TRUE)                                                                                      
                   (values (template-fact-slots-to-key-vals ?fact-id ?other-slots)))                                   
  )                                                                                                                    
)         
(defrule wm-sync-update-goals-on-mode-change                                                                           
  ?g <- (goal (id ?id) (mode ?mode))                                                                                   
  ?gm <- (goal-meta (goal-id ?id))                                                                                     
  ?wm <- (wm-fact (key template fact goal args? id ?id)                                                                
                  (values $? mode ?other-mode&:(neq ?mode ?other-mode) $?))                                            
  ?wm2 <- (wm-fact (key template fact goal-meta args? goal-id ?id))                                                    
  =>                                                                                                                   
  (retract ?wm)                                                                 
  (retract ?wm2)                                                                                                       
  (assert-template-wm-fact ?g                                                                                          
                           ?*GOAL_ID_SLOTS*                                                                            
                           (delete-member$ (deftemplate-remaining-slots goal ?*GOAL_ID_SLOTS*)                         
                                               meta-fact))                                                             
  (assert-template-wm-fact ?gm                                                                                         
                           ?*GOAL_META_ID_SLOTS*                                                                       
                           (deftemplate-remaining-slots goal-meta ?*GOAL_META_ID_SLOTS*))                              
)

My progress with the backtrace

Frame 31-13 essentially describes the execution up until the point in the log, where the rule above asserts the first new wm-fact, then the magic of clips happens which i tried to explain to myself as follows (from reading the wikipedia article on the rete algorithm and looking a bit into the source code of clips 6.3):

Essentially now the new fact needs to be evaluated within the Rete network to see how all existing rule activations are influenced by the new fact. In frame 11-12 one of these checks is done but to no avail, in frame 10 a match with an existing pattern is found (as i think frame 9-7 essentially mean that a match is established and now further evaluation is required).
Hence i tried to look into frame 10:

frame 10

(gdb) frame 10                                                                  
#10 0x00007f3354341204 in FactPatternMatch (theEnv=theEnv@entry=0x7f339c6e0970, theFact=0x7f333a05b2c0, patternPtr=0x7f33380e3ab0 , offset=offset@entry=5, markers=0x7f3339cd4d10, endMark=endMark@entry=0x7f3339cd4d10)
    at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:243
243!           { ProcessMultifieldNode(theEnv,patternPtr,markers,endMark,0); }
// look at the content of patternPtr=0x7f33380e3ab0  
(gdb) p (factPatternNode) *0x7f33380e3ab0                                       
$62 = {header = {firstHash = 0x7f3339018780, lastHash = 0x7f33399c8ac0, entryJoin = 0x7f33380e3b20, rightHash = 0x7f3338099a30, singlefieldNode = 0, multifieldNode = 1, stopNode = 1, initialize = 0, marked = 0, beginSlot = 0,
    endSlot = 1, selector = 0}, bsaveID = 0, whichField = 4, whichSlot = 5, leaveFields = 0, networkTest = 0x0, nextLevel = 0x0, lastLevel = 0x7f33380e3a10, leftNode = 0x0, rightNode = 0x0}
// look at the content of entryJoin = 0x7f33380e3b20 and follow the join links
(gdb) p (joinNode) *0x7f33380e3b20                                              
$63 = {firstJoin = 0, logicalJoin = 0, joinFromTheRight = 0, patternIsNegated = 0, patternIsExists = 0, initialize = 0, marked = 0, rhsType = 1, depth = 3, bsaveID = 0, memoryAdds = 4299, memoryDeletes = 4167, memoryCompares = 5006,
  leftMemory = 0x7f33380b3d60, rightMemory = 0x0, networkTest = 0x7f33380e3c40, secondaryNetworkTest = 0x0, leftHash = 0x7f339c7f3b40, rightHash = 0x0, rightSideEntryStructure = 0x7f33380e3ab0, nextLinks = 0x7f33380b4ba0,
  lastLevel = 0x7f33380e29d0, rightMatchNode = 0x0, ruleToActivate = 0x0}       
(gdb) p (joinLink) *0x7f33380b4ba0                                              
$64 = {enterDirection = 0 '\000', join = 0x7f33380e3cf0, next = 0x0, bsaveID = 0}
(gdb) p (joinNode) *0x7f33380e3cf0                                              
$65 = {firstJoin = 0, logicalJoin = 0, joinFromTheRight = 0, patternIsNegated = 0, patternIsExists = 0, initialize = 0, marked = 0, rhsType = 1, depth = 4, bsaveID = 0, memoryAdds = 0, memoryDeletes = 0, memoryCompares = 0,
  leftMemory = 0x7f33380b4bd0, rightMemory = 0x0, networkTest = 0x7f33380b92e0, secondaryNetworkTest = 0x0, leftHash = 0x7f33380b4990, rightHash = 0x0, rightSideEntryStructure = 0x7f33380e3210, nextLinks = 0x7f33380b47e0,
  lastLevel = 0x7f33380e3b20, rightMatchNode = 0x7f33380e32b0, ruleToActivate = 0x0}
(gdb) p (joinLink) *0x7f33380b47e0                                              
$66 = {enterDirection = 0 '\000', join = 0x7f33380e3e10, next = 0x0, bsaveID = 0}
(gdb) p (joinNode) *0x7f33380e3e10                                              
$67 = {firstJoin = 0, logicalJoin = 0, joinFromTheRight = 0, patternIsNegated = 0, patternIsExists = 0, initialize = 0, marked = 0, rhsType = 0, depth = 5, bsaveID = 0, memoryAdds = 0, memoryDeletes = 0, memoryCompares = 0,
  leftMemory = 0x7f33380b72a0, rightMemory = 0x0, networkTest = 0x0, secondaryNetworkTest = 0x0, leftHash = 0x0, rightHash = 0x0, rightSideEntryStructure = 0x0, nextLinks = 0x0, lastLevel = 0x7f33380e3cf0, rightMatchNode = 0x0,
  ruleToActivate = 0x7f33380e3ec0}
// We are at the end, this should be the rule which is checked for activation?!                                              
(gdb) print (defrule) *0x7f33380e3ec0                                           
$68 = {header = {name = 0x7f33380b6e50,                                         
    ppForm = 0x7f33380e3f30 "(defrule MAIN::wm-sync-update-goals-on-parent-change\n   ?g <- (goal (id ?id) (parent ?parent))\n   ?gm <- (goal-meta (goal-id ?id))\n   ?wm <- (wm-fact (key template fact goal args? id ?id) (values $? p"..., whichModule = 0x7f339c7d22d0, bsaveID = 0, next = 0x7f33380e4570, usrData = 0x0}, salience = 0, localVarCnt = 0, complexity = 19, afterBreakpoint = 0, watchActivation = 0, watchFiring = 1, autoFocus = 0, executing = 0,
  dynamicSalience = 0x0, actions = 0x7f33380e37a0, logicalJoin = 0x0, lastJoin = 0x7f33380e3e10, disjunct = 0x0}

From my understanding, the rule in question is called wm-sync-update-goals-on-parent-change, which is the following:

(defrule wm-sync-update-goals-on-parent-change                                  
  ?g <- (goal (id ?id) (parent ?parent))                                        
  ?gm <- (goal-meta (goal-id ?id))                                              
  ?wm <- (wm-fact (key template fact goal args? id ?id)                         
                  (values $? parent ?other-parent&:(neq ?parent ?other-parent) $?))
  ?wm2 <- (wm-fact (key template fact goal-meta args? goal-id ?id))             
  =>                                                                            
  (retract ?wm)                                                                 
  (retract ?wm2)                                                                
  (assert-template-wm-fact ?g                                                   
                           ?*GOAL_ID_SLOTS*                                     
                           (delete-member$ (deftemplate-remaining-slots goal ?*GOAL_ID_SLOTS*)
                                               meta-fact))                      
  (assert-template-wm-fact ?gm                                                  
                           ?*GOAL_META_ID_SLOTS*                                
                           (deftemplate-remaining-slots goal-meta ?*GOAL_META_ID_SLOTS*))
)

Looking at Frame 3-6 i think its clear that the only NeqFunction that comes into consideration within that rule is (neq ?parent ?other-parent), which should be a rather simple check.

But here is also where i got stuck as the rest of the backtrace is weird. I could not find the functions PropagateReturnValue and PropagateReturnAtom in the source code of clips 6.3, but they only occur in clips 6.24. Hence something with the debuginfo seems to be weird. I am pretty sure that the clips version running is indeed 6.3 tho, as we utilize foreach loops. It cannot be a higher version either as I verified that features from 6.31 are indeed missing (e.g., modifying retracted facts causes no error yet).

frame 3

(gdb) frame 3                                                                   
#3  0x00007f335436a839 in NeqFunction (theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prdctfun.c:167
167!   EvaluateExpression(theEnv,theExpression,&item);                          
(gdb) info args                                                                 
theEnv = 0x7f339c6e0970                                                         
(gdb) info locals                                                               
item = {supplementalInfo = 0x7f339c7d3760, type = 0, value = 0x25, begin = 139859644516720, end = 139857960488192, next = 0x7f333e7fb370}
nextItem = {supplementalInfo = 0x7f339c6e0970, type = 2, value = 0x7f33399bdd90, begin = 139858433126066, end = 19, next = 0x7f33380d0002}
numArgs = 2                                                                     
i = <optimized out>                                                             
theExpression = 0x7f33380e3ca0
// i want find out what is evaluated in the Neq function
// from the source code I guessed I need to look at the evaluation data #define EVALUATION_DATA 44                                                  
(gdb)  p (((struct environmentData *) theEnv)->theData[44])                     
$93 = (void *) 0x7f339c6ab930                                                   
(gdb) p (struct evaluationData) *0x7f339c6ab930                                 
$95 = {CurrentExpression = 0x7f33380e3c80, EvaluationError = 0, HaltExecution = 0, CurrentEvaluationDepth = 2, numberOfAddressTypes = 1, PrimitivesArray = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f339c6feba0, 0x7f339c7bc398,
    0x0 <repeats 23 times>, 0x7f339c70b530, 0x7f339c7213f0, 0x7f339c6fa058, 0x0 <repeats 16 times>, 0x7f339c7b6490, 0x7f339c7b63b0, 0x7f339c7b6420, 0x7f339c7b6570, 0x7f339c7b6260, 0x7f339c7b62d0, 0x7f339c7b6340, 0x7f339c7b6110,
    0x7f339c7b6180, 0x7f339c7b61f0, 0x7f339c7b65e0, 0x7f339c7b6650, 0x7f339c7b6500, 0x7f339c796480, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f339c7d16e0, 0x7f339c7d1750, 0x7f339c7d1600, 0x7f339c7d1670, 0x7f339c7d1830, 0x7f339c7d17c0,
    0x7f339c7d18a0, 0x7f339c7d19f0, 0x7f339c7d1910, 0x7f339c7d1a60, 0x7f339c7d1980, 0x7f339c7d1ad0, 0x7f339c7b9540, 0x7f339c700a30, 0x7f339c700aa0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f339c6fa0c8, 0x0, 0x0, 0x0, 0x0, 0x7f339c7d2678,
    0x7f339c7d26e8, 0x7f339c7d2758, 0x7f339c7d27c8, 0x0 <repeats 51 times>}, ExternalAddressTypes = {0x7f339c6aa790, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}
// The current expression (CurrentExpression = 0x7f33380e3c80) should be Neq
// (type 30 is #define FCALL 30
(gdb) p (expr) *0x7f33380e3c80                                                  
$96 = {type = 30, value = 0x7f339c6aebb0, argList = 0x7f33380e3ca0, nextArg = 0x0}
// This seems to be true:
(gdb) p (struct FunctionDefinition) *0x7f339c6aebb0
$6 = {callFunctionName = 0x7f339c6aec10, actualFunctionName = 0x7f33543c38e3 "NeqFunction", returnValueType = 98 'b', functionPointer = 0x7f335436a7b0 <NeqFunction>, parser = 0x0, restrictions = 0x7f33543c38be "2*", overloadable = 1,
sequenceuseok = 1, environmentAware = 1, bsaveIndex = 0, next = 0x7f339c6aeac0, usrData = 0x0, context = 0x0}
// So the arglist should tell me something about what is compared I assume:
(gdb) p (expr) *0x7f33380e3ca0                                                  
$98 = {type = 58, value = 0x7f339c80f0e0, argList = 0x0, nextArg = 0x7f33380e3cc0}
(gdb) p (expr) *0x7f33380e3ca0@2                                                
$99 = {{type = 58, value = 0x7f339c80f0e0, argList = 0x0, nextArg = 0x7f33380e3cc0}, {type = 57, value = 0x7f33380e26d0, argList = 0x0, nextArg = 0x0}}
                      

Now I lack the knowledge to continue as type = 57 and type = 58 are #define FACT_JN_VAR1 57 and #define FACT_JN_VAR2 58. To my understanding that means that the actual data is stored in the PrimitivesArray of the evaluationData, but if that is true than I still could not figure out what is going on.

(gdb) frame 3                                                                   
(gdb) p ((struct evaluationData) *0x7f339c6ab930).PrimitivesArray[57]           
$6 = (struct entityRecord *) 0x7f339c7b6110                                     
(gdb) p (struct entityRecord) *0x7f339c7b6110                                   
$7 = {name = 0x7f33543bfb9c "FACT_JN_VAR1", type = 57, copyToEvaluate = 0, bitMap = 1, addsToRuleComplexity = 0, shortPrintFunction = 0x7f3354344e40 <PrintFactJNGetVar1>, longPrintFunction = 0x7f3354344e40 <PrintFactJNGetVar1>,
  deleteFunction = 0x0, evaluateFunction = 0x7f335435bd70 <FactJNGetVar1>, getNextFunction = 0x0, decrementBusyCount = 0x0, incrementBusyCount = 0x0, propagateDepth = 0x0, markNeeded = 0x0, install = 0x0, deinstall = 0x0, usrData = 0x0}
(gdb) ptype entityRecord                                                        
(gdb) p ((struct evaluationData) *0x7f339c6ab930).PrimitivesArray[58]           
$8 = (struct entityRecord *) 0x7f339c7b6180                                     
(gdb) p (struct entityRecord) *0x7f339c7b6180                                   
$9 = {name = 0x7f33543bfba9 "FACT_JN_VAR2", type = 58, copyToEvaluate = 0, bitMap = 1, addsToRuleComplexity = 0, shortPrintFunction = 0x7f3354344e50 <PrintFactJNGetVar2>, longPrintFunction = 0x7f3354344e50 <PrintFactJNGetVar2>,
  deleteFunction = 0x0, evaluateFunction = 0x7f335435b3e0 <FactJNGetVar2>, getNextFunction = 0x0, decrementBusyCount = 0x0, incrementBusyCount = 0x0, propagateDepth = 0x0, markNeeded = 0x0, install = 0x0, deinstall = 0x0, usrData = 0x0}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

蘑菇王子 2025-02-17 15:16:25

如果您使用的是夹子6.3,但是堆栈跟踪显示了6.24的功能,则可能应该先尝试弄清楚这一点。

然后,如果您还没有这样做“ nofollow noreferrer”> https://sourceforge.net/p/clipsrules/code/head/tree/tree/branches/63x/core/ ,看看您的问题是否仍然存在。

我怀疑这是针对NEQ功能的问题。 FACT_JN_VAR1和FACT_JN_VAR2是在模式匹配过程中从事实中检索值的原始。有时,这可能是由错误计算的索引引起的,以从事实中检索值,并且通常与具有具有嵌套条件元素的复杂条件的规则相关联。

由于断言WM-FACT时发生故障,因此有可能具有与该事实匹配并在条件下具有NEQ调用的模式的任何规则可能是原因。有时,您可以通过直接运行系统到最后一条规则,保存事实,清除并再次加载规则,加载保存的事实并让最后的规则射击来隔离问题。如果您仍然崩溃,则可以削减事实和规则,以便您可以获得较小的代码进行调试。

想到的另外两种可能性是,要么剪辑错误地垃圾收集了某些东西,要么是记忆正在以其他方式损坏。这很难进行调试,因为实际问题通常发生在崩溃之前很久。当我以前不得不调试这些类型的问题时,第一步也试图削减代码,以便我可以更好地了解导致问题的原因。

If you're using CLIPS 6.3, but the stack trace is showing functions from 6.24, you should probably try to figure that out first.

Then, if you haven't done so already, download the existing patches for 6.3 from https://sourceforge.net/p/clipsrules/code/HEAD/tree/branches/63x/core/ and see if your issue is still present.

I doubt that it's an issue specific to the neq function. The FACT_JN_VAR1 and FACT_JN_VAR2 are primitives to retrieve values from facts during pattern matching. Sometimes this can be caused by an incorrectly computed index to retrieve the value from a fact and is usually associated with a rule having a complex set of conditions with nested conditional elements.

Since the fault is occurring when a wm-fact is asserted, potentially any rule with a pattern that matches that fact and has a neq call in the conditions could be the cause. Sometimes you can isolate the issue by running your system right up to the last rule fired, saving the facts, clear and load the rules again, load the saved facts, and let the last rule fire. If you still get a crash, then you can pare out the facts and rules so that you can get a smaller set of code to debug.

The two other possibilities that come to mind are that either CLIPS is erroneously garbage collecting something prematurely or that memory is being corrupted some other way. That's harder to debug because the actual problem usually happens long before crash. When I've had to debug these types of issues before, the first step was also trying to pare down the code so that I could get a better idea what was causing the issue.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文