TL/DR:
I encounter segmentation faults using clips 6.3 within a large robotics framework and I would appreciate any hints of what could be causing them (e.g., is this something that is known and is fixed in newer versions or is there a known mistake that could cause those type of segfaults) or how I could trace back the faulty value within the backtrace to its origin.
Introduction
Hello,
I have been using clips for many years within a robotics application in the framework fawkes. Since recently I encounter segmentation faults in some of my feature branches during execution. After weeks of searching for the cause I lost any idea on how to proceed and need help trace the cause.
System and Software
Fedora 35 (although the segfault was also observed on Fedora 33 and Fedora 34)
Clips Version: 6.3 (from the fedora package sources)
Embedded via clipsmm within the fawkes framework. Although the framework is multi-threaded I do not think that the segmentation fault is caused through other threads than the one running the clips environment. Mainly, because the segfaults are only occuring in some feature branches, which only change clips code and without those, everything has been stable for multiple years. Nevertheless, I think it is important to mention the slight possibility that some threading issues cause these crashes...
The backtace
#0 0x00007f3354394b1e in PropagateReturnAtom (value=0x25, type=0, theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:784
#1 PropagateReturnValue (theEnv=theEnv#entry=0x7f339c6e0970, vPtr=vPtr#entry=0x7f333e7fafa0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:750
#2 0x00007f335439aeb2 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=<optimized out>, returnValue=0x7f333e7fafa0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:432
#3 0x00007f335436a839 in NeqFunction (theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prdctfun.c:167
#4 0x00007f335439b3a3 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380e3c80, returnValue=0x7f333e7fb0a0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:180
#5 0x00007f335436158a in EvaluateJoinExpression (theEnv=theEnv#entry=0x7f339c6e0970, joinExpr=0x7f33380e3c80, joinPtr=joinPtr#entry=0x7f33380e3b20) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/drive.c:629
#6 0x00007f3354361aef in NetworkAssertRight (join=0x7f33380e3b20, rhsBinds=0x7f333996fe10, theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/drive.c:235
#7 NetworkAssertRight (theEnv=0x7f339c6e0970, rhsBinds=0x7f333996fe10, join=0x7f33380e3b20) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/drive.c:112
#8 0x00007f335433b14e in ProcessFactAlphaMatch (theEnv=theEnv#entry=0x7f339c6e0970, theFact=0x7f333a05b2c0, theMarks=<optimized out>, thePattern=thePattern#entry=0x7f33380e3ab0)
at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:552
#9 0x00007f3354340fd1 in ProcessMultifieldNode (theEnv=theEnv#entry=0x7f339c6e0970, thePattern=<optimized out>, thePattern#entry=0x7f33380e3ab0, markers=<optimized out>,
markers#entry=0x7f3339cd4d10, endMark=endMark#entry=0x7f3339cd4d10, offset=5) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:367
#10 0x00007f3354341204 in FactPatternMatch (theEnv=theEnv#entry=0x7f339c6e0970, theFact=0x7f333a05b2c0, patternPtr=0x7f33380e3ab0, offset=offset#entry=5, markers=0x7f3339cd4d10, endMark=endMark#entry=0x7f3339cd4d10)
at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:243
#11 0x00007f3354340d8c in ProcessMultifieldNode (theEnv=theEnv#entry=0x7f339c6e0970, thePattern=thePattern#entry=0x7f33380e2c30, markers=<optimized out>, markers#entry=0x0, endMark=endMark#entry=0x0, offset=0)
at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:420
#12 0x00007f3354341204 in FactPatternMatch (theEnv=theEnv#entry=0x7f339c6e0970, theFact=theFact#entry=0x7f333a05b2c0, patternPtr=0x7f33380e2c30, offset=offset#entry=0, markers=markers#entry=0x0, endMark=endMark#entry=0x0)
at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:243
#13 0x00007f3354343531 in EnvAssert (theEnv=0x7f339c6e0970, vTheFact=0x7f333a05b2c0) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmngr.c:770
#14 0x00007f335431c08f in AssertCommand (theEnv=0x7f339c6e0970, rv=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factcom.c:235
#15 0x00007f335439b138 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380dd0d0, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:349
#16 0x00007f335435f993 in PrognFunction (theEnv=0x7f339c6e0970, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prcdrfun.c:570
#17 0x00007f335439b138 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380dd090, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:349
#18 0x00007f33543589ce in EvaluateProcActions (theEnv=0x7f339c6e0970, theModule=<optimized out>, actions=0x7f33380dd090, lvarcnt=0, result=0x7f333e7fb920, crtproc=0x7f335433aa90 <UnboundDeffunctionErr>)
at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prccode.c:873
#19 0x00007f3354340239 in CallDeffunction (theEnv=0x7f339c6e0970, dptr=0x7f33380dcff0, args=0x7f333e7fb6b0, result=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/dffnxexe.c:131
#20 0x00007f33543490ea in EvaluateDeffunctionCall (theEnv=0x7f339c6e0970, value=<optimized out>, result=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/dffnxfun.c:661
#21 0x00007f335439af67 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380e2800, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:422
#22 0x00007f335435f993 in PrognFunction (theEnv=0x7f339c6e0970, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prcdrfun.c:570
#23 0x00007f335439b138 in EvaluateExpression (theEnv=0x7f339c6e0970, problem=0x7f33380e2760, returnValue=0x7f333e7fb920) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/evaluatn.c:349
#24 0x00007f33543589ce in EvaluateProcActions (theEnv=0x7f339c6e0970, theModule=<optimized out>, actions=0x7f33380e2760, lvarcnt=0, result=0x7f333e7fb920, crtproc=0x0)
at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prccode.c:873
#25 0x00007f33543915ad in EnvRun (theEnv=0x7f339c6e0970, runLimit=-1) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/engine.c:315
#26 0x00007f3354432e94 in CLIPS::Environment::run(long) (this=0x7f339c64bf00, runlimit=runlimit#entry=-1) at /usr/src/debug/clipsmm-0.3.5-11.fc35.x86_64/clipsmm/environment.cpp:134
#27 0x00007f33540091b7 in ClipsExecutiveThread::loop() (this=0x7f339c71a5d0) at /home/tarikwork/fawkes-robotino/fawkes/src/libs/core/utils/lockptr.h:301
#28 0x00007f33adaa6d6c in fawkes::Thread::run() (this=0x7f339c71a5d0) at /home/tarikwork/fawkes-robotino/fawkes/src/libs/core/threading/thread.cpp:947
#29 0x00007f33adaa791a in fawkes::Thread::entry(void*) (pthis=0x7f339c71a5d0) at /home/tarikwork/fawkes-robotino/fawkes/src/libs/core/threading/thread.cpp:565
#30 0x00007f33ad6e1da2 in start_thread (arg=<optimized out>) at pthread_create.c:443
#31 0x00007f33ad6819e0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
The relevant Clips rules (i think)
Here is the last entry in my debug log:
D 16:45:38.408820 CLIPS (executive): FIRE 131 central-run-parallel-goal-commit: f-39749,f-39728
D 16:45:38.408838 CLIPS (executive): <== f-39749 (goal (id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (class PAYMENT-GOALS) (type ACHIEVE) (sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL) (parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840) (mode EXPANDED) (outcome UNKNOWN) (warning) (error) (message "") (priority 50) (params) (meta) (meta-fact 0) (meta-template nil) (required-resources) (acquired-resources) (committed-to nil) (verbosity DEFAULT) (is-executable TRUE))
D 16:45:38.408864 CLIPS (executive): ==> f-39778 (goal (id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (class PAYMENT-GOALS) (type ACHIEVE) (sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL) (parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840) (mode COMMITTED) (outcome UNKNOWN) (warning) (error) (message "") (priority 50) (params) (meta) (meta-fact 0) (meta-template nil) (required-resources) (acquired-resources) (committed-to nil) (verbosity DEFAULT) (is-executable TRUE))
D 16:45:38.408911 CLIPS (executive): FIRE 132 wm-sync-update-goals-on-mode-change: f-39778,f-16227,f-39775,f-39774
D 16:45:38.408924 CLIPS (executive): <== f-39775 (wm-fact (id "/template/fact/goal?id=CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839") (key template fact goal args? id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (type SYMBOL) (is-list TRUE) (value nil) (values class PAYMENT-GOALS type ACHIEVE sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840 mode EXPANDED outcome UNKNOWN warning [ ] error [ ] message "" priority 50 params [ ] meta [ ] meta-template nil required-resources [ ] acquired-resources [ ] committed-to nil verbosity DEFAULT is-executable TRUE))
D 16:45:38.408936 CLIPS (executive): <== f-39774 (wm-fact (id "/template/fact/goal-meta?goal-id=CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839") (key template fact goal-meta args? goal-id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (type SYMBOL) (is-list TRUE) (value nil) (values assigned-to nil restricted-to nil order-id nil ring-nr nil root-for-order nil run-all-ordering 1))
D 16:45:38.409030 CLIPS (executive): ==> f-39779 (wm-fact (id "") (key template fact goal args? id CENTRAL-RUN-PARALLEL-PAYMENT-GOALS-gen1839) (type SYMBOL) (is-list TRUE) (value nil) (values class PAYMENT-GOALS type ACHIEVE sub-type CENTRAL-RUN-SUBGOALS-IN-PARALLEL parent CENTRAL-RUN-PARALLEL-PRODUCE-ORDER-gen1840 mode COMMITTED outcome UNKNOWN warning [ ] error [ ] message "" priority 50 params [ ] meta [ ] meta-template nil required-resources [ ] acquired-resources [ ] committed-to nil verbosity DEFAULT is-executable TRUE))
The Rule that is fired last is the following:
(deffunction assert-template-wm-fact (?fact-id ?id-slots ?other-slots)
" Helper to create a wm-fact from a template fact"
(assert (wm-fact (key template fact (fact-relation ?fact-id)
args? (template-fact-slots-to-key-vals ?fact-id ?id-slots))
(type SYMBOL)
(is-list TRUE)
(values (template-fact-slots-to-key-vals ?fact-id ?other-slots)))
)
)
(defrule wm-sync-update-goals-on-mode-change
?g <- (goal (id ?id) (mode ?mode))
?gm <- (goal-meta (goal-id ?id))
?wm <- (wm-fact (key template fact goal args? id ?id)
(values $? mode ?other-mode&:(neq ?mode ?other-mode) $?))
?wm2 <- (wm-fact (key template fact goal-meta args? goal-id ?id))
=>
(retract ?wm)
(retract ?wm2)
(assert-template-wm-fact ?g
?*GOAL_ID_SLOTS*
(delete-member$ (deftemplate-remaining-slots goal ?*GOAL_ID_SLOTS*)
meta-fact))
(assert-template-wm-fact ?gm
?*GOAL_META_ID_SLOTS*
(deftemplate-remaining-slots goal-meta ?*GOAL_META_ID_SLOTS*))
)
My progress with the backtrace
Frame 31-13 essentially describes the execution up until the point in the log, where the rule above asserts the first new wm-fact, then the magic of clips happens which i tried to explain to myself as follows (from reading the wikipedia article on the rete algorithm and looking a bit into the source code of clips 6.3):
Essentially now the new fact needs to be evaluated within the Rete network to see how all existing rule activations are influenced by the new fact. In frame 11-12 one of these checks is done but to no avail, in frame 10 a match with an existing pattern is found (as i think frame 9-7 essentially mean that a match is established and now further evaluation is required).
Hence i tried to look into frame 10:
frame 10
(gdb) frame 10
#10 0x00007f3354341204 in FactPatternMatch (theEnv=theEnv#entry=0x7f339c6e0970, theFact=0x7f333a05b2c0, patternPtr=0x7f33380e3ab0 , offset=offset#entry=5, markers=0x7f3339cd4d10, endMark=endMark#entry=0x7f3339cd4d10)
at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/factmch.c:243
243! { ProcessMultifieldNode(theEnv,patternPtr,markers,endMark,0); }
// look at the content of patternPtr=0x7f33380e3ab0
(gdb) p (factPatternNode) *0x7f33380e3ab0
$62 = {header = {firstHash = 0x7f3339018780, lastHash = 0x7f33399c8ac0, entryJoin = 0x7f33380e3b20, rightHash = 0x7f3338099a30, singlefieldNode = 0, multifieldNode = 1, stopNode = 1, initialize = 0, marked = 0, beginSlot = 0,
endSlot = 1, selector = 0}, bsaveID = 0, whichField = 4, whichSlot = 5, leaveFields = 0, networkTest = 0x0, nextLevel = 0x0, lastLevel = 0x7f33380e3a10, leftNode = 0x0, rightNode = 0x0}
// look at the content of entryJoin = 0x7f33380e3b20 and follow the join links
(gdb) p (joinNode) *0x7f33380e3b20
$63 = {firstJoin = 0, logicalJoin = 0, joinFromTheRight = 0, patternIsNegated = 0, patternIsExists = 0, initialize = 0, marked = 0, rhsType = 1, depth = 3, bsaveID = 0, memoryAdds = 4299, memoryDeletes = 4167, memoryCompares = 5006,
leftMemory = 0x7f33380b3d60, rightMemory = 0x0, networkTest = 0x7f33380e3c40, secondaryNetworkTest = 0x0, leftHash = 0x7f339c7f3b40, rightHash = 0x0, rightSideEntryStructure = 0x7f33380e3ab0, nextLinks = 0x7f33380b4ba0,
lastLevel = 0x7f33380e29d0, rightMatchNode = 0x0, ruleToActivate = 0x0}
(gdb) p (joinLink) *0x7f33380b4ba0
$64 = {enterDirection = 0 '\000', join = 0x7f33380e3cf0, next = 0x0, bsaveID = 0}
(gdb) p (joinNode) *0x7f33380e3cf0
$65 = {firstJoin = 0, logicalJoin = 0, joinFromTheRight = 0, patternIsNegated = 0, patternIsExists = 0, initialize = 0, marked = 0, rhsType = 1, depth = 4, bsaveID = 0, memoryAdds = 0, memoryDeletes = 0, memoryCompares = 0,
leftMemory = 0x7f33380b4bd0, rightMemory = 0x0, networkTest = 0x7f33380b92e0, secondaryNetworkTest = 0x0, leftHash = 0x7f33380b4990, rightHash = 0x0, rightSideEntryStructure = 0x7f33380e3210, nextLinks = 0x7f33380b47e0,
lastLevel = 0x7f33380e3b20, rightMatchNode = 0x7f33380e32b0, ruleToActivate = 0x0}
(gdb) p (joinLink) *0x7f33380b47e0
$66 = {enterDirection = 0 '\000', join = 0x7f33380e3e10, next = 0x0, bsaveID = 0}
(gdb) p (joinNode) *0x7f33380e3e10
$67 = {firstJoin = 0, logicalJoin = 0, joinFromTheRight = 0, patternIsNegated = 0, patternIsExists = 0, initialize = 0, marked = 0, rhsType = 0, depth = 5, bsaveID = 0, memoryAdds = 0, memoryDeletes = 0, memoryCompares = 0,
leftMemory = 0x7f33380b72a0, rightMemory = 0x0, networkTest = 0x0, secondaryNetworkTest = 0x0, leftHash = 0x0, rightHash = 0x0, rightSideEntryStructure = 0x0, nextLinks = 0x0, lastLevel = 0x7f33380e3cf0, rightMatchNode = 0x0,
ruleToActivate = 0x7f33380e3ec0}
// We are at the end, this should be the rule which is checked for activation?!
(gdb) print (defrule) *0x7f33380e3ec0
$68 = {header = {name = 0x7f33380b6e50,
ppForm = 0x7f33380e3f30 "(defrule MAIN::wm-sync-update-goals-on-parent-change\n ?g <- (goal (id ?id) (parent ?parent))\n ?gm <- (goal-meta (goal-id ?id))\n ?wm <- (wm-fact (key template fact goal args? id ?id) (values $? p"..., whichModule = 0x7f339c7d22d0, bsaveID = 0, next = 0x7f33380e4570, usrData = 0x0}, salience = 0, localVarCnt = 0, complexity = 19, afterBreakpoint = 0, watchActivation = 0, watchFiring = 1, autoFocus = 0, executing = 0,
dynamicSalience = 0x0, actions = 0x7f33380e37a0, logicalJoin = 0x0, lastJoin = 0x7f33380e3e10, disjunct = 0x0}
From my understanding, the rule in question is called wm-sync-update-goals-on-parent-change, which is the following:
(defrule wm-sync-update-goals-on-parent-change
?g <- (goal (id ?id) (parent ?parent))
?gm <- (goal-meta (goal-id ?id))
?wm <- (wm-fact (key template fact goal args? id ?id)
(values $? parent ?other-parent&:(neq ?parent ?other-parent) $?))
?wm2 <- (wm-fact (key template fact goal-meta args? goal-id ?id))
=>
(retract ?wm)
(retract ?wm2)
(assert-template-wm-fact ?g
?*GOAL_ID_SLOTS*
(delete-member$ (deftemplate-remaining-slots goal ?*GOAL_ID_SLOTS*)
meta-fact))
(assert-template-wm-fact ?gm
?*GOAL_META_ID_SLOTS*
(deftemplate-remaining-slots goal-meta ?*GOAL_META_ID_SLOTS*))
)
Looking at Frame 3-6 i think its clear that the only NeqFunction that comes into consideration within that rule is (neq ?parent ?other-parent), which should be a rather simple check.
But here is also where i got stuck as the rest of the backtrace is weird. I could not find the functions PropagateReturnValue and PropagateReturnAtom in the source code of clips 6.3, but they only occur in clips 6.24. Hence something with the debuginfo seems to be weird. I am pretty sure that the clips version running is indeed 6.3 tho, as we utilize foreach loops. It cannot be a higher version either as I verified that features from 6.31 are indeed missing (e.g., modifying retracted facts causes no error yet).
frame 3
(gdb) frame 3
#3 0x00007f335436a839 in NeqFunction (theEnv=0x7f339c6e0970) at /usr/src/debug/clips-6.30.0-0.25.20090722svn.fc35.x86_64/clips/prdctfun.c:167
167! EvaluateExpression(theEnv,theExpression,&item);
(gdb) info args
theEnv = 0x7f339c6e0970
(gdb) info locals
item = {supplementalInfo = 0x7f339c7d3760, type = 0, value = 0x25, begin = 139859644516720, end = 139857960488192, next = 0x7f333e7fb370}
nextItem = {supplementalInfo = 0x7f339c6e0970, type = 2, value = 0x7f33399bdd90, begin = 139858433126066, end = 19, next = 0x7f33380d0002}
numArgs = 2
i = <optimized out>
theExpression = 0x7f33380e3ca0
// i want find out what is evaluated in the Neq function
// from the source code I guessed I need to look at the evaluation data #define EVALUATION_DATA 44
(gdb) p (((struct environmentData *) theEnv)->theData[44])
$93 = (void *) 0x7f339c6ab930
(gdb) p (struct evaluationData) *0x7f339c6ab930
$95 = {CurrentExpression = 0x7f33380e3c80, EvaluationError = 0, HaltExecution = 0, CurrentEvaluationDepth = 2, numberOfAddressTypes = 1, PrimitivesArray = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f339c6feba0, 0x7f339c7bc398,
0x0 <repeats 23 times>, 0x7f339c70b530, 0x7f339c7213f0, 0x7f339c6fa058, 0x0 <repeats 16 times>, 0x7f339c7b6490, 0x7f339c7b63b0, 0x7f339c7b6420, 0x7f339c7b6570, 0x7f339c7b6260, 0x7f339c7b62d0, 0x7f339c7b6340, 0x7f339c7b6110,
0x7f339c7b6180, 0x7f339c7b61f0, 0x7f339c7b65e0, 0x7f339c7b6650, 0x7f339c7b6500, 0x7f339c796480, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f339c7d16e0, 0x7f339c7d1750, 0x7f339c7d1600, 0x7f339c7d1670, 0x7f339c7d1830, 0x7f339c7d17c0,
0x7f339c7d18a0, 0x7f339c7d19f0, 0x7f339c7d1910, 0x7f339c7d1a60, 0x7f339c7d1980, 0x7f339c7d1ad0, 0x7f339c7b9540, 0x7f339c700a30, 0x7f339c700aa0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f339c6fa0c8, 0x0, 0x0, 0x0, 0x0, 0x7f339c7d2678,
0x7f339c7d26e8, 0x7f339c7d2758, 0x7f339c7d27c8, 0x0 <repeats 51 times>}, ExternalAddressTypes = {0x7f339c6aa790, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}
// The current expression (CurrentExpression = 0x7f33380e3c80) should be Neq
// (type 30 is #define FCALL 30
(gdb) p (expr) *0x7f33380e3c80
$96 = {type = 30, value = 0x7f339c6aebb0, argList = 0x7f33380e3ca0, nextArg = 0x0}
// This seems to be true:
(gdb) p (struct FunctionDefinition) *0x7f339c6aebb0
$6 = {callFunctionName = 0x7f339c6aec10, actualFunctionName = 0x7f33543c38e3 "NeqFunction", returnValueType = 98 'b', functionPointer = 0x7f335436a7b0 <NeqFunction>, parser = 0x0, restrictions = 0x7f33543c38be "2*", overloadable = 1,
sequenceuseok = 1, environmentAware = 1, bsaveIndex = 0, next = 0x7f339c6aeac0, usrData = 0x0, context = 0x0}
// So the arglist should tell me something about what is compared I assume:
(gdb) p (expr) *0x7f33380e3ca0
$98 = {type = 58, value = 0x7f339c80f0e0, argList = 0x0, nextArg = 0x7f33380e3cc0}
(gdb) p (expr) *0x7f33380e3ca0#2
$99 = {{type = 58, value = 0x7f339c80f0e0, argList = 0x0, nextArg = 0x7f33380e3cc0}, {type = 57, value = 0x7f33380e26d0, argList = 0x0, nextArg = 0x0}}
Now I lack the knowledge to continue as type = 57 and type = 58 are #define FACT_JN_VAR1 57 and #define FACT_JN_VAR2 58. To my understanding that means that the actual data is stored in the PrimitivesArray of the evaluationData, but if that is true than I still could not figure out what is going on.
(gdb) frame 3
(gdb) p ((struct evaluationData) *0x7f339c6ab930).PrimitivesArray[57]
$6 = (struct entityRecord *) 0x7f339c7b6110
(gdb) p (struct entityRecord) *0x7f339c7b6110
$7 = {name = 0x7f33543bfb9c "FACT_JN_VAR1", type = 57, copyToEvaluate = 0, bitMap = 1, addsToRuleComplexity = 0, shortPrintFunction = 0x7f3354344e40 <PrintFactJNGetVar1>, longPrintFunction = 0x7f3354344e40 <PrintFactJNGetVar1>,
deleteFunction = 0x0, evaluateFunction = 0x7f335435bd70 <FactJNGetVar1>, getNextFunction = 0x0, decrementBusyCount = 0x0, incrementBusyCount = 0x0, propagateDepth = 0x0, markNeeded = 0x0, install = 0x0, deinstall = 0x0, usrData = 0x0}
(gdb) ptype entityRecord
(gdb) p ((struct evaluationData) *0x7f339c6ab930).PrimitivesArray[58]
$8 = (struct entityRecord *) 0x7f339c7b6180
(gdb) p (struct entityRecord) *0x7f339c7b6180
$9 = {name = 0x7f33543bfba9 "FACT_JN_VAR2", type = 58, copyToEvaluate = 0, bitMap = 1, addsToRuleComplexity = 0, shortPrintFunction = 0x7f3354344e50 <PrintFactJNGetVar2>, longPrintFunction = 0x7f3354344e50 <PrintFactJNGetVar2>,
deleteFunction = 0x0, evaluateFunction = 0x7f335435b3e0 <FactJNGetVar2>, getNextFunction = 0x0, decrementBusyCount = 0x0, incrementBusyCount = 0x0, propagateDepth = 0x0, markNeeded = 0x0, install = 0x0, deinstall = 0x0, usrData = 0x0}
If you're using CLIPS 6.3, but the stack trace is showing functions from 6.24, you should probably try to figure that out first.
Then, if you haven't done so already, download the existing patches for 6.3 from https://sourceforge.net/p/clipsrules/code/HEAD/tree/branches/63x/core/ and see if your issue is still present.
I doubt that it's an issue specific to the neq function. The FACT_JN_VAR1 and FACT_JN_VAR2 are primitives to retrieve values from facts during pattern matching. Sometimes this can be caused by an incorrectly computed index to retrieve the value from a fact and is usually associated with a rule having a complex set of conditions with nested conditional elements.
Since the fault is occurring when a wm-fact is asserted, potentially any rule with a pattern that matches that fact and has a neq call in the conditions could be the cause. Sometimes you can isolate the issue by running your system right up to the last rule fired, saving the facts, clear and load the rules again, load the saved facts, and let the last rule fire. If you still get a crash, then you can pare out the facts and rules so that you can get a smaller set of code to debug.
The two other possibilities that come to mind are that either CLIPS is erroneously garbage collecting something prematurely or that memory is being corrupted some other way. That's harder to debug because the actual problem usually happens long before crash. When I've had to debug these types of issues before, the first step was also trying to pare down the code so that I could get a better idea what was causing the issue.
I am trying to debug inside my assembly code to check what values are in advanced SIMD vector registers. To this end, I run gdb and set a breakpoint inside my instructions, run layout asm and step through my instructions using si. However, when I reached to my desired instruction, p v16 for example, didn't print the value inside this register and it gave me an error like as the following:
│0x4009d0 <Montmul512+80> umull2 v16.2d, v15.4s, v7.s[3] │
>│0x4009d4 <Montmul512+84> umull2 v17.2d, v13.4s, v7.s[3] │
│0x4009d8 <Montmul512+88> umull2 v18.2d, v14.4s, v7.s[3] │
│0x4009dc <Montmul512+92> umull2 v19.2d, v12.4s, v7.s[3] │
│0x4009e0 <Montmul512+96> umull v20.2d, v15.2s, v7.s[3] │
│0x4009e4 <Montmul512+100> umull v21.2d, v13.2s, v7.s[3] │
│0x4009e8 <Montmul512+104> umull v22.2d, v14.2s, v7.s[3] │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print v16
print v16
No symbol "v16" in current context.
I haven't had any experience around debugging assembly codes, so maybe this question seems to be very simple for many folks.
You can also print vector registers from within gdb like in the examples below.
(gdb) p $v0
$101 = {d = {f = {1.2672947890318689e-279, 7.7486181465248912e-304}, u = {434317018741670663, 72340181461566213}, s = {434317018741670663, 72340181461566213}}, s = {
f = {2.42644275e-35, 2.53914328e-35, 3.79131591e-37, 2.36942839e-38}, u = {100729607, 101122311, 50397957, 16843011}, s = {100729607, 101122311, 50397957,
16843011}}, h = {u = {775, 1537, 263, 1543, 773, 769, 259, 257}, s = {775, 1537, 263, 1543, 773, 769, 259, 257}}, b = {u = {7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3,
3, 1, 1, 1}, s = {7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 1, 1, 1}}, q = {u = {0x01010103030103050607010706010307}, s = {0x01010103030103050607010706010307}}}
Print different lanes/elements:
(gdb) p $v0.q
$102 = {u = {0x01010103030103050607010706010307}, s = {0x01010103030103050607010706010307}}
(gdb) p $v0.d
$103 = {f = {1.2672947890318689e-279, 7.7486181465248912e-304}, u = {434317018741670663, 72340181461566213}, s = {434317018741670663, 72340181461566213}}
(gdb) p $v0.s
$104 = {f = {2.42644275e-35, 2.53914328e-35, 3.79131591e-37, 2.36942839e-38}, u = {100729607, 101122311, 50397957, 16843011}, s = {100729607, 101122311, 50397957,
16843011}}
(gdb) p $v0.q.s
$105 = {0x01010103030103050607010706010307}
(gdb) p $v0.d.s
$106 = {434317018741670663, 72340181461566213}
(gdb) p $v0.d.s[1]
$107 = 72340181461566213
In my experience using the -tui, layout asm, layout reg option tends to be crowded if you don't have very large monitors. So if you do the commands below in gdb you'll have a hard time seeing all the simd registers. I tend to use abbreviations since I'm lazy. Gdb will let you know when it doesn't understand which command you want.
(gdb) wh reg +1
(gdb) tu reg next
Try info vector for all Advanced SIMD registers (printed in various layouts), or info all-registers v16 for just the contents of v16.
ARMv7
And this is the ARMv7 behavior analogous to the ARMv8 mentioned at: https://stackoverflow.com/a/38538116/9160762 with QEMU v3.0.0 built from source user mode + GDB 8.2 Ubuntu 16.04.
After loading:
1.5, 2.5, 3.5, 4.5
into q0, we have:
(gdb) p $q0
$3 = {
u8 = {[0] = 0, [1] = 0, [2] = 192, [3] = 63, [4] = 0, [5] = 0, [6] = 32, [7] = 64, [8] = 0, [9] = 0, [10] = 96, [11] = 64, [12] = 0, [13] = 0, [14] = 144, [15] = 64},
u16 = {[0] = 0, [1] = 16320, [2] = 0, [3] = 16416, [4] = 0, [5] = 16480, [6] = 0, [7] = 16528},
u32 = {[0] = 1069547520, [1] = 1075838976, [2] = 1080033280, [3] = 1083179008},
u64 = {[0] = 4620693218751676416, [1] = 4652218416153755648},
f32 = {[0] = 1.5, [1] = 2.5, [2] = 3.5, [3] = 4.5},
f64 = {[0] = 8.0000018998980522, [1] = 1024.0002455711365}
}
and:
(gdb) p $q0.f32
$5 = {[0] = 1.5, [1] = 2.5, [2] = 3.5, [3] = 4.5}
Test setup.
Bug with info register
When I try to use info vector or info register in this setup (v7 or v8) as mentioned at https://stackoverflow.com/a/35552000/9160762 , there seems to be a bug where the floating point representation gets converted to integer, see: https://reverseengineering.stackexchange.com/questions/8992/floating-point-registers-on-arm/20623#20623
SVE
Not yet implemented on QEMU, see: How to assemble ARM SVE instructions with GNU GAS or LLVM and run it on QEMU?
It seems that there're 6 variations to CBC-MAC algorithm. I've been trying to match the MAC algorithm on the PINPad 1000SE [which per manual is ISO 9797-1 Algorithm 1].
I got an excellent start from here.
And I coded the algorithm as below:
public static byte[] CalculateMAC(this IPinPad pinpad, byte[] message, byte[] key)
{
//Divide the key with Key1[ first 64 bits] and key2 [last 64 bits]
var key1 = new byte[8];
Array.Copy(key, 0, key1, 0, 8);
var key2 = new byte[8];
Array.Copy(key, 8, key2, 0, 8); //64 bits
//divide the message into 8 bytes blocks
//pad the last block with "80" and "00","00","00" until it reaches 8 bytes
//if the message already can be divided by 8, then add
//another block "80 00 00 00 00 00 00 00"
Action<byte[], int> prepArray = (bArr, offset) =>
{
bArr[offset] = 0; //80
for (var i = offset + 1; i < bArr.Length; i++)
bArr[i] = 0;
};
var length = message.Length;
var mod = length > 8? length % 8: length - 8;
var newLength = length + ((mod < 0) ? -mod : (mod > 0) ? 8 - mod : 0);
//var newLength = length + ((mod < 0) ? -mod : (mod > 0) ? 8 - mod : 8);
Debug.Assert(newLength % 8 == 0);
var arr = new byte[newLength];
Array.Copy(message, 0, arr, 0, length);
//Encoding.ASCII.GetBytes(message, 0, length, arr, 0);
prepArray(arr, length);
//use initial vector {0,0,0,0,0,0,0,0}
var vector = new byte[] { 0, 0, 0, 0, 0, 0, 0, 0 };
//encrypt by DES CBC algorith with the first key KEY 1
var des = new DESCryptoServiceProvider { Mode = CipherMode.CBC };
var cryptor = des.CreateEncryptor(key1, vector);
var outputBuffer = new byte[arr.Length];
cryptor.TransformBlock(arr, 0, arr.Length, outputBuffer, 0);
//Decrypt the result by DES ECB with the second key KEY2 [Original suggestion]
//Now I'm Encrypting
var decOutputBuffer = new byte[outputBuffer.Length];
des.Mode = CipherMode.ECB;
var decryptor = des.CreateEncryptor(key2, vector);
//var decryptor = des.CreateDecryptor(key2, vector);
decryptor.TransformBlock(outputBuffer, 0, outputBuffer.Length, decOutputBuffer, 0);
//Encrypt the result by DES ECB with the first key KEY1
var finalOutputBuffer = new byte[decOutputBuffer.Length];
var cryptor2 = des.CreateEncryptor(key1, vector);
cryptor2.TransformBlock(decOutputBuffer, 0, decOutputBuffer.Length, finalOutputBuffer, 0);
//take the first 4 bytes as the MAC
var rval = new byte[4];
Array.Copy(finalOutputBuffer, 0, rval, 0, 4);
return rval;
}
Then I discovered there're 3 padding schemes and the one that gave me a start may not necessarily be right. The manual came to my rescue again. It seems the device only pads with 0s. Additional block is also nowhere mentioned so I made the below changes:
Action<byte[], int> prepArray = (bArr, offset) =>
{
bArr[offset] = 0; ... }
No additional block (if mod 0 [divisible by 8] do not change array length)
var newLength = length + ((mod < 0) ? -mod : (mod > 0) ? 8 - mod : 0);
The original suggestion wanted me to decrypt at the second step... but Valery here suggests that it's encrypt all the way. So I changed Decrypt to Encrypt. But still I'm unable to get the requisite MAC...
Manual says for key "6AC292FAA1315B4D8234B3A3D7D5933A" [since the key should be 16 bytes, I figured the key here's hex string so I took byte values of 6A, C2, 92, FA...
new byte[] { 106, 194, 146, ...] the MAC should be 7B,40,BA,95 [4 bytes] if the message is [0x1a + byte array of MENTERODOMETER]
Can someone help? Please?
Since Pinpad requires that the first character in message is a 0x1a...
public static byte[] CalculateAugmentedMAC(this IPinPad pinpad, string message, byte[] key)
{
var arr = new byte[message.Length + 1];
var source = Encoding.ASCII.GetBytes(message);
arr[0] = 0x1a; //ClearScreenIndicator
Array.Copy(source, 0, arr, 1, source.Length);
return CalculateMAC(pinpad, arr, key);
}
I'm calling the code above with this input:
var result = pad.CalculateAugmentedMAC("MENTERODOMETER", new byte[] { 106, 194, 146, 250, 161, 49, 91, 77, 130, 52, 179, 163, 215, 213, 147, 58 });
Most CBC MAC algorithms are implemented in BouncyCastle's JCE provider.
Look at: BouncyCastleProvider.java
You're probably looking for DESEDEISO9797ALG1MACWITHISO7816-4PADDING, which is an alias for DESEDEMAC64WITHISO7816-4PADDING, implemented here (well, it's a specific configuration of CBCBlockCipherMac using the DESedeEngine and ISO7816d4Padding, you'll have to jump between some classes to get the full picture):
JCEMac.java
Also, have a look at jPos:
JCESecurityModule.java
and their contributed retail MAC algorithm implementation:
retail-mac-contributed-by-vsalaman.zip
I am pretty sure (IIRC) that you need to call TransformFinalBlock at the end (per encryptor).
Can't answer to your specific terminal, but I use this to test MACs.
public static byte[] GenerateMAC(byte[] key, byte[] data)
{
using (MACTripleDES mac = new MACTripleDES(key))
return mac.ComputeHash(data);
}