So basically I am trying to use some Prolog code to simulate pointer like behavior.
I asked a related question here, and after around one month, I finally have time to start.
Here is a simple example in C:
int a = 1;
int* p = &a;
int b = *p;
And I want to translate this code into Prolog like this (or other better strategies?):
A is 1,
assert(ref(p, a)), <- this can be dynamic fact gene
ref(p, TEMP), <- now I want to use a!!
to_lowercase(TEMP, TEMP1), <- I don't know how to implement to_low
B is TEMP1. <- reflection?
So in the above code, I am confused with
In my understanding, after ref(p, TEMP), then TEMP will equal to "a", and it is just a string, then how can I reuse it as a variable name, sounds like a reflection...?
How to implement the to_lowercase function?
Am I clear?
If you are really that determined to simulate a computer from within Prolog, you should take into account the answers to your previous questions before moving on. Either way, this answer makes a lot of assumptions about what your ultimate goal is. I am guessing that you are trying to simulate a machine, and write a simulator that takes source code written in a C-style language and executes it.
So let's say you have a very simple processor with flat memory space (some small embedded microcontrollers are like this). Your whole memory then would be just one chunk of 16-bit addresses, let's say 1000 of them:
functor(Memory, memory, 1000).
Taking your C code above, a compiler might come up with:
Pick an address Addr1 for a, which is an int, and write the value 1 at that address
Pick an address Addr2 for p, which is an int *, and write the value of the address of a at that address
Pick an address Addr3 for b, which is an int, and write to it the value which is at the memory to which the value in p is pointing to.
This could translate to machine code for the machine you are simulating (assuming the actual addresses have been already picked appropriately by the compiler):
arg(Addr1, Memory, 1), % int a = 1;
arg(Addr2, Memory, Addr1), % int *p = &a;
arg(Addr2, Memory, Tmp1), %% put the value at p in Tmp1; this is an address
arg(Tmp1, Memory, Tmp2), %% read the value at the address Tmp1 into Tmp2
arg(Addr3, Memory, Tmp2). % int b = *p;
So of course all addresses should be within your memory space. Some of the calls to arg/3 above are reads and some are writes; it should be clear which is which. Note that in this particular conjunction all three arguments to the term memory/1000 are still free variables. To modify the values at an address that has been already set you would need to copy it accordingly, freeing the address you need to reuse.
Please read carefully all the answers to your questions before pressing on.
You need to read a good book on Prolog. I'd suggest The Art of Prolog.
Prolog doesn't have anything like pointers, or addresses or variables. It's got terms. An unbound term is variable because it's not yet bound. Once bound (unified), it ceases to be variable and it becomes that with which it unified. It cannot be assigned a new value — unless the unification is undone via backtracking and an alternative path taken. Hence the term unification.
Trying to map the concept of pointers and memory addresses onto prolog is somewhat akin to putting fish gills on a bicycle.
As far as implementing a predicate for converting a strong to lower-case, you should realize that Prolog doesn't really have strings: the Prolog string "ABC" is exactly identical to the list [65,66,67], a a list of integers representing ASCII/Unicode code points. It is what is called syntactic sugar. So...given that identity...
Something like
to_lower( [] , [] ).
to_lower( [C|Cs] , [L|Ls] ) :-
C >= A ,
C =< Z ,
! ,
Offset = C - A ,
L is Base+Offset,
to_lower([C|Cs],[C|Ls]) :-
Should do you.
Since you tag the question SWI-Prolog, I assume you have clear that the string concept has undergone some important change in recent times, mainly for efficiency reasons.
Look at downcase_atom/2, or string_lower/2, depending on your intended usage (I linked to string processing page, because the string_lower one has a typo).
For storing 'pointers' like objects, I suggest to use global variables, nb_setval/2, nb_getval/2, nb_current/2 instead of assert/retract. For first, are much more efficient (I measured time ago a factor of 3 in favour of nb_ predicate family), and make clearer the intended usage. assert/retract are better used to update a dynamic knowledge base.
Here is a section of Prolog code defining numeral in a recursive way:
numeral(succ(X)) :- numeral(X).
When given query numeral(X). Prolog will return:
X = 0 ;
X = succ(0) ;
X = succ(succ(0)) ;
X = succ(succ(succ(0))) ;
X = succ(succ(succ(succ(0)))) ;
X = succ(succ(succ(succ(succ(0))))) ;
X = succ(succ(succ(succ(succ(succ(0)))))) ;
X = succ(succ(succ(succ(succ(succ(succ(0))))))) ;
X = succ(succ(succ(succ(succ(succ(succ(succ(0))))))))
Based on what I have learned, when doing the query, prolog will firstly make X into a variable like (_G42), then it will search the facts and rules to find the match.
In this case, it will find 0 (fact) as a right match. Then it will also try to match the rule. That is considering _G42 is not 0, and _G42 is the succ of another number. Thus, another variable is generated(like _G44), _G44 will match 0 and will also go further like _G42. Since _G44 matches 0, then it will go backward to _G42, getting _G42 = succ(_G44) = succ(0).
I am not sure if I am right about the understanding. I made a diagram to show my comprehension on this problem.
If the analysis is correct, I still feel difficult to design the recursive function like this. Since I am new to Prolog, I want to know if this kind of definition always used in application (say building an expert system, verifying protocols) or it is just for beginners to better understanding the basic searching procedure? If it is often used, what is the key point to design this kind of recursive definition?
My personal opinion: Especially as a beginner, you have zero chance to"understand the recursive search in Prolog". Countless beginners are trying to understand Prolog in this way, and they very consistently fail.
The sad part is that this hits hardest workers the hardest: You always think you can somehow understand it, but in the end, you cannot, because there are too many ways to invoke even the simplest predicates, with uninstantiated and (partly) instantiated arguments, and even with aliased variables.
Your graph nicely illustrates that such a procedural reading gets extremely unwieldy very quickly for even the simplest conceivable recursive definitions.
A much more tractable approach for understanding the predicate is to read it declaratively:
0 is a numeral
If X is a numeral (whatever X is!), then succ(X) of X is also a numeral.
Note that :- even means ←, i.e., an implication from right to left.
My recommendation is to focus on a clear declarative description of what ought to hold. To overcome the initial barriers with Prolog, you must let go the idea that you can trace the steps that the CPU performs in the extreme detail in which you are currently trying to follow it. Prolog is too high-level to be amenable to tracing in this low-level way. It is like trying to interpret between French and English by tracing only the neuronal activities of the speakers.
Write a clear definition and then leave the search to Prolog. There are many other and working ways to understand and break down declarative definitions without getting swamped in low-level details. See for example program-slicing and failure-slicing. They work as long as you stay in the so-called pure monotonic subset of Prolog. Focus on this area, and you will be able to make very fast progress.
I know not to use them, but there are techniques to swap two variables without using a third, such as
x ^= y;
y ^= x;
x ^= y;
x = x + y
y = x - y
x = x - y
In class the prof mentioned that these were popular 20 years ago when memory was very limited and are still used in high-performance applications today. Is this true? My understanding as to why it's pointless to use such techniques is that:
It can never be the bottleneck using the third variable.
The optimizer does this anyway.
So is there ever a good time to not swap with a third variable? Is it ever faster?
Compared to each other, is the method that uses XOR vs the method that uses +/- faster? Most architectures have a unit for addition/subtraction and XOR so wouldn't that mean they are all the same speed? Or just because a CPU has a unit for the operation doesn't mean they're all the same speed?
These techniques are still important to know for the programmers who write the firmware of your average washing machine or so. Lots of that kind of hardware still runs on Z80 CPUs or similar, often with no more than 4K of memory or so. Outside of that scene, knowing these kinds of algorithmic "trickery" has, as you say, as good as no real practical use.
(I do want to remark though that nonetheless, the programmers who remember and know this kind of stuff often turn out to be better programmers even for "regular" applications than their "peers" who won't bother. Precisely because the latter often take that attitude of "memory is big enough anyway" too far.)
There's no point to it at all. It is an attempt to demonstrate cleverness. Considering that it doesn't work in many cases (floating point, pointers, structs), is unreadabe, and uses three dependent operations which will be much slower than just exchanging the values, it's absolutely pointless and demonstrates a failure to actually be clever.
You are right, if it was faster, then optimising compilers would detect the pattern when two numbers are exchanged, and replace it. It's easy enough to do. But compilers do actually notice when you exchange two variables and may produce no code at all, but start using the different variables after that. For example if you exchange x and y, then write a += x; b += y; the compiler may just change this to a += y; b += x; . The xor or add/subtract pattern on the other hand will not be recognised because it is so rare and won't get improved.
Yes, there is, especially in assembly code.
Processors have only a limited number of registers. When the registers are pretty full, this trick can avoid spilling a register to another memory location (posssibly in an unfetched cacheline).
I've actually used the 3 way xor to swap a register with memory location in the critical path of high-performance hand-coded lock routines for x86 where the register pressure was high, and there was no (lock safe!) place to put the temp. (on the X86, it is useful to know the the XCHG instruction to memory has a high cost associated with it, because it includes its own lock, whose effect I did not want. Given that the x86 has LOCK prefix opcode, this was really unnecessary, but historical mistakes are just that).
Morale: every solution, no matter how ugly looking when standing in isolation, likely has some uses. Its good to know them; you can always not use them if inappropriate. And where they are useful, they can be very effective.
Such a construct can be useful on many members of the PIC series of microcontrollers which require that almost all operations go through a single accumulator ("working register") [note that while this can sometimes be a hindrance, the fact that it's only necessary for each instruction to encode one register address and a destination bit, rather than two register addresses, makes it possible for the PIC to have a much larger working set than other microcontrollers].
If the working register holds a value and it's necessary to swap its contents with those of RAM, the alternative to:
xorwf other,w ; w=(w ^ other)
xorwf other,f ; other=(w ^ other)
xorwf other,w ; w=(w ^ other)
would be
movwf temp1 ; temp1 = w
movf other,w ; w = other
movwf temp2 ; temp2 = w
movf temp1,w ; w = temp1 [old w]
movwf other ; other = w
movf temp2,w ; w = temp2 [old other]
Three instructions and no extra storage, versus six instructions and two extra registers.
Incidentally, another trick which can be helpful in cases where one wishes to make another register hold the maximum of its present value or W, and the value of W will not be needed afterward is
subwf other,w ; w = other-w
btfss STATUS,C ; Skip next instruction if carry set (other >= W)
subwf other,f ; other = other-w [i.e. other-(other-oldW), i.e. old W]
I'm not sure how many other processors have a subtract instruction but no non-destructive compare, but on such processors that trick can be a good one to know.
These tricks are not very likely to be useful if you want to exchange two whole words in memory or two whole registers. Still you could take advantage of them if you have no free registers (or only one free register for memory-to-memoty swap) and there is no "exchange" instruction available (like when swapping two SSE registers in x86) or "exchange" instruction is too expensive (like register-memory xchg in x86) and it is not possible to avoid exchange or lower register pressure.
But if your variables are two bitfields in single word, a modification of 3-XOR approach may be a good idea:
y = (x ^ (x >> d)) & mask
x = x ^ y ^ (y << d)
This snippet is from Knuth's "The art of computer programming" vol. 4a. sec. 7.1.3. Here y is just a temporary variable. Both bitfields to exchange are in x. mask is used to select a bitfield, d is distance between bitfields.
Also you could use tricks like this in hardness proofs (to preserve planarity). See for example crossover gadget from this slide (page 7). This is from recent lectures in "Algorithmic Lower Bounds" by prof. Erik Demaine.
Of course it is still useful to know. What is the alternative?
c = a
a = b
b = c
three operations with three resources rather than three operations with two resources?
Sure the instruction set may have an exchange but that only comes into play if you are 1) writing assembly or 2) the optimizer figures this out as a swap and then encodes that instruction. Or you could do inline assembly but that is not portable and a pain to maintain, if you called an asm function then the compiler has to setup for the call burning a bunch more resources and instructions. Although it can be done you are not as likely to actually exploit the instruction sets feature unless the language has a swap operation.
Now the average programmer doesnt NEED to know this now any more than back in the day, folks will bash this kind of premature optimization, and unless you know the trick and use it often if the code isnt documented then it is not obvious so it is bad programming because it is unreadable and unmaintainable.
it is still a value programming education and exercise for example to have one invent a test to prove that it actually swaps for all combinations of bit patterns. And just like doing an xor reg,reg on an x86 to zero a register, it has a small but real performance boost for highly optimized code.
Sometimes the value of a variable accessed within the control-flow of a program cannot possibly have any effect on a its output. For example:
global var_1
global var_2
start program hello(var_3, var_4)
if (var_2 < 0) then
save-log-to-disk (var_1, var_3, var_4)
return ("Hello " + var_3 + ", my name is " + var_1)
end program
Here only var_1 and var_3 have any influence on the output, while var_2 and var_4 are only used for side effects.
Do variables such as var_1 and var_3 have a name in dataflow-theory/compiler-theory?
Which static dataflow analysis techniques can be used to discover them?
References to academic literature on the subject would be particularly appreciated.
The problem that you stated is undecidable in general,
even for the following very narrow special case:
Given a single routine P(x), where x is a parameter of type integer. Is the output of P(x) independent of the value of x, i.e., does
P(0) = P(1) = P(2) = ...?
We can reduce the following still undecidable version of the halting problem to the question above: Given a Turing machine M(), does the program
never stop on the empty input?
I assume that we use a (Turing-complete) language in which we can build a "Turing machine simulator":
Given the program M(), construct this routine:
if x == 0:
return 0
Run M() for x steps
if M() has terminated then:
return 1
return 0
P(0) = P(1) = P(2) = ...
M() does not terminate.
M() does terminate
=> P(x) = 1 for a sufficiently large x
=> P(x) != P(0) = 0
So, it is very difficult for a compiler to decide whether a variable actually does not influence the return value of a routine; in your example, the "side effect routine" might manipulate one of its values (or even loop infinitely, which would most definitely change the return value of the routine ;-)
Of course overapproximations are still possible. For example, one might conclude that a variable does not influence the return value if it does not appear in the routine body at all. You can also see some classical compiler analyses (like Expression Simplification, Constant propagation) having the side effect of eliminating appearances of such redundant variables.
Pachelbel has discussed the fact that you cannot do this perfectly. OK, I'm an engineer, I'm willing to accept some dirt in my answer.
The classic way to answer you question is to do dataflow tracing from program outputs back to program inputs. A dataflow is the connection of a program assignment (or sideeffect) to a variable value, to a place in the application that consumes that value.
If there is (transitive) dataflow from a program output that you care about (in your example, the printed text stream) to an input you supplied (var2), then that input "affects" the output. A variable that does not flow from the input to your desired output is useless from your point of view.
If you focus your attention only the computations involved in the dataflows, and display them, you get what is generally called a "program slice" . There are (very few) commercial tools that can show this to you.
Grammatech has a good reputation here for C and C++.
There are standard compiler algorithms for constructing such dataflow graphs; see any competent compiler book.
They all suffer from some limitation due to Turing's impossibility proofs as pointed out by Pachelbel. When you implement such a dataflow algorithm, there will be places that it cannot know the right answer; simply pick one.
If your algorithm chooses to answer "there is no dataflow" in certain places where it is not sure, then it may miss a valid dataflow and it might report that a variable does not affect the answer incorrectly. (This is called a "false negative"). This occasional error may be satisfactory if
the algorithm has some other nice properties, e.g, it runs really fast on a millions of code. (The trivial algorithm simply says "no dataflow" in all places, and it is really fast :)
If your algorithm chooses to answer "yes there is a dataflow", then it may claim that some variable affects the answer when it does not. (This is called a "false positive").
You get to decide which is more important; many people prefer false positives when looking for a problem, because then you have to at least look at possibilities detected by the tool. A false negative means it didn't report something you might care about. YMMV.
Here's a starting reference:
Any of the books on that page will be pretty good. I have Muchnick's book and like it lot. See also this page: (
You will discover that implementing this is pretty big effort, for any real langauge. You are probably better off finding a tool framework that does most or all this for you already.
I use the following algorithm: a variable is used if it is a parameter or it occurs anywhere in an expression, excluding as the LHS of an assignment. First, count the number of uses of all variables. Delete unused variables and assignments to unused variables. Repeat until no variables are deleted.
This algorithm only implements a subset of the OP's requirement, it is horribly inefficient because it requires multiple passes. A garbage collection may be faster but is harder to write: my algorithm only requires a list of variables with usage counts. Each pass is linear in the size of the program. The algorithm effectively does a limited kind of dataflow analysis by elimination of the tail of a flow ending in an assignment.
For my language the elimination of side effects in the RHS of an assignment to an unused variable is mandated by the language specification, it may not be suitable for other languages. Effectiveness is improved by running before inlining to reduce the cost of inlining unused function applications, then running it again afterwards which eliminates parameters of inlined functions.
Just as an example of the utility of the language specification, the library constructs a thread pool and assigns a pointer to it to a global variable. If the thread pool is not used, the assignment is deleted, and hence the construction of the thread pool elided.
IMHO compiler optimisations are almost invariably heuristics whose performance matters more than effectiveness achieving a theoretical goal (like removing unused variables). Simple reductions are useful not only because they're fast and easy to write, but because a programmer using a language who understand basics of the compiler operation can leverage this knowledge to help the compiler. The most well known example of this is probably the refactoring of recursive functions to place the recursion in tail position: a pointless exercise unless the programmer knows the compiler can do tail-recursion optimisation.
So basically I am trying to simulate asm code using Prolog.
With the help of #mbratch , I know it is straightforward and easy to use dynamic facts to simulate instructions like
add eax, 1
mov eax, 1
in this way:
:- dynamic(register/2). % Fill in as needed
register(eax, 0).
add(Reg, Value) :-
( retract(register(Reg, OldValue))
-> NewValue is OldValue + Value
assertz(register(Reg, NewValue)).
But the problem is that how to simulate the stack in a similar way...?
Originally I wrote some quite FP style code like this:
nth_member(1, [M|_], M).
nth_member(N, [_|T], M) :- N>1, N1 is N - 1, nth_member(N1, T, M).
% push ebp
ESP is ESP - 1, nth_member(ESP, STACK, EBP),
But the problem is I don't know how to rewrite this code in a dynamic facts style...
Could anyone give me some help..? Thank you!
I'd like to reinforce the point #Boris made: do not use dynamic predicates.
By far the cleanest solution is to use state variables to carry around the current state of the simulated machine. Because of Prolog's single-assignment characteristic, you will always have pairs of these: state before and state after. For registers and memory, the state is best represented as a table which maps register names (or memory addresses) to values. A stack can simply be kept as a list. For example:
main :-
Stack0 = [],
Regs0 = [eax-0, ebx-0, ecx-0, edx-0],
Code = [movi(3,eax), add(eax,7), push(eax), pop(ecx)],
sim_code(Code, Regs0, RegsN, Stack0, StackN),
write(RegsN), nl, write(StackN), nl.
% simulate a sequence of instructions
sim_code([], Regs, Regs, Stack, Stack).
sim_code([Instr|Instrs], Regs0, RegsN, Stack0, StackN) :-
sim_instr(Instr, Regs0, Regs1, Stack0, Stack1),
sim_code(Instrs, Regs1, RegsN, Stack1, StackN).
% simulate one instruction
sim_instr(movi(Value,Reg), Regs0, RegsN, Stack, Stack) :-
update(Regs0, Reg, _Old, Value, RegsN).
sim_instr(add(Reg,Value), Regs0, RegsN, Stack, Stack) :-
update(Regs0, Reg, Old, New, RegsN),
New is Old+Value.
sim_instr(push(Reg), Regs, Regs, Stack, [Val|Stack]) :-
lookup(Regs, Reg, Val).
sim_instr(pop(Reg), Regs0, RegsN, [Val|Stack], Stack) :-
update(Regs0, Reg, _Old, Val, RegsN).
%sim_instr(etc, ...).
% simple key-value table (replace with more efficient library predicates)
lookup([K-V|KVs], Key, Val) :-
( Key==K -> Val=V ; lookup(KVs, Key, Val) ).
update([K-V|KVs], Key, Old, New, KVs1) :-
( Key==K ->
Old = V, KVs1 = [K-New|KVs]
KVs1 = [K-V|KVs2],
update(KVs, Key, Old, New, KVs2)
In practice, you should replace my simple table implementation (lookup/3, update/5) with an efficient hash or tree-based version. These are not standardised, but you can usually find one among the libraries that come with your Prolog system.
The Prolog canon says, don't use dynamic facts for state unless you have a really good reason. In other words, if you want to model a stack, maintain it as a term that you mutate and pass to the next step of a recursive predicate that takes as arguments the state. For example (very simplified),
step(Current_stack, Final_stack) :-
read_next_instruction(Instruction/*, whatever other arguments you need */),
apply_instruction(Current_stack, Instruction, New_stack),
step(New_stack, Final_stack).
The second argument, Final_stack, is there if you want to have the final stack after getting through all instructions in the code you are simulating. It will probably be a free variable at the beginning of the simulation, or, if you want to validate, the expected final state.
The stack itself would be either a list (if you only need the top of the stack), or a more complex, possibly nested term. You most probably would want to maintain all registers in this way too (as shown in my other answer).
There is another option, using proper mutable global variables. Depending on the Prolog implementation you use, this will involve different built-ins. For SWI-Prolog, look here; for GNU-Prolog, here. Other implementations will probably have predicates along the same lines as well.
The main point here is that using assert and retract for maintaining state that changes frequently makes your program very difficult to understand, and very inefficient. The "pure" Prolog solution is the first suggestion; using global variables can be more efficient in some cases.
As a full example of how to use a stack, see this answer to a question about a stack-based calculator (shameless self-promotion):
Postfix expression list evaluation
And to expand on the "don't use dynamic predicates", they definitely have their use. A good example of when this is a good solution is if you are implementing a relational database. Then, your tables are implemented as facts, with one clause per column:
name_age('Bob', 20).
name_age('Jane', 23).
% etc
name_occupation('Bob', student).
name_occupation('Jane', teacher).
% etc
Here, you can use asserts to add new rows to your tables, or retracts to remove rows. The main point is that you will probably query your database much more often that you will alter it. You will also profit from Prolog's efficient lookup of facts, plus you can write queries in a more natural way.
I often end up writing code in Prolog which involves some arithmetic calculation (or state information important throughout the program), by means of first obtaining the value stored in a predicate, then recalculating the value and finally storing the value using retractall and assert because in Prolog we cannot assign values to variable twice using is (thus making almost every variable that needs modification, global). I have come to know that this is not a good practice in Prolog. In this regard I would like to ask:
Why is it a bad practice in Prolog (though i myself don't like to go through the above mentioned steps just to have have a kind of flexible (modifiable) variable)?
What are some general ways to avoid this practice? Small examples will be greatly appreciated.
P.S. I just started learning Prolog. I do have programming experience in languages like C.
Edited for further clarification
A bad example (in win-prolog) of what I want to say is given below:
:- dynamic(value/1).
:- assert(value(0)).
adds :-
NewX is X + 4,
mults :-
NewY is Y * 2,
start :-
Then we can query like:
?- start.
Here, it is very trivial, but in real program and application, the above shown method of global variable becomes unavoidable. Sometimes the list given above like assert(value(0))... grows very long with many more assert predicates for defining more variables. This is done to make communication of the values between different functions possible and to store states of variables during the runtime of program.
Finally, I'd like to know one more thing:
When does the practice mentioned above become unavoidable in spite of various solutions suggested by you to avoid it?
The general way to avoid this is to think in terms of relations between states of your computations: You use one argument to hold the state that is relevant to your program before a calculation, and a second argument that describes the state after some calculation. For example, to describe a sequence of arithmetic operations on a value V0, you can use:
state0_state(V0, V) :-
operation1_result(V0, V1),
operation2_result(V1, V2),
operation3_result(V2, V).
Notice how the state (in your case: the arithmetic value) is threaded through the predicates. The naming convention V0 -> V1 -> ... -> V scales easily to any number of operations and helps to keep in mind that V0 is the initial value, and V is the value after the various operations have been applied. Each predicate that needs to access or modify the state will have an argument that allows you to pass it the state.
A huge advantage of threading the state through like this is that you can easily reason about each operation in isolation: You can test it, debug it, analyze it with other tools etc., without having to set up any implicit global state. As another huge benefit, you can then use your programs in more directions provided you are using sufficiently general predicates. For example, you can ask: Which initial values lead to a given outcome?
?- state0_state(V0, given_outcome).
This is of course not readily possible when using the imperative style. You should therefore use constraints instead of is/2, because is/2 only works in one direction. Constraints are much easier to use and a more general modern alternative to low-level arithmetic.
The dynamic database is also slower than threading states through in variables, because it performs indexing etc. on each assertz/1.
1 - it's bad practice because destroys the declarative model that (pure) Prolog programs exhibit.
Then the programmer must think in procedural terms, and the procedural model of Prolog is rather complicate and difficult to follow.
Specifically, we must be able to decide about the validity of asserted knowledge while the programs backtracks, i.e. follow alternative paths to those already tried, that (maybe) caused the assertions.
2 - We need additional variables to keep the state. A practical, maybe not very intuitive way, is using grammar rules (a DCG) instead of plain predicates. Grammar rules are translated adding two list arguments, normally hidden, and we can use those arguments to pass around the state implicitly, and reference/change it only where needed.
A really interesting introduction is here: DCGs in Prolog by Markus Triska. Look for Implicitly passing states around: you'll find this enlighting small example:
num_leaves(nil), [N1] --> [N0], { N1 is N0 + 1 }.
num_leaves(node(_,Left,Right)) -->
More generally, and for further practical examples, see Thinking in States, from the same author.
edit: generally, assert/retract are required only if you need to change the database, or keep track of computation result along backtracking. A simple example from my (very) old Prolog interpreter:
findall_p(_,_,N) :-
collect_found(S,L) :-
getnext(X) :-
X \= '$mark'.
findall/3 can be seen as the basic all solutions predicate. That code should be the very same from Clockins-Mellish textbook - Programming in Prolog. I used it while testing the 'real' findall/3 I implemented. You can see that it's not 'reentrant', because of the '$mark' aliased.