Representing syntactically different terms in TPTP - syntax

I am having a look at first order logic theorem provers such as Vampire and E-Prover, and the TPTP syntax seems to be the way to go. I am more familiar with Logic Programming syntaxes such as Answer Set Programming and Prolog, and although I try refering to a detailed description of the TPTP syntax I still don't seem to grasp how to properly distinguish between interpreted and non interpreted functor (and I might be using the terminology wrong).
Essentially, I am trying to prove a theorem by showing that no model acts as a counter-example. My first difficulty was that I did not expect the following logic program to be satisfiable.
fof(all_foo, axiom, ![X] : (pred(X) => (X = foo))).
fof(exists_bar, axiom, pred(bar)).
It is indeed satisfiable because nothing prevents bar from being equal to foo. So a first solution would be to insist that these two terms are distinct and we obtain the following unsatisfiable program.
fof(all_foo, axiom, ![X] : pred(X) => (X = foo)).
fof(exists_bar, axiom, pred(bar)).
fof(foo_not_bar, axiom, foo != bar).
The Techinal Report clarifies that different double quoted strings are different objects indeed, so another solution is to put quotes here and there, so as to obtain the following unsatisfiable program.
fof(all_foo, axiom, ![X] : (pred(X) => (X = "foo"))).
fof(exists_bar, axiom, pred("bar")).
I am happy not to have manually specify the inequality as that would obviously not scale to a more realistic scenario. Moving closer to my real situation, I actually have to handle composed terms, and the following program is unfortunately satisfiable.
fof(all_foo, axiom, ![X] : (pred(X) => (X = f("foo")))).
fof(exists_bar, axiom, pred(g("bar"))).
I guess f("foo") is not a term but the function f applied to the object "foo". So it could potentially coincide with function g. Although a manual specification that f and g never coincide does the trick, the following program is unsatisfiable, I feel like I'm doing it wrong. And it probably wouldn't scale to my real setting with plenty of terms all to be interpreted as distinct when they are syntactically distinct.
fof(all_foo, axiom, ![X] : (pred(X) => (X = f("foo")))).
fof(exists_bar, axiom, pred(g("bar"))).
fof(f_not_g, axiom, ![X, Y] : f(X) != g(Y)).
I have tried throwing single quotes around, but I didn't find the proper way to do it.
How do I make syntactically different (composed) terms and test for syntactical equality?
Subsidiary question: the following program is satisfiable, because the automated-theorem prover understands f as a function rather than a uninterpreted functor.
fof(exists_f_g, axiom, (?[I] : ((f(foo) = f(I)) & pred(g(I))))).
fof(not_g_foo, axiom, ~pred(g(foo))).
To make it unsatisfiable, I need to manually specify that f is injective. What would be the natural way to obtain this behaviour without specifying injectivity of all functors that occur in my program?
fof(exists_f_g, axiom, (?[I] : ((f(foo) = f(I)) & pred(g(I))))).
fof(not_g_foo, axiom, ~pred(g(foo))).
fof(f_injective, axiom, ![X,Y] : (f(X) = f(Y) => (X = Y))).

First of all let me point you to the Syntax BNF of TPTP. In principle, you have Prolog terms with some predefined infix/prefix operators of appropriate precedences. This means, variables are written in upper case and constants are written in lower case. Also like Prolog, escaping with single quotes allows us to write a constant starting with a capital letter i.e. 'X'. I have never seen double quoted atoms so far, so you might want look up the instructions of the prover on how to interpret them.
But even though the syntax is Prolog-ish, automated theorem proving is a different kind of beast. There is no closed world assumption nor are different constants assumed to be different - that's why you cannot find a proof for:
fof(c1, conjecture, a=b ).
and neither for:
fof(c1, conjecture, ~(a=b) ).
So if you want to have syntactic dis-equality, you need to axiomatize it. Now, assuming a different from b trivially shows that they are different, so I at least claimed: "Suppose there are two different constants a and b, then there exists some variable which is not b."
fof(a1, axiom, ~(a=b)).
fof(c1, conjecture, ?[X]: ~(X=b)).
Since functions in first-order logic are not necessarily injective, you also don't get around of adding your assumption in there.
Please also note the different roles of input formulas: so far you only stated axioms and no conjectures i.e. you ask the prover to show your axiom set to be inconsistent. Some provers might even give up because they use some resolution refinements (e.g. set of support) which restricts resolution between axioms[1]. In any case, you need to be aware that the formula you are trying to prove is of the form A1 ∧ ... ∧ An → C1 ∨ ... Cm where the A are axioms and the C are conjectures.[2]
I hope that at least the syntax is a bit clearer now - unfortunately the answer to the questions is more that atomated theorem provers don't make the same assumptions as you expect, so you have to axiomatize them. These axiomatizations are also often ineffective and you might get better perfomance from specialized tools.
[1] As you already notice, advanced provers like Vampire or E Prover tell you about (counter-)satisfyability instead.
[2] A resolution based theorem prover will first negate that formula and perform a CNF transformation, but even though most TPTP accepting provers are resolution based, that's not a requirement.

Related

does prolog backtracking/search always follow the same scheme?

The following prolog code establishes a very simple grammar for sentences (sentence = object + verb + subject), and provides some small vocabulary.
% Example 05 - Generating Sentences
% subjects, verbs and objects
subject(john).
subject(jane).
verb(eats).
verb(washes).
object(apples).
object(spinach).
% sentence = subject + verb + object
sentence(X,Y,Z) :- subject(X), verb(Y), object(Z).
% sentence as a list
sentence(S) :- S=[X, Y, Z], subject(X), verb(Y), object(Z).
When asked to generate valid sentences, swi-prolog (specifically swish.swi-prolog.org) generates them in the following order:
?- sentence(S).
S = [john, eats, apples]
S = [john, eats, spinach]
S = [john, washes, apples]
S = [john, washes, spinach]
S = [jane, eats, apples]
S = [jane, eats, spinach]
S = [jane, washes, apples]
S = [jane, washes, spinach]
Question
The above suggests that prolog always backtracks from the right to the left of conjunctive queries. Is this true for all prologs? Is it part of the specification? If not, is it common enough to be relied upon?
Notes
For clarity, by backtracking from the right, I mean that Z is unbound and rebound to find all possibilities, given the first matches for X and Y. Then after these have been exhausted, Y is unbound and rebound, and for each Y, different Z are tested. Finally it is X that is unbound then rebound to new values, and for each X the combinations of Y and Z are generated again.
Short answer: yes. Prolog always uses this same well defined scheme also known as chronological backtracking together with (one specific instance) of SLD-resolution.
But that needs some elaboration.
Prolog systems stick to that very strategy because it is quite efficient to implement and leads in many cases directly to the desired result. For those cases where Prolog works nicely it is pretty much competitive with imperative programming languages for many tasks. Some systems even translate to machine code, the most prominent being the just-in-time compiler of sicstus-prolog.
As you have probably already encountered, there are, however, cases where that strategy leads to undesirable inefficiencies and even non-termination whereas another strategy would produce an answer. So what to do in such situations?
Firstly, the precise encoding of a problem may be reformulated. To take your case of grammars, we even have a specific formalism for this, called Definite Clause Grammars, dcg. It is very compact and leads to both efficient parsing and efficient generation for many cases. This insight (and the precise encoding) was not that evident for quite some time. And the precise moment of Prolog's birth (pretty exactly) 50 years ago was when this was understood. In the example you have, you have just 3 tokens in a list, but most of the time that number can vary. It is there where the DCG formalism shines and still can be used both to parse and generate sentences. In your example, say you also want to include subjects with unrestricted length like [the,boy], [the,nice,boy], [the,nice,and,handsome,boy], ...
There are many such encoding techniques to learn.
Another way how Prolog's strategy is further improved is to offer more flexible selection strategies with built-ins like freeze/2, when/2 and similar coroutining methods. While such extensions exist for quite some time, they are difficult to employ. Particularly because understanding non-termination gets even more complex.
A more successful extension are constraints (constraint-programming), most prominently clpz/clpfd which are used primarily for combinatorial problems. While chronological backtracking is still in place, it is only used as a last resort either to ensure correctness of solutions with labeling/2 or when there is no better way to express the actual problem.
And finally, you may want to reconsider Prolog's strategy in a more fundamental way. This is all possible by means of meta-interpretation. In some sense this is a complete new implementation, but it can often use a lot of Prolog's infrastructure thereby making such meta-interpreters quite compact compared to other programming languages. And, it may not only be used to implement other strategies, it is even used to prototype and implement other programming languages. The most prominent example being erlang which first existed as a Prolog meta-interpreter, its syntax still being quite Prologish.
Prolog as a programming language contains also many features that do not fit into this pure view, like side effecting built-ins like put_char/1 which are clearly a hindrance in meta-interpretation. But in many such situations this can be mitigated by restricting their use only to specific modes and producing instantiation errors otherwise. Think of (non-constraint based) arithmetics which produces an error if the result cannot be determined immediately, but still produces correct results when used with sufficiently instantiated arguments like in
?- X > 0, X = -1.
error(instantiation_error,(is)/2).
?- X = -1, X > 0.
false.
?- X = 2, X > 0.
X = 2.
Finally, a word on non-termination. Often non-termination is seen as a fundamental weakness of Prolog. But there is another view on this. Also other much older systems or engines suffer (from time-to-time) runaways. And they are still used. In the case of programming languages, runaways are a fundamental consequence of their generality. And a non-terminating query is still preferable to an incorrect but terminating query.

How is predicate logic represented in Prolog?

may be a strange and broad question and not a 100% programming question, but I hope this is ok. I recently had a discussion about, that a lot of programs in Prolog don´t follow strict predicate logic (of Frege) but often are "object oriented" which I am trying to grasp.
I know that Prolog is based on first order predicate logic especially Horn Clauses and that they are a special form of modus ponens. A fact and a rule if they occur solo are simply clauses, but as soon as I add more than one occurrence they become a predicate.
How are the quantors of first order predicate logic represented and related to fact , rule , predicate or the Prolog concept in general? What does the functor express and what the arguments in relation to predicate logic. How is predicate logic and first order predicate logic reflected in Prolog and where does prolog leave their concepts? e.g. how would I define a point, a line and a vertical line in predicate logic and first order predicate logic.
How do I formulate this in predicate logic and first order predicate logic what is the semantic and logic difference between
vertical(line).
line(vertical).
Or a line and point in this example. Are point and line not predicate logic?
For me it is " point(X) the set of all points" and when I pick a concrete point "there exists one point(110, 12)."
point(X,Y).
line(point(W,X), point(Y,Z)).
vertical(line(point(X,Y), point(X,Z))).
horizontal(line(point(X,Y), point(Z,Y))).
Any info helps! Many thanks, H
A chapter of Programming in Prolog by W.Clocksin and C.Mellish is devoted to explain the relation of Prolog with logic. Citing from there
If we wish to discuss how Prolog is related to logic, we must first establish what we
mean by logic. Logic was originally devised as a way of representing the form of
arguments, so that it would be possible to check in a formal way whether or not they
are valid. Thus we can use logic to express propositions, the relations between propositions and how one can validly infer some propositions from others. The particular
form of logic that we will be talking about here is called the Predicate Calculus. We
will only be able to say a few words about it here. There are scores of good basic
introductions to logic you can turn to for background reading.
If we wish to express propositions about the world, we must be able to describe
the objects that are involved in them. In Predicate Calculus, we represent objects by
terms. A term is of one of the following forms:
A constant symbol. This is a symbol that stands for a single individual or concept.
We can think of this as a Prolog atom, and we will use the Prolog syntax. So
greek, agatha, and peace are constant symbols.
A variable symbol. This is a symbol that we may want to stand for different
individuals at different times. Variables are really only introduced in conjunction
with quantifiers, which are discussed below. We can think of them as Prolog
variables and will use the Prolog syntax. Thus X, Man, and Greek are variable
symbols.
A compound term. A compound term consists of a function symbol, together
with an ordered set of terms as its arguments. The idea is that the compound
term represents some individual that depends on the individuals represented by
the arguments. The function symbol represents how the first depends on the second. For instance, we could have a function symbol standing for the notion of
"distance" and two arguments. In this case, the compound term stands for the
distance between the objects represented by the arguments. We can think of a
compound term as a Prolog structure with the function symbol as the functor.
We will write Predicate Calculus compound terms using the Prolog syntax, so
that, for instance, wife(henry) might mean Henry's wife, distance(point1, X)
might mean the distance between some particular point and some other place to
be specified, and classes(mary, dayafter(W)) might mean the classes that Mary
teaches on the day after some day W to be specified.
Thus in Predicate Calculus the ways of representing objects are just like the ways available in Prolog.
Seems not appropriate to put the entire chapter here... there is also a program, very explanatory, in appendix B, that performs an automatic translation of WFFs into clauses.
The book is very readable, just a pity it's not among the titles in Free Prolog Programming Books section.
I know that Prolog is based on first order predicate logic especially Horn Clauses and that they are a special form of modus ponens.
In a sense, inverse "modus ponens":
a :- b
You want to show "a true", and to do so, you have to show "b true"
A fact and a rule if they occur solo are simply clauses, but as soon as I add more than one occurrence they become a predicate.
No, they are all predicates. The "predicate" is an object/agent/program/platonic-phenomenon which expresses that there (objectively) is some "relationship" between "things", and you can ask the Prolog Processor about that relationship. There is no direct meaning associated to all of that though, it's "strings related to strings via strings". We are working with syntactic machines after all (i.e. computers).
Enter this logic program:
p(x,y). % Predicate p/2 states that there is a relationship p between x and y
And now, you can query the database about what the program is saying:
?- p(x,y).
true. % a p relationship exists (fact, but could also be rule)
?- p(x,A).
A = y. % the thing related to x via p is y
?- p(A,y).
A = x. % the thing related to x via p is y
?- p(A,B).
A = x, % things related via p are x and y
B = y.
?- p(c,d).
false. % not REALLY "false" but "as far as I can tell, there
% is no relationship p between c and d"
Note the interpretation of "false", which is not the "strong false" of classical logic. Even though it is traditionally state that Prolog works in classical logic, this is not really the case:
From "Logic Programming with Strong Negation" (David Pearce, Gerd Wagner, FU Berlin, 1991), appears in Springer LNAI 475: Extensions of Logic Programming, International Workshop Tübingen, FRG, December 8–10, 1989 Proceedings):
According to the standard view, a logic program is a set of definite Horn clauses. Thus, logic programs are regarded as syntactically restricted first-order theories within the framework of classical logic. Correspondingly, the proof theory of logic programs is considered as the specialized version of classical resolution, known as SLD-resolution. This view, however, neglects the fact that a program clause, a_0 <— a_1, a_2, • • •, a_n, is an expression of a fragment of positive logic (a subsystem of intuitionistic logic) rather than an implicational formula of classical logic. The classical interpretation of logic programs, therefore, seems to be a semantical overkill.
It should be clear that in order to explain the deduction mechanism of Prolog one does not have to refer to the indirect method of SLD-resolution which checks for the refutability of the contrary. It is certainly more natural to view Prolog's proof procedure as a kind of natural deduction, as, for example, in [Hallnäs & Schroeder-Heister 1987] and [Miller 1989]. This also is more in line with the intuitions of a Prolog programmer. Since Prolog is the paradigm, logic programming semantics should take it as a point of departure.
Now:
How are the quantors of first order predicate logic represented and related
to fact, rule, predicate or the Prolog concept in general?
That is a long story. Note that Prolog is primarily about "programming using logic", and also about "modeling using logic". The two aspects certainly overlap well for problems that can be solved using explicit enumeration, but Prolog is not made for specifying general FOL constraints describing a sought-for solution. In fact, certain FOL constraints cannot be represented and other have to be transformed into nominally equivalent expression that are agreeable to the machine. Look up "skolemization". For example: https://www.cs.toronto.edu/~sheila/384/w11/Lectures/csc384w11-KR-tutorial.pdf
On the flip side, Prolog provides "meta-predicates" which generate solutions by calling other predicates, so it's making forays into second-order logic. As it must - nobody can survive in the FOL desert for long.
What does the functor express
Nothing. It just stands for itself. Pure syntax. Look up "Herbrand Universe".
How do I formulate this in predicate logic and first order predicate logic
what is the semantic and logic difference between
vertical(line).
line(vertical).
It's you who imbues vertical and line with meaning. So, feelings. You want a "vertial line", so you would say, the "thing" is the "line" and "vertical" is an attribute of the "line". So vertical(line) sounds appropriate. Or maybe attribute(line,vertical). It depends.
Here:
point(X,Y).
line(point(W,X), point(Y,Z)).
You have to aspects:
Predicates express "relationships". "Function symbols" are used to construct "things with structure": you can form trees of stuff with function symbols on nodes and integers/strings/variables on leaves. These are called "term". But terms can appear as predicates or as things, depending on the context, it's quite fluid. So you can for example construct a Prolog program with another Prolog program.
point(X,Y)
line(point(W,X), point(Y,Z))
These are terms!
If you type this into a file program.pl:
point_on_line(point(X,Y),line(point(W,X), point(Y,Z))).
The terms appear as "things" related by predicate point_on_line/2. The whole line is itself a term.
If you type this into a file program.pl:
point(X,Y).
line(point(W,X), point(Y,Z)).
The terms appear as "predicates", and point appears both as predicate point/2 and as "thing" about which predicate line/2 is talking.
This is actually a vast subject and it takes some time getting used to it, much more than functional programming. I had some Prolog and Logic courses at uni but 20 years later I found out that I had badly misunderstood a lot of aspects.

Where might I find a method to convert an arbitrary boolean expression into conjunctive or disjunctive normal form?

I've written a little app that parses expressions into abstract syntax trees. Right now, I use a bunch of heuristics against the expression in order to decide how to best evaluate the query. Unfortunately, there are examples which make the query plan extremely bad.
I've found a way to provably make better guesses as to how queries should be evaluated, but I need to put my expression into CNF or DNF first in order to get provably correct answers. I know this could result in potentially exponential time and space, but for typical queries my users run this is not a problem.
Now, converting to CNF or DNF is something I do by hand all the time in order to simplify complicated expressions. (Well, maybe not all the time, but I do know how that's done using e.g. demorgan's laws, distributive laws, etc.) However, I'm not sure how to begin translating that into a method that is implementable as an algorithm. I've looked at papers on query optimization, and several start with "well first we put things into CNF" or "first we put things into DNF", and they never seem to explain their method for accomplishing that.
Where should I start?
Look at https://github.com/bastikr/boolean.py
Example:
def test(self):
expr = parse("a*(b+~c*d)")
print(expr)
dnf_expr = normalize(boolean.OR, expr)
print(list(map(str, dnf_expr)))
cnf_expr = normalize(boolean.AND, expr)
print(list(map(str, cnf_expr)))
Output is:
a*(b+(~c*d))
['a*b', 'a*~c*d']
['a', 'b+~c', 'b+d']
UPDATE: Now I prefer this sympy logic package:
>>> from sympy.logic.boolalg import to_dnf
>>> from sympy.abc import A, B, C
>>> to_dnf(B & (A | C))
Or(And(A, B), And(B, C))
>>> to_dnf((A & B) | (A & ~B) | (B & C) | (~B & C), True)
Or(A, C)
The naive vanilla algorithm, for quantifier-free formulae, is :
for CNF, convert to negation normal form with De Morgan laws then distribute OR over AND
for DNF, convert to negation normal form with De Morgan laws then distribute AND over OR
It's unclear to me if your formulae are quantified. But even if they aren't, it seems the end of the wikipedia articles on conjunctive normal form, and its - roughly equivalent in the automated theorm prover world - clausal normal form alter ego outline a usable algorithm (and point to references if you want to make this transformation a bit more clever). If you need more than that, please do tell us more about where you encounter a difficulty.
I've came across this page: How to Convert a Formula to CNF. It shows the algorithm to convert a Boolean expression to CNF in pseudo code. Helped me to get started into this topic.

Equation Threading: Why the default behavior?

I recently rediscovered a small package by Roman Maeder that tells Mathematica to automatically thread arithmetic and similar functions over expressions such as x == y. Link to Maeder's package.
First, to demonstrate, here's an example given by Maeder:
In[1]:= Needs["EqualThread`"]
Now proceed to use the threading behavior to solve the following equation for x 'by hand':
In[7]:= a == b Log[2 x]
In[8]:= %/b
Out[8]:= a/b == Log[2 x]
Now exponentiate:
In[9]:= Exp[%]
Out[9]= E^(a/b) == 2 x
And divide through by 2:
In[10]:= %/2
Out[10]= (E^(a/b))/2 == x
Q: From a design perspective, can someone explain why Mathematica is set to behave this way by default? Automatically threading seems like the type of behavior a Mathematica beginner would expect---to me, at least---perhaps someone can offer an example or two that would cause problems with the system as a whole. (And feel free to point out any mathematica ignorance...)
Seems natural when thinking of arithmetic operations. But that is not always the case.
When I write
Boole[a==b]
I don't want
Boole[a] == Boole[b]
And that is what Maeder's package does.
Edit
Answering your comment below:
I noticed that Boole[] was added in v.5.2, whereas Maeder's package was authored for v.3. I guess the core of my question still revolves around the 'design' issue. I mean, how would one get around the issue you pointed out? To me, the clearest path would be declaring something about variables you're working with, no? -- What puzzles me is the way you can generally only do this with Assumptions (globally or as an option to Simplify, etc). Anyone else think it would be more natural to have a full set of numerical Attributes? (in this regard, the Constant Attribute is a tease)
My answer is by no means a critic to Maeder's package, which is nice, but a statement that it should not be the mainstream way to treat Equal[ ] in Mma.
Equal[ ] is a function, and not particularly easy to grasp at first:
returns True if lhs and rhs are identical
returns False if lhs and rhs are determined to be unequal by comparisons between numbers or other raw data, such as strings.
remains unevaluated when lhs or rhs contains objects such as Indeterminate and Overflow.
is used to represent a symbolic equation, to be manipulated using functions like Solve.
The intent of Maeder's package, which I understand is well aligned with yours, is to give to the expression lhs == rhs the same meaning and manipulation rules humans use when doing math.
In math, equality is an equivalence relation, imposing a partial order in a set, and an equation is an assertion that the expressions are related by this particular relation.
Compare these differences with other Mma "functions". Sin[x] is in Mma, and in usual math the same thing (well, almost), and the same can be said of most Mma beasts. There are a few Mma constructs, however, that do not hold that exact isomorphism to math concepts: Equal, SameQ, Equivalent, etc. They are the bridge from the math world to the programming world. They are not strict math concepts, but modified programming concepts to hold them.
Sorry if I got a little on the philosophical side.
HTH!
I guess it is partly because the behavior can not be extended over to inequalities. And also because the behavior should make sense both when equalities become evaluated:
Would be nice:
In[85]:= Thread[Power[a == b, 2], Equal]
Out[85]= a^2 == b^2
In[86]:= Thread[Power[a == b, c == d], Equal]
Out[86]= a^c == b^d
but:
In[87]:= Thread[Power[a == b, c == d] /. {c -> 2, d -> 2}, Equal]
Out[87]= a^True == b^True

Are there Mathematica packages for presenting proofs/derivations?

When I write out a proof or derivation on paper I frequently make sign errors or drop terms as I move from one step to the next. I'd like to use Mathematica to save myself from these silly mistakes. I don't want Mathematica to solve the expression, I just want to use it carry out and display a series of algebraic manipulations. For a (trivial) example
In[111]:= MultBothSides[Equal[a_, b_], c_] := Equal[c a, c b];
In[112]:= expression = 2 a == a b
Out[112]= 2 a == a b
In[113]:= MultBothSides[expression, 1/a]
Out[113]= 2 == b
Can anyone point me to a package that would support this kind of manipulation?
Edit
Thanks for the input, not quite what I'm looking for though. The symbol manipulation isn't really the problem. I'm really looking for something that will make explicit the algebraic or mathematical justification of each step of a derivation. My goal here is really pedagogical.
Mathematica also provides a number of high-level functions for manipulating algebraic. Among these are Expand, Apart and Together, and Cancel, though there are quite a few more.
Also, for your specific example of applying the same transformation to both sides of an equation (that is, and expression with the head Equal), you can use the Thread function, which works just like your MultBothSides function, but with a great deal more generality.
In[1]:= expression = 2 a == a b
Out[1]:= 2 a == a b
In[2]:= Thread[expression /a, Equal]
Out[2]:= 2 == b
In[3]:= Thread[expression - c, Equal]
Out[3]:= 2 a - c == a b - c
In either of the presented solutions, it should be relatively easy to see what the step entailed. If you want something a little more explicit, you can write your own function like so:
In[4]:= ApplyToBothSides[f_, eq_Equal] := Map[f, eq]
In[5]:= ApplyToBothSides[4 * #&, expression]
Out[5]:= 8 a == 4 a b
It's a generalization of your MultBothSides function that takes advantage of the fact that Map works on expressions with any head, not just head List. If you're trying to communicate with an audience that is unfamiliar with Mathematica, using these sorts of names can help you communicate more clearly. In a related vein, if you want to use replacement rules as suggested by Ira Baxter, it may be helpful to write out Replace or ReplaceAll instead of using the /. syntactic sugar.
In[6]:= ReplaceAll[expression, a -> (x + y)]
Out[6]:= 2 (x + y) == b (x + y)
If you think it would be clearer to have the actual equation, instead of the variable name expression, in your input, and you're using the notebook interface, highlight the word expression with your mouse, call up the contextual menu, and select "Evaluate in Place".
The notebook interface is also a very pleasant environment for doing "literate programming", so you can also explain any steps that are not immediately obvious in words. I believe this is a good practice when writing mathematical proofs regardless of the medium.
I don't think you need a package. What you want to do is to manipulate each formula according to an inference rule. In MMa, you can model inference rules on a formula using transformations. So, if you have a formula f, you can apply an inference rule I by executing (my MMa syntax is 15 years rusty)
f ./ I
to produce the next formula in your sequence.
MMa will of course try to simplify your formulas if they contain standard algebraic operators and terms, such as constant numbers and arithmetic operators. You can prevent MMa from applying its own "inference" rules by enclosing your formula in a Hold[...] form.

Resources