I would like to solve problems combining boolean and integer logic in linear arithmetic with a SAT/SMT solver. At first glance, Z3 seems promising.
First of all, is it at all possible to solve the following problem? This answer makes it seem like it works.
int x,y,z
boolean a,b,c
( (3x + y - 2z >= 10) OR (A AND (NOT B OR C)) OR ((A == C) AND (x + y >= 5)) )
If so, how does Z3 solve this kind of problem in theory and is there any documentation about it?
I could think of two ways to solve this problem. One would be to convert the Boolean operations into a linear integer expression. Another solution I read about is to use the Nelson-Oppen Combination Method described in [Kro 08].
I found a corresponding documentation in chapter 3.2.2. Solving Arithmetical Fragments, Table 1 a listing of the implemented algorithms for a certain logic.
Yes, SMT solvers are quite good at solving problems of this sort. Your problem can be expressed using z3's Python interface like this:
from z3 import *
x, y, z = Ints('x y z')
A, B, C = Bools('A B C')
solve (Or(3*x + y - 2*z >= 10
, And(A, Or(Not(B), C))
, And(A == C, x + y >= 5)))
This prints:
[A = True, z = 3, y = 0, B = True, C = True, x = 5]
giving you a (not necessarily "the") model that satisfies your constraints.
SMT solvers can deal with integers, machine words (i.e., bit-vectors), reals, along with many other data types, and there are efficient procedures for combinations of linear-integer-arithmetic, booleans, uninterpreted-functions, bit-vectors amongst many others.
See http://smtlib.cs.uiowa.edu for many resources on SMT solving, including references to other work. Any given solver (i.e., z3, yices, cvc etc.) will be a collection of various algorithms, heuristics and tactics. It's hard to compare them directly as each shine in their own way for certain sublogics, but for the base set of linear-integer arithmetic, booleans, and bit-vectors, they should all perform fairly well. Looks like you already found some good references, so you can do further reading as necessary; though for most end users it's neither necessary nor that important to know how an SMT solver internally works.
I would like to create a quotient type with quotient_type in Isabelle/HOL in which I would left "non-constructed" the non-empty set S and the equivalence relation ≡. The goal is for me to derive generic properties w.r.t. S and ≡ over the quotient-lifted set S/≡. In this way, it would be interesting that Isabelle/HOL accepts dependent types... But I was told that was not possible.
Hence, I tried this
(* 1. Defining an arbitrary set and its associated type *)
consts S :: "'a set"
typedef ('a) inst = "{ x :: 'a. ¬ S = ({} :: 'a set) ⟶ x ∈ S}" by(auto)
(* 2. Defining the equivalence relation *)
definition equiv :: "'a ⇒ 'a ⇒ bool" where
"equiv x y = undefined"
(* here needs a property of equivalence relationship... *)
(* 3. Defining the quotiented set *)
quotient_type ('a) quotiented_set = "('a inst × 'a inst)" / "equiv"
(* Hence, impossible end proof here... *)
Is this formalization, there appears to be two problems
I don't think this is the cleanest way to define an arbitrary set S as I can't specify it to be non-empty...
I can't define an arbitrary equivalence relation equiv with the definition nor the fun commands as they only allow me define "constructive-strongly normalizing-inductive" definitions only... And yet, I want to say that I just have some function equiv that satisfies properties of equivalence (reflexivity, symmetry, transitivity).
Do you have any idea ? Thanks.
HOL types cannot depend on values. So if you want to define a quotient type for an arbitrary non-empty set S and equivalence relation equiv using quotient_type, the arbitrary part must stay at the meta-level. Thus, S and equiv can either be axiomatized or defined such that you can convince yourself that you really have captured the desired notion of arbitrary.
If you axiomatize S and equiv, then you yourself are responsible that the axioms are consistent with the other axioms of HOL. You can do that with the command axiomatization as in
axiomatization S :: "'a set" where S_not_empty: "S ≠ {}"
For Isabelle/HOL, S is then a fixed constant of which you only know that it is not empty. You will never be able to instantiate S, because the arbitrariness only exists in the set-theoretic interpretation of Isabelle/HOL.
If you do not want to add new axioms, you can use specification instead:
consts S :: "'a set"
specification (S) S_not_empty: "S ≠ {}" by auto
With specification, you have to prove that your axioms are consistent, so there is no danger here. However, S no longer is absolutely arbitrary, because it is defined in terms of the choice operator Eps, as can be seen from the generated theorem S_def.
If you really want to study the theory of quotients within Isabelle/HOL, I recommend that you do not use types, but ordinary sets. There is the quotient operator op // and some theorems in the theory Equiv_Relations which is part of the library.
As I understand it, matrices in Isabelle are essentially functions and of abitrary dimension. In this setting, it is not easy to define a squared matrix (n x n matrix). Also, in a proof on paper the dimension "n" of a squared can be used in the proof. But how do I do that in Isabelle?
Leibniz Formula:
My proof on paper:
Here is a relevant excerpt of my Isabelle proof:
(* tested with Isabelle2013-2 (and also Isabelle2013-1) *)
theory Notepad
imports
Main
"~~/src/HOL/Library/Polynomial"
"~~/src/HOL/Multivariate_Analysis/Determinants"
begin
notepad
begin
fix C :: "('a::comm_ring_1 poly)^'n∷finite^'n∷finite"
(* Definition Determinant (from the HOL Library, shown for reference
see: "~~/src/HOL/Multivariate_Analysis/Determinants") *)
have "det C =
setsum (λp. of_int (sign p) *
setprod (λi. C$i$p i) (UNIV :: 'n set))
{p. p permutes (UNIV :: 'n set)}" unfolding det_def by simp
(* assumtions *)
have 1: "∀ i j. degree (C $ i $ j) ≤ 1" sorry (* from assumtions, not shown *)
have 2: "∀ i. degree (C $ i $ i) = 1" sorry (* from assumtions, not shown *)
(* don't have "n", that is the dimension of the squared matrix *)
have "∀p∈{p. p permutes (UNIV :: 'n set)}. degree (setprod (λi. C$i$p i) (UNIV :: 'n set)) ≤ n" sorry (* no n! *)
end
What can I do in this situation?
UPDATE:
Your type for C, a restricted version of ('a ^ 'n ^ 'n), appears to be a custom type of > yours, because I get an error when trying to use it, even after importing > Polynomial.thy. But maybe it's defined in some other HOL theory.
Unfortunately I did not write the includes in my code example, please see the updated example. But it is not a custom type, importing "Polynomial.thy" and "Determinants" should be sufficient. (I tested Isabelle version 2013-1 and 2013-2.)
If you're using a custom definition of a matrix, there's a good chance
you're on your own, for the most part.
I don't belive I am using a custom definition of a matrix.
The library Determinants (~~/src/HOL/Multivariate_Analysis/Determinants) has the following definition of a determinant:
definition det:: "'a::comm_ring_1^'n^'n ⇒ 'a" where .... So the library uses the notion of a matrix as a vector of vectors. If my ring is over polynomials it should not make a difference in my eyes.
Regardless, for a type such as ('a ^ 'n ^ 'n), it seems to me, you
should be able to write a function to return a value for the size of
the matrix. So if (p ^ n ^ n) is a matrix, where n is a set, then
maybe the cardinality of n is the n you want in your question.
This brought me on the right way. My current guess is that the following definition is helpful:
definition card_diagonal :: "('a::zero poly)^'n^'n ⇒ nat" where "card_diagonal A = card { (A $ i $ i) | i . True }"
card is definied in Finite_Set.
It seems to me that the essence of this question is how to obtain the integer n from a given n x n matrix, A. The difficulty here is that this integer is encoded in A's type. Nevertheless, it seems clear to me that n is actually a parameter of the problem. Although we can imagine representations of matrices that somehow store the dimension internally, from a mathematical point of view, it is natural to begin the entire development by stating "let n be a positive integer".
Update 140107_2040
It's hard to make a short answer here. I only work everything for vectors, since it all gets very involved. I try to give you the function for the length of a vector as fast as possible. I then go into a big explanation on what I did to get a decent understanding of the vector type, but not necessarily for you, if you don't need it.
Reflected by the name Finite_Cartesian_Product.thy, Amine Chaieb defines a generalized finite Cartesian product. So, of course, we also get a definition for vectors and n-tuples. That it's a generalized Cartesian product is what requires the huge explanation, and what took me a long time to recognize and work through. Having said that, I'll call it a vector, since he named the type vec.
Everything needs to be understood in reference to what a vector is, which is defined by this definition:
typedef ('a, 'b) vec = "UNIV :: (('b::finite) => 'a) set"
This tells us that a vector is a function f::('b::finite) => 'a. The domain of the function is UNIV::'b set, which is finite and is called the index set. For example, let the index set be defined with typedef as {1,2,3}.
The codomain of the function can be any type, but let it be a set of constants {a,b}, defined with typedef. Because HOL functions are total, each element of {1,2,3} must get mapped to an element of {a,b}.
Now, consider the set of all such functions that map elements from {1,2,3} to {a,b}. There will be 2^3 = 8 such functions. I now resort to ZFC function notation, along with n-tuple notation:
f_1: {1,2,3} --> {a,b} == {(1,a),(2,a),(3,a)} == (a,a,a)
f_2 == {(1,a),(2,a),(3,b)} == (a,a,b)
f_3 == {(1,a),(2,b),(3,a)} == (a,b,a)
f_4 == {(1,a),(2,b),(3,b)} == (a,b,b)
f_5 to f_8 == (b,a,a), (b,a,b), (b,b,a), (b,b,b)
Then for any vector f_i, which, again, is a function, the length of the vector will be the cardinality of the domain of f_i, which will be 3.
I'm pretty sure your function card_diagonal is the cardinality of the range of the function, and I tested out a vector version of it much further down, but it basically showed me how to get the cardinality of the domain.
Here is the function for the length of a vector:
definition vec_length :: "('a, 'b::finite) vec => nat" where
"vec_length v = card {i. ? c. c = (vec_nth v) i}"
declare
vec_length_def [simp add]
You might want to substitute v $ i for (vec_nth v) i. The ? is \<exists>.
In my example below, the simp method easily produced a goal CARD(t123) = (3::nat), where t123 is a type I defined with 3 elements in it. I couldn't get past that.
Anyone who wants to understand the details needs to understand the use of the Rep_t and Abs_t functions that are created when typedef is used to create a type t. In the case of vec, the functions would have been Rep_vec and Abs_vec, but they are renamed with morphisms to vec_nth and vec_lambda.
Will the Non-vector-specific Vector Length Please Step Forward
Update 140111
This should be my final update, because to completely work it out to my satisfaction, I need to know much more about instantiating type classes in general, and how to specifically instantiate type classes so that my concrete example, UNIV::t123 set, is finite.
I more than welcome being corrected where I may be wrong. I would much rather be reading about Multivariate_Analysis in a textbook than be learning how to use Isar and Isabelle/HOL like this.
By all appearances, the concept of the length of a vector of type ('a, 'b) vec is extraordinarily simple. It is the cardinality of the universal set of the type 'b::finite.
Intuitively, it makes sense, so I commit to the idea prematurely, but I don't permanently commit because I can't finish my example.
I added an update to the end of my "investigative" theory below.
What I hadn't done before is instantiate my example type, t123, a type defined with the set {c1,c2,c3}, as type class top.
The shorter story is that in pursuing top, value tipped me off that type class card_UNIV is involved, where card_UNIV is based on finite_UNIV. Again, the descriptive identifiers make it seem that if my type t123 is of type class finite_UNIV, then I can calculate the cardinality of it with card, which will be the length of any vector using type t123 as the index set.
I show some terms here which indicate what's involved, which, as usual, can be investigated by cntl-clicking on various identifiers, if you have my example theory loaded. A little more detail is in my investigative source below.
term "UNIV::t123 set"
term "top::t123 set"
term "card (UNIV::t123 set)" (*OUTPUT PANEL: CARD(t123)::nat.*)
term "card (top::t123 set)" (*OUTPUT PANEL: CARD(t123)::nat.*)
value "card (top::t123 set)" (*ERROR: Type t123 not of sort card_UNIV.*)
term "card_UNIV"
term "finite_UNIV"
(End of update.)
140112 Final update to the final update
It paid to not permanently commit, and though answering questions is a good way to learn, there is also downside under these circumstances.
For the vector type, the only type class that's part of the definition is finite, but then, above, what I'm doing involves type class finite_UNIV, which is in src/HOL/Library/Cardinality.thy.
Trying to use card, like with card (UNIV::t123 set), won't work for type vec because you can't assume that type class finite_UNIV has been instantiated for the index set type. If I'm wrong here with what seems to be obvious now, I'd like to know.
Well, even though the function I defined, vector_length, doesn't try to take the cardinality of UNIV::'b set directly, with my example, the simplifier produces the goal CARD(t123) = (3::nat).
I speculate on what that means for myself, but I haven't tracked down CARD, so I keep my speculations to myself.
(End of update.)
140117 Final final final
Trying to use value to learn about the use of card led me astray. The value command is based on the code generator, and value will have type class requirements that aren't needed in general.
There's no requirement that the index set be instantiated for type class finite_UNIV. It's just that the logic needed to be able to use card (UNIV::('b::finite set)) has to be in place.
It seems like the logic should already be there in Multivariate_Analysis for anything I've done. Anything I've said is subject to error.
(End of update.)
Conclusion About My Experience Here with vec in Multivariate_Analysis
Using generalized index sets seems overly complex, at least for me. Vectors as lists seems like what I would want, like with Matrix.thy, but maybe things need to be complex at times.
The biggest pain is using typedef to create a type which has a finite universal set. I don't know how to easily create finite sets. I saw a comment in the past that it's best to stay away from typedef. It sounds good at first, that it creates a type based on a set, but it ends up being a hassle to deal with.
[I comment further here about finite, generalized index sets being used in vec. I have to resort to a ZFC definition, because I have no idea where textbooks are that formalize general mathematics with type theory. This wiki article shows a generalized Cartesian product:
Wiki: Infinite product definition using a finite or infinite index set
Key to the definition is that an infinite set can be used as the index set, such as the real numbers.
As far as using a finite set as an index set, any finite set of cardinality n can be put one-to-one with the natural numbers 1...n, and a finite, natural number ordering is normally how we would use a vector.
It's not that I don't believe that someone, somewhere needs vectors with a finite index set that's not the natural numbers, but all the math I've seen for vectors and matrices is vectors of length n::nat, or n::nat x m::nat matrices.
For myself, I would think that the best vector and matrix would be based on list, since the component location of a list is based on natural numbers. There's a lot of computational magic that comes from using an Isabelle/HOL list.]
What I worked Through to Get the Above
It took me a lot of work to work through this. I know much less of how to use Isabelle than much more.
(*It's much faster to start jEdit with Multivariate_Analysis as the logic.*)
theory i140107a__Multvariate_Ana_vec_length
imports Complex_Main Multivariate_Analysis (*"../../../iHelp/i"*)
begin
declare[[show_sorts=true]] (*Set false if you don't want typing shown.*)
declare[[show_brackets=true]]
(*---FINITE UNIVERSAL SET, NOT FINITE SET
*)
(*
First, we need to understand what `x::('a::finite)` means. It means that
`x` is a type for which the universal set of it's type is finite, where
the universal set is `UNIV::('a set)`. It does not mean that terms of type
`'a::finite` are finite sets.
The use of `typedef` below will hopefully make this clear. The following are
related to all of this, cntl-click on them to investigate them.
*)
term "x::('a::finite)"
term "finite::('a set => bool)" (*the finite predicate*)
term "UNIV::('a set) == top" (*UNIV is designated universal set in Set.thy.*)
term "finite (UNIV :: 'a set)"
term "finite (top :: 'a set)"
(*
It happens to be that the `finite` predicate is used in the definition of
type class `finite`. Here are some pertinent snippets, after which I comment
on them:
class top =
fixes top :: 'a ("⊤")
abbreviation UNIV :: "'a set" where
"UNIV == top"
class finite =
assumes finite_UNIV: "finite (UNIV :: 'a set)"
The `assumes` in the `finite` type-class specifies that constant `top::'a set`
is finite, where `top` can be seen as defined in type-class `top`. Thus, any
type of type-class `top` must have a `top` constant.
The constant `top` is in Orderings.thy, and the Orderings theory comes next
after HOL.thy, which is fundamental. As to why this use of the constant `top`
by type-class `finite` can make the universe of a type finite, I don't know.
*)
(*---DISCOVERING LOWER LEVEL SYNTAX TO WORK WITH
*)
(*
From the output panel, I copied the type shown for `term "v::('a ^ 'b)"`. I
then cntl-clicked on `vec` to take me to the `vec` definition.
*)
term "v::('a ^ 'b)"
term "v::('a,'b::finite) vec"
(*
The `typedef` command defines the `('a, 'b) vec` type as an element of a
particular set, in particular, as an element in the set of all functions of
type `('b::finite) => 'a`. I rename `vec` to `vec2` so I can experiment with
`vec2`.
*)
typedef ('a, 'b) vec2 = "UNIV :: (('b::finite) => 'a) set"
by(auto)
notation
Rep_vec2 (infixl "$$" 90)
(*
The `morphisms` command renamed `Rep_vec` and `Abs_vec` to `vec_nth` and
`vec_lambda`, but I don't rename them for `vec2`. To create the `vec_length`
function, I'll be using the `Rep` function, which is `vec_nth` for `vec`.
However, the `Abs` function comes into play further down with the concrete
examples. It's used to coerce a function into a type that uses the type
construcor `vec`.
*)
term "Rep_vec2::(('a, 'b::finite) vec2 => ('b::finite => 'a))"
term "Abs_vec2::(('a::finite => 'b) => ('b, 'a::finite) vec2)"
(*---FIGURING OUT HOW THE REP FUNCTION WORKS WITH 0, 1, OR 2 ARGS
*)
(*
To figure it all out, I need to study these Rep_t function types. The type
of terms without explicit typing have the type shown below them, with the
appropriate `vec` or `vec2`.
*)
term "op $"
term "vec_nth"
term "op $$"
term "Rep_vec2::(('a, 'b::finite) vec2 => ('b::finite => 'a))"
term "op $ x"
term "vec_nth x"
term "op $$ x"
term "(Rep_vec2 x)::('b::finite => 'a)"
term "x $ i"
term "op $ x i"
term "vec_nth x i"
term "x $$ i"
term "op $$ x i"
term "(Rep_vec2 (x::('a, 'b::finite) vec2) (i::('b::finite))) :: 'a"
(*
No brackets shows more clearly that `x $$ i` is the curried function
`Rep_vec2` taking the arguments `x::(('a, 'b::finite) vec2)` and
`i::('b::finite)`.
*)
term "Rep_vec2::('a, 'b::finite) vec2 => 'b::finite => 'a"
(*---THE FUNCTION FOR THE LENGTH OF A VECTOR*)
(*
This is based on your `card_diagonal`, but it's `card` of the range of
`vec_nth v`. You want `card` of the domain.
*)
theorem "{ (v $ i) | i. True } = {c. ? i. c = (v $ i)}"
by(simp)
definition range_size :: "('a, 'b::finite) vec => nat" where
"range_size v = card {c. ? i. c = (v $ i)}"
declare
range_size_def [simp add]
(*
This is the card of the domain of `(vec_nth v)::('b::finite => 'a)`. I use
`vec_nth v` just to emphasize that what we want is `card` of the domain.
*)
theorem "(vec_nth v) i = (v $ i)"
by(simp)
definition vec_length :: "('a, 'b::finite) vec => nat" where
"vec_length v = card {i. ? c. c = (vec_nth v) i}"
declare
vec_length_def [simp add]
theorem
"∀x y. vec_length (x::('a, 'b) vec) = vec_length (y::('a, 'b::finite) vec)"
by(simp)
(*---EXAMPLES TO TEST THINGS OUT
*)
(*
Creating some constants.
*)
typedecl cT
consts
c1::cT
c2::cT
c3::cT
(*
Creating a type using the set {c1,c2,c3}.
*)
typedef t123 = "{c1,c2,c3}"
by(auto)
(*
The functions Abs_t123 and Rep_t123 are created. I have to use Abs_t123 below
to coerce the type of `cT` to `t123`. Here, I show the type of `Abs_t123`.
*)
term "Abs_t123 :: (cT => t123)"
term "Abs_t123 c1 :: t123"
(*
Use these `declare` commands to do automatic `Abs` coercion. I comment
them out to show how I do coercions explicitly.
*)
(*declare [[coercion_enabled]]*)
(*declare [[coercion Abs_t123]]*)
(*
I have to instantiate type `t123` as type-class `finite`. It seems it should
be simple to prove, but I can't prove it, so I use `sorry`.
*)
instantiation t123 :: finite
begin
instance sorry
end
term "UNIV::t123 set"
term "card (UNIV::t123 set)"
theorem "card (UNIV::t123 set) = 3"
try0
oops
(*
Generalized vectors use an index set, in this case `{c1,c2,c3}`. A vector is
an element from the set `(('b::finite) => 'a) set`. Concretely, my vectors are
going to be from the set `(t123 => nat) set`. I define a vector by defining a
function `t123_to_0`. Using normal vector notation, it is the vector
`<0,0,0>`. Using ZFC ordered pair function notation, it is the set
{(c1,0),(c2,0),(c3,0)}.
*)
definition t123_to_0 :: "t123 => nat" where
"t123_to_0 x = 0"
declare
t123_to_0_def [simp add]
(*
I'm going to have to use `vec_lambda`, `vec_nth`, and `Abs_t123`, so I create
some `term` variations to look at types in the output panel, to try to figure
out how to mix and match functions and arguments.
*)
term "vec_lambda (f::('a::finite => 'b)) :: ('b, 'a::finite) vec"
term "vec_lambda t123_to_0 :: (nat, t123) vec"
term "vec_nth (vec_lambda t123_to_0)"
term "vec_nth (vec_lambda t123_to_0) (Abs_t123 c1)"
(*
The function `vec_length` seems to work. You'd think that `CARD(t123) = 3`
would be true. I try to cntl-click on `CARD`, but it doesn't work.
*)
theorem "vec_length (vec_lambda t123_to_0) = (3::nat)"
apply(simp)
(*GOAL: (CARD(t123) = (3::nat))*)
oops
theorem "(vec_nth (vec_lambda t123_to_0) (Abs_t123 c1)) = (0::nat)"
by(auto)
theorem "range_size (vec_lambda t123_to_0) = (1::nat)"
by(auto)
definition t123_to_x :: "t123 => t123" where
"t123_to_x x = x"
declare
t123_to_x_def [simp add]
theorem "(vec_nth (vec_lambda t123_to_x) (Abs_t123 c1)) = (Abs_t123 c1)"
by(auto)
theorem "(vec_nth (vec_lambda t123_to_x) (Abs_t123 c2)) = (Abs_t123 c2)"
by(auto)
(*THE LENGTH BASED SOLELY ON THE TYPE, NOT ON A PARTICULAR VECTOR
*)
(*Update 140111: The length of a vector is going to be the cardinality of the
universal set of the type, `UNIV::('a::finite set)`. For `t123`, the following
terms are involved.
*)
term "UNIV::t123 set"
term "top::t123 set"
term "card (UNIV::t123 set)" (*OUTPUT PANEL: CARD(t123)::nat.*)
term "card (top::t123 set)" (*OUTPUT PANEL: CARD(t123)::nat.*)
(*
It can be seen that `card (top::t123 set)` is the same as the theorem above
with the goal `CARD(t123) = (3::nat)`. What I didn't do above is instantiate
type `t123` for type-class `top`. I try to define `top_t123`, but it gives me
an error.
*)
instantiation t123 :: top
begin
definition top_t123 :: "t123 set" where
"top_t123 = {Abs_t123 c1, Abs_t123 c2, Abs_t123 c3}"
(*ERROR
Clash of specifications
"i140107a__Multvariate_Ana_vec_length.top_set_inst.top_set_def" and
"Set.top_set_inst.top_set_def" for constant "Orderings.top_class.top"
*)
instance sorry
end
(*To define the cardinality of type `t123` appears to be an involved process,
but maybe there's one easy type-class that can be instantiated that gives me
everything I need. The use of `value` shows that type `t123` needs to be
type-class `card_UNIV`, but `card_UNIV` is based on class `finite_UNIV`.
Understanding it all is involved enough to give job security to a person who
does understand it.
*)
value "card (top::t123 set)" (*ERROR: Type t123 not of sort card_UNIV.*)
term "card_UNIV"
term "finite_UNIV"
(******************************************************************************)
end
The First Parts of My Answer
(Because the imports weren't shown for the source, it wasn't obvious where any of the operators were coming from. There's also the Matrix AFP entry to confuse things. Additionally, other than atomic constants and variables in HOL, most everything is a function, so classifying something as a function doesn't clarify anything without some context. Providing source that won't produce errors helps. The normal entry point is Complex_Main. That sums up most of what I had said here. )
Links to Related Questions
[13-05-27] Isabelle: how to work with matrices
[13-05-30] Isabelle: transpose a matrix that includes a constant factor
[13-06-25] Isabelle matrix arithmetic: det_linear_row_setsum in library with different notation
[13-08-12] Isabelle: maximum value in a vector
[13-09-12] Degree of polynomial smaller than a number
[13-11-21] Isabelle: degree of polynomial multiplied with constant
[13-11-26] Isabelle: Power of a matrix (A^n)?
[13-12-01] Isabelle: difference between A * 1 and A ** mat 1
[14-01-17] Isabelle: Issue with setprod
I'm carrying out some experiments in theorem proving with combinator logic, which is looking promising, but there's one stumbling block: it has been pointed out that in combinator logic it is true that e.g. I = SKK but this is not a theorem, it has to be added as an axiom. Does anyone know of a complete list of the axioms that need to be added?
Edit: You can of course prove by hand that I = SKK, but unless I'm missing something, it's not a theorem within the system of combinator logic with equality. That having been said, you can just macro expand I to SKK... but I'm still missing something important. Taking the set of clauses p(X) and ~p(X), which easily resolve to a contradiction in ordinary first-order logic, and converting them to SK, performing substitution and evaluating all calls of S and K, my program generates the following (where I am using ' for Unlambda's backtick):
''eq ''s ''s ''s 'k s ''s ''s 'k s ''s 'k k 'k eq ''s ''s 'k s 'k k 'k k ''s 'k k 'k false 'k true 'k true
It looks like maybe what I need is an appropriate set of rules for handling the partial calls 'k and ''s, I'm just not seeing what those rules should be, and all the literature I can find in this area was written for a target audience of mathematicians not programmers. I suspect the answer is probably quite simple once you understand it.
Some textbooks define I as mere alias for ((S K) K). In this case they are identical (as terms) per definitionem. To prove their equality (as functions), we need only to prove that equality is reflexive, which can be achieved by a reflexivity axiom scheme:
Proposition ``E = E'' is deducible (Reflexivity axiom scheme, instantiated for each possible terms denoted here by metavariable E)
Thus, I suppose in the followings, that Your questions investigates another approach: when combinator I is not defined as a mere alias for compound term ((S K) K), but introduced as a standalone basic combinator constant on its own, whose operational semantics is declared explicitly by axiom scheme
``(I E) = E'' is deducible (I-axiom scheme)
I suppose Your question asks
whether we can deduce formally (remaining inside the system), that such a standalone-defined I behaves exactly as ((S K) K), when used as functions in reductions?
I think we can, but we must resort to stronger tools. I conjecture that the usual axiom schemes are not enough, we have to declare also the extensionality property (equality of functions), that's the main point. If we want to formalize extensionality as an axiom, we have to augment our object language with free variables.
I think, we have to adopt such an approach for building combinatory logic, that we have to allow also the use of variables in the object langauge. Oof course, I mean "just" free valuables. Using bound variables would be cheating, we have to remain inside the realm of combinatory logic. Using free varaibles is not cheating, it's a honest tool. Thus, we can do the formal proof You required.
Besides the straightforward equality axioms and rules of inference (transitivity, reflexivity, symmetry, Leibniz rules), we must add an extensionality rule of inference for equality. Here is the point where free variables matter.
In Csörnyei 2007: 157-158, I have found the following approach. I think this way the proof can be done.
Some remarks:
Most of the axioms are in fact axiom schemes, consisting of infinitely many axiom instances. The instances must be instantiated for for every possible E, F, G terms. Here, I use italics for metavariables.
The superficial infinite nature of axiom schemes won't raise computability problems, because they can be tackled in a finite time: our axiom system is recursive. It means that a clever parser can decide in a finite time (moreover, very effectively), whether a given proposition is an instance of an axiom scheme, or not. Thus, the usage of axiom schemes does not raise neither theoretical nor practical problems.
Now let us seem our framework:
Language
ALPHABET
Constants: The following three are called constants: K, S, I.
I added the constant I only because Your question presupposes that we have not defined the combinator I as an mere alias/macro for compound term S K K, but it is a standalone constant on its own.
I shall denote constants by boldface roman capitals.
Sign of application: A sign # of ``application'' is enough (prefix notation with arity 2). As syntactic sugar, I use here parantheses instead of the explicit application sign: I shall use the explicit both opening ( and closing ) signs.
Variables: Although combinator logic does not make use of bound variables, scope etc, but we can introduce free variables. I suspect, they are not only syntactic sugar, they can strengthen the deduction system, too. I conjecture, that Your question will require their usage. Any enumerable infinite set (disjoint of the constants and parenthesis signs) will serve as the alphabet of variables, I will denote them here with unformatted roman lowercase letters x, y, z...
TERMS
Terms are defined inductively:
Any constant is a term
Any variable is a term
If E is a term, and F is a term too, then also (E F) is a term
I sometimes use practical conventions as syntactic sugar, e.g. write
E F G H
instead of
(((E F) G) H).
Deduction
Conversion axiom schemes:
``K E F = E'' is deducible (K-axiom scheme)
``S F G H = F H (G H)'' is deducible (S-axiom scheme)
``I E = E'' is deducible (I-axiom scheme)
I added the third conversion axiom (I rule) only because Your question presupposes that we have not defined the combinator I as an alias/macro for S K K.
Equality axiom schemes and rules of inference
``E = E'' is deducible (Reflexivity axiom)
If "E = F" is deducible, then "F = E" is also deducible (Symmetry rule of inference)
If "E = F" is deducible, and "F = G" is deducible too, then also "E = G" is reducible (Transitivity rule)
If "E = F" is deducible, then "E G = F G" is also deducible (Leibniz rule I)
If "E = F" is deducible, then "G E = G F" is also deducible (Leibniz rule II)
Question
Now let us investigate Your question. I conjecture that the deduction system defined so far is not strong enough to prove Your question.
Is proposition "I = S K K" deducible?
The problem is, that we have to prove the equivalence of functions. We regard two functions equivalent if they behave the same way. Functions act so that they are applied to arguments. We should prove that both functions act the same way if applied to each possible arguments. Again, the problem with infinity! I suspect, axioms schemes can't help us here. Something like
If E F = G F is deducible, then also E = G is deducible
would fail to do the job: we can see that this does not yield what we want. Using it, we can prove that
``I E = S K K E'' is deducible
for each E term instance, but these results are only separated instances of, and cannot be used as a whole for further deductions. We have only concrete results (infinitely many), not being able to summarize them:
it holds for E := K
holds for E := S
it holds for E := K K
.
.
.
...
we cannot summarize these fragmented result instances into a single great result, stating extensionality! We cannot pour these low-value fragment into the funnel a rule of inference that would melt them together into a single more valuable result.
We have to augment the power of our deduction system. We have to find a formal tool that can grasps the problem. Your questions leads to extensionality, and I think, declaring extensionality needs that we can pose propositions that hold for *****arbitrary***** instances. That's why I think we must allow free variables inside our object language. I conjecture that the following additional rule of inference will do the work:
If variable x is not part of terms neither E nor F, and statement (E x) = (F x) is deducible, then E = F is also deducible (Extensionality rule of inference)
The hard thing in this axiom, easily leading to confusion: x is an object variables, fully emancipated and respected parts of our object language, while E and G are metavariables, not parts of the object language, but used only for a concise notation of axiom schemes.
(Remark: More precisely, the extensionality rule of inference should be formalized in a more careful way, introducing a metavariable x over all possible object variables x, y, z..., and also another kind of metavariable E over all possible term instances. But this distinction among the two kinds of metavariables plus the object variables is not so didactic here, it does not affect Your question too much.)
Proof
Let us prove now the proposition that ``I = S K K''.
Steps for left-hand side:
proposition ``I x = x'' is an instance of I-axiom scheme with instatiation [E := x]
Steps for right-hand side:
Proposition "S K K x = K x (K x)" is an instance of S-axiom scheme with instantiations [E := K, F := K, G := x], thus it is deducible
Proposition "K x (K x) = x" is an instance of K-axiom scheme with instantiations [E := x, F := K x], thus it is deducible
Transitivity of equality:
Statement "S K K x = K x (K x)" matches the first premise of transitivity rule of inference, and statement "K x (K x) = x" matches the second premise of this rule of inference. The instantiations are [E := S K K x, F := K x (K x), G = x]. Thus the conclusion holds too: E = G. Rewriting the conclusion with the same instantiations, we get statement "S K K x = x", thus, this is deducible.
Symmetry of equality:
Using "S K K x = x", we can infer "x = S K K x"
Transitivity of equality:
Using "I x = x" and "x = S K K x", we can infer "I x = S K K x"
Now we have paved the way for the crucial point:
Proposition "I x = S K K x" matches with the first premise of Extension rule of inference: (E x) = (F x), with instantiations [E := I, F := S K K]. Thus the conclusion must also hold, that is, "E = F" with the same instantiations ([E := I, F := S K K]), yielding proposition "I = S K K", quod erat demonstrandum.
Csörnyei, Zoltán (2007): Lambda-kalkulus. A funkcionális programozás alapjai. Budapest: Typotex. ISBN-978-963-9664-46-3.
You don't need to define I as an axiom. Start with the following:
I.x = x
K.x y = x
S.x y z = x z (y z)
Since SKanything = anything, then SKanything is an identity function, just like I.
So, I = SKK and I = SKS. No need to define I as an axiom, you can define it as syntax sugar which aliases SKK.
The definitions of S and K are you only axioms.
The usual axioms are complete for beta equality, but do not give eta equality. Curry found a set of about thirty axioms to the usual ones to get completeness for beta-eta equality. They're listed in Hindley & Seldin's Introduction to combinators and lambda-calculus.
Roger Hindley, Curry's Last Problem, lists some additional desiderata we might want from mappings between the lambda calculus and notes that we don't have mappings that satisfy all of them. You likely won't care much about all of the criteria.