Algorithm For Intesection of Logical Expressions? - algorithm

Given a set of n elements U, and a set of m properties P where each element of P defines a function from U to boolean.
Given two composite logical expressions of the form (recursively defined):
p1 : true iff p1(x) is true
e1 and e2 : means e1 and e2 are both true
e1 or e2 : means e1 and e2 are not both false
not e1 : true iff e1 is false
(e1) : true iff e1
These logical expressions are parsed into expression statements (parse trees).
Assume that for any p1, p2: All four sets (p1 and p2), (p1 and not p2), (not p1 and p2), (not p1 and not p2), are non-empty.
I want to determine if a logical expression L1 is a subset of L2. That is for every element x in U, if L1(x) is true then L2(x) is true.
So for example:
is_subset(not not p1, p1) is true
is_subset(p1, p2) is false
is_subset(p1 and p2 and p3, (p1 and p2) or p3) is true
I think I need to "normalize" the parse trees somehow and then compare them. Can anyone outline an approach or sketch an architecture?

Since you don't do anything with the objects (x) it seems you want propositional logic, where all combinations of the truth values for p1 to pn are possible.
So essentially you want to do theorem proving in propositional logic.
Your is_subset(e1,e2) translates to a logical operator e1 implies e2, which is the same as not e1 or e2. To know if these hold universally you can check if the negation is unsatisfiable with an algorithm for satisfiability checking such as DPLL.
This is just a starting point, there are many other options to prove theorems in propositional logic.

You can convert each formula to the disjunctive normal form and find if one contains a subset of the conjunctive clauses in the other. The complexity of this approach grows as the exponent of the number of pn mentioned.

I think your instructor essentially wants you to implement the Quine-McCluskey Algorithm Note that as the other answer implies, the execution time grows exceptionally fast because the problem is-NP Hard.

Related

Disjoint Sets of Strings - Minimization Problem

There are two sets, s1 and s2, each containing pairs of letters. A pair is only equivalent to another pair if their letters are in the same order, so they're essentially strings (of length 2). The sets s1 and s2 are disjoint, neither set is empty, and each pair of letters only appears once.
Here is an example of what the two sets might look like:
s1 = { ax, bx, cy, dy }
s2 = { ay, by, cx, dx }
The set of all letters in (s1 ∪ s2) is called sl. The set sr is a set of letters of your choice, but must be a subset of sl. Your goal is to define a mapping m from letters in sl to letters in sr, which, when applied to s1 and s2, will generate the sets s1' and s2', which also contain pairs of letters and must also be disjoint.
The most obvious m just maps each letter to itself. In this example (shown below), s1 is equivalent to s1', and s2 is equivalent to s2' (but given any other m, that would not be the case).
a -> a
b -> b
c -> c
d -> d
x -> x
y -> y
The goal is to construct m such that sr (the set of letters on the right-hand side of the mapping) has the fewest number of letters possible. To accomplish this, you can map multiple letters in sl to the same letter in sr. Note that depending on s1 and s2, and depending on m, you could potentially break the rule that s1' and s2' must be disjoint. For example, you would obviously break that rule by mapping every letter in sl to a single letter in sr.
So, given s1 and s2, how can someone construct an m that minimizes sr, while ensuring that s1' and s2' are disjoint?
Here is a simplified visualization of the problem:
This problem is NP-hard, to show this, consider reducing graph coloring to this problem.
Proof:
Let G=(V,E) be the graph for which we want to compute the minimal graph coloring problem. Formally, we want to compute the chromatic number of the graph, which is the lowest k for which G is k colourable.
To reduce the graph coloring problem to the problem described here, define
s1 = { zu : (u,v) \in E }
s2 = { zv : (u,v) \in E }
where z is a magic value unused other than in constructing s1 & s2.
By construction of the sets above, for any mapping m and any edge (u,v) we must have m(u) != m(v), otherwise the disjointedness of s1' and s2' would be violated. Thus, any optimal sr is the set of optimal colors (with the exception of z) to color the graph G and m is the mapping that defines which node is assigned which color. QED.
The proof above may give the intuition that researching graph coloring approximations would be a good start, and indeed it probably would, but there is a confounding factor involved. This confounding factor is that for two elements ab \in s1 and cd \in s2, if m(a) = m(c) then m(b) != m(d). Logically, this is equivalent to the statement m(a) != m(c) or m(b) != m(d). These types of constraints, in isolation, do not map naturally to an analogous graph problem (because of the or statement.)
There are ways to formulate this problem as an (binary) ILP and solve it as such. This would likely give you (slightly) inferior results to a custom designed & tuned branch-and-bound implementation (assuming you want to find the optimal solution) but would work with turn-key solvers.
If you are more interested in approximations (possibly with guaranteed ratios of optimality) I would investigate a SDP relaxation to your problem & appropriate rounding scheme. This level of work would likely be the kind one would invest in a small-to-medium sized research paper.

Prove whether this language is decidable and recognizable

If L1 and L2 are languages we have a new language
INTERLACE(L1, L2) = {w1v1w2v2 . . . wnvn | w1w2 . . . wn ∈ L1, v1v2 . . . vn ∈ L2}.
For example, if abc ∈ L1 and 123 ∈ L2, then a1b2c3 ∈ INTERLACE(L1, L2)
How can I prove that the INTERLACE is:
decidable ?
recognizable ?
I know how to show this language is regular.
For decidable I am not so sure..
Here's what I think:
To show that the class of decidable languages is closed under operation INTERLACE need to show that if A and B are two decidable languages, then there is method to find if INTERLACE language is decidable. Suppose A, B decidable languages and M1, M2 two TM who decide, respectively.
After I think I have to say how to construct the DFA that recognize the language?
L1 and L2 decidable ==> INTERLACE(L1, L2) decidable
Citation from Wikipedia:
There are two equivalent major definitions for the concept of a recursive (also decidable) language:
...
2. A recursive language is a formal language for which there exists a Turing machine that, when presented with any finite input string, halts and accepts if the string is in the language, and halts and rejects otherwise.
Using this definition:
If L1 and L2 are decidable, then algorithms (or Turing machines) M1 and M2 exist, so that:
M1 accepts all inputs w ∈ L1 and rejects all inputs w ∉ L1.
M2 accepts all inputs v ∈ L2 and rejects all inputs v ∉ L2.
Now let's construct algorithm M which accepts all inputs x ∈ INTERLACE(L1, L2) and rejects all inputs x ∉ INTERLACE(L1, L2), as follows:
Given an input x1 x2 .. xn.
If n is odd, reject the input, otherwise (n is even):
Run M1 for the input x1 x3 x5 .. xn-1. If M1 rejects this input, then M rejects its input and finishes, otherwise (M1 accepted its input):
Run M2 for the input x2 x4 x6 .. xn. If M2 rejects this input, then M rejects its input, otherwise M accepts its input.
One can easily prove that M is the decision algorithm for INTERLACE(L1, L2), thus, the language is decidable.
L1 and L2 recognizable ==> INTERLACE(L1, L2) recognizable
Citation from Wikipedia:
There are three equivalent definitions of a recursively enumerable (also recognizable) language:
...
3. A recursively enumerable language is a formal language for which there exists a Turing machine (or other computable function) that will halt and accept when presented with any string in the language as input but may either halt and reject or loop forever when presented with a string not in the language. Contrast this to recursive languages, which require that the Turing machine halts in all cases.
The proof is very similar to the proof of the 'decidable' property.
If L1 and L2 are recognizable, then algorithms R1 and R2 exist, so that:
R1 accepts all inputs w ∈ L1 and rejects or loops forever for all inputs w ∉ L1.
R2 accepts all inputs v ∈ L2 and rejects or loops forever for all inputs v ∉ L2.
Let's construct algorithm R which accepts all inputs x ∈ INTERLACE(L1, L2) and rejects or loops forever for all inputs x ∉ INTERLACE(L1, L2):
Given an input x1 x2 .. xn.
If n is odd, reject the input, otherwise (n is even):
Run R1 for the input x1 x3 x5 .. xn-1. If R1 loops forever, then R loops forever as well ("automatically"). If R1 rejects this input, then R rejects its input and finishes, otherwise (if R1 accepts its input):
Run R2 for the input x2 x4 x6 .. xn. If R2 loops forever, then R loops forever as well. If R2 rejects this input, then R rejects its input, otherwise R accepts its input.
P.S. you were almost there, actually ;)

First sets of LL(1) parser

I have some problems understanding the following rules applied for first sets of LL(1) parser:
b) Else X1 is a nonterminal, so add First(X1) - ε to First(u).  
a. If X1 is a nullable nonterminal, i.e., X1 =>* ε,  add First(X2) - ε to First(u). 
Furthermore, if X2 can also go to ε, then add First(X3) - ε and so on, through all Xn until the first non­nullable symbol is encountered. 
b. If X1X2...Xn =>* ε, add ε to the first set.
How at b) if X1 nonterminal it can't add ε to First(u)? So if I have
S-> A / a
A-> b / ε
F(A) = {b,ε}
F(S) = {b,ε,a}
it's not correct? Also the little points a and b are confusing.
All it says is what are the terminals you can expect in a sentential form so that you can replace S by AB in the leftmost derivation. So, if A derives ε then in leftmost derivation you can replace A by ε. So now you depend upon B and say on. Consider this sample grammar:
S -> AB
A -> ε
B -> h
So, if there is a string with just one character/terminal "h" and you start verifying whether this string is valid by checking if there is any leftmost derivation deriving the string using the above grammar, then you can safely replace S by AB because A will derive ε and B will derive h.
Therefore, the language recognized by above grammar cannot have a null ε string. For having ε in the language, B should also derive ε. So now both the non-terminals A and B derive ε, therefore S derives ε.
That is, if there is some production S->ABCD and if all the non-terminals A,B,C and D derive ε, then only S can also derive ε and therefore ε will be in FIRST(S).
The FIRST sets given by you are correct. I think you are confused since the production S->A has only one terminal A on rhs and this A derives ε. Now as per b) FIRST(S) = {FIRST(A) - ε, a,} = {b, a} which is incorrect. Since rhs has only one terminal so there is this following possibility S -> A -> ε which specifies that FIRST(S) has ε or S can derive a null string ε.

Why are the set of variables in lambda calculus typically defined as countable infinite?

When reading formal descriptions of the lambda calculus, the set of variables seems to always be defined as countably infinite. Why this set cannot be finite seems clear; defining the set of variables as finite would restrict term constructions in unacceptable ways. However, why not allow the set to be uncountably infinite?
Currently, the most sensible answer to this question I have received is that choosing a countably infinite set of variables implies we may enumerate variables making the description of how to choose fresh variables, say for an alpha rewrite, natural.
I am looking for a definitive answer to this question.
Most definitions and constructs in maths and logic include only the minimal apparatus that is required to achieve the desired end. As you note, more than a finite number of variables may be required. But since no more than a countable infinity is required, why allow more?
The reason that this set is required to be countable is quite simple. Imagine that you had a bag full of the variables. There would be no way to count the number of variables in this bag unless the set was denumerable.
Note that bags are isomorphic to sacks.
Uncountable collections of things seem to usually have uncomputable elements. I'm not sure that all uncountable collections have this property, but I strongly suspect they do.
As a result, you could never even write out the name of those elements in any reasonable way. For example, unlike a number like pi, you cannot have a program that writes out the digits Chaitin's constant past a certain finite number of digits. The set of computable real numbers is countably infinite, so the "additional" reals you get are uncomputable.
I also don't believe you gain anything from the set being uncountably infinite. So you would introduce uncomputable names without benefit (as far as I can see).
Having a countable number of variables, and a computable bijection between them and ℕ, lets us create a bijection between Λ and ℕ:
#v = ⟨0, f(v)⟩, where f is the computable bijection between 𝕍 and ℕ (exists because 𝕍 is countable) and ⟨m, n⟩ is a computable bijection between ℕ2 and ℕ.
#(L M) = ⟨1, ⟨#L, #M⟩⟩
#(λv. L) = ⟨2, ⟨#v, #L⟩⟩
The notation ⌜L⌝ represents c_{#L}, the church numeral representing the encoding of L. For all sets S, #S represents the set {#L | L ∈ S}.
This allows us to prove that lambda calculus is not decidable:
Let A be a non-trivial (not ∅ or Λ) set closed under α and β equality (if L ∈ A and L β= M, M ∈ A). Let B be the set {L | L⌜L⌝ ∈ A}. Assume that set #A is recursive. Then f, for which f(x) = 1 if x ∈ A and 0 if x ∉ A, must be a μ-recursive function. All μ-recursive functions are λ-definable*, so there must be an F for which:
F⌜L⌝ = c_1 ⇔ ⌜L⌝ ∈ A
F⌜L⌝ = c_0 ⇔ ⌜L⌝ ∉ A
By letting G ≡ λn. iszero (F ⟨1, ⟨n, #n⟩⟩) M_0 M_1, where M_0 is any λ-term in B and M_1 is any λ-term not in B. Note that #n is computable and therefore λ-definable.
Now just ask the question "Is G⌜G⌝ in B?". If yes, then G⌜G⌝ = M_1 ∉ B, so G⌜G⌝ could not have been in B (remember that B is closed under β=). If no, then G⌜G⌝ = M_0 ∈ B, so it must have been in B.
This is a contradiction, so A could not have been recursive, therefore no closed-under-β= non-trivial set is recursive.
Note that {L | L β= true} is closed under β= and non-trivial, so it is therefore not recursive. This means lambda calculus is not decidable.
* The proof that all computable functions are λ-definable (we can have a λ-term F such that F c_{n1} c_{n2} ... = c_{f(n1, n2, ...)}), as well as the proof in this answer, can be found in "Lambda Calculi With Types" by Henk Barendregt (section 2.2).

Find a minimum spanning tree in different sets

Here I have two connected undirected graphs
G1 = [V ; E1] and G2 =[V ; E2] on the same set of vertices V . And assume edges in E1 and E2 have different colors.
Let w(e) be the weight of edge e ∈ E1 ∪ E2.
I want to find a minimum weight spanning tree (MSF) among those spanning trees which have at least one edge in each set E1 and E2. In this condition, How to find a proper algorithm for this? I got stuck here a whole night.
Consider two edges e1 ∈ E1, e2 ∈ E2. They connect between 2 and 4 different vertices in V. If they connect 3 or 4 vertices, suppose you first contract the vertices which e1 connects (same as each step in Kruskal's algorithm), then the ones which e2 connects, and then run any minimum spanning tree algorithm on the resulting graph. Then the result is the MST containing e1 and e2.
It follows that you can find the total MST by looping over all e1 ∈ E1, e2 ∈ E2 (which don't connect exactly the same two vertices), and finding the lightest solution. The proof of correctness can be easily modified from that of Kruskal's algorithm.
In fact, though, you can make this more efficient, since either the lightest edge in E1 or the lightest edge in E2 must be used in some MST. Suppose that the lightest edge in E1, say e'1, is not used, and consider a cut agreeing with e'1. The MST must contain some e ≠ e'1 connecting the cut. Clearly, if e ∈ E1, then e'1 can be used instead of e. If e ∈ E2, though, and e can't be used, then e is lighter than e'1. In this case, though, repeating the argument for E2, yields that the lightest edge in E2 can be part of the MST.
Consequently, only the lightest edge of E1 along with any edge in E2, or the lightest edge in E2 along with any edge in E1 must be considered for the first two contractions mention above.
The complexity is Θ(|E1 + E2| f(V, E1 + E2)), where f is the complexity of the MST algorithm.

Resources