Algorithm for regular expression intersection with cfg - algorithm

Im looking for an algorithm which outputs if the intersection of a regular expression and a contex free grammar is empty or not. I know that this problem is decidable, however, I cannot find any example implementation (in pseudocode).
Can someone provide me with such an algorithm, in .NET if possible but this is not a must. This problem is also called "regular intersection". Googling for it only gives me the geometrical algorithm or the theory about it.
edit:
Anybody. Im really stuck on it, and cannot find anything yet.

Here is a sketch of an approach that occurs to me. I think this should work but it is probably not the best way to do it since it uses the terribly messy conversion from PDA to CFG.
Convert the regular expression into a nondeterministic finite automaton (NFA) and reduce it down to the minimal determinsitic finite automaton (DFA). Convert the context free grammar (CFG) into a pushdown automoton (PDA). These first steps are all well known and fairly simple algorithms.
Take the intersection of the DFA and PDA, which is also a PDA. We will say the DFA has states S1, start state s1, final states F1, and transitions delta1 of the form ((source,trigger),destination), and the PDA has states S2, start state s2, final states F2, and transitons delta2 of the form ((source,trigger,pop),(destination,push)). The new PDA has states S1 X S2, each state labeled by a pair. It has final states F1 X F2, and start state (s1,s2). Now for the transitions.
For each transition d an element of delta2, for each state s an element s1, find the transition t an element of delta1 of the form ((s,d.trigger),?). Make a new transition (((d.source, s), d.trigger, d.pop),((d.destination, t.destination),d.push)).
This new PDA should accept the intersection of the languages produced by the RE and the CFG. To test if the language is empty you will need to convert it back to a CFG. The algorithm for that is messy and large, but it works. Once you have done that, mark each terminal symbol. Then mark each symbol which has a rule where there are only marked symbols on the right hand side, and repeat until you can mark no more symbols. If you can mark the start symbol, the language is not empty. Otherwise, the language is empty.

In fact, there is a simpler algorithm for computing the intersection between a context-free grammar and a regular expression. It does not use push-down automata, which can be costly to obtain from CFG with several conversions.
This solution was presented in:
Y. Ba-Hillel, M. Prles, and E. Shamir. 1965. On formal properties of
simple phrase structure grammars. Z. Phonetik, Sprachwissen. Komm. 15
(I961), 143-172. Y. Bar-Hillel, Language and Information,
Addison-Wesley, Reading, Mass (1965), 116–150.
but you can find a simpler version in:
Richard Beigel and William Gasarch. .. A Proof that the intersection
of a context-free language and a regular language is Context-Free
Which Does Not use pushdown automata.
http://www.cs.umd.edu/~gasarch/BLOGPAPERS/ cfg.pdf (.).
If it help, this solution was implemented in Pyformlang (https://pyformlang.readthedocs.io/), and you can find it on Github for Python (https://github.com/Aunsiels/pyformlang/blob/master/pyformlang/cfg/cfg.py)

Related

How to use Coq as calculator or as forward chaining rule engine/sequence application tool?

Is it possible and how to use Coq as calculator or as rule engine in foward chaining mode? Coq script usually requires to declare the goal for which the proof can be found. But is it possible to go in other direction, e.g. to compute the set of some consequences bounded by some rule, e.g., by some number of steps. I am especially interested in the sequent calculus of full first order logic. I guess (but I don't know) that there are some implementation or package for some type of sequent calculus for first order logic, but it is for theorem proving. I woul like to use such sequent calculus to derive consequences in some directed order. Is that possible in Coq and how?
Coq can be used for forward reasoning as well, in particular with the assert tactic. When you write assert (H : P)., Coq generates a subgoal that asks you to prove P. When this goal is complete, it resumes the original proof, extending its context with a hypothesis H : P.
The ltac language used to write Coq scripts has a match goal operator that allows you to inspect the shape of your goal. This allows you to progressively saturate your proof context with new facts derived from your current assumptions using the assert tactic, and to stop once certain conditions are met. Adam Chlipala's CPDT book has a nice chapter covering these features of tactic programming.

Algorithm that checks if a context free grammar generates infinite language that a DFA rejects

I have a DFA A and a CFG G, then i have to check if G generates infinite words that A don't accept (rejected by A), and a nice complexity time.
I thought to construct a graph with the CFG and if it contains a directed cycle, then produces an infinite language. Vertices are the variables and for each production I draw some edges. The input are all words rejected by the DFA and when I found a cycle I can say that the CFG generates infinite language rejected by DFA A.
I don't know how to transform it in a algorithm or if my proposal is not correct and I have to create a new one.
Edit: Can I transform my cfg to CNF and then to a DFA ( with chomsky ).After, I try to find a cycle. But my transformed dfa can have less states than my dfa a... I need how to get the words rejected by DFA A in my cfg I think.
Given CFG G, construct PDA B. Given DFA A and PDA B, construct PDA C such that C accepts L(C) = L(B) \ L(A), where \ is set difference. Now, L(C) is precisely the language of words accepted by the PDA B (hence generated by the CFG G) but not accepted, i.e. rejected, by the DFA A.
Now, the question is whether the language of B is infinite. We can do this. One way is to convert the PDA back into a CFG, and then put the CFG in CNF - removing unnecessary and unproductive symbols. Then, create a dependency tree among nonterminal symbols. If any remaining (productive) nonterminal symbol depends upon itself, i.e., there is a loop, then the language is infinite. Otherwise, the language is finite (empty, if there are no productive symbols remaining).

Logic programming - Is subset with only one function symbol Turing - complete?

If I have a subset of logic programming which contains only one function symbol, am I able to do everything?
I think that I cannot but I am not sure at all.
A programming language can do anything user wants if it is a Turing-complete language. I was taught that this means it has to be able to execute if..then..else commands, recursion and that natural numbers should be defined.
Any help and opinions would be appreciated!
In classical predicate logic, there is a distinction between the formula level and the term level. Since an n-ary function can be represented as an (n+1)-ary predicate, restricting only the number of function symbols does not lessen the expressivity.
In prolog, there is no difference between the formula and the term level. You might pick an n-ary symbol p and try to encode turing machines or an equivalent notion(e.g. recursive functions) via nestings of p.
From my intution I would assume this is not possible: you can basically describe n-ary trees with variables as leaves, but then you can always unify these trees. This means that every rule head will match during recursive derivations and therefore you are unable to express any case distinction. Still, this is just an informal argument, not a proof.
P.S. you might also be interested in monadic logic, where only unary predicates are allowed. This fragment of first-order logic is decidable.

Prolog - what sort of sentences can't be expressed

I was wondering what sort of sentences can't you express in Prolog? I've been researching into logic programming in general and have learned that first-order logic is more expressive compared to definite clause logic (Horn clause) that Prolog is based on. It's a tough subject for me to get my head around.
So, for instance, can the following sentence be expressed:
For all cars, there does not exist at least 1 car without an engine
If so, are there any other sentences that CAN'T be expressed? If not, why?
You can express your sentence straightforward with Prolog using negation (\+).
E.g.:
car(bmw).
car(honda).
...
car(toyota).
engine(bmw, dohv).
engine(toyota, wenkel).
no_car_without_engine:-
\+(
car(Car),
\+(engine(Car, _))
).
Procedure no_car_without_engine/0 will succeed if every car has an engine, and fail otherwise.
The most problematic definitions in Prolog, are those which are left-recursive.
Definitions like
g(X) :- g(A), r(A,X).
are most likely to fail, due to Prolog's search algorithm, which is plain depth-first-search
and will run to infinity and beyond.
The general problem with Horn Clauses however is, that they're defined to have at most one positive element. That said, one can find a clause which is limited to those conditions,
for example:
A ∨ B
As a consequence, facts like ∀ X: cat(X) ∨ dog(X) can't be expressed directly.
There are ways to work around those and there are ways to allow such statements (see below).
Reading material:
These slides (p. 3) give an
example of which sentence you can't build using Prolog.
This work (p. 10) also explains Horn Clauses and their implications and introduces a method to allow 'invalid' Horn Clauses.
Prolog is a programming language, not a natural language interface.
The sentence you show is expressed in such a convoluted way that I had hard time attempting to understand it. Effectively, I must thanks gusbro that took the pain to express it in understandable way. But he entirely glossed over the knowledge representation problems that any programming language pose when applied to natural language, or even simply negation in first order logic. These problems are so urgent that the language selected is often perceived as 'unimportant'.
Relating to programming, Prolog lacks the ability to access in O(1) (constant time) any linear data structure (i.e. arrays). Then a QuickSort, for instance, that requires access to array elements in O(1), can't be implemented in efficient way.
But it's nevertheless a Turing complete language, for what is worth. Then there are no statements that can't be expressed in Prolog.
So you are looking for sentences that can't be expressed in clausal logic that can be expressed in first order logic.
Strictly speaking, there are many, simply because clausal logic is a restriction of FOL. So that's true by definition.
What you can do though is you can rewrite any set of FOL sentences into a logic program that is not equivalent but with good properties. So for example if you want to know if p is a consequence of your theory, you can use equivalently the transformed logic program.
A few notes on the other answers:
Negation in Prolog (\+) is negation as failure and not first order logic negation
Prolog is a programming language, as correctly pointed out, we should be talking about clausal logic instead.
Left recursion is not a problem. You can easily use a different selection rule, or some other inference mechanism.

Finite-state transducer that computes the relation

From http://www.cse.ohio-state.edu/~gurari/theory-bk/theory-bk-twoli1.html#30007-23021r2.2.4:
Let M = <Q, Σ, Δ, δ, q0, F> be the deterministic finite-state transducer whose transition diagram is given in Figure 2.E.2.
For each of the following relations find a finite-state transducer that computes the relation.
a. { (x, y) | x is in L(M), and y is in Δ* }.
b. { (x, y) | x is in L(M), y is in Δ*, and (x, y) is not in R(M) }.
Yes, this is HW, but I have been struggling with these questions and could at least use pointers. If you want to create your own c. and/or d. examples just to show me HOW to do it rather than lead me to the answers for a. and b. then obviously I'm fine with that.
Thanks in advance!
Since you don't indicate what progress you've made so far, I'm going to assume that you've made no progress at all, and will give overall guidance for how you can approach this sort of problem.
First of all, examine the transition diagram. Do you understand what all the notations mean? Note that the transducer is described as deterministic. Do you understand what that means? Convince yourself that the transducer depicted in the transition diagram is, in fact, deterministic. Trace through it; try to get a sense for what inputs are accepted by the transducer, and what outputs it gives.
Next, figure out what L(M), Δ, and R(M) are for this transducer, since the questions refer to them. Do you know what those notations mean?
Do you know what it means for a transducer to compute a certain relation? Do you understand the { (x, y) | ... } notation for describing the relation?
Can you modify the transition diagram to eliminate the ε/0 transition and merge it into adjacent transitions (which then might output multiple symbols at a single transition)? (This can help, IMHO, with creating other transducers that accept the same input language. More so with part b, in this case, than part a.)
Describe for yourself the transducers you need to create, in a way that's independent of the original transducer. Will these transducers be deterministic?
Create the transition diagrams for these transducers.

Resources