Z notation: How to write operation schema that may add one or more tuples to a relation - formal-languages

I'm writing an operation schema in Z. This operation AssignValue maps a property to one or more values.
One property may be linked to one or more values, and one value may be linked to one or more properties, forming a many-to-many relation, R ⊆ Property × Value.
I'm not sure how to write this operation to indicate that one property could be mapped to one or more values. I have two versions here. Version A seems to map one property to only one value.
Version A:
--AssignValue---
| p? : Property
| v? : Value
-------
|R′ = R ∪ { p? ↦ v? }
-------
In Version B, I have added a powerset in the declaration of v? to indicate that v? is a set of values (more than one value).
Version B:
--AssignValue---
| p? : Property
| v? : P Value
-------
|R′ = R ∪ { p? ↦ v? }
-------
Which version is correct? or there is a better way to represent this? I'm new to z-notation, any help would be greatly appreciated. Thanks!

You have not shown the whole schema. I assume that you have a state schema S with a relation R : Property<->Value (equivalent to R ⊆ Property × Value) and AssignValue includes ∆S.
Both styles could work, although your version B is probably not what you intended.
A relation is allowed to have many pairs with the same domain element, so starting with
R = {p0 ↦ v0, p0 ↦ v1, p3 ↦ v16}
You could call AssignValueA with p?=p0, v?=v16 to get a state with
R = {p0 ↦ v0, p0 ↦ v1, p0 ↦ v16, p3 ↦ v16}
that is, p0 is now mapped to three separate values.
In your version B you have exactly the same thing, but your values are now sets of values. What you probably intended was that R would be a total function of type Property → Value. Now, assuming only properties p0 through p3, you would have the initial R as
R = {p0 ↦ {v0, v1}, p1 ↦ ∅, p2 ↦ ∅, p3 ↦ {v16}}
You need to define
--AssignValueB----------------
| ∆S
| p? : Property, v? Value
------------
| R' = R ⊕ {p? ↦ R p? ∪ {v?}}
------------------------------
This has the same interface as AssignValueA, allowing addition of a single value to a single property per call.
In both models a property may have no, one or many associated values, but the operation only allows one property to be assigned one extra value per call.
Exercise: try defining an operation that allows multiple properties to be assigned multiple extra values per call.

For an example of a big Z specification, I suggest this recently uploaded project: https://github.com/vinahradau/finma
For a one-to-many, I would use a relation (as an opposite to a function).
Example form the project above:
userAccessRigths: USER ↔ ROLE
userAccessRigths′ = userAccessRigths ∪ {(user?, role?)}

Related

How to implement a union-find (disjoint set) data structure in Coq?

I am quite new to Coq, but for my project I have to use a union-find data structure in Coq. Are there any implementations of the union-find (disjoint set) data structure in Coq?
If not, can someone provide an implementation or some ideas? It doesn't have to be very efficient. (no need to do path compression or all the fancy optimizations) I just need a data structure that can hold an arbitrary data type (or nat if it's too hard) and perform: union and find.
Thanks in advance
If all you need is a mathematical model, with no concern for actual performance, I would go for the most straightforward one: a functional map (finite partial function) in which each element optionally links to another element with which it has been merged.
If an element links to nothing, then its canonical representative is itself.
If an element links to another element, then its canonical representative is the canonical representative of that other element.
Note: in the remaining of this answer, as is standard with union-find, I will assume that elements are simply natural numbers. If you want another type of elements, simply have another map that binds all elements to unique numbers.
Then you would define a function find : UnionFind → nat → nat that returns the canonical representative of a given element, by following links as long as you can. Notice that the function would use recursion, whose termination argument is not trivial. To make it happen, I think that the easiest way is to maintain the invariant that a number only links to a lesser number (i.e. if i links to j, then i > j). Then the recursion terminates because, when following links, the current element is a decreasing natural number.
Defining the function union : UnionFind → nat → nat → UnionFind is easier: union m i j simply returns an updated map with max i' j' linking to min i' j', where i' = find m i and j' = find m j.
[Side note on performance: maintaining the invariant means that you cannot adequately choose which of a pair of partitions to merge into the other, based on their ranks; however you can still implement path compression if you want!]
As for which data structure exactly to use for the map: there are several available.
The standard library (look under the title FSets) has several implementations (FMapList, FMapPositive and so on) satisfying the interface FMapInterface.
The stdpp libray has gmap.
Again if performance is not a concern, just pick the simplest encoding or, more importantly, the one that makes your proofs the simplest. I am thinking of just a list of natural numbers.
The positions of the list are the elements in reverse order.
The values of the list are offsets, i.e. the number of positions to skip forward in order to reach the target of the link.
For an element i linking to j (i > j), the offset is i − j.
For a canonical representative, the offset is zero.
With my best pseudo-ASCII-art skills, here is a map where the links are { 6↦2, 4↦2, 3↦0, 2↦1 } and the canonical representatives are { 5, 1, 0 }:
6 5 4 3 2 1 0 element
↓ ↓ ↓ ↓ ↓ ↓ ↓
/‾‾‾‾‾‾‾‾‾↘
[ 4 ; 0 ; 2 ; 3 ; 1 ; 0 ; 0 ] map
\ \____↗↗ \_↗
\___________/
The motivation is that the invariant discussed above is then enforced structurally. Hence, there is hope that find could actually be defined by structural induction (on the structure of the list), and have termination for free.
A related paper is: Sylvain Conchon and Jean-Christophe Filliâtre. A Persistent Union-Find Data Structure. In ACM SIGPLAN Workshop on ML.
It describes the implementation of an efficient union-find data structure in ML, that is persistent from the user perspective, but uses mutation internally. What may be more interesting for you, is that they prove it correct in Coq, which implies that they have a Coq model for union-find. However, this model reflects the memory store for the imperative program that they seek to prove correct. I’m not sure how applicable it is to your problem.
Maëlan has a good answer, but for an even simpler and more inefficient disjoint set data structure, you can just use functions to nat to represent them. This avoids any termination stickiness. In essence, the preimages of any total function form disjoint sets over the domain. Another way of looking at this is as representing any disjoint set G as the curried application find_root G : nat -> nat since find_root is the essential interface that disjoint sets provide.
This is also analogous to using functions to represent Maps in Coq like in Software Foundations. https://softwarefoundations.cis.upenn.edu/lf-current/Maps.html
Require Import Arith.
Search eq_nat_decide.
(* disjoint set *)
Definition ds := nat -> nat.
Definition init_ds : ds := fun x => x.
Definition find_root (g : ds) x := g x.
Definition in_same_set (g : ds) x y :=
eq_nat_decide (g x) (g y).
Definition union (g : ds) x y : ds :=
fun z =>
if in_same_set g x z
then find_root g y
else find_root g z.
You can also make it generic over the type held in the disjoint set like so
Definition ds (a : Type) := a -> nat.
Definition find_root {a} (g : ds a) x := g x.
Definition in_same_set {a} (g : ds a) x y :=
eq_nat_decide (g x) (g y).
Definition union {a} (g : ds a) x y : ds a :=
fun z =>
if in_same_set g x z
then find_root g y
else find_root g z.
To initialize the disjoint set for a particular a, you need an Enum instance for your type a basically.
Definition init_bool_ds : ds bool := fun x => if x then 0 else 1.
You may want to trade out eq_nat_decide for eqb or some other roughly equivalent thing depending on your proof style and needs.

Algorithm for allowing concurrent walk of a graph

In a directed acyclic graph describing a set of tasks to process, i need to find all tasks that can be processed concurrently. The graph has no loops and is quite small (~1000 nodes, ~2000 edges), performance is not a primary concern.
Examples with desired result:
[] is a group. All tasks in a group must be processed before continuing
[x & y] means x and y can be processed concurrently (x and y in parallel)
x -> y means x and y must be processed sequentially (x before y)
1
a -> [b & c] -> c
2
[a & e] -> b -> c -> [d & f]
3
[ [a -> b] & [e -> f] ] -> [ [c -> d] & g ]
I do not want to actually execute the graph, but rather build a data structure that is as parallel as possible, while maintaining the order. The nomenclature and names of algorithms is not that familiar to me, so i'm having a hard time trying to find similar problems/solutions online.
Mathematically, I would frame this problem as finding a minimally defined series-parallel partial order extending the given partial order.
I would start by transitively reducing the graph and repeatedly applying two heuristics.
If x has one dependent y, and y has one dependency x, merge them into a new node z = [x → y].
If x and y have the same dependencies and dependents, merge them into a new node z = [x & y].
Now, if the input is already series-parallel, the result will be one node. In general, however, this will leave a graph that embeds an N-shaped structure like b → c, b → g, f → g from the last example in the question. This structure must be addressed by adding one or more of b → f, c → f, c → g, f → b, f → c, g → c. But in a different instance, this act would in turn create new N-shaped structures. There's no obvious notion of a closure, which is why this problem feels hard to me.
Some of these choices seem worse than others. For example, c → f forces the sequence b → c → f → g, whereas f → c is the only choice that doesn't increase the length of the critical path.
I guess what I'd try is,
If heuristics 1 and 2 have no targets, form a graph with edges x--y if and only if x and y have either a common dependent or a common dependency, compute the connected components of this graph, and &-merge the smallest component that isn't a singleton, followed by another transitive reduction.
Here's a solution i came up with (pseudocode):
sequence = []
for each (node, depth) in depthFirstSearch(graph)
sequence[depth].push(node)
return sequence
The sequence defines the order to process the graph. If an item in it contains more than one node, they can be processed concurrently.
While this allows for some concurrency, it does not advance as fast as it could. For example, f in the 3rd example in the question would require a to be completed first (as it will be at depth 1, when a and e are depth 0). Ideally work on f could start when e is done.

Efficient Union Find with Existential Quantifier?

Is there a classical algorithm to solve the following problem.
Assume the union find algorithm without existential quantifiers
has the following input:
x1 = y1 /\ .. /\ xn = yn
It will then build some datastructure u, so that I can check
u.root(x)==u.root(y), to decide whether x and y are in the same
subgraph.
The input can be characterized by the following grammar:
Input :== Var = Var | Input /\ Input
Assume now we also allow existential quantifiers:
Input :== Var = Var | Input /\ Input | exists Var Input
What union find algorithm could deal with such an input.
I am still assuming that the algorithm builds some datastructure
u, where I can check via u.root(x)==u.root(y) whether x and
y are in the same subgraph.
Additionally u.root(x) should throw an exception when used
with a bound variable. These variables should all have been
eliminated and not anymore part of datastructure. Means
the subgraph should have been accordingly reduced, without
changing the validity of the result.
Bye
Here is a sketch of an algorithm. It will traverse the AST, and
feed a special union find algorithm. First the traversal:
traverse((X = Y)) :- add_conn(X, Y).
traverse(exists(X,I)) :- push_var(X), traverse(I), pop_var_remove_conn(X).
traverse((A /\ B)) :- traverse(A), traverse(B).
The special union find algorithm works with a list. This list defines
the weight of the nodes, the head of list has weight 0, the second element
weight 1, etc... add_conn(X,Y) first computes X'=root(X) and Y'=root(Y). The
less weighted of X' and Y' is connected to the more weighted one.
push_var(X) adds X to the front of the list. Making it the less weighted
node. pop_var_remove_conn(X) removes X again from the list, and removes a
possibly established connection from X to some other node as well.

How to find the intersection of two NFA

In DFA we can do the intersection of two automata by doing the cross product of the states of the two automata and accepting those states that are accepting in both the initial automata.
Union is performed similarly. How ever although i can do union in NFA easily using epsilon transition how do i do their intersection?
You can use the cross-product construction on NFAs just as you would DFAs. The only changes are how you'd handle ε-transitions. Specifically, for each state (qi, rj) in the cross-product automaton, you add an ε-transition from that state to each pair of states (qk, rj) where there's an ε-transition in the first machine from qi to qk and to each pair of states (qi, rk) where there's an ε-transition in the second machine from rj to rk.
Alternatively, you can always convert the NFAs into DFAs and then compute the cross product of those DFAs.
Hope this helps!
We can also use De Morgan's Laws: A intersection B = (A' U B')'
Taking the union of the compliments of the two NFA's is comparatively simpler, especially if you are used to the epsilon method of union.
There is a huge mistake in templatetypedef's answer.
The product automaton of L1 and L2 which are NFAs :
New states Q = product of the states of L1 and L2.
Now the transition function:
a is a symbol in the union of both automatons' alphabets
delta( (q1,q2) , a) = delta_L1(q1 , a) X delta_L2(q2 , a)
which means you should multiply the set that is the result of delta_L1(q1 , a) with the set that results from delta_L2(q1 , a).
The problem in the templatetypedef's answer is that the product result (qk ,rk) is not mentioned.
Probably a late answer, but since I had the similar problem today I felt like sharing it. Realise the meaning of intersection first. Here, it means that given the string e, e should be accepted by both automata.
Consider the folowing automata:
m1 accepting the language {w | w contains '11' as a substring}
m2 accepting the language {w | w contains '00' as a substring}
Intuitively, m = m1 ∩ m2 is the automaton accepting the strings containing both '11' and '00' as substrings. The idea is to simulate both automata simultaneously.
Let's now formally define the intersection.
m = (Q, Σ, Δ, q0, F)
Let's start by defining the states for m; this is, as mentioned above the Cartesian product of the states in m1 and m2. So, if we have a1, a2 as labels for the states in m1, and b1, b2 the states in m2, Q will consist of following states: a1b1, a2b1, a1b2, a2b2. The idea behind this product construction is to keep track of where we are in both m1 and m2.
Σ most likely remains the same, however in some cases they differ and we just take the union of alphabets in m1 and m2.
q0 is now the state in Q containing both the start state of m1 and the start state of m2. (a1b1, to give an example.)
F contains state s IF and only IF both states mentioned in s are accept states of m1, m2 respectively.
Last but not least, Δ; we define delta again in terms of the Cartesian product, as follows: Δ(a1b1, E) = Δ(m1)(a1, E) x Δ(m2)(b1, E), as also mentioned in one of the answers above (if I am not mistaken). The intuitive idea behind this construction for Δ is just to tear a1b1 apart and consider the states a1 and b1 in their original automaton. Now we 'iterate' each possible edge, let's pick E for example, and see where it brings us in the original automaton. After that, we glue these results together using the Cartesian product. If (a1, E) is present in m1 but not Δ(b1, E) in m2, then the edge will not exist in m; otherwise we'll have some kind of a union construction.
An alternative to constructing the product automaton is allowing more complicated acceptance criteria. Ordinarily, an NFA accepts an input string when it has reached any one of a set of accepting final states. That can be extended to boolean combinations of states. Specifically, you construct the automaton for the intersection like you do for the union, but consider the resulting automaton to accept an input string only when it is in (what corresponds to) accepting final states in both automata.

Call by value in the lambda calculus

I'm working my way through Types and Programming Languages, and Pierce, for the call by value reduction strategy, gives the example of the term id (id (λz. id z)). The inner redex id (λz. id z) is reduced to λz. id z first, giving id (λz. id z) as the result of the first reduction, before the outer redex is reduced to the normal form λz. id z.
But call by value order is defined as 'only outermost redexes are reduced', and 'a redex is reduced only when its right-hand side has already been reduced to a value'. In the example id (λz. id z) appears on the right-hand side of the outermost redex, and is reduced. How is this squared with the rule that only outermost redexes are reduced?
Is the answer that 'outermost' and 'innermost' only refers to lambda abstractions? So for a term t in λz. t, t can't be reduced, but in a redex s t, t is reduced to a value v if this is possible, and then s v is reduced?
Short answer: yes. You can never reduce inside a lambda-term you can only reduce term outside, starting by right.
The set of evaluation contexts in lambda-calculus by value is defined as follow:
E = [ ] | (λ.t)E | Et
E is what you can value..
For example in lambda calculus by name the evaluation context is :
E = [ ] | Et | fE
as you can reduce an application even if a term is not a value.
For example (λx.x)(z λx.x) is stuck in call by value but in call by name it reduce to (z λx.x), which is a normal form.
In the context grammar f is a normal form (in call by name) defined as:
f = λx.t | L
L = x | L f
You can see another definition of contexts at chapter 19.5.3 of the Pierce.
Is the answer that 'outermost' and 'innermost' only refers to lambda abstractions? So for a term t in λz. t, t can't be reduced, but in a redex s t, t is reduced to a value v if this is possible, and then s v is reduced?
Yes, that's exactly right.

Resources