Refactoring in Alloy - refactoring

I have an Alloy model with this relation in it:
sig myint {nextX: (myint -> myint -> myint) -> myint, nextT: (myint -> myint -> myint) -> myint}
and I get the following error message:
Translation capacity exceeded.
In this scope, universe contains 84 atoms
and relations of arity 5 cannot be represented.
Visit http://alloy.mit.edu/ for advice on refactoring.
I am wondering how the number of atoms is related to the supported relation arity, and how I could resolve this problem.
I would really appreciate your help.
Thanks a lot.
Fathiyeh

This StackOverflow post answers why the number of atoms is related to max relation arity.
(in your case, 845 (4182119424) is greater than Integer.MAX_VALUE (2147483647))
For some ideas on refactoring, see this.

Related

Substitution system

I am currently developing an advanced text editor with some features.
One of the features is a substitution system. It will help user to replace strings quickly.
It has a following format:
((a x) | x) (d e)
We can divide the string into two parts: left ((a x) | x) and right (d e). If there is a letter (x, in current case) after a |, then we then can perform a substitute action - replace all x in the left side with a string on the right side.
So after the action we will receive (a (d e)).
Of course, these parenthesized expressions might be nested : ((a | a) (d x | d)) (e f | f) -> ((e f | f) x) -> (e x).
Unfortunately, the reductions may continue infinitely: (a a | a) (a a | a).
I need to show a warning if user would write a string for which there is no way to reduce it into a form without reductions.
Any suggestions on how to do it?
Congratulations, you have just invented the λ-calculus (pronounced "Lambda-calculus")
Why this problem is hard
Let's use the original notation for this, since it was already invented by Church in the 1930s:
((λx.f) u) is the rewriting of f where all x's have been replaced by u's.
Note that this is equivalent to you notation (f | x) u, where f is a string that can contain x's. It's a mathematical tool introduced to understand what functions are "computable", i.e. can be given to a computer with an adequate program, so that for all input, the computer will run its program and output the correct answer. Spoiler: computable functions are exactly the functions that can be written as λ-terms (i.e. rewritings in your settings).
Sometimes λ-terms can be simplified, as you have noted yourself. For instance
(λx.yx)((λu.uu)(λz.z)i) -> (λx.yx)((λz.z)(λz.z)i) -> (λx.yx)((λz.z)i) -> (λx.yx)i -> yi
Sometimes the sequence of simplifications (also called "evaluation", or "beta-reduction") is infinite. One very famous is the Omega operator (λx.xx)(λx.xx) (rings a bell?).
Ideally, we would like to restrict the λ-calculus so that all terms are "simplifiable" to a final usable form in a finite number of steps. This property is called "normalization". The actual thing we want however is one step further : we want that all sequences of simplifications end up in a finite number of steps in the final form, so that when faced with multiple choices, you can choose either and not get stuck in an infinite loop because of a bad choice. This is called "strong normalization".
Here's the issue : the λ-calculus is not strongly normalizing. This property is simply not true. There are terms that do not end up in a final - "normal" - form.
You have found one yourself.
How this problem has been solved theoretically
The key to gaining the strong normalization property was to rule out λ-terms which did not satisfy this property (print a warning and spit an error in your case) so that we only consider strongly normalizing λ-terms. This "ruling out" was put in place via a typing system : λ-terms that have a valid type are strongly normalizing, and λ-terms that are not strongly normalizing cannot have a valid type. Awesome, now we know which ones should give errors ! Let's check that with simple cases.
From the way you are able to phrase your problem in a very clear way, I'm assuming you already have experience with programming and static type systems, but I can modify this answer if you'd rather have a full explanation. I'll be using Caml-like notation for types so s is a string, s -> s is a function that to a string associates a string, and s -> (s -> s) is a function that to a string associates a function from string to string, etc. I denote x : t when variable x has type t.
λx.ax : s -> s provided a : s -> s
(λx.yx)((λu.uu)(λz.z)i) : s provided y : s -> s, i : s as we have seen by the reductions above
λx.x : s -> s. But watch out, λx.x : (s -> s) -> (s -> s) is also true. Deciding the type is hard
How you may solve this problem programatically
You problem is slightly easier, because you are only dealing with string replacements, so you know that the base type of everything is a string, and you can try to "type you way up" until you are either able to type the whole rewriting (i.e. no errors), or able to prove that the rewriting is not typable (see Omega) and spit an error. Watch out though : just because you are not able to type a rewriting does not mean that it cannot be typed !
In your example (λx.ax)(de), you know the actual values for a, d and e so you may have for instance a : s -> s, d : s -> s, e : s, hence de : s and λx.ax : s -> s so the whole thing has type s and you're good to go. You can probably write a compiler that will try to type the rewriting and figure out if it's typable or not based on a set of cleverly-crafted decision rules for the specific use that you want. You can even decide that if the compiler fails to type a rewriting, then it is rejected (even when it's valid) because cases so intricate that they will fail though being correct should never happen in a reasonable editor text substitution scenario.
Do you want to solve this programatically ?
No. I mean it. You really don't want to.
Remember, λ-terms describe all computable functions.
If you were to really implement a fully correct warning generator as you seem to intend to, this means that one could encode any program as a string substitution in you editor, so your editor is essentially a programming language on its own, which is typically not what you want your editor to be.
I mean, writing a full program that queries the webcam, analyzes the image, guesses who is typing, and loads an ad based on Google's suggestion when opening a file, all as a cleverly-written text substitution command, really ? Is that what you're trying to achieve ?
PS: If this is indeed what you're trying to achieve, have a look at Lisp and Haskell, their syntax should look somewhat... familiar to what you've seen here.
And good luck with your editor !

Haskell(GHC) specialization tour & efficient TypeFamilies

Haskell is all about abstraction. But abstraction costs us extra CPU cycles and extra memory usage due to common representation of all abstract (polymorphic) data - pointer on heap. There are some ways to make abstract code play better with high performance demands. As far as I understand, one way it is done is specialization - basically extra code generation(manual or by compiler), correct ?
Let's assume that all code below is Strict(which helps compiler perform more optimizations ?)
If we have a function sum:
sum :: (Num a) => a -> a -> a
We can generate specialized version of it using specialize pragma:
{-#SPECIALIZE sum :: Float -> Float -> Float#-}
Now if haskell compiler can determine at compile time that we call sum on two Floats, it is going to use specialized version of it. No heap allocations, right ?
Functions - done. Same pragma can be applied to class instances. Logic does not change here, does it ?
But what about data types ?
I suspect that TypeFamilies are in charge here ?
Let's try to specialize dependent length-indexed list.
--UVec for unboxed vector
class UVec a where
data Vec (n :: Nat) a :: *
instance UVec Float where
data Vec n Float where
VNilFloat :: Vec 0 Float
VConsFloat :: {-#UNPACK#-}Float ->
Vec n Float ->
Vec (N :+ 1) Float
But Vec has a problem. We can't pattern match on its constructors as
each instance of UVec does not have to provide Vec with identical constructors. This forces us to implement each function on Vec for each instance of Vec (as lack of pattern matching implies that it can't be polymorphic on Vec). What is the best practice in such case ?
As you say, we can't pattern match on UVec a without knowing what a is.
One option is to use another typeclass that extends your vector class with a custom function.
class UVec a => UVecSum a where
sum :: UVec a -> a
instance UVecSum Float where
sum = ... -- use pattern match here
If, later on, we use sum v where v :: UVec Float, the Float-specific code we defined in the instance will be called.
Partial answer, but perhaps it might help.
As far as I understand, one way it is done is specialization - basically extra code generation(manual or by compiler), correct ?
Yes, this is similar to code instantiation in C++ templates.
Now if haskell compiler can determine at compile time that we call sum on two Floats, it is going to use specialized version of it. No heap allocations, right ?
Yes the compiler calls the specialised version whenever possible. Not sure what you mean regarding the heap allocations.
Regarding the dependently types vectors: usually (I know this from Idris) the length of the vector is eliminated by the compiler when possible. It is intended for stronger type checking. At runtime the length information is useless and can be dropped.

How to debug type-level programs

I'm trying to do some hoopy type-level programming, and it just doesn't work. I'm tearing my hair out trying to figure out why the heck GHC utterly fails to infer the type signatures I want.
Is there some way to make GHC tell me what it's doing?
I tried -ddump-tc, which just prints out the final type signatures. (Yes, they're wrong. Thanks, I already knew that.)
I also tried -ddump-tc-trace, which dumps out ~70KB of unintelligible gibberish. (In particular, I can't see any user-written identifiers mentioned anywhere.)
My code is so close to working, but somehow an extra type variable keeps appearing. For some reason, GHC can't see that this variable should be completely determined. Indeed, if I manually write the five-mile type signature, GHC happily accepts it. So I'm clearly just missing a constraint somewhere... but where?!? >_<
As has been mentioned in the comments, poking around with :kind and :kind! in GHCi is usually how I go about doing it, but it also surprisingly matters where you place the functions, and what looks like it should be the same, isn't always.
For instance, I was trying to make a dependently typed functor equivalent, for a personal project, which looked like
class IFunctor f where
ifmap :: (a -> b) -> f n a -> f n b
and I was writing the instance for
data IEither a n b where
ILeft :: a -> IEither a Z b
IRight :: b -> IEither a (S n) b
It should be fairly simple, I thought, just ignore f for the left case, apply it in the right.
I tried
instance IFunctor (IEither a) where
ifmap _ l#(ILeft _) = l
ifmap f (IRight r) = IRight $ f r
but for the specialized version of ifmap in this case being ifmap :: (b -> c) -> IEither a Z b -> IEither a Z c, Haskell inferred the type of l to be IEither a Z b on the LHS, which, makes sense, but then refused to produce b ~ c.
So, I had to unwrap l, get the value of type a, then rewrap it to get the IEither a Z c.
This isn't just the case with dependent types, but also with rank-n types.
For instance, I was trying to convert isomorphisms of a proper form into natural transformations, which should be fairly easy, I thought.
Apparently, I had to put the deconstructors in a where clause of the function, because otherwise type inference didn't work properly.

What is the base of a refined element?

Consider the logical fragment:
Patient (Profile A)
identifier (sliced on system) 0..*
myclinicnbr (slice 1) 0..1
yourclinicnbr (slice 2) 0..*
And then:
Patient (Profile B, base is A)
identifier (sliced on system) 0..2
myclinicnbr (slice 1) (no diff)
yourclinicnbr (slice 2) 0..*
In B, the effective cardinalities are:
identifier 0..2 (explicit)
myclinicnbr 0..1 (constrained by A::myclinicnbr)
yourclinicnbr 0..2 (constrained by B::identifier)
Questions are:
Should B validate with B::yourclinicnbr having a cardinality incompatible with B::identifier?
Must B::yourclinicnbr override A::yourclinicnbr to bring it into compliance with B::identifier, or could it make no statement?
For each part in B, what is the correct snapshot cardinality?
I don't think we've said that cardinalities for slices must be proper subsets of the parent. Doing that would require a lot of awkward math each time a new slice got introduced - adding a minOccurs = 1 slice could easily force decreasing the maximum of a bunch of other 0..n slices. The expectation is that the instance must meet the constraints of both the base element and the slices, so if you have a 0..3 element with a bunch of 0..* slices, you can't have an instance that contains more than 3 repetitions, regardless of the fact that the slices might indicate 0..*. So the implementation behavior isn't confusing. It could however be confusing from a documentation perspective, and might in some cases cause confusion for software.
My leaning is to leave it up to the profile designer as to whether the slice cardinalities are struct mathematical subsets of the element cardinality. In some cases it'll be easy and worth it. In other cases, it might be more pain than the effort justifies. If you think we need to be tighter on this, a change proposal with rationale would be welcome.

lenses, fclabels, data-accessor - which library for structure access and mutation is better

There are at least three popular libraries for accessing and manipulating fields of records. The ones I know of are: data-accessor, fclabels and lenses.
Personally I started with data-accessor and I'm using them now. However recently on haskell-cafe there was an opinion of fclabels being superior.
Therefore I'm interested in comparison of those three (and maybe more) libraries.
There are at least 4 libraries that I am aware of providing lenses.
The notion of a lens is that it provides something isomorphic to
data Lens a b = Lens (a -> b) (b -> a -> a)
providing two functions: a getter, and a setter
get (Lens g _) = g
put (Lens _ s) = s
subject to three laws:
First, that if you put something, you can get it back out
get l (put l b a) = b
Second that getting and then setting doesn't change the answer
put l (get l a) a = a
And third, putting twice is the same as putting once, or rather, that the second put wins.
put l b1 (put l b2 a) = put l b1 a
Note, that the type system isn't sufficient to check these laws for you, so you need to ensure them yourself no matter what lens implementation you use.
Many of these libraries also provide a bunch of extra combinators on top, and usually some form of template haskell machinery to automatically generate lenses for the fields of simple record types.
With that in mind, we can turn to the different implementations:
Implementations
fclabels
fclabels is perhaps the most easily reasoned about of the lens libraries, because its a :-> b can be directly translated to the above type. It provides a Category instance for (:->) which is useful as it allows you to compose lenses. It also provides a lawless Point type which generalizes the notion of a lens used here, and some plumbing for dealing with isomorphisms.
One hindrance to the adoption of fclabels is that the main package includes the template-haskell plumbing, so the package is not Haskell 98, and it also requires the (fairly non-controversial) TypeOperators extension.
data-accessor
[Edit: data-accessor is no longer using this representation, but has moved to a form similar to that of data-lens. I'm keeping this commentary, though.]
data-accessor is somewhat more popular than fclabels, in part because it is Haskell 98. However, its choice of internal representation makes me throw up in my mouth a little bit.
The type T it uses to represent a lens is internally defined as
newtype T r a = Cons { decons :: a -> r -> (a, r) }
Consequently, in order to get the value of a lens, you must submit an undefined value for the 'a' argument! This strikes me as an incredibly ugly and ad hoc implementation.
That said, Henning has included the template-haskell plumbing to automatically generate the accessors for you in a separate 'data-accessor-template' package.
It has the benefit of a decently large set of packages that already employ it, being Haskell 98, and providing the all-important Category instance, so if you don't pay attention to how the sausage is made, this package is actually pretty reasonable choice.
lenses
Next, there is the lenses package, which observes that a lens can provide a state monad homomorphism between two state monads, by definining lenses directly as such monad homomorphisms.
If it actually bothered to provide a type for its lenses, they would have a rank-2 type like:
newtype Lens s t = Lens (forall a. State t a -> State s a)
As a result, I rather don't like this approach, as it needlessly yanks you out of Haskell 98 (if you want a type to provide to your lenses in the abstract) and deprives you of the Category instance for lenses, which would let you compose them with .. The implementation also requires multi-parameter type classes.
Note, all of the other lens libraries mentioned here provide some combinator or can be used to provide this same state focalization effect, so nothing is gained by encoding your lens directly in this fashion.
Furthermore, the side-conditions stated at the start don't really have a nice expression in this form. As with 'fclabels' this does provide template-haskell method for automatically generating lenses for a record type directly in the main package.
Because of the lack of Category instance, the baroque encoding, and the requirement of template-haskell in the main package, this is my least favorite implementation.
data-lens
[Edit: As of 1.8.0, these have moved from the comonad-transformers package to data-lens]
My data-lens package provides lenses in terms of the Store comonad.
newtype Lens a b = Lens (a -> Store b a)
where
data Store b a = Store (b -> a) b
Expanded this is equivalent to
newtype Lens a b = Lens (a -> (b, b -> a))
You can view this as factoring out the common argument from the getter and the setter to return a pair consisting of the result of retrieving the element, and a setter to put a new value back in. This offers the computational benefit that the 'setter' here can recycle some of the work used to get the value out, making for a more efficient 'modify' operation than in the fclabels definition, especially when accessors are chained.
There is also a nice theoretical justification for this representation, because the subset of 'Lens' values that satisfy the 3 laws stated in the beginning of this response are precisely those lenses for which the wrapped function is a 'comonad coalgebra' for the store comonad. This transforms 3 hairy laws for a lens l down to 2 nicely pointfree equivalents:
extract . l = id
duplicate . l = fmap l . l
This approach was first noted and described in Russell O'Connor's Functor is to Lens as Applicative is to Biplate: Introducing Multiplate and was blogged about based on a preprint by Jeremy Gibbons.
It also includes a number of combinators for working with lenses strictly and some stock lenses for containers, such as Data.Map.
So the lenses in data-lens form a Category (unlike the lenses package), are Haskell 98 (unlike fclabels/lenses), are sane (unlike the back end of data-accessor) and provide a slightly more efficient implementation, data-lens-fd provides the functionality for working with MonadState for those willing to step outside of Haskell 98, and the template-haskell machinery is now available via data-lens-template.
Update 6/28/2012: Other Lens Implementation Strategies
Isomorphism Lenses
There are two other lens encodings worth considering. The first gives a nice theoretical way to view a lens as a way to break a structure into the value of the field, and 'everything else'.
Given a type for isomorphisms
data Iso a b = Iso { hither :: a -> b, yon :: b -> a }
such that valid members satisfy hither . yon = id, and yon . hither = id
We can represent a lens with:
data Lens a b = forall c. Lens (Iso a (b,c))
These are primarily useful as a way to think about the meaning of lenses, and we can use them as a reasoning tool to explain other lenses.
van Laarhoven Lenses
We can model lenses such that they can be composed with (.) and id, even without a Category instance by using
type Lens a b = forall f. Functor f => (b -> f b) -> a -> f a
as the type for our lenses.
Then defining a lens is as easy as:
_2 f (a,b) = (,) a <$> f b
and you can validate for yourself that function composition is lens composition.
I've recently written on how you can further generalize van Laarhoven lenses to get lens families that can change the types of fields, just by generalizing this signature to
type LensFamily a b c d = forall f. Functor f => (c -> f d) -> a -> f b
This does have the unfortunate consequence that the best way to talk about lenses is to use rank 2 polymorphism, but you don't need to use that signature directly when defining lenses.
The Lens I defined above for _2 is actually a LensFamily.
_2 :: Functor f => (a -> f b) -> (c,a) -> f (c, b)
I've written a library that includes lenses, lens families, and other generalizations including getters, setters, folds and traversals. It is available on hackage as the lens package.
Again, a big advantage of this approach is that library maintainers can actually create lenses in this style in your libraries without incurring any lens library dependency whatsoever, by just supplying functions with type Functor f => (b -> f b) -> a -> f a, for their particular types 'a' and 'b'. This greatly lowers the cost of adoption.
Since you don't need to actually use the package to define new lenses, it takes a lot of pressure off my earlier concerns about keeping the library Haskell 98.

Resources