Coq list sorting practices & sortBy? - sorting

What is the equivalent of Haskell's sortBy in Coq?
In general, I find the Coq standard library around sorting confusing.
I would have hoped for some "axiomatisation" of a sorted list, and the the availability of different sorts, to which I can provide an ordering function.
However, this does not seem to be the case.
There is a theory of a "Sorted list", which uses a relation Variable R : A -> A -> Prop.
, but there are no restrictions on this R. I would have expected it to be an ordering, but no such thing exists.
There is another file with a "mergesort implementation", which needs a new module to be passed.
There is no "higher level" version which provides helpers such as sortBy.
Is there some implementation of sortBy I can use, or do I need to create this manually?

The MergeSort module in the Coq standard library does what you need. It works like sortBy in Haskell except that instead of passing an ordering function and obtaining the specialized sort, you pass a module that encapsulates the ordering function along with the proof that this function is total. See the example at the bottom of the module documentation.

Besides the Coq standard library, the Mathetical Components library also has an implementation of merge sort. It is called sort, and it lives in the module mathcomp.path. Its signature is forall T : Type, (T -> T -> bool) -> list T -> list T, which is closer to the original sortBy.

Related

Read Non-Binary Expression with custom functions

I want read in mathematical functions and interpret them.
So far I worked with binary expressions and used the Infix to Prefix Method to read my string (e.g. 4*3+1).
However, meanwhile I want to read in also more complex expressions that are not translatable into a binary tree.
Some Examples:
max(x_1,x_2,x_3,x_4,x_5) + max(y_1,y_2)
round(interpolate(x_1,x,y),2)
customfunction(x,y,z) + 4
I have some problems to find a way to translate the given string into a non-binary tree. How would be a good way to do this, are there some known methods?
Since I need to support some own custom functions I can not use any existing library.
I don't expect any code, I'm interested in the theory doing this.
Normally you'll need to define a grammar (a very simple one in this case) that defines what are the rules for parsing your text, and then generate a parser that is based on this grammar through one of the various libraries that exist. For a grammar so simple Antlr is surely overblown. There is a serie of articles here about writing a "Writing a Recursive Descent Parser" or if you search for peg (a family of grammar parsers) on nuget you'll find plenty of implementations.
Note that this branch of computer science is quite vast... You can start from here lexers vs parsers for the theoretical part.
Look at your custom non-binary functions the same way as you look at the binary expressions.
E.g. 2+3*4 translates to +(*(3,4),2) // + and * here are just function names
You can mix in a custom function:
E.g. a^3 + Max(a,b,c)*2 translates to: +(^(a,3), *(2, Max(a,b,c))
In your interpreter define what +(), ^(), Max(), your_custom_function() mean and what parameters (i.e. child nodes to in the tree) to expect. The tree will not be binary, but that does not really change how you create it and traverse it.

Consistent formulations of sets in Coq?

I'm quite new at Coq and trying to develop a framework based on my research. My work is quite definition-heavy and I'm having trouble encoding it because of how Coq seems to treat sets.
There are Type and Set, which they call 'sorts', and I can use them to define a new set:
Variable X: Type.
And then there's a library encoding (sub)sets as 'Ensembles', which are functions from some Type to a Prop. In other words, they are predicates on a Type:
Variable Y: Ensemble X.
Ensembles feel more like proper mathematical sets. Plus, they are built upon by many other libraries. I've tried focussing on them: defining one universal set U: Set, and then limiting myself to (sub)Ensembles on U. But no. Ensembles cannot be used as types for other variables, nor to define new subsets:
Variable y: Y. (* Error *)
Variable Z: Ensemble Y. (* Error *)
Now, I know there are several ways to get around that. The question "Subset parameter" offers two. Both use coercions. The first sticks to Sets. The second essentially uses Ensembles (though not by name). But both require quite some machinery to accomplish something so simple.
Question: What is the recommended way of consistently (and elegantly) handling sets?
Example: Here's an example of what I want to do: Assume a set DD. Define a pair dm = (D, <) where D is a finite subset of DD and < is a strict partial order on D.
I'm sure that with enough tinkering with coercions or other structures, I could accomplish it; but not in a particularly readable way; and without a good intuition of how to manipulate the structure further. For example, the following type-checks:
Record OrderedSet {DD: Set} : Type := {
D : (Ensemble DD);
order : (relation {d | In _ D d});
is_finite : (Finite _ D);
is_strict_partial : (is_strict_partial_order order)
}.
But I'm not so sure it's what I want; and it certainly doesn't look very pretty. Note that I'm going backwards and forwards between Set and Ensemble in a seemingly arbitrary way.
There are plenty of libraries out there which use Ensembles, so there must be a nice way to treat them, but those libraries don't seem to be documented very well (or... at all).
Update: To complicate matters further, there appear to be a number of other set implementations too, like MSets. This one seems to be completely separate and incompatible with Ensemble. It also uses bool rather than Prop for some reason. There is also FSets, but it appears to be an outdated version of MSets.
It's been (literally) years since I used Coq, but let me try to help.
I think mathematically speaking U: Set is like saying U is an universe of elements and Ensemble U would then mean a set of elements from that universe. So for generic notions and definitions you will almost certainly use Set and Ensemble is one possible way about reasoning about subsets of elements.
I'd suggest that you take a look at great work by Matthieu Sozeau who introduced type classes to Coq, a very useful feature based on Haskell's type classes. In particular in the standard library you will find a class-based definition of a PartialOrder that you mention in your question.
Another reference would be the CoLoR library formalizing notions needed to prove termination of term rewriting. It has a fairly large set of generic purpose definitions on orders and what-not.

Why is D missing container classes?

I'm used to C++ STL containers. D has arrays, associative arrays, and strings, but where is the rest? I know about std.container, but as far as I can tell it only has one container, the red-black tree, which I could use if I needed something similar to std::set. But, what if I need a list? Am I supposed to use an array instead?
std::vector -> array
std::deque -> ?
std::queue -> ?
std::stack -> ? maybe array and std.container functions ?
std::priority_queue -> BinaryHeap
std::list -> ?
std::set -> std.container RedBlackTree
std::multiset -> ?
std::unordered_set -> ?
std::map -> associative arrays
std::multimap -> ?
std::unordered_map -> ?
Are there any plans to support any of the missing?
I believe that the main holdup for getting more containers into std.container is that Andrei Alexandrescu has been sorting out how best to deal with custom allocators, and he wants to do that before implementing all of the sundry container types, because otherwise it's going to require a lot of code changes once he does.
In the interim, you have the built-in arrays and associative arrays, and std.container contains Array (which is essentially std::vector), SList (which is a singly-linked list), RedBlackTree (which can be used for any type of set or map which uses a tree - which is what the STL's various set and map types do), and BinaryHeap.
So, there's no question that the situation needs to be improved (and it will), but I don't know how soon. Eventually, std.container should have container types which correspond to all of the STL container types.
Containers are a todo in terms of library development in D, but noone's gotten a comprehensive container library into Phobos because noone agrees on what the design should be, and everyone who contributes to the standard library (which has been growing very rapidly) has found more interesting things to work on.
std::vector -> array as you say
std::dequeue, std::queue: We don't have one yet, unfortunately.
std::stack: This can be trivially implemented on top of SList or an array.
std::set: This can be trivially implemented on top of either RedBlackTree.
std::multiset: I think RedBlackTree can be set to allow duplicates.
std::unordered_set: This can be trivially implemented on top of the builtin associative array. To implement it on top of the builtin AA, use byte[0][SomeType].
std::map: Can be trivially implemented on top of RedBlackTree.
std::multimap: You can probably use associative arrays of arrays for this.
std__unordered_map: Use builtin associative arrays.

lenses, fclabels, data-accessor - which library for structure access and mutation is better

There are at least three popular libraries for accessing and manipulating fields of records. The ones I know of are: data-accessor, fclabels and lenses.
Personally I started with data-accessor and I'm using them now. However recently on haskell-cafe there was an opinion of fclabels being superior.
Therefore I'm interested in comparison of those three (and maybe more) libraries.
There are at least 4 libraries that I am aware of providing lenses.
The notion of a lens is that it provides something isomorphic to
data Lens a b = Lens (a -> b) (b -> a -> a)
providing two functions: a getter, and a setter
get (Lens g _) = g
put (Lens _ s) = s
subject to three laws:
First, that if you put something, you can get it back out
get l (put l b a) = b
Second that getting and then setting doesn't change the answer
put l (get l a) a = a
And third, putting twice is the same as putting once, or rather, that the second put wins.
put l b1 (put l b2 a) = put l b1 a
Note, that the type system isn't sufficient to check these laws for you, so you need to ensure them yourself no matter what lens implementation you use.
Many of these libraries also provide a bunch of extra combinators on top, and usually some form of template haskell machinery to automatically generate lenses for the fields of simple record types.
With that in mind, we can turn to the different implementations:
Implementations
fclabels
fclabels is perhaps the most easily reasoned about of the lens libraries, because its a :-> b can be directly translated to the above type. It provides a Category instance for (:->) which is useful as it allows you to compose lenses. It also provides a lawless Point type which generalizes the notion of a lens used here, and some plumbing for dealing with isomorphisms.
One hindrance to the adoption of fclabels is that the main package includes the template-haskell plumbing, so the package is not Haskell 98, and it also requires the (fairly non-controversial) TypeOperators extension.
data-accessor
[Edit: data-accessor is no longer using this representation, but has moved to a form similar to that of data-lens. I'm keeping this commentary, though.]
data-accessor is somewhat more popular than fclabels, in part because it is Haskell 98. However, its choice of internal representation makes me throw up in my mouth a little bit.
The type T it uses to represent a lens is internally defined as
newtype T r a = Cons { decons :: a -> r -> (a, r) }
Consequently, in order to get the value of a lens, you must submit an undefined value for the 'a' argument! This strikes me as an incredibly ugly and ad hoc implementation.
That said, Henning has included the template-haskell plumbing to automatically generate the accessors for you in a separate 'data-accessor-template' package.
It has the benefit of a decently large set of packages that already employ it, being Haskell 98, and providing the all-important Category instance, so if you don't pay attention to how the sausage is made, this package is actually pretty reasonable choice.
lenses
Next, there is the lenses package, which observes that a lens can provide a state monad homomorphism between two state monads, by definining lenses directly as such monad homomorphisms.
If it actually bothered to provide a type for its lenses, they would have a rank-2 type like:
newtype Lens s t = Lens (forall a. State t a -> State s a)
As a result, I rather don't like this approach, as it needlessly yanks you out of Haskell 98 (if you want a type to provide to your lenses in the abstract) and deprives you of the Category instance for lenses, which would let you compose them with .. The implementation also requires multi-parameter type classes.
Note, all of the other lens libraries mentioned here provide some combinator or can be used to provide this same state focalization effect, so nothing is gained by encoding your lens directly in this fashion.
Furthermore, the side-conditions stated at the start don't really have a nice expression in this form. As with 'fclabels' this does provide template-haskell method for automatically generating lenses for a record type directly in the main package.
Because of the lack of Category instance, the baroque encoding, and the requirement of template-haskell in the main package, this is my least favorite implementation.
data-lens
[Edit: As of 1.8.0, these have moved from the comonad-transformers package to data-lens]
My data-lens package provides lenses in terms of the Store comonad.
newtype Lens a b = Lens (a -> Store b a)
where
data Store b a = Store (b -> a) b
Expanded this is equivalent to
newtype Lens a b = Lens (a -> (b, b -> a))
You can view this as factoring out the common argument from the getter and the setter to return a pair consisting of the result of retrieving the element, and a setter to put a new value back in. This offers the computational benefit that the 'setter' here can recycle some of the work used to get the value out, making for a more efficient 'modify' operation than in the fclabels definition, especially when accessors are chained.
There is also a nice theoretical justification for this representation, because the subset of 'Lens' values that satisfy the 3 laws stated in the beginning of this response are precisely those lenses for which the wrapped function is a 'comonad coalgebra' for the store comonad. This transforms 3 hairy laws for a lens l down to 2 nicely pointfree equivalents:
extract . l = id
duplicate . l = fmap l . l
This approach was first noted and described in Russell O'Connor's Functor is to Lens as Applicative is to Biplate: Introducing Multiplate and was blogged about based on a preprint by Jeremy Gibbons.
It also includes a number of combinators for working with lenses strictly and some stock lenses for containers, such as Data.Map.
So the lenses in data-lens form a Category (unlike the lenses package), are Haskell 98 (unlike fclabels/lenses), are sane (unlike the back end of data-accessor) and provide a slightly more efficient implementation, data-lens-fd provides the functionality for working with MonadState for those willing to step outside of Haskell 98, and the template-haskell machinery is now available via data-lens-template.
Update 6/28/2012: Other Lens Implementation Strategies
Isomorphism Lenses
There are two other lens encodings worth considering. The first gives a nice theoretical way to view a lens as a way to break a structure into the value of the field, and 'everything else'.
Given a type for isomorphisms
data Iso a b = Iso { hither :: a -> b, yon :: b -> a }
such that valid members satisfy hither . yon = id, and yon . hither = id
We can represent a lens with:
data Lens a b = forall c. Lens (Iso a (b,c))
These are primarily useful as a way to think about the meaning of lenses, and we can use them as a reasoning tool to explain other lenses.
van Laarhoven Lenses
We can model lenses such that they can be composed with (.) and id, even without a Category instance by using
type Lens a b = forall f. Functor f => (b -> f b) -> a -> f a
as the type for our lenses.
Then defining a lens is as easy as:
_2 f (a,b) = (,) a <$> f b
and you can validate for yourself that function composition is lens composition.
I've recently written on how you can further generalize van Laarhoven lenses to get lens families that can change the types of fields, just by generalizing this signature to
type LensFamily a b c d = forall f. Functor f => (c -> f d) -> a -> f b
This does have the unfortunate consequence that the best way to talk about lenses is to use rank 2 polymorphism, but you don't need to use that signature directly when defining lenses.
The Lens I defined above for _2 is actually a LensFamily.
_2 :: Functor f => (a -> f b) -> (c,a) -> f (c, b)
I've written a library that includes lenses, lens families, and other generalizations including getters, setters, folds and traversals. It is available on hackage as the lens package.
Again, a big advantage of this approach is that library maintainers can actually create lenses in this style in your libraries without incurring any lens library dependency whatsoever, by just supplying functions with type Functor f => (b -> f b) -> a -> f a, for their particular types 'a' and 'b'. This greatly lowers the cost of adoption.
Since you don't need to actually use the package to define new lenses, it takes a lot of pressure off my earlier concerns about keeping the library Haskell 98.

Why does F# prefer lists over arrays?

I'm trying to learn F# and was watching a video when something odd (at least, to me) came up. The video in question is here and the relevant part starts at 2:30 for those interested. But basically, the guy says that F# makes it awkward to work with arrays and that the designers did so on purpose because lists are easier to "prepend and append".
The question that immediately sprang to mind: isn't easy prepending and appending something that should be frowned upon in an immutable language? Specifically, I'm thinking of C#'s Lists where you can do something like List.Add(obj); and mutate the list. With an array you'd have to create an entirely new array, but that's also what would need to happen in an immutable language.
So why do the designers of F# prefer lists? What is the fundamental difference in an immutable environment between a list and an array? What am I missing? Are lists in F# really linked lists?
I would disagree that "F# makes it awkward to work with arrays". In fact, F# makes working with arrays quite nice compared to most languages.
For example, F# has literal array construction: let arr = [|1;2;3;4;|]. And perhaps even cooler, pattern matching on arrays:
match arr with
| [|1;2;_;_|] -> printfn "Starts with 1;2"
| [|_;_;3;4|] -> printfn "Ends with 3;4"
| _ -> printfn "Array not recognized"
As to why immutable singly linked lists are preferred in functional programming like F#, there's a lot to say, but the short answer is that it allows an O(1) prepend efficiency, and allows the implementation to share nodes, so it is easy on memory. For example,
let x = [2;3]
let y = 1::x
Here y is created by prepending 1 to x, but x is neither modified nor copied, so making y was very cheap. We can see a bit how this is possible, since x points to the head, 2, of the initially constructed list and can only move forward, and since the elements of the list it points to can't be mutated, it doesn't matter that y shares nodes with it.
In functional languages lists are usually single-linked lists. I.e. it is not necessary to copy the complete list. Instead prepending (often called cons) is a O(1) operation and you can still use the old list, because lists are immutable.
First of all, arrays are pretty low-level data structure and they are really only useful if you know the lenght of the array when creating it. That's not often the case and that's a reason why C# programmers use System.Collections.Generic.List<T> and F# programmers use F# list<T>.
The reason why F# prefers its own functional list rather than using .NET List<T> is that functional languages prefer immutable types. Instead of modifying the object by calling list.Add(x), you can create new list with items added to the front by writing let newList = x::list.
I also agree with Stephen that using arrays in F# is not awkward at all. If you know the number of elements you're working with or you're transforming some existing data source, then working with arrays is quite easy:
/ You can create arrays using `init`
let a = Array.init 10 (fun i -> (* calculate i-th element here *) )
// You can transform arrays using `map` and `filter`
a |> Array.map (fun n -> n + 10)
|> Array.filter (fun n -> n > 12)
// You can use array comprehensions:
let a2 = [| for n in a do
if n > 12 then yield n + 10 |]
This is essentially the same as processing lists - there you would use list comprehensions [ ... ] and list processing functions such as List.map etc. The difference really appears just when initializing the list/array.
F# makes it awkward to work with arrays
F# provides many features that make it easier to work with arrays than in other languages, including array literals, array patterns and higher-order functions.
The question that immediately sprang to mind: isn't easy prepending and appending something that should be frowned upon in an immutable language?
I believe you have misunderstood what that statement means. When people talk about prepending and appending in the context of purely functional data structures they are referring to the creation of a new collection that is derived (and shared most of its internals) with an existing collection.
So why do the designers of F# prefer lists?
F# inherited some list-related capabilities from OCaml which inherited them from Standard ML and ML because singly-linked immutable lists are very useful in the context of their application domain (metaprogramming) but I would not say that the designers of F# prefer lists.
What is the fundamental difference in an immutable environment between a list and an array?
In F#, lists provide O(1) prepend and append and O(n) random access whereas arrays provide O(n) prepend and append and O(1) random access. Arrays can be mutated but lists cannot.
What am I missing?
Basic knowledge of purely functional data structures. Read Okasaki.
Are lists in F# really linked lists?
Yes. Specifically, singly-linked immutable lists. In fact, in some MLs the list type can be defined as:
type 'a list =
| ([])
| (::) of 'a * 'a list
This is why the :: operator is a constructor and not a function so you cannot write (::) as you can with, for example, (+).
An F# list is more like the following datastructure - a single linked list:
public class List<T> {
public List(T item, List<T> prev) { /*...*/ }
public T Item { get; }
public List<T> Prev { get; }
}
So when a new list is created, it is actually creating a single node with a reference to the first element of the previous list, rather than copying the entire array.

Resources