Forall quantifier and complex boolean propositions in Idris - logic

I'm new to dependent types and, having a Haskell experience, am slowly learning Idris. For an exercize, I want to write a Huffman encoding. Currently I'm trying to write a proof that the "flattening" of the code tree produces a prefix code, but has got stuck with the quantifiers.
I have a simple inductive proposition that one list is a prefix of an another:
using (xs : List a, ys : List a)
data Prefix : List a -> List a -> Type where
pEmpty : Prefix Nil ys
pNext : (x : a) -> Prefix xs ys -> Prefix (x :: xs) (x :: ys)
Is this a valid approach? Or something like "xs is prefix of ys if there is zs such that xs ++ zs = ys" would be better?
Was it the right way to introduce a "forall" quantifier (as far as I can understand, in Agda it would be something like pNext : ∀ {x xs ys} → Prefix xs ys → Prefix (x :: xs) (y :: ys))? Should the pNext first argument be implicit? What is the semantic differences between two variants?
Then, I want to build one for a vector where none of elements forms a prefix of another:
data PrefVect : Vect n (List a) -> Type where
Empty vector has no prefixes:
pvEmpty : PrefVect Nil
and given an element x, a vector xs, and proofs that none element of xs is a prefix of x (and vice versa), x :: xs will hold that property:
pvNext : (x : [a]) -> (xs : Vect n [a]) ->
All (\y => Not (Prefix x y)) xs ->
All (\y => Not (Prefix y x)) xs ->
PrefVect (x :: xs)
This is an unvalid type which I hope to fix after dealing with the first one, but there is similar question about quantifiers in pvNext: is this variant acceptable, or there is a better way to form a "negation on relation"?
Thank you.

I think the only problem here is that you've used [a] as the type of lists of a, in the Haskell style, whereas in Idris it needs to be List a.
Your Prefix type looks fine to me, although I'd write it as:
data Prefix : List a -> List a -> Type where
pEmpty : Prefix [] ys
pNext : Prefix xs ys -> Prefix (x :: xs) (x :: ys)
That is, x can be implicit, and you don't need the using because Idris can infer the types of xs and ys. Whether this is the right approach or not really depends on what you plan to use the Prefix type for. It's certainly easy enough to test whether a list is the prefix of another. Something like:
testPrefix : DecEq a => (xs : List a) -> (ys : List a) -> Maybe (Prefix xs ys)
testPrefix [] ys = Just pEmpty
testPrefix (x :: xs) [] = Nothing
testPrefix (x :: xs) (y :: ys) with (decEq x y)
testPrefix (x :: xs) (x :: ys) | (Yes Refl) = Just (pNext !(testPrefix xs ys
testPrefix (x :: xs) (y :: ys) | (No contra) = Nothing
If you want to prove the negation, which it seems you might, you'll need the type to be:
testPrefix : DecEq a => (xs : List a) -> (ys : List a) -> Dec (Prefix xs ys)
I'll leave that one as an exercise :).

Related

Error with using sort on type a in Haskell

I am pretty new to Haskell and I am trying to write a function neighbours :: Int -> Metric a -> Point a -> [Point a] -> [Point a] such that neighbours k d p xs returns a list of the k nearest neighbours, in distance order, according to metric d to point p in the list xs of points. My code is
type Point a = (a, a)
type Metric a = Point a -> Point a -> Double
type Tuple a = (Double, Point a)
create:: Metric a -> Point a -> [Point a] -> [Tuple a] -> [Tuple a]
create d p (x:xs) ys | length xs == 0 = sort(((d p x), x) : ys)
| otherwise = create d p xs (((d p x), x) : ys)
takeP:: Tuple a -> Point a
takeP (_,p) = p
pList:: [Tuple a] ->[Point a]-> [Point a]
pList (x:xs) ys | length xs == 0 = reverse (takeP x : ys)
| otherwise = pList xs (takeP x : ys)
neighbours :: Int -> Metric a -> Point a -> [Point a] -> [Point a]--
neighbours k d p xs = take k (pList (create d p xs []) [])
But I am getting an error on sort which is:
* No instance for (Ord a) arising from a use of `sort'
Possible fix:
add (Ord a) to the context of
the type signature for:
create :: forall a.
Metric a -> Point a -> [Point a] -> [Tuple a] -> [Tuple a]
* In the expression: sort (((d p x), x) : ys)
In an equation for `create':
create d p (x : xs) ys
| length xs == 0 = sort (((d p x), x) : ys)
| otherwise = create d p xs (((d p x), x) : ys)
I used type Point a = (Int, Int) at first and it was working fine but in the specification it is required that Point is type Point a = (a, a) which caused my error. The other problem is that I cannot change the function types so I can't just add (Ord a) as proposed.
Is there a way to sort the Tuples' list by the first variable without encountering errors?
In your create function, you make use of sort :: Ord a => [a] -> [a]:
… = sort (((d p x), x) : ys)
this thus means that type of objects that we are sorting, in this case Tuple a, needs to be an instance of the Ord typeclass. A 2-tuple is an instance of the Ord typeclass, if both the type of the items are instances of Ord as well, so in this case Double and Point a. Since Point a is also a 2-tuple, but of two as, this thus means that Tuple a is an instance of Ord, if a is an instance of Ord. You thus should add a type constraint:
create :: Ord a => Metric a -> Point a -> [Point a] -> [Tuple a] -> [Tuple a]
create d p (x:xs) ys | length xs == 0 = sort(((d p x), x) : ys)
| otherwise = create d p xs (((d p x), x) : ys)
The create function makes use of some anti-patterns like using length, which takes linear time. You can in fact rewrite this to sorting a mapping:
create :: Ord a => Metric a -> Point a -> [Point a] -> [Tuple a]
create d p = sort . map f
where f x = (d p x, x)
This removes the ys parameter, which here only seems to be used as a accumulator.
If you wish to only sort on the first item of the 2-tuple, you can make use of sortOn :: Ord b => (a -> b) -> [a] -> [a]:
create :: Metric a -> Point a -> [Point a] -> [Tuple a]
create d p = sortOn fst . map f
where f x = (d p x, x)

Agda: what does `.(` mean?

I was looking at the code in agda-stdlib/src/Data/Vec/Base.agda
and saw .( in
take : ∀ m {n} → Vec A (m + n) → Vec A m
take m xs with splitAt m xs
take m .(ys ++ zs) | (ys , zs , refl) = ys
I tried removing the . in front of it and got the following error:
Could not parse the left-hand side
take m (ys ++ zs) | (ys , zs , refl)
Operators used in the grammar:
++ (infixr operator, level 5) [_++_ (/Users/fss/Dropbox/Documents/projects/Coding/dev/agda/agda-stdlib/src/Data/Vec/Base.agda:99,1-5)]
, (infixr operator, level 4) [_,_ (/usr/local/Cellar/agda/2.6.1/share/x86_64-osx-ghc-8.10.1/Agda-2.6.1/lib/prim/Agda/Builtin/Sigma.agda:9,15-18)]
when scope checking the left-hand side
take m (ys ++ zs) | (ys , zs , refl) in the definition of take
So I'm guessing it is necessary. But I don't understand what it is for exactly. I tried to look in https://agda.readthedocs.io/en/v2.6.1.1 but couldn't find anything about it.
Thanks!
First off, in
take m (ys ++ zs)
the pattern ys ++ zs is not valid because it is not a constructor applied on other patterns. If you think about it, in general it doesn't make sense to pattern match as function application because you'd need to be able to invert every function.
However, in
take : ∀ m {n} → Vec A (m + n) → Vec A m
take m xs with splitAt m xs
take m .(ys ++ zs) | (ys , zs , refl) = ys
we pattern match on the result of splitAt as well. The type of the third argument is (ys ++ zs) == xs, and the type of the constructor refl is (ys ++ zs) == (ys ++ zs). By unification, that means xs ~ (ys ++ zs), so the second argument to take cannot be anything other than ys ++ zs in this clause.
And this is exactly what a dot-pattern means:
A dot pattern (also called inaccessible pattern) can be used when the only type-correct value of the argument is determined by the patterns given for the other arguments. The syntax for a dot pattern is .t.

Is this a correctly implemented mergesort in Haskell?

I could not find my code anywhere on the net, so can you please tell me why or why not the function myMergeSort is a mergesort? I know my function myMergeSort sorts, but am not sure if it really sorts using the mergesort algorithm or if it is a different algorithm. I just began with Haskell a few days ago.
merge xs [] = xs
merge [] ys = ys
merge (x : xs) (y : ys)
| x <= y = x : merge xs (y : ys)
| otherwise = y : merge (x : xs) ys
myMergeSort :: [Int] -> [Int]
myMergeSort [] = []
myMergeSort (x:[]) = [x]
myMergeSort (x:xs) = foldl merge [] (map (\x -> [x]) (x:xs))
I have no questions about the merge function.
The following function mergeSortOfficial was the solution presented to us, I understand it but am not sure if I am implementing the mergesort algorithm in my function myMergeSort correctly or not.
Official solution - implemenation:
mergeSortOfficial [] = []
mergeSortOfficial (x : []) = [x]
mergeSortOfficial xs = merge
(mergeSortOfficial (take ((length xs) ‘div‘ 2) xs))
(mergeSortOfficial (drop ((length xs) ‘div‘ 2) xs))
No, that's not mergeSort. That's insertionSort, which is essentially the same algorithm as bubbleSort, depending on how you stare at it. At each step, a singleton list is merged with the accumulated ordered-list-so-far, so, effectively, the element of that singleton is inserted.
As other commenters have already observed, to get mergeSort (and in particular, its efficiency), it's necessary to divide the problem repeatedly into roughly equal parts (rather than "one element" and "the rest"). The "official" solution gives a rather clunky way to do that. I quite like
foldr (\ x (ys, zs) -> (x : zs, ys)) ([], [])
as a way to split a list in two, not in the middle, but into elements in even and odd positions.
If, like me, you like to have structure up front where you can see it, you can make ordered lists a Monoid.
import Data.Monoid
import Data.Foldable
import Control.Newtype
newtype Merge x = Merge {merged :: [x]}
instance Newtype (Merge x) [x] where
pack = Merge
unpack = merged
instance Ord x => Monoid (Merge x) where
mempty = Merge []
mappend (Merge xs) (Merge ys) = Merge (merge xs ys) where
-- merge is as you defined it
And now you have insertion sort just by
ala' Merge foldMap (:[]) :: [x] -> [x]
One way to get the divide-and-conquer structure of mergeSort is to make it a data structure: binary trees.
data Tree x = None | One x | Node (Tree x) (Tree x) deriving Foldable
I haven't enforced a balancing invariant here, but I could. The point is that the same operation as before has another type
ala' Merge foldMap (:[]) :: Tree x -> [x]
which merges lists collected from a treelike arrangement of elements. To obtain said arrangements, think "what's cons for Tree?" and make sure you keep your balance, by the same kind of twistiness I used in the above "dividing" operation.
twistin :: x -> Tree x -> Tree x -- a very cons-like type
twistin x None = One x
twistin x (One y) = Node (One x) (One y)
twistin x (Node l r) = Node (twistin x r) l
Now you have mergeSort by building a binary tree, then merging it.
mergeSort :: Ord x => [x] -> [x]
mergeSort = ala' Merge foldMap (:[]) . foldr twistin None
Of course, introducing the intermediate data structure has curiosity value, but you can easily cut it out and get something like
mergeSort :: Ord x => [x] -> [x]
mergeSort [] = []
mergeSort [x] = [x]
mergeSort xs = merge (mergeSort ys) (mergeSort zs) where
(ys, zs) = foldr (\ x (ys, zs) -> (x : zs, ys)) ([], []) xs
where the tree has become the recursion structure of the program.
myMergeSort is not a correct merge sort. It is a correct insertion sort though. We start with an empty list, then insert the elements one-by-one into the correct position:
myMergeSort [2, 1, 4, 3] ==
foldl merge [] [[2], [1], [4], [3]] ==
((([] `merge` [2]) `merge` [1]) `merge` [4]) `merge` [3] ==
(([2] `merge` [1]) `merge` [4]) `merge` [3]
([1, 2] `merge` [4]) `merge` [3] ==
[1, 2, 4] `merge` [3] ==
[1, 2, 3, 4]
Since each insertion takes linear time, the whole sort is quadratic.
mergeSortOfficial is technically right, but it's inefficient. length takes linear time, and it's called at each level of recursion for the total length of the list. take and drop are also linear. The overall complexity remains the optimal n * log n, but we run a couple of unnecessary circles.
If we stick to top-down merging, we could do better with splitting the list to a list of elements with even indices and another with odd indices. Splitting is still linear, but it's only a single traversal instead of two (length and then take / drop in the official sort).
split :: [a] -> ([a], [a])
split = go [] [] where
go as bs [] = (as, bs)
go as bs (x:xs) = go (x:bs) as xs
mergeSortOfficial :: [Int] -> [Int]
mergeSortOfficial [] = []
mergeSortOfficial (x : []) = [x]
mergeSortOfficial xs =
let (as, bs) = split xs in
merge (mergeSortOfficial as) (mergeSortOfficial bs)
As WillNess noted in the comments, the above split yields an unstable sort. We can use a stable alternative:
import Control.Arrow
stableSplit :: [a] -> ([a], [a])
stableSplit xs = go xs xs where
go (x:xs) (_:_:ys) = first (x:) (go xs ys)
go xs ys = ([], xs)
The best way is probably doing a bottom-up merge. It's the approach the sort in Data.List takes. Here we merge consecutive pairs of lists until there is only a single list left:
mergeSort :: Ord a => [a] -> [a]
mergeSort [] = []
mergeSort xs = mergeAll (map (:[]) xs) where
mergePairs (x:y:ys) = merge x y : mergePairs ys
mergePairs xs = xs
mergeAll [xs] = xs
mergeAll xs = mergeAll (mergePairs xs)
Data.List.sort works largely the same as above, except it starts with finding descending and ascending runs in the input instead of just creating singleton lists from the elements.

Statement for checking only once?Haskell

I have two lists of unequal length. When I add both of them I want the final list to have the length of the longest list.
addtwolists [0,0,221,2121] [0,0,0,99,323,99,32,2332,23,23]
>[0,0,221,2220,323,99,32,2332,23,23]
addtwolists [945,45,4,45,22,34,2] [0,34,2,34,2]
>[945,79,6,79,24,34,2]
zerolist :: Int -> [Integer]
zerolist x = take x (repeat 0)
addtwolists :: [Integer] -> [Integer] -> [Integer]
addtwolists x y = zipWith (+) (x ++ (zerolist ((length y)-(length x)))) (y ++ (zerolist ((length x)-(length y))))
This code is inefficient. So I tried:
addtwolist :: [Integer] -> [Integer] -> [Integer]
addtwolist x y = zipWith (+) (x ++ [head (zerolist ((length y)-(length x))) | (length y) > (length x)]) (y ++ [head (zerolist ((length x)-(length y))) | (length x) > (length y)])
Any other way to increase the efficiency?Could you only check once to see which list is bigger?
Your implementation is slow because it looks like you call the length function on each list multiple times on each step of zipWith. Haskell computes list length by walking the entire list and counting the number of elements it traverses.
The first speedy method that came to my mind was explicit recursion.
addLists :: [Integer] -> [Integer] -> [Integer]
addLists xs [] = xs
addLists [] ys = ys
addLists (x:xs) (y:ys) = x + y : addLists xs ys
I'm not aware of any standard Prelude functions that would fill your exact need, but if you wanted to generalize this to a higher order function, you could do worse than this. The two new values passed to the zip function are filler used in computing the remaining portion of the long list after the short list has been exhausted.
zipWithExtend :: (a -> b -> c) -> [a] -> [b] -> a -> b -> [c]
zipWithExtend f [] [] a' b' = []
zipWithExtend f (a:as) [] a' b' = f a b' : zipWithExtend f as [] a' b'
zipWithExtend f [] (b:bs) a' b' = f a' b : zipWithExtend f [] bs a' b'
zipWithExtend f (a:as) (b:bs) a' b' = f a b : zipWithExtend f as bs a' b'
Usage:
> let as = [0,0,221,2121]
> let bs = [0,0,0,99,323,99,32,2332,23,23]
> zipWithExtend (+) as bs 0 0
[0,0,221,2220,323,99,32,2332,23,23]
This can be done in a single iteration, which should be a significant improvement for long lists. It's probably simplest with explicit recursion:
addTwoLists xs [] = xs
addTwoLists [] ys = ys
addTwoLists (x:xs) (y:ys) = x+y:addTwoLists xs ys
Just because I can't help bikeshedding, you might enjoy this function:
Prelude Data.Monoid Data.List> :t map mconcat . transpose
map mconcat . transpose :: Monoid b => [[b]] -> [b]
For example:
> map (getSum . mconcat) . transpose $ [map Sum [0..5], map Sum [10,20..100]]
[10,21,32,43,54,65,70,80,90,100]
Two suggestions:
addtwolists xs ys =
let common = zipWith (+) xs ys
len = length common
in common ++ drop len xs ++ drop len ys
addtwolists xs ys | length xs < length ys = zipWith (+) (xs ++ repeat 0) ys
| otherwise = zipWith (+) xs (ys ++ repeat 0)

Merge sorted inputs in Haskell?

I'm a newbie to Haskell, and I'm trying to write an elegant function to merge an arbitrary number of sorted lists into a single sorted list... Can anyone provide an elegant and efficient reference implementation?
Thanks!
Something like this should work:
merge2 pred xs [] = xs
merge2 pred [] ys = ys
merge2 pred (x:xs) (y:ys) =
case pred x y of
True -> x: merge2 pred xs (y:ys)
False -> y: merge2 pred (x:xs) ys
merge pred [] = []
merge pred (x:[]) = x
merge pred (x:xs) = merge2 pred x (merge pred xs)
Here, the function merge2 merges 2 lists. The function merge merges a list of lists. The pred is predicate you use for sorting.
Example:
merge (<) [[1, 3, 9], [2, 3, 4], [7, 11, 15, 22]]
should return
[1,2,3,3,4,7,9,11,15,22]
Since I like taking advantage of infix operators and higher-order functions where it makes sense to, I would write
infixr 5 ##
(##) :: (Ord a) => [a] -> [a] -> [a]
-- if one side is empty, the merges can only possibly go one way
[] ## ys = ys
xs ## [] = xs
-- otherwise, take the smaller of the two heads out, and continue with the rest
(x:xs) ## (y:ys) = case x `compare` y of
LT -> x : xs ## (y:ys)
EQ -> x : xs ## ys
GT -> y : (x:xs) ## ys
-- a n-way merge can be implemented by a repeated 2-way merge
merge :: (Ord a) => [[a]] -> [a]
merge = foldr1 (##)
Here, xs ## ys merges two lists by their natural ordering (and drops duplicates), while merge [xs, ys, zs..] merges any number of lists.
This leads to the very natural definition of the Hamming numbers:
hamming :: (Num a, Ord a) => [a]
hamming = 1 : map (2*) hamming ## map (3*) hamming ## map (5*) hamming
hamming = 1 : merge [map (n*) hamming | n <- [2, 3, 5]] -- alternative
-- this generates, in order, all numbers of the form 2^i * 3^j * 5^k
-- hamming = [1,2,3,4,5,6,8,9,10,12,15,16,18,20,24,25,27,30,32,36,40,45,48,50,..]
Stealing yairchu's unimplemented idea:
{-# LANGUAGE ViewPatterns #-}
import qualified Data.Map as M
import Data.List (foldl', unfoldr)
import Data.Maybe (mapMaybe)
-- merge any number of ordered lists, dropping duplicate elements
merge :: (Ord a) => [[a]] -> [a]
-- create a map of {n: [tails of lists starting with n]}; then
-- repeatedly take the least n and re-insert the tails
merge = unfoldr ((=<<) step . M.minViewWithKey) . foldl' add M.empty where
add m (x:xs) = M.insertWith' (++) x [xs] m; add m _ = m
step ((x, xss), m) = Just (x, foldl' add m xss)
-- merge any number of ordered lists, preserving duplicate elements
mergeDup :: (Ord a) => [[a]] -> [a]
-- create a map of {(n, i): tail of list number i (which starts with n)}; then
-- repeatedly take the least n and re-insert the tail
-- the index i <- [0..] is used to prevent map from losing duplicates
mergeDup = unfoldr step . M.fromList . mapMaybe swap . zip [0..] where
swap (n, (x:xs)) = Just ((x, n), xs); swap _ = Nothing
step (M.minViewWithKey -> Just (((x, n), xs), m)) =
Just (x, case xs of y:ys -> M.insert (y, n) ys m; _ -> m)
step _ = Nothing
where merge, like my original, eliminates duplicates, while mergeDup preserves them (like Igor's answer).
if efficiency wasn't a concern I'd go with
merge = sort . concat
otherwise:
merge :: Ord a => [[a]] -> [a]
merge [] = []
merge lists =
minVal : merge nextLists
where
heads = map head lists
(minVal, minIdx) = minimum $ zip heads [0..]
(pre, ((_:nextOfMin):post)) = splitAt minIdx lists
nextLists =
if null nextOfMin
then pre ++ post
else pre ++ nextOfMin : post
note however that this implementation always linearly searches for the minimum (while for a large number of list one may wish to maintain a heap etc.)
Unlike the other posts, I would have merge :: [a] -> [a] -> [a]
type SortedList a = [a]
merge :: (Ord a) => SortedList a -> SortedList a -> SortedList a
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys)
| x < y = x : merge xs (y : ys)
| otherwise = y : merge (x : xs) ys
mergeAll :: (Ord a) => [SortedList a] -> SortedList a
mergeAll = foldr merge []
Just a quick note: if you want to have the optimal log n behavior when merging several lists (such as you'd get with a priority queue), you can do it very easily with a tweak to Igor's beautiful solution above. (I would have put this as a comment on his answer above, but I don't have enough reputation.) In particular, you do:
merge2 pred xs [] = xs
merge2 pred [] ys = ys
merge2 pred (x:xs) (y:ys) =
case pred x y of
True -> x: merge2 pred xs (y:ys)
False -> y: merge2 pred (x:xs) ys
everyother [] = []
everyother e0:[] = e0:[]
everyother (e0:e1:es) = e0:everyother es
merge pred [] = []
merge pred (x:[]) = x
merge pred xs = merge2 pred (merge pred . everyother $ xs)
(merge pred . everyother . tail $ xs)
Note that a real priority queue would be a bit faster/more space efficient, but that this solution is asymptotically just as good and, as I say, has the advantage that it's a very minor tweak to Igor's beautifully clear solution above.
Comments?

Resources