I am working through Cormen et. al., Introduction to Algorithms, 3rd ed., but I also have an interest in Haskell. Section 8.2 (p. 194) covers counting sort. I was interested in how it and many algorithms might be implemented in haskell as they often use array access and destructive update. I took a look at the implementation on RosettaCode (copied below) and I find it very difficult to follow.
import Data.Array
countingSort :: (Ix n) => [n] -> n -> n -> [n]
countingSort l lo hi = concatMap (uncurry $ flip replicate) count
where count = assocs . accumArray (+) 0 (lo, hi) . map (\i -> (i, 1)) $ l
One of the things I like about haskell is how algorithms can be very clear (e.g. Haskell quicksort examples), at least as a specification that hasn't been optimized. This seems very unclear and I wonder if it's necessarily so or just overdone.
Can someone
clarify what's going on here,
perhaps provide a more instructive and clear implementation, and
tackle whether this is actually implementing counting sort or if non-strictness (lazyness) and immutability mean that this is actually some other sort disguised as counting sort?
It is indeed doing a counting sort. Here's a slightly rewritten version that I find easier to understand:
import Data.Array
countingSort :: (Ix n) => [n] -> n -> n -> [n]
countingSort l lo hi = concat [replicate times n | (n, times) <- counts]
where counts = assocs (accumArray (+) 0 (lo, hi) [(i, 1) | i <- l])
Let's break it down step by step. We'll use the list [5, 3, 1, 2, 3, 4, 5].
*Main> [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]
[(5,1),(3,1),(1,1),(2,1),(3,1),(4,1),(5,1)]
We're just taking every element of the list and turning it into a tuple with 1. This is the basis for our counts. Now we need a way to sum up those counts per element. This is where accumArray comes into play.
*Main> accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]
array (1,5) [(1,1),(2,1),(3,2),(4,1),(5,2)]
The first parameter to accumArray is the operation to apply during accumulation (just simple addition for us). The second parameter is the starting value, and the third parameter is the bounds. So we end up with an array mapping numbers to their counts in the input.
Next we use assocs to get key/value tuples from the map:
*Main> assocs $ accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]
[(1,1),(2,1),(3,2),(4,1),(5,2)]
And then replicate to repeat each number based on its count:
*Main> [replicate times n | (n, times) <- assocs $ accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]]
[[1],[2],[3,3],[4],[5,5]]
Finally, we use concat to turn this list of lists into a single list:
*Main> concat [replicate times n | (n, times) <- assocs $ accumArray (+) 0 (1, 5) [(i, 1) | i <- [5, 3, 1, 2, 3, 4, 5]]]
[1,2,3,3,4,5,5]
In the actual function I wrote above, I used where to break up this one-liner.
I find list comprehension to be easier to deal with than map and concatMap, so those are the main changes I made in my version of the function.
(uncurry $ flip replicate) is a nice trick... flip replicate gives you a version of replicate that takes its arguments in the opposite order. uncurry takes that curried function and turns it into a function that takes a tuple as an argument instead. Together, those produce the same result as my list comprehension, which destructured the tuple and then passed its parameters in reverse order. I'm not familiar enough with Haskell to know if this is a common idiom, but for me, the list comprehension was easier to follow.
count through accumArray works as histogram builder. That is, for each number from lo to hi it returns how many times the number occurs in the argument list.
count :: (Ix i, Num e) => [i] -> i -> i -> [(i, e)]
count l lo hi = assocs . accumArray (+) 0 (lo, hi) . map (\i -> (i, 1)) $ l
count [6,2,1,6] 0 10 ==
[(0,0),(1,1),(2,1),(3,0),(4,0),(5,0),(6,2),(7,0),(8,0),(9,0),(10,0)]
Results of count are used to generate original elements back from this specification. That is done by replicateing each fst element of tuple snd number of times . This yields list of lists that are concatenated together.
f l lo hi = map (uncurry $ flip replicate ) $ count l lo hi
f [6,2,1,6] 0 10 == [[],[1],[2],[],[],[],[6,6],[],[],[],[]]
Full solution is equivalent to
countingSort l lo hi = concat $ f l lo hi
Related
I'm studying prolog language and i have an issue regarding this problem.
I've already created a program that, given a number N, returns a list with elements between 0 and N:
list2val(N,L):- list2val(0,N,L).
list2val(N,N,[N]).
list2val(C,N,[C|T]):-
C<N,
N1 is C+1,
list2val(N1,N,T).
?- list2val(5,X).
X = [0,1,2,3,4,5]
Now i'm trying to give an extension that, given a list, returns a list of lists in which every list is list2val only if the next number is greater than current number. In this case:
?- newFuction([1,5,2,3,9],L).
L = [[0,1],[0,1,2,],[0,1,2,3]]
My code is this, but somethings is wrong:
array(X):- array(X,_L).
array([],_L).
array([H|[T|Ts]],L1):-
H<T,
list2val(H,L2),
array([T|Ts],[L1|[L2]]).
array([T|Ts],L1).
Maybe could be too much difficult to understand but using a list L = [1,5,2,3,9] i do those steps:
check 1<5 so i create a 1 list2val until 1..in this case [0,1]
check 5<2 i dont create nothing.
check 2<3 i create list2val of 2 ...[0,1,2]
and so on...
I don't want use a standard predicates, by implement with standard terms.
A solution for your problem could be:
list2val(N,L):- list2val(0,N,L).
list2val(N,N,[N]):- !.
list2val(C,N,[C|T]):-
C<N,
N1 is C+1,
list2val(N1,N,T).
simulate([_],[]).
simulate([A,B|T],[H|T1]):-
( A < B ->
list2val(A,H),
simulate([B|T],T1);
simulate([B|T],[H|T1])
).
Using a predicate like simulate/2, you can solve your problem: it compares two numbers of the list and then create a new list in case the condition is satisfied.
?- simulate([1,5,2,3,9],LO).
LO = [[0, 1], [0, 1, 2], [0, 1, 2, 3]]
false
This question already has answers here:
Using a constrained variable with `length/2`
(4 answers)
Closed 5 years ago.
I'm using the clpfd library
?- use_module(library(clpfd)).
true.
Then I attempt to generate all 3 lists of length K with 1 <= K <= 3.
?- K in 1 .. 3, length(C, K).
K = 1,
C = [_1302] ;
K = 2,
C = [_1302, _1308] ;
K = 3,
C = [_1302, _1308, _1314] ;
ERROR: Out of global stack
I would expect the query to terminate after K = 3. For example, the following does terminate.
?- between(1, 3, K), length(X, K).
K = 1,
X = [_3618] ;
K = 2,
X = [_3618, _3624] ;
K = 3,
X = [_3618, _3624, _3630].
Why does one terminate and the other does not?
K in 1..3 simply asserts that K is somewhere between 1 and 3, without binding particular value. What you need is indomain(K) predicate, which backtracks over all values in K's domain:
K in 1..3, indomain(K), length(C, K).
Out of stack in your example happens for the following reason: length(C, K) without any of its arguments bound generates lists of different lengths, starting with 0, then 1, 2, 3, ...
Each time it generates a solution it tries bind a particular value to K, that is 0, 1, 2, ...
Now, because there are constraints applied to K, any attempts to bind a value greater than 3 will fail, meaning that length(C, K) will continue trying to find alternative solutions, that is, it will keep generating lists of length 4, 5, 6, ... and so on, all of which will be discarded. This process will continue until you exhaust your stack.
The question require me to write a predicate seqList(N, L), which is satisfied when L is the list [f0, . . . , fN].
Where the fN = fN-1 + fN-2 + fN-3
My code is to compare the head of a list given, and will return true or false when compared.
seqList(_,[]).
seqList(N,[H|T]) :-
N1 is N - 1,
seq(N,H),
seqList(N1,T).
However, it only valid when the value is reversed,
e.g. seqList(3,[1,1,0,0]) will return true, but the list should return me true for
seqList(3,[0,0,1,1]). Is there any way for me to reverse the list and verifies it correctly?
It seems that you want to generate N elements of a sequence f such that f(N) = f(N-1) + f(N-2) + f(N-3) where f(X) is the X-th element of the sequence list, 0-based. The three starting elements must be pre-set as part of the specification as well. You seem to be starting with [0,0,1, ...].
Using the approach from Lazy lists in Prolog?:
seqList(N,L):- N >= 3, !,
L=[0,0,1|X], N3 is N-3, take(N3, seq(0,0,1), X-[], _).
next( seq(A,B,C), D, seq(B,C,D) ):- D is A+B+C.
Now all these functions can be fused and inlined, to arrive at one recursive definition.
But you can do it directly. You just need to write down the question, to get the solution back.
question(N,L):-
Since you start with 0,0,1, ... write it down:
L = [0, 0, 1 | X],
since the three elements are given, we only need to find out N-3 more. Write it down:
N3 is N-3,
you've now reduced the problem somewhat. You now need to find N-3 elements and put them into the X list. Use a worker predicate for that. It also must know the three preceding numbers at each step:
worker( N3, 0, 0, 1, X).
So just write down what the worker must know:
worker(N, A, B, C, X):-
if N is 0, we must stop. X then is an empty list. Write it down.
N = 0, X = [] .
Add another clause, for when N is greater than 0.
worker(N, A, B, C, X):-
N > 0,
We know that the next element is the sum of the three preceding numbers. Write that down.
D is A + B + C,
the next element in the list is the top element of our argument list (the last parameter). Write it down:
X = [D | X2 ],
now there are one less elements to add. Write it down:
N2 is N - 1,
To find the rest of the list, the three last numbers are B, C, and D. Then the rest is found by worker in exactly the same way:
worker( N2, B, C, D, X2).
That's it. The question predicate is your solution. Rename it to your liking.
Folks,
I have N bounded sets:
S1 = {s11, s12, ... s1a }
S2 = {s21, s22, ... s2b }
...
sN= {sN1, sN2, ... sNx }
I have a function f() that takes one argument A from each set:
f( A1, A2, ... AN ) such that Ax belongs to Sx
I need to invoke f() for all possible combinations of arguments:
f( s11, s21, ... sN1 )
f( s11, s21, ... sN2 )
f( s11, s21, ... sN3 )
...
f( s11, s21, ... sNx )
...
f( s1a, s2b, ... sNx )
Can someone help me figure out a recursive (or iterative) algorithm that will hit all combinations?
Thanks in advance.
-Raj
So basically you want to generate the cartesian product s1 x s2 x ... x sN.
This is a classic application of backtracking / recursion. Here's how a pseudocode would look like:
function CartesianProduct(current, k)
if (k == N + 1)
current is one possibility, so call f(current[1], current[2], ..., current[N])
and return
for each element e in Sk
call CartesianProduct(current + {e}, k + 1)
Initial call is CartesianProduct({}, 1)
You should write it on paper and see how it works. For example, consider the sets:
s1 = {1, 2}
s2 = {3, 4}
s3 = {5, 6}
The first call will be CartesianProduct({}, 1), which will then start iterating over the elements in the first set. The first recursive call with thus be CartesianProduct({1}, 2). This will go on in the same manner, eventually reaching CartesianProduct({1, 3, 5}, 4), for which the termination condition will be true (current.Length == N + 1). Then it will backtrack and call CartesianProduct({1, 3, 6}, 4) and so on, until all possibilities are generated. Run it on paper all the way to see exactly how it works.
A
Extra credit: can you figure out how to get rid of the k parameter?
Let's say, we have a list/an array of positive integers x1, x2, ... , xn.
We can do a join operation on this sequence, that means that we can replace two elements that are next to each other with one element, which is sum of these elements. For example:
-> array/list: [1;2;3;4;5;6]
we can join 2 and 3, and replace them with 5;
we can join 5 and 6, and replace them with 11;
we cannot join 2 and 4;
we cannot join 1 and 3 etc.
Main problem is to find minimum join operations for given sequence, after which this sequence will be sorted in increasing order.
Note: empty and one-element sequences are sorted in increasing order.
Basic examples:
for [4; 6; 5; 3; 9] solution is 1 (we join 5 and 3)
for [1; 3; 6; 5] solution is also 1 (we join 6 and 5)
What I am looking for, is an algorithm that solve this problem. It could be in pseudocode, C, C++, PHP, OCaml or similar (I mean: I would understand solution, if You wrote solution in one of these languages).
This is an ideal problem to solve using Dynamic Programming, and the recurrence described by #lijie is exactly the right approach, with a few minor tweaks to ensure all possibilities are considered. There are two key observations: (a) Any sequence of join operations results in a set of non-overlapping summed subsequences of the original vector, and (b) For the optimal join-sequence, if we look to the right of any summed subsequence (m...n), that portion is an optimal solution to the problem: "find an optimal join-sequence for the sub-vector (n+1)...N such that the resulting final sequence is sorted, and all elements are >= sum(m...n).
Implementing the recurrence directly would of course result in an exponential time algorithm, but a simple tweak using Dynamic Programming makes it O(N^2), because essentially all (m,n) pairs are considered once. An easy way to implement the recurrence using Dynamic Programming is to have a data-structure indexed by (m,n) that stores the results of f(m,n) once they are computed, so that the next time we invoke f(m,n), we can lookup the previously saved results. The following code does this using the R programming language. I am using the formulation where we want to find the min-number of joins to get a non-decreasing sequence. For those new to R, to test this code, simply download R from any mirror (Google "R Project"), fire it up, and paste the two function definitions (f and solve) into the console, and then solve any vector using "solve(c(...))" as in the examples below.
f <- function(m,n) {
name <- paste(m,n)
nCalls <<- nCalls + 1
# use <<- for global assignment
if( !is.null( Saved[[ name ]] ) ) {
# the solution for (m,n) has been cached, look it up
nCached <<- nCached + 1
return( Saved[[ name ]] )
}
N <- length(vec) # vec is global to this function
sum.mn <- -Inf
if(m >= 1)
sum.mn <- sum( vec[m:n] )
if(n == N) { # boundary case: the (m,n) range includes the last number
result <- list( num = 0, joins = list(), seq = c())
} else
{
bestNum <- Inf
bestJoins <- list()
bestSeq <- c()
for( k in (n+1):N ) {
sum.nk <- sum( vec[ (n+1):k ] )
if( sum.nk < sum.mn ) next
joinRest <- f( n+1, k )
numJoins <- joinRest$num + k-n-1
if( numJoins < bestNum ) {
bestNum <- numJoins
if( k == n+1 )
bestJoins <- joinRest$joins else
bestJoins <- c( list(c(n+1,k)), joinRest$joins )
bestSeq <- c( sum.nk, joinRest$seq)
}
}
result <- list( num = bestNum, joins = bestJoins, seq = bestSeq )
}
Saved[[ name ]] <<- result
result
}
solve <- function(input) {
vec <<- input
nCalls <<- 0
nCached <<- 0
Saved <<- c()
result <- f(0,0)
cat( 'Num calls to f = ', nCalls, ', Cached = ', nCached, '\n')
cat( 'Min joins = ', result$num, '\n')
cat( 'Opt summed subsequences: ')
cat( do.call( paste,
lapply(result$joins,
function(pair) paste(pair[1], pair[2], sep=':' ))),
'\n')
cat( 'Final Sequence: ', result$seq, '\n' )
}
Here are some sample runs:
> solve(c(2,8,2,2,9,12))
Num calls to f = 22 , Cached = 4
Min joins = 2
Opt summed subsequences: 2:3 4:5
Final Sequence: 2 10 11 12
> solve(c(1,1,1,1,1))
Num calls to f = 19 , Cached = 3
Min joins = 0
Opt summed subsequences:
Final Sequence: 1 1 1 1 1
> solve(c(4,3,10,11))
Num calls to f = 10 , Cached = 0
Min joins = 1
Opt summed subsequences: 1:2
Final Sequence: 7 10 11
> solve(c (2, 8, 2, 2, 8, 3, 8, 9, 9, 2, 9, 8, 8, 7, 4, 2, 7, 5, 9, 4, 6, 7, 4, 7, 3, 4, 7, 9, 1, 2, 5, 1, 8, 7, 3, 3, 6, 3, 8, 5, 6, 5))
Num calls to f = 3982 , Cached = 3225
Min joins = 30
Opt summed subsequences: 2:3 4:5 6:7 8:9 10:12 13:16 17:19 20:23 24:27 28:33 34:42
Final Sequence: 2 10 10 11 18 19 21 21 21 21 26 46
Note that the min number of joins for the sequence considered by #kotlinski is 30, not 32 or 33.
Greedy algorithm!
import Data.List (inits)
joinSequence :: (Num a, Ord a) => [a] -> Int
joinSequence (x:xs) = joinWithMin 0 x xs
where joinWithMin k _ [] = k
joinWithMin k x xs =
case dropWhile ((< x) . snd) $ zip [0..] $ scanl1 (+) xs
of (l, y):_ -> joinWithMin (k + l) y $ drop (l+1) xs
_ -> k + length xs
joinSequence _ = 0
At each step, grab more elements until their sum is not less than the last. If you run out of elements, just join all the ones that remain to the prior group.
That was wrong.
Combinatorial explosion!
joinSequence :: (Num a, Ord a) => [a] -> Int
joinSequence = joinWithMin 0 0
where joinWithMin k _ [] = k
joinWithMin k m xs =
case dropWhile ((< m) . snd) $ zip [0..] $ scanl1 (+) xs
of [] -> k + length xs
ys -> minimum [ joinWithMin (k+l) y $ drop (l+1) xs
| (l, y) <- ys ]
Just try every possible joining and take the minimum. I couldn't think of a smart heuristic to limit backtracking, but this should be O(n²) with dynamic programming, and O(2n) as written.
A dynamic programming approach:
Let the original array be a[i], 0 <= i < N.
Define f(m, n) to be the minimum number of joins needed to make a[n..N-1] sorted, such that all elements in the sorted sublist are > (or >=, if another variant is desired) the sum of a[m..n-1] (let the sum of an empty list to be -inf).
The base case is f(m, N) = 0 (the sublist is empty).
The recursion is f(m, n) = min_{n < k <= N s.t. sum(a[n..k-1]) > sum(a[m..n-1])} f(n, k) + k-n-1. If no values of k are suitable, then let f(m, n) = inf (anything >= N will also work, because there are at most N-1 joins).
Calculate f(m,n) in decreasing order of m and n.
Then, the desired answer is f(0,0).
EDIT
Oops this is basically ephemient's second answer, I believe, although I am not familiar enough with Haskell to know exactly what it is doing.
Some Haskell code:
sortJoin (a:b:c:xs)
| a <= b = a : sortJoin (b:c:xs)
| a+b <= c = a+b : sortJoin (c:xs)
| otherwise = sortJoin (a:b+c:xs)
sortJoin (a:b:[]) = if a <= b then [a,b] else [a+b]
sortJoin a#_ = a
edits xs = length xs - length (sortJoin xs)
UPDATE: Made this work with test = [2, 8, 2, 2, 8, 3, 8, 9, 9, 2, 9, 8, 8, 7, 4, 2, 7, 5, 9, 4, 6, 7, 4, 7, 3, 4, 7, 9, 1, 2, 5, 1, 8, 7, 3, 3, 6, 3, 8, 5, 6, 5]
...now we get:
> sortJoin test
[2,8,12,20,20,23,27,28,31,55]
> edits test
32
Hopefully keeping it simple. Here's some pseudo-code that's exponential time.
Function "join" (list, max-join-count, join-count) ->
Fail if join-count is greater than max-join-count.
If the list looks sorted return join-count.
For Each number In List
Recur (list with current and next number joined, max-join-count, join-count + 1)
Function "best-join" (list) ->
max-join-count = 0
while not join (list, max-join-count++)
Here's an implementation on Clojure:
(defn join-ahead [f i v]
(concat (take i v)
[(f (nth v i) (nth v (inc i)))]
(drop (+ 2 i) v)))
(defn sort-by-joining
"Sort a list by joining neighboring elements with `+'"
([v max-join-count join-count]
(if (or (nil? max-join-count)
(<= join-count max-join-count))
(if (or (empty? v)
(= v (sort v)))
{:vector v :join-count join-count}
(loop [i 0]
(when (< (inc i) (count v))
(let [r (sort-by-joining (join-ahead + i v)
max-join-count
(inc join-count))]
(or r (recur (inc i)))))))))
([v max-join-count]
(sort-by-joining v max-join-count 0))
([v]
(sort-by-joining v nil 0)))
(defn fewest-joins [v]
(loop [i 0]
(if (sort-by-joining v i)
i
(recur (inc i)))))
(deftest test-fewest-joins
(is (= 0 (fewest-joins nil)))
(is (= 1 (fewest-joins [4 6 5 3 9])))
(is (= 6 (fewest-joins [1 9 22 90 1 1 1 32 78 13 1]))))
This is pchalasani code in F# with some modifications. The memoization is similar, I added a sumRange function generator for sums in O(1) time and moved the start position to f 1 0 to skip checking for n = 0 in minJoins.
let minJoins (input: int array) =
let length = input.Length
let sum = sumRange input
let rec f = memoize2 (fun m n ->
if n = length then
0
else
let sum_mn = sum m n
{n + 1 .. length}
|> Seq.filter (fun k -> sum (n + 1) k >= sum_mn)
|> Seq.map (fun k -> f (n + 1) k + k-n-1)
|> Seq.append {length .. length}
|> Seq.min
)
f 1 0
Full code.
open System.Collections.Generic
// standard memoization
let memoize2 f =
let cache = new Dictionary<_, _>()
(fun x1 x2 ->
match cache.TryGetValue((x1, x2)) with
| true, y -> y
| _ ->
let v = f x1 x2
cache.Add((x1, x2), v)
v)
// returns a function that takes two integers n,m and returns sum(array[n:m])
let sumRange (array : int array) =
let forward = Array.create (array.Length + 1) 0
let mutable total = 0
for i in 0 .. array.Length - 1 do
total <- total + array.[i]
forward.[i + 1] <- total
(fun i j -> forward.[j] - forward.[i - 1])
// min joins to sort an array ascending
let minJoins (input: int array) =
let length = input.Length
let sum = sumRange input
let rec f = memoize2 (fun m n ->
if n = length then
0
else
let sum_mn = sum m n
{n + 1 .. length}
|> Seq.filter (fun k -> sum (n + 1) k >= sum_mn)
|> Seq.map (fun k -> f (n + 1) k + k-n-1)
|> Seq.append {length .. length} // if nothing passed the filter return length as the min
|> Seq.min
)
f 1 0
let input = [|2;8;2;2;8;3;8;9;9;2;9;8;8;7;4;2;7;5;9;4;6;7;4;7;3;4;7;9;1;2;5;1;8;7;3;3;6;3;8;5;6;5|]
let output = minJoins input
printfn "%A" output
// outputs 30