Haskell: map length . group is way slower than explicit recursion? - performance

Consider this trivial algorithm of prime-decomposition of an integer n: Let d' be the divisor of n last found. Initially, set d'=1. Find the smallest divisor d>d' of n, and find the maximal value e such that de divides n. Append de to the answer and repeat the procedure on n/de. Finally, stop when n becomes 1. For simplicity, let's ignore mathematical optimizations, like stop at sqrt n etc.
I have implemented it in two ways. The first one generates a list of division "attempts", and then groups the successful ones by divisors. For example, for n=20, we first generate [(2,20),(2,10),(2,5),(3,5),(4,5),(5,5),(5,1)], which we then transform to the desired [(2,2),(5,1)] using group and other library functions.
The second implementation is an explicit recursion which keeps track of the exponent e along the way, appends de to the answer once the maximal e is reached, proceeds to finding the "next" d, and so on.
Question 1: Why does the first implementation run way slower than the second, despite the following:
Both the implementations execute div, the core step of the algorithm, roughly the same number of times.
Lazy evaluation (and fusion?) has the effect that the long list illustrated above never has to be materialized in the first place. As you can see in the code below, divTrials n, the list I am talking about, is transformed by a chain of higher order functions. In that, I think that the part map (\xs-> (head xs,length xs)) ... group should tell the compiler that the list is just intermediate:
{-# OPTIONS_GHC -O2 #-}
module GroupCheck where
import Data.List
import Data.Maybe
implement1 :: Integral t=> t -> [(t,Int)] -- IMPLEMENTATION 1
implement1 = map (\xs-> (head xs,length xs)).factorGroups where
tryDiv (d,n)
| n `mod` d == 0 = (d,n `div` d)
| n == 1 = (1,1) -- hack
| otherwise = (d+1,n)
divTrials n = takeWhile (/=(1,1)) $ (2,n): map tryDiv (divTrials n)
factorGroups = filter (not.null).map tail.group.map fst.divTrials
implement2 :: Show t => Integral t => t -> [(t,Int)] -- IMPLEMENTATION 2
implement2 num = keep2 $ tail $ go (1,0,1,num) where
range d n = [d+1..n]
nextd d n = fromMaybe n $ find ((0==).(n`mod`)) (range d n)
update (d,e,de,n)
| n `mod` d == 0 = update (d,e+1,de*d,n`div`d)
| otherwise = (d,e,de,n)
go (d,e,de,1) = [(d,e,de,1)]
go (d,e,de,n) = (d,e,de,n) : go (update (nextd d n,0,1,n))
keep2 = map (\(d,e,_,_)->(d,e))
main :: IO ()
main = do
let n = 293872
let ans1 = implement1 n
let ans2 = implement2 n
print ans1
print ans2
Profiling tells us that tryDiv and divTrials together eat up >99% of the entire execution time:
> stack ghc -- -main-is GroupCheck.main -prof -fprof-auto -rtsopts GroupCheck
> ./GroupCheck +RTS -p >/dev/null && cat GroupCheck.prof
GroupCheck +RTS -p -RTS
total time = 18.34 secs (18338 ticks # 1000 us, 1 processor)
total alloc = 17,561,404,568 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
implement1.divTrials GroupCheck GroupCheck.hs:12:3-69 52.6 69.2
implement1.tryDiv GroupCheck GroupCheck.hs:(8,3)-(11,25) 47.2 30.8
Question 1.5: So.. what's so bad about these functions? Also,
Question 2: In a more general case of having to aggregate contiguous blocks of identical elements from a nondecreasing sequence, should we go the bulky implement2 way if we want speed? (Again, ignoring domain-specific optimizations.)
Or did I totally miss something obvious? Thanks!

Just to establish a baseline, I ran your program on a slightly larger starting number (so that time didn't print out 0.00s). I chose n = 2938722345623 for no particular reason. Here's the timings before starting to tweak things:
ans1: indistinguishable from infinity (I finished writing this entire answer and it was still running, about 26 minutes in total)
ans2: 2.78s
The first thing to try is to tweak this line:
divTrials n = takeWhile (/=(1,1)) $ (2,n): map tryDiv (divTrials n)
This looks like a pretty natural definition, but it turns out that GHC never memoizes function calls. So if you want to make a list that's defined recursively in terms of itself, you must not make a function call in the recursion. Here's how:
divTrials n = xs where xs = takeWhile (/=(1,1)) $ (2,n): map tryDiv xs
Just that change brings the time down to 7.85s. Still off by a factor of about 3, but much much better.
The less obvious problem lies here:
factorGroups = filter (not.null).map tail.group.map fst.divTrials
Putting the group so early breaks fusion, causing that intermediate list to actually be materialized. This means allocating and deallocating a lot of cons cells and tuples. Here's an implementation that has the same spirit, but puts more work before the group:
tryDiv d n
| n `mod` d == 0 = d : tryDiv d (n `div` d)
| n == 1 = []
| otherwise = tryDiv (d+1) n
factorGroups = group . tryDiv 2
With that, we are down to 2.65s -- slightly faster than ans2, though I only did one test of each so it's pretty likely to just be measurement noise.

Related

How to optimise Haskell code to pass HackerRanks timed out test cases (Not for any ongoing competition, just me practicing)

I been learning Haskell for around 4 months now and I have to say, the learning curve is definitely hard(scary also :p).
After solving about 15 easy questions, today I moved to my first medium difficulty problem on HackerRank https://www.hackerrank.com/challenges/climbing-the-leaderboard/problem.
It was 10 test cases and I am able to pass 6 of them, but the rest fail with timeout, now the interesting part is, I can already see a few parts that have potential for performance increase, for example, I am using nub to remove duplicated from a [Int], but still I am not able to build a mental model for algorithmic performance, the main cause of that being unsure about Haskell compiler will change my code and how laziness plays a role here.
import Data.List (nub)
getInputs :: [String] -> [String]
getInputs (_:r:_:p:[]) = [r, p]
findRating :: Int -> Int -> [Int] -> Int
findRating step _ [] = step
findRating step point (x:xs) = if point >= x then step else findRating (step + 1) point xs
solution :: [[Int]] -> [Int]
solution [rankings, points] = map (\p -> findRating 1 p duplicateRemovedRatings) points
where duplicateRemovedRatings = nub rankings
main :: IO ()
main = interact (unlines . map show . solution . map (map read . words) . getInputs . lines)
Test Case in GHCI
:l "solution"
let i = "7\n100 100 50 40 40 20 10\n4\n5 25 50 120"
solution i // "6\n4\n2\n1\n"
Specific questions I have:
Will the duplicateRemovedRankings variable be calculated once, or on each iteration of the map function call.
Like in imperative programming languages, I can verify the above question using some sort of printing mechanism, is there some equivalent way of doing the same with Haskell.
According to my current understanding, the complexity of this algorithm would be, I know nub is O(n^2)
findRating is O(n)
getInputs is O(1)
solution is O(n^2)
How can I reason about this and build a mental model for performance.
If this violates community guidelines, please comment and I will delete this. Thank you for the help :)
First, to answer your questions:
Yes, duplicateRemovedRankings is computed only once. No repeated computation.
To debug-trace, you can use trace and its friends (see the docs for examples and explanation). Yes, it can be used even in pure, non-IO code. But obviously, don't use it for "normal" output.
Yes, your understanding of complexity is correct.
Now, how to pass HackerRank's tricky tests.
First, yes, you're right that nub is O(N^2). However, in this particular case you don't have to settle for that. You can use the fact that the rankings come pre-sorted to get a linear version of nub. All you have to do is skip elements while they're equal to the next one:
betterNub (x:y:rest)
| x == y = betterNub (y:rest)
| otherwise = x : betterNub (y:rest)
betterNub xs = xs
This gives you O(N) for betterNub itself, but it's still not good enough for HackerRank, because the overall solution is still O(N*M) - for each game you are iterating over all rankings. No bueno.
But here you can get another improvement by observing that the rankings are sorted, and searching in a sorted list doesn't have to be linear. You can use a binary search instead!
To do this, you'll have to get yourself constant-time indexing, which can be achieved by using Array instead of list.
Here's my implementation (please don't judge harshly; I realize I probably got edge cases overengineered, but hey, it works!):
import Data.Array (listArray, bounds, (!))
findIndex arr p
| arr!end' > p = end' + 1
| otherwise = go start' end'
where
(start', end') = bounds arr
go start end =
let mid = (start + end) `div` 2
midValue = arr ! mid
in
if midValue == p then mid
else if mid == start then (if midValue < p then start else end)
else if midValue < p then go start mid
else go mid end
solution :: [[Int]] -> [Int]
solution [rankings, points] = map (\p -> findIndex duplicateRemovedRatings p + 1) points
where duplicateRemovedRatings = toArr $ betterNub rankings
toArr l = listArray (0, (length l - 1)) l
With this, you get O(log N) for the search itself, making the overall solution O(M * log N). And this seems to be good enough for HackerRank.
(note that I'm adding 1 to the result of findIndex - this is because the exercise requires 1-based index)
I believe Fyodor's answer is excellent for your first two and a half questions. For the final half, "How can I build a mental model for performance?", I can say that SPJ is an absolute master of writing highly technical papers in a way accessible to the smart but ignorant reader. The implementation book Implementing lazy functional languages on stock hardware is excellent and can serve as the basis of a mental execution model. There is also Okasaki's thesis, Purely functional data structures, which discusses a complementary and significantly higher-level approach to doing asymptotic complexity analyses. (Actually, I read his book, which apparently includes some extra content, so bear that in mind when deciding for yourself about this recommendation.)
Please don't be daunted by their length. I personally found they were actually actively fun to read; and the topic they cover is a big one, not compressible to a short/quick answer.

Performance of Longest Substring Without Repeating Characters in Haskell

Upon reading this Python question and proposing a solution, I tried to solve the same challenge in Haskell.
I've come up with the code below, which seems to work. However, since I'm pretty new to this language, I'd like some help in understand whether the code is good performancewise.
lswrc :: String -> String
lswrc s = reverse $ fst $ foldl' step ("","") s
where
step ("","") c = ([c],[c])
step (maxSubstr,current) c
| c `elem` current = step (maxSubstr,init current) c
| otherwise = let candidate = (c:current)
longerThan = (>) `on` length
newMaxSubstr = if maxSubstr `longerThan` candidate
then maxSubstr
else candidate
in (newMaxSubstr, candidate)
Some points I think could be better than they are
I carry on a pair of strings (the longest tracked, and the current candidate) but I only need the former; thinking procedurally, there's no way to escape this, but maybe FP allows another approach?
I construct (c:current) but I use it only in the else; I could make a more complicated longerThan to add 1 to the lenght of its second argument, so that I can apply it to maxSubstr and current, and construct (c:current) in the else, without even giving it a name.
I drop the last element of current when c is in the current string, because I'm piling up the strings with :; I could instead pattern match when checking c against the string (as in c `elem` current#(a:as)), but then when adding the new character I should do current ++ [c], which I know is not as performant as c:current.
I use foldl' (as I know foldl doesn't really make sense); foldr could be an alternative, but since I don't see how laziness enters this problem, I can't tell which one would be better.
Running elem on every iteration makes your algorithm Ω(n^2) (for strings with no repeats). Running length on, in the worst case, every iteration makes your algorithm Ω(n^2) (for strings with no repeats). Running init a lot makes your algorithm Ω(n*sqrt(n)) (for strings that are sqrt(n) repetitions of a sqrt(n)-long string, with every other one reversed, and assuming an O(1) elem replacement).
A better way is to pay one O(n) cost up front to copy into a data structure with constant-time indexing, and to keep a set (or similar data structure) of seen elements rather than a flat list. Like this:
import Data.Set (Set)
import Data.Vector (Vector)
import qualified Data.Set as S
import qualified Data.Vector as V
lswrc2 :: String -> String
lswrc2 "" = ""
lswrc2 s_ = go S.empty 0 0 0 0 where
s = V.fromList s_
n = V.length s
at = V.unsafeIndex s
go seen lo hi bestLo bestHi
| hi == n = V.toList (V.slice bestLo (bestHi-bestLo+1) s)
-- it is probably faster (possibly asymptotically so?) to use findIndex
-- to immediately pick the correct next value of lo
| at hi `S.member` seen = go (S.delete (at lo) seen) (lo+1) hi bestLo bestHi
| otherwise = let rec = go (S.insert (at hi) seen) lo (hi+1) in
if hi-lo > bestHi-bestLo then rec lo hi else rec bestLo bestHi
This should have O(n*log(n)) worst-case performance (achieving that worst case on strings with no repeats). There may be ways that are better still; I haven't thought super hard about it.
On my machine, lswrc2 consistently outperforms lswrc on random strings. On the string ['\0' .. '\100000'], lswrc takes about 40s and lswrc2 takes 0.03s. lswrc2 can handle [minBound .. maxBound] in about 0.4s; I gave up after more than 20 minutes of letting lswrc chew on that list.

algorithm to find products of a set of primes, in order, greater than x

Consider the finite set {2,3,5,...,n}. I am interested in primes but the question could apply to any set of numbers. I want to find all possible products of these numbers in ascending order, and in particular greater than or equal to some number x. Does anyone know a nice algorithm for this?
EDIT to clarify:
Each factor in the input set may be used any number of times. If the input were {2,3,5,7} the output would be {2,3,4,5,6,7,8,9,10,12,14,15,16,18,...}. The algorithm can stop as soon as it produces a result greater than or equal to some number x.
A Haskell code, as seen in this answer,
hamm :: [Integer] -> [Integer]
hamm [] = []
hamm (p:ps) = xs -- e.g. hamm [2,3,5]
where xs = merge (hamm ps) -- H({p} ∪ ps) = S,
(p : map (p*) xs) -- S ⊇ {p} ∪ H(ps) ∪ { p*x | x ∊ S }
merge a#(x:xs) b#(y:ys) | x < y = x : merge xs b
| otherwise = y : merge a ys
merge [] b = b
merge a [] = a
merge here doesn't try to eliminate multiples, because there won't be any -- but only in case you're using only the primes in the input:
~> take 20 $ hamm [2,3,5,7]
[2,3,4,5,6,7,8,9,10,12,14,15,16,18,20,21,24,25,27,28]
If not, you need to use union instead,
union a#(x:xs) b#(y:ys) | x < y = x : union xs b
| x > y = y : union a ys
| otherwise = x : union xs ys
union [] b = b
union a [] = a
Starting from (above) a given value efficiently might be an interesting challenge. A directly slice-generating code at the bottom of this answer could be taken as a starting point.
In general it is easy to skip along the ordered sequence until a value is passed over. In Haskell, it is done with a built-in dropWhile (< n),
~> take 10 $ dropWhile (< 100) $ hamm [2,3,5,7]
[100,105,108,112,120,125,126,128,135,140]
(Edit: made it produce all products in ascending order; let users filter them as they wish. This is a generalised Hamming numbers problem)
genHamming :: Integral a => [a] -> [a]
genHamming zs = hmng where
hmng = 1 : foldr (||) [] [map (z*) hmng | z <- zs]
[] || ys = ys
xs || [] = xs
(x:xs) || (y:ys) | x==y = x : (xs || ys)
| x<y = x : (xs || (y:ys))
| y<x = y : (ys || (x:xs))
Example usage
Prelude Hamming> take 10 $ dropWhile (< 1000) $ genHamming [2,3,5]
[1000,1024,1080,1125,1152,1200,1215,1250,1280,1296]
Prelude Hamming>
You probably also want to include 2^0 * 3^0 * 5^0 * 7^0 = 1 in your output.
The way to do this is with a priority queue. If k is in the sequence, so are 2k, 3k, 5k and 7k. Start your output with 1, then add 2, 3, 5, and 7 to the priority queue. Pop 2 from the top of the queue and add 2*2=4, 2*3=6, 2*5=10 and 2*7=14 to the queue; the queue at that point will contain 3, 4, 5, 6, 7, 10 and 14. Pop 3 from the top of the queue and add 3*2=6, 3*3=9, 3*5=15 and 3*7=21 to the queue. And so on.
You will discover that many elements are duplicated; for instance, we added 6 to the priority queue twice in the example above. You can either add duplicates, and check each time you pop the queue if the element is the same as the prior member of the sequence, or you can keep a separate list of items in the queue and refrain from adding duplicates in the first place.
I discuss a priority queue that contains only distinct elements at my blog.
Every integer greater than 1 is the product of a 'set of primes' because it is the product of its prime factors. It might be easier to start at your desired minimum number and strike out all numbers that have a prime factor not in your initial set. Continue the process until your result set is large enough. In effect you are doing a modified Sieve of Eratosthenes, removing all multiples of the primes not in your initial set.
Because our application is written in python I came up with the following implementation which I wanted to share:
def powers(x):
y = x
while True:
yield y
y *= x
def products(factors):
y0 = factors[0]
if len(factors) == 1:
yield from powers(y0)
else:
yield y0
g1 = products(factors)
y1 = y0 * next(g1)
g2 = products(factors[1:])
y2 = next(g2)
while True:
if y1 < y2:
yield y1
y1 = y0 * next(g1)
else:
yield y2
y2 = next(g2)
if __name__ == "__main__":
import itertools
for n in itertools.islice(products([2, 3, 5, 7]), 10**6):
print(n)
No doubt the use of recursion with generators could be improved upon, but in practice the performance is good enough for our application. Beyond that, I'm still interested in how to start at a given minimum value efficiently, as mentioned in Will Ness' answer. Thanks to all those who contributed.
There are two algorithms that come to mind for this. First, you could calculate all possible products between the numbers, and then sorts these. Although this seems to be the naive approach, you could speed this up by 'remembering' the last product, dividing out one number, and multiplying a different number in its place. This would greatly reduce the number of operations necessary, if with the correct ordering of your permutations (look into Gray Codes) you could minimize the total multiplications.
On the other hand, what you could do is compute the sets of all the original numbers, the set of the products of pairs (of two) original numbers, the set of products of 3... so on. Then you can sort each individual set (which shouldn't be hard to ensure they are nearly sorted anyways), and the mergesort the sets together into one sorted list of products. This would take more operations, but would result in a nearly sorted list at the end, which may take less time to construct overall.
The other algorithm would be to take the product of all the primes of interest, and call this P. Construct another list of all the original primes squared. Now, loop over all numbers up to P, and test if those are divisible by any of the values in the primes-squared array. If they are, toss them. If not, then add them to the output array. You can optimize this by only testing divisibility up to sqrt(i), where i is the iteration in your for loop. This is still likely slower than the above method though.
As every prime factor is allowed to appear many times, the sequence is infinite. So we can't generate all products and then sort them. We have to generate the sequence iteratively.
If a is a member of the sequence, then {2*a, 3*a, 5*a ... n*a} will also be members of the sequence, coming later.
So, the algorithm that comes to my mind is to have a (sorted, duplicate-free) buffer of the next sequence members. We extract and present the smallest value and insert all of its multiples into the buffer.
As it's difficult to predict the buffer content for your starting number x, this algorithm should start from the beginning and ignore the results until they reach the x threshold.

Lazily Tying the Knot for 1 Dimensional Dynamic Programming

Several years ago I took an algorithms course where we were giving the following problem (or one like it):
There is a building of n floors with an elevator that can only go up 2 floors at a time and down 3 floors at a time. Using dynamic programming write a function that will compute the number of steps it takes the elevator to get from floor i to floor j.
This is obviously easy using a stateful approach, you create an array n elements long and fill it up with the values. You could even use a technically non-stateful approach that involves accumulating a result as recursively passing it around. My question is how to do this in a non-stateful manner by using lazy evaluation and tying the knot.
I think I've devised the correct mathematical formula:
where i+2 and i-3 are within the allowed values.
Unfortunately I can't get it to terminate. If I put the i+2 case first and then choose an even floor I can get it to evaluate the even floors below the target level but that's it. I suspect that it shoots straight to the highest even floor for everything else, drops 3 levels, then repeats, forever oscillating between the top few floors.
So it's probably exploring the infinite space (or finite but with loops) in a depth first manner. I can't think of how to explore the space in a breadth first fashion without using a whole lot of data structures in between that effectively mimic a stateful approach.
Although this simple problem is disappointingly difficult I suspect that having seen a solution in 1 dimension I might be able to make it work for a 2 dimensional variation of the problem.
EDIT: A lot of the answers tried to solve the problem in a different way. The problem itself isn't interesting to me, the question is about the method used. Chaosmatter's approach of creating a minimal function which can compare potentially infinite numbers is possibly a step in the right direction. Unfortunately if I try to create a list representing a building with 100 floors the result takes too long to compute, since the solutions to sub problems are not reused.
I made an attempt to use a self-referencing data structure but it doesn't terminate, there is some kind of infinite loop going on. I'll post my code so you can understand what it is I'm going for. I'll change the accepted answer if someone can actually solve the problem using dynamic programming on a self-referential data structure using laziness to avoid computing things more than once.
levels = go [0..10]
where
go [] = []
go (x:xs) = minimum
[ if i == 7
then 0
else 1 + levels !! i
| i <- filter (\n -> n >= 0 && n <= 10) [x+2,x-3] ]
: go xs
You can see how 1 + levels !! i tries to reference the previously calculated result and how filter (\n -> n >= 0 && n <= 10) [x+2,x-3] tries to limit the values of i to valid ones. As I said, this doesn't actually work, it simply demonstrates the method by which I want to see this problem solved. Other ways of solving it are not interesting to me.
Since you're trying to solve this in two dimensions, and for other problems than the one described, let's explore some more general solutions. We are trying to solve the shortest path problem on directed graphs.
Our representation of a graph is currently something like a -> [a], where the function returns the vertices reachable from the input. Any implementation will additionally require that we can compare to see if two vertices are the same, so we'll need Eq a.
The following graph is problematic, and introduces almost all of the difficulty in solving the problem in general:
problematic 1 = [2]
problematic 2 = [3]
problematic 3 = [2]
problematic 4 = []
When trying to reach 4 from 1, there are is a cycle involving 2 and 3 that must be detected to determine that there is no path from 1 to 4.
Breadth-first search
The algorithm Will presented has, if applied to the general problem for finite graphs, worst case performance that is unbounded in both time and space. We can modify his solution to attack the general problem for graphs containing only finite paths and finite cycles by adding cycle detection. Both his original solution and this modification will find finite paths even in infinite graphs, but neither is able to reliably determine that there is no path between two vertices in an infinite graph.
acyclicPaths :: (Eq a) => (a->[a]) -> a -> a -> [[a]]
acyclicPaths steps i j = map (tail . reverse) . filter ((== j).head) $ queue
where
queue = [[i]] ++ gen 1 queue
gen d _ | d <= 0 = []
gen d (visited:t) = let r = filter ((flip notElem) visited) . steps . head $ visited
in map (:visited) r ++ gen (d+length r-1) t
shortestPath :: (Eq a) => (a->[a]) -> a -> a -> Maybe [a]
shortestPath succs i j = listToMaybe (acyclicPaths succs i j)
Reusing the step function from Will's answer as the definition of your example problem, we could get the length of the shortest path from floor 4 to 5 of an 11 story building by fmap length $ shortestPath (step 11) 4 5. This returns Just 3.
Let's consider a finite graph with v vertices and e edges. A graph with v vertices and e edges can be described by an input of size n ~ O(v+e). The worst case graph for this algorithm is to have one unreachable vertex, j, and the remaining vertexes and edges devoted to creating the largest number of acyclic paths starting at i. This is probably something like a clique containing all the vertices that aren't i or j, with edges from i to every other vertex that isn't j. The number of vertices in a clique with e edges is O(e^(1/2)), so this graph has e ~ O(n), v ~ O(n^(1/2)). This graph would have O((n^(1/2))!) paths to explore before determining that j is unreachable.
The memory required by this function for this case is O((n^(1/2))!), since it only requires a constant increase in the queue for each path.
The time required by this function for this case is O((n^(1/2))! * n^(1/2)). Each time it expands a path, it must check that the new node isn't already in the path, which takes O(v) ~ O(n^(1/2)) time. This could be improved to O(log (n^(1/2))) if we had Ord a and used a Set a or similar structure to store the visited vertices.
For non-finite graphs, this function should only fail to terminate exactly when there doesn't exists a finite path from i to j but there does exist a non-finite path from i to j.
Dynamic Programming
A dynamic programming solution doesn't generalize in the same way; let's explore why.
To start with, we'll adapt chaosmasttter's solution to have the same interface as our breadth-first search solution:
instance Show Natural where
show = show . toNum
infinity = Next infinity
shortestPath' :: (Eq a) => (a->[a]) -> a -> a -> Natural
shortestPath' steps i j = go i
where
go i | i == j = Zero
| otherwise = Next . foldr minimal infinity . map go . steps $ i
This works nicely for the elevator problem, shortestPath' (step 11) 4 5 is 3. Unfortunately, for our problematic problem, shortestPath' problematic 1 4 overflows the stack. If we add a bit more code for Natural numbers:
fromInt :: Int -> Natural
fromInt x = (iterate Next Zero) !! x
instance Eq Natural where
Zero == Zero = True
(Next a) == (Next b) = a == b
_ == _ = False
instance Ord Natural where
compare Zero Zero = EQ
compare Zero _ = LT
compare _ Zero = GT
compare (Next a) (Next b) = compare a b
we can ask if the shortest path is shorter than some upper bound. In my opinion, this really shows off what's happening with lazy evaluation. problematic 1 4 < fromInt 100 is False and problematic 1 4 > fromInt 100 is True.
Next, to explore dynamic programming, we'll need to introduce some dynamic programming. Since we will build a table of the solutions to all of the sub-problems, we will need to know the possible values that the vertices can take. This gives us a slightly different interface:
shortestPath'' :: (Ix a) => (a->[a]) -> (a, a) -> a -> a -> Natural
shortestPath'' steps bounds i j = go i
where
go i = lookupTable ! i
lookupTable = buildTable bounds go2
go2 i | i == j = Zero
| otherwise = Next . foldr minimal infinity . map go . steps $ i
-- A utility function that makes memoizing things easier
buildTable :: (Ix i) => (i, i) -> (i -> e) -> Array i e
buildTable bounds f = array bounds . map (\x -> (x, f x)) $ range bounds
We can use this like shortestPath'' (step 11) (1,11) 4 5 or shortestPath'' problematic (1,4) 1 4 < fromInt 100. This still can't detect cycles...
Dynamic programming and cycle detection
The cycle detection is problematic for dynamic programming, because the sub-problems aren't the same when they are approached from different paths. Consider a variant of our problematic problem.
problematic' 1 = [2, 3]
problematic' 2 = [3]
problematic' 3 = [2]
problematic' 4 = []
If we are trying to get from 1 to 4, we have two options:
go to 2 and take the shortest path from 2 to 4
go to 3 and take the shortest path from 3 to 4
If we choose to explore 2, we will be faced with the following option:
go to 3 and take the shortest path from 3 to 4
We want to combine the two explorations of the shortest path from 3 to 4 into the same entry in the table. If we want to avoid cycles, this is really something slightly more subtle. The problems we faced were really:
go to 2 and take the shortest path from 2 to 4 that doesn't visit 1
go to 3 and take the shortest path from 3 to 4 that doesn't visit 1
After choosing 2
go to 3 and take the shortest path from 3 to 4 that doesn't visit 1 or 2
These two questions about how to get from 3 to 4 have two slightly different answers. They are two different sub-problems which can't fit in the same spot in a table. Answering the first question eventually requires determining that you can't get to 4 from 2. Answering the second question is straightforward.
We could make a bunch of tables for each possible set of previously visited vertices, but that doesn't sound very efficient. I've almost convinced myself that we can't do reach-ability as a dynamic programming problem using only laziness.
Breadth-first search redux
While working on a dynamic programming solution with reach-ability or cycle detection, I realized that once we have seen a node in the options, no later path visiting that node can ever be optimal, whether or not we follow that node. If we reconsider problematic':
If we are trying to get from 1 to 4, we have two options:
go to 2 and take the shortest path from 2 to 4 without visiting 1, 2, or 3
go to 3 and take the shortest path from 3 to 4 without visiting 1, 2, or 3
This gives us an algorithm to find the length of the shortest path quite easily:
-- Vertices first reachable in each generation
generations :: (Ord a) => (a->[a]) -> a -> [Set.Set a]
generations steps i = takeWhile (not . Set.null) $ Set.singleton i: go (Set.singleton i) (Set.singleton i)
where go seen previouslyNovel = let reachable = Set.fromList (Set.toList previouslyNovel >>= steps)
novel = reachable `Set.difference` seen
nowSeen = reachable `Set.union` seen
in novel:go nowSeen novel
lengthShortestPath :: (Ord a) => (a->[a]) -> a -> a -> Maybe Int
lengthShortestPath steps i j = findIndex (Set.member j) $ generations steps i
As expected, lengthShortestPath (step 11) 4 5 is Just 3 and lengthShortestPath problematic 1 4 is Nothing.
In the worst case, generations requires space that is O(v*log v), and time that is O(v*e*log v).
The problem is that min needs to fully evaluate both calls to f,
so if one of them loops infinitly min will never return.
So you have to create a new type, encoding that the number returned by f is Zero or a Successor of Zero.
data Natural = Next Natural
| Zero
toNum :: Num n => Natural -> n
toNum Zero = 0
toNum (Next n) = 1 + (toNum n)
minimal :: Natural -> Natural -> Natural
minimal Zero _ = Zero
minimal _ Zero = Zero
minimal (Next a) (Next b) = Next $ minimal a b
f i j | i == j = Zero
| otherwise = Next $ minimal (f l j) (f r j)
where l = i + 2
r = i - 3
This code actually works.
standing on the floor i of n-story building, find minimal number of steps it takes to get to the floor j, where
step n i = [i-3 | i-3 > 0] ++ [i+2 | i+2 <= n]
thus we have a tree. we need to search it in breadth-first fashion until we get a node holding the value j. its depth is the number of steps. we build a queue, carrying the depth levels,
solution n i j = case dropWhile ((/= j).snd) queue
of [] -> Nothing
((k,_):_) -> Just k
where
queue = [(0,i)] ++ gen 1 queue
The function gen d p takes its input p from d notches back from its production point along the output queue:
gen d _ | d <= 0 = []
gen d ((k,i1):t) = let r = step n i1
in map (k+1 ,) r ++ gen (d+length r-1) t
Uses TupleSections. There's no knot tying here, just corecursion, i.e. (optimistic) forward production and frugal exploration. Works fine without knot tying because we only look for the first solution. If we were searching for several of them, then we'd need to eliminate the cycles somehow.
see also: https://en.wikipedia.org/wiki/Corecursion#Discussion
With the cycle detection:
solutionCD1 n i j = case dropWhile ((/= j).snd) queue
of [] -> Nothing
((k,_):_) -> Just k
where
step n i visited = [i2 | let i2=i-3, not $ elem i2 visited, i2 > 0]
++ [i2 | let i2=i+2, not $ elem i2 visited, i2 <=n]
queue = [(0,i)] ++ gen 1 queue [i]
gen d _ _ | d <= 0 = []
gen d ((k,i1):t) visited = let r = step n i1 visited
in map (k+1 ,) r ++
gen (d+length r-1) t (r++visited)
e.g. solution CD1 100 100 7 runs instantly, producing Just 31. The visited list is pretty much a copy of the instantiated prefix of the queue itself. It could be maintained as a Map, to improve time complexity (as it is, sol 10000 10000 7 => Just 3331 takes 1.27 secs on Ideone).
Some explanations seem to be in order.
First, there's nothing 2D about your problem, because the target floor j is fixed.
What you seem to want is memoization, as your latest edit indicates. Memoization is useful for recursive solutions; your function is indeed recursive - analyzing its argument into sub-cases, synthetizing its result from results of calling itself on sub-cases (here, i+2 and i-3) which are closer to the base case (here, i==j).
Because arithmetics is strict, your formula is divergent in the presence of any infinite path in the tree of steps (going from floor to floor). The answer by chaosmasttter, by using lazy arithmetics instead, turns it automagically into a breadth-first search algorithm which is divergent only if there's no finite paths in the tree, exactly like my first solution above (save for the fact that it's not checking for out-of-bounds indices). But it is still recursive, so indeed memoization is called for.
The usual way to approach it first, is to introduce sharing by "going through a list" (inefficient, because of sequential access; for efficient memoization solutions see hackage):
f n i j = g i
where
gs = map g [0..n] -- floors 1,...,n (0 is unused)
g i | i == j = Zero
| r > n = Next (gs !! l) -- assuming there's enough floors in the building
| l < 1 = Next (gs !! r)
| otherwise = Next $ minimal (gs !! l) (gs !! r)
where r = i + 2
l = i - 3
not tested.
My solution is corecursive. It needs no memoization (just needs to be careful with the duplicates), because it is generative, like the dynamic programming is too. It proceeds away from its starting case, i.e. the starting floor. An external accessor chooses the appropriate generated result.
It does tie a knot - it defines queue by using it - queue is on both sides of the equation. I consider it the simpler case of knot tying, because it is just about accessing the previously generated values, in disguise.
The knot tying of the 2nd kind, the more complicated one, is usually about putting some yet-undefined value in some data structure and returning it to be defined by some later portion of the code (like e.g. a back-link pointer in doubly-linked circular list); this is indeed not what my1 code is doing. What it does do is generating a queue, adding at its end and "removing" from its front; in the end it's just a difference list technique of Prolog, the open-ended list with its end pointer maintained and updated, the top-down list building of tail recursion modulo cons - all the same things conceptually. First described (though not named) in 1974, AFAIK.
1 based entirely on the code from Wikipedia.
Others have answered your direct question about dynamic programming. However, for this kind of problem I think the greedy approach works the best. It's implementation is very straightforward.
f i j :: Int -> Int -> Int
f i j = snd $ until (\(i,_) -> i == j)
(\(i,x) -> (i + if i < j then 2 else (-3),x+1))
(i,0)

F# Efficiently removing n items from the end of a Set

I know I can remove the last element from a set:
s.Remove(s.MaximumElement)
But if I want to remove the n maximum elements... do I just execute the above n times, or is there a faster way to do that?
To be clear, this is an obvious solution:
let rec removeLastN (s : Set<'a>, num : int) : Set<'a> =
match num with
| 0 -> s
| _ -> removeLast(s.Remove(s.MinimumElement), num-1)
But it involves creating a new set n times. Is there a way to do it and only create a new set once?
But it involves creating a new set n
times. Is there a way to do it and
only create a new set once?
To the best of my knowledge, no. I'd say what you have a perfectly fine implementation, it runs in O(lg n) -- and its concise too :) Most heap implementations give you O(lg n) for delete min anyway, so what you have is about as good as you can get it.
You might be able to get a little better speed by rolling your balanced tree, and implementing a function to drop a left or right branch for all values greater than a certain value. I don't think an AVL tree or RB tree are appropriate in this context, since you can't really maintain their invariants, but a randomlized tree will give you the results you want.
A treap works awesome for this, because it uses randomization rather than tree invariants to keep itself relatively balanced. Unlike an AVL tree or a RB-tree, you can split a treap on a node without worrying about it being unbalanced. Here's a treap implementation I wrote a few months ago:
http://pastebin.com/j0aV3DJQ
I've added a split function, which will allows you take a tree and return two trees containing all values less than and all values greater than a given value. split runs in O(lg n) using a single pass through the tree, so you can prune entire branches of your tree in one shot -- provided that you know which value to split on.
But if I want to remove the n maximum
elements... do I just execute the
above n times, or is there a faster
way to do that?
Using my Treap class:
open Treap
let nthLargest n t = Seq.nth n (Treap.toSeqBack t)
let removeTopN n t =
let largest = nthLargest n t
let smallerValues, wasFound, largerValues = t.Split(largest)
smallerValues
let e = Treap.empty(fun (x : int) (y : int) -> x.CompareTo(y))
let t = [1 .. 100] |> Seq.fold (fun (acc : Treap<_>) x -> acc.Insert(x)) e
let t' = removeTopN 10 t
removeTopN runs in O(n + lg m) time, where n is the index into the tree sequence and m is the number of items in the tree.
I make no guarantees about the accuracy of my code, use at your own peril ;)
In F#, you can use Set.partition or Set.filter to create sub sets:
let s = Set([1;4;6;9;100;77])
let a, b = Set.partition (fun x -> x <= 10) s
let smallThan10 = Set.filter (fun x -> x < 10) s
In your question, maybe you don't know the value of the ith number of your set, so here is a handy function for that:
let nth (n:int) (s:'a Set) =
s |> Set.toSeq |> Seq.nth n
Now, we can write the remove-top-n function:
let removeTopN n (s:'a Set) =
let size = s.Count
let m = size - n
let mvalue = nth m s
Set.filter (fun x -> x < mvalue) s
and test it:
removeTopN 3 s
and we get:
val it : Set<int> = set [1; 4; 6]
Notice that the removeTopN does not work for a set containing multiple same values.
That is already a pretty good solution. OCaml has a split function that can split a Set so you can find the right element then you can split the Set to remove a bunch of elements at a time. Alternatively, you can use Set.difference to extract another Set of elements.

Resources