Haskell groupBy depending on accumulator value - algorithm

I have a list of pairs of views which represents list of content labels and their widths which I want to group in lines (if the next content label doesn't fit in line then put it into another line). So we have: viewList = [(View1, 45), (View2, 223.5), (View3, 14) (View4, 42)].
I want to write a function groupViews :: [a] -> [[a]] to group this list into a list of sublists where each sublist will contain only views with sum of widths less than the maximum specified width (let's say 250).
So for a sorted viewList this function will return : [[(View3, 14), (View4, 42), (View1, 45)],[(View2, 223.5)]]
It looks similar to groupBy. However, groupBy doesn't maintain an accumulator. I tried to use scanl + takeWhile(<250) combination but in this case I was able to receive only first valid sublist. Maybe use iterate + scanl + takeWhile somehow? But this looks very cumbersome and not functional at all. Any help will be much appreciated.

I would start with a recursive definition like this:
groupViews :: Double -> (a -> Double) -> [a] -> [[a]]
groupViews maxWidth width = go (0, [[]])
where
go (current, acc : accs) (view : views)
| current + width view <= maxWidth
= go (current + width view, (view : acc) : accs) views
| otherwise = go (width view, [view] : acc : accs) views
go (_, accs) []
= reverse $ map reverse accs
Invoked like groupViews 250 snd (sortOn snd viewList). The first thing I notice is that it can be represented as a left fold:
groupViews' maxWidth width
= reverse . map reverse . snd . foldl' go (0, [[]])
where
go (current, acc : accs) view
| current + width view <= maxWidth
= (current + width view, (view : acc) : accs)
| otherwise
= (width view, [view] : acc : accs)
I think this is fine, though you could factor it further if you like, into one scan to accumulate the widths modulo the max width, and another pass to group the elements into ascending runs. For example, here’s a version that works on integer widths:
groupViews'' maxWidth width views
= map fst
$ groupBy ((<) `on` snd)
$ zip views
$ drop 1
$ scanl (\ current view -> (current + width view) `mod` maxWidth) 0 views
And of course you can include the sort in these definitions instead of passing the sorted list from outside.

I don't know a clever way to do this just by combining functions from the standard library, but I do think you can do better than just implementing it from scratch.
This problem fits into a class of problems that I've seen before: "batch up items from this list somehow, and combine its items into batches according to some combination rule and some rule for deciding when a batch is too big". Years ago, when I was writing Clojure, I built a function that abstracted out this idea of batched combinations, just asking you to specify the rules for batching, and was able to use it in a surprising number of places.
Here's how I think it might be reimagined in Haskell:
glue :: Monoid a => (a -> Bool) -> [a] -> [a]
glue tooBig = go mempty
where go current [] = [current]
go current (x:xs) | tooBig x' = current : go x xs
| otherwise = go x' xs
where x' = current `mappend` x
If you had such a glue function already, you could build a simple data type with the appropriate Monoid instance (a list of objects and their cumulative sum), and then let glue do the heavy lifting:
import Data.Monoid (Sum(..))
data ViewGroup contents size = ViewGroup {totalSize :: size,
elements :: [(contents, size)]}
instance Monoid b => Monoid (ViewGroup a b) where
mempty = ViewGroup mempty []
mappend (ViewGroup lSize lElts) (ViewGroup rSize rElts) =
ViewGroup (lSize `mappend` rSize)
(lElts ++ rElts)
viewGroups = let views = [("a", 14), ("b", 42), ("c", 45), ("d", 223.5)]
in glue ((> 250) . totalSize) [ViewGroup (Sum width) [(x, Sum width)]
| (x, width) <- views]
main = print (viewGroups :: [ViewGroup String (Sum Double)])
[ViewGroup {totalSize = Sum {getSum = 101.0},
elements = [("a",Sum {getSum = 14.0}),
("b",Sum {getSum = 42.0}),
("c",Sum {getSum = 45.0})]},
ViewGroup {totalSize = Sum {getSum = 223.5},
elements = [("d",Sum {getSum = 223.5})]}]
On the one hand this looks like quite a bit of work for a simple function, but on the other it's rather nice to have a type that describes the cumulative summing you're doing, and Monoid instances are nice to have anyway...and after defining the type and the Monoid instance there's almost no work left to do in the calling of glue itself.
Well, I don't know, maybe it's still too much work, especially if you don't believe you can reuse that type. But I do think it's useful to recognize that this is a specific case of a more general problem, and try to solve the more general problem as well.

Given that groupBy and span themselves are defined by manual recursive functions, our modified functions will use the same mechanism.
Let us first define a general function groupAcc which takes an initial value for the accumulator, and then a function which takes an element in the list, the current accumulator state and potentially produces a new accumulated value (Nothing means the element is not accepted):
{-# LANGUAGE LambdaCase #-}
import Data.List (sortOn)
import Control.Arrow (first, second)
spanAcc :: z -> (a -> z -> Maybe z) -> [a] -> ((z, [a]), [a])
spanAcc z0 p = \case
xs#[] -> ((z0, xs), xs)
xs#(x:xs') -> case p x z0 of
Nothing -> ((z0, []), xs)
Just z1 -> first (\(z2, xt) -> (if null xt then z1 else z2, x : xt)) $
spanAcc z1 p xs'
groupAcc :: z -> (a -> z -> Maybe z) -> [a] -> [(z, [a])]
groupAcc z p = \case
[] -> [] ;
xs -> uncurry (:) $ second (groupAcc z p) $ spanAcc z p xs
For our specific problem, we define:
threshold :: (Num a, Ord a) => a -> a -> a -> Maybe a
threshold max a z0 = let z1 = a + z0 in if z1 < max then Just z1 else Nothing
groupViews :: (Ord z, Num z) => [(lab, z)] -> [[(lab, z)]]
groupViews = fmap snd . groupAcc 0 (threshold 250 . snd)
Which finally gives us:
groupFinal :: (Num a, Ord a) => [(lab, a)] -> [[(lab, a)]]
groupFinal = groupViews . sortOn snd
And ghci gives us:
> groupFinal [("a", 45), ("b", 223.5), ("c", 14), ("d", 42)]
[[("c",14.0),("d",42.0),("a",45.0)],[("b",223.5)]]
If we want to, we can simplify groupAcc by assuming that z is a Monoid wherefore mempty may be used, such that:
groupAcc2 :: Monoid z => (a -> z -> Maybe z) -> [a] -> [(z, [a])]
groupAcc2 p = \case
[] -> [] ;
xs -> let z = mempty in
uncurry (:) $ second (groupAcc z p) $ spanAcc z p xs

Related

Haskell - Sort by first second element and then by first element

I have a list of tuples and I would like to sort it by second element (descending) and then by first element (ascending).
My code looks like this:
sortedOcc :: Eq a => [a] -> [(a, Int)]
sortedOcc = sortBy (flip compare `on` snd) . occurences
and this is the first sorting by the second element of list returned by occurences (function). How should I add the second sort (ascending) by the first element?
The Data.Ord module provides a Down newtype whose purpose is solely to reverse the ordering.
It also provides a comparing function:
comparing :: Ord a => (b -> a) -> b -> b -> Ordering
which must be fed some transformation function before it can be passed to sortBy.
Like this:
$ ghci
GHCi, version 8.8.4: https://www.haskell.org/ghc/ :? for help
λ>
λ> sortBy (comparing (\(a,v) -> (Down v, a))) [(1,2),(1,3),(5,2),(5,3)]
[(1,3),(5,3),(1,2),(5,2)]
λ>
The values returned by the transformation function are then sorted using their own “natural” order. In our case, this is the lexicographic order on pairs of ordered types.
Overall, the code would require an Ord a constraint:
sortedOcc :: Ord a => [a] -> [(a, Int)]
sortedOcc = sortBy (comparing (\(a,v) -> (Down v, a))) . occurences
I'd probably write this using the Monoid instance on Ordering and on function types.
Sorting on the second value in the tuple looks like flip compare `on` snd, as you've already determined, while sorting on the first value looks like compare `on` fst.
These can be combined Monoidally with <>.
d :: [(String , Int)]
d = [("b", 1), ("a", 1), ("c",3), ("d",4)]
sortedD = sortBy ((flip compare `on` snd) <> (compare `on` fst)) d
I know that the rest of the answers are shorter, but I recommend you to implement these lazy functions yourself before using the already Haskell implemented ones, so you understand how it works.
-- Order a list of tuples by the first item
orderBy1stTupleItem :: Ord a => (a, b1) -> (a, b2) -> Ordering
orderBy1stTupleItem tup1 tup2
| item1 > item2 = GT
| item1 < item2 = LT
| otherwise = EQ
where
item1 = fst tup1
item2 = fst tup2
-- Order a list of tuples by the second item
orderBy2ndTupleItem :: Ord a1 => (a2, a1) -> (a3, a1) -> Ordering
orderBy2ndTupleItem tup1 tup2
| item1 > item2 = GT
| item1 < item2 = LT
| otherwise = EQ
where
item1 = snd tup1
item2 = snd tup2
-- Wrapper Function: Order a list of tuples by the first item and later by the second item
orderTuplesBy1stThenBy2ndItem :: (Ord a1, Ord a2) => [(a2, a1)] -> [(a2, a1)]
orderTuplesBy1stThenBy2ndItem listTuples =
sortBy orderBy2ndTupleItem (sortBy orderBy1stTupleItem listTuples)
Example
let exampleListTuples = [(1,2),(0,8),(6,1),(3,6),(9,1),(7,8),(0,9)]
Then let's get the 1st list, ordered by the first item of each tuple:
> listOrderedByTuple1stItem = sortBy orderBy1stTupleItem exampleListTuples
> listOrderedByTuple1stItem
[(0,8),(0,9),(1,2),(3,6),(6,1),(7,8),(9,1)]
Now we order this result list by the second item of each tuple
> sortBy orderBy2ndTupleItem listOrderedByTuple1stItem
[(6,1),(9,1),(1,2),(3,6),(0,8),(7,8),(0,9)]
Or, you can just run the wrapper function orderTuplesBy1stThenBy2ndItem as follows:
> sortBy orderTuplesBy1stThenBy2ndItem exampleListTuples
What is sortBy's signature?
sortBy :: (a -> a -> Ordering) -> [a] -> [a]
This means that its first argument must have the type a -> a -> Ordering:
sortedOcc :: Eq a => [a] -> [(a, Int)]
sortedOcc = sortBy g . occurences
g :: a -> a -> Ordering
g = (flip compare `on` snd)
but that means that
g :: a -> a -> Ordering
g x y = (flip compare `on` snd) x y
= flip compare (snd x) (snd y)
= compare (snd y) (snd x)
and so to add your requirement into the mix we simply have to write it down,
= let test1 = compare (snd y) (snd x)
test2 = compare (snd y) (snd x)
in ......
right?
The above intentionally contains errors, which should be straightforward for you to fix.
A word of advice, only use point-free code if it is easy and natural for you to read and write, and modify.

Is there such a thing as maximumWith?

Specifically I'm searching for a function 'maximumWith',
maximumWith :: (Foldable f, Ord b) => (a -> b) -> f a -> a
Which behaves in the following way:
maximumWith length [[1, 2], [0, 1, 3]] == [0, 1, 3]
maximumWith null [[(+), (*)], []] == []
maximumWith (const True) x == head x
My use case is picking the longest word in a list.
For this I'd like something akin to maximumWith length.
I'd thought such a thing existed, since sortWith etc. exist.
Let me collect all the notes in the comments together...
Let's look at sort. There are 4 functions in the family:
sortBy is the actual implementation.
sort = sortBy compare uses Ord overloading.
sortWith = sortBy . comparing is the analogue of your desired maximumWith. However, this function has an issue. The ranking of an element is given by applying the given mapping function to it. However, the ranking is not memoized, so if an element needs to compared multiple times, the ranking will be recomputed. You can only use it guilt-free if the ranking function is very cheap. Such functions include selectors (e.g. fst), and newtype constructors. YMMV on simple arithmetic and data constructors. Between this inefficiency, the simplicity of the definition, and its location in GHC.Exts, it's easy to deduce that it's not used that often.
sortOn fixes the inefficiency by decorating each element with its image under the ranking function in a pair, sorting by the ranks, and then erasing them.
The first two have analogues in maximum: maximumBy and maximum. sortWith has no analogy; you may as well write out maximumBy (comparing _) every time. There is also no maximumOn, even though such a thing would be more efficient. The easiest way to define a maximumOn is probably just to copy sortOn:
maximumOn :: (Functor f, Foldable f, Ord r) => (a -> r) -> f a -> a
maximumOn rank = snd . maximumBy (comparing fst) . fmap annotate
where annotate e = let r = rank e in r `seq` (r, e)
There's a bit of interesting code in maximumBy that keeps this from optimizing properly on lists. It also works to use
maximumOn :: (Foldable f, Ord r) => (a -> r) -> f a -> a
maximumOn rank = snd . fromJust . foldl' max' Nothing
where max' Nothing x = let r = rank x in r `seq` Just (r, x)
max' old#(Just (ro, xo)) xn = let rn = rank xn
in case ro `compare` rn of
LT -> Just (rn, xo)
_ -> old
These pragmas may be useful:
{-# SPECIALIZE maximumOn :: Ord r => (a -> r) -> [a] -> a #-}
{-# SPECIALIZE maximumOn :: (a -> Int) -> [a] -> a #-}
HTNW has explained how to do what you asked, but I figured I should mention that for the specific application you mentioned, there's a way that's more efficient in certain cases (assuming the words are represented by Strings). Suppose you want
longest :: [[a]] -> [a]
If you ask for maximumOn length [replicate (10^9) (), []], then you'll end up calculating the length of a very long list unnecessarily. There are several ways to work around this problem, but here's how I'd do it:
data MS a = MS
{ _longest :: [a]
, _longest_suffix :: [a]
, _longest_bound :: !Int }
We will ensure that longest is the first of the longest strings seen thus far, and that longest_bound + length longest_suffix = length longest.
step :: MS a -> [a] -> MS a
step (MS longest longest_suffix longest_bound) xs =
go longest_bound longest_suffix xs'
where
-- the new list is not longer
go n suffo [] = MS longest suffo n
-- the new list is longer
go n [] suffn = MS xs suffn n
-- don't know yet
go !n (_ : suffo) (_ : suffn) =
go (n + 1) suffo suffn
xs' = drop longest_bound xs
longest :: [[a]] -> [a]
longest = _longest . foldl' step (MS [] [] 0)
Now if the second to longest list has q elements, we'll walk at most q conses into each list. This is the best possible complexity. Of course, it's only significantly better than the maximumOn solution when the longest list is much longer than the second to longest.

Generalizing a combinatoric function?

I've been solving a few combinatoric problems on Haskell, so I wrote down those 2 functions:
permutations :: (Eq a) => [a] -> [[a]]
permutations [] = [[]]
permutations list = do
x <- list
xs <- permutations (filter (/= x) list)
return (x : xs)
combinations :: (Eq a, Ord a) => Int -> [a] -> [[a]]
combinations 0 _ = [[]]
combinations n list = do
x <- list
xs <- combinations (n-1) (filter (> x) list)
return (x : xs)
Which works as follows:
*Main> permutations [1,2,3]
[[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]]
*Main> combinations 2 [1,2,3,4]
[[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]]
Those were uncomfortably similar, so I had to abstract it. I wrote the following abstraction:
combinatoric next [] = [[]]
combinatoric next list = do
x <- list
xs <- combinatoric next (next x list)
return (x : xs)
Which receives a function that controls how to filter the elements of the list. It can be used to easily define permutations:
permutations :: (Eq a) => [a] -> [[a]]
permutations = combinatoric (\ x ls -> filter (/= x) ls)
But I couldn't define combinations this way since it carries an state (n). I could extend the combinatoric with an additional state argument, but that'd become too clunky and I remember such approach was not necessary in a somewhat similar situation. Thus, I wonder: is it possible to define combinations using combinatorics? If not, what is a better abstraction of combinatorics which successfully subsumes both functions?
This isn't a direct answer to your question (sorry), but I don't think your code is correct. The Eq and Ord constraints tipped me off - they shouldn't be necessary - so I wrote a couple of QuickCheck properties.
prop_numberOfPermutations xs = length (permutations xs) === factorial (length xs)
where _ = (xs :: [Int]) -- force xs to be instantiated to [Int]
prop_numberOfCombinations (Positive n) (NonEmpty xs) = n <= length xs ==>
length (combinations n xs) === choose (length xs) n
where _ = (xs :: [Int])
factorial :: Int -> Int
factorial x = foldr (*) 1 [1..x]
choose :: Int -> Int -> Int
choose n 0 = 1
choose 0 r = 0
choose n r = choose (n-1) (r-1) * n `div` r
The first property checks that the number of permutations of a list of length n is n!. The second checks that the number of r-combinations of a list of length n is C(n, r). Both of these properties fail when I run them against your definitions:
ghci> quickCheck prop_numberOfPermutations
*** Failed! Falsifiable (after 5 tests and 4 shrinks):
[0,0,0]
3 /= 6
ghci> quickCheck prop_numberOfCombinations
*** Failed! Falsifiable (after 4 tests and 1 shrink):
Positive {getPositive = 2}
NonEmpty {getNonEmpty = [3,3]}
0 /= 1
It looks like your functions fail when the input list contains duplicate elements. Writing an abstraction for an incorrect implementation isn't a good idea - don't try and run before you can walk! You might find it helpful to read the source code for the standard library's definition of permutations, which does not have an Eq constraint.
First let's improve the original functions. You assume that all elements are distinct wrt their equality for permutations, and that they're distinct and have an ordering for combinations. These constraints aren't necessary and as described in the other answer, the code can produce wrong results. Following the robustness principle, let's accept just unconstrained lists. For this we'll need a helper function that produces all possible splits of a list:
split :: [a] -> [([a], a, [a])]
split = loop []
where
loop _ [] = []
loop rs (x:xs) = (rs, x, xs) : loop (x:rs) xs
Note that the implementation causes prefixes returned by this function to be reversed, but it's nothing we require.
This allows us to write generic permutations and combinations.
permutations :: [a] -> [[a]]
permutations [] = [[]]
permutations list = do
(pre, x, post) <- split list
-- reversing 'pre' isn't really necessary, but makes the output
-- order natural
xs <- permutations (reverse pre ++ post)
return (x : xs)
combinations :: Int -> [a] -> [[a]]
combinations 0 _ = [[]]
combinations n list = do
(_, x, post) <- split list
xs <- combinations (n-1) post
return (x : xs)
Now what they have in common:
At each step they pick an element to output,
update the list of elements to pick from and
stop after some condition is met.
The last point is a bit problematic, as for permutations we end once the list to choose from is empty, while for combinations we have a counter. This is probably the reason why it was difficult to generalize. We can work around this by realizing that for permutations the number of steps is equal to the length of the input list, so we can express the condition in the number of repetitions.
For such problems it's often very convenient to express them using StateT s [] monad, where s is the state we're working with. In our case it'll be the list of elements to choose from. The core of our combinatorial functions can be then expressed with StateT [a] [] a: pick an element from the state and update the state for the next step. Since the stateful computations all happen in the [] monad, we automatically branch all possibilities. With that, we can define a generic function:
import Control.Monad.State
combinatoric :: Int -> StateT [a] [] b -> [a] -> [[b]]
combinatoric n k = evalStateT $ replicateM n k
And then define permutations and combinations by specifying the appropriate number of repetitions and what's the core StateT [a] [] a function:
permutations' :: [a] -> [[a]]
permutations' xs = combinatoric (length xs) f xs
where
f = StateT $ map (\(pre, x, post) -> (x, reverse pre ++ post)) . split
combinations' :: Int -> [a] -> [[a]]
combinations' n xs = combinatoric n f xs
where
f = StateT $ map (\(_, x, post) -> (x, post)) . split

Speeding up a stream like data type

I've made a type which is supposed to emulate a "stream". This is basically a list without memory.
data Stream a = forall s. Stream (s -> Maybe (a, s)) s
Basically a stream has two elements. A state s, and a function that takes the state, and returns an element of type a and the new state.
I want to be able to perform operations on streams, so I've imported Data.Foldable and defined streams on it as such:
import Data.Foldable
instance Foldable Stream where
foldr k z (Stream sf s) = go (sf s)
where
go Nothing = z
go (Just (e, ns)) = e `k` go (sf ns)
To test the speed of my stream, I've defined the following function:
mysum = foldl' (+) 0
And now we can compare the speed of ordinary lists and my stream type:
x1 = [1..n]
x2 = Stream (\s -> if (s == n + 1) then Nothing else Just (s, s + 1)) 1
--main = print $ mysum x1
--main = print $ mysum x2
My streams are about half the speed of lists (full code here).
Furthermore, here's a best case situation, without a list or a stream:
bestcase :: Int
bestcase = go 1 0 where
go i c = if i == n then c + i else go (i+1) (c+i)
This is a lot faster than both the list and stream versions.
So I've got two questions:
How to I get my stream version to be at least as fast as a list.
How to I get my stream version to be close to the speed of bestcase.
As it stands the foldl' you are getting from Foldable is defined in terms of the foldr you gave it. The default implementation is the brilliant and surprisingly good
foldl' :: (b -> a -> b) -> b -> t a -> b
foldl' f z0 xs = foldr f' id xs z0
where f' x k z = k $! f z x
But foldl' is the specialty of your type; fortunately the Foldable class includes foldl' as a method, so you can just add this to your instance.
foldl' op acc0 (Stream sf s0) = loop s0 acc0
where
loop !s !acc = case sf s of
Nothing -> acc
Just (a,s') -> loop s' (op acc a)
For me this seems to give about the same time as bestcase
Note that this is a standard case where we need a strictness annotation on the accumulator. You might look in the vector package's treatment of a similar type https://hackage.haskell.org/package/vector-0.10.12.2/docs/src/Data-Vector-Fusion-Stream.html for some ideas; or in the hidden 'fusion' modules of the text library https://github.com/bos/text/blob/master/Data/Text/Internal/Fusion .

Optimizing a sudoku solver on Haskell

I have written a sudoku solver in Haskell. It goes through a list and when it finds '0' (an empty cell) it will get the numbers that could fit and try them:
import Data.List (group, (\\), sort)
import Data.Maybe (fromMaybe)
row :: Int -> [Int] -> [Int]
row y grid = foldl (\acc x -> (grid !! x):acc) [] [y*9 .. y*9+8]
where y' = y*9
column :: Int -> [Int] -> [Int]
column x grid = foldl (\acc n -> (grid !! n):acc) [] [x,x+9..80]
box :: Int -> Int -> [Int] -> [Int]
box x y grid = foldl (\acc n -> (grid !! n):acc) [] [x+y*9*3+y' | y' <- [0,9,18], x <- [x'..x'+2]]
where x' = x*3
isValid :: [Int] -> Bool
isValid grid = and [isValidRow, isValidCol, isValidBox]
where isValidRow = isValidDiv row
isValidCol = isValidDiv column
isValidBox = and $ foldl (\acc (x,y) -> isValidList (box x y grid):acc) [] [(x,y) | x <- [0..2], y <- [0..2]]
isValidDiv f = and $ foldl (\acc x -> isValidList (f x grid):acc) [] [0..8]
isValidList = all (\x -> length x <= 1) . tail . group . sort -- tail removes entries that are '0'
isComplete :: [Int] -> Bool
isComplete grid = length (filter (== 0) grid) == 0
solve :: Maybe [Int] -> Maybe [Int]
solve grid' = foldl f Nothing [0..80]
where grid = fromMaybe [] grid'
f acc x
| isValid grid = if isComplete grid then grid' else f' acc x
| otherwise = acc
f' acc x
| (grid !! x) == 0 = case guess x grid of
Nothing -> acc
Just x -> Just x
| otherwise = acc
guess :: Int -> [Int] -> Maybe [Int]
guess x grid
| length valid /= 0 = foldl f Nothing valid
| otherwise = Nothing
where valid = [1..9] \\ (row rowN grid ++ column colN grid ++ box (fst boxN) (snd boxN) grid) -- remove numbers already used in row/collumn/box
rowN = x `div` 9 -- e.g. 0/9=0 75/9=8
colN = x - (rowN * 9) -- e.g. 0-0=0 75-72=3
boxN = (colN `div` 3, rowN `div` 3)
before x = take x grid
after x = drop (x+1) grid
f acc y = case solve $ Just $ before x ++ [y] ++ after x of
Nothing -> acc
Just x -> Just x
For some puzzles this works, for example this one:
sudoku :: [Int]
sudoku = [5,3,0,6,7,8,0,1,2,
6,7,0,0,0,0,3,4,8,
0,0,8,0,0,0,5,0,7,
8,0,0,0,0,1,0,0,3,
4,2,6,0,0,3,7,9,0,
7,0,0,9,0,0,0,5,0,
9,0,0,5,0,7,0,0,0,
2,8,7,4,1,9,6,0,5,
3,0,0,2,8,0,1,0,0]
Took under a second, however this one:
sudoku :: [Int]
sudoku = [5,3,0,0,7,0,0,1,2,
6,7,0,0,0,0,3,4,8,
0,0,0,0,0,0,5,0,7,
8,0,0,0,0,1,0,0,3,
4,2,6,0,0,3,7,9,0,
7,0,0,9,0,0,0,5,0,
9,0,0,5,0,7,0,0,0,
2,8,7,4,1,9,6,0,5,
3,0,0,2,8,0,1,0,0]
I have not seen finish. I don't think this is a problem with the method, as it does return correct results.
Profiling showed that most of the time was spent in the "isValid" function. Is there something obviously inefficient/slow about that function?
The implementation is of course improvable, but that's not the problem. The problem is that for the second grid, the simple guess-and-check algorithm needs a lot of backtracking. Even if you speed up each of your functions 1000-fold, there will be grids where it still needs several times the age of the universe to find the (first, if the grid is not unique) solution.
You need a better algorithm to avoid that. A fairly efficient method to avoid such cases is to guess the square with the least number of possibilities first. That doesn't avoid all bad cases, but reduces them much.
One thing that you should also do is replace the length thing == 0 check with null thing. With the relatively short lists occurring here, the effect is limited, but in general it can be dramatic (and in general you should also not use length list <= 1, use null $ drop 1 list instead).
isValidList = all (\x -> length x <= 1) . tail . group . sort -- tail removes entries that are '0'
If the original list does not contain any zeros, tail will remove something else, perhaps a list of two ones. I'd replace tail . group. sort with group . sort . filter (/= 0).
I don't understand why isValidBox and isValidDiv use foldl as map appears to be adequate. Have I missed something / are they doing something terribly clever?

Resources