How do I find a factorial? [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
How can I write a program to find the factorial of any natural number?

This will work for the factorial (although a very small subset) of positive integers:
unsigned long factorial(unsigned long f)
{
if ( f == 0 )
return 1;
return(f * factorial(f - 1));
}
printf("%i", factorial(5));
Due to the nature of your problem (and level that you have admitted), this solution is based more in the concept of solving this rather than a function that will be used in the next "Permutation Engine".

This calculates factorials of non-negative integers[*] up to ULONG_MAX, which will have so many digits that it's unlikely your machine can store a whole lot more, even if it has time to calculate them. Uses the GNU multiple precision library, which you need to link against.
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <gmp.h>
void factorial(mpz_t result, unsigned long input) {
mpz_set_ui(result, 1);
while (input > 1) {
mpz_mul_ui(result, result, input--);
}
}
int main() {
mpz_t fact;
unsigned long input = 0;
char *buf;
mpz_init(fact);
scanf("%lu", &input);
factorial(fact, input);
buf = malloc(mpz_sizeinbase(fact, 10) + 1);
assert(buf);
mpz_get_str(buf, 10, fact);
printf("%s\n", buf);
free(buf);
mpz_clear(fact);
}
Example output:
$ make factorial CFLAGS="-L/bin/ -lcyggmp-3 -pedantic" -B && ./factorial
cc -L/bin/ -lcyggmp-3 -pedantic factorial.c -o factorial
100
93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
[*] If you mean something else by "number" then you'll have to be more specific. I'm not aware of any other numbers for which the factorial is defined, despite valiant efforts by Pascal to extend the domain by use of the Gamma function.

Why do it in C when you can do it in Haskell:
Freshman Haskell programmer
fac n = if n == 0
then 1
else n * fac (n-1)
Sophomore Haskell programmer, at MIT (studied Scheme as a freshman)
fac = (\(n) ->
(if ((==) n 0)
then 1
else ((*) n (fac ((-) n 1)))))
Junior Haskell programmer (beginning Peano player)
fac 0 = 1
fac (n+1) = (n+1) * fac n
Another junior Haskell programmer (read that n+k patterns are “a
disgusting part of Haskell” 1 and joined the “Ban n+k
patterns”-movement [2])
fac 0 = 1
fac n = n * fac (n-1)
Senior Haskell programmer (voted for Nixon Buchanan Bush —
“leans right”)
fac n = foldr (*) 1 [1..n]
Another senior Haskell programmer (voted for McGovern Biafra
Nader — “leans left”)
fac n = foldl (*) 1 [1..n]
Yet another senior Haskell programmer (leaned so far right he came
back left again!)
-- using foldr to simulate foldl
fac n = foldr (\x g n -> g (x*n)) id [1..n] 1
Memoizing Haskell programmer (takes Ginkgo Biloba daily)
facs = scanl (*) 1 [1..]
fac n = facs !! n
Pointless (ahem) “Points-free” Haskell programmer (studied at
Oxford)
fac = foldr (*) 1 . enumFromTo 1
Iterative Haskell programmer (former Pascal programmer)
fac n = result (for init next done)
where init = (0,1)
next (i,m) = (i+1, m * (i+1))
done (i,_) = i==n
result (_,m) = m
for i n d = until d n i
Iterative one-liner Haskell programmer (former APL and C programmer)
fac n = snd (until ((>n) . fst) (\(i,m) -> (i+1, i*m)) (1,1))
Accumulating Haskell programmer (building up to a quick climax)
facAcc a 0 = a
facAcc a n = facAcc (n*a) (n-1)
fac = facAcc 1
Continuation-passing Haskell programmer (raised RABBITS in early
years, then moved to New Jersey)
facCps k 0 = k 1
facCps k n = facCps (k . (n *)) (n-1)
fac = facCps id
Boy Scout Haskell programmer (likes tying knots; always “reverent,”
he belongs to the Church of the Least Fixed-Point [8])
y f = f (y f)
fac = y (\f n -> if (n==0) then 1 else n * f (n-1))
Combinatory Haskell programmer (eschews variables, if not
obfuscation; all this currying’s just a phase, though it seldom
hinders)
s f g x = f x (g x)
k x y = x
b f g x = f (g x)
c f g x = f x g
y f = f (y f)
cond p f g x = if p x then f x else g x
fac = y (b (cond ((==) 0) (k 1)) (b (s (*)) (c b pred)))
List-encoding Haskell programmer (prefers to count in unary)
arb = () -- "undefined" is also a good RHS, as is "arb" :)
listenc n = replicate n arb
listprj f = length . f . listenc
listprod xs ys = [ i (x,y) | x<-xs, y<-ys ]
where i _ = arb
facl [] = listenc 1
facl n#(_:pred) = listprod n (facl pred)
fac = listprj facl
Interpretive Haskell programmer (never “met a language” he didn't
like)
-- a dynamically-typed term language
data Term = Occ Var
| Use Prim
| Lit Integer
| App Term Term
| Abs Var Term
| Rec Var Term
type Var = String
type Prim = String
-- a domain of values, including functions
data Value = Num Integer
| Bool Bool
| Fun (Value -> Value)
instance Show Value where
show (Num n) = show n
show (Bool b) = show b
show (Fun _) = ""
prjFun (Fun f) = f
prjFun _ = error "bad function value"
prjNum (Num n) = n
prjNum _ = error "bad numeric value"
prjBool (Bool b) = b
prjBool _ = error "bad boolean value"
binOp inj f = Fun (\i -> (Fun (\j -> inj (f (prjNum i) (prjNum j)))))
-- environments mapping variables to values
type Env = [(Var, Value)]
getval x env = case lookup x env of
Just v -> v
Nothing -> error ("no value for " ++ x)
-- an environment-based evaluation function
eval env (Occ x) = getval x env
eval env (Use c) = getval c prims
eval env (Lit k) = Num k
eval env (App m n) = prjFun (eval env m) (eval env n)
eval env (Abs x m) = Fun (\v -> eval ((x,v) : env) m)
eval env (Rec x m) = f where f = eval ((x,f) : env) m
-- a (fixed) "environment" of language primitives
times = binOp Num (*)
minus = binOp Num (-)
equal = binOp Bool (==)
cond = Fun (\b -> Fun (\x -> Fun (\y -> if (prjBool b) then x else y)))
prims = [ ("*", times), ("-", minus), ("==", equal), ("if", cond) ]
-- a term representing factorial and a "wrapper" for evaluation
facTerm = Rec "f" (Abs "n"
(App (App (App (Use "if")
(App (App (Use "==") (Occ "n")) (Lit 0))) (Lit 1))
(App (App (Use "*") (Occ "n"))
(App (Occ "f")
(App (App (Use "-") (Occ "n")) (Lit 1))))))
fac n = prjNum (eval [] (App facTerm (Lit n)))
Static Haskell programmer (he does it with class, he’s got that
fundep Jones! After Thomas Hallgren’s “Fun with Functional
Dependencies” [7])
-- static Peano constructors and numerals
data Zero
data Succ n
type One = Succ Zero
type Two = Succ One
type Three = Succ Two
type Four = Succ Three
-- dynamic representatives for static Peanos
zero = undefined :: Zero
one = undefined :: One
two = undefined :: Two
three = undefined :: Three
four = undefined :: Four
-- addition, a la Prolog
class Add a b c | a b -> c where
add :: a -> b -> c
instance Add Zero b b
instance Add a b c => Add (Succ a) b (Succ c)
-- multiplication, a la Prolog
class Mul a b c | a b -> c where
mul :: a -> b -> c
instance Mul Zero b Zero
instance (Mul a b c, Add b c d) => Mul (Succ a) b d
-- factorial, a la Prolog
class Fac a b | a -> b where
fac :: a -> b
instance Fac Zero One
instance (Fac n k, Mul (Succ n) k m) => Fac (Succ n) m
-- try, for "instance" (sorry):
--
-- :t fac four
Beginning graduate Haskell programmer (graduate education tends to
liberate one from petty concerns about, e.g., the efficiency of
hardware-based integers)
-- the natural numbers, a la Peano
data Nat = Zero | Succ Nat
-- iteration and some applications
iter z s Zero = z
iter z s (Succ n) = s (iter z s n)
plus n = iter n Succ
mult n = iter Zero (plus n)
-- primitive recursion
primrec z s Zero = z
primrec z s (Succ n) = s n (primrec z s n)
-- two versions of factorial
fac = snd . iter (one, one) (\(a,b) -> (Succ a, mult a b))
fac' = primrec one (mult . Succ)
-- for convenience and testing (try e.g. "fac five")
int = iter 0 (1+)
instance Show Nat where
show = show . int
(zero : one : two : three : four : five : _) = iterate Succ Zero
Origamist Haskell programmer
(always starts out with the “basic Bird fold”)
-- (curried, list) fold and an application
fold c n [] = n
fold c n (x:xs) = c x (fold c n xs)
prod = fold (*) 1
-- (curried, boolean-based, list) unfold and an application
unfold p f g x =
if p x
then []
else f x : unfold p f g (g x)
downfrom = unfold (==0) id pred
-- hylomorphisms, as-is or "unfolded" (ouch! sorry ...)
refold c n p f g = fold c n . unfold p f g
refold' c n p f g x =
if p x
then n
else c (f x) (refold' c n p f g (g x))
-- several versions of factorial, all (extensionally) equivalent
fac = prod . downfrom
fac' = refold (*) 1 (==0) id pred
fac'' = refold' (*) 1 (==0) id pred
Cartesianally-inclined Haskell programmer (prefers Greek food,
avoids the spicy Indian stuff; inspired by Lex Augusteijn’s “Sorting
Morphisms” [3])
-- (product-based, list) catamorphisms and an application
cata (n,c) [] = n
cata (n,c) (x:xs) = c (x, cata (n,c) xs)
mult = uncurry (*)
prod = cata (1, mult)
-- (co-product-based, list) anamorphisms and an application
ana f = either (const []) (cons . pair (id, ana f)) . f
cons = uncurry (:)
downfrom = ana uncount
uncount 0 = Left ()
uncount n = Right (n, n-1)
-- two variations on list hylomorphisms
hylo f g = cata g . ana f
hylo' f (n,c) = either (const n) (c . pair (id, hylo' f (c,n))) . f
pair (f,g) (x,y) = (f x, g y)
-- several versions of factorial, all (extensionally) equivalent
fac = prod . downfrom
fac' = hylo uncount (1, mult)
fac'' = hylo' uncount (1, mult)
Ph.D. Haskell programmer (ate so many bananas that his eyes bugged
out, now he needs new lenses!)
-- explicit type recursion based on functors
newtype Mu f = Mu (f (Mu f)) deriving Show
in x = Mu x
out (Mu x) = x
-- cata- and ana-morphisms, now for *arbitrary* (regular) base functors
cata phi = phi . fmap (cata phi) . out
ana psi = in . fmap (ana psi) . psi
-- base functor and data type for natural numbers,
-- using a curried elimination operator
data N b = Zero | Succ b deriving Show
instance Functor N where
fmap f = nelim Zero (Succ . f)
nelim z s Zero = z
nelim z s (Succ n) = s n
type Nat = Mu N
-- conversion to internal numbers, conveniences and applications
int = cata (nelim 0 (1+))
instance Show Nat where
show = show . int
zero = in Zero
suck = in . Succ -- pardon my "French" (Prelude conflict)
plus n = cata (nelim n suck )
mult n = cata (nelim zero (plus n))
-- base functor and data type for lists
data L a b = Nil | Cons a b deriving Show
instance Functor (L a) where
fmap f = lelim Nil (\a b -> Cons a (f b))
lelim n c Nil = n
lelim n c (Cons a b) = c a b
type List a = Mu (L a)
-- conversion to internal lists, conveniences and applications
list = cata (lelim [] (:))
instance Show a => Show (List a) where
show = show . list
prod = cata (lelim (suck zero) mult)
upto = ana (nelim Nil (diag (Cons . suck)) . out)
diag f x = f x x
fac = prod . upto
Post-doc Haskell programmer
(from Uustalu, Vene and Pardo’s “Recursion Schemes from Comonads” [4])
-- explicit type recursion with functors and catamorphisms
newtype Mu f = In (f (Mu f))
unIn (In x) = x
cata phi = phi . fmap (cata phi) . unIn
-- base functor and data type for natural numbers,
-- using locally-defined "eliminators"
data N c = Z | S c
instance Functor N where
fmap g Z = Z
fmap g (S x) = S (g x)
type Nat = Mu N
zero = In Z
suck n = In (S n)
add m = cata phi where
phi Z = m
phi (S f) = suck f
mult m = cata phi where
phi Z = zero
phi (S f) = add m f
-- explicit products and their functorial action
data Prod e c = Pair c e
outl (Pair x y) = x
outr (Pair x y) = y
fork f g x = Pair (f x) (g x)
instance Functor (Prod e) where
fmap g = fork (g . outl) outr
-- comonads, the categorical "opposite" of monads
class Functor n => Comonad n where
extr :: n a -> a
dupl :: n a -> n (n a)
instance Comonad (Prod e) where
extr = outl
dupl = fork id outr
-- generalized catamorphisms, zygomorphisms and paramorphisms
gcata :: (Functor f, Comonad n) =>
(forall a. f (n a) -> n (f a))
-> (f (n c) -> c) -> Mu f -> c
gcata dist phi = extr . cata (fmap phi . dist . fmap dupl)
zygo chi = gcata (fork (fmap outl) (chi . fmap outr))
para :: Functor f => (f (Prod (Mu f) c) -> c) -> Mu f -> c
para = zygo In
-- factorial, the *hard* way!
fac = para phi where
phi Z = suck zero
phi (S (Pair f n)) = mult f (suck n)
-- for convenience and testing
int = cata phi where
phi Z = 0
phi (S f) = 1 + f
instance Show (Mu N) where
show = show . int
Tenured professor (teaching Haskell to freshmen)
fac n = product [1..n]
Content from The Evolution of a Haskell Programmer by Fritz Ruehr, Willamette University - 11 July 01

Thanks to Christoph, a C99 solution that works for quite a few "numbers":
#include <math.h>
#include <stdio.h>
double fact(double x)
{
return tgamma(x+1.);
}
int main()
{
printf("%f %f\n", fact(3.0), fact(5.0));
return 0;
}
produces 6.000000 120.000000

For large n you may run into some issues and you may want to use Stirling's approximation:
Which is:

If your main objective is an interesting looking function:
int facorial(int a) {
int b = 1, c, d, e;
a--;
for (c = a; c > 0; c--)
for (d = b; d > 0; d--)
for (e = c; e > 0; e--)
b++;
return b;
}
(Not recommended as an algorithm for real use.)

a tail-recursive version:
long factorial(long n)
{
return tr_fact(n, 1);
}
static long tr_fact(long n, long result)
{
if(n==1)
return result;
else
return tr_fact(n-1, n*result);
}

In C99 (or Java) I would write the factorial function iteratively like this:
int factorial(int n)
{
int result = 1;
for (int i = 2; i <= n; i++)
{
result *= i;
}
return result;
}
C is not a functional language and you can't rely on tail-call optimization. So don't use recursion in C (or Java) unless you need to.
Just because factorial is often used as the first example for recursion it doesn't mean you need recursion to compute it.
This will overflow silently if n is too big, as is the custom in C (and Java).
If the numbers int can represent are too small for the factorials you want to compute then choose another number type. long long if it needs be just a little bit bigger, float or double if n isn't too big and you don't mind some imprecision, or big integers if you want the exact values of really big factorials.

Here's a C program that uses OPENSSL's BIGNUM implementation, and therefore is not particularly useful for students. (Of course accepting a BIGNUM as the input parameter is crazy, but helpful for demonstrating interaction between BIGNUMs).
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <openssl/crypto.h>
#include <openssl/bn.h>
BIGNUM *factorial(const BIGNUM *num)
{
BIGNUM *count = BN_new();
BIGNUM *fact = NULL;
BN_CTX *ctx = NULL;
BN_one(count);
if( BN_cmp(num, BN_value_one()) <= 0 )
{
return count;
}
ctx = BN_CTX_new();
fact = BN_dup(num);
BN_sub(count, fact, BN_value_one());
while( BN_cmp(count, BN_value_one()) > 0 )
{
BN_mul(fact, count, fact, ctx);
BN_sub(count, count, BN_value_one());
}
BN_CTX_free(ctx);
BN_free(count);
return fact;
}
This test program shows how to create a number for input and what to do with the return value:
int main(int argc, char *argv[])
{
const char *test_cases[] =
{
"0", "1",
"1", "1",
"4", "24",
"15", "1307674368000",
"30", "265252859812191058636308480000000",
"56", "710998587804863451854045647463724949736497978881168458687447040000000000000",
NULL, NULL
};
int index = 0;
BIGNUM *bn = NULL;
BIGNUM *fact = NULL;
char *result_str = NULL;
for( index = 0; test_cases[index] != NULL; index += 2 )
{
BN_dec2bn(&bn, test_cases[index]);
fact = factorial(bn);
result_str = BN_bn2dec(fact);
printf("%3s: %s\n", test_cases[index], result_str);
assert(strcmp(result_str, test_cases[index + 1]) == 0);
OPENSSL_free(result_str);
BN_free(fact);
BN_free(bn);
bn = NULL;
}
return 0;
}
Compiled with gcc:
gcc factorial.c -o factorial -g -lcrypto

int factorial(int n){
return n <= 1 ? 1 : n * factorial(n-1);
}

You use the following code to do it.
#include <stdio.h>
#include <stdlib.h>
int main()
{
int x, number, fac;
fac = 1;
printf("Enter a number:\n");
scanf("%d",&number);
if(number<0)
{
printf("Factorial not defined for negative numbers.\n");
exit(0);
}
for(x = 1; x <= number; x++)
{
if (number >= 0)
fac = fac * x;
else
fac=1;
}
printf("%d! = %d\n", number, fac);
}

For large numbers you probably can get away with an approximate solution, which tgamma gives you (n! = Gamma(n+1)) from math.h. If you want even larger numbers, they won't fit in a double, so you should use lgamma (natural log of the gamma function) instead.
If you're working somewhere without a full C99 math.h, you can easily do this type of thing yourself:
double logfactorial(int n) {
double fac = 0.0;
for ( ; n>1 ; n--) fac += log(fac);
return fac;
}

I don't think I'd use this in most cases, but one well-known practice which is becoming less widely used is to have a look-up table. If we're only working with built-in types, the memory hit is tiny.
Just another approach, to make the poster aware of a different technique. Many recursive solutions also can be memoized whereby a lookup table is filled in when the algorithm runs, drastically reducing the cost on future calls (kind of like the principle behind .NET JIT compilation I guess).

We have to start from 1 to the limit specfied say n.Start from 1*2*3...*n.
In c, i am writing it as a function.
main()
{
int n;
scanf("%d",&n);
printf("%ld",fact(n));
}
long int fact(int n)
{
long int facto=1;
int i;
for(i=1;i<=n;i++)
{
facto=facto*i;
}
return facto;
}

Simple solution:
unsigned int factorial(unsigned int n)
{
return (n == 1 || n == 0) ? 1 : factorial(n - 1) * n;
}

Simplest and most efficient is to sum up logarithms. If you use Log10 you get power and exponent.
Pseudocode
r=0
for i from 1 to n
r= r + log(i)/log(10)
print "result is:", 10^(r-floor(r)) ,"*10^" , floor(r)
You might need to add the code so the integer part does not increase too much and thus decrease accuracy, but result should be ok for even very large factorials.

Example in C using recursion
unsigned long factorial(unsigned long f)
{
if (f) return(f * factorial(f - 1));
return 1;
}
printf("%lu", factorial(5));

I used this code for Factorial:
#include<stdio.h>
int main(){
int i=1,f=1,n;
printf("\n\nEnter a number: ");
scanf("%d",&n);
while(i<=n){
f=f*i;
i++;
}
printf("Factorial of is: %d",f);
getch();
}

I would do this with a pre-calculated lookup table as suggested by Mr. Boy. This would be faster to calculate than an iterative or recursive solution. It relies on how fast n! grows, because the largest n! you can calculate without overflowing an unsigned long long (max value of 18,446,744,073,709,551,615) is only 20!, so you only need an array with 21 elements. Here's how it would look in c:
long long factorial (int n) {
long long f[22] = {1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800, 39916800, 479001600, 6227020800, 87178291200, 1307674368000, 20922789888000, 355687428096000, 6402373705728000, 121645100408832000, 2432902008176640000, 51090942171709440000};
return f[n];
}
See for yourself!

Related

Moving an ST computation to a sub-computation

I have the following code running in ST using random numbers:
module Main
import Data.Vect
import Data.Fin
import Control.ST
import Control.ST.Random
%default total
choose : (Monad m) => (rnd : Var) -> (n : Nat) -> Vect (n + k) a -> ST m (Vect n a, Vect k a) [rnd ::: Random]
choose rnd n xs = pure $ splitAt n xs -- Dummy implementation
partitionLen : (a -> Bool) -> Vect n a -> DPair (Nat, Nat) (\(m, k) => (m + k = n, Vect m a, Vect k a))
partitionLen p [] = ((0, 0) ** (Refl, [], []))
partitionLen p (x :: xs) = case partitionLen p xs of
((m, k) ** (prf, lefts, rights)) =>
if p x then
((S m, k) ** (cong prf, x::lefts, rights))
else
((m, S k) ** (trans (plusS m k) (cong prf), lefts, x::rights))
where
plusS : (x : Nat) -> (y : Nat) -> x + (S y) = S (x + y)
plusS Z y = Refl
plusS (S x) y = cong (plusS x y)
generate : (Monad m) => (rnd : Var) -> ST m (Vect 25 (Fin 25)) [rnd ::: Random]
generate rnd = do
(shared, nonshared) <- choose rnd 4 indices
(agents1, nonagents1) <- choose rnd 6 nonshared
(agents2, nonagents2) <- choose rnd 6 nonagents1
(assassins1, others1) <- choose rnd 2 nonagents1
case partitionLen (`elem` assassins1) nonagents2 of
((n, k) ** (prf, xs, ys)) => do
(assassins2, others2') <- choose rnd 2 (agents1 ++ xs)
let prf' = trans (sym $ plusAssociative 4 n k) $ cong {f = (+) 4} prf
let others2 = the (Vect 13 (Fin 25)) $ replace {P = \n => Vect n (Fin 25)} prf' (others2' ++ ys)
pure $ shared ++ agents2 ++ assassins2 ++ others2
where
indices : Vect 25 (Fin 25)
indices = fromList [0..24]
I'd like to refactor generate so that instead of the whole tail of the computation is under a case, I would compute (assassins2, others2) in a sub-computation, i.e. I would like to rewrite it as such:
(assassins2, others2) <- case partitionLen (`elem` assassins1) nonagents2 of
((n, k) ** (prf, xs, ys)) => do
(assassins2, others2') <- choose rnd 2 (agents1 ++ xs)
let prf' = trans (sym $ plusAssociative 4 n k) $ cong {f = (+) 4} prf
let others2 = replace {P = \n => Vect n (Fin 25)} prf' (others2' ++ ys)
pure (assassins2, others2)
pure $ shared ++ agents2 ++ assassins2 ++ others2
I believe this should be an equivalent transformation. However, this second version fails type-checking with:
When checking right hand side of Main.case block in case block in case block in case block in case block in generate with expected type
STrans m
(Vect 25 (Fin 25))
(st2_fn (assassins2, others2))
(\result => [rnd ::: State Integer])
When checking argument ys to function Data.Vect.++:
Type mismatch between
B (Type of others2)
and
Vect n (Fin 25) (Expected type)

Competitive programming using Haskell

I am currently trying to refresh my Haskell knowledge by solving some Hackerrank problems.
For example:
https://www.hackerrank.com/challenges/maximum-palindromes/problem
I've already implemented an imperative solution in C++ which got accepted for all test cases. Now I am trying to come up with a pure functional solution in (reasonably idiomatic) Haskell.
My current code is
module Main where
import Control.Monad
import qualified Data.ByteString.Char8 as C
import Data.Bits
import Data.List
import qualified Data.Map.Strict as Map
import qualified Data.IntMap.Strict as IntMap
import Debug.Trace
-- precompute factorials
compFactorials :: Int -> Int -> IntMap.IntMap Int
compFactorials n m = go 0 1 IntMap.empty
where
go a acc map
| a < 0 = map
| a < n = go a' acc' map'
| otherwise = map'
where
map' = IntMap.insert a acc map
a' = a + 1
acc' = (acc * a') `mod` m
-- precompute invs
compInvs :: Int -> Int -> IntMap.IntMap Int -> IntMap.IntMap Int
compInvs n m facts = go 0 IntMap.empty
where
go a map
| a < 0 = map
| a < n = go a' map'
| otherwise = map'
where
map' = IntMap.insert a v map
a' = a + 1
v = (modExp b (m-2) m) `mod` m
b = (IntMap.!) facts a
modExp :: Int -> Int -> Int -> Int
modExp b e m = go b e 1
where
go b e r
| (.&.) e 1 == 1 = go b' e' r'
| e > 0 = go b' e' r
| otherwise = r
where
r' = (r * b) `mod` m
b' = (b * b) `mod` m
e' = shift e (-1)
-- precompute frequency table
initFreqMap :: C.ByteString -> Map.Map Char (IntMap.IntMap Int)
initFreqMap inp = go 1 map1 map2 inp
where
map1 = Map.fromList $ zip ['a'..'z'] $ repeat 0
map2 = Map.fromList $ zip ['a'..'z'] $ repeat IntMap.empty
go idx m1 m2 inp
| C.null inp = m2
| otherwise = go (idx+1) m1' m2' $ C.tail inp
where
m1' = Map.update (\v -> Just $ v+1) (C.head inp) m1
m2' = foldl' (\m w -> Map.update (\v -> liftM (\c -> IntMap.insert idx c v) $ Map.lookup w m1') w m)
m2 ['a'..'z']
query :: Int -> Int -> Int -> Map.Map Char (IntMap.IntMap Int)
-> IntMap.IntMap Int -> IntMap.IntMap Int -> Int
query l r m freqMap facts invs
| x > 1 = (x * y) `mod` m
| otherwise = y
where
calcCnt cs = cr - cl
where
cl = IntMap.findWithDefault 0 (l-1) cs
cr = IntMap.findWithDefault 0 r cs
f1 acc cs
| even cnt = acc
| otherwise = acc + 1
where
cnt = calcCnt cs
f2 (acc1,acc2) cs
| cnt < 2 = (acc1 ,acc2)
| otherwise = (acc1',acc2')
where
cnt = calcCnt cs
n = cnt `div` 2
acc1' = acc1 + n
r = choose acc1' n
acc2' = (acc2 * r) `mod` m
-- calc binomial coefficient using Fermat's little theorem
choose n k
| n < k = 0
| otherwise = (f1 * t) `mod` m
where
f1 = (IntMap.!) facts n
i1 = (IntMap.!) invs k
i2 = (IntMap.!) invs (n-k)
t = (i1 * i2) `mod` m
x = Map.foldl' f1 0 freqMap
y = snd $ Map.foldl' f2 (0,1) freqMap
main :: IO()
main = do
inp <- C.getLine
q <- readLn :: IO Int
let modulo = 1000000007
let facts = compFactorials (C.length inp) modulo
let invs = compInvs (C.length inp) modulo facts
let freqMap = initFreqMap inp
forM_ [1..q] $ \_ -> do
line <- getLine
let [s1, s2] = words line
let l = (read s1) :: Int
let r = (read s2) :: Int
let result = query l r modulo freqMap facts invs
putStrLn $ show result
It passes all small and medium test cases but I am getting timeout with large test cases.
The key to solve this problem is to precompute some stuff once at the beginning and use them to answer the individual queries efficiently.
Now, my main problem where I need help is:
The initital profiling shows that the lookup operation of the IntMap seems to be the main bottleneck. Is there better alternative to IntMap for memoization? Or should I look at Vector or Array, which I believe will lead to more "ugly" code.
Even in current state, the code doesn't look nice (by functional standards) and as verbose as my C++ solution. Any tips to make it more idiomatic? Other than IntMap usage for memoization, do you spot any other obvious problems which can lead to performance problems?
And is there any good sources, where I can learn how to use Haskell more effectively for competitive programming?
A sample large testcase, where the current code gets timeout:
input.txt
output.txt
For comparison my C++ solution:
#include <vector>
#include <iostream>
#define MOD 1000000007L
long mod_exp(long b, long e) {
long r = 1;
while (e > 0) {
if ((e & 1) == 1) {
r = (r * b) % MOD;
}
b = (b * b) % MOD;
e >>= 1;
}
return r;
}
long n_choose_k(int n, int k, const std::vector<long> &fact_map, const std::vector<long> &inv_map) {
if (n < k) {
return 0;
}
long l1 = fact_map[n];
long l2 = (inv_map[k] * inv_map[n-k]) % MOD;
return (l1 * l2) % MOD;
}
int main() {
std::string s;
int q;
std::cin >> s >> q;
std::vector<std::vector<long>> freq_map;
std::vector<long> fact_map(s.size()+1);
std::vector<long> inv_map(s.size()+1);
for (int i = 0; i < 26; i++) {
freq_map.emplace_back(std::vector<long>(s.size(), 0));
}
std::vector<long> acc_map(26, 0);
for (int i = 0; i < s.size(); i++) {
acc_map[s[i]-'a']++;
for (int j = 0; j < 26; j++) {
freq_map[j][i] = acc_map[j];
}
}
fact_map[0] = 1;
inv_map[0] = 1;
for (int i = 1; i <= s.size(); i++) {
fact_map[i] = (i * fact_map[i-1]) % MOD;
inv_map[i] = mod_exp(fact_map[i], MOD-2) % MOD;
}
while (q--) {
int l, r;
std::cin >> l >> r;
std::vector<long> x(26, 0);
long t = 0;
long acc = 0;
long result = 1;
for (int i = 0; i < 26; i++) {
auto cnt = freq_map[i][r-1] - (l > 1 ? freq_map[i][l-2] : 0);
if (cnt % 2 != 0) {
t++;
}
long n = cnt / 2;
if (n > 0) {
acc += n;
result *= n_choose_k(acc, n, fact_map, inv_map);
result = result % MOD;
}
}
if (t > 0) {
result *= t;
result = result % MOD;
}
std::cout << result << std::endl;
}
}
UPDATE:
DanielWagner's answer has confirmed my suspicion that the main problem in my code was the usage of IntMap for memoization. Replacing IntMap with Array made my code perform similar to DanielWagner's solution.
module Main where
import Control.Monad
import Data.Array (Array)
import qualified Data.Array as A
import qualified Data.ByteString.Char8 as C
import Data.Bits
import Data.List
import Debug.Trace
-- precompute factorials
compFactorials :: Int -> Int -> Array Int Int
compFactorials n m = A.listArray (0,n) $ scanl' f 1 [1..n]
where
f acc a = (acc * a) `mod` m
-- precompute invs
compInvs :: Int -> Int -> Array Int Int -> Array Int Int
compInvs n m facts = A.listArray (0,n) $ map f [0..n]
where
f a = (modExp ((A.!) facts a) (m-2) m) `mod` m
modExp :: Int -> Int -> Int -> Int
modExp b e m = go b e 1
where
go b e r
| (.&.) e 1 == 1 = go b' e' r'
| e > 0 = go b' e' r
| otherwise = r
where
r' = (r * b) `mod` m
b' = (b * b) `mod` m
e' = shift e (-1)
-- precompute frequency table
initFreqMap :: C.ByteString -> Map.Map Char (Array Int Int)
initFreqMap inp = Map.fromList $ map f ['a'..'z']
where
n = C.length inp
f c = (c, A.listArray (0,n) $ scanl' g 0 [0..n-1])
where
g x j
| C.index inp j == c = x+1
| otherwise = x
query :: Int -> Int -> Int -> Map.Map Char (Array Int Int)
-> Array Int Int -> Array Int Int -> Int
query l r m freqMap facts invs
| x > 1 = (x * y) `mod` m
| otherwise = y
where
calcCnt freqMap = cr - cl
where
cl = (A.!) freqMap (l-1)
cr = (A.!) freqMap r
f1 acc cs
| even cnt = acc
| otherwise = acc + 1
where
cnt = calcCnt cs
f2 (acc1,acc2) cs
| cnt < 2 = (acc1 ,acc2)
| otherwise = (acc1',acc2')
where
cnt = calcCnt cs
n = cnt `div` 2
acc1' = acc1 + n
r = choose acc1' n
acc2' = (acc2 * r) `mod` m
-- calc binomial coefficient using Fermat's little theorem
choose n k
| n < k = 0
| otherwise = (f1 * t) `mod` m
where
f1 = (A.!) facts n
i1 = (A.!) invs k
i2 = (A.!) invs (n-k)
t = (i1 * i2) `mod` m
x = Map.foldl' f1 0 freqMap
y = snd $ Map.foldl' f2 (0,1) freqMap
main :: IO()
main = do
inp <- C.getLine
q <- readLn :: IO Int
let modulo = 1000000007
let facts = compFactorials (C.length inp) modulo
let invs = compInvs (C.length inp) modulo facts
let freqMap = initFreqMap inp
replicateM_ q $ do
line <- getLine
let [s1, s2] = words line
let l = (read s1) :: Int
let r = (read s2) :: Int
let result = query l r modulo freqMap facts invs
putStrLn $ show result
I think you've shot yourself in the foot by trying to be too clever. Below I'll show a straightforward implementation of a slightly different algorithm that is about 5x faster than your Haskell code.
Here's the core combinatoric computation. Given a character frequency count for a substring, we can compute the number of maximum-length palindromes this way:
Divide all the frequencies by two, rounding down; call this the div2-frequencies. We'll also want the mod2-frequencies, which is the set of letters for which we had to round down.
Sum the div2-frequencies to get the total length of the palindrome prefix; its factorial gives an overcount of the number of possible prefixes for the palindrome.
Take the product of the factorials of the div2-frequencies. This tells the factor by which we overcounted above.
Take the size of the mod2-frequencies, or choose 1 if there are none. We can extend any of the palindrome prefixes by one of the values in this set, if there are any, so we have to multiply by this size.
For the overcounting step, it's not super obvious to me whether it would be faster to store precomputed inverses for factorials, and take their product, or whether it's faster to just take the product of all the factorials and do one inverse operation at the very end. I'll do the latter, because it just intuitively seems faster to do one inversion per query than one lookup per repeated letter, but what do I know? Should be easy to test if you want to try to adapt the code yourself.
There's only one other quick insight I had vs. your code, which is that we can cache the frequency counts for prefixes of the input; then computing the frequency count for a substring is just pointwise subtraction of two cached counts. Your precomputation on the input I find to be a bit excessive in comparison.
Without further ado, let's see some code. As usual there's some preamble.
module Main where
import Control.Monad
import Data.Array (Array)
import qualified Data.Array as A
import Data.Map.Strict (Map)
import qualified Data.Map.Strict as M
import Data.Monoid
Like you, I want to do all my computations on cheap Ints and bake in the modular operations where possible. I'll make a newtype to make sure this happens for me.
newtype Mod1000000007 = Mod Int deriving (Eq, Ord)
instance Num Mod1000000007 where
fromInteger = Mod . (`mod` 1000000007) . fromInteger
Mod l + Mod r = Mod ((l+r) `rem` 1000000007)
Mod l * Mod r = Mod ((l*r) `rem` 1000000007)
negate (Mod v) = Mod ((1000000007 - v) `rem` 1000000007)
abs = id
signum = id
instance Integral Mod1000000007 where
toInteger (Mod n) = toInteger n
quotRem a b = (a * b^1000000005, 0)
I baked in the base of 1000000007 in several places, but it's easy to generalize by giving Mod a phantom parameter and making a HasBase class to pick the base. Ask a fresh question if you're not sure how and are interested; I'll be happy to do a more thorough writeup. There's a few more instances for Mod that are basically uninteresting and primarily needed because of Haskell's wacko numeric class hierarchy:
instance Show Mod1000000007 where show (Mod n) = show n
instance Real Mod1000000007 where toRational (Mod n) = toRational n
instance Enum Mod1000000007 where
toEnum = Mod . (`mod` 1000000007)
fromEnum (Mod n) = n
Here's the precomputation we want to do for factorials...
type FactMap = Array Int Mod1000000007
factMap :: Int -> FactMap
factMap n = A.listArray (0,n) (scanl (*) 1 [1..])
...and for precomputing frequency maps for each prefix, plus getting a frequency map given a start and end point.
type FreqMap = Map Char Int
freqMaps :: String -> Array Int FreqMap
freqMaps s = go where
go = A.listArray (0, length s)
(M.empty : [M.insertWith (+) c 1 (go A.! i) | (i, c) <- zip [0..] s])
substringFreqMap :: Array Int FreqMap -> Int -> Int -> FreqMap
substringFreqMap maps l r = M.unionWith (-) (maps A.! r) (maps A.! (l-1))
Implementing the core computation described above is just a few lines of code, now that we have suitable Num and Integral instances for Mod1000000007:
palindromeCount :: FactMap -> FreqMap -> Mod1000000007
palindromeCount facts freqs
= toEnum (max 1 mod2Freqs)
* (facts A.! sum div2Freqs)
`div` product (map (facts A.!) div2Freqs)
where
(div2Freqs, Sum mod2Freqs) = foldMap (\n -> ([n `quot` 2], Sum (n `rem` 2))) freqs
Now we just need a short driver to read stuff and pass it around to the appropriate functions.
main :: IO ()
main = do
inp <- getLine
q <- readLn
let freqs = freqMaps inp
facts = factMap (length inp)
replicateM_ q $ do
[l,r] <- map read . words <$> getLine
print . palindromeCount facts $ substringFreqMap freqs l r
That's it. Notably I made no attempt to be fancy about bitwise operations and didn't do anything fancy with accumulators; everything is in what I would consider idiomatic purely-functional style. The final count is about half as much code that runs about 5x faster.
P.S. Just for fun, I replaced the last line with print (l+r :: Int)... and discovered that about half the time is spent in read. Ouch! Seems there's still plenty of low-hanging fruit if this isn't fast enough yet.

Performance of Conway's Game of Life using the Store comonad

I've written a simple implementation of Conway's Game of Life using the Store comonad (see code below). My problem is that the grid generation is getting visibly slower from the fifth iteration onwards. Is my issue related to the fact that I'm using the Store comonad? Or am I making a glaring mistake? As far as I could tell, other implementations, which are based on the Zipper comonad, are efficient.
import Control.Comonad
data Store s a = Store (s -> a) s
instance Functor (Store s) where
fmap f (Store g s) = Store (f . g) s
instance Comonad (Store s) where
extract (Store f a) = f a
duplicate (Store f s) = Store (Store f) s
type Pos = (Int, Int)
seed :: Store Pos Bool
seed = Store g (0, 0)
where
g ( 0, 1) = True
g ( 1, 0) = True
g (-1, -1) = True
g (-1, 0) = True
g (-1, 1) = True
g _ = False
neighbours8 :: [Pos]
neighbours8 = [(x, y) | x <- [-1..1], y <- [-1..1], (x, y) /= (0, 0)]
move :: Store Pos a -> Pos -> Store Pos a
move (Store f (x, y)) (dx, dy) = Store f (x + dx, y + dy)
count :: [Bool] -> Int
count = length . filter id
getNrAliveNeighs :: Store Pos Bool -> Int
getNrAliveNeighs s = count $ fmap (extract . move s) neighbours8
rule :: Store Pos Bool -> Bool
rule s = let n = getNrAliveNeighs s
in case (extract s) of
True -> 2 <= n && n <= 3
False -> n == 3
blockToStr :: [[Bool]] -> String
blockToStr = unlines . fmap (fmap f)
where
f True = '*'
f False = '.'
getBlock :: Int -> Store Pos a -> [[a]]
getBlock n store#(Store _ (x, y)) =
[[extract (move store (dx, dy)) | dy <- yrange] | dx <- xrange]
where
yrange = [(x - n)..(y + n)]
xrange = reverse yrange
example :: IO ()
example = putStrLn
$ unlines
$ take 7
$ fmap (blockToStr . getBlock 5)
$ iterate (extend rule) seed
The store comonad per se doesn't really store anything (except in an abstract sense that a function is a “container”), but has to compute it from scratch. That clearly gets very inefficient over a couple of iterations.
You can alleviate this with no change to your code though, if you just back up the s -> a function with some memoisation:
import Data.MemoTrie
instance HasTrie s => Functor (Store s) where
fmap f (Store g s) = Store (memo $ f . g) s
instance HasTrie s => Comonad (Store s) where
extract (Store f a) = f a
duplicate (Store f s) = Store (Store f) s
Haven't tested whether this really gives acceptable performance.
Incidentally, Edward Kmett had an explicitly-memoised version in an old version of the comonad-extras package, but it's gone now. I've recently looked if that still works (seems like it does, after adjusting dependencies).

Haskell Program Low Performance Compared with Perl

I am doing a Facebook Hackercup 2015 problem with Haskell and got stuck on this problem.
Input: Begins with an integer T, the number of questions. For each question, there is one line containing 3 space-separated integers:A, B, and K.
Output: For the ith question, print a line containing "Case #i: " followed by the number of integers in the inclusive range [A, B] with a primacity of K.
Primacity of a number X is the number of its prime factors. For example, the primacity of 12 is 2 (as it's divisible by primes 2 and 3), the primacity of 550 is 3 (as it's divisible by primes 2, 5, and 11), and the primacity of 7 is 1 (as the only prime it's divisible by is 7).
1 ≤ T ≤ 100 
2 ≤ A ≤ B ≤ 10^7 
1 ≤ K ≤ 10^9 
Here is my Haskell solution:
import System.IO
import Data.List
import Control.Monad
incEvery :: Int -> [(Int -> Int)]
incEvery n = (cycle ((replicate (n-1) id) ++ [(+ 1)]))
primes2 :: [Int]
primes2 = sieve 2 (replicate (10^7) 0)
where
sieve _ [] = []
sieve n (a:xs) = (a + (if a == 0 then 1 else 0))
: if a == 0 then
sieve (n+1) (zipWith ($) (incEvery n) xs)
else
sieve (n+1) xs
process :: (Int, Int, Int) -> Int
process (lo, hi, k) =
length . filter (\(a, b) -> a >= lo && a <= hi && b == k) . zip [2,3..] $ primes2
readIn :: [String] -> (Int, Int, Int)
readIn =
(\[x, y, z] -> (x, y, z)) . fmap (read::String->Int) . take 3
lib :: String -> String
lib xs = unlines . fmap (\(i, x) -> "Case #" ++ (show i) ++ ": " ++ x) . zip [1,2..]
. fmap parse . tail . lines $ xs
where
parse = (show . process . readIn . words)
main :: IO ()
main = interact lib
Here is my Perl solution:
use strict;
use warnings;
my $max = 10000010;
my #f = (0) x $max;
for my $i (2 .. $max) {
if($f[$i] == 0) {
$f[$i] = 1;
# print $i . "\n";
for my $j (2 .. ($max / $i)) {
$f[$i * $j] ++;
}
}
}
my $k = <STDIN>;
for my $i (1 .. $k) {
my $line = <STDIN>;
if($line) {
chomp $line;
my ($a, $b, $t) = split(' ', $line);
my $ans = 0;
for my $j ($a .. $b) {
if($f[$j] == $t) {
$ans ++;
}
}
print "Case #$i: " . $ans . "\n";
}
}
Though I am using the same sieving algorithm for both languages, the Haskell version is significantly slower than Perl version on 10^7 scale of data.
Basically the following Haskell function is slower than its Perl counterpart:
incEvery :: Int -> [(Int -> Int)]
incEvery n = (cycle ((replicate (n-1) id) ++ [(+ 1)]))
primes2 :: [Int]
primes2 = sieve 2 (replicate (10^7) 0)
where
sieve _ [] = []
sieve n (a:xs) = (a + (if a == 0 then 1 else 0))
: if a == 0 then
sieve (n+1) (zipWith ($) (incEvery n) xs)
else
sieve (n+1) xs
I think both recursion and (zipWith ($) (incEvery n) xs) are causing the problem. Any ideas?
There is absolutely no reason why you need to resort to imperative programming to gain performance. The unique thing about Haskell is you have to learn to think differently if you want to program in a purely functional manner. Exploit laziness to speed things up a bit:
{-# LANGUAGE ScopedTypeVariables #-}
import Control.Applicative ( pure, (<$>) )
import Data.List ( nub )
import Data.Monoid ( (<>) )
isPrime :: (Integral i) => i -> Bool
isPrime n = isPrime_ n primes
where isPrime_ n (p:ps)
| p * p > n = True
| n `mod` p == 0 = False
| otherwise = isPrime_ n ps
primes :: (Integral i) => [i]
primes = 2 : filter isPrime [3,5..]
primeFactors :: (Integral i) => i -> [i]
primeFactors n = factors n primes
where factors n (x:xs)
| x * x > n = [n]
| n `mod` x == 0 = x : factors (n `div` x) (x:xs)
| otherwise = factors n xs
primacity :: (Integral i) => i -> Int
primacity = length . nub . primeFactors
user :: IO Int
user = do
xs <- getLine
let a :: Int = read . takeWhile (/=' ') . dropN 0 $ xs
let b :: Int = read . takeWhile (/=' ') . dropN 1 $ xs
let k :: Int = read . takeWhile (/=' ') . dropN 2 $ xs
let n = length . filter (== k) . fmap primacity $ [a..b]
pure n
where
dropN 0 = id
dropN n = dropN (pred n) . drop 1 . dropWhile (/= ' ')
printNTimes :: Int -> Int -> IO ()
printNTimes 0 _ = pure ()
printNTimes n total = do
ans <- user
putStr $ "Case #" <> show (total - n + 1) <> ": "
putStrLn $ show ans
printNTimes (pred n) total
main :: IO ()
main = do
n :: Int <- read <$> getLine
printNTimes n n
This is basically mutual recursion mixed with laziness. Might take a while to understand it, but I can guarantee that it's fast.
Yes, of course. You're effectively using two different algorithms. Your Haskell zipWith ($) (incEvery n) xs has to process every entry of your list, while your Perl for my $j (2 .. ($max / $i)) { $f[$i * $j] ++; } only has to process the entries it actually increments, which is a factor of $i faster. This is a prototypical example of a problem where mutable arrays are helpful: in Haskell you can use STUArray, for example.

Haskell performance of memoization implement of dynamic programming is poor [duplicate]

I'm trying to memoize the following function:
gridwalk x y
| x == 0 = 1
| y == 0 = 1
| otherwise = (gridwalk (x - 1) y) + (gridwalk x (y - 1))
Looking at this I came up with the following solution:
gw :: (Int -> Int -> Int) -> Int -> Int -> Int
gw f x y
| x == 0 = 1
| y == 0 = 1
| otherwise = (f (x - 1) y) + (f x (y - 1))
gwlist :: [Int]
gwlist = map (\i -> gw fastgw (i `mod` 20) (i `div` 20)) [0..]
fastgw :: Int -> Int -> Int
fastgw x y = gwlist !! (x + y * 20)
Which I then can call like this:
gw fastgw 20 20
Is there an easier, more concise and general way (notice how I had to hardcode the max grid dimensions in the gwlist function in order to convert from 2D to 1D space so I can access the memoizing list) to memoize functions with multiple parameters in Haskell?
You can use a list of lists to memoize the function result for both parameters:
memo :: (Int -> Int -> a) -> [[a]]
memo f = map (\x -> map (f x) [0..]) [0..]
gw :: Int -> Int -> Int
gw 0 _ = 1
gw _ 0 = 1
gw x y = (fastgw (x - 1) y) + (fastgw x (y - 1))
gwstore :: [[Int]]
gwstore = memo gw
fastgw :: Int -> Int -> Int
fastgw x y = gwstore !! x !! y
Use the data-memocombinators package from hackage. It provides easy to use memorization techniques and provides an easy and breve way to use them:
import Data.MemoCombinators (memo2,integral)
gridwalk = memo2 integral integral gridwalk' where
gridwalk' x y
| x == 0 = 1
| y == 0 = 1
| otherwise = (gridwalk (x - 1) y) + (gridwalk x (y - 1))
Here is a version using Data.MemoTrie from the MemoTrie package to memoize the function:
import Data.MemoTrie(memo2)
gridwalk :: Int -> Int -> Int
gridwalk = memo2 gw
where
gw 0 _ = 1
gw _ 0 = 1
gw x y = gridwalk (x - 1) y + gridwalk x (y - 1)
If you want maximum generality, you can memoize a memoizing function.
memo :: (Num a, Enum a) => (a -> b) -> [b]
memo f = map f (enumFrom 0)
gwvals = fmap memo (memo gw)
fastgw :: Int -> Int -> Int
fastgw x y = gwvals !! x !! y
This technique will work with functions that have any number of arguments.
Edit: thanks to Philip K. for pointing out a bug in the original code. Originally memo had a "Bounded" constraint instead of "Num" and began the enumeration at minBound, which would only be valid for natural numbers.
Lists aren't a good data structure for memoizing, though, because they have linear lookup complexity. You might be better off with a Map or IntMap. Or look on Hackage.
Note that this particular code does rely on laziness, so if you wanted to switch to using a Map you would need to take a bounded amount of elements from the list, as in:
gwByMap :: Int -> Int -> Int -> Int -> Int
gwByMap maxX maxY x y = fromMaybe (gw x y) $ M.lookup (x,y) memomap
where
memomap = M.fromList $ concat [[((x',y'),z) | (y',z) <- zip [0..maxY] ys]
| (x',ys) <- zip [0..maxX] gwvals]
fastgw2 :: Int -> Int -> Int
fastgw2 = gwByMap 20 20
I think ghc may be stupid about sharing in this case, you may need to lift out the x and y parameters, like this:
gwByMap maxX maxY = \x y -> fromMaybe (gw x y) $ M.lookup (x,y) memomap

Resources