Is it possible to add fields into an RSA private Key? - format

The RSA structure is defined in RFC 3447, A.1.2 RSA private key syntax.
This section states that the RSA should be presented as:
RSAPrivateKey ::= SEQUENCE {
version Version,
modulus INTEGER, -- n
publicExponent INTEGER, -- e
privateExponent INTEGER, -- d
prime1 INTEGER, -- p
prime2 INTEGER, -- q
exponent1 INTEGER, -- d mod (p-1)
exponent2 INTEGER, -- d mod (q-1)
coefficient INTEGER, -- (inverse of q) mod p
otherPrimeInfos OtherPrimeInfos OPTIONAL
Is it possible to add additional fields other than the ones described above, for example a Serial Number?


Constant shape error in simple code

I have an error with this code and I don't understand why
-"The module or main program array 'u' at (1) must have constant shape."
-Moreover, how can I do this code with a choice of parameters, I mean [U]=vector(N) where I can chose N and it returns me U.
program vector
implicit none
integer :: n
integer, parameter :: N=10
real, dimension(N,1) :: U
do n=1,N
end do
end program vector
First up, Fortran is caseINsensitive, so n and N are the same thing, and you can't declare two different variables/parameters n and N.
Then you declare U to have shape (N, 1), but seem to use it in the form (1, N).
As for how to auto-generate something like U, you could use something like this:
function vector(n) result(v)
integer, intent(in) :: n
integer :: v(n)
integer :: i
v = [ (i, i=1, n) ]
end function vector
One more thing:
You declare U with dimension(1, N) which creates a 2D array with one dimension having length 1. I'm wondering whether you wanted to create a 1D array with range from 1 to N, for which the declaration would need to be dimension(1:N) (or, since Fortran assumes indices start at 1, just dimension(N)).
Addressing the questions in your comment:
The purpose of intent(in) tells the compiler that n is only read, not written to, in this function. Considering that you want to use n as the size of array v, you want that.
With result(v) I tell the compiler that I want to use the name v to refer to the result of the function, not the default (which is the function name). I do this to avoid confusion.
integer :: v(n) is the same as integer, dimension(n) :: v

Odd DFA for Language

Can somebody please help me draw a NFA that accepts this language:
{ w | the length of w is 6k + 1 for some k ≥ 0 }
I have been stuck on this problem for several hours now. I do not understand where the k comes into play and how it is used in the diagram...
{ w | the length of w is 6k + 1 for some k ≥ 0 }
We can use the Myhill-Nerode theorem to constructively produce a provably minimal DFA for this language. This is a useful exercise. First, a definition:
Two strings w and x are indistinguishable with respect to a language L iff: (1) for every string y such that wy is in L, xy is in L; (2) for every string z such that xz is in L, wz is in L.
The insight in Myhill-Nerode is that if two strings are indistinguishable w.r.t. a regular language, then a minimal DFA for that language will see to it that the machine ends up in the same state for either string. Indistinguishability is reflexive, symmetric and transitive so we can define equivalence classes on it. Those equivalence classes correspond directly to the set of states in the minimal DFA. Now, to find the equivalence classes for our language. We consider strings of increasing length and see for each one whether it's indistinguishable from any of the strings before it:
e, the empty string, has no strings before it. We need a state q0 to correspond to the equivalence class this string belongs to. The set of strings that can come after e to reach a string in L is L itself; also written c(c^6)*
c, any string of length one, has only e before it. These are not, however, indistinguishable; we can add e to c to get ce = c, a string in L, but we cannot add e to e to get a string in L, since e is not in L. We therefore need a new state q1 for the equivalence class to which c belongs. The set of strings that can come after c to reach a string in L is (c^6)*.
It turns out we need a new state q2 here; the set of strings that take cc to a string in L is ccccc(c^6)*. Show this.
It turns out we need a new state q3 here; the set of strings that take ccc to a string in L is cccc(c^6)*. Show this.
It turns out we need a new state q4 here; the set of strings that take cccc to a string in L is ccc(c^6)*. Show this.
It turns out we need a new state q5 here; the set of strings that take ccccc to a string in L is cc(c^6)*. Show this.
Consider the string cccccc. What strings take us to a string in L? Well, c does. So does c followed by any string of length 6. Interestingly, this is the same as L itself. And we already have an equivalence class for that: e could also be followed by any string in L to get a string in L. cccccc and e are indistinguishable. What's more: since all strings of length 6 are indistinguishable from shorter strings, we no longer need to keep checking longer strings. Our DFA is guaranteed to have one the states q0 - q5 we have already identified. What's more, the work we've done above defines the transitions we need in our DFA, the initial state and the accepting states as well:
The DFA will have a transition on symbol c from state q to state q' if x is a string in the equivalence class corresponding to q and xc is a string in the equivalence class corresponding to q';
The initial state will be the state corresponding to the equivalence class to which e, the empty string, belongs;
A state q is accepting if any string (hence all strings) belonging to the equivalence class corresponding to the language is in the language; alternatively, if the set of strings that take strings in the equivalence class to a string in L includes e, the empty string.
We may use the notes above to write the DFA in tabular form:
q x q'
-- -- --
q0 c q1 // e + c = c
q1 c q2 // c + c = cc
q2 c q3 // cc + c = ccc
q3 c q4 // ccc + c = cccc
q4 c q5 // cccc + c = ccccc
q5 c q0 // ccccc + c = cccccc ~ e
We have q0 as the initial state and the only accepting state is q1.
Here's a NFA which goes 6 states forward then if there is one more character it stops on the final state. Otherwise it loops back non-deterministcally to the start and past the final state.
(Start) S1 -> S2 -> S3 -> S5 -> S6 -> S7 (Final State) -> S8 - (loop forever)
^ |
^ v |_|
|________________________| (non deterministically)

Operations with big real numbers in Fortran

I wrote a Fortran code that calculates the ith-permutation of a given list {1,2,3,...,n}, without computing all the others, that are n! I needed that in order to find the ith-path of the TSP (Travelling salesman problem).
When n! is big, the code gives me some error and I tested that the ith-permutation found is not the exact value. For n=10, there are not problems at all, but for n=20, the code crashes or wrong values are found. I think this is due to errors that Fortran makes operating with big numbers (sums of big numbers).
I use Visual Fortran Ultimate 2013. In attached you find the subroutine I use for my goal. WeightAdjMatRete is the distance matrix between each pair of knots of the network.
! Fattoriale
RECURSIVE FUNCTION factorial(n) RESULT(n_factorial)
REAL :: n_factorial
IF(n>0) THEN
! ith-permutazione di una lista
SUBROUTINE ith_permutazione(lista_iniziale,n,i,ith_permutation)
INTEGER :: k,n
REAL :: j,f
INTEGER, DIMENSION(1:n), INTENT(IN) :: lista_iniziale
INTEGER, DIMENSION(1:n) :: lista_lavoro
INTEGER, DIMENSION(1:n), INTENT(OUT) :: ith_permutation
DO k=1,n
ENDSUBROUTINE ith_permutazione
! Funzione modulo, adattata
PURE FUNCTION mood(k,modulo) RESULT(ris)
INTEGER, INTENT(IN) :: k,modulo
INTEGER :: ris
IF(MOD(k,modulo)/=0) THEN
! Funzione quoziente, adattata
PURE FUNCTION quoziente(a,p) RESULT(ris)
INTEGER :: ris
IF(MOD(a,p)/=0) THEN
! Vettori contenenti tutti i payoff percepiti dagli agenti allo state vector attuale e quelli ad ogni sua singola permutazione
SUBROUTINE tuttipayoff(n,m,nodi,nodi_rete,sigma,bvector,MatVecSomma,VecPos,lista_iniziale,ith_permutation,lunghezze_percorso,WeightAdjMatRete,array_perceived_payoff_old,array_perceived_payoff_neg)
INTEGER, INTENT(IN) :: n,m,nodi,nodi_rete
INTEGER, DIMENSION(1:nodi), INTENT(OUT) :: bvector
REAL, DIMENSION(1:m,1:n), INTENT(OUT) :: MatVecSomma
INTEGER, DIMENSION(1:nodi_rete), INTENT(IN) :: lista_iniziale
INTEGER, DIMENSION(1:nodi_rete), INTENT(OUT) :: ith_permutation
REAL, DIMENSION(1:nodi_rete), INTENT(OUT) :: lunghezze_percorso
REAL, DIMENSION(1:nodi_rete,1:nodi_rete), INTENT(IN) :: WeightAdjMatRete
REAL, DIMENSION(1:nodi), INTENT(OUT) :: array_perceived_payoff_old,array_perceived_payoff_neg
INTEGER :: i,j,k
DO k=1,nodi
IF(VecPos(mood(k,m))<=factorial(REAL(nodi_rete))) THEN
CALL ith_permutazione(lista_iniziale,nodi_rete,VecPos(mood(k,m))-1.,ith_permutation)
IF(VecPos(mood(k,m))-SIGN(1,sigma(m*(quoziente(k,m)-1)+mood(k,m)))*2**(n-quoziente(k,m))<=factorial(REAL(nodi_rete))) THEN
CALL ith_permutazione(lista_iniziale,nodi_rete,VecPos(mood(k,m))-SIGN(1,sigma(m*(quoziente(k,m)-1)+mood(k,m)))*2**(n-quoziente(k,m))-1.,ith_permutation)
Don't use floating-point numbers to represent factorials; factorials are products of integers and are therefore best represented as integers.
Factorials grow big fast, so it may be tempting to use reals, because reals can represent huge numbers like 1.0e+30. But floating-point numbers are precise only with relation to their magnitude; their mantissa still has a limited size, they can be huge because their exponents may be huge.
A 32-bit real can represent exact integers up to about 16 million. After that, only every even integer can be represented up to 32 million and every fourth integer up to 64 million. 64-bit integers are better, because they can represent exact integers up to 9 quadrillion.
64-bit integers can go 1024 times further: They can represent 2^63 or about 9 quintillion (9e+18) integers. That is enough to represent 20!:
20! = 2,432,902,008,176,640,000
2^63 = 9,223,372,036,854,775,808
Fortran allows you to select a kind of integer based on the decimal places it should be able to represent:
integer, (kind=selected_int_kind(18))
Use this to do your calculations with 64-bit integers. This will give you factorials up to 20!. It won't go further than that, though: Most machines support only integers up to 64 bit, so selected_int_kind(19) will give you an error.
Here's the permutation part of your program with 64-bit integers. Note how all the type conversions ald floors and ceilings disappear.
program permute
implicit none
integer, parameter :: long = selected_int_kind(18)
integer, parameter :: n = 20
integer, dimension(1:n) :: orig
integer, dimension(1:n) :: perm
integer(kind=long) :: k
do k = 1, n
orig(k) = k
end do
do k = 0, 2000000000000000_long, 100000000000000_long
call ith_perm(perm, orig, n, k)
print *, k
print *, perm
print *
end do
end program
function fact(n)
implicit none
integer, parameter :: long = selected_int_kind(18)
integer(kind=long) :: fact
integer, intent(in) :: n
integer :: i
fact = 1
i = n
do while (i > 1)
fact = fact * i
i = i - 1
end do
end function fact
subroutine ith_perm(perm, orig, n, i)
implicit none
integer, parameter :: long = selected_int_kind(18)
integer, intent(in) :: n
integer(kind=long), intent(in) :: i
integer, dimension(1:n), intent(in) :: orig
integer, dimension(1:n), intent(out) :: perm
integer, dimension(1:n) :: work
integer :: k
integer(kind=long) :: f, j
integer(kind=long) :: fact
work = orig
j = i
do k = 1, n
f = fact(n - k)
perm(k) = work(j / f + 1)
work = pack(work, work /= perm(k))
j = mod(j, f)
end do
end subroutine ith_perm

Is there a fast algorithm to determine the godel number of a term of a context free language?

Suppose we have a simple grammar specification. There is a way to enumerate terms of that grammar that guarantees that any finite term will have a finite position, by iterating it diagonally. For example, for the following grammar:
S ::= add
add ::= mul | add + mul
mul ::= term | mul * term
term ::= number | ( S )
number ::= digit | digit number
digit ::= 0 | 1 | ... | 9
You can enumerate terms like that:
... etc
My question is: is there a way to do the opposite? That is, to take a valid term of that grammar, say, 0+0*0, and find its position on such enumeration - in that case, 9?
For this specific problem, we can cook up something fairly simple, if we allow ourselves to choose a different enumeration ordering. The idea is basically the one in Every Bit Counts, which I also mentioned in the comments. First, some preliminaries: some imports/extensions, a data type representing the grammar, and a pretty-printer. For the sake of simplicity, my digits only go up to 2 (big enough to not be binary any more, but small enough not to wear out my fingers and your eyes).
{-# LANGUAGE TypeSynonymInstances #-}
import Control.Applicative
import Data.Universe.Helpers
type S = Add
data Add = Mul Mul | Add :+ Mul deriving (Eq, Ord, Show, Read)
data Mul = Term Term | Mul :* Term deriving (Eq, Ord, Show, Read)
data Term = Number Number | Parentheses S deriving (Eq, Ord, Show, Read)
data Number = Digit Digit | Digit ::: Number deriving (Eq, Ord, Show, Read)
data Digit = D0 | D1 | D2 deriving (Eq, Ord, Show, Read, Bounded, Enum)
class PP a where pp :: a -> String
instance PP Add where
pp (Mul m) = pp m
pp (a :+ m) = pp a ++ "+" ++ pp m
instance PP Mul where
pp (Term t) = pp t
pp (m :* t) = pp m ++ "*" ++ pp t
instance PP Term where
pp (Number n) = pp n
pp (Parentheses s) = "(" ++ pp s ++ ")"
instance PP Number where
pp (Digit d) = pp d
pp (d ::: n) = pp d ++ pp n
instance PP Digit where pp = show . fromEnum
Now let's define the enumeration order. We'll use two basic combinators, +++ for interleaving two lists (mnemonic: the middle character is a sum, so we're taking elements from either the first argument or the second) and +*+ for the diagonalization (mnemonic: the middle character is a product, so we're taking elements from both the first and second arguments). More information on these in the universe documentation. One invariant we'll maintain is that our lists -- with the exception of digits -- are always infinite. This will be important later.
ss = adds
adds = (Mul <$> muls ) +++ (uncurry (:+) <$> adds +*+ muls)
muls = (Term <$> terms ) +++ (uncurry (:*) <$> muls +*+ terms)
terms = (Number <$> numbers) +++ (Parentheses <$> ss)
numbers = (Digit <$> digits) ++ interleave [[d ::: n | n <- numbers] | d <- digits]
digits = [D0, D1, D2]
Let's see a few terms:
*Main> mapM_ (putStrLn . pp) (take 15 ss)
Okay, now let's get to the good bit. Let's assume we have two infinite lists a and b. There's two things to notice. First, in a +++ b, all the even indices come from a, and all the odd indices come from b. So we can look at the last bit of an index to see which list to look in, and the remaining bits to pick an index in that list. Second, in a +*+ b, we can use the standard bijection between pairs of numbers and single numbers to translate between indices in the big list and pairs of indices in the a and b lists. Nice! Let's get to it. We'll define a class for Godel-able things that can be translated back and forth between numbers -- indices into the infinite list of inhabitants. Later we'll check that this translation matches the enumeration we defined above.
type Nat = Integer -- bear with me here
class Godel a where
to :: a -> Nat
from :: Nat -> a
instance Godel Nat where to = id; from = id
instance (Godel a, Godel b) => Godel (a, b) where
to (m_, n_) = (m + n) * (m + n + 1) `quot` 2 + m where
m = to m_
n = to n_
from p = (from m, from n) where
isqrt = floor . sqrt . fromIntegral
base = (isqrt (1 + 8 * p) - 1) `quot` 2
triangle = base * (base + 1) `quot` 2
m = p - triangle
n = base - m
The instance for pairs here is the standard Cantor diagonal. It's just a bit of algebra: use the triangle numbers to figure out where you're going/coming from. Now building up instances for this class is a breeze. Numbers are just represented in base 3:
-- this instance is a lie! there aren't infinitely many Digits
-- but we'll be careful about how we use it
instance Godel Digit where
to = fromIntegral . fromEnum
from = toEnum . fromIntegral
instance Godel Number where
to (Digit d) = to d
to (d ::: n) = 3 + to d + 3 * to n
from n
| n < 3 = Digit (from n)
| otherwise = let (q, r) = quotRem (n-3) 3 in from r ::: from q
For the remaining three types, we will, as suggested above, check the tag bit to decide which constructor to emit, and use the remaining bits as indices into a diagonalized list. All three instances necessarily look very similar.
instance Godel Term where
to (Number n) = 2 * to n
to (Parentheses s) = 1 + 2 * to s
from n = case quotRem n 2 of
(q, 0) -> Number (from q)
(q, 1) -> Parentheses (from q)
instance Godel Mul where
to (Term t) = 2 * to t
to (m :* t) = 1 + 2 * to (m, t)
from n = case quotRem n 2 of
(q, 0) -> Term (from q)
(q, 1) -> uncurry (:*) (from q)
instance Godel Add where
to (Mul m) = 2 * to m
to (m :+ t) = 1 + 2 * to (m, t)
from n = case quotRem n 2 of
(q, 0) -> Mul (from q)
(q, 1) -> uncurry (:+) (from q)
And that's it! We can now "efficiently" translate back and forth between parse trees and their Godel numbering for this grammar. Moreover, this translation matches the above enumeration, as you can verify:
*Main> map from [0..29] == take 30 ss
We did abuse many nice properties of this particular grammar -- non-ambiguity, the fact that almost all the nonterminals had infinitely many derivations -- but variations on this technique can get you quite far, especially if you are not too strict on requiring every number to be associated with something unique.
Also, by the way, you might notice that, except for the instance for (Nat, Nat), these Godel numberings are particularly nice in that they look at/produce one bit (or trit) at a time. So you could imagine doing some streaming. But the (Nat, Nat) one is pretty nasty: you have to know the whole number ahead of time to compute the sqrt. You actually can turn this into a streaming guy, too, without losing the property of being dense (every Nat being associated with a unique (Nat, Nat)), but that's a topic for another answer...

Project Euler No. 14 Haskell

I'm trying to resolve problem 14 of Project Euler ( and I hit a dead end using Haskell.
Now, I know that the numbers may be small enough and I could do a brute force, but that isn't the purpose of my exercise.
I am trying to memorize the intermediate results in a Map of type Map Integer (Bool, Integer) with the meaning of:
- the first Integer (the key) holds the number
- the Tuple (Bool, Interger) holds either (True, Length) or (False, Number)
where Length = length of the chain
Number = the number before him
for 13: the chain is 13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1
My map should contain :
13 - (True, 10)
40 - (False, 13)
20 - (False, 40)
10 - (False, 20)
5 - (False, 10)
16 - (False, 5)
8 - (False, 16)
4 - (False, 8)
2 - (False, 4)
1 - (False, 2)
Now when I search for another number like 40 i know that the chain has (10 - 1) length and so on.
I want now, if I search for 10, not only to tell me that length of 10 is (10 - 3) length and update the map, but also I want to update 20, 40 in case they are still (False, _)
My code:
import Data.Map as Map
solve :: [Integer] -> Map Integer (Bool, Integer)
solve xs = solve' xs Map.empty
solve' :: [Integer] -> Map Integer (Bool, Integer) -> Map Integer (Bool, Integer)
solve' [] table = table
solve' (x:xs) table =
case Map.lookup x table of
Nothing -> countF x 1 (x:xs) table
Just (b, _) ->
case b of
True -> solve' xs table
False -> {-WRONG-} solve' xs table
f :: Integer -> Integer
f x
| x `mod` 2 == 0 = x `quot` 2
| otherwise = 3 * x + 1
countF :: Integer -> Integer -> [Integer] -> Map Integer (Bool, Integer) -> Map Integer (Bool, Integer)
countF n cnt (x:xs) table
| n == 1 = solve' xs (Map.insert x (True, cnt) table)
| otherwise = countF (f n) (cnt + 1) (x:xs) $ checkMap (f n) n table
checkMap :: Integer -> Integer -> Map Integer (Bool, Integer) -> Map Integer (Bool, Integer)
checkMap n rez table =
case Map.lookup n table of
Nothing -> Map.insert n (False, rez) table
Just _ -> table
At the {-WRONG-} part we should update all the values like in the following example:
--We are looking for 10:
10 - (False, 20)
V {-finally-} update 10 => (True, 10 - 1 - 1 - 1)
20 - (False, 40) ^
| |
V update 20 => 20 - (True, 10 - 1 - 1)
40 - (False, 13) ^
| |
V update 40 => 40 - (True, 10 - 1)
13 - (True, 10) ^
| |
The problem is that I don't know if its possible to do 2 things in a function like updating a number and continue the recurence. In a C like language I may do something like (pseudocode):
void f(int n, tuple(b,nr), int &length, table)
if(b == False) f (nr, (table lookup nr), 0, table);
// the bool is true so we got a length
length = nr;
// Since this is a recurence it would work as a stack, producing the right output
table update(n, --cnt);
The last instruction would work since we are sending cnt by reference. Also we always know that it will finish at some point and cnt should not be < 1.
The easiest optimization (as you have identified) is memoization. You have attempted create a memoization system yourself, however have come across issues on how to store the memoized values. There are solutions to doing this in a maintainable way, such as using a State monad or a STArray. However, there is a much simpler solution to your problem - use haskell's existing memoization. Haskell by default remembers constant values, so if you create a value that stores the collatz values, it will be automatically memoized!
A simple example of this is the following fibonacci definition:
fib :: Int -> Integer
fib n = fibValues !! n where
fibValues = 1 : 1 : zipWith (+) fibValues (tail fibValues)
The fibValues is a [Integer], and as it is just a constant value, it is memoized. However, that doesn't mean it is all memoized at once, since as it is an infinte list, this would never finish. Instead, the values are only calculated when needed, as haskell is lazy.
So if you do something similar with your problem, you will get memoization without a lot of the work. However, using a list like above won't work well in your solution. This is because the collatz algorithm uses many different values to get the result for a given number, so the container used will require random access to be efficient. The obvious choice is an array.
collatzMemoized :: Array Integer Int
Next, we need to fill up the array with the correct values. I'll write this function pretending a collatz function exists that calculates the collatz value for any n. Also, note that arrays are fixed size, so a value needs to be used to determine the maximum number to memoize. I'll use a million, but any value can be used (it is a memory/speed tradeoff).
collatzMemoized = listArray (1, maxNumberToMemoize) $ map collatz [1..maxNumberToMemoize] where
maxNumberToMemroize = 1000000
That is pretty straightforward, the listArray is given bounds, and the a list of all the collatz values in that range is given to it. Remember that this won't calculate all the collatz values straight away, as the values are lazy.
Now, the collatz function can be written. The most important part is to only check the collatzMemoized array if the number being checked is within its bounds:
collatz :: Integer -> Int
collatz 1 = 1
collatz n
| inRange (bounds collatzMemoized) nextValue = 1 + collatzMemoized ! nextValue
| otherwise = 1 + collatz nextValue
nextValue = case n of
1 -> 1
n | even n -> n `div` 2
| otherwise -> 3 * n + 1
In ghci, you can now see the effectiveness of the memoization. Try collatz 200000. It will take about 2 seconds to finish. However, if you run it again, it will complete instantly.
Finally, the solution can be found:
maxCollatzUpTo :: Integer -> (Integer, Int)
maxCollatzUpTo n = maximumBy (compare `on` snd) $ zip [1..n] (map collatz [1..n]) where
and then printed:
main = print $ maxCollatzUpTo 1000000
If you run main, the result will be printed in about 10 seconds.
Now, a small problem with this approach is it uses a lot of stack space. It will work fine in ghci (which seems to use be more flexible with regards to stack space). However, if you compile it and try to run the executable, it will crash (with a stack space overflow). So to run the program, you have to specify more when you compile it. This can be done by adding -with-rtsopts='K64m' to the compile options. This increases the stack to 64mb.
Now the program can be compiled and ran:
> ghc -O3 --make -with-rtsopts='-K6m' problem.hs
Running ./problem will give the result in less than a second.
You are going about memoization the hard way, trying to write an imperative program in Haskell. Borrowing from David Eisenstat's solution, we'll solve it as j_random_hacker suggested:
collatzLength :: Integer -> Integer
collatzLength n
| n == 1 = 1
| even n = 1 + collatzLength (n `div` 2)
| otherwise = 1 + collatzLength (3*n + 1)
The dynamic programming solution for this is to replace the recursion with looking things up in a table. Let's make a function where we can replace the recursive call:
collatzLengthDef :: (Integer -> Integer) -> Integer -> Integer
collatzLengthDef r n
| n == 1 = 1
| even n = 1 + r (n `div` 2)
| otherwise = 1 + r (3*n + 1)
Now we could define the recursive algorithm as
collatzLength :: Integer -> Integer
collatzLength = collatzLengthDef collatzLength
Now we could also make a tabled version of this (it takes a number for the table size, and returns a collatzLength function that is calculated using a table of that size):
-- A utility function that makes memoizing things easier
buildTable :: (Ix i) => (i, i) -> (i -> e) -> Array i e
buildTable bounds f = array $ map (\x -> (x, f x)) $ range bounds
collatzLengthTabled :: Integer -> Integer -> Integer
collatzLengthTabled n = collatzLengthTableLookup
bounds = (1, n)
table = buildTable bounds (collatzLengthDef collatzLengthTableLookup)
collatzLengthTableLookup =
\x -> Case inRange bounds x of
True -> table ! x
_ -> (collatzLengthDef collatzLengthTableLookup) x
This works by defining the collatzLength to be a table lookup, with the table being the definition of the function, but with recursive calls replaced by table lookup. The table lookup function checks to see if the argument to the function is in the range that is tabled, and falls back on the definition of the function. We can even make this work for tabling any function like this:
tableRange :: (Ix a) => (a, a) -> ((a -> b) -> a -> b) -> a -> b
tableRange bounds definition = tableLookup
table = buildTable bounds (definition tableLookup)
tableLookup =
\x -> Case inRange bounds x of
True -> table ! x
_ -> (definition tableLookup) x
collatzLengthTabled n = tableRange (1, n) collatzLengthDef
You just need to make sure that you
let memoized = collatzLengthTabled 10000000
... memoized ...
So that only one table is built in memory.
I remember finding memoisation of dynamic programming algorithms very counterintuitive in Haskell, and it's been a while since I've done it, but hopefully the following trick works for you.
But first, I don't quite understand your current DP scheme, though I suspect it may be quite inefficient as it seems like it will need to update many entries for each answer. (a) I don't know how to do this in Haskell, and (b) you don't need to do this to solve the problem efficiently ;-)
I suggest the following approach instead: first build an ordinary recursive function that computes the right answer for an input number. (Hint: it will have a signature like collatzLength :: Int -> Int.) When you have this function working, just replace its definition with the definition of an array whose elements are defined lazily with the array function using an association list, and replace all recursive calls to the function to array lookups (e.g. collatzLength 42 would become collatzLength ! 42). This will automagically populate the array in the necessary order! So your "top-level" collatzLength object will now actually be an array, rather than a function.
As I suggested above, I would use an array instead of a map datatype to hold the DP table, since you will need to store values for all integer indices from 1 up to 1,000,000.
I don't have a Haskell compiler handy, so I apologize for any broken code.
Without memoization, there's a function
collatzLength :: Integer -> Integer
collatzLength n
| n == 1 = 1
| even n = 1 + collatzLength (n `div` 2)
| otherwise = 1 + collatzLength (3*n + 1)
With memoization, the type signature is
memoCL :: Map Integer Integer -> Integer -> (Map Integer Integer, Integer)
since memoCL receives a table as input and gives the updated table as output. What memoCL needs to do is intercept the return of the recursive call with a let form and insert the new result.
-- table must have an initial entry for 1
memoCL table n = case Map.lookup n table of
Just m -> (table, m)
Nothing -> let (table', m) = memoCL table (collatzStep n) in (Map.insert n (1 + m) table', 1 + m)
collatzStep :: Integer -> Integer
collatzStep n = if even n then n `div` 2 else 3*n + 1
At some point you'll get sick of the above idiom. Then it's time for monads.
I eventually modify the {-WRONG-} part to do what it should with a call to mark x (b, n) [] xs table where
mark :: Integer -> (Bool, Integer) -> [Integer] -> [Integer] -> Map Integer (Bool, Integer) -> Map Integer (Bool, Integer)
mark crtElem (b, n) list xs table
| b == False = mark n (findElem n table) (crtElem:list) xs table
| otherwise = continueWith n list xs table
continueWith :: Integer -> [Integer] -> [Integer] -> Map Integer (Bool, Integer) -> Map Integer (Bool, Integer)
continueWith _ [] xs table = solve' xs table
continueWith cnt (y:ys) xs table = continueWith (cnt - 1) ys xs (Map.insert y (True, cnt - 1) table)
findElem :: Integer -> Map Integer (Bool, Integer) -> (Bool, Integer)
findElem n table =
case Map.lookup n table of
Nothing -> (False, 0)
Just (b, nr) -> (b, nr)
But it seams that there are better (and far less verbose) answers than this 1
Maybe you might find interesting how I solved the problem. Its is pretty functional though it might be not the most efficient thing on earth :)
You can find the code here:
P.S.: Disclaimer: I was doing Project Euler exercises in order to learn Haskell, so the quality of the solution could be debatable.
Since we are studying recursion schemes, here's one for you.
Let's consider functor N(A,B,X)=A+B*X, which is a stream of Bs with the last element being A.
{-# LANGUAGE DeriveFunctor
, TypeFamilies
, TupleSections #-}
import Data.Functor.Foldable
import qualified Data.Map as M
import Data.List
import Data.Function
import Data.Int
data N a b x = Z a | S b x deriving (Functor)
This stream is handy for several kinds of iterations. For one, we can use it to represent a chain of Ints in a Collatz sequence:
type instance Base Int64 = N Int Int64
instance Foldable Int64 where
project 1 = Z 1
project x | odd x = S x $ 3*x+1
project x = S x $ x `div` 2
This is just a algebra, not a initial one, because the transformation is not a isomorphism (same chain of Ints is part of a chain for 2*x and (x-1)/3), but this is sufficient to represent the fixpoint Base Int64 Int64.
With this definition, cata is going to feed the chain to the algebra given to it, and you can use it to construct a memo Map of integers to the chain length. Finally, anamorphism can use it to generate a stream of solutions to the problem of different sizes:
problems = ana (uncurry $ cata . phi) (M.empty, 1) where
phi :: M.Map Int64 Int ->
Base Int64 (Prim [(Int64, Int)] (M.Map Int64 Int, Int64)) ->
Prim [(Int64, Int)] (M.Map Int64 Int, Int64)
phi m (Z v) = found m 1 v
phi m (S x ~(Cons (_, v') (m', _))) = maybe (notFound m' x v') (found m x) $
M.lookup x m
The ~ before (Cons ...) means lazy pattern matching. We don't touch the pattern until the values are needed. If not for lazy pattern matching, it would always construct the whole chain, and using the map would be useless. With lazy pattern matching we only construct the values v' and m' if the chain length for x was not in the map.
Helper functions construct the stream of (Int, chain length) pairs:
found m x v = Cons (x, v) (m, x+1)
notFound m x v = Cons (x, 1+v) (M.insert x (1+v) m, x+1)
Now just take the first 999999 problems, and figure out the one that has the longest chain:
main = print $ maximumBy (compare `on` snd) $ take 999999 problems
This works slower than array-based solution, because Map lookup is logarithmic of map size, but this solution is not fixed size. Still, it finishes in about 5 seconds.
