Slow tail recursion in F#

Slow tail recursion in F# - performance

I have an F# function that returns a list of numbers starting from 0 in the pattern of skip n, choose n, skip n, choose n... up to a limit. For example, this function for input 2 will return [2, 3, 6, 7, 10, 11...].
Initially I implemented this as a non-tail-recursive function as below:
let rec indicesForStep start blockSize maxSize =
match start with
| i when i > maxSize -> []
| _ -> [for j in start .. ((min (start + blockSize) maxSize) - 1) -> j] # indicesForStep (start + 2 * blockSize) blockSize maxSize
Thinking that tail recursion is desirable, I reimplemented it using an accumulator list as follows:
let indicesForStepTail start blockSize maxSize =
let rec indicesForStepInternal istart accumList =
match istart with
| i when i > maxSize -> accumList
| _ -> indicesForStepInternal (istart + 2 * blockSize) (accumList # [for j in istart .. ((min (istart + blockSize) maxSize) - 1) -> j])
indicesForStepInternal start []
However, when I run this in fsi under Mono with the parameters 1, 1 and 20,000 (i.e. should return [1, 3, 5, 7...] up to 20,000), the tail-recursive version is significantly slower than the first version (12 seconds compared to sub-second).
Why is the tail-recursive version slower? Is it because of the list concatenation? Is it a compiler optimisation? Have I actually implemented it tail-recursively?
I also feel as if I should be using higher-order functions to do this, but I'm not sure exactly how to go about doing it.

As dave points out, the problem is that you're using the # operator to append lists. This is more significant performance issue than tail-recursion. In fact, tail-recursion doesn't really speed-up the program too much (but it makes it work on large inputs where the stack would overflow).
The reason why you'r second version is slower is that you're appending shorter list (the one generated using [...]) to a longer list (accumList). This is slower than appending longer list to a shorter list (because the operation needs to copy the first list).
You can fix it by collecting the elements in the accumulator in a reversed order and then reversing it before returning the result:
let indicesForStepTail start blockSize maxSize =
let rec indicesForStepInternal istart accumList =
match istart with
| i when i > maxSize -> accumList |> List.rev
| _ ->
let acc =
[for j in ((min (istart + blockSize) maxSize) - 1) .. -1 .. istart -> j]
# accumList
indicesForStepInternal (istart + 2 * blockSize) acc
indicesForStepInternal start []
As you can see, this has the shorter list (generated using [...]) as the first argument to # and on my machine, it has similar performance to the non-tail-recursive version. Note that the [ ... ] comprehension generates elements in the reversed order - so that they can be reversed back at the end.
You can also write the whole thing more nicely using the F# seq { .. } syntax. You can avoid using the # operator completely, because it allows you to yield individual elemetns using yield and perform tail-recursive calls using yield!:
let rec indicesForStepSeq start blockSize maxSize = seq {
match start with
| i when i > maxSize -> ()
| _ ->
for j in start .. ((min (start + blockSize) maxSize) - 1) do
yield j
yield! indicesForStepSeq (start + 2 * blockSize) blockSize maxSize }
This is how I'd write it. When calling it, you just need to add Seq.toList to evaluate the whole lazy sequence. The performance of this version is similar to the first one.
EDIT With the correction from Daniel, the Seq version is actually slightly faster!

In F# the list type is implemented as a singly linked list. Because of this you get different performance for x # y and y # x if x and y are of different length. That's why your seeing a difference in performance. (x # y) has running time of X.length.
// e.g.
let x = [1;2;3;4]
let y = [5]
If you did x # y then x (4 elements) would be copied into a new list and its internal next pointer would be set to the existing y list. If you did y # x then y (1 element) would be copied into a new list and its next pointer would be set to the existing list x.
I wouldn't use a higher order function to do this. I'd use list comprehension instead.
let indicesForStepTail start blockSize maxSize =
[
for block in start .. (blockSize * 2) .. (maxSize - 1) do
for i in block .. (block + blockSize - 1) do
yield i
]

This looks like the list append is the problem. Append is basically an O(N) operation on the size of the first argument. By accumulating on the left, this operation takes O(N^2) time.
The way this is typically done in functional code seems to be to accumulate the list in reverse order (by accumulating on the right), then at the end, return the reverse of the list.
The first version you have avoids the append problem, but as you point out, is not tail recursive.
In F#, probably the easiest way to solve this problem is with sequences. It is not very functional looking, but you can easily create an infinite sequence following your pattern, and use Seq.take to get the items you are interested in.

Related

Creating all subsets recursively without using array

We get non negative integer number n from user and we must print all subsets of set ({1,2,3,...,n}). (n<=20)
for example for n=3 we must print:
{1 , 2 , 3}
{1 , 2}
{1 , 3}
{1}
{2 , 3}
{2}
{3}
{}
,s are optional and the sequence can be printed without any comma. (like {1 2 3})
I must add that the sequence of subsets must be exactly like the example. Meaning first the subsets that have 1, then subsets that have 2 and .... The longest subset must be printed first. (lexicographical from the largest subset (the set itself) to null set)
I see a lot of codes in the Internet that solve this problem with arrays or using a bit array that indicate whether we use a number or not. The issue is that in this question, we are not allowed to use -any- type of array or other data structures like vector ,etc. Even using the array behaviour of something like string is completely prohibited. It must be solved only with recursion.
We are also not allowed to use any advanced functions. For example if we write it with C, we are allowed just to use stdio.h or for C++, only <iostream> is allowed and no other library.
I don't know how to do this without any arrays. How to check which number it must print and at the sametime, manage the {}.
PS1.
The question is simply generation powerset with these conditions:
USING ARRAY, STRING AND EVEN LOOPS ARE COMPLETELY PROHIBITED. JUST RECURSION.
User Kosyr submitted a very good answer with bit operators. So if you want to submit another answer, submit an answer that even doesn't use bit operators.
PS2.
I write this code by help of George but it doesn't works fine. It doesn't have something like 1 2 4. It also repeats some cases.
#include <stdio.h>
void printAllSets (int size)
{printRows (size, 1);}
void printRows (int size , int start)
{
if (start<=size)
{printf( "{ ");
printRow (start, size);
printf ("}");
printf ("\n");}
if (start <= size)
{printRows(size -1 , start);
printRows (size , (start + 1));}
}
printRow (int start, int limit)
{
if (start <= limit)
{
printf ("%d ",start);
printRow (start +1, limit);
}
}
int main()
{
printAllSets(5);
printf("{ }");
return 0;
}
PS3.
User Kosyr submitted a very good answer with bit operators. So if you want to submit another answer, submit an answer that even doesn't use bit operators.

Recursive algorithms are very memory intensive. Here algorithm for n <= 31
#include <iostream>
void bin(unsigned bit, unsigned k, unsigned max_bits) {
if (bit == max_bits) {
std::cout << "{";
bin(bit - 1, k, max_bits);
}
else {
if ((k & (1u << bit)) != 0) {
std::cout << (max_bits - bit) << " ";
}
if (bit != 0) {
bin(bit - 1, k, max_bits);
}
else {
std::cout << "}" << std::endl;
}
}
}
void print(unsigned k, unsigned n, unsigned max_bits) {
bin(max_bits, k, max_bits);
if (k != 0) {
print(k - 1, n, max_bits);
}
}
int main()
{
unsigned n;
std::cin >> n;
print((1u << n) - 1u, 1u<<n, n);
return 0;
}
First recursion print enumerates k from 2^n-1 to 0, second recursion bin enumerates all bits of k and print non-zero bits. For example, max_bits = 5 and k = 19 is 10011b = 16 + 2 + 1 = 2^4 + 2^1 + 2^0, bits 4,1,0 interoperate as set {5-4,5-1,5-0} => {1,4,5}

The alternative to loops is recursion.
To solve this problem (I think...haven't tested it), I investigate the problem by tabulating the sample date and discerned three states, Size, Start, and Limit with progression as follows:
Size Start Limit Output
10 1 10 1..10
10 1 9 1..9
... ...
10 1 1 1
10 2 10 2..10
10 2 9 2..9
... ...
10 2 2 2
... ... ...
10 10 10 10
The following recursive algorithm in pseudo code may do the trick:
printAllSets size
printRows size 1
printRows size start
print "{"
printRow start size
print "}"
print CRLF
if start <= size
printRows size (start + 1)
printRow start limit
if start <= limit
print start + SPACE
printRow start (limit - 1)
Hope this at least helps point you in the right direction.

I think we can solve this iteratively, which we can assume could also be converted to recursion, although it seems unnecessary. Consider that we can unrank any of the combinations given its index, using common knowledge. So all we need to do is count how many earlier combinations we are skipping and how many we need to unrank at each stage of the iteration (I may have missed something in the following but I think the general idea is sound):
Skip 0, unrank from `3 choose 3`
`2 choose 2` combinations
{1 , 2 , 3}
Skip 0, unrank from `3 choose 2`
`2 choose 1` combinations
{1 , 2}
{1 , 3}
Skip 0, unrank from `3 choose 1`
`2 choose 0` combinations
{1}
Skip `3 choose 2 - 2 choose 2`,
unrank from `3 choose 2`
`1 choose 1` combinations
{2 , 3}
Skip `3 choose 1 - 2 choose 1`,
unrank from `3 choose 1`
`1 choose 0` combinations
{2}
Skip `3 choose 1 - 1 choose 1`,
unrank from `3 choose 1`
`0 choose 0` combinations
{3}
Empty set
{}

By definition, the powerset of a set k, powerset k, is the set of all possible sets containing elements from that given set including the empty set itself. Clearly, when k is the empty set powerset [] is simply the set containing the empty set, [ [] ]. Now, given a power set of k, powerset k, the powerset for k plus an additional element, E, powerset (K+E), would include all possible sets containing elements without E, powerset k, plus those same elements except now all containing E
pseudo code...
let powerset k =
match k with
| [] -> [[]]
| x:xs -> x * powerset xs + powerset xs
or with tail call equivalent performance
let powerset k =
let (*) e ess = map (es -> e::es) ess
reduce (e ess -> (e * ess) ++ ess) [ [] ] (reverse k)
....(In F#)
let rec powerset k =
match k with
| [] -> [ [] ]
| x::xs ->
let withoutE = powerset xs
let withE = List.map (fun es -> x::es) withoutE
List.append withE withoutE
or more succinctly
let rec powerset = function
| [] -> [ [] ]
| x::xs -> List.append (List.map (fun es -> x::es) (powerset xs)) (powerset xs)
A better version would allow for tail call optimization...which we achieved using common functional patterns:
let rec powerset2 k =
let inline (++) a b = List.concat [a;b]
let inline (+*) a bs = List.map (fun b -> a::b) bs
List.fold
(fun ac a -> (a +* ac) ++ ac)
[ [] ]
(List.rev k)
-- this all took me a while to rediscover. It was a fun little puzzle. :)

Efficient partial permutation sort in Julia

I am dealing with a problem that requires a partial permutation sort by magnitude in Julia. If x is a vector of dimension p, then what I need are the first k indices corresponding to the k components of x that would appear first in a partial sort by absolute value of x.
Refer to Julia's sorting functions here. Basically, I want a cross between sortperm and select!. When Julia 0.4 is released, I will be able to obtain the same answer by applying sortperm! (this function) to the vector of indices and choosing the first k of them. However, using sortperm! is not ideal here because it will sort the remaining p-k indices of x, which I do not need.
What would be the most memory-efficient way to do the partial permutation sort? I hacked a solution by looking at the sortperm source code. However, since I am not versed in the ordering modules that Julia uses there, I am not sure if my approach is intelligent.
One important detail: I can ignore repeats or ambiguities here. In other words, I do not care about the ordering by abs() of indices for two components 2 and -2. My actual code uses floating point values, so exact equality never occurs for practical purposes.
# initialize a vector for testing
x = [-3,-2,4,1,0,-1]
x2 = copy(x)
k = 3 # num components desired in partial sort
p = 6 # num components in x, x2
# what are the indices that sort x by magnitude?
indices = sortperm(x, by = abs, rev = true)
# now perform partial sort on x2
select!(x2, k, by = abs, rev = true)
# check if first k components are sorted here
# should evaluate to "true"
isequal(x2[1:k], x[indices[1:k]])
# now try my partial permutation sort
# I only need indices2[1:k] at end of day!
indices2 = [1:p]
select!(indices2, 1:k, 1, p, Base.Perm(Base.ord(isless, abs, true, Base.Forward), x))
# same result? should evaluate to "true"
isequal(indices2[1:k], indices[1:k])
EDIT: With the suggested code, we can briefly compare performance on much larger vectors:
p = 10000; k = 100; # asking for largest 1% of components
x = randn(p); x2 = copy(x);
# run following code twice for proper timing results
#time {indices = sortperm(x, by = abs, rev = true); indices[1:k]};
#time {indices2 = [1:p]; select!(indices2, 1:k, 1, p, Base.Perm(Base.ord(isless, abs, true, Base.Forward), x))};
#time selectperm(x,k);
My output:
elapsed time: 0.048876901 seconds (19792096 bytes allocated)
elapsed time: 0.007016534 seconds (2203688 bytes allocated)
elapsed time: 0.004471847 seconds (1657808 bytes allocated)

The following version appears to be relatively space-efficient because it uses only an integer array of the same length as the input array:
function selectperm (x,k)
if k > 1 then
kk = 1:k
else
kk = 1
end
z = collect(1:length(x))
return select!(z,1:k,by = (i)->abs(x[i]), rev = true)
end
x = [-3,-2,4,1,0,-1]
k = 3 # num components desired in partial sort
print (selectperm(x,k))
The output is:
[3,1,2]
... as expected.
I'm not sure if it uses less memory than the originally-proposed solution (though I suspect the memory usage is similar) but the code may be clearer and it does produce only the first k indices whereas the original solution produced all p indices.
(Edit)
selectperm() has been edited to deal with the BoundsError that occurs if k=1 in the call to select!().

finding primes very slow in F#

I have answered Project Euler Question 7 very easily using Sieve of Eratosthenes in C and I had no problem with it.
I am still quite new to F# so I tried implementing the same technique
let prime_at pos =
let rec loop f l =
match f with
| x::xs -> loop xs (l |> List.filter(fun i -> i % x <> 0 || i = x))
| _ -> l
List.nth (loop [2..pos] [2..pos*pos]) (pos-1)
which works well when pos < 1000, but will crash at 10000 with out of memory exception
I then tried changing the algorithm to
let isPrime n = n > 1 && seq { for f in [2..n/2] do yield f } |> Seq.forall(fun i -> n % i <> 0)
seq {for i in 2..(10000 * 10000) do if isPrime i then yield i} |> Seq.nth 10000 |> Dump
which runs successfully but still takes a few minutes.
If I understand correctly the first algorithm is tail optimized so why does it crash? And how can I write an algorithm that runs under 1 minute (I have a fast computer)?

Looking at your first attempt
let prime_at pos =
let rec loop f l =
match f with
| x::xs -> loop xs (l |> List.filter(fun i -> i % x <> 0 || i = x))
| _ -> l
List.nth (loop [2..pos] [2..pos*pos]) (pos-1)
At each loop iteration, you are iterating over and creating a new list. This is very slow as list creation is very slow and you don't see any benefits from the cache. Several obvious optimisations such as the factor list skipping the even numbers, are skipped. When pos=10 000 you are trying to create a list which will occupy 10 000 * 10 000 * 4 = 400MB of just integers and a further 800MB of pointers (F# lists are linked lists). Futhermore, as each list element takes up a very small amount of memory there will probably be significant overhead for things like GC overhead. In the function you then create a new list of smiliar size. As a result, I am not surprised that this causes OutOfMemoryException.
Looking at the second example,
let isPrime n =
n > 1 &&
seq { for f in [2..n/2] do yield f }
|> Seq.forall(fun i -> n % i <> 0)
Here, the problem is pretty similar as you are generating giant lists for each element you are testing.
I have written a quite fast F# sieve here https://stackoverflow.com/a/12014908/124259 which shows how to do this faster.

As already mentioned by John, your implementation is slow because it generates some temporary data structures.
In the first case, you are building a list, which needs to be fully created in memory and that introduces significant overhead.
In the second case, you are building a lazy sequence, which does not consume memory (because it is build while it is being iterated), but it still introduces indirection that slows the algorithm down.
In most cases in F#, people tend to prefer readability and so using sequences is a nice way to write the code, but here you probably care more about performance, so I'd avoid sequences. If you want to keep the same structure of your code, you can rewrite isPrime like this:
let isPrime n =
let rec nonDivisible by =
if by = 1 then true // Return 'true' if we reached the end
elif n%by = 0 then false // Return 'false' if there is a divisor
else nonDivisible (by - 1) // Otherwise continue looping
n > 1 && nonDivisible (n/2)
This just replaces the sequence and forall with a recursive function nonDivisible that returns true when the number n is not divisible by any number between 2 and n/2. The function first checks the two termination cases and otherwise performs a recursive call..
With the original implementation, I'm able to find 1000th prime in 1.5sec and with the new one, it takes 22ms. Finding 10000th prime with the new implementation takes 3.2sec on my machine.

Algorithm for combining different age groups together based on their values

Let's say we have an array of age groups and an array of the number of people in each age group
For example:
Ages = ("1-13", "14-20", "21-30", "31-40", "41-50", "51+")
People = (1, 10, 21, 3, 2, 1)
I want to have an algorithm that combines these age groups with the following logic if there are fewer than 5 people in each group. The algorithm that I have so far does the following:
Start from the last element (e.g., "51+") can you combine it with the next group? (here "41-50") if yes add the numbers 1+2 and combine their labels. So we get the following
Ages = ("1-13", "14-20", "21-30", "31-40", "41+")
People = (1, 10, 21, 3, 3)
Take the last one again (here is "41+"). Can you combine it with the next group (31-40)? the answer is yes so we get:
Ages = ("1-13", "14-20", "21-30", "31+")
People = (1, 10, 21, 6)
since the group 31+ now has 6 members we cannot collapse it into the next group.
we cannot collapse "21-30" into the next one "14-20" either
"14-20" also has 10 people (>5) so we don't do anything on this either
for the first one ("1-13") since we have only one person and it is the last group we combine it with the next group "14-20" and get the following
Ages = ("1-20", "21-30", "31+")
People = (11, 21, 6)
I have an implementation of this algorithm that uses many flags to keep track of whether or not any data is changed and it makes a number of passes on the two arrays to finish this task.
My question is if you know any efficient way of doing the same thing? any data structure that can help? any algorithm that can help me do the same thing without doing too much bookkeeping would be great.
Update:
A radical example would be (5,1,5)
in the first pass it becomes (5,6) [collapsing the one on the right into the one in the middle]
then we have (5,6). We cannot touch 6 since it is larger than our threshold:5. so we go to the next one (which is element on the very left 5) since it is less than or equal to 5 and since it is the last one on the left we group it with the one on its right. so we finally get (11)

Here is an OCaml solution of a left-to-right merge algorithm:
let close_group acc cur_count cur_names =
(List.rev cur_names, cur_count) :: acc
let merge_small_groups mini l =
let acc, cur_count, cur_names =
List.fold_left (
fun (acc, cur_count, cur_names) (name, count) ->
if cur_count <= mini || count <= mini then
(acc, cur_count + count, name :: cur_names)
else
(close_group acc cur_count cur_names, count, [name])
) ([], 0, []) l
in
List.rev (close_group acc cur_count cur_names)
let input = [
"1-13", 1;
"14-20", 10;
"21-30", 21;
"31-40", 3;
"41-50", 2;
"51+", 1
]
let output = merge_small_groups 5 input
(* output = [(["1-13"; "14-20"], 11); (["21-30"; "31-40"; "41-50"; "51+"], 27)] *)
As you can see, the result of merging from left to right may not be what you want.
Depending on the goal, it may make more sense to merge the pair of consecutive elements whose sum is smallest and iterate until all counts are above the minimum of 5.

Here is my scala approach.
We start with two lists:
val people = List (1, 10, 21, 3, 2, 1)
val ages = List ("1-13", "14-20", "21-30", "31-40", "41-50", "51+")
and combine them to a kind of mapping:
val agegroup = ages.zip (people)
define a method to merge two Strings, describing an (open ended) interval. The first parameter is, if any, the one with the + in "51+".
/**
combine age-strings
a+ b-c => b+
a-b c-d => c-b
*/
def merge (xs: String, ys: String) = {
val xab = xs.split ("[+-]")
val yab = ys.split ("-")
if (xs.contains ("+")) yab(0) + "+" else
yab (0) + "-" + xab (1)
}
Here is the real work:
/**
reverse the list, combine groups < threshold.
*/
def remap (map: List [(String, Int)], threshold : Int) = {
def remap (mappings: List [(String, Int)]) : List [(String, Int)] = mappings match {
case Nil => Nil
case x :: Nil => x :: Nil
case x :: y :: xs => if (x._2 > threshold) x :: remap (y :: xs) else
remap ((merge (x._1, y._1), x._2 + y._2) :: xs) }
val nearly = (remap (map.reverse)).reverse
// check for first element
if (! nearly.isEmpty && nearly.length > 1 && nearly (0)._2 < threshold) {
val a = nearly (0)
val b = nearly (1)
val rest = nearly.tail.tail
(merge (b._1, a._1), a._2 + b._2) :: rest
} else nearly
}
and invocation
println (remap (agegroup, 5))
with result:
scala> println (remap (agegroup, 5))
List((1-20,11), (21-30,21), (31+,6))
The result is a list of pairs, age-group and membercount.
I guess the main part is easy to understand: There are 3 basic cases: an empty list, which can't be grouped, a list of one group, which is the solution itself, and more than one element.
If the first element (I reverse the list in the beginning, to start with the end) is bigger than 5 (6, whatever), yield it, and procede with the rest - if not, combine it with the second, and take this combined element and call it with the rest in a recursive way.
If 2 elements get combined, the merge-method for the strings is called.
The map is remapped, after reverting it, and the result reverted again. Now the first element has to be inspected and eventually combined.
We're done.

I think a good data structure would be a linked list of pairs, where each pair contains the age span and the count. Using that, you can easily walk the list, and join two pairs in O(1).

Filter elements in a list by length - Ocaml

I have the following list:
["A";"AA";"ABC";"BCD";"B";"C"]
I am randomly extracting an element from the list. But the element I extract should be of size 3 only not lesser than 3.
I am trying to do this as follows:
let randomnum = (Random.int(List.length (list)));;
let rec code c =
if (String.length c) = 3 then c
else (code ((List.nth (list) (randomnum)))) ;;
print_string (code ( (List.nth (list) (randomnum)))) ;;
This works fine if randomly a string of length 3 is picked out from the list.
But the program does not terminate if a string of length < 3 is picked up.
I am trying to do a recursive call so that new code keeps getting picked up till we get one of length = 3.
I am unable to figure out why this is does not terminate. Nothing gets output by the print statement.

What you probably want to write is
let rec code list =
let n = Random.int (List.length list) in
let s = List.nth list in
if String.length s < 3 then code list else s
Note that, depending on the size of the list and the number of strings of size greater than 3, you might want to work directly on a list with only strings greater than 3:
let code list =
let list = List.filter (fun s -> String.length s >= 3) list in
match list with
| [] -> raise Not_found
| _ -> List.nth list (Random.int (List.length list))
This second function is better, as it always terminate, especially when there are no strings greater than 3.

You only pick a random number once. Say you pick 5. You just keep recursing with 5 over and over and over. You need to get a new random number.

For your code to terminate, it would be better to first filter the list for suitable elements, then take your random number:
let code list =
let suitables = List.filter (fun x -> String.length x = 3) list in
match List.length suitables with
| 0 -> raise Not_found (* no suitable elements at all! *)
| len -> List.nth suitables (Random.int len)
Otherwise your code would take very long to terminate on a large list of elements with size <> 3; or worse on a list with no element of size 3, it would not terminate at all!

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Slow tail recursion in F# - performance

Related

Creating all subsets recursively without using array

Efficient partial permutation sort in Julia

finding primes very slow in F#

Algorithm for combining different age groups together based on their values

Filter elements in a list by length - Ocaml

Categories

Resources