How to combine two generators in a non-trivial way - scheme

I have a generator which produces all positive integers that are powers of 2, and another which produces all integers that are powers of 3. I now need to use those to produce integers of the form 2^i*3^j where i,j >=0,0 in the increasing order.
The point of using generators is to reduce memory consumption, I think. I have been trying to do this for a while now to no avail. Please help out.

Using a self-reading stream
You can solve this using a self-read stream:
----------- -----------
| pow 2 |------->| |
----------- | |
| merge |-------+------------>
----------- | | |
.->| x 3 |------->| | |
| ----------- ----------- |
\_______________________________________/
The first stream produces the powers of two,
while the second one ensures all the generated numbers
are multiplied by 3 and reinjected into the output.
The merge operator ensures that the output is sorted.
Note that we must "seed" the output stream with 1,
or the first element will try to produce itself when evaluated.
Here is the code:
(require srfi/41)
(define (merge s1 s2)
(stream-match s1 ((x . xs)
(stream-match s2 ((y . ys)
(if (< x y)
(stream-cons x (merge xs s2))
(stream-cons y (merge ys s1))))))))
(define (the-stream)
(letrec ((s
(stream-cons 1 (merge (stream-map (lambda (x) (* 3 x)) s)
(stream-iterate (lambda (x) (* 2 x)) 2)))))
s))
It's quite simple and fast compared to my other proposal,
because it uses arithmetic properties of the problem besides monotonicity.
I'm wrong, it can be generalized just as well (upcoming)
$ mzscheme -f feedback.scm -e '(display (stream->list (stream-take 20 (the-stream))))'
(1 2 3 4 6 8 9 12 16 18 24 27 32 36 48 54 64 72 81 96)
$ time mzscheme -f feedback.scm -e '(display (stream-ref (the-stream) 10000))'
161968247347450370721577384417107686788864605658546176
real 0m1.746s
user 0m1.344s
sys 0m0.156s
Using generators and a queue
We can also implement this with python's generators,
but we need to use a queue to store the numbers waiting in the feedback loop:
# Merge the output of two generators
def merge(g1, g2):
v1 = g1.next()
v2 = g2.next()
while 1:
if v1 < v2:
yield v1
v1 = g1.next()
else:
yield v2
v2 = g2.next()
# Generates the powers of 2, starting with n
def pow2(n):
while 1: yield n; n *= 2
# Generates values shifted from the given 'q' and multiplied by 3
def mul3(q):
while 1: yield q.pop(0) * 3
# The generator we want
def pow23():
q = []
v = 1
g = merge(pow2(2), mul3(q))
while 1:
yield v
q.append(v)
v = g.next()
g23 = pow23()
for i in range(10000): g23.next()
print g23.next()
This is somewhat less elegant (IMHO),
but the generators are much more lightweight:
$ time python feedback.py
161968247347450370721577384417107686788864605658546176
real 0m0.150s
user 0m0.112s
sys 0m0.012s
For what is worth, I have done a scheme implementation
(using closures as generators)
which shows roughly the same performance.

I don't know much about generators,
however I can propose a solution based on streams (lazily constructed,
possibly infinite lists), which are somewhat similar.
My approach would be to create a stream
whose "state" itself would be a stream of streams.
The individual, inner streams of numbers,
let's call them the 3-streams,
would represent lists of the successive powers of 3, starting with 1,
multiplied by a given power of two.
We can then assemble an infinity of such 3-streams,
one for each successive power of 2, starting with 1.
Let's call this the 2-stream.
The initial state, in ascii-art, is this:
---------------------- --- -- -
| The 2-stream ...
--|----|----|----|---- --- -- -
V V V V
|1| | 2| | 4| | 8|
|3| | 6| |12| |24| ...
|9| |18| |36| |72| The 3-streams
: : : :
Now, we're going to manipulate this so that at any moment,
the 3-streams will be ordered within the 2-stream
with regards to their first elements.
As a consequence the next smallest generated number
will always be the first element of the first 3-stream.
So, to get the next number in the sequence you wish to obtain,
we're going to pull out the first 3-stream,
pull out its first element (which is the number we're interested in),
and then re-insert the 3-stream in the 2-stream
at a position determined by its new first element.
The new state after the first number (1) has been extracted would be:
---------------------- --- -- -
| The 2-stream ...
---|----|----|----|---- --- -- -
V V V V
| 2| | 3| | 4| | 8|
| 6| | 9| |12| |24| ...
|18| |27| |36| |72| The 3-streams
: : : :
Note that this method does not depend on 2^i, 3^j or multiplication specifically
(just on 2^i * 3^j being monotonically increasing with i and j).
I have posted another answer which does, and
is much more simple and fast as a result.
don't trust me: it has nothing to do with the math
Below is an example implementation, using SRFI-41 streams:
(require srfi/41)
; Geometric sequence with initial value 'init', and ratio 'r'
(define (make-geoseq init r)
(stream-cons
init
(make-geoseq (* r init) r)))
; Your power generators
(define pow2 (make-geoseq 1 2))
(define pow3 (make-geoseq 1 3))
; Construct a 3-stream from the pow3 sequence
(define (make-3stream mult)
(stream-map (lambda (x) (* mult x)) pow3))
; Construct the (initial) 2-stream from the pow2 sequence
(define initial-2stream
(stream-map make-3stream pow2))
; Insert a modified 3-stream into the given 2-stream, at the right position
(define (insert two-stream three-stream)
(if (< (stream-car three-stream)
(stream-car (stream-car two-stream)))
; we have the smallest 3-stream, put it at the front
(stream-cons
three-stream
two-stream)
; otherwise, recurse
(stream-cons
(stream-car two-stream)
(insert (stream-cdr two-stream) three-stream))))
; Construct a 2^n * 3^p stream with the given 2-stream as its "state"
(define (make-the-stream current-2stream)
(let*
; pull out the first 3-stream
((first-3s (stream-car current-2stream))
(other-3s (stream-cdr current-2stream))
; use its first element as our next value
(next-val (stream-car first-3s))
; reinsert its tail into the 2-stream's tail
(next-2s (insert other-3s (stream-cdr first-3s))))
; and use the resulting 2-stream to construct the (outer) stream's tail
(stream-cons
next-val
(make-the-stream next-2s))))
; Now, we can construct the stream we want
(define the-stream (make-the-stream initial-2stream))
Using plt-scheme (on my rather crappy hardware):
$ mzscheme -f pow23.scm -e '(display (stream->list (stream-take 20 the-stream)))'
(1 2 3 4 6 8 9 12 16 18 24 27 32 36 48 54 64 72 81 96)
$ time mzscheme -f pow23.scm -e '(display (stream-ref the-stream 10000))'
161968247347450370721577384417107686788864605658546176
real 0m12.550s
user 0m11.005s
sys 0m0.340s
Implementing this with generators could be done I guess,
but the tricky part would be implementing (insert).
You could do so by composing generators,
but you would end up adding one "layer" every time a number is pulled,
whereas a stream created with (insert) shares its tail with the original one
(the "layers" eventually merge).

Just merge the two ordered lists a la
(define merge
(lambda (pred ls1 ls2)
(cond
[(null? ls1) ls2]
[(null? ls2) ls1]
[(pred (car ls1) (car ls2))
(cons (car ls1) (merge pred (cdr ls1) ls2))]
[else (cons (car ls2) (merge pred ls1 (cdr ls2)))])))
lifted from here.

The simple solution w/o any examples is creating a new one.
for (i = 0; i < X; i++)
{
if (i%2 or i%3)
{
cout << i
}
}
edit: X is how long you want to run it say you want output 0-100 put 100.
int counter = 1000;
bool done = false;
while(!done)
{
if (i%2 or i%3)
{
cout << i;
counter--;
if(counter <= 1)
{
done = true;
}
}
i++;
}
It's a little messy but should work.
edit: The counter should end at 1 or it will give you 1001 items.

At least if I understand your question, you just need to merge the results from the two generators:
Generate an output from each generator
Produce the smaller of the two as the next output
Generate the next output from that generator
Go back to Step 2
If the two generators produce equal values, produce that as the output, and generate the next value from each generator.
Note that although it's typically used for sorting existing data instead of generating new data, this is similar to the merge used in a normal merge sort, with the exception that I've assumed you don't want duplicates, where a merge sort normally retains duplicates.
Edit: Thanks to lpthnc, I've reread the question, and I think he's right -- I misread the original question. To get the correct output, you'd need to create a third generator and produces the multiples of (in this case) six, and use a three-way merge between that result set and those from the other two generators.
I haven't played with it much, but I believe the Lazy language level (or lazy module) in recent iterations of PLT Scheme would let you write your code to generate the entire infinite sequence, which would theoretically use infinite time and memory, but only evaluate a finite subset of that as needed.

This is pretty easy in Haskell:
merge as bs =
case (as, bs) of
([], _) -> bs
(_, []) -> as
((a:as'), (b:bs')) ->
if a <= b
then a : (merge as' bs)
else b : (merge as bs')
rmDups as =
case as of
[] -> []
[a] -> [a]
(a:bs#(b:_)) ->
if a == b
then rmDups bs
else a:(rmDups bs)
take 25 $ rmDups $ merge (map (2^) [1..]) (map (3^) [1..])
yields the following:
[2,3,4,8,9,16,27,32,64,81,128,243,256,512,729,1024,2048,2187,4096,6561,8192,16384,19683,32768,59049]
though I imagine there's a more elegant way to do it...

Redacted. The more I look at this, the more I think I've got it all wrong - and others appear to have better answers, already.
Sorry, none of this is in scheme, just pseudocode...
The following code matches the thought process I garner from your question:
EDIT: revised pseudocode now that I realize it's "2^i*3^j", not "2^i, 3^j"
// If i got close, this time,
// inputs min-i=0, max-i=2, min-j=0, max-j=2
// should get output like
// 2^0 * 3^0 = 1
// 2^0 * 3^1 = 3
// 2^0 * 3^2 = 6
// 2^1 * 3^0 = 2
// 2^1 * 3^1 = 6
// 2^1 * 3^2 = 12
// 2^2 * 3^0 = 4
// 2^2 * 3^1 = 12
// 2^2 * 3^2 = 24
LET min-i, max-i, min-j, max-j be input
LET current-value = 1
FOR i = min-i to max-i
FOR j = min-j to max-j DO
PRINT "2^" . i . " * j^" . j . " = " . current-value
current-value *= 3;
DONE // end j loop
current-value *= 2
DONE // end i loop

Related

SICP solution to Fibonacci, set `a + b = a`, why not `a + b = b`?

I am reading Tree Recursion of SICP, where fib was computed by a linear recursion.
We can also formulate an iterative process for computing the
Fibonacci numbers. The idea is to use a pair of integers a and b,
initialized to Fib(1) = 1 and Fib(0) = 0, and to repeatedly apply the
simultaneous transformations
It is not hard to show that, after applying this transformation n
times, a and b will be equal, respectively, to Fib(n + 1) and Fib(n).
Thus, we can compute Fibonacci numbers iteratively using the procedure
(rewrite by Emacs Lisp substitute for Scheme)
#+begin_src emacs-lisp :session sicp
(defun fib-iter (a b count)
(if (= count 0)
b
(fib-iter (+ a b) a (- count 1))))
(defun fib (n)
(fib-iter 1 0 n))
(fib 4)
#+end_src
"Set a + b = a and b = a", it's hard to wrap my mind around it.
The general idea to find a fib is simple:
Suppose a completed Fibonacci number table, search X in the table by jumping step by step from 0 to X.
The solution is barely intuitive.
It's reasonably to set a + b = b, a = b:
(defun fib-iter (a b count)
(if (= count 0)
a
(fib-iter b (+ a b) (- count 1))
)
)
(defun fib(n)
(fib-iter 0 1 n))
So, the authors' setting seems no more than just anti-intuitively placing b in the head with no special purpose.
However, I surely acknowledge that SICP deserves digging deeper and deeper.
What key points am I missing? Why set a + b = a rather than a + b = b?
As far as I can see your problem is that you don't like it that order of the arguments to fib-iter is not what you think it should be. The answer is that the order of arguments to functions is very often simply arbitrary and/or conventional: it's a choice made by the person writing the function. It does not matter to anyone but the person reading or writing the code: it's a stylistic choice. It doesn't particularly seem more intuitive to me to have fib defined as
(define (fib n)
(fib-iter 1 0 n))
(define (fib-iter next current n)
(if (zero? n)
current
(fib-iter (+ next current) next (- n 1))))
Rather than
(define (fib n)
(fib-iter 0 1 n))
(define (fib-iter current next n)
(if (zero? n)
current
(fib-iter (+ next current) current (- n 1))))
There are instances where this isn't true. For instance Standard Lisp (warning, PDF link) defined mapcar so that the list being mapped over was the first argument with the function being mapped the second. This means you can't extend it in the way it has been extended for more recent dialects, so that it takes any positive number of lists with the function being applied to the
corresponding elements of all the lists.
Similarly I think it would be extremely unintuitive to define the arguments of - or / the other way around.
but in many, many cases it's just a matter of making a choice and sticking to it.
The recurrence is given in an imperative form. For instance, in Common Lisp, we could use parallel assignment in the body of a loop:
(psetf a (+ a b)
b a)
To reduce confusion, we should think about this functionally and give the old and new variables different names:
a = a' + b'
b = a'
This is no longer an assignment but a pair of equalities; we are justified in using the ordinary "=" operator of mathematics instead of the assignment arrow.
The linear recursion does this implicitly, because it avoids assignment. The value of the expression (+ a b) is passed as the parameter a. But that's a fresh instance of a in new scope which uses the same name, not an assignment; the binding just induces the two to be equivalent.
We can see it also like this with the help of a "Fibonacci slide rule":
1 1 2 3 5 8 13
----------------------------- <-- sliding interface
b' a'
b a
As we calculate the sequence, there is a two-number window whose entries we are calling a and b, which slides along the sequence. You can read the equalities at any position directly off the slide rule: look, b = a' = 5 and a = b' + a' = 8.
You may be confused by a referring to the higher position in the sequence. You might be thinking of this labeling:
1 1 2 3 5 8 13
------------------------
a' b'
a b
Indeed, under this naming arrangement, now we have b = a' + b', as you expect, and a = b'.
It's just a matter of which variable is designated as the leading one farther along the sequence, and which is the trailing one.
The "a is leading" convention comes from the idea that a is before b in the alphabet, and so it receives the newer "updates" from the sequence first, which then pass off to b.
This may seem counterintuitive, but such a pattern appears elsewhere in mathematics, such as convolution of functions.

Iterative tree calculation in scheme

I'm trying to implement a function defined as such:
f(n) = n if n < 4
f(n) = f(n - 1) + 2f(n - 2) + 3f(n - 3) + 4f(n - 4) if n >= 4
The iterative way to do this would be to start at the bottom until I hit n, so if n = 6:
f(4) = (3) + 2(2) + 3(1) + 4(0) | 10
f(5) = f(4) + 2(3) + 3(2) + 4(1) | 10 + 16 = 26
f(6) = f(5) + 2f(4) + 3(3) + 4(2) | 26 + 2(10) + 17 = 63
Implementation attempt:
; m1...m4 | The results of the previous calculations (eg. f(n-1), f(n-2), etc.)
; result | The result thus far
; counter | The current iteration of the loop--starts at 4 and ends at n
(define (fourf-iter n)
(cond [(< n 4) n]
[else
(define (helper m1 m2 m3 m4 result counter)
(cond [(= counter n) result]
[(helper result m1 m2 m3 (+ result m1 (* 2 m2) (* 3 m3) (* 4 m4)) (+ counter 1))]))
(helper 3 2 1 0 10 4)]))
Several problems:
The returned result is one iteration less than what it's supposed to be, because the actual calculations don't take place until the recursive call
Instead of using the defined algorithm to calculate f(4), I'm just putting it right in there that f(4) = 10
Ideally I want to start result at 0 and counter at 3 so that the calculations are applied to m1 through m4 (and so that f(4) will actually be calculated out instead of being preset), but then 0 gets used for m1 in the next iteration when it should be the result of f(4) instead (10)
tl;dr either the result calculation is delayed, or the result itself is delayed. How can I write this properly?
I think the appropriately "Scheme-ish" way to write a function that's defined recursively like that is to use memoization. If a function f is memoized, then when you call f(4) first it looks up 4 in a key-value table and if it finds it, returns the stored value. Otherwise, it simply calculates normally and then stores whatever it calculates in the table. Therefore, f will never evaluate the same computation twice. This is similar to the pattern of making an array of size n and filling in values starting from 0, building up a solution for n. That method is called dynamic programming, and memoization and dynamic programming are really different ways of looking at the same optimization strategy - avoiding computing the same thing twice. Here's a simple Racket function memo that takes a function and returns a memoized version of it:
(define (memo f)
(let ([table (make-hash)])
(lambda args
(hash-ref! table
args
(thunk (apply f args))))))
Now, we can write your function f recursively without having to worry about the performance problems of ever calculating the same result twice, thus going from an exponential time algorithm down to a linear one while keeping the implementation straightforward:
(define f
(memo
(lambda (n)
(if (< n 4)
n
(+ (f (- n 1))
(* 2 (f (- n 2)))
(* 3 (f (- n 3)))
(* 4 (f (- n 4))))))))
Note that as long as the function f exists, it will keep in memory a table containing the result of every time it's ever been called.
If you want a properly tail-recursive solution, your best approach is probably to use the named let construct. If you do (let name ([id val] ...) body ...) then calling (name val ...) anywhere in body ... will jump back to the beginning of the let with the new values val ... for the bindings. An example:
(define (display-n string n)
(let loop ([i 0])
(when (< i n)
(display string)
(loop (add1 i)))))
Using this makes a tail-recursive solution for your problem much less wordy than defining a helper function and calling it:
(define (f n)
(if (< n 4)
n
(let loop ([a 3] [b 2] [c 1] [d 0] [i 4])
(if (<= i n)
(loop (fn+1 a b c d) a b c (add1 i))
a))))
(define (fn+1 a b c d)
(+ a (* 2 b) (* 3 c) (* 4 d)))
This version of the function keeps track of four values for f, then uses them to compute the next value and ditches the oldest value. This builds up a solution while only keeping four values in memory, and it doesn't keep a huge table stored between calls. The fn+1 helper function is for combining the four previous results of the function into the next result, it's just there for readability. This might be a function to use if you want to optimize for memory usage. Using the memoized version has two advantages however:
The memoized version is much easier to understand, the recursive logic is preserved.
The memoized version stores results between calls, so if you call f(10) and then f(4), the second call will only be a table lookup in constant time because calling f(10) stored all the results for calling f with n from 0 to 10.

Implementing Radix Sort in SML

I am trying to implement radix sort in SML via a series of helper functions. The helper function I am having trouble with is called sort_nth_digit, it takes a digit place to be sorted by and a list to sort (n and L respectively). The way I am doing this is to find the first two elements of the list (for now we can assume there are at least 3), compare them by digit n, then concatenating them back onto the list in the proper order. The list should be sorted in ascending order. Now, the problem: The function compiles but I get the following:
HW4.sml:40.5-44.30 Warning: match nonexhaustive
(0,L) => ...
(n,nil) => ...
(n,a :: b :: L) => ...
val sort_nth_digit = fn : int -> int list -> int list
Additionally, when you pass arguments, you don't get an answer back which I believe indicates infinite recursion?
Q:How is the match nonexhaustive and why am I recursing infinitely:
fun sort_nth_digit 0 L = []
| sort_nth_digit n [] = []
| sort_nth_digit n (a::b::L) = if ((nth_digit a n) < (nth_digit b n)) then a::b::(sort_nth_digit n L)
else
b::a::(sort_nth_digit n L)
Thanks for the help in advance! (*My first post on stackoverflow ^.^ *)
Nonexhasutive match fix:
fun sort_nth_digit 0 L = []
| sort_nth_digit n [] = []
| sort_nth_digit n (a::[]) = a::[]
| sort_nth_digit n (a::b::L) = if ((nth_digit a n) < (nth_digit b n)) then a::b::(sort_nth_digit n L)
else
b::a::(sort_nth_digit n L)
Input that results in no output, console just sits at this line:
- sort_nth_digit 1 [333,222,444,555,666,444,333,222,999];
Code for nth_digit & anonymous helper pow:
fun nth_digit x 0 = 0
| nth_digit x n = if (num_digits x) < n then 0
else
let
fun pow x 1 = x
| pow x y= x * pow x (y-1)
in
(*Finding the nth digit of x: ((x - x div 10^n) * 10^n div 10^n-1))*)
(x - ((x div pow 10 n) * pow 10 n)) div (pow 10 (n-1)) (*Me*)
end
If anyone thinks it would be useful to have access to the rest of my code I can provide it via github as an eclipse project (you can just pull the .sml file if you don't have eclipse set up for sml)
The match is not exhaustive because it does not cover the case of a list with only one element (and inductively, any list with an odd number of elements).
I'm not sure what you mean by "not getting an answer". This function does not diverge (recurse infinitely), unless your nth_digit helper does. Instead, you should get a Match exception when you feed it a list with odd length, because of the above.

How to count number of digits?

(CountDigits n) takes a positive integer n, and returns the number of digits it contains. e.g.,
(CountDigits 1) → 1
(CountDigits 10) → 2
(CountDigits 100) → 3
(CountDigits 1000) → 4
(CountDigits 65536) → 5
I think I'm supposed to use the remainder of the number and something else but other then that im really lost. what i tried first was dividing the number by 10 then seeing if the number was less then 1. if it was then it has 1 digit. if it doesnt then divide by 100 and so on and so forth. but im not really sure how to extend that to any number so i scrapped that idea
(define (num-digits number digit)
(if (= number digit 0)
1
Stumbled across this and had to provide the log-based answer:
(define (length n)
(+ 1 (floor (/ (log n) (log 10))))
)
Edit for clarity: This is an O(1) solution that doesn't use recursion. For example, given
(define (fact n)
(cond
[(= n 1) 1]
[else (* n (fact (- n 1)))]
)
)
(define (length n)
(+ 1 (floor (/ (log n) (log 10))))
)
Running (time (length (fact 10000))) produces
cpu time: 78 real time: 79 gc time: 47
35660.0
Indicating that 10000! produces an answer consisting of 35660 digits.
After some discussion in the comments, we figured out how to take a number n with x digits and to get a number with x-1 digits: divide by 10 (using integer division, i.e., we ignore the remainder). We can check whether a number only has one digit by checking whether it's less than 10. Now we just need a way to express the total number of digits in a number as a (recursive) function. There are two cases:
(base case) a number n less than 10 has 1 digit. So CountDigits(n) = 1.
(recursive case) a number n greater than 10 has CountDigits(n) = 1+CountDigits(n/10).
Now it's just a matter of coding this up. This sounds like homework, so I don't want to give everything away. You'll still need to figure out how to write the condition "n < 10" in Scheme, as well as "n/10" (just the quotient part), but the general structure is:
(define (CountDigits n) ; 1
(if [n is less than 10] ; 2
1 ; 3
(+ 1 (CountDigits [n divided by 10])))) ; 4
An explanation of those lines, one at a time:
(define (CountDigits n) begins the definition of a function called CountDigits that's called like (CountDigits n).
In Racket, if is used to evaluate one expression, called the test, or the condition, and then to evaluate and return the value of one of the two remaining expressions. (if test X Y) evaluates test, and if test produces true, then X is evaluated and the result is returned, but otherwise Y is evaluated and the result is returned.
1 is the value that you want to return when n is less than 10 (the base case above).
1+CountDigits(n/10) is the value that you want to return otherwise, and in Racket (and Scheme, and Lisp in general) it's written as (+ 1 (CountDigits [n divided by 10])).
It will be a good idea to familiarize with the style of the Racket documentation, so I will point you to the appropriate chapter: 3.2.2 Generic Numerics. The functions you'll need should be in there, and the documentation should provide enough examples for you to figure out how to write the missing bits.
I know this is old but for future reference to anyone who finds this personally I'd write it like this:
(define (count-digits n acc)
(if (< n 10)
(+ acc 1)
(count-digits (/ n 10) (+ acc 1))))
The difference being that this one is tail-recursive and will essentially be equivalent to an iterative function(and internally Racket's iterative forms actually exploit this fact.)
Using trace illustrates the difference:
(count-digits-taylor 5000000)
>(count-digits-taylor 5000000)
> (count-digits-taylor 500000)
> >(count-digits-taylor 50000)
> > (count-digits-taylor 5000)
> > >(count-digits-taylor 500)
> > > (count-digits-taylor 50)
> > > >(count-digits-taylor 5)
< < < <1
< < < 2
< < <3
< < 4
< <5
< 6
<7
7
(count-digits 5000000 0)
>(count-digits 5000000 0)
>(count-digits 500000 1)
>(count-digits 50000 2)
>(count-digits 5000 3)
>(count-digits 500 4)
>(count-digits 50 5)
>(count-digits 5 6)
<7
7
For this exercise this doesn't matter much, but it's a good style to learn. And of course since the original post asks for a function called CountDigits which only takes one argument (n) you'd just add:
(define (CountDigits n)
(count-digits n 0))

Generating integers in ascending order using a set of prime numbers

I have a set of prime numbers and I have to generate integers using only those prime factors in increasing order.
For example, if the set is p = {2, 5} then my integers should be 1, 2, 4, 5, 8, 10, 16, 20, 25, …
Is there any efficient algorithm to solve this problem?
Removing a number and reinserting all its multiples (by the primes in the set) into a priority queue is wrong (in the sense of the question) - i.e. it produces correct sequence but inefficiently so.
It is inefficient in two ways - first, it overproduces the sequence; second, each PriorityQueue operation incurs extra cost (the operations remove_top and insert are not usually both O(1), certainly not in any list- or tree-based PriorityQueue implementation).
The efficient O(n) algorithm maintains pointers back into the sequence itself as it is being produced, to find and append the next number in O(1) time. In pseudocode:
return array h where
h[0]=1; n=0; ps=[2,3,5, ... ]; // base primes
is=[0 for each p in ps]; // indices back into h
xs=[p for each p in ps] // next multiples: xs[k]==ps[k]*h[is[k]]
repeat:
h[++n] := minimum xs
for each ref (i,x,p) in (is,xs,ps):
if( x==h[n] )
{ x := p*h[++i]; } // advance the minimal multiple/pointer
For each minimal multiple it advances its pointer, while at the same time calculating its next multiple value. This too effectively implements a PriorityQueue but with crucial distinctions - it is before the end point, not after; it doesn't create any additional storage except for the sequence itself; and its size is constant (just k numbers, for k base primes) whereas the size of past-the-end PriorityQueue grows as we progress along the sequence (in the case of Hamming sequence, based on set of 3 primes, as n2/3, for n numbers of the sequence).
The classic Hamming sequence in Haskell is essentially the same algorithm:
h = 1 : map (2*) h `union` map (3*) h `union` map (5*) h
union a#(x:xs) b#(y:ys) = case compare x y of LT -> x : union xs b
EQ -> x : union xs ys
GT -> y : union a ys
We can generate the smooth numbers for arbitrary base primes using the foldi function (see Wikipedia) to fold lists in a tree-like fashion for efficiency, creating a fixed sized tree of comparisons:
smooth base_primes = h where -- strictly increasing base_primes NB!
h = 1 : foldi g [] [map (p*) h | p <- base_primes]
g (x:xs) ys = x : union xs ys
foldi f z [] = z
foldi f z (x:xs) = f x (foldi f z (pairs f xs))
pairs f (x:y:t) = f x y : pairs f t
pairs f t = t
It is also possible to directly calculate a slice of Hamming sequence around its nth member in O(n2/3) time, by direct enumeration of the triples and assessing their values through logarithms, logval(i,j,k) = i*log 2+j*log 3+k*log 5. This Ideone.com test entry calculates 1 billionth Hamming number in 1.12 0.05 seconds (2016-08-18: main speedup due to usage of Int instead of the default Integer where possible, even on 32-bit; additional 20% thanks to the tweak suggested by #GordonBGood, bringing band size complexity down to O(n1/3)).
This is discussed some more in this answer where we also find its full attribution:
slice hi w = (c, sortBy (compare `on` fst) b) where -- hi is a top log2 value
lb5=logBase 2 5 ; lb3=logBase 2 3 -- w<1 (NB!) is (log2 width)
(Sum c, b) = fold -- total count, the band
[ ( Sum (i+1), -- total triples w/this j,k
[ (r,(i,j,k)) | frac < w ] ) -- store it, if inside the band
| k <- [ 0 .. floor ( hi /lb5) ], let p = fromIntegral k*lb5,
j <- [ 0 .. floor ((hi-p)/lb3) ], let q = fromIntegral j*lb3 + p,
let (i,frac) = pr (hi-q) ; r = hi - frac -- r = i + q
] -- (sum . map fst &&& concat . map snd)
pr = properFraction
This can be generalized for k base primes as well, probably running in O(n(k-1)/k) time.
(see this SO entry for an important later development. also, this answer is interesting. and another related answer.)
The basic idea is that 1 is a member of the set, and for each member of the set n so also 2n and 5n are members of the set. Thus, you begin by outputting 1, and push 2 and 5 onto a priority queue. Then, you repeatedly pop the front item of the priority queue, output it if it is different from the previous output, and push 2 times and 5 times the number onto the priority queue.
Google for "Hamming number" or "regular number" or go to A003592 to learn more.
----- ADDED LATER -----
I decided to spend a few minutes on my lunch hour to write a program to implement the algorithm described above, using the Scheme programming language. First, here is a library implementation of priority queues using the pairing heap algorithm:
(define pq-empty '())
(define pq-empty? null?)
(define (pq-first pq)
(if (null? pq)
(error 'pq-first "can't extract minimum from null queue")
(car pq)))
(define (pq-merge lt? p1 p2)
(cond ((null? p1) p2)
((null? p2) p1)
((lt? (car p2) (car p1))
(cons (car p2) (cons p1 (cdr p2))))
(else (cons (car p1) (cons p2 (cdr p1))))))
(define (pq-insert lt? x pq)
(pq-merge lt? (list x) pq))
(define (pq-merge-pairs lt? ps)
(cond ((null? ps) '())
((null? (cdr ps)) (car ps))
(else (pq-merge lt? (pq-merge lt? (car ps) (cadr ps))
(pq-merge-pairs lt? (cddr ps))))))
(define (pq-rest lt? pq)
(if (null? pq)
(error 'pq-rest "can't delete minimum from null queue")
(pq-merge-pairs lt? (cdr pq))))
Now for the algorithm. Function f takes two parameters, a list of the numbers in the set ps and the number n of items to output from the head of the output. The algorithm is slightly changed; the priority queue is initialized by pushing 1, then the extraction steps start. Variable p is the previous output value (initially 0), pq is the priority queue, and xs is the output list, which is accumulated in reverse order. Here's the code:
(define (f ps n)
(let loop ((n n) (p 0) (pq (pq-insert < 1 pq-empty)) (xs (list)))
(cond ((zero? n) (reverse xs))
((= (pq-first pq) p) (loop n p (pq-rest < pq) xs))
(else (loop (- n 1) (pq-first pq) (update < pq ps)
(cons (pq-first pq) xs))))))
For those not familiar with Scheme, loop is a locally-defined function that is called recursively, and cond is the head of an if-else chain; in this case, there are three cond clauses, each clause with a predicate and consequent, with the consequent evaluated for the first clause for which the predicate is true. The predicate (zero? n) terminates the recursion and returns the output list in the proper order. The predicate (= (pq-first pq) p) indicates that the current head of the priority queue has been output previously, so it is skipped by recurring with the rest of the priority queue after the first item. Finally, the else predicate, which is always true, identifies a new number to be output, so it decrements the counter, saves the current head of the priority queue as the new previous value, updates the priority queue to add the new children of the current number, and inserts the current head of the priority queue into the accumulating output.
Since it is non-trivial to update the priority queue to add the new children of the current number, that operation is extracted to a separate function:
(define (update lt? pq ps)
(let loop ((ps ps) (pq pq))
(if (null? ps) (pq-rest lt? pq)
(loop (cdr ps) (pq-insert lt? (* (pq-first pq) (car ps)) pq)))))
The function loops over the elements of the ps set, inserting each into the priority queue in turn; the if returns the updated priority queue, minus its old head, when the ps list is exhausted. The recursive step strips the head of the ps list with cdr and inserts the product of the head of the priority queue and the head of the ps list into the priority queue.
Here are two examples of the algorithm:
> (f '(2 5) 20)
(1 2 4 5 8 10 16 20 25 32 40 50 64 80 100 125 128 160 200 250)
> (f '(2 3 5) 20)
(1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36)
You can run the program at http://ideone.com/sA1nn.
This 2-dimensional exploring algorithm is not exact, but works for the first 25 integers, then mixes up 625 and 512.
n = 0
exp_before_5 = 2
while true
i = 0
do
output 2^(n-exp_before_5*i) * 5^Max(0, n-exp_before_5*(i+1))
i <- i + 1
loop while n-exp_before_5*(i+1) >= 0
n <- n + 1
end while
Based on user448810's answer, here's a solution that uses heaps and vectors from the STL.
Now, heaps normally output the largest value, so we store the negative of the numbers as a workaround (since a>b <==> -a<-b).
#include <vector>
#include <iostream>
#include <algorithm>
int main()
{
std::vector<int> primes;
primes.push_back(2);
primes.push_back(5);//Our prime numbers that we get to use
std::vector<int> heap;//the heap that is going to store our possible values
heap.push_back(-1);
std::vector<int> outputs;
outputs.push_back(1);
while(outputs.size() < 10)
{
std::pop_heap(heap.begin(), heap.end());
int nValue = -*heap.rbegin();//Get current smallest number
heap.pop_back();
if(nValue != *outputs.rbegin())//Is it a repeat?
{
outputs.push_back(nValue);
}
for(unsigned int i = 0; i < primes.size(); i++)
{
heap.push_back(-nValue * primes[i]);//add new values
std::push_heap(heap.begin(), heap.end());
}
}
//output our answer
for(unsigned int i = 0; i < outputs.size(); i++)
{
std::cout << outputs[i] << " ";
}
std::cout << std::endl;
}
Output:
1 2 4 5 8 10 16 20 25 32

Resources