Number of ways to change coins in constant time? - algorithm

Let's say I have three types of coins -- a penny (0.01), a nickel (0.05), and a dime (0.10) and I want to find the number of ways to make change of a certain amount. For example to change 27 cents:
change(amount=27, coins=[1,5,10])
One of the more common ways to approach this problem is recursively/dynamically: to find the number of ways to make that change without a particular coin, and then deduct that coin amount and find the ways to do it with that coin.
But, I'm wondering if there is a way to do it using a cached value and mod operator. For example:
10 cents can be changed 4 ways:
10 pennies
1 dime
2 nickels
1 nickel, 5 pennies
5 cents can be changed 2 ways:
1 nickel
5 pennies
1-4 cents can be changed 1 way:
1-4 pennies
For example, this is wrong, but my idea was along the lines of:
def change(amount, coins=[1,5,10]):
cache = {10: 4, 5: 2, 1: 1}
for coin in sorted(coins, reverse=True):
# yes this will give zerodivision
# and a penny shouldn't be multiplied
# but this is just to demonstrate the basic idea
ways = (amount % coin) * cache[coin]
amount = amount % ways
return ways
If so, how would that algorithm work? Any language (or pseudo-language) is fine.

Precomputing the number of change possibilities for 10 cents and 5 cents cannot be applied to bigger values in a straight forward way, but for special cases like the given example of pennies, nickels and dimes a formula for the number of change possibilities can be derived when looking into more detail how the different ways of change for 5 and 10 cents can be combined.
Lets first look at multiples of 10. Having e.g. n=20 cents, the first 10 cents can be changed in 4 ways, so can the second group of 10 cents. That would make 4x4 = 16 ways of change. But not all combinations are different: a dime for the first 10 cents and 10 pennies for the other 10 cents is the same as having 10 pennies for the first 10 cents and a dime for the second 10 cents. So we have to count the possibilities in an ordered way: that would give (n/10+3) choose 3 possibilities. But still not all possibilities in this counting are different: choosing a nickel and 5 pennies for the first and the second group of 10 cents gives the same change as choosing two nickels for the first group and 10 cents for the second group. Thinking about this a little more one finds out that the possibility of 1 nickel and 5 pennies should be chosen only once. So we get (n/10+2) choose 2 ways of change without the nickel/pennies split (i.e. the total number of nickels will be even) and ((n-10)/10+2) choose 2 ways of change with one nickel/pennies split (i.e. the total number of nickels will be odd).
For an arbitrary number n of cents let [n/10] denote the value n/10 rounded down, i.e. the maximal number of dimes that can be used in the change. The cents exceeding the largest multiple of 10 in n can only be changed in maximally two ways: either they are all pennies or - if at least 5 cents remain - one nickel and pennies for the rest. To avoid counting the same way of change several times one can forbid to use any more pennies (for the groups of 10 cents) if there is a nickel in the change of the 'excess'-cents, so only dimes and and nickels for the groups of 10 cents, giving [n/10]+1 ways.
Alltogether one arrives at the following formula for N, the total number of ways for changing n cents:
N1 = ([n/10]+2) choose 2 + ([n/10]+1) choose 2 = ([n/10]+1)^2
[n/10]+1, if n mod 10 >= 5
N2 = {
0, otherwise
N = N1 + N2
Or as Python code:
def change_1_5_10_count(n):
n_10 = n // 10
N1 = (n_10+1)**2
N2 = (n_10+1) if n % 10 >= 5 else 0
return N1 + N2
btw, the computation can be further simplified: N = [([n/5]+2)^2/4], or in Python notation: (n // 5 + 2)**2 // 4.

Almost certainly not for the general case. That's why recursive and bottom-up dynamic programs are used. The modulus operator would provide us with a remainder when dividing the amount by the coin denomination -- meaning we would be using the maximum count of that coin that we can -- but for our solution, we need to count ways of making change when different counts of each coin denomination are used.
Identical intermediate amounts can be reached by using different combinations of coins, and that is what the classic method uses a cache for. O(amount * num_coins):
# Adapted from https://algorithmist.com/wiki/Coin_change#Dynamic_Programming
def coin_change_bottom_up(amount, coins):
cache = [[None] * len(coins) for _ in range(amount + 1)]
for m in range(amount+1):
for i in range(len(coins)):
# There is one way to return
# zero change with the ith coin.
if m == 0:
cache[m][i] = 1
# Base case: the first
# coin (which would be last
# in a top-down recursion).
elif i == 0:
# If this first/last coin
# divides m, there's one
# way to make change;
if m % coins[i] == 0:
cache[m][i] = 1
# otherwise, no way to make change.
else:
cache[m][i] = 0
else:
# Add the number of ways to
# make change for this amount
# without this particular coin.
cache[m][i] = cache[m][i - 1]
# If this coin's denomintion is less
# than or equal to the amount we're
# making change for, add the number
# of ways we can make change for the
# amount reduced by the coin's denomination
# (thus using the coin), again considering
# this and previously seen coins.
if coins[i] <= m:
cache[m][i] += cache[m - coins[i]][i]
return cache[amount][len(coins)-1]

With Python you can leverage the #cache decorator (or #lru_cache) and automatically make a recursive solution into a cached one. For example:
from functools import cache
#cache
def change(amount, coins=(1, 5, 10)):
if coins==(): return amount==0
C = coins[-1]
return sum([change(amount - C*x, coins[:-1]) for x in range(1+(amount//C))])
print(change(27, (1, 5, 10))) # 12
print(change(27, (1, 5))) # 6
print(change(17, (1, 5))) # 4
print(change(7, (1, 5))) # 2
# ch(27, (1, 5, 10)) == ch(27, (1, 5)) + ch(17, (1, 5)) + ch(7, (1, 5))
This will invoke the recursion only for those values of the parameters which the result hasn't been already computed and stored. With #lru_cache, you can even specify the maximum number of elements you allow in the cache.

This will be one of the DP approach for this problem:
def coin_ways(coins, amount):
dp = [[] for _ in range(amount+1)]
dp[0].append([]) # or table[0] = [[]], if prefer
for coin in coins:
for x in range(coin, amount+1):
dp[x].extend(ans + [coin] for ans in dp[x-coin])
#print(dp)
return len(dp[amount])
if __name__ == '__main__':
coins = [1, 5, 10] # 2, 5, 10, 25]
print(coin_ways(coins, 27)) # 12

Related

Advanced Algorithms Problems ("Nice Triangle"): Prime number Pyramid where every number depends on numbers above it

I'm currently studying for an advanced algorithms and datastructures exam, and I simply can't seem to solve one of the practice-problems which is the following:
1.14) "Nice Triangle"
A "nice" triangle is defined in the following way:
There are three different numbers which the triangle consists of, namely the first three prime numbers (2, 3 and 5).
Every number depends on the two numbers below it in the following way.
Numbers are the same, resulting number is also the same. (2, 2 => 2)
Numbers are different, resulting number is the remaining number. (2, 3 => 5)
Given an integer N with length L, corresponding to the base of the triangle, determine the last element at the top
For example:
Given N = 25555 (and thus L = 5), the triangle looks like this:
2
3 5
2 5 5
3 5 5 5
2 5 5 5 5
=> 2 is the result of this example
What does the fact that every number is prime have to do with the problem?
By using a naive approach (simply calculating every single row), one obtains a time-complexity of O(L^2).
However, the professor said, it's possible with O(L), but I simply can't find any pattern!!!
I'm not sure why this problem would be used in an advanced algorithms course, but yes, you can do this in O(l) = O(log n) time.
There are a couple ways you can do it, but they both rely on recognizing that:
For the problem statement, it doesn't matter what digits you use. Lets use 0, 1, and 2 instead of 2, 3, and 5. Then
If a and b are the input numbers and c is the output, then c = -(a+b) mod 3
You can build the whole triangle using c = a+b mod 3 instead, and then just negate every second row.
Now the two ways you can do this in O(log n) time are:
For each digit d in the input, calculate the number of times (call it k) that it gets added into the final sum, add up all the kd mod 3, and then negate the result if you started with an even number of digits. That takes constant time per digit. Alternatively:
recognize that you can do arithmetic on n-sized values in constant time. Make a value that is a bit mask of all the digits in n. That takes 2 bits each. Then by using bitwise operations you can calculate each row from the previous one in constant time, for O(log n) time altogether.
Here's an implementation of the 2nd way in python:
def niceTriangle(n):
# a vector of 3-bit integers mod 3
rowvec = 0
# a vector of 1 for each number in the row
onevec = 0
# number of rows remaining
rows = 0
# mapping for digits 0-9
digitmap = [0, 0, 0, 1, 1, 2, 2, 2, 2, 2]
# first convert n into the first row
while n > 0:
digit = digitmap[n % 10]
n = n//10
rows += 1
onevec = (onevec << 3) + 1
rowvec = (rowvec << 3) + digit
if rows%2 == 0:
# we have an even number of rows -- negate everything
rowvec = ((rowvec&onevec)<<1) | ((rowvec>>1)&onevec)
while rows > 1:
# add each number to its neighbor
rowvec += (rowvec >> 3)
# isolate the entries >= 3, by adding 1 to each number and
# getting the 2^2 bit
gt3 = ((rowvec + onevec) >> 2) & onevec
# subtract 3 from all the greater entries
rowvec -= gt3*3
rows -= 1
return [2,3,5][rowvec%4]

Divide n into x random parts

What I need to achieve is basically x dice rolls = n sum but backwards.
So let's create an example:
The dice has to be rolled 5 times (min. sum 5, max. sum 30) which means:
x = 5
Let's say in this case the sum that was rolled is 23 which means:
n = 23
So what I need is to get the any of the possible single dice roll combinations (e.g. 6, 4, 5, 3, 5)
What I could make up in my mind so far is:
Create 5 random numbers.
Add them up and get the sum.
Now divide every single random number by the sum and multiply by the wanted number 23.
The result is 5 random numbers that equal the wanted number 23.
The problem is that this one returns random values (decimals, values below 1 and above 6) depending on the random numbers. I can not find a way to edit the formula to only return integers >= 1 or <= 6.
If you don't need to scale it up by far the easiest way is to re-randomize it until you get the right sum. It takes milliseconds on any modern cpu. Not pretty tho.
#!/usr/local/bin/lua
math.randomseed(os.time())
function divs(n,x)
local a = {}
repeat
local s = 0
for i=1,x do
a[i] = math.random(6)
s = s + a[i]
end
until s==n
return a
end
a = divs(23,5)
for k,v in pairs(a) do print(k,v) end
This was an interesting problem. Here's my take:
EDIT: I missed the fact that you needed them to be dice rolls. Here's a new take. As a bonus, you can specify the number of sides of the dices in an optional parameter.
local function getDiceRolls(n, num_rolls, num_sides)
num_sides = num_sides or 6
assert(n >= num_rolls, "n must be greater than num_rolls")
assert(n <= num_rolls * num_sides, "n is too big for the number of dices and sides")
local rolls = {}
for i=1, num_rolls do rolls[i] = 1 end
for i=num_rolls+1, n do
local index = math.random(1,num_rolls)
while rolls[index] == num_sides do
index = (index % num_rolls) + 1
end
rolls[index] = rolls[index] + 1
end
return rolls
end
-- tests:
print(unpack(getDiceRolls(21, 4))) -- 6 4 6 5
print(unpack(getDiceRolls(21, 4))) -- 5 5 6 5
print(unpack(getDiceRolls(13, 3))) -- 4 3 6
print(unpack(getDiceRolls(13, 3))) -- 5 5 3
print(unpack(getDiceRolls(30, 3, 20))) -- 9 10 11
print(unpack(getDiceRolls(7, 7))) -- 1 1 1 1 1 1 1
print(unpack(getDiceRolls(7, 8))) -- error
print(unpack(getDiceRolls(13, 2))) -- error
If the # of rolls does not change wildly, but the sum does, then it would be worth creating a lookup table for combinations of a given sum. You would generate every combination, and for each one compute the sum, then add the combination to a list associated to that sum. The lookup table would look like this:
T = {12 = {{1,2,3,4,2},{2,5,3,1,1},{2,2,2,3,3}, ...}, 13=....}
Then when you want to randomly select a combo for n=23, you look in table for key 23, the list has all combos with that sum, now just randomly pick one of them. Same for any other number.

Determine the combinations of making change for a given amount

My assignment is to write an algorithm using brute force to determine the number of distinct ways, an related combinations of change for given amount. The change will be produced using the following coins: penny (1 cent), nickel (5 cents), dime (10 cents), and quarter (25 cents).
e.g.
Input: 16 (it means a change of 16 cents)
Output: can be produced in 6 different ways and they are:
16 pennies.
11 pennies, 1 nickel
6 pennies, 1 dime
6 pennies, 2 nickels
1 penny, 3 nickels
1 penny, 1 nickel, 1 dime
My algorithm must produce all possible change combinations for a specified amount of change.
I am at a complete loss as to how to even begin starting an algorithm like this. Any input or insight to get me going would be awesome.
Ok. Let me explain one idea for a brute force algorithm. I will use recursion here.
Let's you need a change of c cents. Then consider c as
c = p * PENNY + n * NICKEL + d * DIME + q * QUARTER
or simply,
c = ( p * 1 ) + ( n * 5 ) + ( d * 10 ) + ( q * 25 )
Now you need to go through all the possible values for p, n, d and q that equals the value of c. Using recursion, for each p in [0, maximumPennies] go through each n in [0, maximumNickels]. For each n go through each d in [0, maximumDimes]. For each d go through each q in [0, maximumQuarters].
p in [0, maximumPennies] AND c >= p
|
+- n in [0, maximumNickels] AND c >= p + 5n
|
+- d in [0, maximumDimes] AND c >= p + 5n + 10d
|
+- q in [0, maximumQuarters] AND c >= p + 5n + 10d + 25q
For any equality in these steps you got a solution.
You could start thinking about this problem by dividing it into sub-problems solve these and then change the problem and adjust your solution.
In your case you could first try to solve the problem using only pennies (With only one obvious solution of course), then look at nickels and pennies and look at all combinations there and so on. To improve this you can reuse solutions from earlier stages in your algorithm.
Well, if you want brute force solution, you can start with a very naive recursive approach. But to be efficient, you'll need a dynamic programming approach.
For the recursive approach:
1. find out the number of ways you can make using penny only.
2. do the same using penny and nickel only. (this includes step 1 also)
3. the same using penny, nickel and dime only (including step 2).
4. using all the coins (with all previous steps).
Step 1 is straightforward, only one way to do that.
For step 2, the recursion should be like this:
number of ways to make n cent using penny and nickel =
number of ways to make (n - [1 nickel]) using penny and nickel
+ number of ways to make n cent using penny only
Step 3:
number of ways to make n cent using penny, nickel and dime =
number of ways to make (n - [1 dime]) using penny, nickel and dime
+ number of ways to make n cent using penny and nickel only
Step 4 is similar.
And one thing to remember: you can make 0 cent in one way (i.e. using zero coins), it's the base case.
Try to use recursion on this one.
Your function should take two parameters - the maximum value you are allowed to use and the amount remaining to pay(you need the first to avoid repetition).
Make the function in such a way: if it is in a trivial case (e.g. 1, 5, 10 and you are allowed to take a penny, nickel, dime respectively) print the trivial solution. Also for each case try to take one coin of all the allowed types(e.g. not greater then the maximum allowed) and continue recursively.
Hope this helps.
public class PrintAllCoinCombinations {
static int findChange(int arr[], int index , int value, String str){
if(value == 0){
System.out.println(str);
return 1;
}
if(index<0){
return 0;
}
if(value<0){
return 0;
}
int excl = findChange(arr,index-1,value,str);
str += " "+ arr[index];
int incl = findChange(arr,index,value-arr[index],str);
return incl + excl;
}
public static void main(String [] arg){
int arr[] = {1,5,10,25};
String s = "";
int result = findChange(arr,3,16,s);
System.out.println(result);
}
}

Dynamic programming: can interval of even 1's and 0's be found in linear time?

Found the following inteview q on the web:
You have an array of
0s and 1s and you want to output all the intervals (i, j) where the
number of 0s and numbers of 1s are equal. Example
pos = 0 1 2 3 4 5 6 7 8
0 1 0 0 1 1 1 1 0
One interval is (0, 1) because there the number
of 0 and 1 are equal. There are many other intervals, find all of them
in linear time.
I think there is no linear time algo, as there may be n^2 such intervals.
Am I right? How can I prove that there are n^2 such ?
This is the fastest way I can think of to do this, and it is linear to the number of intervals there are.
Let L be your original list of numbers and A be a hash of empty arrays where initially A[0] = [0]
sum = 0
for i in 0..n
if L[i] == 0:
sum--
A[sum].push(i)
elif L[i] == 1:
sum++
A[sum].push(i)
Now A is essentially an x y graph of the sum of the sequence (x is the index of the list, y is the sum). Every time there are two x values x1 and x2 to an y value, you have an interval (x1, x2] where the number of 0s and 1s is equal.
There are m(m-1)/2 (arithmetic sum from 1 to m - 1) intervals where the sum is 0 in every array M in A where m = M.length
Using your example to calculate A by hand we use this chart
L # 0 1 0 1 0 0 1 1 1 1 0
A keys 0 -1 0 -1 0 -1 -2 -1 0 1 2 1
L index -1 0 1 2 3 4 5 6 7 8 9 10
(I've added a # to represent the start of the list with an key of -1. Also removed all the numbers that are not 0 or 1 since they're just distractions) A will look like this:
[-2]->[5]
[-1]->[0, 2, 4, 6]
[0]->[-1, 1, 3, 7]
[1]->[8, 10]
[2]->[9]
For any M = [a1, a2, a3, ...], (ai + 1, aj) where j > i will be an interval with the same number of 0s as 1s. For example, in [-1]->[0, 2, 4, 6], the intervals are (1, 2), (1, 4), (1, 6), (3, 4), (3, 6), (5, 6).
Building the array A is O(n), but printing these intervals from A must be done in linear time to the number of intervals. In fact, that could be your proof that it is not quite possible to do this in linear time to n because it's possible to have more intervals than n and you need at least the number of interval iterations to print them all.
Unless of course you consider building A is enough to find all the intervals (since it's obvious from A what the intervals are), then it is linear to n :P
A linear solution is possible (sorry, earlier I argued that this had to be n^2) if you're careful to not actually print the results!
First, let's define a "score" for any set of zeros and ones as the number of ones minus the number of zeroes. So (0,1) has a score of 0, while (0) is -1 and (1,1) is 2.
Now, start from the right. If the right-most digit is a 0 then it can be combined with any group to the left that has a score of 1. So we need to know what groups are available to the left, indexed by score. This suggests a recursive procedure that accumulates groups with scores. The sweep process is O(n) and at each step the process has to check whether it has created a new group and extend the table of known groups. Checking for a new group is constant time (lookup in a hash table). Extending the table of known groups is also constant time (at first I thought it wasn't, but you can maintain a separate offset that avoids updating each entry in the table).
So we have a peculiar situation: each step of the process identifies a set of results of size O(n), but the calculation necessary to do this is constant time (within that step). So the process itself is still O(n) (proportional to the number of steps). Of course, actually printing the results (either during the step, or at the end) makes things O(n^2).
I'll write some Python code to test/demonstrate.
Here we go:
SCORE = [-1,1]
class Accumulator:
def __init__(self):
self.offset = 0
self.groups_to_right = {} # map from score to start index
self.even_groups = []
self.index = 0
def append(self, digit):
score = SCORE[digit]
# want existing groups at -score, to sum to zero
# but there's an offset to correct for, so we really want
# groups at -(score+offset)
corrected = -(score + self.offset)
if corrected in self.groups_to_right:
# if this were a linked list we could save a reference
# to the current value. it's not, so we need to filter
# on printing (see below)
self.even_groups.append(
(self.index, self.groups_to_right[corrected]))
# this updates all the known groups
self.offset += score
# this adds the new one, which should be at the index so that
# index + offset = score (so index = score - offset)
groups = self.groups_to_right.get(score-self.offset, [])
groups.append(self.index)
self.groups_to_right[score-self.offset] = groups
# and move on
self.index += 1
#print self.offset
#print self.groups_to_right
#print self.even_groups
#print self.index
def dump(self):
# printing the results does take longer, of course...
for (end, starts) in self.even_groups:
for start in starts:
# this discards the extra points that were added
# to the data after we added it to the results
# (avoidable with linked lists)
if start < end:
print (start, end)
#staticmethod
def run(input):
accumulator = Accumulator()
print input
for digit in input:
accumulator.append(digit)
accumulator.dump()
print
Accumulator.run([0,1,0,0,1,1,1,1,0])
And the output:
dynamic: python dynamic.py
[0, 1, 0, 0, 1, 1, 1, 1, 0]
(0, 1)
(1, 2)
(1, 4)
(3, 4)
(0, 5)
(2, 5)
(7, 8)
You might be worried that some additional processing (the filtering for start < end) is done in the dump routine that displays the results. But that's because I am working around Python's lack of linked lists (I want to both extend a list and save the previous value in constant time).
It may seem surprising that the result is of size O(n^2) while the process of finding the results is O(n), but it's easy to see how that is possible: at one "step" the process identifies a number of groups (of size O(n)) by associating the current point (self.index in append, or end in dump()) with a list of end points (self.groups_to_right[...] or ends).
Update: One further point. The table of "groups to the right" will have a "typical width" of sqrt(n) entries (this follows from the central limit theorem - it's basically a random walk in 1D). Since an entry is added at each step, the average length is also sqrt(n) (the n values shared out over sqrt(n) bins). That means that the expected time for this algorithm (ie with random inputs), if you include printing the results, is O(n^3/2) even though worst case is O(n^2)
Answering directly the question:
you have to constructing an example where there are more than O(N) matches:
let N be in the form 2^k, with the following input:
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 (here, N=16)
number of matches (where 0 is the starting character):
length #
2 N/2
4 N/2 - 1
6 N/2 - 2
8 N/2 - 3
..
N 1
The total number of matches (starting with 0) is: (1+N/2) * (N/2) / 2 = N^2/8 + N/4
The matches starting with 1 are almost the same, expect that it is one less for each length.
Total: (N^2/8 + N/4) * 2 - N/2 = N^2/4
Every interval will contain at least one sequence of either (0,1) or (1,0). Therefore, it's simply a matter of finding every occurance of (0,1) or (1,0), then for each seeing if it is adjacent to an existing solution or if the two bookend elements form another solution.
With a bit of storage trickery you will be able to find all solutions in linear time. Enumerating them will be O(N^2), but you should be able to encode them in O(N) space.

How to count each digit in a range of integers?

Imagine you sell those metallic digits used to number houses, locker doors, hotel rooms, etc. You need to find how many of each digit to ship when your customer needs to number doors/houses:
1 to 100
51 to 300
1 to 2,000 with zeros to the left
The obvious solution is to do a loop from the first to the last number, convert the counter to a string with or without zeros to the left, extract each digit and use it as an index to increment an array of 10 integers.
I wonder if there is a better way to solve this, without having to loop through the entire integers range.
Solutions in any language or pseudocode are welcome.
Edit:
Answers review
John at CashCommons and Wayne Conrad comment that my current approach is good and fast enough. Let me use a silly analogy: If you were given the task of counting the squares in a chess board in less than 1 minute, you could finish the task by counting the squares one by one, but a better solution is to count the sides and do a multiplication, because you later may be asked to count the tiles in a building.
Alex Reisner points to a very interesting mathematical law that, unfortunately, doesn’t seem to be relevant to this problem.
Andres suggests the same algorithm I’m using, but extracting digits with %10 operations instead of substrings.
John at CashCommons and phord propose pre-calculating the digits required and storing them in a lookup table or, for raw speed, an array. This could be a good solution if we had an absolute, unmovable, set in stone, maximum integer value. I’ve never seen one of those.
High-Performance Mark and strainer computed the needed digits for various ranges. The result for one millon seems to indicate there is a proportion, but the results for other number show different proportions.
strainer found some formulas that may be used to count digit for number which are a power of ten.
Robert Harvey had a very interesting experience posting the question at MathOverflow. One of the math guys wrote a solution using mathematical notation.
Aaronaught developed and tested a solution using mathematics. After posting it he reviewed the formulas originated from Math Overflow and found a flaw in it (point to Stackoverflow :).
noahlavine developed an algorithm and presented it in pseudocode.
A new solution
After reading all the answers, and doing some experiments, I found that for a range of integer from 1 to 10n-1:
For digits 1 to 9, n*10(n-1) pieces are needed
For digit 0, if not using leading zeros, n*10n-1 - ((10n-1) / 9) are needed
For digit 0, if using leading zeros, n*10n-1 - n are needed
The first formula was found by strainer (and probably by others), and I found the other two by trial and error (but they may be included in other answers).
For example, if n = 6, range is 1 to 999,999:
For digits 1 to 9 we need 6*105 = 600,000 of each one
For digit 0, without leading zeros, we need 6*105 – (106-1)/9 = 600,000 - 111,111 = 488,889
For digit 0, with leading zeros, we need 6*105 – 6 = 599,994
These numbers can be checked using High-Performance Mark results.
Using these formulas, I improved the original algorithm. It still loops from the first to the last number in the range of integers, but, if it finds a number which is a power of ten, it uses the formulas to add to the digits count the quantity for a full range of 1 to 9 or 1 to 99 or 1 to 999 etc. Here's the algorithm in pseudocode:
integer First,Last //First and last number in the range
integer Number //Current number in the loop
integer Power //Power is the n in 10^n in the formulas
integer Nines //Nines is the resut of 10^n - 1, 10^5 - 1 = 99999
integer Prefix //First digits in a number. For 14,200, prefix is 142
array 0..9 Digits //Will hold the count for all the digits
FOR Number = First TO Last
CALL TallyDigitsForOneNumber WITH Number,1 //Tally the count of each digit
//in the number, increment by 1
//Start of optimization. Comments are for Number = 1,000 and Last = 8,000.
Power = Zeros at the end of number //For 1,000, Power = 3
IF Power > 0 //The number ends in 0 00 000 etc
Nines = 10^Power-1 //Nines = 10^3 - 1 = 1000 - 1 = 999
IF Number+Nines <= Last //If 1,000+999 < 8,000, add a full set
Digits[0-9] += Power*10^(Power-1) //Add 3*10^(3-1) = 300 to digits 0 to 9
Digits[0] -= -Power //Adjust digit 0 (leading zeros formula)
Prefix = First digits of Number //For 1000, prefix is 1
CALL TallyDigitsForOneNumber WITH Prefix,Nines //Tally the count of each
//digit in prefix,
//increment by 999
Number += Nines //Increment the loop counter 999 cycles
ENDIF
ENDIF
//End of optimization
ENDFOR
SUBROUTINE TallyDigitsForOneNumber PARAMS Number,Count
REPEAT
Digits [ Number % 10 ] += Count
Number = Number / 10
UNTIL Number = 0
For example, for range 786 to 3,021, the counter will be incremented:
By 1 from 786 to 790 (5 cycles)
By 9 from 790 to 799 (1 cycle)
By 1 from 799 to 800
By 99 from 800 to 899
By 1 from 899 to 900
By 99 from 900 to 999
By 1 from 999 to 1000
By 999 from 1000 to 1999
By 1 from 1999 to 2000
By 999 from 2000 to 2999
By 1 from 2999 to 3000
By 1 from 3000 to 3010 (10 cycles)
By 9 from 3010 to 3019 (1 cycle)
By 1 from 3019 to 3021 (2 cycles)
Total: 28 cycles
Without optimization: 2,235 cycles
Note that this algorithm solves the problem without leading zeros. To use it with leading zeros, I used a hack:
If range 700 to 1,000 with leading zeros is needed, use the algorithm for 10,700 to 11,000 and then substract 1,000 - 700 = 300 from the count of digit 1.
Benchmark and Source code
I tested the original approach, the same approach using %10 and the new solution for some large ranges, with these results:
Original 104.78 seconds
With %10 83.66
With Powers of Ten 0.07
A screenshot of the benchmark application:
(source: clarion.sca.mx)
If you would like to see the full source code or run the benchmark, use these links:
Complete Source code (in Clarion): http://sca.mx/ftp/countdigits.txt
Compilable project and win32 exe: http://sca.mx/ftp/countdigits.zip
Accepted answer
noahlavine solution may be correct, but l just couldn’t follow the pseudo code, I think there are some details missing or not completely explained.
Aaronaught solution seems to be correct, but the code is just too complex for my taste.
I accepted strainer’s answer, because his line of thought guided me to develop this new solution.
There's a clear mathematical solution to a problem like this. Let's assume the value is zero-padded to the maximum number of digits (it's not, but we'll compensate for that later), and reason through it:
From 0-9, each digit occurs once
From 0-99, each digit occurs 20 times (10x in position 1 and 10x in position 2)
From 0-999, each digit occurs 300 times (100x in P1, 100x in P2, 100x in P3)
The obvious pattern for any given digit, if the range is from 0 to a power of 10, is N * 10N-1, where N is the power of 10.
What if the range is not a power of 10? Start with the lowest power of 10, then work up. The easiest case to deal with is a maximum like 399. We know that for each multiple of 100, each digit occurs at least 20 times, but we have to compensate for the number of times it appears in the most-significant-digit position, which is going to be exactly 100 for digits 0-3, and exactly zero for all other digits. Specifically, the extra amount to add is 10N for the relevant digits.
Putting this into a formula, for upper bounds that are 1 less than some multiple of a power of 10 (i.e. 399, 6999, etc.) it becomes: M * N * 10N-1 + iif(d <= M, 10N, 0)
Now you just have to deal with the remainder (which we'll call R). Take 445 as an example. This is whatever the result is for 399, plus the range 400-445. In this range, the MSD occurs R more times, and all digits (including the MSD) also occur at the same frequencies they would from range [0 - R].
Now we just have to compensate for the leading zeros. This pattern is easy - it's just:
10N + 10N-1 + 10N-2 + ... + **100
Update: This version correctly takes into account "padding zeros", i.e. the zeros in middle positions when dealing with the remainder ([400, 401, 402, ...]). Figuring out the padding zeros is a bit ugly, but the revised code (C-style pseudocode) handles it:
function countdigits(int d, int low, int high) {
return countdigits(d, low, high, false);
}
function countdigits(int d, int low, int high, bool inner) {
if (high == 0)
return (d == 0) ? 1 : 0;
if (low > 0)
return countdigits(d, 0, high) - countdigits(d, 0, low);
int n = floor(log10(high));
int m = floor((high + 1) / pow(10, n));
int r = high - m * pow(10, n);
return
(max(m, 1) * n * pow(10, n-1)) + // (1)
((d < m) ? pow(10, n) : 0) + // (2)
(((r >= 0) && (n > 0)) ? countdigits(d, 0, r, true) : 0) + // (3)
(((r >= 0) && (d == m)) ? (r + 1) : 0) + // (4)
(((r >= 0) && (d == 0)) ? countpaddingzeros(n, r) : 0) - // (5)
(((d == 0) && !inner) ? countleadingzeros(n) : 0); // (6)
}
function countleadingzeros(int n) {
int tmp= 0;
do{
tmp= pow(10, n)+tmp;
--n;
}while(n>0);
return tmp;
}
function countpaddingzeros(int n, int r) {
return (r + 1) * max(0, n - max(0, floor(log10(r))) - 1);
}
As you can see, it's gotten a bit uglier but it still runs in O(log n) time, so if you need to handle numbers in the billions, this will still give you instant results. :-) And if you run it on the range [0 - 1000000], you get the exact same distribution as the one posted by High-Performance Mark, so I'm almost positive that it's correct.
FYI, the reason for the inner variable is that the leading-zero function is already recursive, so it can only be counted in the first execution of countdigits.
Update 2: In case the code is hard to read, here's a reference for what each line of the countdigits return statement means (I tried inline comments but they made the code even harder to read):
Frequency of any digit up to highest power of 10 (0-99, etc.)
Frequency of MSD above any multiple of highest power of 10 (100-399)
Frequency of any digits in remainder (400-445, R = 45)
Additional frequency of MSD in remainder
Count zeros in middle position for remainder range (404, 405...)
Subtract leading zeros only once (on outermost loop)
I'm assuming you want a solution where the numbers are in a range, and you have the starting and ending number. Imagine starting with the start number and counting up until you reach the end number - it would work, but it would be slow. I think the trick to a fast algorithm is to realize that in order to go up one digit in the 10^x place and keep everything else the same, you need to use all of the digits before it 10^x times plus all digits 0-9 10^(x-1) times. (Except that your counting may have involved a carry past the x-th digit - I correct for this below.)
Here's an example. Say you're counting from 523 to 1004.
First, you count from 523 to 524. This uses the digits 5, 2, and 4 once each.
Second, count from 524 to 604. The rightmost digit does 6 cycles through all of the digits, so you need 6 copies of each digit. The second digit goes through digits 2 through 0, 10 times each. The third digit is 6 5 times and 5 100-24 times.
Third, count from 604 to 1004. The rightmost digit does 40 cycles, so add 40 copies of each digit. The second from right digit doers 4 cycles, so add 4 copies of each digit. The leftmost digit does 100 each of 7, 8, and 9, plus 5 of 0 and 100 - 5 of 6. The last digit is 1 5 times.
To speed up the last bit, look at the part about the rightmost two places. It uses each digit 10 + 1 times. In general, 1 + 10 + ... + 10^n = (10^(n+1) - 1)/9, which we can use to speed up counting even more.
My algorithm is to count up from the start number to the end number (using base-10 counting), but use the fact above to do it quickly. You iterate through the digits of the starting number from least to most significant, and at each place you count up so that that digit is the same as the one in the ending number. At each point, n is the number of up-counts you need to do before you get to a carry, and m the number you need to do afterwards.
Now let's assume pseudocode counts as a language. Here, then, is what I would do:
convert start and end numbers to digit arrays start[] and end[]
create an array counts[] with 10 elements which stores the number of copies of
each digit that you need
iterate through start number from right to left. at the i-th digit,
let d be the number of digits you must count up to get from this digit
to the i-th digit in the ending number. (i.e. subtract the equivalent
digits mod 10)
add d * (10^i - 1)/9 to each entry in count.
let m be the numerical value of all the digits to the right of this digit,
n be 10^i - m.
for each digit e from the left of the starting number up to and including the
i-th digit, add n to the count for that digit.
for j in 1 to d
increment the i-th digit by one, including doing any carries
for each digit e from the left of the starting number up to and including
the i-th digit, add 10^i to the count for that digit
for each digit e from the left of the starting number up to and including the
i-th digit, add m to the count for that digit.
set the i-th digit of the starting number to be the i-th digit of the ending
number.
Oh, and since the value of i increases by one each time, keep track of your old 10^i and just multiply it by 10 to get the new one, instead of exponentiating each time.
To reel of the digits from a number, we'd only ever need to do a costly string conversion if we couldnt do a mod, digits can most quickly be pushed of a number like this:
feed=number;
do
{ digit=feed%10;
feed/=10;
//use digit... eg. digitTally[digit]++;
}
while(feed>0)
that loop should be very fast and can just be placed inside a loop of the start to end numbers for the simplest way to tally the digits.
To go faster, for larger range of numbers, im looking for an optimised method of tallying all digits from 0 to number*10^significance
(from a start to end bazzogles me)
here is a table showing digit tallies of some single significant digits..
these are inclusive of 0, but not the top value itself, -that was an oversight
but its maybe a bit easier to see patterns (having the top values digits absent here)
These tallies dont include trailing zeros,
1 10 100 1000 10000 2 20 30 40 60 90 200 600 2000 6000
0 1 1 10 190 2890 1 2 3 4 6 9 30 110 490 1690
1 0 1 20 300 4000 1 12 13 14 16 19 140 220 1600 2800
2 0 1 20 300 4000 0 2 13 14 16 19 40 220 600 2800
3 0 1 20 300 4000 0 2 3 14 16 19 40 220 600 2800
4 0 1 20 300 4000 0 2 3 4 16 19 40 220 600 2800
5 0 1 20 300 4000 0 2 3 4 16 19 40 220 600 2800
6 0 1 20 300 4000 0 2 3 4 6 19 40 120 600 1800
7 0 1 20 300 4000 0 2 3 4 6 19 40 120 600 1800
8 0 1 20 300 4000 0 2 3 4 6 19 40 120 600 1800
9 0 1 20 300 4000 0 2 3 4 6 9 40 120 600 1800
edit: clearing up my origonal
thoughts:
from the brute force table showing
tallies from 0 (included) to
poweroTen(notinc) it is visible that
a majordigit of tenpower:
increments tally[0 to 9] by md*tp*10^(tp-1)
increments tally[1 to md-1] by 10^tp
decrements tally[0] by (10^tp - 10)
(to remove leading 0s if tp>leadingzeros)
can increment tally[moresignificantdigits] by self(md*10^tp)
(to complete an effect)
if these tally adjustments were applied for each significant digit,
the tally should be modified as though counted from 0 to end-1
the adjustments can be inverted to remove preceeding range (start number)
Thanks Aaronaught for your complete and tested answer.
Here's a very bad answer, I'm ashamed to post it. I asked Mathematica to tally the digits used in all numbers from 1 to 1,000,000, no leading 0s. Here's what I got:
0 488895
1 600001
2 600000
3 600000
4 600000
5 600000
6 600000
7 600000
8 600000
9 600000
Next time you're ordering sticky digits for selling in your hardware store, order in these proportions, you won't be far wrong.
I asked this question on Math Overflow, and got spanked for asking such a simple question. One of the users took pity on me and said if I posted it to The Art of Problem Solving, he would answer it; so I did.
Here is the answer he posted:
http://www.artofproblemsolving.com/Forum/viewtopic.php?p=1741600#1741600
Embarrassingly, my math-fu is inadequate to understand what he posted (the guy is 19 years old...that is so depressing). I really need to take some math classes.
On the bright side, the equation is recursive, so it should be a simple matter to turn it into a recursive function with a few lines of code, by someone who understands the math.
I know this question has an accepted answer but I was tasked with writing this code for a job interview and I think I came up with an alternative solution that is fast, requires no loops and can use or discard leading zeroes as required.
It is in fact quite simple but not easy to explain.
If you list out the first n numbers
1
2
3
.
.
.
9
10
11
It is usual to start counting the digits required from the start room number to the end room number in a left to right fashion, so for the above we have one 1, one 2, one 3 ... one 9, two 1's one zero, four 1's etc. Most solutions I have seen used this approach with some optimisation to speed it up.
What I did was to count vertically in columns, as in hundreds, tens, and units. You know the highest room number so we can calculate how many of each digit there are in the hundreds column via a single division, then recurse and calculate how many in the tens column etc. Then we can subtract the leading zeros if we like.
Easier to visualize if you use Excel to write out the numbers but use a separate column for each digit of the number
A B C
- - -
0 0 1 (assuming room numbers do not start at zero)
0 0 2
0 0 3
.
.
.
3 6 4
3 6 5
.
.
.
6 6 9
6 7 0
6 7 1
^
sum in columns not rows
So if the highest room number is 671 the hundreds column will have 100 zeroes vertically, followed by 100 ones and so on up to 71 sixes, ignore 100 of the zeroes if required as we know these are all leading.
Then recurse down to the tens and perform the same operation, we know there will be 10 zeroes followed by 10 ones etc, repeated six times, then the final time down to 2 sevens. Again can ignore the first 10 zeroes as we know they are leading. Finally of course do the units, ignoring the first zero as required.
So there are no loops everything is calculated with division. I use recursion for travelling "up" the columns until the max one is reached (in this case hundreds) and then back down totalling as it goes.
I wrote this in C# and can post code if anyone interested, haven't done any benchmark timings but it is essentially instant for values up to 10^18 rooms.
Could not find this approach mentioned here or elsewhere so thought it might be useful for someone.
Your approach is fine. I'm not sure why you would ever need anything faster than what you've described.
Or, this would give you an instantaneous solution: Before you actually need it, calculate what you would need from 1 to some maximum number. You can store the numbers needed at each step. If you have a range like your second example, it would be what's needed for 1 to 300, minus what's needed for 1 to 50.
Now you have a lookup table that can be called at will. Doing up to 10,000 would only take a few MB and, what, a few minutes to compute, once?
This doesn't answer your exact question, but it's interesting to note the distribution of first digits according to Benford's Law. For example, if you choose a set of numbers at random, 30% of them will start with "1", which is somewhat counter-intuitive.
I don't know of any distributions describing subsequent digits, but you might be able to determine this empirically and come up with a simple formula for computing an approximate number of digits required for any range of numbers.
If "better" means "clearer," then I doubt it. If it means "faster," then yes, but I wouldn't use a faster algorithm in place of a clearer one without a compelling need.
#!/usr/bin/ruby1.8
def digits_for_range(min, max, leading_zeros)
bins = [0] * 10
format = [
'%',
('0' if leading_zeros),
max.to_s.size,
'd',
].compact.join
(min..max).each do |i|
s = format % i
for digit in s.scan(/./)
bins[digit.to_i] +=1 unless digit == ' '
end
end
bins
end
p digits_for_range(1, 49, false)
# => [4, 15, 15, 15, 15, 5, 5, 5, 5, 5]
p digits_for_range(1, 49, true)
# => [13, 15, 15, 15, 15, 5, 5, 5, 5, 5]
p digits_for_range(1, 10000, false)
# => [2893, 4001, 4000, 4000, 4000, 4000, 4000, 4000, 4000, 4000]
Ruby 1.8, a language known to be "dog slow," runs the above code in 0.135 seconds. That includes loading the interpreter. Don't give up an obvious algorithm unless you need more speed.
If you need raw speed over many iterations, try a lookup table:
Build an array with 2 dimensions: 10 x max-house-number
int nDigits[10000][10] ; // Don't try this on the stack, kids!
Fill each row with the count of digits required to get to that number from zero.
Hint: Use the previous row as a start:
n=0..9999:
if (n>0) nDigits[n] = nDigits[n-1]
d=0..9:
nDigits[n][d] += countOccurrencesOf(n,d) //
Number of digits "between" two numbers becomes simple subtraction.
For range=51 to 300, take the counts for 300 and subtract the counts for 50.
0's = nDigits[300][0] - nDigits[50][0]
1's = nDigits[300][1] - nDigits[50][1]
2's = nDigits[300][2] - nDigits[50][2]
3's = nDigits[300][3] - nDigits[50][3]
etc.
You can separate each digit (look here for a example), create a histogram with entries from 0..9 (which will count how many digits appeared in a number) and multiply by the number of 'numbers' asked.
But if isn't what you are looking for, can you give a better example?
Edited:
Now I think I got the problem. I think you can reckon this (pseudo C):
int histogram[10];
memset(histogram, 0, sizeof(histogram));
for(i = startNumber; i <= endNumber; ++i)
{
array = separateDigits(i);
for(j = 0; k < array.length; ++j)
{
histogram[k]++;
}
}
Separate digits implements the function in the link.
Each position of the histogram will have the amount of each digit. For example
histogram[0] == total of zeros
histogram[1] == total of ones
...
Regards

Resources