Efficiently count occurrences of each element from given ranges - algorithm

So i have some ranges like these:
2 4
1 9
4 5
4 7
For this the result should be
1 -> 1
2 -> 2
3 -> 2
4 -> 4
5 -> 3
6 -> 2
7 -> 2
8 -> 1
9 -> 1
The naive approach will be to loop through all the ranges but that would be very inefficient and the worst case would take O(n * n)
What would be the efficient approach probably in O(n) or O(log(n))

Here's the solution, in O(n):
The rationale is to add a range [a, b] as a +1 in a, and a -1 after b. Then, after adding all the ranges, then compute the accumulated sums for that array and display it.
If you need to perform queries while adding the values, a better choice would be to use a Binary Indexed Tree, but your question doesn't seem to require this, so I left it out.
#include <iostream>
#define MAX 1000
using namespace std;
int T[MAX];
int main() {
int a, b;
int min_index = 0x1f1f1f1f, max_index = 0;
while(cin >> a >> b) {
T[a] += 1;
T[b+1] -= 1;
min_index = min(min_index, a);
max_index = max(max_index, b);
}
for(int i=min_index; i<=max_index; i++) {
T[i] += T[i-1];
cout << i << " -> " << T[i] << endl;
}
}
UPDATE: Based on the "provocations" (in a good sense) by גלעד ברקן, you can also do this in O(n log n):
#include <iostream>
#include <map>
#define ull unsigned long long
#define miit map<ull, int>::iterator
using namespace std;
map<ull, int> T;
int main() {
ull a, b;
while(cin >> a >> b) {
T[a] += 1;
T[b+1] -= 1;
}
ull last;
int count = 0;
for(miit it = T.begin(); it != T.end(); it++) {
if (count > 0)
for(ull i=last; i<it->first; i++)
cout << i << " " << count << endl;
count += it->second;
last = it->first;
}
}
The advantage of this solution is being able to support ranges with much larger values (as long as the output isn't so large).

The solution would be pretty simple:
generate two lists with the indices of all starting and ending indices of the ranges and sort them.
Generate a counter for the number of ranges that cover the current index. Start at the first item that is at any range and iterate over all numbers to the last element that is in any range. Now if an index is either part of the list of starting-indices, we add 1 to the counter, if it's an element of the ending-indices, we substract 1 from the counter.
Implementation:
vector<int> count(int** ranges , int rangecount , int rangemin , int rangemax)
{
vector<int> res;
set<int> open, close;
for(int** r = ranges ; r < ranges + sizeof(int*) * rangecount ; r++)
{
open.add((*r)[0]);
close.add((*r)[1]);
}
int rc = 0;
for(int i = rangemin ; i < rangemax ; i++)
{
if(open.count(i))
++rc;
res.add(rc);
if(close.count(i))
--rc;
}
return res;
}

Paul's answer still counts from "the first item that is at any range and iterate[s] over all numbers to the last element that is in any range." But what is we could aggregate overlapping counts? For example, if we have three (or say a very large number of) overlapping ranges [(2,6),[1,6],[2,8] the section (2,6) could be dependent only on the number of ranges, if we were to label the overlaps with their counts [(1),3(2,6),(7,8)]).
Using binary search (once for the start and a second time for the end of each interval), we could split the intervals and aggregate the counts in O(n * log m * l) time, where n is our number of given ranges and m is the number of resulting groups in the total range and l varies as the number of disjoint updates required for a particular overlap (the number of groups already within that range). Notice that at any time, we simply have a sorted list grouped as intervals with labeled count.
2 4
1 9
4 5
4 7
=>
(2,4)
(1),2(2,4),(5,9)
(1),2(2,3),3(4),2(5),(6,9)
(1),2(2,3),4(4),3(5),2(6,7),(8,9)

So you want the output to be an array, where the value of each element is the number of input ranges that include it?
Yeah, the obvious solution would be to increment every element in the range by 1, for each range.
I think you can get more efficient if you sort the input ranges by start (primary), end (secondary). So for 32bit start and end, start:end can be a 64bit sort key. Actually, just sorting by start is fine, we need to sort the ends differently anyway.
Then you can see how many ranges you enter for an element, and (with a pqueue of range-ends) see how many you already left.
# pseudo-code with possible bugs.
# TODO: peek or put-back the element from ranges / ends
# that made the condition false.
pqueue ends; // priority queue
int depth = 0; // how many ranges contain this element
for i in output.len {
while (r = ranges.next && r.start <= i) {
ends.push(r.end);
depth++;
}
while (ends.pop < i) {
depth--;
}
output[i] = depth;
}
assert ends.empty();
Actually, we can just sort the starts and ends separately into two separate priority queues. There's no need to build the pqueue on the fly. (Sorting an array of integers is more efficient than sorting an array of structs by one struct member, because you don't have to copy around as much data.)

Related

Algorithm that will maintain top n items in the past k days?

I would like to implement a data structure maintaining a set S for a leaderboard that can answer the following queries efficiently, while also being memory-efficient:
add(x, t) Add a new item with score x to set S with an associated time t.
query(u) List the top n items (sorted by score) in the set S which has associated time t such that t + k >= u. Each subsequent query will have a u no smaller than previous queries.
In standard English, highscores can be added to this leaderboard individually, and I'd like an algorithm that can efficiently query the top n items on the leaderboard within the post k days (where k and n are fixed constants).
n can be assumed to be much less than the total number of items, and scores may be assumed to be random.
A naïve algorithm would be to store all elements as they are added into a balanced binary search tree sorted by score, and remove elements from the tree when they are more than k days old. Detecting elements that are more than k days old can be done with another balanced binary search tree sorted by time. This algorithm would yield a good time complexity of O(log(h)) where h is the total number of scores added in the past k days. However, the space complexity is O(h), and it is easy to see that most of the data saved will never be reported in a query even if no new scores are added for the next k days.
If n is 1, a simple double-ended queue is all that is necessary. Before adding a new item to the front of the queue, remove items from the front that have a smaller score than the new item, because they will never be reported in a query. Before querying, remove items from the back of the queue that are too old, then return the item that is left at the back of the queue. All operations would be amortized constant time complexity, and I wouldn't be storing items that would never be reported.
When n is more than 1, I can't seem to be able to formulate an algorithm which has a good time complexity and only stores items that could possibly be reported. An algorithm with time complexity O(log(h)) would be great, but n is small enough so that O(log(h) + n) is acceptable too.
Any ideas? Thanks!
This solution is based on the double-ended queue solution and I assume t is ascending.
The idea is that a record can be removed if there are n records with both larger t and larger x than it, which is implemented by Record.count in the sample code.
As each record would be moved from S to temp at most n times, we have average time complexity O(n).
The space complexity is hard to decide. However, it looks fine in the simulation. S.size() is about 400 when h = 10000 and n = 50.
#include <iostream>
#include <vector>
#include <queue>
#include <cstdlib>
using namespace std;
const int k = 10000, n = 50;
class Record {
public:
Record(int _x, int _t): x(_x), t(_t), count(n) {}
int x, t, count;
};
deque<Record> S;
void add(int x, int t)
{
Record record(x, t);
vector<Record> temp;
while (!S.empty() && record.x >= S.back().x) {
if (--S.back().count > 0) temp.push_back(S.back());
S.pop_back();
}
S.push_back(record);
while (!temp.empty()) {
S.push_back(temp.back());
temp.pop_back();
}
}
vector<int> query(int u)
{
while (S.front().t + k < u)
S.pop_front();
vector<int> xs;
for (int i = 0; i < S.size() && i < n; ++i)
xs.push_back(S[i].x);
return xs;
}
int main()
{
for (int t = 1; t <= 1000000; ++t) {
add(rand(), t);
vector<int> xs = query(t);
if (t % k == 0) {
cout << "t = " << t << endl;
cout << "S.size() = " << S.size() << endl;
for (auto x: xs) cout << x << " ";
cout << endl;
}
}
return 0;
}

Merge sort in c++ with divide 3

How can I write a merge sort but divide to 3?
int merge_sort(int input[], int p, int r)
{
if ( p >= r )
return 0;
int mid = floor((p + r) / 2);
merge_sort(input, p, mid);
**merge_sort(input, mid + 1, r);**
merge(input, p, r);
}
This is probably supposed to be a 3 way merge. You may want to consider using a bottom up merge sort. For either top down or bottom up merge, most of the complexity is going to be in the merge function. As mentioned in the answer linked to by zwergmaster, it's a 3 way merge of runs. Each run needs a current and ending index or pointer. A sequence of if / else statements end up doing two compares to determine which of 3 runs has the smallest element, and then that smallest element is moved to the destination array (or vector or ...) and the next element from that run is retrieved. When the end of one of the 3 runs is reached, the code switches into a 2 way merge. When the end of the next run is reached, the code copies the rest of the remaining run. Then the next set of 3 runs are merged, repeating the process until the end of the array is reached, which could happen within any of the 3 runs, so the last merge near the end of the array may be a merge of 3 or 2 runs, or just a copy of 1 run.
It would be more efficient to have an initial function that allocates a temp array the same size as the array to be sorted, then have it call the merge sort function passing the temp array as a parameter, rather than constantly allocating and freeing small temp arrays during the merge sort process.
So using top down merge sort partial code to help explain this:
merge_sort(int *a, int n)
{
int *b = new int[n];
top_down_merge_sort(a, b, 0, n);
/* ... */
delete[] b;
}
top_down_merge_sort(int *a, int *b, int beg, int end)
{
if(end - beg < 3){
/* sort in place */
return;
}
int run0 = beg;
int run1 = beg + (end-beg)/3;
int run2 = beg + 2*(end-beg)/3;
top_down_merge_sort(a, b, run0, run1);
top_down_merge_sort(a, b, run1, run2);
top_down_merge_sort(a, b, run2, end);
merge_runs(a, b, run0, run1, run2, end);
}

Explanation of iterative algorithm to print power set

I found this iterative algorithm that prints the power set for a given set:
void PrintSubsets()
{
int source[3] = {1,2,3};
int currentSubset = 7;
int tmp;
while(currentSubset)
{
printf("(");
tmp = currentSubset;
for(int i = 0; i<3; i++)
{
if (tmp & 1)
printf("%d ", source[i]);
tmp >>= 1;
}
printf(")\n");
currentSubset--;
}
}
However, I am not sure why it works. Is it similar to a solution where you use a set of n bits, and on each step, add 1 with carry, using the reuslting pattern of zeros and ones to determine which elements belong?
List all integers in the binary base, and light should shine:
{abc}
7 xxx
6 xx-
5 x-x
4 x--
3 -xx
2 -x-
1 --x
0 --- (omitted)
The order to enumerate the integers does not matter provided you list them all. Incrementing or decrementing are the most natural ways.

Printing numbers of the form 2^i * 5^j in increasing order

How do you print numbers of form 2^i * 5^j in increasing order.
For eg:
1, 2, 4, 5, 8, 10, 16, 20
This is actually a very interesting question, especially if you don't want this to be N^2 or NlogN complexity.
What I would do is the following:
Define a data structure containing 2 values (i and j) and the result of the formula.
Define a collection (e.g. std::vector) containing this data structures
Initialize the collection with the value (0,0) (the result is 1 in this case)
Now in a loop do the following:
Look in the collection and take the instance with the smallest value
Remove it from the collection
Print this out
Create 2 new instances based on the instance you just processed
In the first instance increment i
In the second instance increment j
Add both instances to the collection (if they aren't in the collection yet)
Loop until you had enough of it
The performance can be easily tweaked by choosing the right data structure and collection.
E.g. in C++, you could use an std::map, where the key is the result of the formula, and the value is the pair (i,j). Taking the smallest value is then just taking the first instance in the map (*map.begin()).
I quickly wrote the following application to illustrate it (it works!, but contains no further comments, sorry):
#include <math.h>
#include <map>
#include <iostream>
typedef __int64 Integer;
typedef std::pair<Integer,Integer> MyPair;
typedef std::map<Integer,MyPair> MyMap;
Integer result(const MyPair &myPair)
{
return pow((double)2,(double)myPair.first) * pow((double)5,(double)myPair.second);
}
int main()
{
MyMap myMap;
MyPair firstValue(0,0);
myMap[result(firstValue)] = firstValue;
while (true)
{
auto it=myMap.begin();
if (it->first < 0) break; // overflow
MyPair myPair = it->second;
std::cout << it->first << "= 2^" << myPair.first << "*5^" << myPair.second << std::endl;
myMap.erase(it);
MyPair pair1 = myPair;
++pair1.first;
myMap[result(pair1)] = pair1;
MyPair pair2 = myPair;
++pair2.second;
myMap[result(pair2)] = pair2;
}
}
This is well suited to a functional programming style. In F#:
let min (a,b)= if(a<b)then a else b;;
type stream (current, next)=
member this.current = current
member this.next():stream = next();;
let rec merge(a:stream,b:stream)=
if(a.current<b.current) then new stream(a.current, fun()->merge(a.next(),b))
else new stream(b.current, fun()->merge(a,b.next()));;
let rec Squares(start) = new stream(start,fun()->Squares(start*2));;
let rec AllPowers(start) = new stream(start,fun()->merge(Squares(start*2),AllPowers(start*5)));;
let Results = AllPowers(1);;
Works well with Results then being a stream type with current value and a next method.
Walking through it:
I define min for completenes.
I define a stream type to have a current value and a method to return a new string, essentially head and tail of a stream of numbers.
I define the function merge, which takes the smaller of the current values of two streams and then increments that stream. It then recurses to provide the rest of the stream. Essentially, given two streams which are in order, it will produce a new stream which is in order.
I define squares to be a stream increasing in powers of 2.
AllPowers takes the start value and merges the stream resulting from all squares at this number of powers of 5. it with the stream resulting from multiplying it by 5, since these are your only two options. You effectively are left with a tree of results
The result is merging more and more streams, so you merge the following streams
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
.
.
.
Merging all of these turns out to be fairly efficient with tail recursio and compiler optimisations etc.
These could be printed to the console like this:
let rec PrintAll(s:stream)=
if (s.current > 0) then
do System.Console.WriteLine(s.current)
PrintAll(s.next());;
PrintAll(Results);
let v = System.Console.ReadLine();
Similar things could be done in any language which allows for recursion and passing functions as values (it's only a little more complex if you can't pass functions as variables).
For an O(N) solution, you can use a list of numbers found so far and two indexes: one representing the next number to be multiplied by 2, and the other the next number to be multiplied by 5. Then in each iteration you have two candidate values to choose the smaller one from.
In Python:
numbers = [1]
next_2 = 0
next_5 = 0
for i in xrange(100):
mult_2 = numbers[next_2]*2
mult_5 = numbers[next_5]*5
if mult_2 < mult_5:
next = mult_2
next_2 += 1
else:
next = mult_5
next_5 += 1
# The comparison here is to avoid appending duplicates
if next > numbers[-1]:
numbers.append(next)
print numbers
So we have two loops, one incrementing i and second one incrementing j starting both from zero, right? (multiply symbol is confusing in the title of the question)
You can do something very straightforward:
Add all items in an array
Sort the array
Or you need an other solution with more math analysys?
EDIT: More smart solution by leveraging similarity with Merge Sort problem
If we imagine infinite set of numbers of 2^i and 5^j as two independent streams/lists this problem looks very the same as well known Merge Sort problem.
So solution steps are:
Get two numbers one from the each of streams (of 2 and of 5)
Compare
Return smallest
get next number from the stream of the previously returned smallest
and that's it! ;)
PS: Complexity of Merge Sort always is O(n*log(n))
I visualize this problem as a matrix M where M(i,j) = 2^i * 5^j. This means that both the rows and columns are increasing.
Think about drawing a line through the entries in increasing order, clearly beginning at entry (1,1). As you visit entries, the row and column increasing conditions ensure that the shape formed by those cells will always be an integer partition (in English notation). Keep track of this partition (mu = (m1, m2, m3, ...) where mi is the number of smaller entries in row i -- hence m1 >= m2 >= ...). Then the only entries that you need to compare are those entries which can be added to the partition.
Here's a crude example. Suppose you've visited all the xs (mu = (5,3,3,1)), then you need only check the #s:
x x x x x #
x x x #
x x x
x #
#
Therefore the number of checks is the number of addable cells (equivalently the number of ways to go up in Bruhat order if you're of a mind to think in terms of posets).
Given a partition mu, it's easy to determine what the addable states are. Image an infinite string of 0s following the last positive entry. Then you can increase mi by 1 if and only if m(i-1) > mi.
Back to the example, for mu = (5,3,3,1) we can increase m1 (6,3,3,1) or m2 (5,4,3,1) or m4 (5,3,3,2) or m5 (5,3,3,1,1).
The solution to the problem then finds the correct sequence of partitions (saturated chain). In pseudocode:
mu = [1,0,0,...,0];
while (/* some terminate condition or go on forever */) {
minNext = 0;
nextCell = [];
// look through all addable cells
for (int i=0; i<mu.length; ++i) {
if (i==0 or mu[i-1]>mu[i]) {
// check for new minimum value
if (minNext == 0 or 2^i * 5^(mu[i]+1) < minNext) {
nextCell = i;
minNext = 2^i * 5^(mu[i]+1)
}
}
}
// print next largest entry and update mu
print(minNext);
mu[i]++;
}
I wrote this in Maple stopping after 12 iterations:
1, 2, 4, 5, 8, 10, 16, 20, 25, 32, 40, 50
and the outputted sequence of cells added and got this:
1 2 3 5 7 10
4 6 8 11
9 12
corresponding to this matrix representation:
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
First of all, (as others mentioned already) this question is very vague!!!
Nevertheless, I am going to give a shot based on your vague equation and the pattern as your expected result. So I am not sure the following will be true for what you are trying to do, however it may give you some idea about java collections!
import java.util.List;
import java.util.ArrayList;
import java.util.SortedSet;
import java.util.TreeSet;
public class IncreasingNumbers {
private static List<Integer> findIncreasingNumbers(int maxIteration) {
SortedSet<Integer> numbers = new TreeSet<Integer>();
SortedSet<Integer> numbers2 = new TreeSet<Integer>();
for (int i=0;i < maxIteration;i++) {
int n1 = (int)Math.pow(2, i);
numbers.add(n1);
for (int j=0;j < maxIteration;j++) {
int n2 = (int)Math.pow(5, i);
numbers.add(n2);
for (Integer n: numbers) {
int n3 = n*n1;
numbers2.add(n3);
}
}
}
numbers.addAll(numbers2);
return new ArrayList<Integer>(numbers);
}
/**
* Based on the following fuzzy question # StackOverflow
* http://stackoverflow.com/questions/7571934/printing-numbers-of-the-form-2i-5j-in-increasing-order
*
*
* Result:
* 1 2 4 5 8 10 16 20 25 32 40 64 80 100 125 128 200 256 400 625 1000 2000 10000
*/
public static void main(String[] args) {
List<Integer> numbers = findIncreasingNumbers(5);
for (Integer i: numbers) {
System.out.print(i + " ");
}
}
}
If you can do it in O(nlogn), here's a simple solution:
Get an empty min-heap
Put 1 in the heap
while (you want to continue)
Get num from heap
print num
put num*2 and num*5 in the heap
There you have it. By min-heap, I mean min-heap
As a mathematician the first thing I always think about when looking at something like this is "will logarithms help?".
In this case it might.
If our series A is increasing then the series log(A) is also increasing. Since all terms of A are of the form 2^i.5^j then all members of the series log(A) are of the form i.log(2) + j.log(5)
We can then look at the series log(A)/log(2) which is also increasing and its elements are of the form i+j.(log(5)/log(2))
If we work out the i and j that generates the full ordered list for this last series (call it B) then that i and j will also generate the series A correctly.
This is just changing the nature of the problem but hopefully to one where it becomes easier to solve. At each step you can either increase i and decrease j or vice versa.
Looking at a few of the early changes you can make (which I will possibly refer to as transforms of i,j or just transorms) gives us some clues of where we are going.
Clearly increasing i by 1 will increase B by 1. However, given that log(5)/log(2) is approx 2.3 then increasing j by 1 while decreasing i by 2 will given an increase of just 0.3 . The problem then is at each stage finding the minimum possible increase in B for changes of i and j.
To do this I just kept a record as I increased of the most efficient transforms of i and j (ie what to add and subtract from each) to get the smallest possible increase in the series. Then applied whichever one was valid (ie making sure i and j don't go negative).
Since at each stage you can either decrease i or decrease j there are effectively two classes of transforms that can be checked individually. A new transform doesn't have to have the best overall score to be included in our future checks, just better than any other in its class.
To test my thougths I wrote a sort of program in LinqPad. Key things to note are that the Dump() method just outputs the object to screen and that the syntax/structure isn't valid for a real c# file. Converting it if you want to run it should be easy though.
Hopefully anything not explicitly explained will be understandable from the code.
void Main()
{
double C = Math.Log(5)/Math.Log(2);
int i = 0;
int j = 0;
int maxi = i;
int maxj = j;
List<int> outputList = new List<int>();
List<Transform> transforms = new List<Transform>();
outputList.Add(1);
while (outputList.Count<500)
{
Transform tr;
if (i==maxi)
{
//We haven't considered i this big before. Lets see if we can find an efficient transform by getting this many i and taking away some j.
maxi++;
tr = new Transform(maxi, (int)(-(maxi-maxi%C)/C), maxi%C);
AddIfWorthwhile(transforms, tr);
}
if (j==maxj)
{
//We haven't considered j this big before. Lets see if we can find an efficient transform by getting this many j and taking away some i.
maxj++;
tr = new Transform((int)(-(maxj*C)), maxj, (maxj*C)%1);
AddIfWorthwhile(transforms, tr);
}
//We have a set of transforms. We first find ones that are valid then order them by score and take the first (smallest) one.
Transform bestTransform = transforms.Where(x=>x.I>=-i && x.J >=-j).OrderBy(x=>x.Score).First();
//Apply transform
i+=bestTransform.I;
j+=bestTransform.J;
//output the next number in out list.
int value = GetValue(i,j);
//This line just gets it to stop when it overflows. I would have expected an exception but maybe LinqPad does magic with them?
if (value<0) break;
outputList.Add(value);
}
outputList.Dump();
}
public int GetValue(int i, int j)
{
return (int)(Math.Pow(2,i)*Math.Pow(5,j));
}
public void AddIfWorthwhile(List<Transform> list, Transform tr)
{
if (list.Where(x=>(x.Score<tr.Score && x.IncreaseI == tr.IncreaseI)).Count()==0)
{
list.Add(tr);
}
}
// Define other methods and classes here
public class Transform
{
public int I;
public int J;
public double Score;
public bool IncreaseI
{
get {return I>0;}
}
public Transform(int i, int j, double score)
{
I=i;
J=j;
Score=score;
}
}
I've not bothered looking at the efficiency of this but I strongly suspect its better than some other solutions because at each stage all I need to do is check my set of transforms - working out how many of these there are compared to "n" is non-trivial. It is clearly related since the further you go the more transforms there are but the number of new transforms becomes vanishingly small at higher numbers so maybe its just O(1). This O stuff always confused me though. ;-)
One advantage over other solutions is that it allows you to calculate i,j without needing to calculate the product allowing me to work out what the sequence would be without needing to calculate the actual number itself.
For what its worth after the first 230 nunmbers (when int runs out of space) I had 9 transforms to check each time. And given its only my total that overflowed I ran if for the first million results and got to i=5191 and j=354. The number of transforms was 23. The size of this number in the list is approximately 10^1810. Runtime to get to this level was approx 5 seconds.
P.S. If you like this answer please feel free to tell your friends since I spent ages on this and a few +1s would be nice compensation. Or in fact just comment to tell me what you think. :)
I'm sure everyone one's might have got the answer by now, but just wanted to give a direction to this solution..
It's a Ctrl C + Ctrl V from
http://www.careercup.com/question?id=16378662
void print(int N)
{
int arr[N];
arr[0] = 1;
int i = 0, j = 0, k = 1;
int numJ, numI;
int num;
for(int count = 1; count < N; )
{
numI = arr[i] * 2;
numJ = arr[j] * 5;
if(numI < numJ)
{
num = numI;
i++;
}
else
{
num = numJ;
j++;
}
if(num > arr[k-1])
{
arr[k] = num;
k++;
count++;
}
}
for(int counter = 0; counter < N; counter++)
{
printf("%d ", arr[counter]);
}
}
The question as put to me was to return an infinite set of solutions. I pondered the use of trees, but felt there was a problem with figuring out when to harvest and prune the tree, given an infinite number of values for i & j. I realized that a sieve algorithm could be used. Starting from zero, determine whether each positive integer had values for i and j. This was facilitated by turning answer = (2^i)*(2^j) around and solving for i instead. That gave me i = log2 (answer/ (5^j)). Here is the code:
class Program
{
static void Main(string[] args)
{
var startTime = DateTime.Now;
int potential = 0;
do
{
if (ExistsIandJ(potential))
Console.WriteLine("{0}", potential);
potential++;
} while (potential < 100000);
Console.WriteLine("Took {0} seconds", DateTime.Now.Subtract(startTime).TotalSeconds);
}
private static bool ExistsIandJ(int potential)
{
// potential = (2^i)*(5^j)
// 1 = (2^i)*(5^j)/potential
// 1/(2^1) = (5^j)/potential or (2^i) = potential / (5^j)
// i = log2 (potential / (5^j))
for (var j = 0; Math.Pow(5,j) <= potential; j++)
{
var i = Math.Log(potential / Math.Pow(5, j), 2);
if (i == Math.Truncate(i))
return true;
}
return false;
}
}

A problem from a programming competition... Digit Sums

I need help solving problem N from this earlier competition:
Problem N: Digit Sums
Given 3 positive integers A, B and C,
find how many positive integers less
than or equal to A, when expressed in
base B, have digits which sum to C.
Input will consist of a series of
lines, each containing three integers,
A, B and C, 2 ≤ B ≤ 100, 1 ≤ A, C ≤
1,000,000,000. The numbers A, B and C
are given in base 10 and are separated
by one or more blanks. The input is
terminated by a line containing three
zeros.
Output will be the number of numbers,
for each input line (it must be given
in base 10).
Sample input
100 10 9
100 10 1
750000 2 2
1000000000 10 40
100000000 100 200
0 0 0
Sample output
10
3
189
45433800
666303
The relevant rules:
Read all input from the keyboard, i.e. use stdin, System.in, cin or equivalent. Input will be redirected from a file to form the input to your submission.
Write all output to the screen, i.e. use stdout, System.out, cout or equivalent. Do not write to stderr. Do NOT use, or even include, any module that allows direct manipulation of the screen, such as conio, Crt or anything similar. Output from your program is redirected to a file for later checking. Use of direct I/O means that such output is not redirected and hence cannot be checked. This could mean that a correct program is rejected!
Unless otherwise stated, all integers in the input will fit into a standard 32-bit computer word. Adjacent integers on a line will be separated by one or more spaces.
Of course, it's fair to say that I should learn more before trying to solve this, but i'd really appreciate it if someone here told me how it's done.
Thanks in advance, John.
Other people pointed out trivial solution: iterate over all numbers from 1 to A. But this problem, actually, can be solved in nearly constant time: O(length of A), which is O(log(A)).
Code provided is for base 10. Adapting it for arbitrary base is trivial.
To reach above estimate for time, you need to add memorization to recursion. Let me know if you have questions about that part.
Now, recursive function itself. Written in Java, but everything should work in C#/C++ without any changes. It's big, but mostly because of comments where I try to clarify algorithm.
// returns amount of numbers strictly less than 'num' with sum of digits 'sum'
// pay attention to word 'strictly'
int count(int num, int sum) {
// no numbers with negative sum of digits
if (sum < 0) {
return 0;
}
int result = 0;
// imagine, 'num' == 1234
// let's check numbers 1233, 1232, 1231, 1230 manually
while (num % 10 > 0) {
--num;
// check if current number is good
if (sumOfDigits(num) == sum) {
// one more result
++result;
}
}
if (num == 0) {
// zero reached, no more numbers to check
return result;
}
num /= 10;
// Using example above (1234), now we're left with numbers
// strictly less than 1230 to check (1..1229)
// It means, any number less than 123 with arbitrary digit appended to the right
// E.g., if this digit in the right (last digit) is 3,
// then sum of the other digits must be "sum - 3"
// and we need to add to result 'count(123, sum - 3)'
// let's iterate over all possible values of last digit
for (int digit = 0; digit < 10; ++digit) {
result += count(num, sum - digit);
}
return result;
}
Helper function
// returns sum of digits, plain and simple
int sumOfDigits(int x) {
int result = 0;
while (x > 0) {
result += x % 10;
x /= 10;
}
return result;
}
Now, let's write a little tester
int A = 12345;
int C = 13;
// recursive solution
System.out.println(count(A + 1, C));
// brute-force solution
int total = 0;
for (int i = 1; i <= A; ++i) {
if (sumOfDigits(i) == C) {
++total;
}
}
System.out.println(total);
You can write more comprehensive tester checking all values of A, but overall solution seems to be correct. (I tried several random A's and C's.)
Don't forget, you can't test solution for A == 1000000000 without memorization: it'll run too long. But with memorization, you can test it even for A == 10^1000.
edit
Just to prove a concept, poor man's memorization. (in Java, in other languages hashtables are declared differently) But if you want to learn something, it might be better to try to do it yourself.
// hold values here
private Map<String, Integer> mem;
int count(int num, int sum) {
// no numbers with negative sum of digits
if (sum < 0) {
return 0;
}
String key = num + " " + sum;
if (mem.containsKey(key)) {
return mem.get(key);
}
// ...
// continue as above...
// ...
mem.put(key, result);
return result;
}
Here's the same memoized recursive solution that Rybak posted, but with a simpler implementation, in my humble opinion:
HashMap<String, Integer> cache = new HashMap<String, Integer>();
int count(int bound, int base, int sum) {
// No negative digit sums.
if (sum < 0)
return 0;
// Handle one digit case.
if (bound < base)
return (sum <= bound) ? 1 : 0;
String key = bound + " " + sum;
if (cache.containsKey(key))
return cache.get(key);
int count = 0;
for (int digit = 0; digit < base; digit++)
count += count((bound - digit) / base, base, sum - digit);
cache.put(key, count);
return count;
}
This is not the complete solution (no input parsing). To get the number in base B, repeatedly take the modulo B, and then divide by B until the result is 0. This effectively computes the base-B digit from the right, and then shifts the number right.
int A,B,C; // from input
for (int x=1; x<A; x++)
{
int sumDigits = 0;
int v = x;
while (v!=0) {
sumDigits += (v % B);
v /= B;
}
if (sumDigits==C)
cout << x;
}
This is a brute force approach. It may be possible to compute this quicker by determining which sets of base B digits add up to C, arranging these in all permutations that are less than A, and then working backwards from that to create the original number.
Yum.
Try this:
int number, digitSum, resultCounter = 0;
for(int i=1; i<=A, i++)
{
number = i; //to avoid screwing up our counter
digitSum = 0;
while(number > 1)
{
//this is the next "digit" of the number as it would be in base B;
//works with any base including 10.
digitSum += (number % B);
//remove this digit from the number, square the base, rinse, repeat
number /= B;
}
digitSum += number;
//Does the sum match?
if(digitSum == C)
resultCounter++;
}
That's your basic algorithm for one line. Now you wrap this in another For loop for each input line you received, preceded by the input collection phase itself. This process can be simplified, but I don't feel like coding your entire answer to see if my algorithm works, and this looks right whereas the simpler tricks are harder to pass by inspection.
The way this works is by modulo dividing by powers of the base. Simple example, 1234 in base 10:
1234 % 10 = 4
1234 / 10 = 123 //integer division truncates any fraction
123 % 10 = 3 //sum is 7
123 / 10 = 12
12 % 10 = 2 //sum is 9
12 / 10 = 1 //end condition, add this and the sum is 10
A harder example to figure out by inspection would be the same number in base 12:
1234 % 12 = 10 //you can call it "A" like in hex, but we need a sum anyway
1234 / 12 = 102
102 % 12 = 6 // sum 16
102/12 = 8
8 % 12 = 8 //sum 24
8 / 12 = 0 //end condition, sum still 24.
So 1234 in base 12 would be written 86A. Check the math:
8*12^2 + 6*12 + 10 = 1152 + 72 + 10 = 1234
Have fun wrapping the rest of the code around this.

Resources