Minimum number of operations to make A and B equal simultaneously - algorithm

Given two non-negative integers A and B, find the minimum number of operations to make them equal simultaneously. In one operation, you can:
either change A to 2*A
or change B to 2*B
or change both A and B to A-1, B-1
For example: A = 7, B = 25
Sequence of operations would be:
6 24
12 24
24 24
We cannot make them equal in less than 3 operations
I was asked this coding question in a test a week ago. Cannot think of a solution, it is stuck in my head.The input A and B were somewhat over 10^12 so it is clear that I cannot use a loop else it will exceed time limit.

A slow but working solution:
If they are equal, stop.
If one of them is 0, stop with failure (there is no solution if negative numbers are not allowed).
While both are larger than 1, decrease both.
Now the smaller is 1, the other is larger.
While the smaller has a shorter binary representation, double the smaller.
Continue at step 1.
In step 4, the maximum decreases. In step 5, the absolute difference decreases. Thus eventually the algorithm terminates.

This should give the optimal solution. We have to compare a few different ways and take the best solution.
One working solution is to double the smaller number as many times as it stays below the larger number (can be zero times). Then calculate the difference between the double of the (possibly multiple times) doubled smaller number and the larger number. And decrease the numbers as many times. Then double the smaller number one more time. [If the numbers are equal from the beginning, the solution is trivial instead.] This gives an upper bound of the steps.
Now try out the following optimizations:
2a) Choose a number n between 0 and up to the number of steps of the best solution so far.
2b) Choose one number as A and one number as B (two possibilities).
2c) Now count the applied steps of the following procedure.
Double A n times.
Calculate the smallest power of 2 (=m), with which B * 2^m >= A. m should be at least 1.
Calculate the difference of A with the product from step 4 in a mixed base (correct term?) system with each digit having a positional value of 2^(n+1)-1, which is from the least significant right digit to the left: 1, 3, 7, 15, 31, 63, ... From all possible representations the number must have the smallest crosssum, e.g. 100 for 7 is correct, 021 not. Sidenote: For the least checksum there will mostly be digits 0 and 1 and at most one digit 2, no other digits. There will never be a digit 1 right of a 2.)
Represent the number as m digits by filling the left positions with zero. If the number does not fit, go back to step 2 for another selection.
Take the most significant not processed digit from step 6 and do as many decreasing steps.
Double B.
Repeat from 7. with the next digit; if there are no more digits left, the numbers are equal.
If the number of steps is less than the best solution so far, choose this as the proposed solution.
Go back to step 2 for another selection.
After doing all selections from 2 we should have the optimal solution with the minimum number of steps.
The following examples are from an earlier version of the answer, where A is always the larger number and n=0, so we test only one selection.
Example 17 and 65
Power of 2: 2^2=4; 4x17=68
Difference: 68-65=3
3 = 010=10 in base 7/3/1
Start => 17/65
Decrease. Double. => 32/64
Double. => 64/64
Example 18 and 67
Power of 2: 2^2=4; 4x18=72
Difference: 72-67=5
5 = 012=12 in base 7/3/1
Start => 18/67
Decrease. Double. => 34/66
Decrease. Decrease. Double. => 64/64
Example 10 and 137
Power of 2: 2^4=16; 16*10=160
Difference: 160-137=23
23 = 1101 in base 15/7/3/1
Start => 10/137
Decrease. Double. => 18/136
Decrease. Double. => 34/135
Double. => 68/135
Decrease. Double. => 134/134

Here's a breadth-first search that does return the correct answer but may not be an optimal method of finding it. Maybe it can help others detect a pattern.
JavasScript code:
function f(a, b) {
const q = [[a, b, [a, b]]];
while (true){
const [x, y, path] = q.shift();
if (x == y) {
return path;
}
if (x > 0 && y > 0) {
q.push([x-1, y-1, path.concat([x-1, y-1])]);
}
q.push([2*x, y, path.concat([2*x, y])]);
q.push([x, 2*y, path.concat([x, 2*y])]);
}
return [];
}
function showPath(path) {
let out1 = "";
let out2 = "";
for (let i = 0; i < path.length; i += 2) {
const s1 = path[i].toString(2);
const s2 = path[i+1].toString(2);
const len = Math.max(s1.length, s2.length);
out1 += s1.padStart(len, "0");
out2 += s2.padStart(len, "0");
if (i < path.length - 2) {
out1 += " --> ";
out2 += " --> ";
}
}
console.log(out1);
console.log(out2);
}
showPath(f(89, 7));

Related

Minimum number of train station stops

I received this interview question and got stuck on it:
There are an infinite number of train stops starting from station number 0.
There are an infinite number of trains. The nth train stops at all of the k * 2^(n - 1) stops where k is between 0 and infinity.
When n = 1, the first train stops at stops 0, 1, 2, 3, 4, 5, 6, etc.
When n = 2, the second train stops at stops 0, 2, 4, 6, 8, etc.
When n = 3, the third train stops at stops 0, 4, 8, 12, etc.
Given a start station number and end station number, return the minimum number of stops between them. You can use any of the trains to get from one stop to another stop.
For example, the minimum number of stops between start = 1 and end = 4 is 3 because we can get from 1 to 2 to 4.
I'm thinking about a dynamic programming solution that would store in dp[start][end] the minimum number of steps between start and end. We'd build up the array using start...mid1, mid1...mid2, mid2...mid3, ..., midn...end. But I wasn't able to get it to work. How do you solve this?
Clarifications:
Trains can only move forward from a lower number stop to a higher number stop.
A train can start at any station where it makes a stop at.
Trains can be boarded in any order. The n = 1 train can be boarded before or after boarding the n = 3 train.
Trains can be boarded multiple times. For example, it is permitted to board the n = 1 train, next board the n = 2 train, and finally board the n = 1 train again.
I don't think you need dynamic programming at all for this problem. It can basically be expressed by binary calculations.
If you convert the number of a station to binary it tells you right away how to get there from station 0, e.g.,
station 6 = 110
tells you that you need to take the n=3 train and the n=2 train each for one station. So the popcount of the binary representation tells you how many steps you need.
The next step is to figure out how to get from one station to another.
I´ll show this again by example. Say you want to get from station 7 to station 23.
station 7 = 00111
station 23 = 10111
The first thing you want to do is to get to an intermediate stop. This stop is specified by
(highest bits that are equal in start and end station) + (first different bit) + (filled up with zeros)
In our example the intermediate stop is 16 (10000). The steps you need to make can be calculated by the difference of that number and the start station (7 = 00111). In our example this yields
10000 - 00111 = 1001
Now you know, that you need 2 stops (n=1 train and n=4) to get from 7 to 16.
The remaining task is to get from 16 to 23, again this can be solved by the corresponding difference
10111 - 10000 = 00111
So, you need another 3 stops to go from 16 to 23 (n= 3, n= 2, n= 1). This gives you 5 stops in total, just using two binary differences and the popcount. The resulting path can be extracted from the bit representations 7 -> 8 -> 16 -> 20 -> 22 -> 23
Edit:
For further clarification of the intermediate stop let's assume we want to go from
station 5 = 101 to
station 7 = 111
the intermediate stop in this case will be 110, because
highest bits that are equal in start and end station = 1
first different bit = 1
filled up with zeros = 0
we need one step to go there (110 - 101 = 001) and one more to go from there to the end station (111 - 110 = 001).
About the intermediate stop
The concept of the intermediate stop is a bit clunky but I could not find a more elegant way in order to get the bit operations to work. The intermediate stop is the stop in between start and end where the highest level bit switches (that's why it is constructed the way it is). In this respect it is the stop at which the fastest train (between start and end) operates (actually all trains that you are able to catch stop there).
By subtracting the intermediate stop (bit representation) from the end station (bit representation) you reduce the problem to the simple case starting from station 0 (cf. first example of my answer).
By subtracting the start station from the intermediate stop you also reduce the problem to the simple case, but assume that you go from the intermediate stop to the start station which is equivalent to the other way round.
First, ask if you can go backward. It sounds like you can't, but as presented here (which may not reflect the question as you received it), the problem never gives an explicit direction for any of these trains. (I see you've now edited your question to say you can't go backward.)
Assuming you can't go backward, the strategy is simple: always take the highest-numbered available train that doesn't overshoot your destination.
Suppose you're at stop s, and the highest-numbered train that stops at your current location and doesn't overshoot is train k. Traveling once on train k will take you to stop s + 2^(k-1). There is no faster way to get to that stop, and no way to skip that stop - no lower-numbered trains skip any of train k's stops, and no higher-numbered trains stop between train k's stops, so you can't get on a higher-numbered train before you get there. Thus, train k is your best immediate move.
With this strategy in mind, most of the remaining optimization is a matter of efficient bit twiddling tricks to compute the number of stops without explicitly figuring out every stop on the route.
I will attempt to prove my algorithm is optimal.
The algorithm is "take the fastest train that doesn't overshoot your destination".
How many stops this is is a bit tricky.
Encode both stops as binary numbers. I claim that an identical prefix can be neglected; the problem of going from a to b is the same as the problem of going from a+2^n to b+2^n if 2^n > b, as the stops between 2^n and 2^(n+1) are just the stops between 0 and 2^n shifted over.
From this, we can reduce a trip from a to b to guarantee that the high bit of b is set, and the same "high" bit of a is not set.
To solve going from 5 (101) to 7 (111), we merely have to solve going from 1 (01) to 3 (11), then shift our stop numbers up 4 (100).
To go from x to 2^n + y, where y < 2^n (and hence x is), we first want to go to 2^n, because there are no trains that skip over 2^n that do not also skip over 2^n+y < 2^{n+1}.
So any set of stops between x and y must stop at 2^n.
Thus the optimal number of stops from x to 2^n + y is the number of stops from x to 2^n, followed by the number of stops from 2^n to 2^n+y, inclusive (or from 0 to y, which is the same).
The algorithm I propose to get from 0 to y is to start with the high order bit set, and take the train that gets you there, then go on down the list.
Claim: In order to generate a number with k 1s, you must take at least k trains. As proof, if you take a train and it doesn't cause a carry in your stop number, it sets 1 bit. If you take a train and it does cause a carry, the resulting number has at most 1 more set bit than it started with.
To get from x to 2^n is a bit trickier, but can be made simple by tracking the trains you take backwards.
Mapping s_i to s_{2^n-i} and reversing the train steps, any solution for getting from x to 2^n describes a solution for getting from 0 to 2^n-x. And any solution that is optimal for the forward one is optimal for the backward one, and vice versa.
Using the result for getting from 0 to y, we then get that the optimal route from a to b where b highest bit set is 2^n and a does not have that bit set is #b-2^n + #2^n-a, where # means "the number of bits set in the binary representation". And in general, if a and b have a common prefix, simply drop that common prefix.
A local rule that generates the above number of steps is "take the fastest train in your current location that doesn't overshoot your destination".
For the part going from 2^n to 2^n+y we did that explicitly in our proof above. For the part going from x to 2^n this is trickier to see.
First, if the low order bit of x is set, obviously we have to take the first and only train we can take.
Second, imagine x has some collection of unset low-order bits, say m of them. If we played the train game going from x/2^m to 2^(n-m), then scaled the stop numbers by multiplying by 2^m we'd get a solution to going from x to 2^n.
And #(2^n-x)/2^m = #2^n - x. So this "scaled" solution is optimal.
From this, we are always taking the train corresponding to our low-order set bit in this optimal solution. This is the longest range train available, and it doesn't overshoot 2^n.
QED
This problem doesn't require dynamic programming.
Here is a simple implementation of a solution using GCC:
uint32_t min_stops(uint32_t start, uint32_t end)
{
uint32_t stops = 0;
if(start != 0) {
while(start <= end - (1U << __builtin_ctz(start))) {
start += 1U << __builtin_ctz(start);
++stops;
}
}
stops += __builtin_popcount(end ^ start);
return stops;
}
The train schema is a map of powers-of-two. If you visualize the train lines as a bit representation, you can see that the lowest bit set represents the train line with the longest distance between stops that you can take. You can also take the lines with shorter distances.
To minimize the distance, you want to take the line with the longest distance possible, until that would make the end station unreachable. That's what adding by the lowest-set bit in the code does. Once you do this, some number of the upper bits will agree with the upper bits of the end station, while the lower bits will be zero.
At that point, it's simply a a matter of taking a train for the highest bit in the end station that is not set in the current station. This is optimized as __builtin_popcount in the code.
An example going from 5 to 39:
000101 5 // Start
000110 5+1=6
001000 6+2=8
010000 8+8=16
100000 16+16=32 // 32+32 > 39, so start reversing the process
100100 32+4=36 // Optimized with __builtin_popcount in code
100110 36+2=38 // Optimized with __builtin_popcount in code
100111 38+1=39 // Optimized with __builtin_popcount in code
As some have pointed out, since stops are all multiples of powers of 2, trains that stop more frequently also stop at the same stops of the more-express trains. Any stop is on the first train's route, which stops at every station. Any stop is at most 1 unit away from the second train's route, stopping every second station. Any stop is at most 3 units from the third train that stops every fourth station, and so on.
So start at the end and trace your route back in time - hop on the nearest multiple-of-power-of-2 train and keep switching to the highest multiple-of-power-of-2 train you can as soon as possible (check the position of the least significant set bit - why? multiples of powers of 2 can be divided by two, that is bit-shifted right, without leaving a remainder, log 2 times, or as many leading zeros in the bit-representation), as long as its interval wouldn't miss the starting point after one stop. When the latter is the case, perform the reverse switch, hopping on the next lower multiple-of-power-of-2 train and stay on it until its interval wouldn't miss the starting point after one stop, and so on.
We can figure this out doing nothing but a little counting and array manipulation. Like all the previous answers, we need to start by converting both numbers to binary and padding them to the same length. So 12 and 38 become 01100 and 10110.
Looking at station 12, looking at the least significant set bit (in this case the only bit, 2^2) all trains with intervals larger than 2^2 won't stop at station 4, and all with intervals less than or equal to 2^2 will stop at station 4, but will require multiple stops to get to the same destination as the interval 4 train. We in every situation, up until we reach the largest set bit in the end value, we need to take the train with the interval of the least significant bit of the current station.
If we are at station 0010110100, our sequence will be:
0010110100 2^2
0010111000 2^3
0011000000 2^6
0100000000 2^7
1000000000
Here we can eliminate all bits smaller than the lest significant set bit and get the same count.
00101101 2^0
00101110 2^1
00110000 2^4
01000000 2^6
10000000
Trimming the ends at each stage, we get this:
00101101 2^0
0010111 2^0
0011 2^0
01 2^0
1
This could equally be described as the process of flipping all the 0 bits. Which brings us to the first half of the algorithm: Count the unset bits in the zero padded start number greater than the least significant set bit, or 1 if the start station is 0.
This will get us to the only intermediate station reachable by the train with the largest interval smaller than the end station, so all trains after this must be smaller than the previous train.
Now we need to get from station to 100101, it is easier and obvious, take the train with an interval equal to the largest significant bit set in the destination and not set in the current station number.
1000000000 2^7
1010000000 2^5
1010100000 2^4
1010110000 2^2
1010110100
Similar to the first method, we can trim the most significant bit which will always be set, then count the remaining 1's in the answer. So the second part of the algorithm is Count all the set significant bits smaller than the most significant bit
Then Add the result from parts 1 and 2
Adjusting the algorithm slightly to get all the train intervals, here is an example written in javascript so it can be run here.
function calculateStops(start, end) {
var result = {
start: start,
end: end,
count: 0,
trains: [],
reverse: false
};
// If equal there are 0 stops
if (start === end) return result;
// If start is greater than end, reverse the values and
// add note to reverse the results
if (start > end) {
start = result.end;
end = result.start;
result.reverse = true;
}
// Convert start and end values to array of binary bits
// with the exponent matched to the index of the array
start = (start >>> 0).toString(2).split('').reverse();
end = (end >>> 0).toString(2).split('').reverse();
// We can trim off any matching significant digits
// The stop pattern for 10 to 13 is the same as
// the stop pattern for 2 to 5 offset by 8
while (start[end.length-1] === end[end.length-1]) {
start.pop();
end.pop();
}
// Trim off the most sigificant bit of the end,
// we don't need it
end.pop();
// Front fill zeros on the starting value
// to make the counting easier
while (start.length < end.length) {
start.push('0');
}
// We can break the algorithm in half
// getting from the start value to the form
// 10...0 with only 1 bit set and then getting
// from that point to the end.
var index;
var trains = [];
var expected = '1';
// Now we loop through the digits on the end
// any 1 we find can be added to a temporary array
for (index in end) {
if (end[index] === expected){
result.count++;
trains.push(Math.pow(2, index));
};
}
// if the start value is 0, we can get to the
// intermediate step in one trip, so we can
// just set this to 1, checking both start and
// end because they can be reversed
if (result.start == 0 || result.end == 0) {
index++
result.count++;
result.trains.push(Math.pow(2, index));
// We need to find the first '1' digit, then all
// subsequent 0 digits, as these are the ones we
// need to flip
} else {
for (index in start) {
if (start[index] === expected){
result.count++;
result.trains.push(Math.pow(2, index));
expected = '0';
}
}
}
// add the second set to the first set, reversing
// it to get them in the right order.
result.trains = result.trains.concat(trains.reverse());
// Reverse the stop list if the trip is reversed
if (result.reverse) result.trains = result.trains.reverse();
return result;
}
$(document).ready(function () {
$("#submit").click(function () {
var trains = calculateStops(
parseInt($("#start").val()),
parseInt($("#end").val())
);
$("#out").html(trains.count);
var current = trains.start;
var stopDetails = 'Starting at station ' + current + '<br/>';
for (index in trains.trains) {
current = trains.reverse ? current - trains.trains[index] : current + trains.trains[index];
stopDetails = stopDetails + 'Take train with interval ' + trains.trains[index] + ' to station ' + current + '<br/>';
}
$("#stops").html(stopDetails);
});
});
label {
display: inline-block;
width: 50px;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<label>Start</label> <input id="start" type="number" /> <br>
<label>End</label> <input id="end" type="number" /> <br>
<button id="submit">Submit</button>
<p>Shortest route contains <span id="out">0</span> stops</p>
<p id="stops"></p>
Simple Java solution
public static int minimumNumberOfStops(int start, final int end) {
// I would initialize it with 0 but the example given in the question states :
// the minimum number of stops between start = 1 and end = 4 is 3 because we can get from 1 to 2 to 4
int stops = 1;
while (start < end) {
start += findClosestPowerOfTwoLessOrEqualThan(end - start);
stops++;
}
return stops;
}
private static int findClosestPowerOfTwoLessOrEqualThan(final int i) {
if (i > 1) {
return 2 << (30 - Integer.numberOfLeadingZeros(i));
}
return 1;
}
NOTICE: Reason for current comments under my answer is that first I wrote this algorithm completely wrong and user2357112 awared me from my mistakes. So I completely removed that algorithm and wrote a new one according to what user2357112 answered to this question. I also added some comments into this algorithm to clarify what happens in each line.
This algorithm starts at procedure main(Origin, Dest) and it simulate our movements toward destination with updateOrigin(Origin, Dest)
procedure main(Origin, Dest){
//at the end we have number of minimum steps in this variable
counter = 0;
while(Origin != Dest){
//we simulate our movement toward destination with this
Origin = updateOrigin(Origin, Dest);
counter = counter + 1;
}
}
procedure updateOrigin(Origin, Dest){
if (Origin == 1) return 2;
//we must find which train pass from our origin, what comes out from this IF clause is NOT exact choice and we still have to do some calculation in future
if (Origin == 0){
//all trains pass from stop 0, thus we can choose our train according to destination
n = Log2(Dest);
}else{
//its a good starting point to check if it pass from our origin
n = Log2(Origin);
}
//now lets choose exact train which pass from origin and doesn't overshoot destination
counter = 0;
do {
temp = counter * 2 ^ (n - 1);
//we have found suitable train
if (temp == Origin){
//where we have moved to
return Origin + 2 ^ ( n - 1 );
//we still don't know if this train pass from our origin
} elseif (temp < Origin){
counter = counter + 1;
//lets check another train
} else {
n = n - 1;
counter = 0;
}
}while(temp < origin)
}

How to turn integers into Fibonacci coding efficiently?

Fibonacci sequence is obtained by starting with 0 and 1 and then adding the two last numbers to get the next one.
All positive integers can be represented as a sum of a set of Fibonacci numbers without repetition. For example: 13 can be the sum of the sets {13}, {5,8} or {2,3,8}. But, as we have seen, some numbers have more than one set whose sum is the number. If we add the constraint that the sets cannot have two consecutive Fibonacci numbers, than we have a unique representation for each number.
We will use a binary sequence (just zeros and ones) to do that. For example, 17 = 1 + 3 + 13. Then, 17 = 100101. See figure 2 for a detailed explanation.
I want to turn some integers into this representation, but the integers may be very big. How to I do this efficiently.
The problem itself is simple. You always pick the largest fibonacci number less than the remainder. You can ignore the the constraint with the consecutive numbers (since if you need both, the next one is the sum of both so you should have picked that one instead of the initial two).
So the problem remains how to quickly find the largest fibonacci number less than some number X.
There's a known trick that starting with the matrix (call it M)
1 1
1 0
You can compute fibbonacci number by matrix multiplications(the xth number is M^x). More details here: https://www.nayuki.io/page/fast-fibonacci-algorithms . The end result is that you can compute the number you're look in O(logN) matrix multiplications.
You'll need large number computations (multiplications and additions) if they don't fit into existing types.
Also store the matrices corresponding to powers of two you compute the first time, since you'll need them again for the results.
Overall this should be O((logN)^2 * large_number_multiplications/additions)).
First I want to tell you that I really liked this question, I didn't know that All positive integers can be represented as a sum of a set of Fibonacci numbers without repetition, I saw the prove by induction and it was awesome.
To respond to your question I think that we have to figure how the presentation is created. I think that the easy way to find this is that from the number we found the closest minor fibonacci item.
For example if we want to present 40:
We have Fib(9)=34 and Fib(10)=55 so the first element in the presentation is Fib(9)
since 40 - Fib(9) = 6 and (Fib(5) =5 and Fib(6) =8) the next element is Fib(5). So we have 40 = Fib(9) + Fib(5)+ Fib(2)
Allow me to write this in C#
class Program
{
static void Main(string[] args)
{
List<int> fibPresentation = new List<int>();
int numberToPresent = Convert.ToInt32(Console.ReadLine());
while (numberToPresent > 0)
{
int k =1;
while (CalculateFib(k) <= numberToPresent)
{
k++;
}
numberToPresent = numberToPresent - CalculateFib(k-1);
fibPresentation.Add(k-1);
}
}
static int CalculateFib(int n)
{
if (n == 1)
return 1;
int a = 0;
int b = 1;
// In N steps compute Fibonacci sequence iteratively.
for (int i = 0; i < n; i++)
{
int temp = a;
a = b;
b = temp + b;
}
return a;
}
}
Your result will be in fibPresentation
This encoding is more accurately called the "Zeckendorf representation": see https://en.wikipedia.org/wiki/Fibonacci_coding
A greedy approach works (see https://en.wikipedia.org/wiki/Zeckendorf%27s_theorem) and here's some Python code that converts a number to this representation. It uses the first 100 Fibonacci numbers and works correctly for all inputs up to 927372692193078999175 (and incorrectly for any larger inputs).
fibs = [0, 1]
for _ in xrange(100):
fibs.append(fibs[-2] + fibs[-1])
def zeck(n):
i = len(fibs) - 1
r = 0
while n:
if fibs[i] <= n:
r |= 1 << (i - 2)
n -= fibs[i]
i -= 1
return r
print bin(zeck(17))
The output is:
0b100101
As the greedy approach seems to work, it suffices to be able to invert the relation N=Fn.
By the Binet formula, Fn=[φ^n/√5], where the brackets denote the nearest integer. Then with n=floor(lnφ(√5N)) you are very close to the solution.
17 => n = floor(7.5599...) => F7 = 13
4 => n = floor(4.5531) => F4 = 3
1 => n = floor(1.6722) => F1 = 1
(I do not exclude that some n values can be off by one.)
I'm not sure if this is an efficient enough for you, but you could simply use Backtracking to find a(the) valid representation.
I would try to start the backtracking steps by taking the biggest possible fib number and only switch to smaller ones if the consecutive or the only once constraint is violated.

Subtract a number's digits from the number until it reaches 0

Can anyone help me with some algorithm for this problem?
We have a big number (19 digits) and, in a loop, we subtract one of the digits of that number from the number itself.
We continue to do this until the number reaches zero. We want to calculate the minimum number of subtraction that makes a given number reach zero.
The algorithm must respond fast, for a 19 digits number (10^19), within two seconds. As an example, providing input of 36 will give 7:
1. 36 - 6 = 30
2. 30 - 3 = 27
3. 27 - 7 = 20
4. 20 - 2 = 18
5. 18 - 8 = 10
6. 10 - 1 = 9
7. 9 - 9 = 0
Thank you.
The minimum number of subtractions to reach zero makes this, I suspect, a very thorny problem, one that will require a great deal of backtracking potential solutions, making it possibly too expensive for your time limitations.
But the first thing you should do is a sanity check. Since the largest digit is a 9, a 19-digit number will require about 1018 subtractions to reach zero. Code up a simple program to continuously subtract 9 from 1019 until it becomes less than ten. If you can't do that within the two seconds, you're in trouble.
By way of example, the following program (a):
#include <stdio.h>
int main (int argc, char *argv[]) {
unsigned long long x = strtoull(argv[1], NULL, 10);
x /= 1000000000;
while (x > 9)
x -= 9;
return x;
}
when run with the argument 10000000000000000000 (1019), takes a second and a half clock time (and CPU time since it's all calculation) even at gcc insane optimisation level of -O3:
real 0m1.531s
user 0m1.528s
sys 0m0.000s
And that's with the one-billion divisor just before the while loop, meaning the full number of iterations would take about 48 years.
So a brute force method isn't going to help here, what you need is some serious mathematical analysis which probably means you should post a similar question over at https://math.stackexchange.com/ and let the math geniuses have a shot.
(a) If you're wondering why I'm getting the value from the user rather than using a constant of 10000000000000000000ULL, it's to prevent gcc from calculating it at compile time and turning it into something like:
mov $1, %eax
Ditto for the return x which will prevent it noticing I don't use the final value of x and hence optimise the loop out of existence altogether.
I don't have a solution that can solve 19 digit numbers in 2 seconds. Not even close. But I did implement a couple of algorithms (including a dynamic programming algorithm that solves for the optimum), and gained some insight that I believe is interesting.
Greedy Algorithm
As a baseline, I implemented a greedy algorithm that simply picks the largest digit in each step:
uint64_t countGreedy(uint64_t inputVal) {
uint64_t remVal = inputVal;
uint64_t nStep = 0;
while (remVal > 0) {
uint64_t digitVal = remVal;
uint_fast8_t maxDigit = 0;
while (digitVal > 0) {
uint64_t nextDigitVal = digitVal / 10;
uint_fast8_t digit = digitVal - nextDigitVal * 10;
if (digit > maxDigit) {
maxDigit = digit;
}
digitVal = nextDigitVal;
}
remVal -= maxDigit;
++nStep;
}
return nStep;
}
Dynamic Programming Algorithm
The idea for this is that we can calculate the optimum incrementally. For a given value, we pick a digit, which adds one step to the optimum number of steps for the value with the digit subtracted.
With the target function (optimum number of steps) for a given value named optSteps(val), and the digits of the value named d_i, the following relationship holds:
optSteps(val) = 1 + min(optSteps(val - d_i))
This can be implemented with a dynamic programming algorithm. Since d_i is at most 9, we only need the previous 9 values to build on. In my implementation, I keep a circular buffer of 10 values:
static uint64_t countDynamic(uint64_t inputVal) {
uint64_t minSteps[10] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
uint_fast8_t digit0 = 0;
for (uint64_t val = 10; val <= inputVal; ++val) {
digit0 = val % 10;
uint64_t digitVal = val;
uint64_t minPrevStep = 0;
bool prevStepSet = false;
while (digitVal > 0) {
uint64_t nextDigitVal = digitVal / 10;
uint_fast8_t digit = digitVal - nextDigitVal * 10;
if (digit > 0) {
uint64_t prevStep = 0;
if (digit > digit0) {
prevStep = minSteps[10 + digit0 - digit];
} else {
prevStep = minSteps[digit0 - digit];
}
if (!prevStepSet || prevStep < minPrevStep) {
minPrevStep = prevStep;
prevStepSet = true;
}
}
digitVal = nextDigitVal;
}
minSteps[digit0] = minPrevStep + 1;
}
return minSteps[digit0];
}
Comparison of Results
This may be considered a surprise: I ran both algorithms on all values up to 1,000,000. The results are absolutely identical. This suggests that the greedy algorithm actually calculates the optimum.
I don't have a formal proof that this is indeed true for all possible values. It intuitively kind of makes sense to me. If in any given step, you choose a smaller digit than the maximum, you compromise the immediate progress with the goal of getting into a more favorable situation that allows you to catch up and pass the greedy approach. But in all the scenarios I thought about, the situation after taking a sub-optimal step just does not get significantly more favorable. It might make the next step bigger, but that is at most enough to get even again.
Complexity
While both algorithms look linear in the size of the value, they also loop over all digits in the value. Since the number of digits corresponds to log(n), I believe the complexity is O(n * log(n)).
I think it's possible to make it linear by keeping counts of the frequency of each digit, and modifying them incrementally. But I doubt it would actually be faster. It requires more logic, and turns a loop over all digits in the value (which is in the range of 2-19 for the values we are looking at) into a fixed loop over 10 possible digits.
Runtimes
Not surprisingly, the greedy algorithm is faster to calculate a single value. For example, for value 1,000,000,000, the runtimes on my MacBook Pro are:
greedy: 3 seconds
dynamic: 36 seconds
On the other hand, the dynamic programming approach is obviously much faster at calculating all the values, since its incremental approach needs them as intermediate results anyway. For calculating all values from 10 to 1,000,000:
greedy: 19 minutes
dynamic: 0.03 seconds
As already shown in the runtimes above, the greedy algorithm gets about as high as 9 digit input values within the targeted runtime of 2 seconds. The implementations aren't really tuned, and it's certainly possible to squeeze out some more time, but it would be fractional improvements.
Ideas
As already explored in another answer, there's no chance of getting the result for 19 digit numbers in 2 seconds by subtracting digits one by one. Since we subtract at most 9 in each step, completing this for a value of 10^19 needs more than 10^18 steps. We mostly use computers that perform in the rough range of 10^9 operations/second, which suggests that it would take about 10^9 seconds.
Therefore, we need something that can take shortcuts. I can think of scenarios where that's possible, but haven't been able to generalize it to a full strategy so far.
For example, if your current value is 9999, you know that you can subtract 9 until you reach 9000. So you can calculate that you will make 112 steps ((9999 - 9000) / 9 + 1) where you subtract 9, which can be done in a few operations.
As said in comments already, and agreeing with #paxdiablo’s other answer, I’m not sure if there is an algorithm to find the ideal solution without some backtracking; and the size of the number and the time constraint might be tough as well.
A general consideration though: You might want to find a way to decide between always subtracting the highest digit (which will decrease your current number by the largest possible amount, obviously), and by looking at your current digits and subtracting which of those will give you the largest “new” digit.
Say, your current number only consists of digits between 0 and 5 – then you might be tempted to subtract the 5 to decrease your number by the highest possible value, and continue with the next step. If the last digit of your current number is 3 however, then you might want to subtract 4 instead – since that will give you 9 as new digit at the end of the number, instead of “only” 8 you would be getting if you subtracted 5.
Whereas if you have a 2 and two 9 in your digits already, and the last digit is a 1 – then you might want to subtract the 9 anyway, since you will be left with the second 9 in the result (at least in most cases; in some edge cases it might get obliterated from the result as well), so subtracting the 2 instead would not have the advantage of giving you a “high” 9 that you would otherwise not have in the next step, and would have the disadvantage of not lowering your number by as high an amount as subtracting the 9 would …
But every digit you subtract will not only affect the next step directly, but the following steps indirectly – so again, I doubt there is a way to always chose the ideal digit for the current step without any backtracking or similar measures.

Fast algorithm to optimize a sequence of arithmetic expression

EDIT: clarified description of problem
Is there a fast algorithm solving following problem?
And, is also for extendend version of this problem
that is replaced natural numbers to Z/(2^n Z)?(This problem was too complex to add more quesion in one place, IMO.)
Problem:
For a given set of natural numbers like {7, 20, 17, 100}, required algorithm
returns the shortest sequence of additions, mutliplications and powers compute
all of given numbers.
Each item of sequence are (correct) equation that matches following pattern:
<number> = <number> <op> <number>
where <number> is a natual number, <op> is one of {+, *, ^}.
In the sequence, each operand of <op> should be one of
1
numbers which are already appeared in the left-hand-side of equal.
Example:
Input: {7, 20, 17, 100}
Output:
2 = 1 + 1
3 = 1 + 2
6 = 2 * 3
7 = 1 + 6
10 = 3 + 7
17 = 7 + 10
20 = 2 * 10
100 = 10 ^ 2
I wrote backtracking algorithm in Haskell.
it works for small input like above, but my real query is
randomly distributed ~30 numbers in [0,255].
for real query, following code takes 2~10 minutes in my PC.
(Actual code,
very simple test)
My current (Pseudo)code:
-- generate set of sets required to compute n.
-- operater (+) on set is set union.
requiredNumbers 0 = { {} }
requiredNumbers 1 = { {} }
requiredNumbers n =
{ {j, k} | j^k == n, j >= 2, k >= 2 }
+ { {j, k} | j*k == n, j >= 2, k >= 2 }
+ { {j, k} | j+k == n, j >= 1, k >= 1 }
-- remember the smallest set of "computed" number
bestSet := {i | 1 <= i <= largeNumber}
-- backtracking algorithm
-- from: input
-- to: accumulator of "already computed" number
closure from to =
if (from is empty)
if (|bestSet| > |to|)
bestSet := to
return
else if (|from| + |to| >= |bestSet|)
-- cut branch
return
else
m := min(from)
from' := deleteMin(from)
foreach (req in (requiredNumbers m))
closure (from' + (req - to)) (to + {m})
-- recoverEquation is a function converts set of number to set of equation.
-- it can be done easily.
output = recoverEquation (closure input {})
Additional Note:
Answers like
There isn't a fast algorithm, because...
There is a heuristic algorithm, it is...
are also welcomed. Now I'm feeling that there is no fast and exact algorithm...
Answer #1 can be used as a heuristic, I think.
What if you worked backwards from the highest number in a sorted input, checking if/how to utilize the smaller numbers (and numbers that are being introduced) in its construction?
For example, although this may not guarantee the shortest sequence...
input: {7, 20, 17, 100}
(100) = (20) * 5 =>
(7) = 5 + 2 =>
(17) = 10 + (7) =>
(20) = 10 * 2 =>
10 = 5 * 2 =>
5 = 3 + 2 =>
3 = 2 + 1 =>
2 = 1 + 1
What I recommend is to transform it into some kind of graph shortest path algorithm.
For each number, you compute (and store) the shortest path of operations. Technically one step is enough: For each number you can store the operation and the two operands (left and right, because power operation is not commutative), and also the weight ("nodes")
Initially you register 1 with the weight of zero
Every time you register a new number, you have to generate all calculations with that number (all additions, multiplications, powers) with all already-registered numbers. ("edges")
Filter for the calculations: it the result of the calculation is already registered, you shouldn't store that, because there is an easier way to get to that number
Store only 1 operation for the commutative ones (1+2=2+1)
Prefilter the power operation because that may even cause overflow
You have to order this list to the shortest sum path (weight of the edge). Weight = (weight of operand1) + (weight of operand2) + (1, which is the weight of the operation)
You can exclude all resulting numbers which are greater than the maximum number that we have to find (e.g. if we found 100 already, anything greater that 20 can be excluded) - this can be refined so that you can check the members of the operations also.
If you hit one of your target numbers, then you found the shortest way of calculating one of your target numbers, you have to restart the generations:
Recalculate the maximum of the target numbers
Go back on the paths of the currently found number, set their weight to 0 (they will be given from now on, because their cost is already paid)
Recalculate the weight for the operations in the generation list, because the source operand weight may have been changed (this results reordering at the end) - here you can exclude those where either operand is greater than the new maximum
If all the numbers are hit, then the search is over
You can build your expression using the "backlinks" (operation, left and right operands) for each of your target numbers.
The main point is that we always keep our eye on the target function, which is that the total number of operation must be the minimum possible. In order to get this, we always calculate the shortest path to a certain number, then considering that number (and all the other numbers on the way) as given numbers, then extending our search to the remaining targets.
Theoretically, this algorithm processes (registers) each numbers only once. Applying the proper filters cuts the unnecessary branches, so nothing is calculated twice (except the weights of the in-queue elements)

How to implement Random(a,b) with only Random(0,1)? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
how to get uniformed random between a, b by a known uniformed random function RANDOM(0,1)
In the book of Introduction to algorithms, there is an excise:
Describe an implementation of the procedure Random(a, b) that only makes calls to Random(0,1). What is the expected running time of your procedure, as a function of a and b? The probability of the result of Random(a,b) should be pure uniformly distributed, as Random(0,1)
For the Random function, the results are integers between a and b, inclusively. For e.g., Random(0,1) generates either 0 or 1; Random(a, b) generates a, a+1, a+2, ..., b
My solution is like this:
for i = 1 to b-a
r = a + Random(0,1)
return r
the running time is T=b-a
Is this correct? Are the results of my solutions uniformly distributed?
Thanks
What if my new solution is like this:
r = a
for i = 1 to b - a //including b-a
r += Random(0,1)
return r
If it is not correct, why r += Random(0,1) makes r not uniformly distributed?
Others have explained why your solution doesn't work. Here's the correct solution:
1) Find the smallest number, p, such that 2^p > b-a.
2) Perform the following algorithm:
r=0
for i = 1 to p
r = 2*r + Random(0,1)
3) If r is greater than b-a, go to step 2.
4) Your result is r+a
So let's try Random(1,3).
So b-a is 2.
2^1 = 2, so p will have to be 2 so that 2^p is greater than 2.
So we'll loop two times. Let's try all possible outputs:
00 -> r=0, 0 is not > 2, so we output 0+1 or 1.
01 -> r=1, 1 is not > 2, so we output 1+1 or 2.
10 -> r=2, 2 is not > 2, so we output 2+1 or 3.
11 -> r=3, 3 is > 2, so we repeat.
So 1/4 of the time, we output 1. 1/4 of the time we output 2. 1/4 of the time we output 3. And 1/4 of the time we have to repeat the algorithm a second time. Looks good.
Note that if you have to do this a lot, two optimizations are handy:
1) If you use the same range a lot, have a class that computes p once so you don't have to compute it each time.
2) Many CPUs have fast ways to perform step 1 that aren't exposed in high-level languages. For example, x86 CPUs have the BSR instruction.
No, it's not correct, that method will concentrate around (a+b)/2. It's a binomial distribution.
Are you sure that Random(0,1) produces integers? it would make more sense if it produced floating point values between 0 and 1. Then the solution would be an affine transformation, running time independent of a and b.
An idea I just had, in case it's about integer values: use bisection. At each step, you have a range low-high. If Random(0,1) returns 0, the next range is low-(low+high)/2, else (low+high)/2-high.
Details and complexity left to you, since it's homework.
That should create (approximately) a uniform distribution.
Edit: approximately is the important word there. Uniform if b-a+1 is a power of 2, not too far off if it's close, but not good enough generally. Ah, well it was a spontaneous idea, can't get them all right.
No, your solution isn't correct. This sum'll have binomial distribution.
However, you can generate a pure random sequence of 0, 1 and treat it as a binary number.
repeat
result = a
steps = ceiling(log(b - a))
for i = 0 to steps
result += (2 ^ i) * Random(0, 1)
until result <= b
KennyTM: my bad.
I read the other answers. For fun, here is another way to find the random number:
Allocate an array with b-a elements.
Set all the values to 1.
Iterate through the array. For each nonzero element, flip the coin, as it were. If it is came up 0, set the element to 0.
Whenever, after a complete iteration, you only have 1 element remaining, you have your random number: a+i where i is the index of the nonzero element (assuming we start indexing on 0). All numbers are then equally likely. (You would have to deal with the case where it's a tie, but I leave that as an exercise for you.)
This would have O(infinity) ... :)
On average, though, half the numbers would be eliminated, so it would have an average case running time of log_2 (b-a).
First of all I assume you are actually accumulating the result, not adding 0 or 1 to a on each step.
Using some probabilites you can prove that your solution is not uniformly distibuted. The chance that the resulting value r is (a+b)/2 is greatest. For instance if a is 0 and b is 7, the chance that you get a value 4 is (combination 4 of 7) divided by 2 raised to the power 7. The reason for that is that no matter which 4 out of the 7 values are 1 the result will still be 4.
The running time you estimate is correct.
Your solution's pseudocode should look like:
r=a
for i = 0 to b-a
r+=Random(0,1)
return r
As for uniform distribution, assuming that the random implementation this random number generator is based on is perfectly uniform the odds of getting 0 or 1 are 50%. Therefore getting the number you want is the result of that choice made over and over again.
So for a=1, b=5, there are 5 choices made.
The odds of getting 1 involves 5 decisions, all 0, the odds of that are 0.5^5 = 3.125%
The odds of getting 5 involves 5 decisions, all 1, the odds of that are 0.5^5 = 3.125%
As you can see from this, the distribution is not uniform -- the odds of any number should be 20%.
In the algorithm you created, it is really not equally distributed.
The result "r" will always be either "a" or "a+1". It will never go beyond that.
It should look something like this:
r=0;
for i=0 to b-a
r = a + r + Random(0,1)
return r;
By including "r" into your computation, you are including the "randomness" of all the previous "for" loop runs.

Resources