find minimum step to make a number from a pair of number - algorithm

Let's assume that we have a pair of numbers (a, b). We can get a new pair (a + b, b) or (a, a + b) from the given pair in a single step.
Let the initial pair of numbers be (1,1). Our task is to find number k, that is, the least number of steps needed to transform (1,1) into the pair where at least one number equals n.
I solved it by finding all the possible pairs and then return min steps in which the given number is formed, but it taking quite long time to compute.I guess this must be somehow related with finding gcd.can some one please help or provide me some link for the concept.
Here is the program that solved the issue but it is not cleat to me...
#include <iostream>
using namespace std;
#define INF 1000000000
int n,r=INF;
int f(int a,int b){
if(b<=0)return INF;
if(a>1&&b==1)return a-1;
return f(b,a-a/b*b)+a/b;
}
int main(){
cin>>n;
for(int i=1;i<=n/2;i++){
r=min(r,f(n,i));
}
cout<<(n==1?0:r)<<endl;
}

My approach to such problems(one I got from projecteuler.net) is to calculate the first few terms of the sequence and then search in oeis for a sequence with the same terms. This can result in a solutions order of magnitude faster. In your case the sequence is probably: http://oeis.org/A178031 but unfortunately it has no easy to use formula.
:
As the constraint for n is relatively small you can do a dp on the minimum number of steps required to get to the pair (a,b) from (1,1). You take a two dimensional array that stores the answer for a given pair and then you do a recursion with memoization:
int mem[5001][5001];
int solve(int a, int b) {
if (a == 0) {
return mem[a][b] = b + 1;
}
if (mem[a][b] != -1) {
return mem[a][b];
}
if (a == 1 && b == 1) {
return mem[a][b] = 0;
}
int res;
if (a > b) {
swap(a,b);
}
if (mem[a][b%a] == -1) { // not yet calculated
res = solve(a, b%a);
} else { // already calculated
res = mem[a][b%a];
}
res += b/a;
return mem[a][b] = res;
}
int main() {
memset(mem, -1, sizeof(mem));
int n;
cin >> n;
int best = -1;
for (int i = 1; i <= n; ++i) {
int temp = solve(n, i);
if (best == -1 || temp < best) {
best = temp;
}
}
cout << best << endl;
}
In fact in this case there is not much difference between dp and BFS, but this is the general approach to such problems. Hope this helps.
EDIT: return a big enough value in the dp if a is zero

You can use the breadth first search algorithm to do this. At each step you generate all possible NEXT steps that you havent seen before. If the set of next steps contains the result you're done if not repeat. The number of times you repeat this is the minimum number of transformations.

First of all, the maximum number you can get after k-3 steps is kth fibinocci number. Let t be the magic ratio.
Now, for n start with (n, upper(n/t) ).
If x>y:
NumSteps(x,y) = NumSteps(x-y,y)+1
Else:
NumSteps(x,y) = NumSteps(x,y-x)+1
Iteratively calculate NumSteps(n, upper(n/t) )
PS: Using upper(n/t) might not always provide the optimal solution. You can do some local search around this value for the optimal results. To ensure optimality you can try ALL the values from 0 to n-1, in which worst case complexity is O(n^2). But, if the optimal value results from a value close to upper(n/t), the solution is O(nlogn)

Related

How do I add memoization to this recursive approach?

I was attempting to solve the coin change problem (https://leetcode.com/problems/coin-change) and have come up with the following recursive approach:
class Solution {
public int coinChange(int[] coins, int amount) {
int[] min = new int[1];
min[0] = Integer.MAX_VALUE;
recur(coins,amount,min,0);
min[0] = min[0]==Integer.MAX_VALUE?-1:min[0];
return min[0];
}
private void recur(int[] coins,int amount,int[] min,int coinsUsed){
if(amount==0){
min[0] = Math.min(min[0],coinsUsed);
return;
}
if(amount<0){
return;
}
for(int i=0;i<coins.length;++i){
recur(coins,amount-coins[i],min,coinsUsed+1);
}
}
}
Currently the time complexity would be O((coins.length)^n). How can I add memoization to improve this?
Usually top-down with memoization gets a bit complicated. A simple way is to use a 2-d array and solve the problem bottom up. Suppose you have n denominations (index from 1 to n) and the sum is S. Let T[n][S] be the minimal number of coins you need to add up to S and v[n] the value of coin n.
You have two choices either you use coin n or you don't. If you don't
then T[n][S]=T[n-1][S]. If you do then T[n][S]=1+T[n][S-v[n]] since you don't know which is
better then
T[n][S]=min (T[n-1][S],1+T[n][S-v[n]]).
This is true for all n therefore we get the recursive formula.
T[i][s]=min(T[i-1][s],1+T[i][s-v[i]])
Now for the boundary conditions
If i=1 then T[1][s]=s for all 0<= s <= S
T[i][0]=0 for 1<= i<= n
int x,y;
int T[n+1][S+1];
//initialization
for(y=0;y<=S;y++)
T[1][y]=y;
for(x=0;x<=n;x++)
T[x][0]=0;
//solution
for(x=2;x<=n;x++){
for(y=0;y<=S;y++){
if (v[x]>= y )
T[x][y]=min(T[x-1][y],1+T[x][y-v[x]];
else
T[x][y]=T[x-1][y];
}
}

Algorithm that will maintain top n items in the past k days?

I would like to implement a data structure maintaining a set S for a leaderboard that can answer the following queries efficiently, while also being memory-efficient:
add(x, t) Add a new item with score x to set S with an associated time t.
query(u) List the top n items (sorted by score) in the set S which has associated time t such that t + k >= u. Each subsequent query will have a u no smaller than previous queries.
In standard English, highscores can be added to this leaderboard individually, and I'd like an algorithm that can efficiently query the top n items on the leaderboard within the post k days (where k and n are fixed constants).
n can be assumed to be much less than the total number of items, and scores may be assumed to be random.
A naïve algorithm would be to store all elements as they are added into a balanced binary search tree sorted by score, and remove elements from the tree when they are more than k days old. Detecting elements that are more than k days old can be done with another balanced binary search tree sorted by time. This algorithm would yield a good time complexity of O(log(h)) where h is the total number of scores added in the past k days. However, the space complexity is O(h), and it is easy to see that most of the data saved will never be reported in a query even if no new scores are added for the next k days.
If n is 1, a simple double-ended queue is all that is necessary. Before adding a new item to the front of the queue, remove items from the front that have a smaller score than the new item, because they will never be reported in a query. Before querying, remove items from the back of the queue that are too old, then return the item that is left at the back of the queue. All operations would be amortized constant time complexity, and I wouldn't be storing items that would never be reported.
When n is more than 1, I can't seem to be able to formulate an algorithm which has a good time complexity and only stores items that could possibly be reported. An algorithm with time complexity O(log(h)) would be great, but n is small enough so that O(log(h) + n) is acceptable too.
Any ideas? Thanks!
This solution is based on the double-ended queue solution and I assume t is ascending.
The idea is that a record can be removed if there are n records with both larger t and larger x than it, which is implemented by Record.count in the sample code.
As each record would be moved from S to temp at most n times, we have average time complexity O(n).
The space complexity is hard to decide. However, it looks fine in the simulation. S.size() is about 400 when h = 10000 and n = 50.
#include <iostream>
#include <vector>
#include <queue>
#include <cstdlib>
using namespace std;
const int k = 10000, n = 50;
class Record {
public:
Record(int _x, int _t): x(_x), t(_t), count(n) {}
int x, t, count;
};
deque<Record> S;
void add(int x, int t)
{
Record record(x, t);
vector<Record> temp;
while (!S.empty() && record.x >= S.back().x) {
if (--S.back().count > 0) temp.push_back(S.back());
S.pop_back();
}
S.push_back(record);
while (!temp.empty()) {
S.push_back(temp.back());
temp.pop_back();
}
}
vector<int> query(int u)
{
while (S.front().t + k < u)
S.pop_front();
vector<int> xs;
for (int i = 0; i < S.size() && i < n; ++i)
xs.push_back(S[i].x);
return xs;
}
int main()
{
for (int t = 1; t <= 1000000; ++t) {
add(rand(), t);
vector<int> xs = query(t);
if (t % k == 0) {
cout << "t = " << t << endl;
cout << "S.size() = " << S.size() << endl;
for (auto x: xs) cout << x << " ";
cout << endl;
}
}
return 0;
}

Reconstructing the list of items from a space optimized 0/1 knapsack implementation

A space optimization for the 0/1 knapsack dynamic programming algorithm is to use a 1-d array (say, A) of size equal to the knapsack capacity, and simply overwrite A[w] (if required) at each iteration i, where A[w] denotes the total value if the first i items are considered and knapsack capacity is w.
If this optimization is used, is there a way to reconstruct the list of items picked, perhaps by saving some extra information at each iteration of the DP algorithm? For example, in the Bellman Ford Algorithm a similar space optimization can be implemented, and the shortest path can still be reconstructed as long as we keep a list of the predecessor pointers, ie the last hop (or first, depending on if a source/destination oriented algorithm is being written).
For reference, here is my C++ function for the 0/1 knapsack problem using dynamic programming where I construct a 2-d vector ans such that ans[i][j] denotes the total value considering the first i items and knapsack capacity j. I reconstruct the items picked by reverse traversing this vector ans:
void knapsack(vector<int> v,vector<int>w,int cap){
//v[i]=value of item i-1
//w[i]=weight of item i-1, cap=knapsack capacity
//ans[i][j]=total value if considering 1st i items and capacity j
vector <vector<int> > ans(v.size()+1,vector<int>(cap+1));
//value with 0 items is 0
ans[0]=vector<int>(cap+1,0);
//value with 0 capacity is 0
for (uint i=1;i<v.size()+1;i++){
ans[i][0]=0;
}
//dp
for (uint i=1;i<v.size()+1;i++) {
for (int x=1;x<cap+1;x++) {
if (ans[i-1][x]>=ans[i-1][x-w[i-1]]+v[i-1]||x<w[i-1])
ans[i][x]=ans[i-1][x];
else {
ans[i][x]=ans[i-1][x-w[i-1]]+v[i-1];
}
}
}
cout<<"Total value: "<<ans[v.size()][cap]<<endl;
//reconstruction
cout<<"Items to carry: \n";
for (uint i=v.size();i>0;i--) {
for (int x=cap;x>0;x--) {
if (ans[i][x]==ans[i-1][x]) //item i not in knapsack
break;
else if (ans[i][x]==ans[i-1][x-w[i-1]]+v[i-1]) { //item i in knapsack
cap-=w[i-1];
cout<<i<<"("<<v[i-1]<<"), ";
break;
}
}
}
cout<<endl;
}
The following is a C++ implementation of yildizkabaran's answer. It adapts Hirschberg's clever divide & conquer idea to compute the answer to a knapsack instance with n items and capacity c in O(nc) time and just O(c) space:
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
// Returns a vector of (cost, elem) pairs.
vector<pair<int, int>> optimal_cost(vector<int> const& v, vector<int> const& w, int cap) {
vector<pair<int, int>> dp(cap + 1, { 0, -1 });
for (int i = 0; i < size(v); ++i) {
for (int j = cap; j >= 0; --j) {
if (w[i] <= j && dp[j].first < dp[j - w[i]].first + v[i]) {
dp[j] = { dp[j - w[i]].first + v[i], i };
}
}
}
return dp;
}
// Returns a vector of item labels corresponding to an optimal solution, in increasing order.
vector<int> knapsack_hirschberg(vector<int> const& v, vector<int> const& w, int cap, int offset = 0) {
if (empty(v)) {
return {};
}
int mid = size(v) / 2;
auto subSol1 = optimal_cost(vector<int>(begin(v), begin(v) + mid), vector<int>(begin(w), begin(w) + mid), cap);
auto subSol2 = optimal_cost(vector<int>(begin(v) + mid, end(v)), vector<int>(begin(w) + mid, end(w)), cap);
pair<int, int> best = { -1, -1 };
for (int i = 0; i <= cap; ++i) {
best = max(best, { subSol1[i].first + subSol2[cap - i].first, i });
}
vector<int> solution;
if (subSol1[best.second].second != -1) {
int iChosen = subSol1[best.second].second;
solution = knapsack_hirschberg(vector<int>(begin(v), begin(v) + iChosen), vector<int>(begin(w), begin(w) + iChosen), best.second - w[iChosen], offset);
solution.push_back(subSol1[best.second].second + offset);
}
if (subSol2[cap - best.second].second != -1) {
int iChosen = mid + subSol2[cap - best.second].second;
auto subSolution = knapsack_hirschberg(vector<int>(begin(v) + mid, begin(v) + iChosen), vector<int>(begin(w) + mid, begin(w) + iChosen), cap - best.second - w[iChosen], offset + mid);
copy(begin(subSolution), end(subSolution), back_inserter(solution));
solution.push_back(iChosen + offset);
}
return solution;
}
Even though this is an old question I recently ran into the same problem so I figured I would write my solution here. What you need is Hirschberg's algorithm. Although this algorithm is written for reconstructing edit distances, the same principle applies here. The idea is that when searching for n items in capacity c, the knapsack state at (n/2)th item corresponding to the final maximum value is determined in the first scan. Let's call this state weight_m and value_m. This can be with keeping track of an additional 1d array of size c. So the memory is still O(c). Then the problem is divided into two parts: items 0 to n/2 with a capacity of weight_m, and items n/2 to n with a capacity of c-weight_m. The reduced problems in total is of size nc/2. Continuing this approach we can determine the knapsack state (occupied weight and current value) after each item, after which we can simply check to see which items were included. This algorithm completes in O(2nc) while using O(c) memory, so in terms of big-O nothing is changed even though the algorithm is at least twice as slow. I hope this helps to anyone who is facing a similar problem.
To my understanding, with the proposed solution, it is effectively impossible to obtain the set of involved items for a certain objective value. The set of items can be obtained by either generating the discarded rows again or maintain a suitable auxiliary data structure. This could be done by associating each entry in A with the list of items from which it was generated. However, this would require more memory than the initially proposed solution. Approaches for backtracking for knapsack problems is also briefly discussed in this journal paper.

Minimizing the distance of pairing points

My problem is as follows:
Given a number of 2n points, I can calculate the distance between all points
and get a symmetrical matrix.
Can you create n pairs of points, so that the sum of the distance of all pairs is
minimal?
EDIT: Every point has to be in one of the pairs. Which means that
every point is only allowed to be in one pair.
I have naively tried to use the Hungarian algorithm and hoped that it may give me an assignment, so that the assignments are symmetrical. But that obviously did not work, as I do not have a bipartite graph.
After a search, I found the Stable roommates problem, which seems to be similar to my problem, but the difference is, that it just tries to find a matching, but not to try to minimize some kind of distance.
Does anyone know a similar problem or even a solution? Did I miss something? The problem does actually not seem that difficult, but I just could not come up with an optimal solution.
There's a primal-dual algorithm due to Edmonds (the Blossom algorithm), which you really don't want to implement yourself if possible. Vladimir Kolmogorov has an implementation that may be suitable for your purposes.
Try network-flow. The max flow is the number of the pairs you want to create. And calculate the min cost of it.
now this isn't a guarantee but just a hunch.
you can find the shortest pair, match them, and remove it from the set.
and recurse until you have no pairs left.
It is clearly sub-optimal. but I have a hunch that the ratio of just how sub-optimal this is to the absolutely optimal solution can be bounded. The hope is to use some sub-modularity argument and bound it to something like (1 - 1 / e) fraction of the global optimal, but I wasn't able to do it. Maybe someone could take a stab at it.
There is a C++ memoization implementation in Competitive Programming 3 as follows (note maximum of N was 8):
#include <algorithm>
#include <cmath>
#include <cstdio>
#include <cstring>
using namespace std;
int N, target;
double dist[20][20], memo[1<<16];
double matching(int bitmask)
{
if (memo[bitmask] > -0.5) // Already computed? Then return the result if yes
return memo[bitmask];
if (bitmask == target) // If all students are already matched then cost is zero
return memo[bitmask] = 0;
double ans = 2000000000.0; // Infinity could also work
int p1, p2;
for (p1 = 0; p1 < 2*N; ++p1) // Find first non-matched point
if (!(bitmask & (1 << p1)))
break;
for (p2 = p1 + 1; p2 < 2*N; ++p2) // and pair it with another non-matched point
if (!(bitmask & (1 << p2)))
ans = min(ans, dist[p1][p2]+matching(bitmask| (1 << p1) | (1 << p2)));
return memo[bitmask] = ans;
}
and then the main method (driving code)
int main()
{
int i,j, caseNo = 1, x[20], y[20];
while(scanf("%d", &N), N){
for (i = 0; i < 2 * N; ++i)
scanf("%d %d", &x[i], &y[i]);
for (i = 0; i < 2*N - 1; ++i)
for (j = i + 1; j < 2*N; ++j)
dist[i][j] = dist[j][i] = hypot(x[i]-x[j], y[i]-y[j]);
// use DP to solve min weighted perfect matching on small general graph
for (i = 0; i < (1 << 16); ++i) memo[i] = -1;
target = (1 << (2 * N)) - 1;
printf("Case %d: %.2lf", caseNo++, matching(0));
}
return 0;
}

Find the largest subset of it which form a sequence

I came across this problem during an interview forum.,
Given an int array which might contain duplicates, find the largest subset of it which form a sequence.
Eg. {1,6,10,4,7,9,5}
then ans is 4,5,6,7
Sorting is an obvious solution. Can this be done in O(n) time.
My take on the problem is that this cannot be done O(n) time & the reason is that if we could do this in O(n) time we could do sorting in O(n) time also ( without knowing the upper bound).
As a random array can contain all the elements in sequence but in random order.
Does this sound a plausible explanation ? your thoughts.
I believe it can be solved in O(n) if you assume you have enough memory to allocate an uninitialized array of a size equal to the largest value, and that allocation can be done in constant time. The trick is to use a lazy array, which gives you the ability to create a set of items in linear time with a membership test in constant time.
Phase 1: Go through each item and add it to the lazy array.
Phase 2: Go through each undeleted item, and delete all contiguous items.
In phase 2, you determine the range and remember it if it is the largest so far. Items can be deleted in constant time using a doubly-linked list.
Here is some incredibly kludgy code that demonstrates the idea:
int main(int argc,char **argv)
{
static const int n = 8;
int values[n] = {1,6,10,4,7,9,5,5};
int index[n];
int lists[n];
int prev[n];
int next_existing[n]; //
int prev_existing[n];
int index_size = 0;
int n_lists = 0;
// Find largest value
int max_value = 0;
for (int i=0; i!=n; ++i) {
int v=values[i];
if (v>max_value) max_value=v;
}
// Allocate a lazy array
int *lazy = (int *)malloc((max_value+1)*sizeof(int));
// Set items in the lazy array and build the lists of indices for
// items with a particular value.
for (int i=0; i!=n; ++i) {
next_existing[i] = i+1;
prev_existing[i] = i-1;
int v = values[i];
int l = lazy[v];
if (l>=0 && l<index_size && index[l]==v) {
// already there, add it to the list
prev[n_lists] = lists[l];
lists[l] = n_lists++;
}
else {
// not there -- create a new list
l = index_size;
lazy[v] = l;
index[l] = v;
++index_size;
prev[n_lists] = -1;
lists[l] = n_lists++;
}
}
// Go through each contiguous range of values and delete them, determining
// what the range is.
int max_count = 0;
int max_begin = -1;
int max_end = -1;
int i = 0;
while (i<n) {
// Start by searching backwards for a value that isn't in the lazy array
int dir = -1;
int v_mid = values[i];
int v = v_mid;
int begin = -1;
for (;;) {
int l = lazy[v];
if (l<0 || l>=index_size || index[l]!=v) {
// Value not in the lazy array
if (dir==1) {
// Hit the end
if (v-begin>max_count) {
max_count = v-begin;
max_begin = begin;
max_end = v;
}
break;
}
// Hit the beginning
begin = v+1;
dir = 1;
v = v_mid+1;
}
else {
// Remove all the items with value v
int k = lists[l];
while (k>=0) {
if (k!=i) {
next_existing[prev_existing[l]] = next_existing[l];
prev_existing[next_existing[l]] = prev_existing[l];
}
k = prev[k];
}
v += dir;
}
}
// Go to the next existing item
i = next_existing[i];
}
// Print the largest range
for (int i=max_begin; i!=max_end; ++i) {
if (i!=max_begin) fprintf(stderr,",");
fprintf(stderr,"%d",i);
}
fprintf(stderr,"\n");
free(lazy);
}
I would say there are ways to do it. The algorithm is the one you already describe, but just use a O(n) sorting algorithm. As such exist for certain inputs (Bucket Sort, Radix Sort) this works (this also goes hand in hand with your argumentation why it should not work).
Vaughn Cato suggested implementation is working like this (its working like a bucket sort with the lazy array working as buckets-on-demand).
As shown by M. Ben-Or in Lower bounds for algebraic computation trees, Proc. 15th ACM Sympos. Theory Comput., pp. 80-86. 1983 cited by J. Erickson in pdf Finding Longest Arithmetic Progressions, this problem cannot be solved in less than O(n log n) time (even if the input is already sorted into order) when using an algebraic decision tree model of computation.
Earlier, I posted the following example in a comment to illustrate that sorting the numbers does not provide an easy answer to the question: Suppose the array is given already sorted into ascending order. For example, let it be (20 30 35 40 47 60 70 80 85 95 100). The longest sequence found in any subsequence of the input is 20,40,60,80,100 rather than 30,35,40 or 60,70,80.
Regarding whether an O(n) algebraic decision tree solution to this problem would provide an O(n) algebraic decision tree sorting method: As others have pointed out, a solution to this subsequence problem for a given multiset does not provide a solution to a sorting problem for that multiset. As an example, consider set {2,4,6,x,y,z}. The subsequence solver will give you the result (2,4,6) whenever x,y,z are large numbers not in arithmetic sequence, and it will tell you nothing about the order of x,y,z.
What about this? populate a hash-table so each value stores the start of the range seen so far for that number, except for the head element that stores the end of the range. O(n) time, O(n) space. A tentative Python implementation (you could do it with one traversal keeping some state variables, but this way seems more clear):
def longest_subset(xs):
table = {}
for x in xs:
start = table.get(x-1, x)
end = table.get(x+1, x)
if x+1 in table:
table[end] = start
if x-1 in table:
table[start] = end
table[x] = (start if x-1 in table else end)
start, end = max(table.items(), key=lambda pair: pair[1]-pair[0])
return list(range(start, end+1))
print(longest_subset([1, 6, 10, 4, 7, 9, 5]))
# [4, 5, 6, 7]
here is a un-optimized O(n) implementation, maybe you will find it useful:
hash_tb={}
A=[1,6,10,4,7,9,5]
for i in range(0,len(A)):
if not hash_tb.has_key(A[i]):
hash_tb[A[i]]=A[i]
max_sq=[];cur_seq=[]
for i in range(0,max(A)):
if hash_tb.has_key(i):
cur_seq.append(i)
else:
if len(cur_seq)>len(max_sq):
max_sq=cur_seq
cur_seq=[]
print max_sq

Resources