How to optimize my solution to dp programming to algorithmic task faster - algorithm

Now when we have a computer we need to power it, for n days. Each days shop offers m batteries which will last for only one day each. Additionaly when you are buying k items that day you need to pay tax which is k^2. Print the minimum cost to run your computer for n days
For example for
5 5
100 1 1 1 1
100 2 2 2 2
100 3 3 3 3
100 4 4 4 4
100 5 5 5 5
Output will be 18
10 1
1000000000
1000000000
1000000000
1000000000
1000000000
1000000000
1000000000
1000000000
1000000000
1000000000
Output will be 10000000010
I cannot approach this task faster than checking all possibilities. Can you point me to better solution. Limits 1 <= n * m <= 10^6 price can be from 1 to 10^9.

You can simply for each day sort elements (you will want to pick the lowest price first for each day) and add item to priority queue as value + tax(when tax is calculated as 2 * j -1 where j is j-th item worthed to buy that day. That's working cause k^2 - (k + 1)^2. And each day you will remove first item (best battery you can currently buy).
#include <iostream>
#include <utility>
#include <algorithm>
#include <queue>
#include <cmath>
#include <vector>
using namespace std;
int n, m;
vector <long long> pom;
int x;
priority_queue<long long> q;
long long score;
int main(){
cin.tie(0);
ios_base::sync_with_stdio(0);
cin >> n >> m;
for(int i = 0; i < n; i++){
for(int j = 0; j < m; j++){
cin >> x;
pom.push_back(x);
}
sort(pom.begin(), pom.end());
for(int j = 0; j < pom.size(); j++){
pom[j] += 1 + 2 * j;
q.push(-pom[j]);
}
pom.clear();
score += q.top();
q.pop();
}
cout << -score;
}
Complexity for that solution is O(n*m logm)

Related

How to print (output) multiple lines from a loop?

I would like to know how to output multiple lines from a for loop.
For example, if I want to input a number of subsequent lines N and then another numbers. I am trying to output the inputs that I provided below, but whenever I do this it keeps returning only the last digits and not everything I entered.
This is what I know so far.
#include <iostream>
using namespace std;
int main()
{
int N, M, num;
cin >> N;
for (int i = 0; i < N; i++)
{
for (int j = 0; j < 3; j++)
{
cin >> M;
for (int k = 0; k < M; k++)
cout << M << endl;
}
}
return 0;
}
Input:
2 (This is for N)
1 2 3
4 5 6
or
3
10 20 30
50 100 500
1000 5 0
---------
Output:
1 2 3
4 5 6
10 20 30
50 100 500
1000 5 0

SPOJ: DCOWS Why a Greedy algorithm does not work?

Question Link: https://www.spoj.com/problems/DCOWS/
I am trying to figure out why my greedy approach to solve the above problem does not work.
Given two lists B & C with corresponding sizes of N & M with (M > N), consisting of heights of bulls and cows respectively as inputs to this question, my approach to solve this problem is as follows:
Sort both lists B & C in non-decreasing order
Set k = 0
For each item Bi in list B
Using modified binary search on C[k..M-N+i] find an element Cj at position j, 0<=j<=M-N in list C which has the minimum absolute difference with Bi
Add abs(Bi - Cj) to the result
Update k = j + 1 for next iteration of the loop
Here is the code:
#include <iostream>
#include <algorithm>
#include <cmath>
using namespace std;
int my_bsearch(long *arr, int lo, int hi, long x)
{
int mid = lo + (hi - lo)/2;
if (lo == mid)
{
if (abs(x - arr[lo]) <= abs(x - arr[hi])) return lo;
else return hi;
}
if ((mid-1 >= 0) && (abs(x - arr[mid-1]) <= abs(x - arr[mid])))
return my_bsearch(arr, lo, mid, x);
else
return my_bsearch(arr, mid, hi, x);
}
int main() {
int M, N;
cin >> N >> M;
long bulls[N], cows[M];
for (int i=0; i<N; i++) cin >> bulls[i];
for (int i=0; i<M; i++) cin >> cows[i];
sort(bulls, bulls + N);
sort(cows, cows + M);
long long min_val = 0, lo = 0, hi = M-N;
for (int i=0; i<N; i++) {
lo = my_bsearch(cows, lo, hi, bulls[i]);
min_val += abs(bulls[i] - cows[lo]);
lo++, hi++;
}
cout<< min_val << endl;
return 0;
}
As described in this similar question Can we solve the “printing neatly” problem using a greedy algorithm, a greedy solution is often led astray. Consider this data:
Bulls: 5, 5
Cows: 1, 6, 15
Your algorithm outputs minimum distance of 11 (pairs 5 to 6, and then 5 to 15). But the optimal solution is clearly 5 (pairing 5 to 1, and 5 to 6).

Find The quotient of a number

There is a giving number N , i have to find out the number of integer for which the repetitive division with N gives quotient one.
For Ex:
N=8
Numbers Are 2 as: 8/2=4/2=2/2=1
5 as 8/5=1
6 as 8/6=1
7 and 8
My Aprroach:
All the numbers from N/2+1 to N gives me quotient 1 so
Ans: N/2 + Check Numbers from (2, sqrt(N))
Time Complexity O(sqrt(N))
Is there any better ways to do this, since number can be upto 10^12 and there can many queries ?
Can it be O(1) or O(40) (because 2^40 exceeds 10^12)
A test harness to verify functionality and assess order of complexity.
Edit as needed - its wiki
#include <math.h>
#include <stdio.h>
unsigned long long nn = 0;
unsigned repeat_div(unsigned n, unsigned d) {
for (;;) {
nn++;
unsigned q = n / d;
if (q <= 1) return q;
n = q;
}
return 0;
}
unsigned num_repeat_div2(unsigned n) {
unsigned count = 0;
for (unsigned d = 2; d <= n; d++) {
count += repeat_div(n, d);
}
return count;
}
unsigned num_repeat_div2_NM(unsigned n) {
unsigned count = 0;
if (n > 1) {
count += (n + 1) / 2;
unsigned hi = (unsigned) sqrt(n);
for (unsigned d = 2; d <= hi; d++) {
count += repeat_div(n, d);
}
}
return count;
}
unsigned num_repeat_div2_test(unsigned n) {
// number of integers that repetitive division with n gives quotient one.
unsigned count = 0;
// increment nn per code' tightest loop
...
return count;
}
///
unsigned (*method_rd[])(unsigned) = { num_repeat_div2, num_repeat_div2_NM,
num_repeat_div2_test};
#define RD_N (sizeof method_rd/sizeof method_rd[0])
unsigned test_rd(unsigned n, unsigned long long *iteration) {
unsigned count = 0;
for (unsigned i = 0; i < RD_N; i++) {
nn = 0;
unsigned this_count = method_rd[i](n);
iteration[i] += nn;
if (i > 0 && this_count != count) {
printf("Oops %u %u %u\n", i, count, this_count);
exit(-1);
}
count = this_count;
// printf("rd[%u](%u) = %u. Iterations:%llu\n", i, n, cnt, nn);
}
return count;
}
void tests_rd(unsigned lo, unsigned hi) {
unsigned long long total_iterations[RD_N] = {0};
unsigned long long total_count = 0;
for (unsigned n = lo; n <= hi; n++) {
total_count += test_rd(n, total_iterations);
}
for (unsigned i = 0; i < RD_N; i++) {
printf("Sum rd(%u,%u) --> %llu. total Iterations %llu\n", lo, hi,
total_count, total_iterations[i]);
}
}
int main(void) {
tests_rd(2, 10 * 1000);
return 0;
}
If you'd like O(1) lookup per query, the hash table of naturals less than or equal 10^12 that are powers of other naturals will not be much larger than 2,000,000 elements; create it by iterating on the bases from 1 to 1,000,000, incrementing the value of seen keys; roots 1,000,000...10,001 need only be squared; roots 10,000...1,001 need only be cubed; after that, as has been mentioned, there can be at most 40 operations at the smallest root.
Each value in the table will represent the number of base/power configurations (e.g., 512 -> 2, corresponding to 2^9 and 8^3).
First off, your algorithm is not O(sqrt(N)), as you are ignoring the number of times you divide by each of the checked numbers. If the number being checked is k, the number of divisions before the result is obtained (by the method described above) is O(log(k)). Hence the complexity becomes N/2 + (log(2) + log(3) + ... + log(sqrt(N)) = O(log(N) * sqrt(N)).
Now that we have got that out of the way, the algorithm may be improved. Observe that, by repeated division and you will get a 1 for a checked number k only when k^t <= N < 2 * k^t where t=floor(log_k(N)).
That is, when k^t <= N < 2 * k^(t+1). Note the strict < on the right-side.
Now, to figure out t, you can use the Newton-Raphson method or the Taylor's series to get logarithms very quickly and a complexity measure is mentioned here. Let us call that C(N). So the complexity will be C(2) + C(3) + .... + C(sqrt(N)). If you can ignore the cost of computing the log, you can get this to O(sqrt(N)).
For example, in the above case for N=8:
2^3 <= 8 < 2 * 2^3 : 1
floor(log_3(8)) = 1 and 8 does not satisfy 3^1 <= 8 < 2 * 3^1: 0
floor(log_4(8)) = 1 and 8 does not satisfy 4^1 <= 8 < 2 * 4^1 : 0
4 extra coming in from numbers 5, 6, 7 and 8 as 8 t=1 for these numbers.
Note that we did not need to check for 3 and 4, but I have done so to illustrate the point. And you can verify that each of the numbers in [N/2..N] satisfies the above inequality and hence need to be added.
If you use this approach, we can eliminate the repeated divisions and get the complexity down to O(sqrt(N)) if the complexity of computing logarithms can be assumed negligible.
Let's see since number can be upto 10^12 , what you can do is Create for number 2 to 10^6 , you can create and Array of 40 , so for each length check if the number can be expressed as i^(len-1)+ y where i is between 2 to 10^6 and len is between 1 to 40.
So time complexity O(40*Query)

finding all divisors of all the numbers from 1 to 10^6 efficiently

I need to find all divisors of all numbers between 1 and n (including 1 and n). where n equals 10^6 and I want to store them in the vector.
vector< vector<int> > divisors(1000000);
void abc()
{
long int n=1,num;
while(n<1000000)
{
num=n;
int limit=sqrt(num);
for(long int i=1;i<limit;i++)
{
if(num%i==0)
{
divisors[n].push_back(i);
divisors[n].push_back(num/i);
}
}
n++;
}
}
This is too much time taking as well. Can i optimize it in any way?
const int N = 1000000;
vector<vector<int>> divisors(N+1);
for (int i = 2; i <= N; i++) {
for (j = i; j <= N; j += i) {
divisors[j].push_back(i);
}
}
this runs in O(N*log(N))
Intuition is that upper N/2 numbers are run only once. Then from remaining numbers upper half are run once more ...
Other way around. If you increase N from lets say 10^6 to 10^7, than you have as many opertions as at 10^6 times 10. (that is linear), but what is extra are numbers from 10^6 to 10^7 that doesnt run more than 10 times each at worst.
number of operaions is
sum (N/n for n from 1 to N)
this becomes then N * sum(1/n for n from 1 to N) and this is N*log(N) that can be shown using integration of 1/x over dx from 1 to N
We can see that algorhitm is optimal, because there is as many operation as is number of divisors. Size of result or total number of divisors is same as complexity of algorhitm.
I think this might not be the best solution, but it is much better than the one presented, so here we go:
Go over all the numbers (i) from 1 to n, and for each number:
Add the number to the list of itself.
Set multiplier to 2.
Add i to the list of i * multiplier.
increase multiplier.
Repeat steps 3 & 4 until i * multiplier is greater than n.
[Edit3] complete reedit
Your current approach is O(n^1.5) not O(n^2)
Originally I suggested to see Why are my nested for loops taking so long to compute?
But as Oliver Charlesworth suggested to me to read About Vectors growth That should not be much of an issue here (also the measurements confirmed it).
So no need to preallocating of memroy for the list (it would just waste memory and due to CACHE barriers even lower the overall performance at least on mine setup).
So how to optimize?
either lower the constant time so the runtime is better of your iteration (even with worse complexity)
or lower the complexity so much that overhead is not bigger to actually have some speedup
I would start with SoF (Sieve of Eratosthenes)
But instead setting number as divisible I would add currently iterated sieve to the number divisor list. This should be O(n^2) but with much lower overhead (no divisions and fully parallelisable) if coded right.
start computing SoF for all numbers i=2,3,4,5,...,n-1
for each number x you hit do not update SoF table (you do not need it). Instead add the iterated sieve i to the divisor list of x. Something like:
C++ source:
const int n=1000000;
List<int> divs[n];
void divisors()
{
int i,x;
for (i=1;i<n;i++)
for (x=i;x<n;x+=i)
divs[x].add(i);
}
This took 1.739s and found 13969984 divisors total, max 240 divisors per number (including 1 and x). As you can see it does not use any divisions. and the divisors are sorted ascending.
List<int> is dynamic list of integers template (something like your vector<>)
You can adapt this to your kind of iteration so you can check up to nn=sqrt(n) and add 2 divisors per iteration that is O(n^1.5*log(n)) with different constant time (overhead) a bit slower due to single division need and duplicity check (log(n) with high base) so you need to measure if it speeds things up or not on my setup is this way slower (~2.383s even if it is with better complexity).
const int n=1000000;
List<int> divs[n];
int i,j,x,y,nn=sqrt(n);
for (i=1;i<=nn;i++)
for (x=i;x<n;x+=i)
{
for (y=divs[x].num-1;y>=0;y--)
if (i==divs[x][y]) break;
if (y<0) divs[x].add(i);
j=x/i;
for (y=divs[x].num-1;y>=0;y--)
if (j==divs[x][y]) break;
if (y<0) divs[x].add(j);
}
Next thing is to use direct memory access (not sure you can do that with vector<>) my list is capable of such thing do not confuse it with hardware DMA this is just avoidance of array range checking. This speeds up the constant overhead of the duplicity check and the result time is [1.793s] which is a little bit slower then the raw SoF O(n^2) version. So if you got bigger n this would be the way.
[Notes]
If you want to do prime decomposition then iterate i only through primes (in that case you need the SoF table) ...
If you got problems with the SoF or primes look at Prime numbers by Eratosthenes quicker sequential than concurrently? for some additional ideas on this
Another optimization is not to use -vector- nor -list- , but a large array of divisors, see http://oeis.org/A027750
First step: Sieve of number of divisors
Second step: Sieve of divisors with the total number of divisors
Note: A maximum of 20-fold time increase for 10-fold range. --> O(N*log(N))
Dev-C++ 5.11 , in C
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int SieveNbOfDiv(int NumberOfDivisors[], int IndexCount[], int Limit) {
for (int i = 1; i*i <= Limit; i++) {
NumberOfDivisors[i*i] += 1;
for (int j = i*(i+1); j <= Limit; j += i )
NumberOfDivisors[j] += 2;
}
int Count = 0;
for (int i = 1; i <= Limit; i++) {
Count += NumberOfDivisors[i];
NumberOfDivisors[i] = Count;
IndexCount[i] = Count;
}
return Count;
}
void SieveDivisors(int IndexCount[], int NumberOfDivisors[], int Divisors[], int Limit) {
for (int i = 1; i <= Limit; i++) {
Divisors[IndexCount[i-1]++] = 1;
Divisors[IndexCount[i]-1] = i;
}
for (int i = 2; i*i <= Limit; i++) {
Divisors[IndexCount[i*i-1]++] = i;
for (int j = i*(i+1); j <= Limit; j += i ) {
Divisors[IndexCount[j-1]++] = i;
Divisors[NumberOfDivisors[j-1] + NumberOfDivisors[j] - IndexCount[j-1]] = j/i;
}
}
}
int main(int argc, char *argv[]) {
int N = 1000000;
if (argc > 1) N = atoi(argv[1]);
int ToPrint = 0;
if (argc > 2) ToPrint = atoi(argv[2]);
clock_t Start = clock();
printf("Using sieve of divisors from 1 to %d\n\n", N);
printf("Evaluating sieve of number of divisors ...\n");
int *NumberOfDivisors = (int*) calloc(N+1, sizeof(int));
int *IndexCount = (int*) calloc(N+1, sizeof(int));
int size = SieveNbOfDiv(NumberOfDivisors, IndexCount, N);
printf("Total number of divisors = %d\n", size);
printf("%0.3f second(s)\n\n", (clock() - Start)/1000.0);
printf("Evaluating sieve of divisors ...\n");
int *Divisors = (int*) calloc(size+1, sizeof(int));
SieveDivisors(IndexCount, NumberOfDivisors, Divisors, N);
printf("%0.3f second(s)\n", (clock() - Start)/1000.0);
if (ToPrint == 1)
for (int i = 1; i <= N; i++) {
printf("%d(%d) = ", i, NumberOfDivisors[i] - NumberOfDivisors[i-1]);
for (int j = NumberOfDivisors[i-1]; j < NumberOfDivisors[i]; j++)
printf("%d ", Divisors[j]);
printf("\n");
}
return 0;
}
With some results:
Copyright (c) 2009 Microsoft Corporation. All rights reserved.
c:\Users\Ab\Documents\gcc\sievedivisors>sievedivisors 100000
Using sieve of divisors from 1 to 100000
Evaluating sieve of number of divisors ...
Total number of divisors = 1166750
0.000 second(s)
Evaluating sieve of divisors ...
0.020 second(s)
c:\Users\Ab\Documents\gcc\sievedivisors>sievedivisors 1000000
Using sieve of divisors from 1 to 1000000
Evaluating sieve of number of divisors ...
Total number of divisors = 13970034
0.060 second(s)
Evaluating sieve of divisors ...
0.610 second(s)
c:\Users\Ab\Documents\gcc\sievedivisors>sievedivisors 10000000
Using sieve of divisors from 1 to 10000000
Evaluating sieve of number of divisors ...
Total number of divisors = 162725364
0.995 second(s)
Evaluating sieve of divisors ...
11.900 second(s)
c:\Users\Ab\Documents\gcc\sievedivisors>

Approach for better solution - Sum of medians

Here is the question Spoj-WEIRDFN
Problem:
Let us define :
F[1] = 1
F[i] = (a*M[i] + b*i + c)%1000000007 for i > 1
where M[i] is the median of the array {F[1],F[2],..,F[i-1]}
Given a,b,c and n, calculate the sum F[1] + F[2] + .. + F[n].
Constraints:
0 <= a,b,c < 1000000007
1 <= n <= 200000
I came up with a solution which is not so efficient
MY SOLUTION::--
#include <bits/stdc++.h>
using namespace std;
#define ll long long int
#define mod 1000000007
int main() {
// your code goes here
int t;
scanf("%d",&t);
while(t--)
{
ll a,b,c,sum=0;
int n;
scanf("%lld%lld%lld%d",&a,&b,&c,&n);
ll f[n+1];
f[1]=1;
f[0]=0;
for(int i=2;i<=n;i++)
{
ll temp;
sort(&f[1],&f[i]);
temp=f[i/2];
f[i]=((a*(temp)%mod)+((b*i)%mod)+(c%mod))%mod;
sum+=f[i];
}
printf("%lld\n",sum+f[1]);
}
return 0;
}
Can anybody give me hint for for better algorithm or data structure for this task
For each test case, you can maintain a binary search tree, thus you can find the median of n elements in O(log n) time, and you only need O(log n) time to add a new element into the tree.
Thus, we have an O(T*nlogn) algorithm, with T is number of test case, and n is number of elements, which should be enough to pass.

Resources