How to find all taxicab numbers less than N? - algorithm

A taxicab number is an integer that can be expressed as the sum of two cubes of integers in two different ways: a^3+b^3 = c^3+d^3. Design an algorithm to find all taxicab numbers with a, b, c, and d less than N.
Please give both the space and time complexity in terms of N.
I could do it in o(N^2.logN) time with O(N^2) space.
Best algorithm I've found so far:
Form all pairs: N^2
Sort the sum: N^2 logN
Find duplicates less than N
But this takes N^2 space. Can we do better?

But this takes N^2 space. Can we do better?
There exists an O(N) space solution based on a priority queue. Time complexity is O(N^2 logN). To sketch out the idea of the algorithm, here is the matrix M such that M[i][j] = i^3 + j^3 (of course, the matrix is never created in memory):
0 1 8 27 64 125
1 2 9 28 65 126
8 9 16 35 72 133
27 28 35 54 91 152
64 65 72 91 128 189
125 126 133 152 189 250
Observe that every line and every row is sorted in ascending order. Let PQ be the priority queue. First we put the biggest element in the priority queue. Then perform the following, as long as the PQ is not empty:
Pop the biggest element from PQ
add adjacent element above if the PQ doesn't have any element from that row
add adjacent element on the left if the PQ doesn't have any element from that column, and if it is not under the diagonal of the matrix (to avoid redundant elements)
Note that
You don't need to create the matrix in memory to implement the algorithm
The elements will be popped from the PQ in descending order, from the biggest element of the matrix to its smallest one (avoiding elements from the redundant half part of the matrix).
Everytime the PQ issues the same value twice then we have found a taxicab number.
As an illustration, here is an implementation in C++. The time complexity is O(N^2 logN) and space complexity O(N).
#include <iostream>
#include <cassert>
#include <queue>
using namespace std;
typedef unsigned int value_type;
struct Square
{
value_type i;
value_type j;
value_type sum_of_cubes;
Square(value_type i, value_type j) : i(i), j(j), sum_of_cubes(i*i*i+j*j*j) {}
friend class SquareCompare;
bool taxicab(const Square& sq) const
{
return sum_of_cubes == sq.sum_of_cubes && i != sq.i && i != sq.j;
}
friend ostream& operator<<(ostream& os, const Square& sq);
};
class SquareCompare
{
public:
bool operator()(const Square& a, const Square& b)
{
return a.sum_of_cubes < b.sum_of_cubes;
}
};
ostream& operator<<(ostream& os, const Square& sq)
{
return os << sq.i << "^3 + " << sq.j << "^3 = " << sq.sum_of_cubes;
}
int main()
{
const value_type N=2001;
value_type count = 0;
bool in_i [N];
bool in_j [N];
for (value_type i=0; i<N; i++) {
in_i[i] = false;
in_j[i] = false;
}
priority_queue<Square, vector<Square>, SquareCompare> p_queue;
p_queue.push(Square(N-1, N-1));
in_i[N-1] = true;
in_j[N-1] = true;
while(!p_queue.empty()) {
Square sq = p_queue.top();
p_queue.pop();
in_i[sq.i] = false;
in_j[sq.j] = false;
// cout << "pop " << sq.i << " " << sq.j << endl;
if (sq.i > 0 && !in_i[sq.i - 1] && sq.i-1 >= sq.j) {
p_queue.push(Square(sq.i-1, sq.j));
in_i[sq.i-1] = true;
in_j[sq.j] = true;
// cout << "push " << sq.i-1 << " " << sq.j << endl;
}
if (sq.j > 0 && !in_j[sq.j-1] && sq.i >= sq.j - 1) {
p_queue.push(Square(sq.i, sq.j-1));
in_i[sq.i] = true;
in_j[sq.j - 1] = true;
// cout << "push " << sq.i << " " << sq.j-1 << endl;
}
if (sq.taxicab(p_queue.top())) {
/* taxicab number */
cout << sq << " " << p_queue.top() << endl;
count++;
}
}
cout << endl;
cout << "there are " << count << " taxicab numbers with a, b, c, d < " << N << endl;
return 0;
}

The answers given by Novneet Nov and user3017842 are both correct ideas for finding the taxicab numbers with storage O(N) using minHeap.
Just a little bit more explanation why the minHeap of size N works.
First, if you had all the sums (O(N^2)) and could sort them (O(N^2lgN)) you would just pick the duplicates as you traverse the sorted array. Well, in our case using a minHeap we can traverse in-order all the sums: we just need to ensure that the minHeap always contains the minimum unprocessed sum.
Now, we have a huge number of sums (O(N^2)). But, notice that this number can be split into N groups each of which has an easily defined minimum!
(fix a, change b from 0 to N-1 => here are your N groups. The sum in one group with a smaller b is smaller than one with a bigger b in the same group - because a is the same).
The minimum of union of these groups is in the union of mins of these
groups. Therefore, if you keep all minimums of these groups in the
minHeap you are guaranteed to have the total minimum in the minHeap.
Now, when you extract Min from the heap, you just add next smallest element from the group of this extracted min (so if you extracted (a, b) you add (a, b+1)) and you are guaranteed that your minHeap still contains the next unprocessed min of all the sums.

I found the solution/code here : Time complexity O(N^2 logN), space complexity O(N)
The solution is implemented by help of priority queues.
Reverse thinking can be easily done by looking at the code. It can be done in an array of size N because the min sums are deleted from the array after comparing to the next minimum and then the array is made to size N by adding a new sum - (i^3 + (j+1)^3).
A intuitive proof is here :
Initially, we have added (1,1),(2,2),(3,3),...,(N,N) in the min-priority queue.
Suppose a^+b^3=c^3+d^3, and (a,b) is the minimum that will be taken out of the priority queue next. To be able to detect this taxicab number, (c,d) must also be in the priority queue which would be taken out after (a,b).
Note: We would be adding (a,b+1) after extracting (a,b) so there is no way that extraction of (a,b) would result in addition of (c,d) to the priority queue, so it must already exist in the priority queue.
Now lets assume that (c,d) is not in the priority queue, because we haven't gotten to it yet. Instead, there is some (c,d−k) in the priority queue where k>0.
Since (a,b) is being taken out,
a^3+b^3≤c^3+(d−k)^3
However, a^3+b^3=c^3+d^3
Therefore,
c^3+d^3≤c^3+(d−k)^3
d≤d−k
k≤0
Since k>0, this is impossible. Thus our assumption can never come to pass.
Thus for every (a,b) which is being removed from the min-PQ, (c,d) is already in the min-PQ (or was just removed) if a^3+b^3=c^3+d^3

The time complexity of the algorithm can't be less than O(N2) in any case, since you might print up to O(N2) taxicab numbers.
To reduce space usage you could, in theory, use the suggestion mentioned here: little link. Basically, the idea is that first you try all possible pairs a, b and find the solution to this:
a = 1 − (p − 3 * q)(p2 + 3 * q2)
b = −1 + (p + 3 * q)(p2 + 3q2)
Then you can find the appropriate c, d pair using:
c = (p + 3 * q) - (p2 + 3 * q2)
d = -(p - 3 * q) + (p2 + 3 * q2)
and check whether they are both less than N. The issue here is that solving that system of equations might get a bit messy (by 'a bit' I mean very tedious).
The O(N2) space solution is much simpler, and it'd probably be efficient enough since anything of quadratic time complexity that can run in reasonable time limits will probably be fine with quadratic space usage.
I hope that helped!

version1 uses List and sorting
O(n^2*logn) time and O(n^2) space
public static void Taxicab1(int n)
{
// O(n^2) time and O(n^2) space
var list = new List<int>();
for (int i = 1; i <= n; i++)
{
for (int j = i; j <= n; j++)
{
list.Add(i * i * i + j * j * j);
}
}
// O(n^2*log(n^2)) time
list.Sort();
// O(n^2) time
int prev = -1;
foreach (var next in list)
{
if (prev == next)
{
Console.WriteLine(prev);
}
prev = next;
}
}
version2 uses HashSet
O(n^2) time and O(n^2) space
public static void Taxicab2(int n)
{
// O(n^2) time and O(n^2) space
var set = new HashSet<int>();
for (int i = 1; i <= n; i++)
{
for (int j = i; j <= n; j++)
{
int x = i * i * i + j * j * j;
if (!set.Add(x))
{
Console.WriteLine(x);
}
}
}
}
version3 uses min oriented Priority Queue
O(n^2*logn) time and O(n) space
public static void Taxicab3(int n)
{
// O(n) time and O(n) space
var pq = new MinPQ<SumOfCubes>();
for (int i = 1; i <= n; i++)
{
pq.Push(new SumOfCubes(i, i));
}
// O(n^2*logn) time
var sentinel = new SumOfCubes(0, 0);
while (pq.Count > 0)
{
var current = pq.Pop();
if (current.Result == sentinel.Result)
Console.WriteLine($"{sentinel.A}^3+{sentinel.B}^3 = {current.A}^3+{current.B}^3 = {current.Result}");
if (current.B <= n)
pq.Push(new SumOfCubes(current.A, current.B + 1));
sentinel = current;
}
}
where SummOfCubes
public class SumOfCubes : IComparable<SumOfCubes>
{
public int A { get; private set; }
public int B { get; private set; }
public int Result { get; private set; }
public SumOfCubes(int a, int b)
{
A = a;
B = b;
Result = a * a * a + b * b * b;
}
public int CompareTo(SumOfCubes other)
{
return Result.CompareTo(other.Result);
}
}
github

create an array: 1^3, 2^3, 3^3, 4^3, ....... k^3. such that k^3 < N and (k+1)^3 > N. the array size would be ~ (N)^(1/3). the array is sorted order.
use 2sum technique (link) in lineal time proportional to the array size. if we find 2 pairs of numbers, that is a hit.
looping through step 2 by decreasing N by 1 each time.
This will use O(N^(1/3)) extra space and ~ O(N^(4/3)) time.

A easy way of understanding Time complexity O(N^2 logN), space complexity O(N) is to think it as a merge of N sorted arrays plus a bookkeeping of the previously merged element.

It seems like a simple brute-force algorithm with proper bounds solves it in time proportional to n^1.33 and space proportional to n. Or could anyone point me to the place where I'm mistaken?
Consider 4 nested loops, each running from 1 to cubic root of n. Using these loops we can go over all possible combinations of 4 values and find the pairs forming taxicab numbers. It means each loop takes time proportional to cubic root of n, or n^(1/3). Multiply this value 4 times and get:
(n^(1/3)^4 = n^(4/3) = n^1.33
I wrote a solution in JavaScript and benchmarked it, and it seems to be working. One caveat is that the result is only partially sorted.
Here is my JavaScript code (it's not optimal yet, could be optimized even more):
function taxicab(n) {
let a = 1, b = 1, c = 1, d = 1,
cubeA = a**3 + b**3,
cubeB = c**3 + d**3,
results = [];
while (cubeA < n) { // loop over a
while (cubeA < n) { // loop over b
// avoid running nested loops if this number is already in results
if (results.indexOf(cubeA) === -1) {
while (cubeB <= cubeA) { // loop over c
while (cubeB <= cubeA) { // loop over d
if (cubeB === cubeA && a!=c && a!=d) { // found a taxicab number!
results.push(cubeA);
}
d++;
cubeB = c**3 + d**3;
} // end loop over d
c++;
d = c;
cubeB = c**3 + d**3;
} // end loop over c
}
b++;
cubeA = a**3 + b**3;
c = d = 1;
cubeB = c**3 + d**3;
} // end loop over d
a++;
b = a;
cubeA = a**3 + b**3;
} // end loop over a
return results;
}
Running taxicab(1E8) takes around 30 seconds in a browser console and yields 485 numbers as a result. Ten times smaller value taxicab(1E7) (10 millions) takes almost 1.4 seconds and yields 150 numbers. 10^1.33 * 1.4 = 29.9, i.e. multiplying n by 10 leads to the running time increased by 10^1.33 times. The result array is unsorted, but after quickly sorting it we get correct result, as it seems:
[1729, 4104, 13832, 20683, 32832, 39312, 40033, 46683, 64232, 65728,
110656, 110808, 134379, 149389, 165464, 171288, 195841, 216027, 216125,
262656, 314496, 320264, 327763, 373464, 402597, 439101, 443889, 513000,
513856, 515375, 525824, 558441, 593047, 684019, 704977, 805688, 842751,
885248, 886464, 920673, 955016, 984067, 994688, 1009736, 1016496, 1061424,
1073375, 1075032, 1080891, 1092728, 1195112, 1260441, 1323712, 1331064,
1370304, 1407672, 1533357, 1566728, 1609272, 1728216, 1729000, 1734264,
1774656, 1845649, 2048391, 2101248, 2301299, 2418271, 2515968, 2562112,
2585375, 2622104, 2691451, 2864288, 2987712, 2991816, 3220776, 3242197,
3375001, 3375008, 3511872, 3512808, 3551112, 3587409, 3628233, 3798613,
3813992, 4033503, 4104000, 4110848, 4123000, 4174281, 4206592, 4342914,
4467528, 4505949, 4511808, 4607064, 4624776, 4673088, …]
Here is a code for benchmarking:
// run taxicab(n) for k trials and return the average running time
function benchmark(n, k) {
let t = 0;
k = k || 1; // how many times to repeat the trial to get an averaged result
for(let i = 0; i < k; i++) {
let t1 = new Date();
taxicab(n);
let t2 = new Date();
t += t2 - t1;
}
return Math.round(t/k);
}
Finally, I tested it:
let T = benchmark(1E7, 3); // 1376 - running time for n = 10 million
let T2 = benchmark(2E7, 3);// 4821 - running time for n = 20 million
let powerLaw = Math.log2(T2/T); // 1.3206693816701993
So it means time is proportional to n^1.32 in this test. Repeating this many times with different values always yields around the same result: from 1.3 to 1.4.

First of all, we will construct the taxicab numbers instead of searching for them. The range we will use to construct a taxicab number i.e Ta(2) will go up to n^1/3 not n. Because if you cube a number bigger than n^1/3 it will be bigger than n and also we can't cube negative numbers to prevent that case by definition. We will use a HashSet to remember the sums of two cubed numbers in the algorithm. This will help us to lookup previous cubed sums in O(1) time while we are iterating over every possible pair of numbers in the range I mentioned earlier.
Time complexity: O(n^2/3)
Space complexity: O(n^1/3)
def taxicab_numbers(n: int) -> list[int]:
taxicab_numbers = []
max_num = math.floor(n ** (1. / 3.))
seen_sums = set()
for i in range(1, max_num + 1):
for j in range(i, max_num + 1):
cube_sum = i ** 3 + j ** 3
if cube_sum in seen_sums:
taxicab_numbers.append(cube_sum)
else:
seen_sums.add(cube_sum)
return taxicab_numbers

import java.util.*;
public class A5Q24 {
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
System.out.println("Enter number:");
int n = sc.nextInt();
// start checking every int less than the input
for (int a = 2;a <= n;a++) {
int count = 0;
// number of ways that number be expressed in sum of two number cubes
for (int i = 1; Math.pow(i, 3) < a; i++) {
// if the cube of number smaller is greater than the number than it goes out
for (int j = 1; j <= i; j++) {
if (Math.pow(i, 3) + Math.pow(j, 3) == a)
count++;
}
}
if (count == 2)
System.out.println(a);
}
sc.close();
}
}

I think we can also do better on time (O (N ^ 2)) with O(N ^ 2) memory, using a hashmap to check if a pair of cubes has already be seen. In Python:
def find_taxicab_numbers(n: int) -> List[Tuple[int, int, int, int, int]]:
"""
find all taxicab numbers smaller than n, i.e. integers that can be expressed as the sum of two cubes of positive
integers in two different ways so that a^3 + b^3 = c^3 + d^3.
Time: O(n ^ 2) (two loops, one dict lookup). Space: O(n ^ 2)) (all possible cubes)
:param n: upper bound for a, b, c, d
:return: list of tuples of int: a, b, c, d, and taxicab numbers
"""
cubes = [i ** 3 for i in range(n)]
seen_sum_cubes = dict() # mapping sum cubes -> a, b
taxicabs = list() # list of a, b, c, d, taxicab
# check all possible sums of cubes
for i in range(n):
for j in range(i):
sum_cubes = cubes[i] + cubes[j]
if sum_cubes in seen_sum_cubes:
prev_i, prev_j = seen_sum_cubes[sum_cubes]
taxicabs.append((i, j, prev_i, prev_j, sum_cubes))
else:
seen_sum_cubes[sum_cubes] = (i, j)
return taxicabs

Related

Finding kth element in the nth order of Farey Sequence

Farey sequence of order n is the sequence of completely reduced fractions, between 0 and 1 which when in lowest terms have denominators less than or equal to n, arranged in order of increasing size. Detailed explanation here.
Problem
The problem is, given n and k, where n = order of seq and k = element index, can we find the particular element from the sequence. For examples answer for (n=5, k =6) is 1/2.
Lead
There are many less than optimal solution available, but am looking for a near-optimal one. One such algorithm is discussed here, for which I am unable to understand the logic hence unable to apply the examples.
Question
Can some please explain the solution with more detail, preferably with an example.
Thank you.
I've read the method provided in your link, and the accepted C++ solution to it. Let me post them, for reference:
Editorial Explanation
Several less-than-optimal solutions exist. Using a priority queue, one
can iterate through the fractions (generating them one by one) in O(K
log N) time. Using a fancier math relation, this can be reduced to
O(K). However, neither of these solution obtains many points, because
the number of fractions (and thus K) is quadratic in N.
The “good” solution is based on meta-binary search. To construct this
solution, we need the following subroutine: given a fraction A/B
(which is not necessarily irreducible), find how many fractions from
the Farey sequence are less than this fraction. Suppose we had this
subroutine; then the algorithm works as follows:
Determine a number X such that the answer is between X/N and (X+1)/N; such a number can be determined by binary searching the range
1...N, thus calling the subroutine O(log N) times.
Make a list of all fractions A/B in the range X/N...(X+1)/N. For any given B, there is at most one A in this range, and it can be
determined trivially in O(1).
Determine the appropriate order statistic in this list (doing this in O(N log N) by sorting is good enough).
It remains to show how we can construct the desired subroutine. We
will show how it can be implemented in O(N log N), thus giving a O(N
log^2 N) algorithm overall. Let us denote by C[j] the number of
irreducible fractions i/j which are less than X/N. The algorithm is
based on the following observation: C[j] = floor(X*B/N) – Sum(C[D],
where D divides j). A direct implementation, which tests whether any D
is a divisor, yields a quadratic algorithm. A better approach,
inspired by Eratosthene’s sieve, is the following: at step j, we know
C[j], and we subtract it from all multiples of j. The running time of
the subroutine becomes O(N log N).
Relevant Code
#include <cassert>
#include <algorithm>
#include <fstream>
#include <iostream>
#include <vector>
using namespace std;
const int kMaxN = 2e5;
typedef int int32;
typedef long long int64_x;
// #define int __int128_t
// #define int64 __int128_t
typedef long long int64;
int64 count_less(int a, int n) {
vector<int> counter(n + 1, 0);
for (int i = 2; i <= n; i += 1) {
counter[i] = min(1LL * (i - 1), 1LL * i * a / n);
}
int64 result = 0;
for (int i = 2; i <= n; i += 1) {
for (int j = 2 * i; j <= n; j += i) {
counter[j] -= counter[i];
}
result += counter[i];
}
return result;
}
int32 main() {
// ifstream cin("farey.in");
// ofstream cout("farey.out");
int64_x n, k; cin >> n >> k;
assert(1 <= n);
assert(n <= kMaxN);
assert(1 <= k);
assert(k <= count_less(n, n));
int up = 0;
for (int p = 29; p >= 0; p -= 1) {
if ((1 << p) + up > n)
continue;
if (count_less((1 << p) + up, n) < k) {
up += (1 << p);
}
}
k -= count_less(up, n);
vector<pair<int, int>> elements;
for (int i = 1; i <= n; i += 1) {
int b = i;
// find a such that up/n < a / b and a / b <= (up+1) / n
int a = 1LL * (up + 1) * b / n;
if (1LL * up * b < 1LL * a * n) {
} else {
continue;
}
if (1LL * a * n <= 1LL * (up + 1) * b) {
} else {
continue;
}
if (__gcd(a, b) != 1) {
continue;
}
elements.push_back({a, b});
}
sort(elements.begin(), elements.end(),
[](const pair<int, int>& lhs, const pair<int, int>& rhs) -> bool {
return 1LL * lhs.first * rhs.second < 1LL * rhs.first * lhs.second;
});
cout << (int64_x)elements[k - 1].first << ' ' << (int64_x)elements[k - 1].second << '\n';
return 0;
}
Basic Methodology
The above editorial explanation results in the following simplified version. Let me start with an example.
Let's say, we want to find 7th element of Farey Sequence with N = 5.
We start with writing a subroutine, as said in the explanation, that gives us the "k" value (how many Farey Sequence reduced fractions there exist before a given fraction - the given number may or may not be reduced)
So, take your F5 sequence:
k = 0, 0/1
k = 1, 1/5
k = 2, 1/4
k = 3, 1/3
k = 4, 2/5
k = 5, 1/2
k = 6, 3/5
k = 7, 2/3
k = 8, 3/4
k = 9, 4/5
k = 10, 1/1
If we can find a function that finds the count of the previous reduced fractions in Farey Sequence, we can do the following:
int64 k_count_2 = count_less(2, 5); // result = 4
int64 k_count_3 = count_less(3, 5); // result = 6
int64 k_count_4 = count_less(4, 5); // result = 9
This function is written in the accepted solution. It uses the exact methodology explained in the last paragraph of the editorial.
As you can see, the count_less() function generates the same k values as in our hand written list.
We know the values of the reduced fractions for k = 4, 6, 9 using that function. What about k = 7? As explained in the editorial, we will list all the reduced fractions in range X/N and (X+1)/N, here X = 3 and N = 5.
Using the function in the accepted solution (its near bottom), we list and sort the reduced fractions.
After that we will rearrange our k values, as in to fit in our new array as such:
k = -, 0/1
k = -, 1/5
k = -, 1/4
k = -, 1/3
k = -, 2/5
k = -, 1/2
k = -, 3/5 <-|
k = 0, 2/3 | We list and sort the possible reduced fractions
k = 1, 3/4 | in between these numbers
k = -, 4/5 <-|
k = -, 1/1
(That's why there is this piece of code: k -= count_less(up, n);, it basically remaps the k values)
(And we also subtract one more during indexing, i.e.: cout << (int64_x)elements[k - 1].first << ' ' << (int64_x)elements[k - 1].second << '\n';. This is just to basically call the right position in the generated array.)
So, for our new re-mapped k values, for N = 5 and k = 7 (original k), our result is 2/3.
(We select the value k = 0, in our new map)
If you compile and run the accepted solution, it will give you this:
Input: 5 7 (Enter)
Output: 2 3
I believe this is the basic point of the editorial and accepted solution.

How to print values in memoization method-Dynamic pragraming

I know for a problem that can be solved using DP, can be solved by either tabulation(bottom-up) approach or memoization(top-down) approach. personally i find memoization is easy and even efficient approach(analysis required just to get recursive formula,once recursive formula is obtained, a brute-force recursive method can easily be converted to store sub-problem's result and reuse it.) The only problem that i am facing in this approach is, i am not able to construct actual result from the table which i filled on demand.
For example, in Matrix Product Parenthesization problem ( to decide in which order to perform the multiplications on Matrices so that cost of multiplication is minimum) i am able to calculate minimum cost not not able to generate order in algo.
For example, suppose A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix. Then,
(AB)C = (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations
A(BC) = (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 operations.
here i am able to get min-cost as 27000 but unable to get order which is A(BC).
I used this. Suppose F[i, j] represents least number of multiplication needed to multiply Ai.....Aj and an array p[] is given which represents the chain of matrices such that the ith matrix Ai is of dimension p[i-1] x p[i]. So
0 if i=j
F[i,j]=
min(F[i,k] + F[k+1,j] +P_i-1 * P_k * P_j where k∈[i,j)
Below is the implementation that i have created.
#include<stdio.h>
#include<limits.h>
#include<string.h>
#define MAX 4
int lookup[MAX][MAX];
int MatrixChainOrder(int p[], int i, int j)
{
if(i==j) return 0;
int min = INT_MAX;
int k, count;
if(lookup[i][j]==0){
// recursively calculate count of multiplcations and return the minimum count
for (k = i; k<j; k++) {
int gmin=0;
if(lookup[i][k]==0)
lookup[i][k]=MatrixChainOrder(p, i, k);
if(lookup[k+1][j]==0)
lookup[k+1][j]=MatrixChainOrder(p, k+1, j);
count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < min){
min = count;
printf("\n****%d ",k); // i think something has be done here to represent the correct answer ((AB)C)D where first mat is represented by A second by B and so on.
}
}
lookup[i][j] = min;
}
return lookup[i][j];
}
// Driver program to test above function
int main()
{
int arr[] = {2,3,6,4,5};
int n = sizeof(arr)/sizeof(arr[0]);
memset(lookup, 0, sizeof(lookup));
int width =10;
printf("Minimum number of multiplications is %d ", MatrixChainOrder(arr, 1, n-1));
printf("\n ---->");
for(int l=0;l<MAX;++l)
printf(" %*d ",width,l);
printf("\n");
for(int z=0;z<MAX;z++){
printf("\n %d--->",z);
for(int x=0;x<MAX;x++)
printf(" %*d ",width,lookup[z][x]);
}
return 0;
}
I know using tabulation approach printing the solution is much easy but i want to do it in memoization technique.
Thanks.
Your code correctly computes the minimum number of multiplications, but you're struggling to display the optimal chain of matrix multiplications.
There's two possibilities:
When you compute the table, you can store the best index found in another memoization array.
You can recompute the optimal splitting points from the results in the memoization array.
The first would involve creating the split points in a separate array:
int lookup_splits[MAX][MAX];
And then updating it inside your MatrixChainOrder function:
...
if (count < min) {
min = count;
lookup_splits[i][j] = k;
}
You can then generate the multiplication chain recursively like this:
void print_mult_chain(int i, int j) {
if (i == j) {
putchar('A' + i - 1);
return;
}
putchar('(');
print_mult_chain(i, lookup_splits[i][j]);
print_mult_chain(lookup_splits[i][j] + 1, j);
putchar(')');
}
You can call the function with print_mult_chain(1, n - 1) from main.
The second possibility is that you don't cache lookup_splits and recompute it as necessary.
int get_lookup_splits(int p[], int i, int j) {
int best = INT_MAX;
int k_best;
for (int k = i; k < j; k++) {
int count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < best) {
best = count;
k_best = k;
}
}
return k;
}
This is essentially the same computation you did inside MatrixChainOrder, so if you go with this solution you should factor the code appropriately to avoid having two copies.
With this function, you can adapt print_mult_chain above to use it rather than the lookup_splits array. (You'll need to pass the p array in).
[None of this code is tested, so you may need to edit the answer to fix bugs].

Find the top k sums of two sorted arrays

You are given two sorted arrays, of sizes n and m respectively. Your task (should you choose to accept it), is to output the largest k sums of the form a[i]+b[j].
A O(k log k) solution can be found here. There are rumors of a O(k) or O(n) solution. Does one exist?
I found the responses at your link mostly vague and poorly structured. Here's a start with a O(k * log(min(m, n))) O(k * log(m + n)) O(k * log(k)) algorithm.
Suppose they are sorted decreasing. Imagine you computed the m*n matrix of the sums as follows:
for i from 0 to m
for j from 0 to n
sums[i][j] = a[i] + b[j]
In this matrix, values monotonically decrease down and to the right. With that in mind, here is an algorithm which performs a graph search through this matrix in order of decreasing sums.
q : priority queue (decreasing) := empty priority queue
add (0, 0) to q with priority a[0] + b[0]
while k > 0:
k--
x := pop q
output x
(i, j) : tuple of int,int := position of x
if i < m:
add (i + 1, j) to q with priority a[i + 1] + b[j]
if j < n:
add (i, j + 1) to q with priority a[i] + b[j + 1]
Analysis:
The loop is executed k times.
There is one pop operation per iteration.
There are up to two insert operations per iteration.
The maximum size of the priority queue is O(min(m, n)) O(m + n) O(k).
The priority queue can be implemented with a binary heap giving log(size) pop and insert.
Therefore this algorithm is O(k * log(min(m, n))) O(k * log(m + n)) O(k * log(k)).
Note that the general priority queue abstract data type needs to be modified to ignore duplicate entries. Alternately, you could maintain a separate set structure that first checks for membership in the set before adding to the queue, and removes from the set after popping from the queue. Neither of these ideas would worsen the time or space complexity.
I could write this up in Java if there's any interest.
Edit: fixed complexity. There is an algorithm which has the complexity I described, but it is slightly different from this one. You would have to take care to avoid adding certain nodes. My simple solution adds many nodes to the queue prematurely.
private static class FrontierElem implements Comparable<FrontierElem> {
int value;
int aIdx;
int bIdx;
public FrontierElem(int value, int aIdx, int bIdx) {
this.value = value;
this.aIdx = aIdx;
this.bIdx = bIdx;
}
#Override
public int compareTo(FrontierElem o) {
return o.value - value;
}
}
public static void findMaxSum( int [] a, int [] b, int k ) {
Integer [] frontierA = new Integer[ a.length ];
Integer [] frontierB = new Integer[ b.length ];
PriorityQueue<FrontierElem> q = new PriorityQueue<MaxSum.FrontierElem>();
frontierA[0] = frontierB[0]=0;
q.add( new FrontierElem( a[0]+b[0], 0, 0));
while( k > 0 ) {
FrontierElem f = q.poll();
System.out.println( f.value+" "+q.size() );
k--;
frontierA[ f.aIdx ] = frontierB[ f.bIdx ] = null;
int fRight = f.aIdx+1;
int fDown = f.bIdx+1;
if( fRight < a.length && frontierA[ fRight ] == null ) {
q.add( new FrontierElem( a[fRight]+b[f.bIdx], fRight, f.bIdx));
frontierA[ fRight ] = f.bIdx;
frontierB[ f.bIdx ] = fRight;
}
if( fDown < b.length && frontierB[ fDown ] == null ) {
q.add( new FrontierElem( a[f.aIdx]+b[fDown], f.aIdx, fDown));
frontierA[ f.aIdx ] = fDown;
frontierB[ fDown ] = f.aIdx;
}
}
}
The idea is similar to the other solution, but with the observation that as you add to your result set from the matrix, at every step the next element in our set can only come from where the current set is concave. I called these elements frontier elements and I keep track of their position in two arrays and their values in a priority queue. This helps keep the queue size down, but by how much I've yet to figure out. It seems to be about sqrt( k ) but I'm not entirely sure about that.
(Of course the frontierA/B arrays could be simple boolean arrays, but this way they fully define my result set, This isn't used anywhere in this example but might be useful otherwise.)
As the pre-condition is the Array are sorted hence lets consider the following
for N= 5;
A[]={ 1,2,3,4,5}
B[]={ 496,497,498,499,500}
Now since we know Summation of N-1 of A&B would be highest hence just insert this in to heap along with the indexes of A & B element ( why, indexes? we'll come to know in a short while )
H.insert(A[N-1]+B[N-1],N-1,N-1);
now
while(!H.empty()) { // the time heap is not empty
H.pop(); // this will give you the sum you are looking for
The indexes which we got at the time of pop, we shall use them for selecting the next sum element.
Consider the following :
if we have i & j as the indexes in A & B , then the next element would be max ( A[i]+B[j-1], A[i-1]+B[j], A[i+1]+B[j+1] ) ,
So, insert the same if that has not been inserted in the heap
hence
(i,j)= max ( A[i]+B[j-1], A[i-1]+B[j], A[i+1]+B[j+1] ) ;
if(Hash[i,j]){ // not inserted
H.insert (i,j);
}else{
get the next max from max ( A[i]+B[j-1], A[i-1]+B[j], A[i+1]+B[j+1] ) ; and insert.
}
K pop-ing them will give you max elements required.
Hope this helps
Many thanks to #rlibby and #xuhdev with such an original idea to solve this kind of problem. I had a similar coding exercise interview require to find N largest sums formed by K elements in K descending sorted arrays - means we must pick 1 element from each sorted arrays to build the largest sum.
Example: List findHighestSums(int[][] lists, int n) {}
[5,4,3,2,1]
[4,1]
[5,0,0]
[6,4,2]
[1]
and a value of 5 for n, your procedure should return a List of size 5:
[21,20,19,19,18]
Below is my code, please take a look carefully for those block comments :D
private class Pair implements Comparable<Pair>{
String state;
int sum;
public Pair(String state, int sum) {
this.state = state;
this.sum = sum;
}
#Override
public int compareTo(Pair o) {
// Max heap
return o.sum - this.sum;
}
}
List<Integer> findHighestSums(int[][] lists, int n) {
int numOfLists = lists.length;
int totalCharacterInState = 0;
/*
* To represent State of combination of largest sum as String
* The number of characters for each list should be Math.ceil(log(list[i].length))
* For example:
* If list1 length contains from 11 to 100 elements
* Then the State represents for list1 will require 2 characters
*/
int[] positionStartingCharacterOfListState = new int[numOfLists + 1];
positionStartingCharacterOfListState[0] = 0;
// the reason to set less or equal here is to get the position starting character of the last list
for(int i = 1; i <= numOfLists; i++) {
int previousListNumOfCharacters = 1;
if(lists[i-1].length > 10) {
previousListNumOfCharacters = (int)Math.ceil(Math.log10(lists[i-1].length));
}
positionStartingCharacterOfListState[i] = positionStartingCharacterOfListState[i-1] + previousListNumOfCharacters;
totalCharacterInState += previousListNumOfCharacters;
}
// Check the state <---> make sure that combination of a sum is new
Set<String> states = new HashSet<>();
List<Integer> result = new ArrayList<>();
StringBuilder sb = new StringBuilder();
// This is a max heap contain <State, largestSum>
PriorityQueue<Pair> pq = new PriorityQueue<>();
char[] stateChars = new char[totalCharacterInState];
Arrays.fill(stateChars, '0');
sb.append(stateChars);
String firstState = sb.toString();
states.add(firstState);
int firstLargestSum = 0;
for(int i = 0; i < numOfLists; i++) firstLargestSum += lists[i][0];
// Imagine this is the initial state in a graph
pq.add(new Pair(firstState, firstLargestSum));
while(n > 0) {
// In case n is larger than the number of combinations of all list entries
if(pq.isEmpty()) break;
Pair top = pq.poll();
String currentState = top.state;
int currentSum = top.sum;
/*
* Loop for all lists and generate new states of which only 1 character is different from the former state
* For example: the initial state (Stage 0) 0 0 0 0 0
* So the next states (Stage 1) should be:
* 1 0 0 0 0
* 0 1 0 0 0 (choose element at index 2 from 2nd array)
* 0 0 1 0 0 (choose element at index 2 from 3rd array)
* 0 0 0 0 1
* But don't forget to check whether index in any lists have exceeded list's length
*/
for(int i = 0; i < numOfLists; i++) {
int indexInList = Integer.parseInt(
currentState.substring(positionStartingCharacterOfListState[i], positionStartingCharacterOfListState[i+1]));
if( indexInList < lists[i].length - 1) {
int numberOfCharacters = positionStartingCharacterOfListState[i+1] - positionStartingCharacterOfListState[i];
sb = new StringBuilder(currentState.substring(0, positionStartingCharacterOfListState[i]));
sb.append(String.format("%0" + numberOfCharacters + "d", indexInList + 1));
sb.append(currentState.substring(positionStartingCharacterOfListState[i+1]));
String newState = sb.toString();
if(!states.contains(newState)) {
// The newSum is always <= currentSum
int newSum = currentSum - lists[i][indexInList] + lists[i][indexInList+1];
states.add(newState);
// Using priority queue, we can immediately retrieve the largest Sum at Stage k and track all other unused states.
// From that Stage k largest Sum's state, then we can generate new states
// Those sums composed by recently generated states don't guarantee to be larger than those sums composed by old unused states.
pq.add(new Pair(newState, newSum));
}
}
}
result.add(currentSum);
n--;
}
return result;
}
Let me explain how I come up with the solution:
The while loop in my answer executes N times, consider the max heap
( priority queue).
Poll operation 1 time with complexity O(log(
sumOfListLength )) because the maximum element Pair in
heap is sumOfListLength.
Insertion operations might up to K times,
the complexity for each insertion is log(sumOfListLength).
Therefore, the complexity is O(N * log(sumOfListLength) ),

How to find pythagorean triplets in an array faster than O(N^2)?

Can someone suggest an algorithm that finds all Pythagorean triplets among numbers in a given array? If it's possible, please, suggest an algorithm faster than O(n2).
Pythagorean triplet is a set {a,b,c} such that a2 = b2 + c2. Example: for array [9, 2, 3, 4, 8, 5, 6, 10] the output of the algorithm should be {3, 4, 5} and {6, 8, 10}.
I understand this question as
Given an array, find all such triplets i,j and k, such that a[i]2 = a[j]2+a[k]2
The key idea of the solution is:
Square each element. (This takes O(n) time). This will reduce the original task to "find three numbers in array, one of which is the sum of other two".
Now it you know how to solve such task in less than O(n2) time, use such algorithm. Out of my mind comes only the following O(n2) solution:
Sort the array in ascending order. This takes O(n log n).
Now consider each element a[i]. If a[i]=a[j]+a[k], then, since numbers are positive and array is now sorted, k<i and j<i.
To find such indexes, run a loop that increases j from 1 to i, and decreases k from i to 0 at the same time, until they meet. Increase j if a[j]+a[k] < a[i], and decrease k if the sum is greater than a[i]. If the sum is equal, that's one of the answers, print it, and shift both indexes.
This takes O(i) operations.
Repeat step 2 for each index i. This way you'll need totally O(n2) operations, which will be the final estimate.
No one knows how to do significantly better than quadratic for the closely related 3SUM problem ( http://en.wikipedia.org/wiki/3SUM ). I'd rate the possibility of a fast solution to your problem as unlikely.
The 3SUM problem is finding a + b + c = 0. Let PYTHTRIP be the problem of finding a^2 + b^2 = c^2 when the inputs are real algebraic numbers. Here is the O(n log n)-time reduction from 3SUM to PYTHTRIP. As ShreevatsaR points out, this doesn't exclude the possibility of a number-theoretic trick (or a solution to 3SUM!).
First we reduce 3SUM to a problem I'll call 3SUM-ALT. In 3SUM-ALT, we want to find a + b = c where all array entries are nonnegative. The finishing reduction from 3SUM-ALT to PYTHTRIP is just taking square roots.
To solve 3SUM using 3SUM-ALT, first eliminate the possibility of triples where one of a, b, c is zero (O(n log n)). Now, any satisfying triple has two positive numbers and one negative, or two negative and one positive. Let w be a number greater than three times the absolute value of any input number. Solve two instances of 3SUM-ALT: one where all negative x are mapped to w - x and all positive x are mapped to 2w + x; one where all negative x are mapped to 2w - x and all positive x are mapped to w + x. The rest of the proof is straightforward.
I have one more solution,
//sort the array in ascending order
//find the square of each element in the array
//let 'a' be the array containing square of each element in ascending order
for(i->0 to (a.length-1))
for (j->i+1 to (a.length-1))
//search the a[i]+a[j] ahead in the array from j+1 to the end of array
//if found get the triplet according to sqrt(a[i]),sqrt(a[j]) & sqrt(a[i]+a[j])
endfor
endfor
Not sure if this is any better but you can compute them in time proportional to the maximum value in the list by just computing all possible triples less than or equal to it. The following Perl code does. The time complexity of the algorithm is proportional to the maximum value since the sum of inverse squares 1 + 1/2^2 + 1/3^3 .... is equal to Pi^2/6, a constant.
I just used the formula from the Wikipedia page for generating none unique triples.
my $list = [9, 2, 3, 4, 8, 5, 6, 10];
pythagoreanTriplets ($list);
sub pythagoreanTriplets
{
my $list = $_[0];
my %hash;
my $max = 0;
foreach my $value (#$list)
{
$hash{$value} = 1;
$max = $value if ($value > $max);
}
my $sqrtMax = 1 + int sqrt $max;
for (my $n = 1; $n <= $sqrtMax; $n++)
{
my $n2 = $n * $n;
for (my $m = $n + 1; $m <= $sqrtMax; $m++)
{
my $m2 = $m * $m;
my $maxK = 1 + int ($max / ($m2 + $n2));
for (my $k = 1; $k <= $maxK; $k++)
{
my $a = $k * ($m2 - $n2);
my $b = $k * (2 * $m * $n);
my $c = $k * ($m2 + $n2);
print "$a $b $c\n" if (exists ($hash{$a}) && exists ($hash{$b}) && exists ($hash{$c}));
}
}
}
}
Here's a solution which might scale better for large lists of small numbers. At least it's different ;v) .
According to http://en.wikipedia.org/wiki/Pythagorean_triple#Generating_a_triple,
a = m^2 - n^2, b = 2mn, c = m^2 + n^2
b looks nice, eh?
Sort the array in O(N log N) time.
For each element b, find the prime factorization. Naively using a table of primes up to the square root of the largest input value M would take O(sqrt M/log M) time and space* per element.
For each pair (m,n), m > n, b = 2mn (skip odd b), search for m^2-n^2 and m^2+n^2 in the sorted array. O(log N) per pair, O(2^(Ω(M))) = O(log M)** pairs per element, O(N (log N) (log M)) total.
Final analysis: O( N ( (sqrt M/log M) + (log N * log M) ) ), N = array size, M = magnitude of values.
(* To accept 64-bit input, there are about 203M 32-bit primes, but we can use a table of differences at one byte per prime, since the differences are all even, and perhaps also generate large primes in sequence on demand. To accept 32-bit input, a table of 16-bit primes is needed, which is small enough to fit in L1 cache. Time here is an overestimate assuming all prime factors are just less than the square root.)
(** Actual bound lower because of duplicate prime factors.)
Solution in O(N).
find out minimum element in array. min O(n).
find out maximum element in array. max O(n).
make a hastable of elements so that element can be searched in O(1).
m^2-1 = min .... put min from step 1. find out m in this equation.O(1)
2m = min .... put min from step 1. find out m in this equation.O(1)
m^2+1= max .... put max from step 2. find out m in this equation.O(1)
choose floor of min of (steps 4,5,6) let say minValue.O(1)
choose ceil of max of (steps 4,5,6) let say maxValue.O(1)
loop from j=minValue to maxValue. maxvalue-minvalue will be less than root of N.
9.a calculate three numbers j^2-1,2j,j^2+1.
9.b search these numbers in hashtable. if found return success.
return failure.
A few of my co-workers were asked this very same problem in a java cert course they were taking the solution we came up with was O(N^2). We shaved off as much of the problem space as we could but we could not find a way to drop the complexity to N Log N or better.
public static List<int[]> pythagoreanTripplets(int[] input) {
List<int[]> answers = new ArrayList<int[]>();
Map<Long, Integer> map = new HashMap<Long, Integer>();
for (int i = 0; i < input.length; i++) {
map.put((long)input[i] * (long)input[i], input[i]);
}
Long[] unique = (Long[]) map.keySet().toArray(new Long[0]);
Arrays.sort(unique);
long comps =0;
for(int i = 1 ; i < unique.length;i++)
{
Long halfC = unique[i]/2;
for(int j = i-1 ; j>= 0 ; j--)
{
if(unique[j] < halfC) break;
if(map.containsKey(unique[i] - unique[j]))
{
answers.add(new int[]{map.get(unique[i] - unique[j]),map.get(unique[j]),map.get(unique[i])});
}
}
}
return answers;
}
If (a, b, c) is a Pythagorean triple, then so is (ka, kb, kc) for any positive integer.
so simply find one value for a, b, and c and then you can calculate as many new ones as you want.
Pseudo code:
a = 3
b = 4
c = 5
for k in 1..N:
P[k] = (ka, kb, kc)
Let me know if this is not exactly what you're looking for.
It can be done in O(n) time. first hash the elements in map for existence check. after that apply the below algorithm
Scan the array and if element is even number, (n,n^2/2 +1, n^2/2 -1) is triplet to be found. just check for that's existence using hash map lookup. if all elements in triplet exists, print the triplet.
This is the one i had implemented ...
import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
/**
*
* #author Pranali Choudhari (pranali_choudhari#persistent.co.in)
*/
public class PythagoreanTriple {
/
//I hope this is optimized
public static void main(String[] args) {
Map<Long,Set<Long>> triples = new HashMap<Long,Set<Long>>();
List<Long> l1 = new ArrayList<Long>();
addValuesToArrayList(l1);
long n =0;
for(long i : l1){
//if its side a.
n = (i-1L)/2L;
if (n!=0 && n > 0){
putInMap(triples,n,i);
n=0;
}
//if its side b
n = ((-1 + Math.round(Math.sqrt(2*i+1)))/2);
if (n != 0 && n > 0){
putInMap(triples,n,i);
n=0;
}
n= ((-1 - Math.round(Math.sqrt(2*i+1)))/2);
if (n != 0 && n > 0){
putInMap(triples,n,i);
n=0;
}
//if its side c
n = ((-1 + Math.round(Math.sqrt(2*i-1)))/2);
if (n != 0 && n > 0){
putInMap(triples,n,i);
n=0;
}
n= ((-1 - Math.round(Math.sqrt(2*i-1)))/2);
if (n != 0 && n > 0){
putInMap(triples,n,i);
n=0;
}
}
for(Map.Entry<Long, Set<Long>> e : triples.entrySet()){
if(e.getValue().size() == 3){
System.out.println("Tripples" + e.getValue());
}
//need to handle scenario when size() > 3
//even those are tripples but we need to filter the wrong ones
}
}
private static void putInMap( Map<Long,Set<Long>> triples, long n, Long i) {
Set<Long> set = triples.get(n);
if(set == null){
set = new HashSet<Long>();
triples.put(n, set);
}
set.add(i);
}
//add values here
private static void addValuesToArrayList(List<Long> l1) {
l1.add(1L);
l1.add(2L);
l1.add(3L);
l1.add(4L);
l1.add(5L);
l1.add(12L);
l1.add(13L);
}
}
Here's the implementation in Java:
/**
* Step1: Square each of the elements in the array [O(n)]
* Step2: Sort the array [O(n logn)]
* Step3: For each element in the array, find all the pairs in the array whose sum is equal to that element [O(n2)]
*
* Time Complexity: O(n2)
*/
public static Set<Set<Integer>> findAllPythogoreanTriplets(int [] unsortedData) {
// O(n) - Square all the elements in the array
for (int i = 0; i < unsortedData.length; i++)
unsortedData[i] *= unsortedData[i];
// O(n logn) - Sort
int [] sortedSquareData = QuickSort.sort(unsortedData);
// O(n2)
Set<Set<Integer>> triplets = new HashSet<Set<Integer>>();
for (int i = 0; i < sortedSquareData.length; i++) {
Set<Set<Integer>> pairs = findAllPairsThatSumToAConstant(sortedSquareData, sortedSquareData[i]);
for (Set<Integer> pair : pairs) {
Set<Integer> triplet = new HashSet<Integer>();
for (Integer n : pair) {
triplet.add((int)Math.sqrt(n));
}
triplet.add((int)Math.sqrt(sortedSquareData[i])); // adding the third element to the pair to make it a triplet
triplets.add(triplet);
}
}
return triplets;
}
public static Set<Set<Integer>> findAllPairsThatSumToAConstant(int [] sortedData, int constant) {
// O(n)
Set<Set<Integer>> pairs = new HashSet<Set<Integer>>();
int p1 = 0; // pointing to the first element
int p2 = sortedData.length - 1; // pointing to the last element
while (p1 < p2) {
int pointersSum = sortedData[p1] + sortedData[p2];
if (pointersSum > constant)
p2--;
else if (pointersSum < constant)
p1++;
else {
Set<Integer> set = new HashSet<Integer>();
set.add(sortedData[p1]);
set.add(sortedData[p2]);
pairs.add(set);
p1++;
p2--;
}
}
return pairs;
}
if the problem is the one "For an Array of integers find all triples such that a^2+b^2 = c^2
Sort the array into ascending order
Set three pointers p1,p2,p3 at entries 0,1,2
set pEnd to past the last entry in the array
while (p2 < pend-2)
{
sum = (*p1 * *p1 + *p2 * *p2)
while ((*p3 * *p3) < sum && p3 < pEnd -1)
p3++;
if ( *p3 == sum)
output_triple(*p1, *p2, *p3);
p1++;
p2++;
}
it's moving 3 pointers up the array so it O(sort(n) + n)
it's not n2 because the next pass starts at the next largest number and doesn't reset.
if the last number was too small for the triple, it's still to small when you go to the next bigger a and b
public class FindPythagorusCombination {
public static void main(String[] args) {
int[] no={1, 5, 3, 4, 8, 10, 6 };
int[] sortedno= sorno(no);
findPythaComb(sortedno);
}
private static void findPythaComb(int[] sortedno) {
for(int i=0; i<sortedno.length;i++){
int lSum=0, rSum=0;
lSum= sortedno[i]*sortedno[i];
for(int j=i+1; j<sortedno.length; j++){
for(int k=j+1; k<sortedno.length;k++){
rSum= (sortedno[j]*sortedno[j])+(sortedno[k]*sortedno[k]);
if(lSum==rSum){
System.out.println("Pythagorus combination found: " +sortedno[i] +" " +sortedno[j]+" "+sortedno[k]);
}else
rSum=0;
}
}
}
}
private static int[] sorno(int[] no) {
for(int i=0; i<no.length;i++){
for(int j=i+1; j<no.length;j++){
if(no[i]<no[j]){
int temp= no[i];
no[i]= no[j];
no[j]=temp;
}
}
}
return no;
}
}
import java.io.*;
import java.lang.*;
import java.util.*;
class PythagoreanTriplets
{
public static void main(String args[])throws IOException
{
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
int n = Integer.parseInt(br.readLine());
int arr[] = new int[n];
int i,j,k,sum;
System.out.println("Enter the numbers ");
for(i=0;i<n;i++)
{
arr[i]=Integer.parseInt(br.readLine());
arr[i]=arr[i]*arr[i];
}
Arrays.sort(arr);
for(i=n-1;i>=0;i--)
{
for(j=0,k=i-1;j<k;)
{
sum=arr[j]+arr[k];
if(sum==arr[i]){System.out.println((int)Math.sqrt(arr[i]) +","+(int)Math.sqrt(arr[j])+","+(int)Math.sqrt(arr[k]));break;}
else if(sum>arr[i])k--;
else j++;
}
}
}
}
Finding Pythagorean triplets in O(n)
Algorithm:
For each element in array, check it is prime or not
if it is prime, calculate other two number as ((n^2)+1)/2 and ((n^2)-1)/2 and check whether these two calculated number is in array
if it is not prime, calculate other two number as mentioned in else case in code given below
Repeat until end of array is reached
int arr[]={1,2,3,4,5,6,7,8,9,10,12,13,11,60,61};
int prim[]={3,5,7,11};//store all the prime numbers
int r,l;
List<Integer> prime=new ArrayList<Integer>();//storing in list,so that it is easy to search
for(int i=0;i<4;i++){
prime.add(prim[i]);
}
List<Integer> n=new ArrayList<Integer>();
for(int i=0;i<arr.length;i++)
{
n.add(arr[i]);
}
double v1,v2,v3;
int dummy[]=new int[arr.length];
for(int i=0;i<arr.length;i++)
dummy[i]=arr[i];
Integer x=0,y=0,z=0;
List<Integer> temp=new ArrayList<Integer>();
for(int i=0;i<arr.length;i++)
{
temp.add(arr[i]);
}
for(int j:n){
if(prime.contains(j)){//if it is prime
double a,b;
v1=(double)j;
v2=Math.ceil(((j*j)+1)/2);
v3=Math.ceil(((j*j)-1)/2);
if(n.contains((int)v2) && n.contains((int)v3)){
System.out.println((int)v1+" "+(int)v2+" "+(int)v3);
}
}
else//if it is not prime
{
if(j%3==0){
x=j;
y=4*(j/3);
z=5*(j/3);
if(temp.contains(y) && temp.contains(z)){
System.out.println(x+" "+y+" "+z);
//replacing those three elements with 0
dummy[temp.indexOf(x)-1]=0;
dummy[temp.indexOf(y)-1]=0;
dummy[temp.indexOf(z)-1]=0;
}
}
}//else end
}//for end
Complexity: O(n)
Take a look at the following code that I wrote.
#include <iostream>
#include <vector>
using namespace std;
typedef long long ll;
bool existTriplet(vector<ll> &vec)
{
for(auto i = 0; i < vec.size(); i++)
{
vec[i] = vec[i] * vec[i]; //Square all the array elements
}
sort(vec.begin(), vec.end()); //Sort it
for(auto i = vec.size() - 1; i >= 2; i--)
{
ll l = 0;
ll r = i - 1;
while(l < r)
{
if(vec[l] + vec[r] == vec[i])
return true;
vec[l] + vec[r] < vec[i] ? l++ : r--;
}
}
return false;
}
int main() {
int T;
cin >> T;
while(T--)
{
ll n;
cin >> n;
vector<ll> vec(n);
for(auto i = 0; i < n; i++)
{
cin >> vec[i];
}
if(existTriplet(vec))
cout << "Yes";
else
cout << "No";
cout << endl;
}
return 0;
}
Plato's formula for Pythagorean Triples:
Plato, a Greek Philosopher, came up with a great formula for finding Pythagorean triples.
(2m)^2 + (m^2 - 1)^2 = (m^2 + 1)^2
bool checkperfectSquare(int num){
int sq=(int)round(sqrt(num));
if(sq==num/sq){
return true;
}
else{
return false;
}
}
void solve(){
int i,j,k,n;
// lenth of array
cin>>n;
int ar[n];
// reading all the number in array
for(i=0;i<n;i++){
cin>>ar[i];
}
// sort the array
sort(ar,ar+n);
for(i=0;i<n;i++){
if(ar[i]<=2){
continue;
}
else{
int tmp1=ar[i]+1;
int m;
if(checkperfectSquare(tmp1)){
m=(ll)round(sqrt(tmp1));
int b=2*m,c=(m*m)+1;
if(binary_search(ar,ar+n,b)&&binary_search(ar,ar+n,c)){
cout<<ar[i]<<" "<<b<<" "<<c<<endl;
break;
}
}
if(ar[i]%2==0){
m=ar[i]/2;
int b=(m*m-1),c=(m*m+1);
if(binary_search(ar,ar+n,b)&&binary_search(ar,ar+n,c)){
cout<<ar[i]<<" "<<b<<" "<<c<<endl;
break;
}
}
}
}
}

How to find the kth largest element in an unsorted array of length n in O(n)?

I believe there's a way to find the kth largest element in an unsorted array of length n in O(n). Or perhaps it's "expected" O(n) or something. How can we do this?
This is called finding the k-th order statistic. There's a very simple randomized algorithm (called quickselect) taking O(n) average time, O(n^2) worst case time, and a pretty complicated non-randomized algorithm (called introselect) taking O(n) worst case time. There's some info on Wikipedia, but it's not very good.
Everything you need is in these powerpoint slides. Just to extract the basic algorithm of the O(n) worst-case algorithm (introselect):
Select(A,n,i):
Divide input into ⌈n/5⌉ groups of size 5.
/* Partition on median-of-medians */
medians = array of each group’s median.
pivot = Select(medians, ⌈n/5⌉, ⌈n/10⌉)
Left Array L and Right Array G = partition(A, pivot)
/* Find ith element in L, pivot, or G */
k = |L| + 1
If i = k, return pivot
If i < k, return Select(L, k-1, i)
If i > k, return Select(G, n-k, i-k)
It's also very nicely detailed in the Introduction to Algorithms book by Cormen et al.
If you want a true O(n) algorithm, as opposed to O(kn) or something like that, then you should use quickselect (it's basically quicksort where you throw out the partition that you're not interested in). My prof has a great writeup, with the runtime analysis: (reference)
The QuickSelect algorithm quickly finds the k-th smallest element of an unsorted array of n elements. It is a RandomizedAlgorithm, so we compute the worst-case expected running time.
Here is the algorithm.
QuickSelect(A, k)
let r be chosen uniformly at random in the range 1 to length(A)
let pivot = A[r]
let A1, A2 be new arrays
# split into a pile A1 of small elements and A2 of big elements
for i = 1 to n
if A[i] < pivot then
append A[i] to A1
else if A[i] > pivot then
append A[i] to A2
else
# do nothing
end for
if k <= length(A1):
# it's in the pile of small elements
return QuickSelect(A1, k)
else if k > length(A) - length(A2)
# it's in the pile of big elements
return QuickSelect(A2, k - (length(A) - length(A2))
else
# it's equal to the pivot
return pivot
What is the running time of this algorithm? If the adversary flips coins for us, we may find that the pivot is always the largest element and k is always 1, giving a running time of
T(n) = Theta(n) + T(n-1) = Theta(n2)
But if the choices are indeed random, the expected running time is given by
T(n) <= Theta(n) + (1/n) ∑i=1 to nT(max(i, n-i-1))
where we are making the not entirely reasonable assumption that the recursion always lands in the larger of A1 or A2.
Let's guess that T(n) <= an for some a. Then we get
T(n)
<= cn + (1/n) ∑i=1 to nT(max(i-1, n-i))
= cn + (1/n) ∑i=1 to floor(n/2) T(n-i) + (1/n) ∑i=floor(n/2)+1 to n T(i)
<= cn + 2 (1/n) ∑i=floor(n/2) to n T(i)
<= cn + 2 (1/n) ∑i=floor(n/2) to n ai
and now somehow we have to get the horrendous sum on the right of the plus sign to absorb the cn on the left. If we just bound it as 2(1/n) ∑i=n/2 to n an, we get roughly 2(1/n)(n/2)an = an. But this is too big - there's no room to squeeze in an extra cn. So let's expand the sum using the arithmetic series formula:
∑i=floor(n/2) to n i
= ∑i=1 to n i - ∑i=1 to floor(n/2) i
= n(n+1)/2 - floor(n/2)(floor(n/2)+1)/2
<= n2/2 - (n/4)2/2
= (15/32)n2
where we take advantage of n being "sufficiently large" to replace the ugly floor(n/2) factors with the much cleaner (and smaller) n/4. Now we can continue with
cn + 2 (1/n) ∑i=floor(n/2) to n ai,
<= cn + (2a/n) (15/32) n2
= n (c + (15/16)a)
<= an
provided a > 16c.
This gives T(n) = O(n). It's clearly Omega(n), so we get T(n) = Theta(n).
A quick Google on that ('kth largest element array') returned this: http://discuss.joelonsoftware.com/default.asp?interview.11.509587.17
"Make one pass through tracking the three largest values so far."
(it was specifically for 3d largest)
and this answer:
Build a heap/priority queue. O(n)
Pop top element. O(log n)
Pop top element. O(log n)
Pop top element. O(log n)
Total = O(n) + 3 O(log n) = O(n)
You do like quicksort. Pick an element at random and shove everything either higher or lower. At this point you'll know which element you actually picked, and if it is the kth element you're done, otherwise you repeat with the bin (higher or lower), that the kth element would fall in. Statistically speaking, the time it takes to find the kth element grows with n, O(n).
A Programmer's Companion to Algorithm Analysis gives a version that is O(n), although the author states that the constant factor is so high, you'd probably prefer the naive sort-the-list-then-select method.
I answered the letter of your question :)
The C++ standard library has almost exactly that function call nth_element, although it does modify your data. It has expected linear run-time, O(N), and it also does a partial sort.
const int N = ...;
double a[N];
// ...
const int m = ...; // m < N
nth_element (a, a + m, a + N);
// a[m] contains the mth element in a
You can do it in O(n + kn) = O(n) (for constant k) for time and O(k) for space, by keeping track of the k largest elements you've seen.
For each element in the array you can scan the list of k largest and replace the smallest element with the new one if it is bigger.
Warren's priority heap solution is neater though.
Although not very sure about O(n) complexity, but it will be sure to be between O(n) and nLog(n). Also sure to be closer to O(n) than nLog(n). Function is written in Java
public int quickSelect(ArrayList<Integer>list, int nthSmallest){
//Choose random number in range of 0 to array length
Random random = new Random();
//This will give random number which is not greater than length - 1
int pivotIndex = random.nextInt(list.size() - 1);
int pivot = list.get(pivotIndex);
ArrayList<Integer> smallerNumberList = new ArrayList<Integer>();
ArrayList<Integer> greaterNumberList = new ArrayList<Integer>();
//Split list into two.
//Value smaller than pivot should go to smallerNumberList
//Value greater than pivot should go to greaterNumberList
//Do nothing for value which is equal to pivot
for(int i=0; i<list.size(); i++){
if(list.get(i)<pivot){
smallerNumberList.add(list.get(i));
}
else if(list.get(i)>pivot){
greaterNumberList.add(list.get(i));
}
else{
//Do nothing
}
}
//If smallerNumberList size is greater than nthSmallest value, nthSmallest number must be in this list
if(nthSmallest < smallerNumberList.size()){
return quickSelect(smallerNumberList, nthSmallest);
}
//If nthSmallest is greater than [ list.size() - greaterNumberList.size() ], nthSmallest number must be in this list
//The step is bit tricky. If confusing, please see the above loop once again for clarification.
else if(nthSmallest > (list.size() - greaterNumberList.size())){
//nthSmallest will have to be changed here. [ list.size() - greaterNumberList.size() ] elements are already in
//smallerNumberList
nthSmallest = nthSmallest - (list.size() - greaterNumberList.size());
return quickSelect(greaterNumberList,nthSmallest);
}
else{
return pivot;
}
}
I implemented finding kth minimimum in n unsorted elements using dynamic programming, specifically tournament method. The execution time is O(n + klog(n)). The mechanism used is listed as one of methods on Wikipedia page about Selection Algorithm (as indicated in one of the posting above). You can read about the algorithm and also find code (java) on my blog page Finding Kth Minimum. In addition the logic can do partial ordering of the list - return first K min (or max) in O(klog(n)) time.
Though the code provided result kth minimum, similar logic can be employed to find kth maximum in O(klog(n)), ignoring the pre-work done to create tournament tree.
Sexy quickselect in Python
def quickselect(arr, k):
'''
k = 1 returns first element in ascending order.
can be easily modified to return first element in descending order
'''
r = random.randrange(0, len(arr))
a1 = [i for i in arr if i < arr[r]] '''partition'''
a2 = [i for i in arr if i > arr[r]]
if k <= len(a1):
return quickselect(a1, k)
elif k > len(arr)-len(a2):
return quickselect(a2, k - (len(arr) - len(a2)))
else:
return arr[r]
As per this paper Finding the Kth largest item in a list of n items the following algorithm will take O(n) time in worst case.
Divide the array in to n/5 lists of 5 elements each.
Find the median in each sub array of 5 elements.
Recursively find the median of all the medians, lets call it M
Partition the array in to two sub array 1st sub-array contains the elements larger than M , lets say this sub-array is a1 , while other sub-array contains the elements smaller then M., lets call this sub-array a2.
If k <= |a1|, return selection (a1,k).
If k− 1 = |a1|, return M.
If k> |a1| + 1, return selection(a2,k −a1 − 1).
Analysis: As suggested in the original paper:
We use the median to partition the list into two halves(the first half,
if k <= n/2 , and the second half otherwise). This algorithm takes
time cn at the first level of recursion for some constant c, cn/2 at
the next level (since we recurse in a list of size n/2), cn/4 at the
third level, and so on. The total time taken is cn + cn/2 + cn/4 +
.... = 2cn = o(n).
Why partition size is taken 5 and not 3?
As mentioned in original paper:
Dividing the list by 5 assures a worst-case split of 70 − 30. Atleast
half of the medians greater than the median-of-medians, hence atleast
half of the n/5 blocks have atleast 3 elements and this gives a
3n/10 split, which means the other partition is 7n/10 in worst case.
That gives T(n) = T(n/5)+T(7n/10)+O(n). Since n/5+7n/10 < 1, the
worst-case running time isO(n).
Now I have tried to implement the above algorithm as:
public static int findKthLargestUsingMedian(Integer[] array, int k) {
// Step 1: Divide the list into n/5 lists of 5 element each.
int noOfRequiredLists = (int) Math.ceil(array.length / 5.0);
// Step 2: Find pivotal element aka median of medians.
int medianOfMedian = findMedianOfMedians(array, noOfRequiredLists);
//Now we need two lists split using medianOfMedian as pivot. All elements in list listOne will be grater than medianOfMedian and listTwo will have elements lesser than medianOfMedian.
List<Integer> listWithGreaterNumbers = new ArrayList<>(); // elements greater than medianOfMedian
List<Integer> listWithSmallerNumbers = new ArrayList<>(); // elements less than medianOfMedian
for (Integer element : array) {
if (element < medianOfMedian) {
listWithSmallerNumbers.add(element);
} else if (element > medianOfMedian) {
listWithGreaterNumbers.add(element);
}
}
// Next step.
if (k <= listWithGreaterNumbers.size()) return findKthLargestUsingMedian((Integer[]) listWithGreaterNumbers.toArray(new Integer[listWithGreaterNumbers.size()]), k);
else if ((k - 1) == listWithGreaterNumbers.size()) return medianOfMedian;
else if (k > (listWithGreaterNumbers.size() + 1)) return findKthLargestUsingMedian((Integer[]) listWithSmallerNumbers.toArray(new Integer[listWithSmallerNumbers.size()]), k-listWithGreaterNumbers.size()-1);
return -1;
}
public static int findMedianOfMedians(Integer[] mainList, int noOfRequiredLists) {
int[] medians = new int[noOfRequiredLists];
for (int count = 0; count < noOfRequiredLists; count++) {
int startOfPartialArray = 5 * count;
int endOfPartialArray = startOfPartialArray + 5;
Integer[] partialArray = Arrays.copyOfRange((Integer[]) mainList, startOfPartialArray, endOfPartialArray);
// Step 2: Find median of each of these sublists.
int medianIndex = partialArray.length/2;
medians[count] = partialArray[medianIndex];
}
// Step 3: Find median of the medians.
return medians[medians.length / 2];
}
Just for sake of completion, another algorithm makes use of Priority Queue and takes time O(nlogn).
public static int findKthLargestUsingPriorityQueue(Integer[] nums, int k) {
int p = 0;
int numElements = nums.length;
// create priority queue where all the elements of nums will be stored
PriorityQueue<Integer> pq = new PriorityQueue<Integer>();
// place all the elements of the array to this priority queue
for (int n : nums) {
pq.add(n);
}
// extract the kth largest element
while (numElements - k + 1 > 0) {
p = pq.poll();
k++;
}
return p;
}
Both of these algorithms can be tested as:
public static void main(String[] args) throws IOException {
Integer[] numbers = new Integer[]{2, 3, 5, 4, 1, 12, 11, 13, 16, 7, 8, 6, 10, 9, 17, 15, 19, 20, 18, 23, 21, 22, 25, 24, 14};
System.out.println(findKthLargestUsingMedian(numbers, 8));
System.out.println(findKthLargestUsingPriorityQueue(numbers, 8));
}
As expected output is:
18
18
Find the median of the array in linear time, then use partition procedure exactly as in quicksort to divide the array in two parts, values to the left of the median lesser( < ) than than median and to the right greater than ( > ) median, that too can be done in lineat time, now, go to that part of the array where kth element lies,
Now recurrence becomes:
T(n) = T(n/2) + cn
which gives me O (n) overal.
Below is the link to full implementation with quite an extensive explanation how the algorithm for finding Kth element in an unsorted algorithm works. Basic idea is to partition the array like in QuickSort. But in order to avoid extreme cases (e.g. when smallest element is chosen as pivot in every step, so that algorithm degenerates into O(n^2) running time), special pivot selection is applied, called median-of-medians algorithm. The whole solution runs in O(n) time in worst and in average case.
Here is link to the full article (it is about finding Kth smallest element, but the principle is the same for finding Kth largest):
Finding Kth Smallest Element in an Unsorted Array
How about this kinda approach
Maintain a buffer of length k and a tmp_max, getting tmp_max is O(k) and is done n times so something like O(kn)
Is it right or am i missing something ?
Although it doesn't beat average case of quickselect and worst case of median statistics method but its pretty easy to understand and implement.
There is also one algorithm, that outperforms quickselect algorithm. It's called Floyd-Rivets (FR) algorithm.
Original article: https://doi.org/10.1145/360680.360694
Downloadable version: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.309.7108&rep=rep1&type=pdf
Wikipedia article https://en.wikipedia.org/wiki/Floyd%E2%80%93Rivest_algorithm
I tried to implement quickselect and FR algorithm in C++. Also I compared them to the standard C++ library implementations std::nth_element (which is basically introselect hybrid of quickselect and heapselect). The result was quickselect and nth_element ran comparably on average, but FR algorithm ran approx. twice as fast compared to them.
Sample code that I used for FR algorithm:
template <typename T>
T FRselect(std::vector<T>& data, const size_t& n)
{
if (n == 0)
return *(std::min_element(data.begin(), data.end()));
else if (n == data.size() - 1)
return *(std::max_element(data.begin(), data.end()));
else
return _FRselect(data, 0, data.size() - 1, n);
}
template <typename T>
T _FRselect(std::vector<T>& data, const size_t& left, const size_t& right, const size_t& n)
{
size_t leftIdx = left;
size_t rightIdx = right;
while (rightIdx > leftIdx)
{
if (rightIdx - leftIdx > 600)
{
size_t range = rightIdx - leftIdx + 1;
long long i = n - (long long)leftIdx + 1;
long long z = log(range);
long long s = 0.5 * exp(2 * z / 3);
long long sd = 0.5 * sqrt(z * s * (range - s) / range) * sgn(i - (long long)range / 2);
size_t newLeft = fmax(leftIdx, n - i * s / range + sd);
size_t newRight = fmin(rightIdx, n + (range - i) * s / range + sd);
_FRselect(data, newLeft, newRight, n);
}
T t = data[n];
size_t i = leftIdx;
size_t j = rightIdx;
// arrange pivot and right index
std::swap(data[leftIdx], data[n]);
if (data[rightIdx] > t)
std::swap(data[rightIdx], data[leftIdx]);
while (i < j)
{
std::swap(data[i], data[j]);
++i; --j;
while (data[i] < t) ++i;
while (data[j] > t) --j;
}
if (data[leftIdx] == t)
std::swap(data[leftIdx], data[j]);
else
{
++j;
std::swap(data[j], data[rightIdx]);
}
// adjust left and right towards the boundaries of the subset
// containing the (k - left + 1)th smallest element
if (j <= n)
leftIdx = j + 1;
if (n <= j)
rightIdx = j - 1;
}
return data[leftIdx];
}
template <typename T>
int sgn(T val) {
return (T(0) < val) - (val < T(0));
}
iterate through the list. if the current value is larger than the stored largest value, store it as the largest value and bump the 1-4 down and 5 drops off the list. If not,compare it to number 2 and do the same thing. Repeat, checking it against all 5 stored values. this should do it in O(n)
i would like to suggest one answer
if we take the first k elements and sort them into a linked list of k values
now for every other value even for the worst case if we do insertion sort for rest n-k values even in the worst case number of comparisons will be k*(n-k) and for prev k values to be sorted let it be k*(k-1) so it comes out to be (nk-k) which is o(n)
cheers
Explanation of the median - of - medians algorithm to find the k-th largest integer out of n can be found here:
http://cs.indstate.edu/~spitla/presentation.pdf
Implementation in c++ is below:
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
int findMedian(vector<int> vec){
// Find median of a vector
int median;
size_t size = vec.size();
median = vec[(size/2)];
return median;
}
int findMedianOfMedians(vector<vector<int> > values){
vector<int> medians;
for (int i = 0; i < values.size(); i++) {
int m = findMedian(values[i]);
medians.push_back(m);
}
return findMedian(medians);
}
void selectionByMedianOfMedians(const vector<int> values, int k){
// Divide the list into n/5 lists of 5 elements each
vector<vector<int> > vec2D;
int count = 0;
while (count != values.size()) {
int countRow = 0;
vector<int> row;
while ((countRow < 5) && (count < values.size())) {
row.push_back(values[count]);
count++;
countRow++;
}
vec2D.push_back(row);
}
cout<<endl<<endl<<"Printing 2D vector : "<<endl;
for (int i = 0; i < vec2D.size(); i++) {
for (int j = 0; j < vec2D[i].size(); j++) {
cout<<vec2D[i][j]<<" ";
}
cout<<endl;
}
cout<<endl;
// Calculating a new pivot for making splits
int m = findMedianOfMedians(vec2D);
cout<<"Median of medians is : "<<m<<endl;
// Partition the list into unique elements larger than 'm' (call this sublist L1) and
// those smaller them 'm' (call this sublist L2)
vector<int> L1, L2;
for (int i = 0; i < vec2D.size(); i++) {
for (int j = 0; j < vec2D[i].size(); j++) {
if (vec2D[i][j] > m) {
L1.push_back(vec2D[i][j]);
}else if (vec2D[i][j] < m){
L2.push_back(vec2D[i][j]);
}
}
}
// Checking the splits as per the new pivot 'm'
cout<<endl<<"Printing L1 : "<<endl;
for (int i = 0; i < L1.size(); i++) {
cout<<L1[i]<<" ";
}
cout<<endl<<endl<<"Printing L2 : "<<endl;
for (int i = 0; i < L2.size(); i++) {
cout<<L2[i]<<" ";
}
// Recursive calls
if ((k - 1) == L1.size()) {
cout<<endl<<endl<<"Answer :"<<m;
}else if (k <= L1.size()) {
return selectionByMedianOfMedians(L1, k);
}else if (k > (L1.size() + 1)){
return selectionByMedianOfMedians(L2, k-((int)L1.size())-1);
}
}
int main()
{
int values[] = {2, 3, 5, 4, 1, 12, 11, 13, 16, 7, 8, 6, 10, 9, 17, 15, 19, 20, 18, 23, 21, 22, 25, 24, 14};
vector<int> vec(values, values + 25);
cout<<"The given array is : "<<endl;
for (int i = 0; i < vec.size(); i++) {
cout<<vec[i]<<" ";
}
selectionByMedianOfMedians(vec, 8);
return 0;
}
There is also Wirth's selection algorithm, which has a simpler implementation than QuickSelect. Wirth's selection algorithm is slower than QuickSelect, but with some improvements it becomes faster.
In more detail. Using Vladimir Zabrodsky's MODIFIND optimization and the median-of-3 pivot selection and paying some attention to the final steps of the partitioning part of the algorithm, i've came up with the following algorithm (imaginably named "LefSelect"):
#define F_SWAP(a,b) { float temp=(a);(a)=(b);(b)=temp; }
# Note: The code needs more than 2 elements to work
float lefselect(float a[], const int n, const int k) {
int l=0, m = n-1, i=l, j=m;
float x;
while (l<m) {
if( a[k] < a[i] ) F_SWAP(a[i],a[k]);
if( a[j] < a[i] ) F_SWAP(a[i],a[j]);
if( a[j] < a[k] ) F_SWAP(a[k],a[j]);
x=a[k];
while (j>k & i<k) {
do i++; while (a[i]<x);
do j--; while (a[j]>x);
F_SWAP(a[i],a[j]);
}
i++; j--;
if (j<k) {
while (a[i]<x) i++;
l=i; j=m;
}
if (k<i) {
while (x<a[j]) j--;
m=j; i=l;
}
}
return a[k];
}
In benchmarks that i did here, LefSelect is 20-30% faster than QuickSelect.
Haskell Solution:
kthElem index list = sort list !! index
withShape ~[] [] = []
withShape ~(x:xs) (y:ys) = x : withShape xs ys
sort [] = []
sort (x:xs) = (sort ls `withShape` ls) ++ [x] ++ (sort rs `withShape` rs)
where
ls = filter (< x)
rs = filter (>= x)
This implements the median of median solutions by using the withShape method to discover the size of a partition without actually computing it.
Here is a C++ implementation of Randomized QuickSelect. The idea is to randomly pick a pivot element. To implement randomized partition, we use a random function, rand() to generate index between l and r, swap the element at randomly generated index with the last element, and finally call the standard partition process which uses last element as pivot.
#include<iostream>
#include<climits>
#include<cstdlib>
using namespace std;
int randomPartition(int arr[], int l, int r);
// This function returns k'th smallest element in arr[l..r] using
// QuickSort based method. ASSUMPTION: ALL ELEMENTS IN ARR[] ARE DISTINCT
int kthSmallest(int arr[], int l, int r, int k)
{
// If k is smaller than number of elements in array
if (k > 0 && k <= r - l + 1)
{
// Partition the array around a random element and
// get position of pivot element in sorted array
int pos = randomPartition(arr, l, r);
// If position is same as k
if (pos-l == k-1)
return arr[pos];
if (pos-l > k-1) // If position is more, recur for left subarray
return kthSmallest(arr, l, pos-1, k);
// Else recur for right subarray
return kthSmallest(arr, pos+1, r, k-pos+l-1);
}
// If k is more than number of elements in array
return INT_MAX;
}
void swap(int *a, int *b)
{
int temp = *a;
*a = *b;
*b = temp;
}
// Standard partition process of QuickSort(). It considers the last
// element as pivot and moves all smaller element to left of it and
// greater elements to right. This function is used by randomPartition()
int partition(int arr[], int l, int r)
{
int x = arr[r], i = l;
for (int j = l; j <= r - 1; j++)
{
if (arr[j] <= x) //arr[i] is bigger than arr[j] so swap them
{
swap(&arr[i], &arr[j]);
i++;
}
}
swap(&arr[i], &arr[r]); // swap the pivot
return i;
}
// Picks a random pivot element between l and r and partitions
// arr[l..r] around the randomly picked element using partition()
int randomPartition(int arr[], int l, int r)
{
int n = r-l+1;
int pivot = rand() % n;
swap(&arr[l + pivot], &arr[r]);
return partition(arr, l, r);
}
// Driver program to test above methods
int main()
{
int arr[] = {12, 3, 5, 7, 4, 19, 26};
int n = sizeof(arr)/sizeof(arr[0]), k = 3;
cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
return 0;
}
The worst case time complexity of the above solution is still O(n2).In worst case, the randomized function may always pick a corner element. The expected time complexity of above randomized QuickSelect is Θ(n)
Have Priority queue created.
Insert all the elements into heap.
Call poll() k times.
public static int getKthLargestElements(int[] arr)
{
PriorityQueue<Integer> pq = new PriorityQueue<>((x , y) -> (y-x));
//insert all the elements into heap
for(int ele : arr)
pq.offer(ele);
// call poll() k times
int i=0;
while(i<k)
{
int result = pq.poll();
}
return result;
}
This is an implementation in Javascript.
If you release the constraint that you cannot modify the array, you can prevent the use of extra memory using two indexes to identify the "current partition" (in classic quicksort style - http://www.nczonline.net/blog/2012/11/27/computer-science-in-javascript-quicksort/).
function kthMax(a, k){
var size = a.length;
var pivot = a[ parseInt(Math.random()*size) ]; //Another choice could have been (size / 2)
//Create an array with all element lower than the pivot and an array with all element higher than the pivot
var i, lowerArray = [], upperArray = [];
for (i = 0; i < size; i++){
var current = a[i];
if (current < pivot) {
lowerArray.push(current);
} else if (current > pivot) {
upperArray.push(current);
}
}
//Which one should I continue with?
if(k <= upperArray.length) {
//Upper
return kthMax(upperArray, k);
} else {
var newK = k - (size - lowerArray.length);
if (newK > 0) {
///Lower
return kthMax(lowerArray, newK);
} else {
//None ... it's the current pivot!
return pivot;
}
}
}
If you want to test how it perform, you can use this variation:
function kthMax (a, k, logging) {
var comparisonCount = 0; //Number of comparison that the algorithm uses
var memoryCount = 0; //Number of integers in memory that the algorithm uses
var _log = logging;
if(k < 0 || k >= a.length) {
if (_log) console.log ("k is out of range");
return false;
}
function _kthmax(a, k){
var size = a.length;
var pivot = a[parseInt(Math.random()*size)];
if(_log) console.log("Inputs:", a, "size="+size, "k="+k, "pivot="+pivot);
// This should never happen. Just a nice check in this exercise
// if you are playing with the code to avoid never ending recursion
if(typeof pivot === "undefined") {
if (_log) console.log ("Ops...");
return false;
}
var i, lowerArray = [], upperArray = [];
for (i = 0; i < size; i++){
var current = a[i];
if (current < pivot) {
comparisonCount += 1;
memoryCount++;
lowerArray.push(current);
} else if (current > pivot) {
comparisonCount += 2;
memoryCount++;
upperArray.push(current);
}
}
if(_log) console.log("Pivoting:",lowerArray, "*"+pivot+"*", upperArray);
if(k <= upperArray.length) {
comparisonCount += 1;
return _kthmax(upperArray, k);
} else if (k > size - lowerArray.length) {
comparisonCount += 2;
return _kthmax(lowerArray, k - (size - lowerArray.length));
} else {
comparisonCount += 2;
return pivot;
}
/*
* BTW, this is the logic for kthMin if we want to implement that... ;-)
*
if(k <= lowerArray.length) {
return kthMin(lowerArray, k);
} else if (k > size - upperArray.length) {
return kthMin(upperArray, k - (size - upperArray.length));
} else
return pivot;
*/
}
var result = _kthmax(a, k);
return {result: result, iterations: comparisonCount, memory: memoryCount};
}
The rest of the code is just to create some playground:
function getRandomArray (n){
var ar = [];
for (var i = 0, l = n; i < l; i++) {
ar.push(Math.round(Math.random() * l))
}
return ar;
}
//Create a random array of 50 numbers
var ar = getRandomArray (50);
Now, run you tests a few time.
Because of the Math.random() it will produce every time different results:
kthMax(ar, 2, true);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 34, true);
kthMax(ar, 34);
kthMax(ar, 34);
kthMax(ar, 34);
kthMax(ar, 34);
kthMax(ar, 34);
If you test it a few times you can see even empirically that the number of iterations is, on average, O(n) ~= constant * n and the value of k does not affect the algorithm.
I came up with this algorithm and seems to be O(n):
Let's say k=3 and we want to find the 3rd largest item in the array. I would create three variables and compare each item of the array with the minimum of these three variables. If array item is greater than our minimum, we would replace the min variable with the item value. We continue the same thing until end of the array. The minimum of our three variables is the 3rd largest item in the array.
define variables a=0, b=0, c=0
iterate through the array items
find minimum a,b,c
if item > min then replace the min variable with item value
continue until end of array
the minimum of a,b,c is our answer
And, to find Kth largest item we need K variables.
Example: (k=3)
[1,2,4,1,7,3,9,5,6,2,9,8]
Final variable values:
a=7 (answer)
b=8
c=9
Can someone please review this and let me know what I am missing?
Here is the implementation of the algorithm eladv suggested(I also put here the implementation with random pivot):
public class Median {
public static void main(String[] s) {
int[] test = {4,18,20,3,7,13,5,8,2,1,15,17,25,30,16};
System.out.println(selectK(test,8));
/*
int n = 100000000;
int[] test = new int[n];
for(int i=0; i<test.length; i++)
test[i] = (int)(Math.random()*test.length);
long start = System.currentTimeMillis();
random_selectK(test, test.length/2);
long end = System.currentTimeMillis();
System.out.println(end - start);
*/
}
public static int random_selectK(int[] a, int k) {
if(a.length <= 1)
return a[0];
int r = (int)(Math.random() * a.length);
int p = a[r];
int small = 0, equal = 0, big = 0;
for(int i=0; i<a.length; i++) {
if(a[i] < p) small++;
else if(a[i] == p) equal++;
else if(a[i] > p) big++;
}
if(k <= small) {
int[] temp = new int[small];
for(int i=0, j=0; i<a.length; i++)
if(a[i] < p)
temp[j++] = a[i];
return random_selectK(temp, k);
}
else if (k <= small+equal)
return p;
else {
int[] temp = new int[big];
for(int i=0, j=0; i<a.length; i++)
if(a[i] > p)
temp[j++] = a[i];
return random_selectK(temp,k-small-equal);
}
}
public static int selectK(int[] a, int k) {
if(a.length <= 5) {
Arrays.sort(a);
return a[k-1];
}
int p = median_of_medians(a);
int small = 0, equal = 0, big = 0;
for(int i=0; i<a.length; i++) {
if(a[i] < p) small++;
else if(a[i] == p) equal++;
else if(a[i] > p) big++;
}
if(k <= small) {
int[] temp = new int[small];
for(int i=0, j=0; i<a.length; i++)
if(a[i] < p)
temp[j++] = a[i];
return selectK(temp, k);
}
else if (k <= small+equal)
return p;
else {
int[] temp = new int[big];
for(int i=0, j=0; i<a.length; i++)
if(a[i] > p)
temp[j++] = a[i];
return selectK(temp,k-small-equal);
}
}
private static int median_of_medians(int[] a) {
int[] b = new int[a.length/5];
int[] temp = new int[5];
for(int i=0; i<b.length; i++) {
for(int j=0; j<5; j++)
temp[j] = a[5*i + j];
Arrays.sort(temp);
b[i] = temp[2];
}
return selectK(b, b.length/2 + 1);
}
}
it is similar to the quickSort strategy, where we pick an arbitrary pivot, and bring the smaller elements to its left, and the larger to the right
public static int kthElInUnsortedList(List<int> list, int k)
{
if (list.Count == 1)
return list[0];
List<int> left = new List<int>();
List<int> right = new List<int>();
int pivotIndex = list.Count / 2;
int pivot = list[pivotIndex]; //arbitrary
for (int i = 0; i < list.Count && i != pivotIndex; i++)
{
int currentEl = list[i];
if (currentEl < pivot)
left.Add(currentEl);
else
right.Add(currentEl);
}
if (k == left.Count + 1)
return pivot;
if (left.Count < k)
return kthElInUnsortedList(right, k - left.Count - 1);
else
return kthElInUnsortedList(left, k);
}
Go to the End of this link : ...........
http://www.geeksforgeeks.org/kth-smallestlargest-element-unsorted-array-set-3-worst-case-linear-time/
You can find the kth smallest element in O(n) time and constant space. If we consider the array is only for integers.
The approach is to do a binary search on the range of Array values. If we have a min_value and a max_value both in integer range, we can do a binary search on that range.
We can write a comparator function which will tell us if any value is the kth-smallest or smaller than kth-smallest or bigger than kth-smallest.
Do the binary search until you reach the kth-smallest number
Here is the code for that
class Solution:
def _iskthsmallest(self, A, val, k):
less_count, equal_count = 0, 0
for i in range(len(A)):
if A[i] == val: equal_count += 1
if A[i] < val: less_count += 1
if less_count >= k: return 1
if less_count + equal_count < k: return -1
return 0
def kthsmallest_binary(self, A, min_val, max_val, k):
if min_val == max_val:
return min_val
mid = (min_val + max_val)/2
iskthsmallest = self._iskthsmallest(A, mid, k)
if iskthsmallest == 0: return mid
if iskthsmallest > 0: return self.kthsmallest_binary(A, min_val, mid, k)
return self.kthsmallest_binary(A, mid+1, max_val, k)
# #param A : tuple of integers
# #param B : integer
# #return an integer
def kthsmallest(self, A, k):
if not A: return 0
if k > len(A): return 0
min_val, max_val = min(A), max(A)
return self.kthsmallest_binary(A, min_val, max_val, k)
What I would do is this:
initialize empty doubly linked list l
for each element e in array
if e larger than head(l)
make e the new head of l
if size(l) > k
remove last element from l
the last element of l should now be the kth largest element
You can simply store pointers to the first and last element in the linked list. They only change when updates to the list are made.
Update:
initialize empty sorted tree l
for each element e in array
if e between head(l) and tail(l)
insert e into l // O(log k)
if size(l) > k
remove last element from l
the last element of l should now be the kth largest element
First we can build a BST from unsorted array which takes O(n) time and from the BST we can find the kth smallest element in O(log(n)) which over all counts to an order of O(n).

Resources