Using diffrent threads in OMP for nested for loops - openmp

Before I start, I should mention that the loops in my original code are not perfectly nested so I cannot use the "collapse" primitive.
I am trying to thread both the outer and inner "for" loops in the following code. The sample below works fine, however, there is a problem! The inner loop is using the same thread IDs of the outer one, which causes performance degradation. My computer has 20 threads and I want those that are idle to be used in the inner loop. So far I could not figure out a solution. Any suggestion?
The output of sample code is like this:
iter (0, 0), tid outer loop: 0, tid inner loop: 0
iter (0, 1), tid outer loop: 0, tid inner loop: 0
iter (0, 6), tid outer loop: 0, tid inner loop: 0
iter (1, 0), tid outer loop: 1, tid inner loop: 0
iter (1, 1), tid outer loop: 1, tid inner loop: 0
iter (1, 6), tid outer loop: 1, tid inner loop: 0
iter (1, 4), tid outer loop: 1, tid inner loop: 2
iter (1, 5), tid outer loop: 1, tid inner loop: 2
iter (1, 2), tid outer loop: 1, tid inner loop: 1
iter (1, 3), tid outer loop: 1, tid inner loop: 1
iter (0, 4), tid outer loop: 0, tid inner loop: 2
iter (0, 5), tid outer loop: 0, tid inner loop: 2
iter (0, 2), tid outer loop: 0, tid inner loop: 1
iter (0, 3), tid outer loop: 0, tid inner loop: 1
iter (2, 0), tid outer loop: 2, tid inner loop: 0
iter (3, 4), tid outer loop: 0, tid inner loop: 2
iter (2, 1), tid outer loop: 2, tid inner loop: 0
iter (3, 5), tid outer loop: 0, tid inner loop: 2
iter (4, 2), tid outer loop: 1, tid inner loop: 1
iter (3, 0), tid outer loop: 0, tid inner loop: 0
iter (3, 2), tid outer loop: 0, tid inner loop: 1
iter (4, 4), tid outer loop: 1, tid inner loop: 2
iter (2, 6), tid outer loop: 2, tid inner loop: 0
iter (4, 5), tid outer loop: 1, tid inner loop: 2
iter (2, 2), tid outer loop: 2, tid inner loop: 1
iter (3, 3), tid outer loop: 0, tid inner loop: 1
and so on!
#include <stdio.h>
#include <omp.h>
int main (void)
int i,j,thid;
#pragma omp parallel for default(none) num_threads(3) private(i,j,thid) schedule(static,1)
for (i = 0; i < 8; i++){
thid = omp_get_thread_num();
#pragma omp parallel for num_threads(3) schedule(static,2)
for (j = 0; j < 7; j++)
printf("iter (%0d, %0d), tid outer loop: %0d, tid inner loop: %0d\n", i, j, thid, omp_get_ancestor_thread_num(omp_get_level()) );
return 0;

your code is fine but you must set num_threads only once. If your computer is capable of running 20 threads, set num_threads to 16
#pragma omp parallel for default(none) num_threads(16) private(i,j,thid) schedule(static,1)
for (i = 0; i < 8; i++){
thid = omp_get_thread_num();
//remove num_threads here
#pragma omp parallel for schedule(static,2)
for (j = 0; j < 7; j++)
printf("iter (%0d, %0d), tid outer loop: %0d, tid inner loop: %0d\n", i, j, thid, omp_get_ancestor_thread_num(omp_get_level()) );


parallelizing C++ code with MPI_send and MPI_recv

I have a parallel code, but I don't understand if it works correctly in parallel.
I have two vectors A and B whose elements are matrices defined with a proper class.
Since the matrices in the vectors are not primitive type I can't send these vectors to other ranks through MPI_Scatter, so I have to use MPI_Send and MPI_Recv. Also, rank 0 has only a coordination role: it sends to the other ranks the blocks they should work with and collects the results at the end, but it does not participate to the computation.
The solution of the exercise is the following:
// rank 0 sends the blocks to the other ranks, which compute the local
// block products, then receive the partial results and prints the global
// vector
if (rank == 0)
// send data
for (unsigned j = 0; j < N_blocks; ++j)
int dest = j / local_N_blocks + 1;
// send number of rows
unsigned n = A[j].rows();
MPI_Send(&n, 1, MPI_UNSIGNED, dest, 1, MPI_COMM_WORLD);
// send blocks
MPI_Send(A[j].data(), n*n, MPI_DOUBLE, dest, 2, MPI_COMM_WORLD);
MPI_Send(B[j].data(), n*n, MPI_DOUBLE, dest, 3, MPI_COMM_WORLD);
// global vector
std::vector<dense_matrix> C(N_blocks);
for (unsigned j = 0; j < N_blocks; ++j)
int root = j / local_N_blocks + 1;
// receive number of rows
unsigned n;
MPI_Recv(&n, 1, MPI_UNSIGNED, root, 4, MPI_COMM_WORLD,
// initialize blocks
dense_matrix received(n,n);
// receive blocks
MPI_Recv(, n*n, MPI_DOUBLE, root, 5,
// store block in the vector
C[j] = received;
// print result
// all the other ranks receive the blocks and compute the local block
// products, then send the results to rank 0
// local vector
std::vector<dense_matrix> local_C(local_N_blocks);
// receive data and compute products
for (unsigned j = 0; j < local_N_blocks; ++j)
// receive number of rows
unsigned n;
// initialize blocks
dense_matrix local_A(n,n); dense_matrix local_B(n,n);
// receive blocks
// compute product
local_C[j] = local_A * local_B;
// send local results
for (unsigned j = 0; j < local_N_blocks; ++j)
// send number of rows
unsigned n = local_C[j].rows();
// send block
MPI_Send(local_C[j].data(), n*n, MPI_DOUBLE, 0, 5, MPI_COMM_WORLD);
In my opinion, if local_N_blocks= N_blocks / (size - 1); is different from 1, the variable dest doesn't change value at every loop iteration. So, after the first iteration of the "sending loop", the second time that rank 0 faces
MPI_Send(A[j].data(), n*n, MPI_DOUBLE, dest, 2, MPI_COMM_WORLD);
MPI_Send(B[j].data(), n*n, MPI_DOUBLE, dest, 3, MPI_COMM_WORLD);
it has to wait that the operation local_C[j] = local_A * local_B of the previous j has been completed so the code doesn't seem to me well parallelized.
What do you think?

How to find minimum number of swaps needed to sort an array? [duplicate]

I'm working on sorting an integer sequence with no identical numbers (without loss of generality, let's assume the sequence is a permutation of 1,2,...,n) into its natural increasing order (i.e. 1,2,...,n). I was thinking about directly swapping the elements (regardless of the positions of elements; in other words, a swap is valid for any two elements) with minimal number of swaps (the following may be a feasible solution):
Swap two elements with the constraint that either one or both of them should be swapped into the correct position(s). Until every element is put in its correct position.
But I don't know how to mathematically prove if the above solution is optimal. Anyone can help?
I was able to prove this with graph-theory. Might want to add that tag in :)
Create a graph with n vertices. Create an edge from node n_i to n_j if the element in position i should be in position j in the correct ordering. You will now have a graph consisting of several non-intersecting cycles. I argue that the minimum number of swaps needed to order the graph correctly is
M = sum (c in cycles) size(c) - 1
Take a second to convince yourself of that...if two items are in a cycle, one swap can just take care of them. If three items are in a cycle, you can swap a pair to put one in the right spot, and a two-cycle remains, etc. If n items are in a cycle, you need n-1 swaps. (This is always true even if you don't swap with immediate neighbors.)
Given that, you may now be able to see why your algorithm is optimal. If you do a swap and at least one item is in the right position, then it will always reduce the value of M by 1. For any cycle of length n, consider swapping an element into the correct spot, occupied by its neighbor. You now have a correctly ordered element, and a cycle of length n-1.
Since M is the minimum number of swaps, and your algorithm always reduces M by 1 for each swap, it must be optimal.
All the cycle counting is very difficult to keep in your head. There is a way that is much simpler to memorize.
First, let's go through a sample case manually.
Sequence: [7, 1, 3, 2, 4, 5, 6]
Enumerate it: [(0, 7), (1, 1), (2, 3), (3, 2), (4, 4), (5, 5), (6, 6)]
Sort the enumeration by value: [(1, 1), (3, 2), (2, 3), (4, 4), (5, 5), (6, 6), (0, 7)]
Start from the beginning. While the index is different from the enumerated index keep on swapping the elements defined by index and enumerated index. Remember: swap(0,2);swap(0,3) is the same as swap(2,3);swap(0,2)
swap(0, 1) => [(3, 2), (1, 1), (2, 3), (4, 4), (5, 5), (6, 6), (0, 7)]
swap(0, 3) => [(4, 4), (1, 1), (2, 3), (3, 2), (5, 5), (6, 6), (0, 7)]
swap(0, 4) => [(5, 5), (1, 1), (2, 3), (3, 2), (4, 4), (6, 6), (0, 7)]
swap(0, 5) => [(6, 6), (1, 1), (2, 3), (3, 2), (4, 4), (5, 5), (0, 7)]
swap(0, 6) => [(0, 7), (1, 1), (2, 3), (3, 2), (4, 4), (5, 5), (6, 6)]
I.e. semantically you sort the elements and then figure out how to put them to the initial state via swapping through the leftmost item that is out of place.
Python algorithm is as simple as this:
def swap(arr, i, j):
arr[i], arr[j] = arr[j], arr[i]
def minimum_swaps(arr):
annotated = [*enumerate(arr)]
annotated.sort(key = lambda it: it[1])
count = 0
i = 0
while i < len(arr):
if annotated[i][0] == i:
i += 1
swap(annotated, i, annotated[i][0])
count += 1
return count
Thus, you don't need to memorize visited nodes or compute some cycle length.
For your reference, here is an algorithm that I wrote, to generate the minimum number of swaps needed to sort the array. It finds the cycles as described by #Andrew Mao.
* Finds the minimum number of swaps to sort given array in increasing order.
* #param ar array of <strong>non-negative distinct</strong> integers.
* input array will be overwritten during the call!
* #return min no of swaps
public int findMinSwapsToSort(int[] ar) {
int n = ar.length;
Map<Integer, Integer> m = new HashMap<>();
for (int i = 0; i < n; i++) {
m.put(ar[i], i);
for (int i = 0; i < n; i++) {
ar[i] = m.get(ar[i]);
m = null;
int swaps = 0;
for (int i = 0; i < n; i++) {
int val = ar[i];
if (val < 0) continue;
while (val != i) {
int new_val = ar[val];
ar[val] = -1;
val = new_val;
ar[i] = -1;
return swaps;
We do not need to swap the actual elements, just find how many elements are not in the right index (Cycle).
The min swaps will be Cycle - 1;
Here is the code...
static int minimumSwaps(int[] arr) {
int swap=0;
boolean visited[]=new boolean[arr.length];
for(int i=0;i<arr.length;i++){
int j=i,cycle=0;
return swap;
#Archibald, I like your solution, and such was my initial assumptions that sorting the array would be the simplest solution, but I don't see the need to go through the effort of the reverse-traverse as I've dubbed it, ie enumerating then sorting the array and then computing the swaps for the enums.
I find it simpler to subtract 1 from each element in the array and then to compute the swaps required to sort that list
here is my tweak/solution:
def swap(arr, i, j):
tmp = arr[i]
arr[i] = arr[j]
arr[j] = tmp
def minimum_swaps(arr):
a = [x - 1 for x in arr]
swaps = 0
i = 0
while i < len(a):
if a[i] == i:
i += 1
swap(a, i, a[i])
swaps += 1
return swaps
As for proving optimality, I think #arax has a good point.
// Assuming that we are dealing with only sequence started with zero
function minimumSwaps(arr) {
var len = arr.length
var visitedarr = []
var i, start, j, swap = 0
for (i = 0; i < len; i++) {
if (!visitedarr[i]) {
start = j = i
var cycleNode = 1
while (arr[j] != start) {
j = arr[j]
visitedarr[j] = true
swap += cycleNode - 1
return swap
I really liked the solution of #Ieuan Uys in Python.
What I improved on his solution;
While loop is iterated one less to increase speed; while i < len(a) - 1
Swap function is de-capsulated to make one, single function.
Extensive code comments are added to increase readability.
My code in python.
def minimumSwaps(arr):
#make array values starting from zero to match index values.
a = [x - 1 for x in arr]
#initialize number of swaps and iterator.
swaps = 0
i = 0
while i < len(a)-1:
if a[i] == i:
i += 1
tmp = a[i] #create temp variable assign it to a[i]
a[i] = a[tmp] #assign value of a[i] with a[tmp]
a[tmp] = tmp #assign value of a[tmp] with tmp (or initial a[i])
#calculate number of swaps.
swaps += 1
return swaps
Detailed explanation on what code does on an array with size n;
We check every value except last one (n-1 iterations) in the array one by one. If the value does not match with array index, then we send this value to its place where index value is equal to its value. For instance, if at a[0] = 3. Then this value should swap with a[3]. a[0] and a[3] is swapped. Value 3 will be at a[3] where it is supposed to be. One value is sent to its place. We have n-2 iteration left. I am not interested what is now a[0]. If it is not 0 at that location, it will be swapped by another value latter. Because that another value also exists in a wrong place, this will be recognized by while loop latter.
Real Example
a[4, 2, 1, 0, 3]
#iteration 0, check a[0]. 4 should be located at a[4] where the value is 3. Swap them.
a[3, 2, 1, 0, 4] #we sent 4 to the right location now.
#iteration 1, check a[1]. 2 should be located at a[2] where the value is 1. Swap them.
a[3, 1, 2, 0, 4] #we sent 2 to the right location now.
#iteration 2, check a[2]. 2 is already located at a[2]. Don't do anything, continue.
a[3, 1, 2, 0, 4]
#iteration 3, check a[3]. 0 should be located at a[0] where the value is 3. Swap them.
a[0, 1, 2, 3, 4] #we sent 0 to the right location now.
# There is no need to check final value of array. Since all swaps are done.
Nicely done solution by #bekce. If using C#, the initial code of setting up the modified array ar can be succinctly expressed as:
var origIndexes = Enumerable.Range(0, n).ToArray();
Array.Sort(ar, origIndexes);
then use origIndexes instead of ar in the rest of the code.
Swift 4 version:
func minimumSwaps(arr: [Int]) -> Int {
struct Pair {
let index: Int
let value: Int
var positions = arr.enumerated().map { Pair(index: $0, value: $1) }
positions.sort { $0.value < $1.value }
var indexes = { $0.index }
var swaps = 0
for i in 0 ..< indexes.count {
var val = indexes[i]
if val < 0 {
continue // Already visited.
while val != i {
let new_val = indexes[val]
indexes[val] = -1
val = new_val
swaps += 1
indexes[i] = -1
return swaps
This is the sample code in C++ that finds the minimum number of swaps to sort a permutation of the sequence of (1,2,3,4,5,.......n-2,n-1,n)
using namespace std;
int main()
int n,i,j,k,num = 0;
cin >> n;
int arr[n+1];
for(i = 1;i <= n;++i)cin >> arr[i];
for(i = 1;i <= n;++i)
if(i != arr[i])// condition to check if an element is in a cycle r nt
j = arr[i];
arr[i] = 0;
while(j != 0)// Here i am traversing a cycle as mentioned in
{ // first answer
k = arr[j];
arr[j] = j;
j = k;
num++;// reducing cycle by one node each time
for(i = 1;i <= n;++i)cout << arr[i] << " ";cout << endl;
cout << num << endl;
return 0;
Solution using Javascript.
First I set all the elements with their current index that need to be ordered, and then I iterate over the map to order only the elements that need to be swapped.
function minimumSwaps(arr) {
const mapUnorderedPositions = new Map()
for (let i = 0; i < arr.length; i++) {
if (arr[i] !== i+1) {
mapUnorderedPositions.set(arr[i], i)
let minSwaps = 0
while (mapUnorderedPositions.size > 1) {
const currentElement = mapUnorderedPositions.entries().next().value
const x = currentElement[0]
const y = currentElement[1]
// Skip element in map if its already ordered
if (x-1 !== y) {
// Update unordered position index of swapped element
mapUnorderedPositions.set(arr[x-1], y)
// swap in array
arr[y] = arr[x-1]
arr[x-1] = x
// Increment swaps
return minSwaps
If you have an input like 7 2 4 3 5 6 1, this is how the debugging will go:
Map { 7 => 0, 4 => 2, 3 => 3, 1 => 6 }
currentElement [ 7, 0 ]
swapping 1 with 7
[ 1, 2, 4, 3, 5, 6, 7 ]
currentElement [ 4, 2 ]
swapping 3 with 4
[ 1, 2, 3, 4, 5, 6, 7 ]
currentElement [ 3, 2 ]
minSwaps = 2
Finding the minimum number of swaps required to put a permutation of 1..N in order.
We can use that the we know what the sort result would be: 1..N, which means we don't actually have to do swaps just count them.
The shuffling of 1..N is called a permutation, and is composed of disjoint cyclic permutations, for example, this permutation of 1..6:
1 2 3 4 5 6
6 4 2 3 5 1
Is composed of the cyclic permutations (1,6)(2,4,3)(5)
1->6(->1) cycle: 1 swap
2->4->3(->2) cycle: 2 swaps
5(->5) cycle: 0 swaps
So a cycle of k elements requires k-1 swaps to put in order.
Since we know where each element "belongs" (i.e. value k belongs at position k-1) we can easily traverse the cycle. Start at 0, we get 6, which belongs at 5,
and there we find 1, which belongs at 0 and we're back where we started.
To avoid re-counting a cycle later, we track which elements were visited - alternatively you could perform the swaps so that the elements are in the right place when you visit them later.
The resulting code:
def minimumSwaps(arr):
visited = [False] * len(arr)
numswaps = 0
for i in range(len(arr)):
if not visited[i]:
visited[i] = True
j = arr[i]-1
while not visited[j]:
numswaps += 1
visited[j] = True
j = arr[j]-1
return numswaps
An implementation on integers with primitive types in Java (and tests).
import java.util.Arrays;
public class MinSwaps {
public static int computate(int[] unordered) {
int size = unordered.length;
int[] ordered = order(unordered);
int[] realPositions = realPositions(ordered, unordered);
boolean[] touchs = new boolean[size];
Arrays.fill(touchs, false);
int i;
int landing;
int swaps = 0;
for(i = 0; i < size; i++) {
if(!touchs[i]) {
landing = realPositions[i];
while(!touchs[landing]) {
touchs[landing] = true;
landing = realPositions[landing];
if(!touchs[landing]) { swaps++; }
return swaps;
private static int[] realPositions(int[] ordered, int[] unordered) {
int i;
int[] positions = new int[unordered.length];
for(i = 0; i < unordered.length; i++) {
positions[i] = position(ordered, unordered[i]);
return positions;
private static int position(int[] ordered, int value) {
int i;
for(i = 0; i < ordered.length; i++) {
if(ordered[i] == value) {
return i;
return -1;
private static int[] order(int[] unordered) {
int[] ordered = unordered.clone();
return ordered;
import org.junit.Test;
import static org.junit.Assert.assertEquals;
public class MinimumSwapsSpec {
public void example() {
// setup
int[] unordered = new int[] { 40, 23, 1, 7, 52, 31 };
// run
int minSwaps = MinSwaps.computate(unordered);
// verify
assertEquals(5, minSwaps);
public void example2() {
// setup
int[] unordered = new int[] { 4, 3, 2, 1 };
// run
int minSwaps = MinSwaps.computate(unordered);
// verify
assertEquals(2, minSwaps);
public void example3() {
// setup
int[] unordered = new int[] {1, 5, 4, 3, 2};
// run
int minSwaps = MinSwaps.computate(unordered);
// verify
assertEquals(2, minSwaps);
Swift 4.2:
func minimumSwaps(arr: [Int]) -> Int {
let sortedValueIdx = arr.sorted().enumerated()
.reduce(into: [Int: Int](), { $0[$1.element] = $1.offset })
var checked = Array(repeating: false, count: arr.count)
var swaps = 0
for idx in 0 ..< arr.count {
if checked[idx] { continue }
var edges = 1
var cursorIdx = idx
while true {
let cursorEl = arr[cursorIdx]
let targetIdx = sortedValueIdx[cursorEl]!
if targetIdx == idx {
} else {
cursorIdx = targetIdx
edges += 1
checked[targetIdx] = true
swaps += edges - 1
return swaps
Python code
A = [4,3,2,1]
count = 0
for i in range (len(A)):
min_idx = i
for j in range (i+1,len(A)):
if A[min_idx] > A[j]:
min_idx = j
if min_idx > i:
A[i],A[min_idx] = A[min_idx],A[i]
count = count + 1
print "Swap required : %d" %count
In Javascript
If the count of the array starts with 1
function minimumSwaps(arr) {
var len = arr.length
var visitedarr = []
var i, start, j, swap = 0
for (i = 0; i < len; i++) {
if (!visitedarr[i]) {
start = j = i
var cycleNode = 1
while (arr[j] != start + 1) {
j = arr[j] - 1
visitedarr[j] = true
swap += cycleNode - 1
return swap
else for input starting with 0
function minimumSwaps(arr) {
var len = arr.length
var visitedarr = []
var i, start, j, swap = 0
for (i = 0; i < len; i++) {
if (!visitedarr[i]) {
start = j = i
var cycleNode = 1
while (arr[j] != start) {
j = arr[j]
visitedarr[j] = true
swap += cycleNode - 1
return swap
Just extending Darshan Puttaswamy code for current HackerEarth inputs
Here's a solution in Java for what #Archibald has already explained.
static int minimumSwaps(int[] arr){
int swaps = 0;
int[] arrCopy = arr.clone();
HashMap<Integer, Integer> originalPositionMap
= new HashMap<>();
for(int i = 0 ; i < arr.length ; i++){
originalPositionMap.put(arr[i], i);
for(int i = 0 ; i < arr.length ; i++){
while(arr[i] != arrCopy[i]){
int temp = arr[i];
arr[i] = arr[originalPositionMap.get(temp)];
arr[originalPositionMap.get(temp)] = temp;
swaps += 1;
return swaps;
def swap_sort(arr)
changes = 0
loop do
# Find a number that is out-of-place
_, i = arr.each_with_index.find { |val, index| val != (index + 1) }
if i != nil
# If such a number is found, then `j` is the position that the out-of-place number points to.
j = arr[i] - 1
# Swap the out-of-place number with number from position `j`.
arr[i], arr[j] = arr[j], arr[i]
# Increase swap counter.
changes += 1
# If there are no out-of-place number, it means the array is sorted, and we're done.
return changes
Apple Swift version 5.2.4
func minimumSwaps(arr: [Int]) -> Int {
var swapCount = 0
var arrayPositionValue = [(Int, Int)]()
var visitedDictionary = [Int: Bool]()
for (index, number) in arr.enumerated() {
arrayPositionValue.append((index, number))
visitedDictionary[index] = false
arrayPositionValue = arrayPositionValue.sorted{ $0.1 < $1.1 }
for i in 0..<arr.count {
var cycleSize = 0
var visitedIndex = i
while !visitedDictionary[visitedIndex]! {
visitedDictionary[visitedIndex] = true
visitedIndex = arrayPositionValue[visitedIndex].0
cycleSize += 1
if cycleSize > 0 {
swapCount += cycleSize - 1
return swapCount
Go version 1.17:
func minimumSwaps(arr []int32) int32 {
var swap int32
for i := 0; i < len(arr) - 1; i++{
for j := 0; j < len(arr); j++ {
if arr[j] > arr[i] {
arr[i], arr[j] = arr[j], arr[i]
}else {
return swap

How do I account for duplicate values when solving the the two sum problem using a hash table?

Say I have the classic Two Sum Problem but with a twist
If I am given a list of integers and target
I need to print all the pairs of values that add up to the sum
Without repeating symmetrical values
Without reusing a value
I am trying to avoid the brute force approach for obvious reasons, but if I implement a hash-map with each value as the key and the element being the frequency of that value in the original array. How do I get the algorithm to only print each value pair once?
function findPairs(arr, target){
let hashMap = {};
let results = [];
for(let i = 0; i < arr.length; i++){
hashMap[arr[i]] = 1;
for(let i = 0; i < arr.length; i++){
let diff = target - arr[i];
if(hashMap.hasOwnProperty(diff) && hashMap[diff] > 0){
results.push([arr[i], diff]);
findPairs([1, 3, -1, 11, 7], 10);
findPairs([5, 5, 5, 5, 5], 10);
findPairs([1, 3, -1, 11, 7], 10)
(3, 7)
(-1, 11)
findPairs([5, 5, 5], 10)
(5, 5)
findPairs([5, 5, 5, 5], 10)
(5, 5)
(5, 5)
findPairs([5, 5, 5, 5, 5], 10)
(5, 5)
(5, 5)
findPairs([5, 5, 5, 5, 5, 5 ], 10)
(5, 5)
(5, 5)
(5, 5)
This is the summary of the question as far as I understood:
Your array can have duplicate elements eg:- [1, 2, 3, 2, 4]
You want to print duplicate [4, 1, 2, 3, 2, 4] as (2, 4), (2, 4)
vector<pair<int, int> > findPairs(vector<int> arr, int target){
int size = arr.size();
map<int, int> hashMap;
for(int i = 0; i < size; i++){
// C++ map assigns 0 as default if the key is not present, C++ map uses Red Black Tree
if(hashMap[arr[i]] == 0)
hashMap[arr[i]] = 1;
/** Use to store result in (int, int) form
* Vector is a Dynamic array
vector<pair<int, int> > results;
for(int i = 0; i < size; i++){
int diff = target - arr[i];
if(hashMap[diff] >= 1)
results.push_back(make_pair(arr[i], diff));
return results;
This code is based on the examples you have provided in the question.

Different ways to select ordered triplets from an array of N integers

Given an array A of n integers, I want to find the ways of selecting ordered triplets. For eg.
A = [1, 2, 1, 1]
different ways are (1, 2, 1), (1, 1, 1) and (2, 1, 1)
so the answer will be 3.
for A = [2, 2, 1, 2, 2]
different ways are (1, 2, 2), (2, 1, 2), (2, 2, 1) and (2, 2, 2)
so the answer will be 4 in this case
If all the numbers are unique then I have come up with a recurrence
f(n) = f(n-1) + ((n-1) * (n-2))/2
where f(3) = 1 and f(2) = f(1) = 0
I am having trouble when numbers are repeated. This needs to be solved in O(n) time and O(n) space.
The dynamic programming relation for the number of unique, ordered sets, from an array of size idx is:
DP[size of set][idx] = DP[size of set][idx-1] + DP[size of set - 1][idx-1] - DP[size of set - 1][ last_idx[ A[idx] - 1]
So, to calculate the number of ordered, unique sets of size LEN from an array of idx elements:
Take the number of ordered, unique sets of size LEN that can be created from an array of idx-1 elements
Add the number of ordered, unique sets that can be formed by adding element idx to the end of ordered, unique sets for size LEN-1
Don’t double count. Subtract the number of ordered, unique sets that can be formed by adding the PREVIOUS occurrence of element idx to the end of ordered, unique sets for size LEN-1.
This works because we are always counting unique sets as we go through the array. Counting unique the sets is based on the previous element counts of unique sets.
So, start with sets of size 1, then do size 2, then size 3, etc.
For unique, ordered sets of constant size LEN, my function takes O(LEN * N) memory and O(LEN * N) time. You should be able to reuse the DP array to reduce the memory to a constant independent of LEN, O(constant * N).
Here is the function.
static int answer(int[] A) {
// This example is for 0 <= A[i] <= 9. For an array of arbitrary integers, use a proper
// HashMap instead of an array as a HashMap. Alternatively, one could compress the input array
// down to distinct, consecutive numbers. Either way max memory of the last_idx array is O(n).
// This is left as an exercise to the reader.
final int MAX_INT_DIGIT = 10;
int n = A.length;
int[][] dp = new int[SUBSEQUENCE_LENGTH][n];
int[] last_idx = new int[MAX_INT_DIGIT];
Arrays.fill(last_idx, -1);
// Init dp[0] which gives the number of distinct sets of length 1 ending at index i
dp[0][0] = 1;
last_idx[A[0]] = 0;
for (int i = 1; i < n; i++) {
if (last_idx[A[i]] == -1) {
dp[0][i] = dp[0][i - 1] + 1;
} else {
dp[0][i] = dp[0][i - 1];
last_idx[A[i]] = i;
for (int ss_len = 1; ss_len < SUBSEQUENCE_LENGTH; ss_len++) {
Arrays.fill(last_idx, -1);
last_idx[A[0]] = 0;
for (int i = 1; i < n; i++) {
if (last_idx[A[i]] <= 0) {
dp[ss_len][i] = dp[ss_len][i - 1] + dp[ss_len-1][i - 1];
} else {
dp[ss_len][i] = dp[ss_len][i - 1] + dp[ss_len-1][i - 1] - dp[ss_len-1][last_idx[A[i]] - 1];
last_idx[A[i]] = (i);
return dp[SUBSEQUENCE_LENGTH-1][n - 1];
For [3 1 1 3 8 0 5 8 9 0] the answer I get is 62.

Is the time complexity of this function O(N)?

What is the time complexity of this function? The function returns minimum value in the array. I think it's O(N) but I can't prove it. Any help would be appreciated!
int[] foo(int arr[], int N)
int k = 1;
while(k < N)
for(int i = 0; i+k<N; i+=2*k)
if(arr[k] > arr[i+k])
swap(arr[i], arr[k+i]); //swap values in arr[i] and arr[k+1]
k = k*2;
return arr[0];
It's O(N).
It may seem like O(NlogN) at a first glance, but:
1st inner for loop: i = 0, 2, 4, 6, 8, ... i.e. N/2 operations
2nd inner for loop: i = 0, 4, 8, 12, 16, ... i.e. N/4 operations
3rd inner for loop: i = 0, 8, 16, 24, 32, ... i.e. N/8 operations
4th inner for loop: i = 0, 16, 32, 48, 64, ... i.e. N/16 operations
N/2 + N/4 + N/8 + N/16 + ... = N(1/2 + 1/4 + 1/8 + 1/16 ...) = N
