Merging of arrays containing duplicate elements using PRAM algorithm - algorithm

I am learning PRAM algorithms. I stuck at one question. "There exists an algorithm which,given any two sorted m-element array of integers,where each integer belongs to the set{1,2,3...m} and where duplicate elements are allowed, merges the two arrays in O(1) time using PRAM with m common CRCW processors"
e.g.with m=4 ,it could merge the arrays<1,2,3,3>and<1,3,3,4> in O(1) time using 4 common CRCW processors
Please reply,
Thanks

as soon as you have M processors, and M-length arrays and I don't see any mention that final array should be sorted:
each processor can take 1 value from 1st array and 1 value from 2nd array and put them into final array, and this operation will be O(1)
try to look into this pseudo-code:
int array1[M] = {1, 2, 3, 4};
int array2[M] = {1, 3, 3, 4};
int output[M * 2] = {};
parallel for (i=0; i<M; i++) // each iteration of this loop runs on its own processor, so all iterations run at the same time and will finish in O(1) time simultaneously
{
output[i * 2] = array1[i];
output[i * 2 + 1] = array2[i];
}
so, operation in loop is obviously is O(1), and in general programming we can say that final complexity will be O(M), because of loop, but for M processors it will be only O(1)

So we have M processors Concurrent Read, Concurrent Write and 2 arrays: a,b.
We can do this (kind of sorting by counting the appearences of each number):
//index 0 1 2 3 4 5
int a[M] = {1, 1, 1, 1, 2, 6};
int b[M] = {1, 3, 3, 4, 4, 8};
int o[M * 2] = {};
int tem1[M * 2] = {};
int tem2[M * 2] = {};
parallel for (i=0; i<M; i++)
// each iteration of this loop runs on its own processor, so all iterations run at the same
// time and will finish in O(1) time simultaneously
// at reading/writing from/in the same location, processor i has higher priority than processor i+1, operations are queued
{
// step 1 (depending on how much the values repeat, there will be processors that wait for others with higher priority, // before performing their operations)
tem1[a[i]]++;
tem2[b[i]]++;
// index: 0 1 2 3 4 5 6 7 8 9 10 11 12
// -> tem1: 0 4 1 0 0 0 1 0 0 0 0 0 0
// -> tem2: 0 1 0 2 2 0 0 0 1 0 0 0 0
// step 2
// again, some processors might wait until they perform their operations, because they access the same memory location
o[tem1[a[i]+tem2[a[i]]-1] = a[i];
tem1[a[i]]--;
o[tem1[b[i]]+tem2[b[i]]-1] = b[i];
tem2[b[i]]--;
// index: 0 1 2 3 4 5 6 7 8 9 10 11 12
// -> o: 1 1 1 1 1 2 3 3 4 4 6 8
}
-> no loops, constant number of operations for each processor -> O(1)

Related

How to solve M times prefix sum with better time complexity

The problem is to find the prefix sum of array of length N by repeating the process M times. e.g.
Example N=3
M=4
array = 1 2 3
output = 1 6 21
Explanation:
Step 1 prefix Sum = 1 3 6
Step 2 prefix sum = 1 4 10
Step 3 prefix sum = 1 5 15
Step 4(M) prefix sum = 1 6 21
Example 2:
N=5
M=3
array = 1 2 3 4 5
output = 1 5 15 35 70
I was not able to solve the problem and kept getting lime limit exceeded. I used dynamic programming to solve it in O(NM) time. I looked around and found the following general mathematical solution but I still not able to solve it because my math isn't that great to understand it. Can someone solve it in a better time complexity?
https://math.stackexchange.com/questions/234304/sum-of-the-sum-of-the-sum-of-the-first-n-natural-numbers
Hint: 3, 4, 5 and 6, 10, 15 are sections of diagonals on Pascal's Triangle.
JavaScript code:
function f(n, m) {
const result = [1];
for (let i = 1; i < n; i++)
result.push(result[i-1] * (m + i + 1) / i);
return result;
}
console.log(JSON.stringify(f(3, 4)));
console.log(JSON.stringify(f(5, 3)));

codility:peaks: what's wrong with go implementation on performance parts testing?

Divide an array into the maximum number of same-sized blocks, each of which should contain an index P such that A[P - 1] < A[P] > A[P + 1].
My Solution: golang solution
However partly performance testing fails without reason, anyone can add some suggestion?
func Solution(A []int) int {
peaks := make([]int, 0)
for i := 1; i < len(A)-1; i++ {
if A[i] > A[i-1] && A[i] > A[i+1] {
peaks = append(peaks, i)
}
}
if len(peaks) <= 0 {
return 0
}
maxBlocks := 0
// we only loop through the possible block sizes which are less than
// the size of peaks, in other words, we have to ensure at least one
// peak inside each block
for i := 1; i <= len(peaks); i++ {
// if i is not the divisor of len(A), which means the A is not
// able to be equally divided, we ignore them;
if len(A)%i != 0 {
continue
}
// we got the block size
di := len(A) / i
peakState := 0
k := 0
// this loop is for verifying whether each block has at least one
// peak by checking the peak is inside A[k]~A[2k]-1
// if current peak is not valid, we step down the next peak until
// valid, then we move to the next block for finding valid peak;
// once all the peaks are consumed, we can verify whether all the
// blocks are valid with peak inside by checking the k value,
// if k reaches the
// final state, we can make sure that this solution is acceptable
for {
if peakState > len(peaks)-1 {
break
}
if k >= i {
break
}
if peaks[peakState] >= di*k && peaks[peakState] <= di*(k+1)-1 {
peakState++
} else {
k++
}
}
// if all peaks are checked truly inside the block, we can make
// sure this divide solution is acceptable and record it in the
// global max block size
if k == i-1 && peakState == len(peaks) {
maxBlocks = i
}
}
return maxBlocks
}
Thanks for adding more comments to your code. The idea seems to make sense. If the judge is reporting a wrong answer, I would try it with random data and some edge cases and a brute-force control to see if you can catch a failing example that's reasonably sized, and analyse what is wrong.
My own thought about a possible approach so far was to record a prefix array so as to tell in O(1) if a block has a peak. Add 1 if the element is a peak, 0 otherwise. For input,
1, 2, 3, 4, 3, 4, 1, 2, 3, 4, 6, 2
we would have:
1, 2, 3, 4, 3, 4, 1, 2, 3, 4, 6, 2
0 0 0 1 1 2 2 2 2 2 3 3
now when we divide, we know if a block contains a peak if its relative sum is positive:
1, 2, 3, 4, 3, 4, 1, 2, 3, 4, 6, 2
0|0 0 0 1| 1 2 2 2| 2 2 3 3
a b c d
If the first block did not contain a peak, we would expect b - a to equal 0 but instead we get 1, meaning there's a peak. This method would guarantee O(num blocks) for each divisor test.
The second thing I would try is to iterate from the smallest divisor (largest block size) to the largest divisor (smallest block size), but skip divisors that can be divided by a smaller divisor that failed validation. For example, if 2 succeeded but 3 failed, there's no way 6 can succeed, but 4 still could.
1 2 3 4 5 6 7 8 9 10 11 12
2 |
3 | |
6 | | | | |
4 x |x | x| x

Summing elements of a set of numbers to a given number

I have been battling to put up an algorithm to solve this problem.
Let say i have a set of number {1, 2, 5} and each element of the this set as unlimited supply, and i given another number 6, then ask to determine the number of ways you can sum the elements to get the number 6. For illustration purpose i do this
1 + 1 + 1 + 1 + 1 + 1 = 6
1 + 1 + 2 + 2 = 6
2 + 2 + 2 = 6
1 + 5 = 6
1 + 1 + 1 + 1 + 2 = 6
So in this case the program will output 5 as the number of ways. Again let say you are to find the sum for 4,
1 + 1 + 1 + 1 = 4
2 + 2 = 4
1 + 1 + 2 = 4
In this case the algorithm will output 3 as the number of way
This is similar to sum of subsets problem . I am sure you have to use branch and bound method or backtracking method.
1)Create a state space tree which consist of all possible cases.
0
/ | \
1 2 5
/ | \
1 2 5 ........
2) Continue the process until the sum of nodes in depth first manner is greater or equal to your desired number.
3) Count the no. of full branches that satisfy your condition.
The python implementation of similar problem can be found here.
This is a good problem to use recursion and dynamic programming techniques. Here is an implementation in Python using the top-down approach (memoization) to avoid doing the same calculation multiple times:
# Remember answers for subsets
cache = {}
# Return the ways to get the desired sum from combinations of the given numbers
def possible_sums(numbers, desired_sum):
# See if we have already calculated this possibility
key = (tuple(set(numbers)), desired_sum)
if key in cache:
return cache[key]
answers = {}
for n in numbers:
if desired_sum % n == 0:
# The sum is a multiple of the number
answers[tuple([n] * (desired_sum / n))] = True
if n < desired_sum:
for a in possible_sums(numbers, desired_sum - n):
answers[tuple([n] + a)] = True
cache[key] = [list(k) for k in answers.iterkeys()]
return cache[key]
# Return only distinct combinations of sums, ignoring order
def unique_possible_sums(numbers, desired_sum):
answers = {}
for s in possible_sums(numbers, desired_sum):
answers[tuple(sorted(s))] = True
return [list(k) for k in answers.iterkeys()]
for s in unique_possible_sums([1, 2, 5], 6):
print '6: ' + repr(s)
for s in unique_possible_sums([1, 2, 5], 4):
print '4: ' + repr(s)
For smaller target number(~1000000) and 1000{supply} n try this:
The supply of numbers you have
supply {a,b,c....}
The target you need
steps[n]
1 way to get to 0 use nothing
steps[0]=1
Scan till target number
for i from 1 to n:
for each supply x:
if i - x >=0
steps[i] += steps[i-x]
Steps at n will contain the number of ways
steps[n]
Visualization of the above:
supply {1, 2, 5} , target 6
i = 1, x=1 and steps required is 1
i = 2, x=1 and steps required is 1
i = 2, x=2 and steps required is 2
i = 3, x=1 and steps required is 2
i = 3, x=2 and steps required is 3
i = 4, x=1 and steps required is 3
i = 4, x=2 and steps required is 5
i = 5, x=1 and steps required is 5
i = 5, x=2 and steps required is 8
i = 5, x=5 and steps required is 9
i = 6, x=1 and steps required is 9
i = 6, x=2 and steps required is 14
i = 6, x=5 and steps required is 15
Some Java Code
private int test(int targetSize, int supply[]){
int target[] = new int[targetSize+1];
target[0]=1;
for(int i=0;i<=targetSize;i++){
for(int x:supply){
if(i-x >= 0){
target[i]+=target[i-x];
}
}
}
return target[targetSize];
}
#Test
public void test(){
System.err.println(test(12, new int[]{1,2,3,4,5,6}));
}

Fenwick trees to determine which interval a point falls in

Let a0,...,an-1 be a sequence of lengths. We can construct intervals [0,a0], (a1,a2+a1],(a2+a1,a3+a2+a1],... I store the sequence a1,...,an-1 in a Fenwick tree.
I ask the question: given a number m, how can I efficiently (log n time) find into which interval m falls?
For example, given the a: 3, 5, 2, 7, 9, 4.
The Fenwick Tree stores 3, 8, 2, 17, 9, 13.
The intervals are [0,3],(3,8],(8,10],(10,17],(17,26],(26,30].
Given the number 9, the algorithm should return the 3rd index of the Fenwick Tree (2 if 0-based arrays are used, 3 if 1-based arrays are used). Given the number 26, the algorithm should return the 5th index of the Fenwick Tree (4 if 0-based arrays are used or 5 if 1-based arrays are used).
Possibly another data structure might be more suited to this operation. I am using Fenwick Trees because of their seeming simplicity and efficiency.
We can get an O(log n)-time search operation. The trick is to integrate the binary search with the prefix sum operation.
def get_total(tree, i):
total = 0
while i > 0:
total += tree[i - 1]
i -= i & (-i)
return total
def search(tree, total):
j = 1
while j < len(tree):
j <<= 1
j >>= 1
i = -1
while j > 0:
if i + j < len(tree) and total > tree[i + j]:
total -= tree[i + j]
i += j
j >>= 1
return i + 1
tree = [3, 8, 2, 17, 9, 13]
print('Intervals')
for i in range(len(tree)):
print(get_total(tree, i), get_total(tree, i + 1))
print('Searches')
for total in range(31):
print(total, search(tree, total))
Output is
Intervals
0 3
3 8
8 10
10 17
17 26
26 30
Searches
0 0
1 0
2 0
3 0
4 1
5 1
6 1
7 1
8 1
9 2
10 2
11 3
12 3
13 3
14 3
15 3
16 3
17 3
18 4
19 4
20 4
21 4
22 4
23 4
24 4
25 4
26 4
27 5
28 5
29 5
30 5
If the intervals don't change frequently, you can use a simple binary search in the accumulated array to do that. In Python you can use the bisect module to do that. Each query will be O(log n):
import bisect
A = [3, 5, 2, 7, 9, 4]
for i in xrange(1, len(A)):
A[i] += A[i-1]
print bisect.bisect_left(A, 9)
print bisect.bisect_left(A, 26)
If the intervals change, you can use the same idea, but each array lookup will be O(log n), making the query operation O(logĀ² n) overall.

Link list algorithm to find pairs adding up to 10

Can you suggest an algorithm that find all pairs of nodes in a link list that add up to 10.
I came up with the following.
Algorithm: Compare each node, starting with the second node, with each node starting from the head node till the previous node (previous to the current node being compared) and report all such pairs.
I think this algorithm should work however its certainly not the most efficient one having a complexity of O(n2).
Can anyone hint at a solution which is more efficient (perhaps takes linear time). Additional or temporary nodes can be used by such a solution.
If their range is limited (say between -100 and 100), it's easy.
Create an array quant[-100..100] then just cycle through your linked list, executing:
quant[value] = quant[value] + 1
Then the following loop will do the trick.
for i = -100 to 100:
j = 10 - i
for k = 1 to quant[i] * quant[j]
output i, " ", j
Even if their range isn't limited, you can have a more efficient method than what you proposed, by sorting the values first and then just keeping counts rather than individual values (same as the above solution).
This is achieved by running two pointers, one at the start of the list and one at the end. When the numbers at those pointers add up to 10, output them and move the end pointer down and the start pointer up.
When they're greater than 10, move the end pointer down. When they're less, move the start pointer up.
This relies on the sorted nature. Less than 10 means you need to make the sum higher (move start pointer up). Greater than 10 means you need to make the sum less (end pointer down). Since they're are no duplicates in the list (because of the counts), being equal to 10 means you move both pointers.
Stop when the pointers pass each other.
There's one more tricky bit and that's when the pointers are equal and the value sums to 10 (this can only happen when the value is 5, obviously).
You don't output the number of pairs based on the product, rather it's based on the product of the value minus 1. That's because a value 5 with count of 1 doesn't actually sum to 10 (since there's only one 5).
So, for the list:
2 3 1 3 5 7 10 -1 11
you get:
Index a b c d e f g h
Value -1 1 2 3 5 7 10 11
Count 1 1 1 2 1 1 1 1
You start pointer p1 at a and p2 at h. Since -1 + 11 = 10, you output those two numbers (as above, you do it N times where N is the product of the counts). Thats one copy of (-1,11). Then you move p1 to b and p2 to g.
1 + 10 > 10 so leave p1 at b, move p2 down to f.
1 + 7 < 10 so move p1 to c, leave p2 at f.
2 + 7 < 10 so move p1 to d, leave p2 at f.
3 + 7 = 10, output two copies of (3,7) since the count of d is 2, move p1 to e, p2 to e.
5 + 5 = 10 but p1 = p2 so the product is 0 times 0 or 0. Output nothing, move p1 to f, p2 to d.
Loop ends since p1 > p2.
Hence the overall output was:
(-1,11)
( 3, 7)
( 3, 7)
which is correct.
Here's some test code. You'll notice that I've forced 7 (the midpoint) to a specific value for testing. Obviously, you wouldn't do this.
#include <stdio.h>
#define SZSRC 30
#define SZSORTED 20
#define SUM 14
int main (void) {
int i, s, e, prod;
int srcData[SZSRC];
int sortedVal[SZSORTED];
int sortedCnt[SZSORTED];
// Make some random data.
srand (time (0));
for (i = 0; i < SZSRC; i++) {
srcData[i] = rand() % SZSORTED;
printf ("srcData[%2d] = %5d\n", i, srcData[i]);
}
// Convert to value/size array.
for (i = 0; i < SZSORTED; i++) {
sortedVal[i] = i;
sortedCnt[i] = 0;
}
for (i = 0; i < SZSRC; i++)
sortedCnt[srcData[i]]++;
// Force 7+7 to specific count for testing.
sortedCnt[7] = 2;
for (i = 0; i < SZSORTED; i++)
if (sortedCnt[i] != 0)
printf ("Sorted [%3d], count = %3d\n", i, sortedCnt[i]);
// Start and end pointers.
s = 0;
e = SZSORTED - 1;
// Loop until they overlap.
while (s <= e) {
// Equal to desired value?
if (sortedVal[s] + sortedVal[e] == SUM) {
// Get product (note special case at midpoint).
prod = (s == e)
? (sortedCnt[s] - 1) * (sortedCnt[e] - 1)
: sortedCnt[s] * sortedCnt[e];
// Output the right count.
for (i = 0; i < prod; i++)
printf ("(%3d,%3d)\n", sortedVal[s], sortedVal[e]);
// Move both pointers and continue.
s++;
e--;
continue;
}
// Less than desired, move start pointer.
if (sortedVal[s] + sortedVal[e] < SUM) {
s++;
continue;
}
// Greater than desired, move end pointer.
e--;
}
return 0;
}
You'll see that the code above is all O(n) since I'm not sorting in this version, just intelligently using the values as indexes.
If the minimum is below zero (or very high to the point where it would waste too much memory), you can just use a minVal to adjust the indexes (another O(n) scan to find the minimum value and then just use i-minVal instead of i for array indexes).
And, even if the range from low to high is too expensive on memory, you can use a sparse array. You'll have to sort it, O(n log n), and search it for updating counts, also O(n log n), but that's still better than the original O(n2). The reason the binary search is O(n log n) is because a single search would be O(log n) but you have to do it for each value.
And here's the output from a test run, which shows you the various stages of calculation.
srcData[ 0] = 13
srcData[ 1] = 16
srcData[ 2] = 9
srcData[ 3] = 14
srcData[ 4] = 0
srcData[ 5] = 8
srcData[ 6] = 9
srcData[ 7] = 8
srcData[ 8] = 5
srcData[ 9] = 9
srcData[10] = 12
srcData[11] = 18
srcData[12] = 3
srcData[13] = 14
srcData[14] = 7
srcData[15] = 16
srcData[16] = 12
srcData[17] = 8
srcData[18] = 17
srcData[19] = 11
srcData[20] = 13
srcData[21] = 3
srcData[22] = 16
srcData[23] = 9
srcData[24] = 10
srcData[25] = 3
srcData[26] = 16
srcData[27] = 9
srcData[28] = 13
srcData[29] = 5
Sorted [ 0], count = 1
Sorted [ 3], count = 3
Sorted [ 5], count = 2
Sorted [ 7], count = 2
Sorted [ 8], count = 3
Sorted [ 9], count = 5
Sorted [ 10], count = 1
Sorted [ 11], count = 1
Sorted [ 12], count = 2
Sorted [ 13], count = 3
Sorted [ 14], count = 2
Sorted [ 16], count = 4
Sorted [ 17], count = 1
Sorted [ 18], count = 1
( 0, 14)
( 0, 14)
( 3, 11)
( 3, 11)
( 3, 11)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 7, 7)
Create a hash set (HashSet in Java) (could use a sparse array if your numbers are well-bounded, i.e. you know they fall into +/- 100)
For each node, first check if 10-n is in the set. If so, you have found a pair. Either way, then add n to the set and continue.
So for example you have
1 - 6 - 3 - 4 - 9
1 - is 9 in the set? Nope
6 - 4? No.
3 - 7? No.
4 - 6? Yup! Print (6,4)
9 - 1? Yup! Print (9,1)
This is a mini subset sum problem, which is NP complete.
If you were to first sort the set, it would eliminate the pairs of numbers that needed to be evaluated.

Resources