Sort array with range of values in O(n) - algorithm

There's an array A[1, ..., n], and its known that every 1 <= l <= n then A[l] in {1,2,...,n^5}.
How can I find an algroithm which sorts this in O(n) ?

Imagine representing the values in A[i] in Base-n system. Then each number becomes a five-digit n-ary number, meaning that you can sort the entire array with five applications of Radix Sort, with "radix" of n.
Compute the value of each "digit" x in a number k as follows:
dx = (k / (n x)) % n
where / denotes integer division.

sort list of integers a using base-N radix sort,
also this applies for a simple list
def rsort(a,N):
if a:
bins = [ [],[],[],[],[] ]
m = max(a)
r = 1
while m > r:
for e in a:
bins[(e/r)%N].append(e)
r = r * N
a = []
for i in range(N):
a.extend(bins[i])
bins[i] = []
return a

Related

Algorithm to find maximum of combinations of arrays

I have n finite arrays A[n], and the elements in those arrays are ordered pair (p,r) where 0 < p < 1 and r is a real number.
For instance,A[n][elements][0] = p,A[n][elements][1] = r
Let C = {p[1]*p[2] *...*p[n] * (r[1] + ...+ r[n]): where (p[i],r[i]) are in distinct arrays from A[n]} and I would like to find the maximum/minimum m elements of it.
Is there any algorithm that has order of any combinations of m and n, but not the actual size of the arrays in A[n]?
Example: say n = 2, m = 2, A[1] = {(0.23,91.2),(0.45,-31.7),(0.32,60.5)} and A[2] = {(0.12,150.3),(0.26,13.3),(0.33,200.3),(0.29,-23.4)}.
The 2 maximum is given by 0.32*0.33*(60.5+200.3)=27.54 and 0.45*0.33*(-31.7+200.3)=25.04.
The 2 minimum is given by 0.45*0.28*(-31.7+-23.4)=-7.19 and 0.45*0.26*(-31.7+13.3)=-2.15.

Number of pairs with a given sum and product

I have an array A along with 3 variables k, x and y.
I have to find number of unordered pairs (i,j) such that the sum of two elements mod k equals x and the product of the same two elements mod k is equal to y. Pairs need not be distinct. In other words, the number of (i,j) so that
(A[i]+A[j])%k == x and (A[i]*A[j])%k == y where 0 <= i < j < size of A.
For example, let A={1,2,3,2,1}, k=2, x=1, y=0. Then the answer is 6, because the pairs are: (1,2), (1,2), (2,3), (2,1), (3,2), and (2,1).
I used a brute force approach, but obviously this is not acceptable.
Modulo-arithmetic has the following two rules:
((a mod k) * (b mod k)) mod k = (a * b) mod k
((a mod k) + (b mod k)) mod k = (a + b) mod k
Thus we can sort all values into a hashtable with separate chaining and k buckets.
Addition
Find m < k, such that for a given n < k: (n + m) mod k = x.
There is exactly one solution to this problem:
if n < x: m < x must hold. Thus m = x - n
if n == x: m = 0
if n > x: we need to find m such that n + m = x + k. Thus m = x + k - n
This way, we can easily determine for each list of values the corresponding values such that for any pair (a, b) of the crossproduct of the two lists (a + b) mod k = x holds.
Multiplication
Multiplication is a bit trickier. Luckily we've already been given the matching congruence-class for addition (see above), which must as well be the matching congruence-class for the multiplication, since both constraints need to hold. To verify that the given congruence-class matches, we only need to check that (n * m) mod k = y (n and m defined as above). If this expression holds, we can build pairs, otherwise no matching elements exist.
Implementation
This would be the working python-code for the above example:
def modmuladd(ls, x, y, k):
result = []
# create tuples of indices and values
indices = zip(ls, range(0, len(ls)))
# split up into congruence classes
congruence_cls = [[] for i in range(0, k)]
for p in indices:
congruence_cls[p[0] % k].append(p)
for n in range(0, k):
# congruence class to match addition
if n < x:
m = x - n
elif n == x:
m = 0
else:
m = x + k - n
# check if congruence class matches for multiplication
if (n * m) % k != y or len(congruence_cls[m]) == 0:
continue # no matching congruence class
# add matching tuple to result
result += [(a, b) for a in congruence_cls[n] for b in congruence_cls[m] if a[1] <= b[1]]
result += [(a, b) for a in congruence_cls[m] for b in congruence_cls[n] if a[1] <= b[1]]
# sort result such according to indices of first and second element, remove duplicates
sorted_res = sorted(sorted(set(result), key=lambda p: p[1][1]), key=lambda p: p[0][1])
# remove indices from result-set
return [(p[0][0], p[1][0]) for p in sorted_res]
Note that sorting and elimination of duplicates is only required since this code concentrates on the usage of congruence-classes than perfect optimization. This example can be easily tweaked to provided ordering without the sorting by minor modifications.
Test run
print(modmuladd([1, 2, 3, 2, 1], 1, 0, 2))
Output:
[(1, 2), (1, 2), (2, 3), (2, 1), (3, 2), (2, 1)]
EDIT:
Worst-case complexity of this algorithm is still O(n^2), due to the fact that building all possible pairs of a list of size n is O(n^2). With this algorithm however the search for matching pairs can be cut down to O(k) with O(n) preprocessing. Thus counting resulting pairs can be done in O(n) with this approach. Assuming the numbers are distributed equally over the congruence-classes, this algorithm could build all pairs that are part of the solution-set in O(n^2/k^2).
EDIT 2:
An implementation that only counts would work like this:
def modmuladdct(ls, x, y, k):
result = 0
# split up into congruence classes
congruence_class = {}
for v in ls:
if v % k not in congruence_class:
congruence_class[(v % k)] = [v]
else:
congruence_class[v % k].append(v)
for n in congruence_class.keys():
# congruence class to match addition
m = (x - n + k) % k
# check if congruence class matches for multiplication
if (n * m % k != y) or len(congruence_class[m]) == 0:
continue # no matching congruence class
# total number of pairs that will be built
result += len(congruence_class[n]) * len(congruence_class[m])
# divide by two since each pair would otherwise be counted twice
return result // 2
Each pair would appear exactly twice in the result: once in-order and once with reversed order. By dividing the result by two this is being corrected. Runtime is O(n + k) (assuming dictionary-operations are O(1)).
The number of loops is C(2, n) = 5!/(2!(5-2)! = 10 loops in your case, and there is nothing magic that would drastically reduce the number of loops.
In JS you can do:
A = [1, 2, 3, 2, 1];
k = 2;
x = 1;
y = 0;
for(i=0; i<A.length; i++) {
for(j=i+1; j<A.length; j++) {
if ((A[i]+A[j])%k !== x) {
continue;
}
if ((A[i]*A[j])%k !== y) {
continue;
}
console.log('('+A[i]+', '+A[j]+')');
}
}
Ignoring A, we can find all solutions of n * (x - n) == y mod k for 0 <= n < k. That's a simple O(k) algorithm -- check each such n in turn.
We can count, for each n, how often A[i] == n, and then reconstruct the counts of pairs. For if cs is an array of these counts, and n is a solution of n * (x - n) == y mod k, then there's cs[n] * cs[(x-n)^k] pairs of things in A that solve our equations corresponding to this n. To avoid double counting we only count n such that n < (x - n) % k.
def count_pairs(A, k, x, y):
cs = [0] * k
for a in A:
cs[a % k] += 1
pairs = ((i, (x-i)%k) for i in xrange(k) if i * (x-i) % k == y)
return sum(cs[i] * cs[j] for i, j in pairs if i < j)
print count_pairs([1, 2, 3, 2, 1], 2, 1, 0)
Overall, this constructs the counts in O(|A|) time, and the remaining code runs in O(k) time. It uses O(k) space.

What is the fastest algorithm for intersection of two sorted lists?

Say that there are two sorted lists: A and B.
The number of entries in A and B can vary. (They can be very small/huge. They can be similar to each other/significantly different).
What is the known to be the fastest algorithm for this functionality?
Can any one give me an idea or reference?
Assume that A has m elements and B has n elements, with m ≥ n. Information theoretically, the best we can do is
(m + n)!
lg -------- = n lg (m/n) + O(n)
m! n!
comparisons, since in order to verify an empty intersection, we essentially have to perform a sorted merge. We can get within a constant factor of this bound by iterating through B and keeping a "cursor" in A indicating the position at which the most recent element of B should be inserted to maintain sorted order. We use exponential search to advance the cursor, for a total cost that is on the order of
lg x_1 + lg x_2 + ... + lg x_n,
where x_1 + x_2 + ... + x_n = m + n is some integer partition of m. This sum is O(n lg (m/n)) by the concavity of lg.
I don't know if this is the fastest option but here's one that runs in O(n+m) where n and m are the sizes of your lists:
Loop over both lists until one of them is empty in the following way:
Advance by one on one list.
Advance on the other list until you find a value that is either equal or greater than the current value of the other list.
If it is equal, the element belongs to the intersection and you can append it to another list
If it is greater that the other element, advance on the other list until you find a value equal or greater than this value
as said, repeat this until one of the lists is empty
Here is a simple and tested Python implementation that uses bisect search to advance pointers of both lists.
It assumes both input lists are sorted and contain no duplicates.
import bisect
def compute_intersection_list(l1, l2):
# A is the smaller list
A, B = (l1, l2) if len(l1) < len(l2) else (l2, l1)
i = 0
j = 0
intersection_list = []
while i < len(A) and j < len(B):
if A[i] == B[j]:
intersection_list.append(A[i])
i += 1
j += 1
elif A[i] < B[j]:
i = bisect.bisect_left(A, B[j], lo=i+1)
else:
j = bisect.bisect_left(B, A[i], lo=j+1)
return intersection_list
# test on many random cases
import random
MM = 100 # max value
for _ in range(10000):
M1 = random.randint(0, MM) # random max value
N1 = random.randint(0, M1) # random number of values
M2 = random.randint(0, MM) # random max value
N2 = random.randint(0, M2) # random number of values
a = sorted(random.sample(range(M1), N1)) # sampling without replacement to have no duplicates
b = sorted(random.sample(range(M2), N2))
assert compute_intersection_list(a, b) == sorted(set(a).intersection(b))

Count number of subsequences with given k modulo sum

Given an array a of n integers, count how many subsequences (non-consecutive as well) have sum % k = 0:
1 <= k < 100
1 <= n <= 10^6
1 <= a[i] <= 1000
An O(n^2) solution is easily possible, however a faster way O(n log n) or O(n) is needed.
This is the subset sum problem.
A simple solution is this:
s = 0
dp[x] = how many subsequences we can build with sum x
dp[0] = 1, 0 elsewhere
for i = 1 to n:
s += a[i]
for j = s down to a[i]:
dp[j] = dp[j] + dp[j - a[i]]
Then you can simply return the sum of all dp[x] such that x % k == 0. This has a high complexity though: about O(n*S), where S is the sum of all of your elements. The dp array must also have size S, which you probably can't even afford to declare for your constraints.
A better solution is to not iterate over sums larger than or equal to k in the first place. To do this, we will use 2 dp arrays:
dp1, dp2 = arrays of size k
dp1[0] = dp2[0] = 1, 0 elsewhere
for i = 1 to n:
mod_elem = a[i] % k
for j = 0 to k - 1:
dp2[j] = dp2[j] + dp1[(j - mod_elem + k) % k]
copy dp2 into dp1
return dp1[0]
Whose complexity is O(n*k), and is optimal for this problem.
There's an O(n + k^2 lg n)-time algorithm. Compute a histogram c(0), c(1), ..., c(k-1) of the input array mod k (i.e., there are c(r) elements that are r mod k). Then compute
k-1
product (1 + x^r)^c(r) mod (1 - x^k)
r=0
as follows, where the constant term of the reduced polynomial is the answer.
Rather than evaluate each factor with a fast exponentiation method and then multiply, we turn things inside out. If all c(r) are zero, then the answer is 1. Otherwise, recursively evaluate
k-1
P = product (1 + x^r)^(floor(c(r)/2)) mod (1 - x^k).
r=0
and then compute
k-1
Q = product (1 + x^r)^(c(r) - 2 floor(c(r)/2)) mod (1 - x^k),
r=0
in time O(k^2) for the latter computation by exploiting the sparsity of the factors. The result is P^2 Q mod (1 - x^k), computed in time O(k^2) via naive convolution.
Traverse a and count a[i] mod k; there ought to be k such counts.
Recurse and memoize over the distinct partitions of k, 2*k, 3*k...etc. with parts less than or equal to k, adding the products of the appropriate counts.
For example, if k were 10, some of the partitions would be 1+2+7 and 1+2+3+4; but while memoizing, we would only need to calculate once how many pairs mod k in the array produce (1 + 2).
For example, k = 5, a = {1,4,2,3,5,6}:
counts of a[i] mod k: {1,2,1,1,1}
products of distinct partitions of k:
5 => 1
4,1 => 2
3,2 => 1
products of distinct partitions of 2 * k with parts <= k:
5,4,1 => 2
5,3,2 => 1
4,1,3,2 => 2
products of distinct partitions of 3 * k with parts <= k:
5,4,1,3,2 => 2
answer = 11
{1,4} {4,6} {2,3} {5}
{1,4,2,3} {1,4,5} {4,6,2,3} {4,6,5} {2,3,5}
{1,4,2,3,5} {4,6,2,3,5}

Finding Numbers where modulo is k

I have given a Number A where 1<=A<=10^6 and a Number K. I have to find the all the numbers between 1 to A where A%i==k and i is 1<=i<=A. Is there any better solution than looping
Simple Solution
for(int i=1;i<=A;i++)
if(A%i==k) count++;
Is there any better solution than iterating all the numbers between 1 to A
The expression A % i == k is equivalent to A == n * i + k for any integer value of n that gives a value of A within the stated bounds.
This can be rearranged as n * i = A - k, and can be solved by finding all the factors of A - k that are multiples of i (where k < i <= A).
Here are a couple of examples:
A = 100, k = 10
F = factor_list(A-k) = factor_list(90) = [1,2,3,5,6,9,10,15,18,30,45,90]
(discard all factors less than or equal to k)
Result: [15,18,30,45,90]
A = 288, k = 32
F = [2,4,8,16,32,64,128,256]
Result: [64,128,256]
If A - k is prime, then there is either one solution (A-k) or none (if A-k <= k).

Resources