Any faster way to find the number of "lucky triples"? - algorithm

I am working on a code challenge problem -- "find lucky triples". "Lucky triple" is defined as "In a list lst, for any combination of triple like (lst[i], lst[j], lst[k]) where i < j < k, where lst[i] divides lst[j] and lst[j] divides lst[k].
My task is to find the number of lucky triples in a given list. The brute force way is to use three loops but it takes too much time to solve the problem. I wrote this one and the system respond "time exceed". The problems looks silly and easy but the array is unsorted so general methods like binary search do not work. I am stun in the problem for one day and hope someone can give me a hint. I am seeking a way to solve the problem faster, at least the time complexity should be lower than O(N^3).

A simple dynamic programming-like algorithm will do this in quadratic time and linear space. You just have to maintain a counter c[i] for each item in the list, that represents the number of previous integers that divides L[i].
Then, as you go through the list and test each integer L[k] with all previous item L[j], if L[j] divides L[k], you just add c[j] (which could be 0) to your global counter of triples, because that also implies that there exist exactly c[j] items L[i] such that L[i] divides L[j] and i < j.
int c[] = {0}
int nbTriples = 0
for k=0 to n-1
for j=0 to k-1
if (L[k] % L[j] == 0)
c[k]++
nbTriples += c[j]
return nbTriples
There may be some better algorithm that uses fancy discrete maths to do it faster, but if O(n^2) is ok, this will do just fine.
In regard to your comment:
Why DP? We have something that can clearly be modeled as having a left to right order (DP orange flag), and it feels like reusing previously computed values could be interesting, because the brute force algorithm does the exact same computations a lot of times.
How to get from that to a solution? Run a simple example (hint: it should better be by treating input from left to right). At step i, compute what you can compute from this particular point (ignoring everything on the right of i), and try to pinpoint what you compute over and over again for different i's: this is what you want to cache. Here, when you see a potential triple at step k (L[k] % L[j] == 0), you have to consider what happens on L[j]: "does it have some divisors on its left too? Each of these would give us a new triple. Let's see... But wait! We already computed that on step j! Let's cache this value!" And this is when you jump on your seat.

Full working solution in python:
c = [0] * len(l)
print c
count = 0
for i in range(0,len(l)):
j=0
for j in range(0, i):
if l[i] % l[j] == 0:
c[i] = c[i] + 1
count = count + c[j]
print j
print c
print count

Read up on the Sieve of Eratosthenes, a common technique for finding prime numbers, which could be adapted to find your 'lucky triples'. Essentially, you would need to iterate your list in increasing value order, and for each value, multiply it by an increasing factor until it is larger than the largest list element, and each time one of these multiples equals another value in the list, the multiple is divisible by the base number. If the list is sorted when given to you, then the i < j < k requirement would also be satisfied.
e.g. Given the list [3, 4, 8, 15, 16, 20, 40]:
Start at 3, which has multiples [6, 9, 12, 15, 18 ... 39] within the range of the list. Of those multiples, only 15 is contained in the list, so record under 15 that it has a factor 3.
Proceed to 4, which has multiples [8, 12, 16, 20, 24, 28, 32, 36, 40]. Mark those as having a factor 4.
Continue through the list. When you reach an element that has an existing known factor, then if you find any multiples of that number in the list, then you have a triple. In this case, for 16, this has a multiple 32 which is in the list. So now you know that 32 is divisible by 16, which is divisible by 4. Whereas for 15, that has no multiples in the list, so there is no value that can form a triplet with 3 and 15.

A precomputation step to the problem can help reduce time complexity.
Precomputation Step:
For every element(i), iterate the array to find which are the elements(j) such that lst[j]%lst[i]==0
for(i=0;i<n;i++)
{
for(j=i+1;j<n;j++)
{
if(a[j]%a[i] == 0)
// mark those j's. You decide how to store this data
}
}
This Precomputation Step will take O(n^2) time.
In the Ultimate Step, use the details of the Precomputation Step, to help find the triplets..

Forming a graph - an array of the indices which are multiples ahead of the current index. Then calculating the collective sum of multiples of these indices, referred from the graph. It has a complexity of O(n^2)
For example, for a list {1,2,3,4,5,6} there will be an array of the multiples. The graph will look like
{ 0:[1,2,3,4,5], 1:[3,5], 2: [5], 3:[],4:[], 5:[]}
So, total triplets will be {0->1 ->3/5} and {0->2 ->5} ie., 3
package com.welldyne.mx.dao.core;
import java.util.LinkedList;
import java.util.List;
public class LuckyTriplets {
public static void main(String[] args) {
int[] integers = new int[2000];
for (int i = 1; i < 2001; i++) {
integers[i - 1] = i;
}
long start = System.currentTimeMillis();
int n = findLuckyTriplets(integers);
long end = System.currentTimeMillis();
System.out.println((end - start) + " ms");
System.out.println(n);
}
private static int findLuckyTriplets(int[] integers) {
List<Integer>[] indexMultiples = new LinkedList[integers.length];
for (int i = 0; i < integers.length; i++) {
indexMultiples[i] = getMultiples(integers, i);
}
int luckyTriplets = 0;
for (int i = 0; i < integers.length - 1; i++) {
luckyTriplets += getLuckyTripletsFromMultiplesMap(indexMultiples, i);
}
return luckyTriplets;
}
private static int getLuckyTripletsFromMultiplesMap(List<Integer>[] indexMultiples, int n) {
int sum = 0;
for (int i = 0; i < indexMultiples[n].size(); i++) {
sum += indexMultiples[(indexMultiples[n].get(i))].size();
}
return sum;
}
private static List<Integer> getMultiples(int[] integers, int n) {
List<Integer> multiples = new LinkedList<>();
for (int i = n + 1; i < integers.length; i++) {
if (isMultiple(integers[n], integers[i])) {
multiples.add(i);
}
}
return multiples;
}
/*
* if b is the multiple of a
*/
private static boolean isMultiple(int a, int b) {
return b % a == 0;
}
}

I just wanted to share my solution, which passed. Basically, the problem can be condensed to a tree problem. You need to pay attention to the wording of the question, it only treats numbers different on basis of the index not value. so {1,1,1} will have only 1 triple, but {1,1,1,1} will have 4. the constraint is {li,lj,lk} such that the divide and i<j<k
def solution(l):
count = 0
data = l
max_element = max(data)
tree_list = []
for p,element in enumerate(data):
if element == 0:
tree_list.append([])
else:
temp = []
for el in data[p+1:]:
if el%element == 0:
temp.append(el)
tree_list.append(temp)
for p,element_list in enumerate(tree_list):
data[p] = 0
temp = data[:]
for element in element_list:
pos_element = temp.index(element)
count += len(tree_list[pos_element])
temp[pos_element] = 0
return count

Related

Finding subarray with target bitwise AND value

Given an array A of size N and an integer P, find the subarray B = A[i...j] such that i <= j, compute the bitwise value of subarray elements say K = B[i] & B[i + 1] & ... & B[j].
Output the minimum value of |K-P| among all possible values of K.
Here is a a quasilinear approach, assuming the elements of the array have a constant number of bits.
The rows of the matrix K[i,j] = A[i] & A[i + 1] & ... & A[j] are monotonically decreasing (ignore the lower triangle of the matrix). That means the absolute value of the difference between K[i,:] and the search parameter P is unimodal and a minimum (not necessarily the minimum as the same minimum may occur several times, but then they will do so in a row) can be found in O(log n) time with ternary search (assuming access to elements of K can be arranged in constant time). Repeat this for every row and output the position of the lowest minimum, bringing it up to O(n log n).
Performing the row-minimum search in a time less than the size of row requires implicit access to the elements of the matrix K, which could be accomplished by creating b prefix-sum arrays, one for each bit of the elements of A. A range-AND can then be found by calculating all b single-bit range-sums and comparing them with the length of the range, each comparison giving a single bit of the range-AND. This takes O(nb) preprocessing and gives O(b) (so constant, by the assumption I made at the beginning) access to arbitrary elements of K.
I had hoped that the matrix of absolute differences would be a Monge matrix allowing the SMAWK algorithm to be used, but that does not seem to be the case and I could not find a way to push to towards that property.
Are you familiar with the Find subarray with given sum problem? The solution I'm proposing uses the same method as in the efficient solution in the link. It is highly recommended to read it before continuing.
First let's notice that the longer a subarray its K will be it will be smaller, since the & operator between two numbers can create only a smaller number.
So if I have a subarray from i to j and I want want to make its K smaller I'll add more elements (now the subarray is from i to j + 1), if I want to make K larger I'll remove elements (i + 1 to j).
If we review the solution to Find subarray with given sum we see that we can easily transform it to our problem - the given sum is K and summing is like using the & operator, but more elements is smaller K so we can flip the comparison of the sums.
This problem tells you if the solution exist but if you simply maintain the minimal difference you found so far you can solve your problem as well.
Edit
This solution is true if all the numbers are positive, as mentioned in the comments, if not all the numbers are positive the solution is slightly different.
Notice that if not all of the numbers are negative, the K will be positive, so in order to find a negative P we can consider only the negatives in the algorithm, than use the algorithm as shown above.
Here an other quasi-linear algorithm, mixing the yonlif Find subarray with given sum problem solution with Harold idea to compute K[i,j]; therefore I don't use pre-processing which if memory-hungry. I use a counter to keep trace of bits and compute at most 2N values of K, each costing at most O(log N). since log N is generally smaller than the word size (B), it's faster than a linear O(NB) algorithm.
Counts of bits of N numbers can be done with only ~log N words :
So you can compute A[i]&A[i+1]& ... &A[I+N-1] with only log N operations.
Here the way to manage the counter: if
counter is C0,C1, ...Cp, and
Ck is Ck0,Ck1, ...Ckm,
Then Cpq ... C1q,C0q is the binary representation of the number of bits equal to 1 among the q-th bit of {A[i],A[i+1], ... ,A[j-1]}.
The bit-level implementation (in python); all bits are managed in parallel.
def add(counter,x):
k = 0
while x :
x, counter[k] = x & counter[k], x ^ counter[k]
k += 1
def sub(counter,x):
k = 0
while x :
x, counter[k] = x & ~counter[k], x ^ counter[k]
k += 1
def val(counter,count): # return A[i] & .... & A[j-1] if count = j-i.
k = 0
res = -1
while count:
if count %2 > 0 : res &= counter[k]
else: res &= ~counter[k]
count //= 2
k += 1
return res
And the algorithm :
def solve(A,P):
counter = np.zeros(32, np.int64) # up to 4Go
n = A.size
i = j = 0
K=P # trig fill buffer
mini = np.int64(2**63-1)
while i<n :
if K<P or j == n : # dump buffer
sub(counter,A[i])
i += 1
else: # fill buffer
add(counter,A[j])
j += 1
if j>i:
K = val(counter, count)
X = np.abs(K - P)
if mini > X: mini = X
else : K = P # reset K
return mini
val,sub and add are O(ln N) so the whole process is O(N ln N)
Test :
n = 10**5
A = np.random.randint(0, 10**8, n, dtype=np.int64)
P = np.random.randint(0, 10**8, dtype=np.int64)
%time solve(A,P)
Wall time: 0.8 s
Out: 452613036735
A numba compiled version (decorate the 4 functions by #numba.jit) is 200x faster (5 ms).
Yonlif answer is wrong.
In the Find subaray with given sum solution we have a loop where we do substruction.
while (curr_sum > sum && start < i-1)
curr_sum = curr_sum - arr[start++];
Since there is no inverse operator of a logical AND, we cannot rewrite this line and we cannot use this solution directly.
One would say that we can recalculate the sum every time when we increase the lower bound of a sliding window (which would lead us to O(n^2) time complexity), but this solution would not work (I'll provide the code and counter example in the end).
Here is brute force solution that works in O(n^3)
unsigned int getSum(const vector<int>& vec, int from, int to) {
unsigned int sum = -1;
for (auto k = from; k <= to; k++)
sum &= (unsigned int)vec[k];
return sum;
}
void updateMin(unsigned int& minDiff, int sum, int target) {
minDiff = std::min(minDiff, (unsigned int)std::abs((int)sum - target));
}
// Brute force solution: O(n^3)
int maxSubArray(const std::vector<int>& vec, int target) {
auto minDiff = UINT_MAX;
for (auto i = 0; i < vec.size(); i++)
for (auto j = i; j < vec.size(); j++)
updateMin(minDiff, getSum(vec, i, j), target);
return minDiff;
}
Here is O(n^2) solution in C++ (thanks to B.M answer) The idea is to update current sum instead calling getSum for every two indices. You should also look at B.M answer as it contains conditions for early braak. Here is C++ version:
int maxSubArray(const std::vector<int>& vec, int target) {
auto minDiff = UINT_MAX;
for (auto i = 0; i < vec.size(); i++) {
unsigned int sum = -1;
for (auto j = i; j < vec.size(); j++) {
sum &= (unsigned int)vec[j];
updateMin(minDiff, sum, target);
}
}
return minDiff;
}
Here is NOT working solution with a sliding window: This is the idea from Yonlif answer with the precomputation of the sum in O(n^2)
int maxSubArray(const std::vector<int>& vec, int target) {
auto minDiff = UINT_MAX;
unsigned int sum = -1;
auto left = 0, right = 0;
while (right < vec.size()) {
if (sum > target)
sum &= (unsigned int)vec[right++];
else
sum = getSum(vec, ++left, right);
updateMin(minDiff, sum, target);
}
right--;
while (left < vec.size()) {
sum = getSum(vec, left++, right);
updateMin(minDiff, sum, target);
}
return minDiff;
}
The problem with this solution is that we skip some sequences which can actually be the best ones.
Input: vector = [26,77,21,6], target = 5.
Ouput should be zero as 77&21=5, but sliding window approach is not capable of finding that one as it will first consider window [0..3] and than increase lower bound, without possibility to consider window [1..2].
If someone have a linear or log-linear solution which works it would be nice to post.
Here is a solution that i wrote and that takes time complexity of the order O(n^2).
The below code snippet is written in Java .
class Solution{
public int solve(int[] arr,int p){
int maxk = Integer.MIN_VALUE;
int mink = Integer.MAX_VALUE;
int size = arr.length;
for(int i =0;i<size;i++){
int temp = arr[i];
for(int j = i;j<size;j++){
temp &=arr[j];
if(temp<=p){
if(temp>maxk)
maxk = temp;
}
else{
if(temp < mink)
mink = temp;
}
}
}
int min1 = Math.abs(mink -p);
int min2 = Math.abs(maxk -p);
return ( min1 < min2 ) ? min1 : min2;
}
}
It is simple brute force approach where 2 numbers let us say x and y , such that x <= k and y >=k are found where x and y are some different K = arr[i]&arr[i+1]&...arr[j] where i<=j for different i and j for x,y .
Answer will be just the minimum of |x-p| and |y-p| .
This is a Python implementation of the O(n) solution based on the broad idea from Yonlif's answer. There were doubts about whether this solution could work since no implementation was provided, so here's an explicit writeup.
Some caveats:
The code technically runs in O(n*B), where n is the number of integers and B is the number of unique bit positions set in any of the integers. With constant-width integers that's linear, but otherwise it's not generally linear in actual input size. You can get a true linear solution for exponentially large inputs with more bookkeeping.
Negative numbers in the array aren't handled, since their bit representation isn't specified in the question. See the comments on Yonlif's answer for hints on how to handle fixed-width two's complement signed integers.
The contentious part of the sliding window solution seems to be how to 'undo' bitwise &. The trick is to store the counts of set-bits in each bit-position of elements in your sliding window, not just the bitwise &. This means adding or removing an element from the window turns into adding or removing 1 from the bit-counters for each set-bit in the element.
On top of testing this code for correctness, it isn't too hard to prove that a sliding window approach can solve this problem. The bitwise & function on subarrays is weakly-monotonic with respect to subarray inclusion. Therefore the basic approach of increasing the right pointer when the &-value is too large, and increasing the left pointer when the &-value is too small, will cause our sliding window to equal an optimal sliding window at some point.
Here's a small example run on Dejan's testcase from another answer:
A = [26, 77, 21, 6], Target = 5
Active sliding window surrounded by []
[26], 77, 21, 6
left = 0, right = 0, AND = 26
----------------------------------------
[26, 77], 21, 6
left = 0, right = 1, AND = 8
----------------------------------------
[26, 77, 21], 6
left = 0, right = 2, AND = 0
----------------------------------------
26, [77, 21], 6
left = 1, right = 2, AND = 5
----------------------------------------
26, 77, [21], 6
left = 2, right = 2, AND = 21
----------------------------------------
26, 77, [21, 6]
left = 2, right = 3, AND = 4
----------------------------------------
26, 77, 21, [6]
left = 3, right = 3, AND = 6
So the code will correctly output 0, as the value of 5 was found for [77, 21]
Python code:
def find_bitwise_and(nums: List[int], target: int) -> int:
"""Find smallest difference between a subarray-& and target.
Given a list on nonnegative integers, and nonnegative target
returns the minimum value of abs(target - BITWISE_AND(B))
over all nonempty subarrays B
Runs in linear time on fixed-width integers.
"""
def get_set_bits(x: int) -> List[int]:
"""Return indices of set bits in x"""
return [i for i, x in enumerate(reversed(bin(x)[2:]))
if x == '1']
def counts_to_bitwise_and(window_length: int,
bit_counts: Dict[int, int]) -> int:
"""Given bit counts for a window of an array, return
bitwise AND of the window's elements."""
return sum((1 << key) for key, count in bit_counts.items()
if count == window_length)
current_AND_value = nums[0]
best_diff = abs(current_AND_value - target)
window_bit_counts = Counter(get_set_bits(nums[0]))
left_idx = right_idx = 0
while right_idx < len(nums):
# Expand the window to decrease & value
if current_AND_value > target or left_idx > right_idx:
right_idx += 1
if right_idx >= len(nums):
break
window_bit_counts += Counter(get_set_bits(nums[right_idx]))
# Shrink the window to increase & value
else:
window_bit_counts -= Counter(get_set_bits(nums[left_idx]))
left_idx += 1
current_AND_value = counts_to_bitwise_and(right_idx - left_idx + 1,
window_bit_counts)
# No nonempty arrays allowed
if left_idx <= right_idx:
best_diff = min(best_diff, abs(current_AND_value - target))
return best_diff

More efficient algorithm to find OR of two sets

Given a matrix of n rows and m columns of 1's and 0's, it is required to find out the number of pairs of rows that can be selected so that their OR is 11111....m times.
Example:
1 0 1 0 1
0 1 0 0 1
1 1 1 1 0
Answer:
2 ---> OR of row number [1,3] and [2,3]
Given n and m can be an order upto <= 3000, how efficiently can this problem be solved?
PS: I already tried with a naive O(n*n*m) method. I was thinking of a better solution.
1. trivial solution
The trivial algorithm (which you already discovered but did not post) is to take all (n choose 2) combinations of the n rows, OR them, and see if it works. This is O(n^2 * m). Coding would look like:
for (i = 0; i < n; ++i)
for (j=i+1; j < n; ++ j) {
try OR of row[i] with row[j] to see if it works, if so record (i,j)
}
2. constant speedup
You can improve the running time by a factor of the word size by packing bits into the words. This still gives same asymptotic, but in practice a factor of 64-bit speedup on a 64-bit machine. This has already been noted in comments above.
3. heuristic speedup
We can do heuristics to further improve the time in practice, but no asymptotic guarantee. Consider sorting your rows by hamming weight, with smallest hamming weight in the front and largest hamming weight at the end (running time O(m * n * log m )). Then you only need to compare low weight rows with high weight rows: specifically, the weights need to be >= m. Then the search would look something like this:
for (i = 0; i < n; ++i)
for (j=n-1; j > i; --j) /* go backwards to take advantage of hmwt */
{
if ( (hmwt(row[i]) + hmwt(row[j])) < m)
break;
try OR of row[i] with row[j] to see if it works, if so record (i,j)
}
4. towards a better approach
Another approach that may offer better returns is to choose a column of low hamming weight. Then combine the rows into two groups: those with a 1 in that column (group A) versus those with a 0 in that column (group B). Then you only need to consider combinations of rows where one come from group A and the other comes from group B, or both come from group A (Thanks #ruakh for catching my oversight). Something along these lines should help a lot. But again this is still asymptotically same worse case, however should be faster in practice (assuming that we're not in the case of all combinations being the answer).
5. the limits of what can be done
It is easy to construct examples where the number of vector pairs that work is O(n^2), and therefore it feels very hard to beat O(m*n^2) worse case. What we should be seeking is a solution that is somehow related to the number of pairs that work. The heuristics above are going this direction. If there is a column with small hamming weight h, then point 4 above brings the running time down to O(h*n*m + h^2*m). If h is significantly smaller than n, then you get big improvements.
Here's an off-the-wall idea that might have even worse asymptotic (or even average) behavior -- but it generalizes in an interesting way, and at least offers a different approach. This problem can be viewed as an exact cover problem. Each of the n rows contains a set of values S from the set {1,2,...,m}, corresponding to the column indices for which the row has the value 1. The task of the problem is to find a collection of rows whose sets form a disjoint partition of {1,2,...m}. When there are only two such rows in an exact cover, these rows are binary opposites of the kind that you are looking for. However, there are more complicated exact covers possible, such as one involving three rows:
0 1 0 0 1
1 0 0 0 0
0 0 1 1 0
The exact cover problem looks for all such exact covers, and is an NP-complete problem. The canonical solution is Algorithm X, created by Donald Knuth.
If I'm not mistaken, the following should be O(n*m):
For each column, compute the set of indices of rows that have a "1" at this column, and store this as a mapping from the column index to the set of row indices
For each row, compute the set of row indices that could "complete" the row (by adding "1"s in the columns where the row has a "0"). This can be done by computing the intersection of the sets that have been computed in step 1 for the respective column
Count the completing row indices
For your example:
1 0 1 0 1
0 1 0 0 1
1 1 1 1 0
The indices of the rows that have a "1" at each column are
Column 0: [0, 2]
Column 1: [1, 2]
Column 2: [0, 2]
Column 3: [2]
Column 4: [0, 1]
The unions of all sets of indices that are used for "filling" each row are
Row 0: [2]
Row 1: [2]
Row 2: []
Which is 2 in total.
The main reason of why one could argue about the running time is that the computation of the intersections of m sets with a size of at most n could be considered to be O(m*n), but I think that the sizes of these sets will be limited: The entries are either are 1's or 0's, and when there are many 1s (and the sizes are large), then there are fewer sets to intersect, and vice versa - but I didn't do a rigorous proof here...
A Java-based implementation that I used for playing around with this (and for some basic "tests") :
import java.util.ArrayList;
import java.util.Arrays;
import java.util.LinkedHashMap;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Map;
import java.util.Random;
import java.util.Set;
public class SetOrCombinations
{
public static void main(String[] args)
{
List<Integer> row0 = Arrays.asList(1, 0, 1, 0, 1);
List<Integer> row1 = Arrays.asList(1, 1, 0, 0, 1);
List<Integer> row2 = Arrays.asList(1, 1, 1, 1, 0);
List<Integer> row3 = Arrays.asList(0, 0, 1, 1, 1);
List<List<Integer>> rows = Arrays.asList(row0, row1, row2, row3);
run(rows);
for (int m = 2; m < 10; m++)
{
for (int n = 2; n < 10; n++)
{
run(generateRandomInput(m, n));
}
}
}
private static void run(List<List<Integer>> rows)
{
int m = rows.get(0).size();
int n = rows.size();
// For each column i:
// Compute the set of rows that "fill" this column with a "1"
Map<Integer, List<Integer>> fillers =
new LinkedHashMap<Integer, List<Integer>>();
for (int i = 0; i < m; i++)
{
for (int j = 0; j < n; j++)
{
List<Integer> row = rows.get(j);
List<Integer> list =
fillers.computeIfAbsent(i, k -> new ArrayList<Integer>());
if (row.get(i) == 1)
{
list.add(j);
}
}
}
// For each row, compute the set of rows that could "complete"
// the row (by adding "1"s in the columns where the row has
// a "0").
int count = 0;
Set<Integer> processedRows = new LinkedHashSet<Integer>();
for (int j = 0; j < n; j++)
{
processedRows.add(j);
List<Integer> row = rows.get(j);
Set<Integer> completers = new LinkedHashSet<Integer>();
for (int i = 0; i < n; i++)
{
completers.add(i);
}
for (int i = 0; i < m; i++)
{
if (row.get(i) == 0)
{
completers.retainAll(fillers.get(i));
}
}
completers.removeAll(processedRows);
count += completers.size();
}
System.out.println("Count "+count);
System.out.println("Ref. "+bruteForceCount(rows));
}
// Brute force
private static int bruteForceCount(List<List<Integer>> lists)
{
int count = 0;
int n = lists.size();
for (int i = 0; i < n; i++)
{
for (int j = i + 1; j < n; j++)
{
List<Integer> list0 = lists.get(i);
List<Integer> list1 = lists.get(j);
if (areOne(list0, list1))
{
count++;
}
}
}
return count;
}
private static boolean areOne(List<Integer> list0, List<Integer> list1)
{
int n = list0.size();
for (int i=0; i<n; i++)
{
int v0 = list0.get(i);
int v1 = list1.get(i);
if (v0 == 0 && v1 == 0)
{
return false;
}
}
return true;
}
// For testing
private static Random random = new Random(0);
private static List<List<Integer>> generateRandomInput(int m, int n)
{
List<List<Integer>> rows = new ArrayList<List<Integer>>();
for (int i=0; i<n; i++)
{
List<Integer> row = new ArrayList<Integer>();
for (int j=0; j<m; j++)
{
row.add(random.nextInt(2));
}
rows.add(row);
}
return rows;
}
}
Expanding on TheGreatContini's idea:
First try
Let's look at it as finding combinations that belong to AxB, with A and B sets of rows. These combinations must satisfy the or condition, but we'll also assume the hamming weight of a is at least as large as b (to avoid some duplicates).
Now split A into A0 (rows that start with 0) and A1 (rows that start with 1). Do the same for B. We have now reduced the problem into three smaller problems: A0xB1, A1xB1 and A1xB0. If A and B were the same, A0xB1 and A1xB0 are the same, so we only need to do one. Not only are these three subproblems combined smaller than the first, we've also fully checked the first column and can ignore it from now on.
To solve these subproblems, we can use the same approach, but now with column 2, 3, ... At some point, either we will have checked all the columns or #A and #B will be 1.
Depending on the implementation, stopping sooner might be more efficient. At that point we can do an exhaustive check of the remaining combinations. Remember though that if we have already checked k columns, that this will only cost m-k per combination.
Better column selection
As TheGreatContini suggested, instead of selecting the first column, we can select the column that would lead to the smallest subproblems. The cost of finding this column at each step is rather high, but the weights could be calculated once in the beginning and then used as an estimate for the best column. We can then rearrange the columns use the algorithm as normal, after it is done, rearrange them again.
The exact best column would be the column for which the amount of zeros in A times the amount of zeros in B is maximal.
Hamming weight pruning
We know that the sum of the hamming weights of a and b must be at least m. And since we assumed a to be the highest hamming weight, we can remove all the values of a that have a hamming weight less than m/2. (the speedup this gives might be negligable, i'm not sure). Calculating all the hamming weights costs O(m*n).
Efficient Splitting
If we sort the rows, the splitting in groups can be done much faster using the bisection algorithm. This can also lead to an efficient representation of the sets in memory. We can just specify the minimum and maximum row. Sorting can be done in O(n*m*log(n)). Splitting can then be done in about log(n).
Here's some code that won't compile, but should give the right idea.
private List<Row> rows;
public int findFirstOne(int column, int start, int end){
if(rows.get(start).get(column) == 1) return start;
if(rows.get(end).get(column) == 0) return -1;
while(start < end){
int mid = (start+end)/2;
if(rows.get(mid).get(column) == 0){
start = mid+1;
}else{
end = mid;
}
}
return start;
}
Complexity
In the following calculations the effect of the better column selection is ignored as it will give little improvement on worst case efficienty. However, in average cases it can give a massive improvement by reducing the search space as soon as possible and thereby making the other checks faster.
The algorithms run time is bounded by n²m.
However, the worst examples i have found are all O(n*log(n)*m).
First, the sorting of the matrix will be O(n*log(n)*m) for the rows and optionally, sorting the columns will be O(n*m + m*log(m)).
Then, the creating of the subproblems. Let's make an overestimate first. We need to subdivise at most m times and the cost for a full level of of subdivisions at depth i can be overestimated as log(n)*3^i (cost per subdivision times number of subdivisions). This leads to O(log(n)*3^m) in total.
It must also hold that 3^i <= n²/2, because this is the maximum number of combinations possible, so for large m it caps at O(n²*log(n)*m). I struggle to find an example that actually behaves like this though.
I think it is reasonable to assume many of the subproblems become trivial very early. Leading to O(log(n)*m*n) (if someone wants to check this, i'm not really sure about it).
Here is an algorithm that takes advantage of the knowledge that two rows with a zero in the same column are automatically disqualified as partners. The less zeros we have in the present row, the less we visit other rows; but also the more zeros we have overall, the less we visit other rows.
create two sets, one with a list of indexes of all rows, and the other empty
assign a variable, total = 0
Iterate over each row from right to left, from the bottom row to the top (could be in another order, too; I just pictured it that way).
while row i is not the first row:
call the non-empty set A and the empty set dont_match
remove i, the index of the current row, from A
traverse row i:
if A is empty:
stop the traversal
if a zero is encountered:
traverse up that column, visiting only rows listed in A:
if a zero is encountered:
move that row index from A to dont_match
the remaining indexes in A point to row partners to row i
add their count to total and move the elements from the
shorter of A and dont_match to the other set
return total

Find all products of members of a set with multiplicity of factors

Given a set of numbers S, how would one find all possible products of those numbers (with multiplicity of any or all factors) below a given n?
For example, given S = {2, 3, 5, 7, 11}, n = 12, after sorting I would want to see:
{ 1, 2, 3, 2^2, 5, 2*3, 7, 2^3, 3^2, 2*5, 11, 3*2^2 }
If the size of S were a known small constant, I could use:
int p,sum=0;
for(int i[0] = 0; p <= n; i[0]++){
for(int i[1] = 0; p <= n; i[1]++){
for(etc.){...
p = pow(S[0],i[0])*pow(S[1],i[0])*pow(...
// do what you wanted to with p here
// in my case:
for(int j = 0; j < SIZE_OF_I; j++){
if(i[j] > 0){ p *= S[i] - 1; p /= S[i]; }
}
sum += p;
}
}
}
But I need it to work for S of arbitrary size. Intuitively, it feels like a recursion problem, but I'm not sure how to begin.
One thing I should point out is that this is for Project Euler, so I can't get help with the maths.
You can systematically build the series using a priority queue.
The idea is to always have the smallest element yet to be produced on the top of the queue, and when you process it, add to the queue this number multiplied by all factors in S.
Pseudo code:
Create a priority queue (min heap) q
Add 1 to q
set last=0
while q.top() <= n:
current = q.popHead()
if current == last:
continue
last = current
yield current
for each x in S:
q.add(x*current)
The algorithm will yield all elements which can be factored into elements in S, and in ascending order.
Improvement suggestions:
Uusing a set to make sure no element is inserted twice to the queue, rather than the patch of using last I presented here.
Note it can be improved somehow, so you won't need a priority queue,
but only |S| queues, if |S| is fairly small, by having a regular
queue for each factor, and when processing a number - add it to each
of the queues multiplied by its factor.
If you just need the final products, you can use a recursive function:
function prod(base[], n, limit) {
if (base.length > 0 && n <= limit) {
process(n)
for (i = 0; i < base.length; i++) {
prod(base[i:], n * base[i], limit)
}
}
}
This will process the products, but not in order. This is essentially your nested loop variant expressed as recursion. The recursion stops once the limit is exceeded. (The pseudocode notation base[i:] means the subarray from the ith element on to the end.)
If you need the information on the prime factors that were used and the respectie exponents, you should pass an additional array of exponents, which you must adjust for each recursive call.

array- having some issues [duplicate]

An interesting interview question that a colleague of mine uses:
Suppose that you are given a very long, unsorted list of unsigned 64-bit integers. How would you find the smallest non-negative integer that does not occur in the list?
FOLLOW-UP: Now that the obvious solution by sorting has been proposed, can you do it faster than O(n log n)?
FOLLOW-UP: Your algorithm has to run on a computer with, say, 1GB of memory
CLARIFICATION: The list is in RAM, though it might consume a large amount of it. You are given the size of the list, say N, in advance.
If the datastructure can be mutated in place and supports random access then you can do it in O(N) time and O(1) additional space. Just go through the array sequentially and for every index write the value at the index to the index specified by value, recursively placing any value at that location to its place and throwing away values > N. Then go again through the array looking for the spot where value doesn't match the index - that's the smallest value not in the array. This results in at most 3N comparisons and only uses a few values worth of temporary space.
# Pass 1, move every value to the position of its value
for cursor in range(N):
target = array[cursor]
while target < N and target != array[target]:
new_target = array[target]
array[target] = target
target = new_target
# Pass 2, find first location where the index doesn't match the value
for cursor in range(N):
if array[cursor] != cursor:
return cursor
return N
Here's a simple O(N) solution that uses O(N) space. I'm assuming that we are restricting the input list to non-negative numbers and that we want to find the first non-negative number that is not in the list.
Find the length of the list; lets say it is N.
Allocate an array of N booleans, initialized to all false.
For each number X in the list, if X is less than N, set the X'th element of the array to true.
Scan the array starting from index 0, looking for the first element that is false. If you find the first false at index I, then I is the answer. Otherwise (i.e. when all elements are true) the answer is N.
In practice, the "array of N booleans" would probably be encoded as a "bitmap" or "bitset" represented as a byte or int array. This typically uses less space (depending on the programming language) and allows the scan for the first false to be done more quickly.
This is how / why the algorithm works.
Suppose that the N numbers in the list are not distinct, or that one or more of them is greater than N. This means that there must be at least one number in the range 0 .. N - 1 that is not in the list. So the problem of find the smallest missing number must therefore reduce to the problem of finding the smallest missing number less than N. This means that we don't need to keep track of numbers that are greater or equal to N ... because they won't be the answer.
The alternative to the previous paragraph is that the list is a permutation of the numbers from 0 .. N - 1. In this case, step 3 sets all elements of the array to true, and step 4 tells us that the first "missing" number is N.
The computational complexity of the algorithm is O(N) with a relatively small constant of proportionality. It makes two linear passes through the list, or just one pass if the list length is known to start with. There is no need to represent the hold the entire list in memory, so the algorithm's asymptotic memory usage is just what is needed to represent the array of booleans; i.e. O(N) bits.
(By contrast, algorithms that rely on in-memory sorting or partitioning assume that you can represent the entire list in memory. In the form the question was asked, this would require O(N) 64-bit words.)
#Jorn comments that steps 1 through 3 are a variation on counting sort. In a sense he is right, but the differences are significant:
A counting sort requires an array of (at least) Xmax - Xmin counters where Xmax is the largest number in the list and Xmin is the smallest number in the list. Each counter has to be able to represent N states; i.e. assuming a binary representation it has to have an integer type (at least) ceiling(log2(N)) bits.
To determine the array size, a counting sort needs to make an initial pass through the list to determine Xmax and Xmin.
The minimum worst-case space requirement is therefore ceiling(log2(N)) * (Xmax - Xmin) bits.
By contrast, the algorithm presented above simply requires N bits in the worst and best cases.
However, this analysis leads to the intuition that if the algorithm made an initial pass through the list looking for a zero (and counting the list elements if required), it would give a quicker answer using no space at all if it found the zero. It is definitely worth doing this if there is a high probability of finding at least one zero in the list. And this extra pass doesn't change the overall complexity.
EDIT: I've changed the description of the algorithm to use "array of booleans" since people apparently found my original description using bits and bitmaps to be confusing.
Since the OP has now specified that the original list is held in RAM and that the computer has only, say, 1GB of memory, I'm going to go out on a limb and predict that the answer is zero.
1GB of RAM means the list can have at most 134,217,728 numbers in it. But there are 264 = 18,446,744,073,709,551,616 possible numbers. So the probability that zero is in the list is 1 in 137,438,953,472.
In contrast, my odds of being struck by lightning this year are 1 in 700,000. And my odds of getting hit by a meteorite are about 1 in 10 trillion. So I'm about ten times more likely to be written up in a scientific journal due to my untimely death by a celestial object than the answer not being zero.
As pointed out in other answers you can do a sort, and then simply scan up until you find a gap.
You can improve the algorithmic complexity to O(N) and keep O(N) space by using a modified QuickSort where you eliminate partitions which are not potential candidates for containing the gap.
On the first partition phase, remove duplicates.
Once the partitioning is complete look at the number of items in the lower partition
Is this value equal to the value used for creating the partition?
If so then it implies that the gap is in the higher partition.
Continue with the quicksort, ignoring the lower partition
Otherwise the gap is in the lower partition
Continue with the quicksort, ignoring the higher partition
This saves a large number of computations.
To illustrate one of the pitfalls of O(N) thinking, here is an O(N) algorithm that uses O(1) space.
for i in [0..2^64):
if i not in list: return i
print "no 64-bit integers are missing"
Since the numbers are all 64 bits long, we can use radix sort on them, which is O(n). Sort 'em, then scan 'em until you find what you're looking for.
if the smallest number is zero, scan forward until you find a gap. If the smallest number is not zero, the answer is zero.
For a space efficient method and all values are distinct you can do it in space O( k ) and time O( k*log(N)*N ). It's space efficient and there's no data moving and all operations are elementary (adding subtracting).
set U = N; L=0
First partition the number space in k regions. Like this:
0->(1/k)*(U-L) + L, 0->(2/k)*(U-L) + L, 0->(3/k)*(U-L) + L ... 0->(U-L) + L
Find how many numbers (count{i}) are in each region. (N*k steps)
Find the first region (h) that isn't full. That means count{h} < upper_limit{h}. (k steps)
if h - count{h-1} = 1 you've got your answer
set U = count{h}; L = count{h-1}
goto 2
this can be improved using hashing (thanks for Nic this idea).
same
First partition the number space in k regions. Like this:
L + (i/k)->L + (i+1/k)*(U-L)
inc count{j} using j = (number - L)/k (if L < number < U)
find first region (h) that doesn't have k elements in it
if count{h} = 1 h is your answer
set U = maximum value in region h L = minimum value in region h
This will run in O(log(N)*N).
I'd just sort them then run through the sequence until I find a gap (including the gap at the start between zero and the first number).
In terms of an algorithm, something like this would do it:
def smallest_not_in_list(list):
sort(list)
if list[0] != 0:
return 0
for i = 1 to list.last:
if list[i] != list[i-1] + 1:
return list[i-1] + 1
if list[list.last] == 2^64 - 1:
assert ("No gaps")
return list[list.last] + 1
Of course, if you have a lot more memory than CPU grunt, you could create a bitmask of all possible 64-bit values and just set the bits for every number in the list. Then look for the first 0-bit in that bitmask. That turns it into an O(n) operation in terms of time but pretty damned expensive in terms of memory requirements :-)
I doubt you could improve on O(n) since I can't see a way of doing it that doesn't involve looking at each number at least once.
The algorithm for that one would be along the lines of:
def smallest_not_in_list(list):
bitmask = mask_make(2^64) // might take a while :-)
mask_clear_all (bitmask)
for i = 1 to list.last:
mask_set (bitmask, list[i])
for i = 0 to 2^64 - 1:
if mask_is_clear (bitmask, i):
return i
assert ("No gaps")
Sort the list, look at the first and second elements, and start going up until there is a gap.
We could use a hash table to hold the numbers. Once all numbers are done, run a counter from 0 till we find the lowest. A reasonably good hash will hash and store in constant time, and retrieves in constant time.
for every i in X // One scan Θ(1)
hashtable.put(i, i); // O(1)
low = 0;
while (hashtable.get(i) <> null) // at most n+1 times
low++;
print low;
The worst case if there are n elements in the array, and are {0, 1, ... n-1}, in which case, the answer will be obtained at n, still keeping it O(n).
You can do it in O(n) time and O(1) additional space, although the hidden factor is quite large. This isn't a practical way to solve the problem, but it might be interesting nonetheless.
For every unsigned 64-bit integer (in ascending order) iterate over the list until you find the target integer or you reach the end of the list. If you reach the end of the list, the target integer is the smallest integer not in the list. If you reach the end of the 64-bit integers, every 64-bit integer is in the list.
Here it is as a Python function:
def smallest_missing_uint64(source_list):
the_answer = None
target = 0L
while target < 2L**64:
target_found = False
for item in source_list:
if item == target:
target_found = True
if not target_found and the_answer is None:
the_answer = target
target += 1L
return the_answer
This function is deliberately inefficient to keep it O(n). Note especially that the function keeps checking target integers even after the answer has been found. If the function returned as soon as the answer was found, the number of times the outer loop ran would be bound by the size of the answer, which is bound by n. That change would make the run time O(n^2), even though it would be a lot faster.
Thanks to egon, swilden, and Stephen C for my inspiration. First, we know the bounds of the goal value because it cannot be greater than the size of the list. Also, a 1GB list could contain at most 134217728 (128 * 2^20) 64-bit integers.
Hashing part
I propose using hashing to dramatically reduce our search space. First, square root the size of the list. For a 1GB list, that's N=11,586. Set up an integer array of size N. Iterate through the list, and take the square root* of each number you find as your hash. In your hash table, increment the counter for that hash. Next, iterate through your hash table. The first bucket you find that is not equal to it's max size defines your new search space.
Bitmap part
Now set up a regular bit map equal to the size of your new search space, and again iterate through the source list, filling out the bitmap as you find each number in your search space. When you're done, the first unset bit in your bitmap will give you your answer.
This will be completed in O(n) time and O(sqrt(n)) space.
(*You could use use something like bit shifting to do this a lot more efficiently, and just vary the number and size of buckets accordingly.)
Well if there is only one missing number in a list of numbers, the easiest way to find the missing number is to sum the series and subtract each value in the list. The final value is the missing number.
int i = 0;
while ( i < Array.Length)
{
if (Array[i] == i + 1)
{
i++;
}
if (i < Array.Length)
{
if (Array[i] <= Array.Length)
{//SWap
int temp = Array[i];
int AnoTemp = Array[temp - 1];
Array[temp - 1] = temp;
Array[i] = AnoTemp;
}
else
i++;
}
}
for (int j = 0; j < Array.Length; j++)
{
if (Array[j] > Array.Length)
{
Console.WriteLine(j + 1);
j = Array.Length;
}
else
if (j == Array.Length - 1)
Console.WriteLine("Not Found !!");
}
}
Here's my answer written in Java:
Basic Idea:
1- Loop through the array throwing away duplicate positive, zeros, and negative numbers while summing up the rest, getting the maximum positive number as well, and keep the unique positive numbers in a Map.
2- Compute the sum as max * (max+1)/2.
3- Find the difference between the sums calculated at steps 1 & 2
4- Loop again from 1 to the minimum of [sums difference, max] and return the first number that is not in the map populated in step 1.
public static int solution(int[] A) {
if (A == null || A.length == 0) {
throw new IllegalArgumentException();
}
int sum = 0;
Map<Integer, Boolean> uniqueNumbers = new HashMap<Integer, Boolean>();
int max = A[0];
for (int i = 0; i < A.length; i++) {
if(A[i] < 0) {
continue;
}
if(uniqueNumbers.get(A[i]) != null) {
continue;
}
if (A[i] > max) {
max = A[i];
}
uniqueNumbers.put(A[i], true);
sum += A[i];
}
int completeSum = (max * (max + 1)) / 2;
for(int j = 1; j <= Math.min((completeSum - sum), max); j++) {
if(uniqueNumbers.get(j) == null) { //O(1)
return j;
}
}
//All negative case
if(uniqueNumbers.isEmpty()) {
return 1;
}
return 0;
}
As Stephen C smartly pointed out, the answer must be a number smaller than the length of the array. I would then find the answer by binary search. This optimizes the worst case (so the interviewer can't catch you in a 'what if' pathological scenario). In an interview, do point out you are doing this to optimize for the worst case.
The way to use binary search is to subtract the number you are looking for from each element of the array, and check for negative results.
I like the "guess zero" apprach. If the numbers were random, zero is highly probable. If the "examiner" set a non-random list, then add one and guess again:
LowNum=0
i=0
do forever {
if i == N then leave /* Processed entire array */
if array[i] == LowNum {
LowNum++
i=0
}
else {
i++
}
}
display LowNum
The worst case is n*N with n=N, but in practice n is highly likely to be a small number (eg. 1)
I am not sure if I got the question. But if for list 1,2,3,5,6 and the missing number is 4, then the missing number can be found in O(n) by:
(n+2)(n+1)/2-(n+1)n/2
EDIT: sorry, I guess I was thinking too fast last night. Anyway, The second part should actually be replaced by sum(list), which is where O(n) comes. The formula reveals the idea behind it: for n sequential integers, the sum should be (n+1)*n/2. If there is a missing number, the sum would be equal to the sum of (n+1) sequential integers minus the missing number.
Thanks for pointing out the fact that I was putting some middle pieces in my mind.
Well done Ants Aasma! I thought about the answer for about 15 minutes and independently came up with an answer in a similar vein of thinking to yours:
#define SWAP(x,y) { numerictype_t tmp = x; x = y; y = tmp; }
int minNonNegativeNotInArr (numerictype_t * a, size_t n) {
int m = n;
for (int i = 0; i < m;) {
if (a[i] >= m || a[i] < i || a[i] == a[a[i]]) {
m--;
SWAP (a[i], a[m]);
continue;
}
if (a[i] > i) {
SWAP (a[i], a[a[i]]);
continue;
}
i++;
}
return m;
}
m represents "the current maximum possible output given what I know about the first i inputs and assuming nothing else about the values until the entry at m-1".
This value of m will be returned only if (a[i], ..., a[m-1]) is a permutation of the values (i, ..., m-1). Thus if a[i] >= m or if a[i] < i or if a[i] == a[a[i]] we know that m is the wrong output and must be at least one element lower. So decrementing m and swapping a[i] with the a[m] we can recurse.
If this is not true but a[i] > i then knowing that a[i] != a[a[i]] we know that swapping a[i] with a[a[i]] will increase the number of elements in their own place.
Otherwise a[i] must be equal to i in which case we can increment i knowing that all the values of up to and including this index are equal to their index.
The proof that this cannot enter an infinite loop is left as an exercise to the reader. :)
The Dafny fragment from Ants' answer shows why the in-place algorithm may fail. The requires pre-condition describes that the values of each item must not go beyond the bounds of the array.
method AntsAasma(A: array<int>) returns (M: int)
requires A != null && forall N :: 0 <= N < A.Length ==> 0 <= A[N] < A.Length;
modifies A;
{
// Pass 1, move every value to the position of its value
var N := A.Length;
var cursor := 0;
while (cursor < N)
{
var target := A[cursor];
while (0 <= target < N && target != A[target])
{
var new_target := A[target];
A[target] := target;
target := new_target;
}
cursor := cursor + 1;
}
// Pass 2, find first location where the index doesn't match the value
cursor := 0;
while (cursor < N)
{
if (A[cursor] != cursor)
{
return cursor;
}
cursor := cursor + 1;
}
return N;
}
Paste the code into the validator with and without the forall ... clause to see the verification error. The second error is a result of the verifier not being able to establish a termination condition for the Pass 1 loop. Proving this is left to someone who understands the tool better.
Here's an answer in Java that does not modify the input and uses O(N) time and N bits plus a small constant overhead of memory (where N is the size of the list):
int smallestMissingValue(List<Integer> values) {
BitSet bitset = new BitSet(values.size() + 1);
for (int i : values) {
if (i >= 0 && i <= values.size()) {
bitset.set(i);
}
}
return bitset.nextClearBit(0);
}
def solution(A):
index = 0
target = []
A = [x for x in A if x >=0]
if len(A) ==0:
return 1
maxi = max(A)
if maxi <= len(A):
maxi = len(A)
target = ['X' for x in range(maxi+1)]
for number in A:
target[number]= number
count = 1
while count < maxi+1:
if target[count] == 'X':
return count
count +=1
return target[count-1] + 1
Got 100% for the above solution.
1)Filter negative and Zero
2)Sort/distinct
3)Visit array
Complexity: O(N) or O(N * log(N))
using Java8
public int solution(int[] A) {
int result = 1;
boolean found = false;
A = Arrays.stream(A).filter(x -> x > 0).sorted().distinct().toArray();
//System.out.println(Arrays.toString(A));
for (int i = 0; i < A.length; i++) {
result = i + 1;
if (result != A[i]) {
found = true;
break;
}
}
if (!found && result == A.length) {
//result is larger than max element in array
result++;
}
return result;
}
An unordered_set can be used to store all the positive numbers, and then we can iterate from 1 to length of unordered_set, and see the first number that does not occur.
int firstMissingPositive(vector<int>& nums) {
unordered_set<int> fre;
// storing each positive number in a hash.
for(int i = 0; i < nums.size(); i +=1)
{
if(nums[i] > 0)
fre.insert(nums[i]);
}
int i = 1;
// Iterating from 1 to size of the set and checking
// for the occurrence of 'i'
for(auto it = fre.begin(); it != fre.end(); ++it)
{
if(fre.find(i) == fre.end())
return i;
i +=1;
}
return i;
}
Solution through basic javascript
var a = [1, 3, 6, 4, 1, 2];
function findSmallest(a) {
var m = 0;
for(i=1;i<=a.length;i++) {
j=0;m=1;
while(j < a.length) {
if(i === a[j]) {
m++;
}
j++;
}
if(m === 1) {
return i;
}
}
}
console.log(findSmallest(a))
Hope this helps for someone.
With python it is not the most efficient, but correct
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-
import datetime
# write your code in Python 3.6
def solution(A):
MIN = 0
MAX = 1000000
possible_results = range(MIN, MAX)
for i in possible_results:
next_value = (i + 1)
if next_value not in A:
return next_value
return 1
test_case_0 = [2, 2, 2]
test_case_1 = [1, 3, 44, 55, 6, 0, 3, 8]
test_case_2 = [-1, -22]
test_case_3 = [x for x in range(-10000, 10000)]
test_case_4 = [x for x in range(0, 100)] + [x for x in range(102, 200)]
test_case_5 = [4, 5, 6]
print("---")
a = datetime.datetime.now()
print(solution(test_case_0))
print(solution(test_case_1))
print(solution(test_case_2))
print(solution(test_case_3))
print(solution(test_case_4))
print(solution(test_case_5))
def solution(A):
A.sort()
j = 1
for i, elem in enumerate(A):
if j < elem:
break
elif j == elem:
j += 1
continue
else:
continue
return j
this can help:
0- A is [5, 3, 2, 7];
1- Define B With Length = A.Length; (O(1))
2- initialize B Cells With 1; (O(n))
3- For Each Item In A:
if (B.Length <= item) then B[Item] = -1 (O(n))
4- The answer is smallest index in B such that B[index] != -1 (O(n))

Find the x smallest integers in a list of length n

You have a list of n integers and you want the x smallest. For example,
x_smallest([1, 2, 5, 4, 3], 3) should return [1, 2, 3].
I'll vote up unique runtimes within reason and will give the green check to the best runtime.
I'll start with O(n * x): Create an array of length x. Iterate through the list x times, each time pulling out the next smallest integer.
Edits
You have no idea how big or small these numbers are ahead of time.
You don't care about the final order, you just want the x smallest.
This is already being handled in some solutions, but let's say that while you aren't guaranteed a unique list, you aren't going to get a degenerate list either such as [1, 1, 1, 1, 1] either.
You can find the k-th smallest element in O(n) time. This has been discussed on StackOverflow before. There are relatively simple randomized algorithms, such as QuickSelect, that run in O(n) expected time and more complicated algorithms that run in O(n) worst-case time.
Given the k-th smallest element you can make one pass over the list to find all elements less than the k-th smallest and you are done. (I assume that the result array does not need to be sorted.)
Overall run-time is O(n).
Maintain the list of the x highest so far in sorted order in a skip-list. Iterate through the array. For each element, find where it would be inserted in the skip list (log x time). If in the interior of the list, it is one of the smallest x so far, so insert it and remove the element at the end of the list. Otherwise do nothing.
Time O(n*log(x))
Alternative implementation: maintain the collection of x highest so far in a max-heap, compare each new element with top element of the heap, and pop + insert new element only if the new element is less than the top element. Since comparison to top element is O(1) and pop/insert O(log x), this is also O(nlog(x))
Add all n numbers to a heap and delete x of them. Complexity is O((n + x) log n). Since x is obviously less than n, it's O(n log n).
If the range of numbers (L) is known, you can do a modified counting sort.
given L, x, input[]
counts <- array[0..L]
for each number in input
increment counts[number]
next
#populate the output
index <- 0
xIndex <- 0
while xIndex < x and index <= L
if counts[index] > 0 then
decrement counts[index]
output[xIndex] = index
increment xIndex
else
increment index
end if
loop
This has a runtime of O(n + L) (with memory overhead of O(L)) which makes it pretty attractive if the range is small (L < n log n).
def x_smallest(items, x):
result = sorted(items[:x])
for i in items[x:]:
if i < result[-1]:
result[-1] = i
j = x - 1
while j > 0 and result[j] < result[j-1]:
result[j-1], result[j] = result[j], result[j-1]
j -= 1
return result
Worst case is O(x*n), but will typically be closer to O(n).
Psudocode:
def x_smallest(array<int> arr, int limit)
array<int> ret = new array[limit]
ret = {INT_MAX}
for i in arr
for j in range(0..limit)
if (i < ret[j])
ret[j] = i
endif
endfor
endfor
return ret
enddef
In pseudo code:
y = length of list / 2
if (x > y)
iterate and pop off the (length - x) largest
else
iterate and pop off the x smallest
O(n/2 * x) ?
sort array
slice array 0 x
Choose the best sort algorithm and you're done: http://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms
You can sort then take the first x values?
Java: with QuickSort O(n log n)
import java.util.Arrays;
import java.util.Random;
public class Main {
public static void main(String[] args) {
Random random = new Random(); // Random number generator
int[] list = new int[1000];
int lenght = 3;
// Initialize array with positive random values
for (int i = 0; i < list.length; i++) {
list[i] = Math.abs(random.nextInt());
}
// Solution
int[] output = findSmallest(list, lenght);
// Display Results
for(int x : output)
System.out.println(x);
}
private static int[] findSmallest(int[] list, int lenght) {
// A tuned quicksort
Arrays.sort(list);
// Send back correct lenght
return Arrays.copyOf(list, lenght);
}
}
Its pretty fast.
private static int[] x_smallest(int[] input, int x)
{
int[] output = new int[x];
for (int i = 0; i < x; i++) { // O(x)
output[i] = input[i];
}
for (int i = x; i < input.Length; i++) { // + O(n-x)
int current = input[i];
int temp;
for (int j = 0; j < output.Length; j++) { // * O(x)
if (current < output[j]) {
temp = output[j];
output[j] = current;
current = temp;
}
}
}
return output;
}
Looking at the complexity:
O(x + (n-x) * x) -- assuming x is some constant, O(n)
What about using a splay tree? Because of the splay tree's unique approach to adaptive balancing it makes for a slick implementation of the algorithm with the added benefit of being able to enumerate the x items in order afterwards. Here is some psuedocode.
public SplayTree GetSmallest(int[] array, int x)
{
var tree = new SplayTree();
for (int i = 0; i < array.Length; i++)
{
int max = tree.GetLargest();
if (array[i] < max || tree.Count < x)
{
if (tree.Count >= x)
{
tree.Remove(max);
}
tree.Add(array[i]);
}
}
return tree;
}
The GetLargest and Remove operations have an amortized complexity of O(log(n)), but because the last accessed item bubbles to the top it would normally be O(1). So the space complexity is O(x) and the runtime complexity is O(n*log(x)). If the array happens to already be ordered then this algorithm would acheive its best case complexity of O(n) with either an ascending or descending ordered array. However, a very odd or peculiar ordering could result in a O(n^2) complexity. Can you guess how the array would have to be ordered for that to happen?
In scala, and probably other functional languages, a no brainer:
scala> List (1, 3, 6, 4, 5, 1, 2, 9, 4) sortWith ( _<_ ) take 5
res18: List[Int] = List(1, 1, 2, 3, 4)

Resources