Related
I am trying to sort strings alphabetically in linear time and thought about using tries for this, my question is What's the time complexity of running a Pre-Order transversal on tries? is it O(n) ?
You have to be a little careful with the way you measure complexity in this case. A lot of times, people pretend that sorting N strings with a comparison-based sort takes O(N log N) time, but that is not really true in the worst case unless the length of the strings is bounded. It is the expected time if the strings are randomized, however, so it's not a bad approximation for many use cases.
If you want to account for possible long strings with long common prefixes, then you change the meaning of N to refer to the total size of the input, including all the strings. With this new definition, you can sort a list of strings in O(N) time.
Inserting the strings into a trie, or better a radix tree (https://en.wikipedia.org/wiki/Radix_tree) and then doing a preorder traversal is one way, and yes that works in O(N) time, where N is the total size of the input.
But it's faster and easier to do a radix sort: https://en.wikipedia.org/wiki/Radix_sort The Most-Significant-Digit-First variant works best with variable-length inputs.
Radix Sort can be applied in this case to sort them in O(n) refer to the following code implemented in c++:
#include<iostream>
using namespace std;
class RadixSort {
public:
static char charAt(string s,int n){
return s[n];
}
static void countingSort(string arr[],int n,int index,char lower,char upper){
int countArray[(upper-lower)+2];
string tempArray[n];
for(int i =0; i < sizeof(countArray)/sizeof(countArray[0]); i++)
countArray[i]=0;
//increase count for char at index
for(int i=0;i<n;i++){
int charIndex = (arr[i].length()-1 < index) ? 0 : (charAt(arr[i],index) - lower+1);
countArray[charIndex]++;
}
//sum up countArray;countArray will hold last index for the char at each strings index
for(int i=1;i<sizeof(countArray)/sizeof(countArray[0]);i++){
countArray[i] += countArray[i-1];
}
for(int i=n-1;i>=0;i--){
int charIndex = (arr[i].length()-1 < index) ? 0 : (charAt(arr[i],index) - lower+1);
tempArray[countArray[charIndex]-1] = arr[i];
countArray[charIndex]--;
}
for(int i=0;i<sizeof(tempArray)/sizeof(tempArray[0]);i++){
arr[i] = tempArray[i];
}
}
static void radixSort(string arr[],int n,char lower,char upper){
int maxIndex = 0;
for(int i=0;i<n;i++){
if(arr[i].length()-1 > maxIndex){
maxIndex = arr[i].length()-1;
}
}
for(int i=maxIndex;i>=0;i--){
countingSort(arr,n,i,lower,upper);
}
}
};
int main(){
string arr[] = {"a", "aa", "aaa","kinga", "bishoy","computer","az"};
int n = sizeof(arr)/sizeof(arr[0]);
RadixSort::radixSort(arr,n,'a','z');
for(int i=0;i<n;i++){
cout<<arr[i]<<" ";
}
return 0;
}
No. it is not O(n). it is Omega(k(log(k))n).
without any other restriction,and this is the case as i understand from your question, it is just comparison based sorting algorithm.
Sorting an array of length k is in Omega(klog(k)),
and doing it n times, without any connections between the times, will lead to
Omega(klog(k)n).
You can read more here:
https://www.geeksforgeeks.org/lower-bound-on-comparison-based-sorting-algorithms/
If you look at k as bounded, because there is no ENGLISH word longer then 10^1000000 (Which probably larger than atoms on Earth), then sort an array of bounded length is in O(1), and doing it n time will lead to O(n).
You get a lot from dealing with infinity, but sometimes you have to pay back...
Given a sorted array A, which stores n integers, and a value key. Design
an efficient divide and conquer algorithm that returns the index of the value
key if it can be found in array A.Otherwise, the algorithm returns 0.
I think for your description that the best choice is binary search (dichotomous search). The algorithm is to go by dividing the array in half and buying if the element found in the middle is higher, lower or equal to the item you are looking for. This is done in order to reduce search space to find the element logarithmically or determine that the element is not in the vector.The array must be ordered.
this is an example in C++
#include <vector>
bool busqueda_dicotomica(const vector<int> &v, int principio, int fin, int &x){
bool res;
if(principio <= fin){
int m = ((fin - principio)/2) + principio;
if(x < v[m]) res = busqueda_dicotomica(v, principio, m-1, x);
else if(x > v[m]) res = busqueda_dicotomica(v, m+1, fin, x);
else res = true;
}else res = false;
return res;
}
I think your algorithm should not return 0 if not found because if you want to return the index of the element, 0 is the index of the first element
In this article is the detailed explanation of the binary search or in this other
I am working on to find the kth smallest element in min heap. I have got a code for this whose complexity is O(k log k). I tried to improve it to O(k).
Below is the code.
struct heap{
int *array;
int count;
int capacity;
};
int kthsmallestelement(struct heap *h,int i,int k){
if(i<0||i>=h->count)
return INT_MIN;
if(k==1)
return h->array[i];
k--;
int j=2*i+1;
int m=2*i+2;
if(h->array[j] < h->array[m])
{
int x=kthsmallestelement(h,j,k);
if(x==INT_MIN)
return kthsmallestelement(h,m,k);
return x;
}
else
{
int x=kthsmallestelement(h,m,k);
if(x==INT_MIN)
return kthsmallestelement(h,j,k);
return x;
}
}
My code is traversing k elements in heap and thus complexity is O(k).
Is it correct?
Your code, and in fact, its entire approach - are completely wrong, IIUC.
In a classic min-heap, the only thing you know is that each path from the root to the children is non-decreasing. There are no other constraints, in particular no constraints between the paths.
It follows that the k-th smallest element can be anywhere in the first 2k element. If you are just using the entire heap's array built & maintained using the classic heap algorithm, any solution will necessarily be Ω(min(n, 2k)). Anything below that will require additional requirements on the array's structure, an additional data structure, or both.
Given an array we need to find out the count of number of subsets having sum exactly equal to a given integer k.
Please suggest an optimal algorithm for this problem. Here the actual subsets are not needed just the count will do.
The array consists of integers which can be negative as well as non negative.
Example:
Array -> {1,4,-1,10,5} abs sum->9
Answer should be 2 for{4,5} and {-1,10}
This is a variation of the subset sum problem, which is NP-Hard - so there is no known polynomial solution to it. (In fact, the subset sum problem says it is hard to find if there is even one subset that sums to the given sum).
Possible approaches to solve it are brute force (check all possible subsets), or if the set contains relatively small integers, you can use the pseudo-polynomial dynamic programming technique:
f(i,0) = 1 (i >= 0) //succesful base clause
f(0,j) = 0 (j != 0) //non succesful base clause
f(i,j) = f(i-1,j) + f(i-1,j-arr[i]) //step
Applying dynamic programming to the above recursive formula gives you O(k*n) time and space solution.
Invoke with f(n,k) [assuming 1 based index for arrays].
Following is memoized Dynamic Programming code to print the count of the number of subsets with a given sum. The repeating values of DP are stores in "tmp" array. To attain a DP solution first always start with a recursive solution to the problem and then store the repeating value in a tmp array to arrive at a memoized solution.
#include <bits/stdc++.h>
using namespace std;
int tmp[1001][1001];
int subset_count(int* arr, int sum, int n)
{ ` if(sum==0)
return 1;
if(n==0)
return 0;
if(tmp[n][sum]!=-1)
return tmp[n][sum];
else{
if(arr[n-1]>sum)
return tmp[n][sum]=subset_count(arr,sum, n-1);
else{
return tmp[n][required_sum]=subset_count(arr,sum, n- 1)+subset_count(arr,sum-arr[n-1], n-1);`
}
}
}
// Driver code
int main()
{ ` memset(tmp,-1,sizeof(tmp));
int arr[] = { 2, 3, 5, 6, 8, 10 };
int n = sizeof(arr) / sizeof(int);
int sum = 10; `
cout << subset_count(arr,sum, n);
return 0;
}
This is recursive solution. It has time complexity of O(2^n)
Use Dynamic Programming to Improve time complexity to be Quadratic O(n^2)
def count_of_subset(arr,sum,n,count):
if sum==0:
count+=1
return count
if n==0 and sum!=0:
count+=0
return count
if arr[n-1]<=sum:
count=count_of_subset(arr,sum-arr[n-1],n-1,count)
count=count_of_subset(arr,sum,n-1,count)
return count
else:
count=count_of_subset(arr,sum,n-1,count)
return count
int numSubseq(vector<int>& nums, int target) {
int size = nums.size();
int T[size+1][target+1];
for(int i=0;i<=size;i++){
for(int j=0;j<=target;j++){
if(i==0 && j!=0)
T[i][j]=0;
else if(j==0)
T[i][j] = 1;
}
}
for(int i=1;i<=size;i++){
for(int j=1;j<=target;j++){
if(nums[i-1] <= j)
T[i][j] = T[i-1][j] + T[i-1][j-nums[i-1]];
else
T[i][j] = T[i-1][j];
}
}
return T[size][target];
}
Although the above base case will work fine if the constraints is : 1<=v[i]<=1000
But consider : constraints : 0<=v[i]<=1000
The above base case will give wrong answer , consider a test case : v = [0,0,1] and k = 1 , the output will be "1" according to the base case .
But the correct answer is 3 : {0,1}{0,0,1}{1}
to avoid this we can go deep instead of returning 0 , and fix it by
C++:
if(ind==0)
{
if(v[0]==target and target==0)return 2;
if(v[0]==target || target==0)return 1;
return 0 ;
}
One of the answer to this solution is to generate a power set of N, where N is the size of the array which will be equal to 2^n. For every number between 0 and 2^N-1 check its binary representation and include all the values from the array for which the bit is in the set position i.e one.
Check if all the values you included results in the sum which is equal to the required value.
This might not be the most efficient solution but as this is an NP hard problem, there exist no polynomial time solution for this problem.
Given a list of integers, how can I best find an integer that is not in the list?
The list can potentially be very large, and the integers might be large (i.e. BigIntegers, not just 32-bit ints).
If it makes any difference, the list is "probably" sorted, i.e. 99% of the time it will be sorted, but I cannot rely on always being sorted.
Edit -
To clarify, given the list {0, 1, 3, 4, 7}, examples of acceptable solutions would be -2, 2, 8 and 10012, but I would prefer to find the smallest, non-negative solution (i.e. 2) if there is an algorithm that can find it without needing to sort the entire list.
One easy way would be to iterate the list to get the highest value n, then you know that n+1 is not in the list.
Edit:
A method to find the smallest positive unused number would be to start from zero and scan the list for that number, starting over and increase if you find the number. To make it more efficient, and to make use of the high probability of the list being sorted, you can move numbers that are smaller than the current to an unused part of the list.
This method uses the beginning of the list as storage space for lower numbers, the startIndex variable keeps track of where the relevant numbers start:
public static int GetSmallest(int[] items) {
int startIndex = 0;
int result = 0;
int i = 0;
while (i < items.Length) {
if (items[i] == result) {
result++;
i = startIndex;
} else {
if (items[i] < result) {
if (i != startIndex) {
int temp = items[startIndex];
items[startIndex] = items[i];
items[i] = temp;
}
startIndex++;
}
i++;
}
}
return result;
}
I made a performance test where I created lists with 100000 random numbers from 0 to 19999, which makes the average lowest number around 150. On test runs (with 1000 test lists each), the method found the smallest number in unsorted lists by average in 8.2 ms., and in sorted lists by average in 0.32 ms.
(I haven't checked in what state the method leaves the list, as it may swap some items in it. It leaves the list containing the same items, at least, and as it moves smaller values down the list I think that it should actually become more sorted for each search.)
If the number doesn't have any restrictions, then you can do a linear search to find the maximum value in the list and return the number that is one larger.
If the number does have restrictions (e.g. max+1 and min-1 could overflow), then you can use a sorting algorithm that works well on partially sorted data. Then go through the list and find the first pair of numbers v_i and v_{i+1} that are not consecutive. Return v_i + 1.
To get the smallest non-negative integer (based on the edit in the question), you can either:
Sort the list using a partial sort as above. Binary search the list for 0. Iterate through the list from this value until you find a "gap" between two numbers. If you get to the end of the list, return the last value + 1.
Insert the values into a hash table. Then iterate from 0 upwards until you find an integer not in the list.
Unless it is sorted you will have to do a linear search going item by item until you find a match or you reach the end of the list. If you can guarantee it is sorted you could always use the array method of BinarySearch or just roll your own binary search.
Or like Jason mentioned there is always the option of using a Hashtable.
"probably sorted" means you have to treat it as being completely unsorted. If of course you could guarantee it was sorted this is simple. Just look at the first or last element and add or subtract 1.
I got 100% in both correctness & performance,
You should use quick sorting which is N log(N) complexity.
Here you go...
public int solution(int[] A) {
if (A != null && A.length > 0) {
quickSort(A, 0, A.length - 1);
}
int result = 1;
if (A.length == 1 && A[0] < 0) {
return result;
}
for (int i = 0; i < A.length; i++) {
if (A[i] <= 0) {
continue;
}
if (A[i] == result) {
result++;
} else if (A[i] < result) {
continue;
} else if (A[i] > result) {
return result;
}
}
return result;
}
private void quickSort(int[] numbers, int low, int high) {
int i = low, j = high;
int pivot = numbers[low + (high - low) / 2];
while (i <= j) {
while (numbers[i] < pivot) {
i++;
}
while (numbers[j] > pivot) {
j--;
}
if (i <= j) {
exchange(numbers, i, j);
i++;
j--;
}
}
// Recursion
if (low < j)
quickSort(numbers, low, j);
if (i < high)
quickSort(numbers, i, high);
}
private void exchange(int[] numbers, int i, int j) {
int temp = numbers[i];
numbers[i] = numbers[j];
numbers[j] = temp;
}
Theoretically, find the max and add 1. Assuming you're constrained by the max value of the BigInteger type, sort the list if unsorted, and look for gaps.
Are you looking for an on-line algorithm (since you say the input is arbitrarily large)? If so, take a look at Odds algorithm.
Otherwise, as already suggested, hash the input, search and turn on/off elements of boolean set (the hash indexes into the set).
There are several approaches:
find the biggest int in the list and store it in x. x+1 will not be in the list. The same applies with using min() and x-1.
When N is the size of the list, allocate an int array with the size (N+31)/32. For each element in the list, set the bit v&31 (where v is the value of the element) of the integer at array index i/32. Ignore values where i/32 >= array.length. Now search for the first array item which is '!= 0xFFFFFFFF' (for 32bit integers).
If you can't guarantee it is sorted, then you have a best possible time efficiency of O(N) as you have to look at every element to make sure your final choice is not there. So the question is then:
Can it be done in O(N)?
What is the best space efficiency?
Chris Doggett's solution of find the max and add 1 is both O(N) and space efficient (O(1) memory usage)
If you want only probably the best answer then it is a different question.
Unless you are 100% sure it is sorted, the quickest algorithm still has to look at each number in the list at least once to at least verify that a number is not in the list.
Assuming this is the problem I'm thinking of:
You have a set of all ints in the range 1 to n, but one of those ints is missing. Tell me which of int is missing.
This is a pretty easy problem to solve with some simple math knowledge. It's known that the sum of the range 1 .. n is equal to n(n+1) / 2. So, let W = n(n+1) / 2 and let Y = the sum of the numbers in your set. The integer that is missing from your set, X, would then be X = W - Y.
Note: SO needs to support MathML
If this isn't that problem, or if it's more general, then one of the other solutions is probably right. I just can't really tell from the question since it's kind of vague.
Edit: Well, since the edit, I can see that my answer is absolutely wrong. Fun math, none-the-less.
I've solved this using Linq and a binary search. I got 100% across the board. Here's my code:
using System.Collections.Generic;
using System.Linq;
class Solution {
public int solution(int[] A) {
if (A == null) {
return 1;
} else {
if (A.Length == 0) {
return 1;
}
}
List<int> list_test = new List<int>(A);
list_test = list_test.Distinct().ToList();
list_test = list_test.Where(i => i > 0).ToList();
list_test.Sort();
if (list_test.Count == 0) {
return 1;
}
int lastValue = list_test[list_test.Count - 1];
if (lastValue <= 0) {
return 1;
}
int firstValue = list_test[0];
if (firstValue > 1) {
return 1;
}
return BinarySearchList(list_test);
}
int BinarySearchList(List<int> list) {
int returnable = 0;
int tempIndex;
int[] boundaries = new int[2] { 0, list.Count - 1 };
int testCounter = 0;
while (returnable == 0 && testCounter < 2000) {
tempIndex = (boundaries[0] + boundaries[1]) / 2;
if (tempIndex != boundaries[0]) {
if (list[tempIndex] > tempIndex + 1) {
boundaries[1] = tempIndex;
} else {
boundaries[0] = tempIndex;
}
} else {
if (list[tempIndex] > tempIndex + 1) {
returnable = tempIndex + 1;
} else {
returnable = tempIndex + 2;
}
}
testCounter++;
}
if (returnable == list[list.Count - 1]) {
returnable++;
}
return returnable;
}
}
The longest execution time was 0.08s on the Large_2 test
You need the list to be sorted. That means either knowing it is sorted, or sorting it.
Sort the list. Skip this step if the list is known to be sorted. O(n lg n)
Remove any duplicate elements. Skip this step if elements are already guaranteed distinct. O(n)
Let B be the position of 1 in the list using a binary search. O(lg n)
If 1 isn't in the list, return 1. Note that if all elements from 1 to n are in the list, then the element at B+n must be n+1. O(1)
Now perform a sortof binary search starting with min = B, max = end of the list. Call the position of the pivot P. If the element at P is greater than (P-B+1), recurse on the range [min, pivot], otherwise recurse on the range (pivot, max]. Continue until min=pivot=max O(lg n)
Your answer is (the element at pivot-1)+1, unless you are at the end of the list and (P-B+1) = B in which case it is the last element + 1. O(1)
This is very efficient if the list is already sorted and has distinct elements. You can do optimistic checks to make it faster when the list has only non-negative elements or when the list doesn't include the value 1.
Just gave an interview where they asked me this question. The answer to this problem can be found using worst case analysis. The upper bound for the smallest natural number present on the list would be length(list). This is because, the worst case for the smallest number present in the list given the length of the list is the list 0,1,2,3,4,5....length(list)-1.
Therefore for all lists, smallest number not present in the list is less than equal to length of the list. Therefore, initiate a list t with n=length(list)+1 zeros. Corresponding to every number i in the list (less than equal to the length of the list) mark assign the value 1 to t[i]. The index of the first zero in the list is the smallest number not present in the list. And since, the lower bound on this list n-1, for at least one index j