CodeFight firstDuplicate interview challenge - algorithm

As per problem statement:
Write a solution with O(n) time complexity and O(1) additional space
complexity. Given an array a that contains only numbers in the range
from 1 to a.length, find the first duplicate number for which the
second occurrence has the minimal index. In other words, if there are
more than 1 duplicated numbers, return the number for which the second
occurrence has a smaller index than the second occurrence of the other
number does. If there are no such elements, return -1
I followed my code according to constraints and still I'm getting time complexity error. Here's my solution:
int firstDuplicate(std::vector<int> a)
{
long long int n = a.size();
int cnt=0;
for(long long int i=0;i<n;i++)
{
//cout<<a[i]<<" "<<cnt<<endl;
if(a[i]==n||a[i]==-n)
{ cnt++;
if(cnt>1)
return n;
}
else if(a[abs(a[i])]<0)
return -a[i];
else
a[a[i]] = -a[a[i]];
}
return -1;
}
Can anyone suggest me better algorithm or whats wrong with this algorithm?

The algorithm for this problem is as follows:
For each number in the array, a, each time we see that number, we make a[abs(a[i]) - 1] negative. While iterating through a, if at some point we find that a[abs(a[i] - 1] is negative, we return a[i]. If we reach the last item in the array without finding a negative number, we return -1.
I feel like, this is what you were trying to get at, but you might have overcomplicated things. In code this is:
int firstDuplicate(std::vector<int> a)
{
for (int i = 0; i < a.size(); i += 1)
{
if (a[abs(a[i]) - 1] < 0)
return abs(a[i]);
else
a[abs(a[i]) - 1] = -a[abs(a[i]) - 1];
}
return -1;
}
This runs in O(n) time, with O(1) space complexity.

You can use the indexes to mark whether an element has occured before or not, if the value at idx is negative then it has already occurred before
int firstDuplicate(std::vector<int> a)
{
long long int n = a.size();
int cnt=0;
for(long long int i=0;i<n;i++)
{
int idx = a[i] - 1;
if(a[idx] < 0){
return a[i];
}
a[idx] *= -1;
}
return -1;
}

Related

Maximum subArray product using Divide and Conquer Anyone?

I am aware that this is one of the most common coding questions when it comes to integral arrays. I am looking for a solution to the problem of finding the longest contiguous subArray product within the array, but using a Divide and Conquer approach.
I split my input array into two halves: the left and right arrays are solved recursively in case the solution falls entirely in the half array. Where I have a problem is with the scenario where the subArray crosses the mid-point of the array. Here is a short snippet of my code for the function handling the crossing:
pair<int,pair<int, int>> maxMidCrossing(vector<int>& nums, int low, int mid, int high)
{
int m = 1;
int leftIndx = low;
long long leftProduct = INT_MIN;
for(int i = mid-1; i>= low; --i)
{
m *= nums[i];
if(m > leftProduct) {
leftProduct = m;
leftIndx = i;
}
}
int mleft = m;
m=1;
int rightIndx = high;
long long rightProduct = INT_MIN;
for(int i = mid; i<= high; ++i)
{
m *= nums[i];
if(m > rightProduct) {
rightProduct = m;
rightIndx = i;
}
}
int mright = m;
cout << "\nRight product " << rightProduct;
pair<int, int> tmp;
int maximum = 0;
// Check the multiplication of both sides of the array to see if the combined subarray satisfies the maximum product condition.
if(mleft*mright < leftProduct*rightProduct) {
tmp = pair(leftIndx, rightIndx);
maximum = leftProduct*rightProduct;
}
else {
tmp = pair(low, high);
maximum = mleft*mright;
}
return pair(maximum, tmp);
}
The function handling the entire search contains the following:
auto leftIndx = indexProduct(left);
auto rightIndx = indexProduct(right);
auto midResult = maxMidCrossing(nums, 0, mid, nums.size()-1); // middle crossing
//.....more code........
if(mLeft > midProduct && mLeft > mRight)
tmp=leftIndx;
else if (mRight > midProduct && mRight > mLeft)
tmp = pair(rightIndx.first + mid, rightIndx.second + mid);
else tmp=midIndx;
In the end, I just compute the maximum product across the 3 scenarios: left array, crossing array, right array.
I still have a few corner cases failing. My question is if this problem admits a recursive solution of the Divide and Conquer type, and if anyone can spot what I may be doing wrong in my code, I would appreciate any hints that could help me get unstuck.
Thanks,
Amine
Take a look at these from leetcode
C++ Divide and Conquer
https://leetcode.com/problems/maximum-product-subarray/discuss/48289/c++-divide-and-conquer-solution-8ms
Java
https://leetcode.com/problems/maximum-product-subarray/discuss/367839/java-divide-and-conquer-2ms
c#
https://leetcode.com/problems/maximum-product-subarray/discuss/367839/java-divide-and-conquer-2ms

How to find the Kth smallest sum in a sorted MxN matrix

I've seen solutions on how to find the Kth smallest element in a sorted matrix, and I've also seen solutions on how to find the Kth smallest sum in two arrays.
But I found a question recently that asks to find the Kth smallest sum in a sorted MxN matrix. The sum must be made up of one element from each row. I'm really struggling develop anything close to a working solution, let alone a brute force solution. Any help would be greatly appreciated!
I thought this would be some kind of a heap problem... But perhaps it is a graph problem? I'm not that great with graphs.
I assume by "sorted MxN matrix", you mean each row of the matrix is sorted. If you already know how to merge 2 rows and take only the first K elements, you can do that same procedure to merge each and every row of the matrix. Ignore the Java conversion between int[] and List, the following code should work.
class Solution {
/**
* Runtime O(m * k * logk)
*/
public int kthSmallest(int[][] mat, int k) {
List<Integer> row = IntStream.of(mat[0]).boxed().collect(Collectors.toList());
for (int i = 1; i < mat.length; i++) {
row = kthSmallestPairs(row, mat[i], k);
}
return row.get(k - 1);
}
/**
* A pair is formed from one num of n1 and one num of n2. Find the k-th smallest sum of these pairs
* Queue size is maxed at k, hence this method run O(k logk)
*/
List<Integer> kthSmallestPairs(List<Integer> n1, int[] n2, int k) {
// 0 is n1's num, 1 is n2's num, 2 is n2's index
Queue<int[]> que = new PriorityQueue<>((a, b) -> a[0] + a[1] - b[0] - b[1]);
// first pair each num in n1 with the 0-th num of n2. Don't need to do more than k elements because those greater
// elements will never have a chance
for (int i = 0; i < n1.size() && i < k; i++) {
que.add(new int[] {n1.get(i), n2[0], 0});
}
List<Integer> res = new ArrayList<>();
while (!que.isEmpty() && k-- > 0) {
int[] top = que.remove();
res.add(top[0] + top[1]);
// index of n2 is top[2]
if (top[2] < n2.length - 1) {
int nextN2Idx = top[2] + 1;
que.add(new int[] {top[0], n2[nextN2Idx], nextN2Idx});
}
}
return res;
}
}
You can make a minHeap priority queue and save the sums and the corresponding index of rows in it. Then, once you pop the smallest sum so far, you can examine the next candidates for the smallest sum by incrementing index of each row by one.
Here are the data structures that you would need.
typedef pair<int,vector<int>> pi;
priority_queue<pi,vector<pi>,greater<pi>> pq;
You can try the question now, for help I have also added the code that I have written for this problem.
typedef pair<int,vector<int>> pi;
int kthSmallest(vector<vector<int>>& mat, int k) {
int m=mat.size();
int n=mat[0].size();
priority_queue<pi,vector<pi>,greater<pi>> pq;
int sum=0;
for(int i=0;i<m;i++)
sum+=mat[i][0];
vector<int> v;
for(int i=0;i<m;i++)
v.push_back(0);
pq.push({sum,v});
int count=1;
int ans=sum;
unordered_map<string,int> meep;
string s;
for(int i=0;i<m;i++)
s+="0";
meep[s]=1;
while(count<=k)
{
ans=pq.top().first;
v=pq.top().second;
// cout<<ans<<endl;
// for(int i=0;i<v.size();i++)
// cout<<v[i]<<" ";
// cout<<endl;
pq.pop();
for(int i=0;i<m;i++)
{
vector<int> temp;
sum=0;
int flag=0;
string luuul;
for(int j=0;j<m;j++)
{
if(i==j&&v[j]<n-1)
{
sum+=mat[j][v[j]+1];
temp.push_back(v[j]+1);
luuul+=to_string(v[j]+1);
}
else if(i==j&&v[j]==n-1)
{
flag=1;
break;
}
else
{
sum+=mat[j][v[j]];
temp.push_back(v[j]);
luuul+=to_string(v[j]);
}
}
if(!flag)
{
if(meep[luuul]==0)
pq.push({sum,temp});
meep[luuul]=1;
}
}
// cout<<endl;
count++;
}
return ans;
}
or every row we calculate all possible sums but keep the k smallest. We can use quickselect to do so in linear time.
The complexity below should be: O(n * m * k).
class Solution {
public:
int kthSmallest(vector<vector<int>>& mat, int k) {
vector<int> sums = { 0 }, cur = {};
for (const auto& row : mat) {
for (const int cel : row) {
for (const int sum : sums) {
cur.push_back(cel + sum);
}
}
int nth = min((int ) cur.size(), k);
nth_element(cur.begin(), cur.begin() + nth, cur.end());
sums.clear();
copy(cur.begin(), cur.begin() + nth, back_inserter(sums));
cur.clear();
}
return *max_element(sums.begin(), sums.end());
}
};
So the algorithm goes like this:
We know that elements of each row are sorted, so the minimum sum would be given by selecting the 1st element from each row.
We make a set storing {sum,vector of positions of the current elements we've chosen} sorted wrt to the sum.
So for finding the kth smallest sum, we repeat the following steps k-1 times: i) Take the element at the beginning of the set and erase it. ii) Find the next possible combinations with respect to the previous combination.
After exiting the loop return the sum of combination present at the beginning of the set.
The algorithm (using set) is properly explained with dry runs of test case containing all the corner case conditions. Do watch this youtube video by alGOds : https://youtu.be/ZYlVCy_vRp8

(with example) Why is KMP string matching O(n). Shouldn't it be O(n*m)?

Why is KMP O(n + m)?
I know this question has probably been asked a million times on here but I haven't find a solution that convinced me/I understood or a question that matched my example.
/**
* KMP algorithm of pattern matching.
*/
public boolean KMP(char []text, char []pattern){
int lps[] = computeTemporaryArray(pattern);
int i=0;
int j=0;
while(i < text.length && j < pattern.length){
if(text[i] == pattern[j]){
i++;
j++;
}else{
if(j!=0){
j = lps[j-1];
}else{
i++;
}
}
}
if(j == pattern.length){
return true;
}
return false;
}
n = size of text
m = size of pattern
I know why its + m, thats the runtime it takes to create the lsp array to do lookups. I'm not sure why the code I passed above is O(n).
I see that above "i" always progresses forwards EXCEPT when it doesn't match and j!= 0. In that case, we can do iterations of the while loop where i doesn't move forward, so its not exactly O(n)
If the lps array is incrementing like [1,2,3,4,5,6,0]. If we fail to match at index 6, j gets updated to 5, and then 4, and then 3.... and etc and we effectively go through m extra iterations (assuming all mismatch). This can occur at every step.
so it would look like
for (int i = 0; i < n; i++) {
for (int j = i; j >=0; j--) {
}
}
and to put all the possible i j combinations aka states would require a nm array so wouldn't the runtime be O(nm).
So is my reading of the code wrong, or the runtime analysis of the for loop wrong, or my example is impossible?
Actually, now that I think about it. It is O(n+m). Just visualized it as two windows shifting.

longest nondecreasing subsequence in O(nlgn)

I have the following algorithm which works well
I tried explaining it here for myself http://nemo.la/?p=943 and it is explained here http://www.geeksforgeeks.org/longest-monotonically-increasing-subsequence-size-n-log-n/ as well and on stackoverflow as well
I want to modify it to produce the longest non-monotonically increasing subsequence
for the sequence 30 20 20 10 10 10 10
the answer should be 4: "10 10 10 10"
But the with nlgn version of the algorithm it isn't working. Initializing s to contain the first element "30" and starting at the second element = 20. This is what happens:
The first step: 30 is not greater than or equal to 20. We find the smallest element greater than 20. The new s becomes "20"
The second step: 20 is greater than or equal to 20. We extend the sequence and s now contains "20 20"
The third step: 10 is not greater than or equal to 20. We find the smallest element greater than 10 which is "20". The new s becomes "10 20"
and s will never grow after that and the algorithm will return 2 instead of 4
int height[100];
int s[100];
int binary_search(int first, int last, int x) {
int mid;
while (first < last) {
mid = (first + last) / 2;
if (height[s[mid]] == x)
return mid;
else if (height[s[mid]] >= x)
last = mid;
else
first = mid + 1;
}
return first; /* or last */
}
int longest_increasing_subsequence_nlgn(int n) {
int i, k, index;
memset(s, 0, sizeof(s));
index = 1;
s[1] = 0; /* s[i] = 0 is the index of the element that ends an increasing sequence of length i = 1 */
for (i = 1; i < n; i++) {
if (height[i] >= height[s[index]]) { /* larger element, extend the sequence */
index++; /* increase the length of my subsequence */
s[index] = i; /* the current doll ends my subsequence */
}
/* else find the smallest element in s >= a[i], basically insert a[i] in s such that s stays sorted */
else {
k = binary_search(1, index, height[i]);
if (height[s[k]] >= height[i]) { /* if truly >= greater */
s[k] = i;
}
}
}
return index;
}
To find the longest non-strictly increasing subsequence, change these conditions:
If A[i] is smallest among all end candidates of active lists, we will start new active list of length 1.
If A[i] is largest among all end candidates of active lists, we will clone the largest active list, and extend it by A[i].
If A[i] is in between, we will find a list with largest end element that is smaller than A[i]. Clone and extend this list by A[i]. We will discard all other lists of same length as that of this modified list.
to:
If A[i] is smaller than the smallest of all end candidates of active lists, we will start new active list of length 1.
If A[i] is largest among all end candidates of active lists, we will clone the largest active list, and extend it by A[i].
If A[i] is in between, we will find a list with largest end element that is smaller than or equal to A[i]. Clone and extend this list by A[i]. We will discard all other lists of same length as that of this modified list.
The fourth step for your example sequence should be:
10 is not less than 10 (the smallest element). We find the largest element that is smaller than or equal to 10 (that would be s[0]==10). Clone and extend this list by 10. Discard the existing list of length 2. The new s becomes {10 10}.
Your code nearly works except a problem in your binary_search() function, this function should return the index of the first element that's greater than the target element(x) since you want the longest non-decreasing sequence. Modify it to this, it'll be OK.
If you use c++, std::lower_bound() and std::upper_bound() will help you get rid of this confusing problem. By the way, the if statement"if (height[s[k]] >= height[i])" is superfluous.
int binary_search(int first, int last, int x) {
while(last > first)
{
int mid = first + (last - first) / 2;
if(height[s[mid]] > x)
last = mid;
else
first = mid + 1;
}
return first; /* or last */
}
Just apply the longest increasing sub-sequence algorithm to ordered pair (A[i], i), by using a lexicographic compare.
My Java version:
public static int longestNondecreasingSubsequenceLength(List<Integer> A) {
int n = A.size();
int dp[] = new int[n];
int max = 0;
for(int i = 0; i < n; i++) {
int el = A.get(i);
int idx = Arrays.binarySearch(dp, 0, max, el);
if(idx < 0) {
idx = -(idx + 1);
}
if(dp[idx] == el) { // duplicate found, let's find the last one
idx = Arrays.binarySearch(dp, 0, max, el + 1);
if(idx < 0) {
idx = -(idx + 1);
}
}
dp[idx] = el;
if(idx == max) {
max++;
}
}
return max;
}
A completely different solution to this problem is the following. Make a copy of the array and sort it. Then, compute the minimum nonzero difference between any two elements of the array (this will be the minimum nonzero difference between two adjacent array elements) and call it δ. This step takes time O(n log n).
The key observation is that if you add 0 to element 0 of the original array, δ/n to the second element of the original array, 2δ/n to the third element of the array, etc., then any nondecreasing sequence in the original array becomes a strictly increasing sequence in the new array and vice-versa. Therefore, you can transform the array this way, then run a standard longest increasing subsequence solver, which runs in time O(n log n). The net result of this process is an O(n log n) algorithm for finding the longest nondecreasing subsequence.
For example, consider 30, 20, 20, 10, 10, 10, 10. In this case δ = 10 and n = 7, so δ / n &approx; 1.42. The new array is then
40, 21.42, 22.84, 14.28, 15.71, 17.14, 18.57
Here, the LIS is 14.28, 15.71, 17.14, 18.57, which maps back to 10, 10, 10, 10 in the original array.
Hope this helps!
I have my simple solution for the longest non-decreasing subsequence using upper bound function in c++.
Time complexity (nlogn)
int longest(vector<long long> a) {
vector<long long> s;
s.push_back(a[0]);
int n = a.size();
int len = 1;
for (int i = 1; i < n; i++) {
int idx = upper_bound(s.begin(), s.end(), a[i]) - s.begin();
int m = s.size();
if (m > idx) {
s[idx] = a[i];
} else {
s.push_back(a[i]);
}
}
return s.size();
}
If you know the algorithm for LIS, then changing inequalities in the code, gives the Longest Non-Decreasing subsequence.
Code for LIS:
public int ceilIndex(int []a, int n, int t[], int ele){
int l=-1, r=n+1;
while(r-l>1){
int mid=l+(r-l)/2;
if(a[t[mid]]<ele) l=mid;
else r=mid;
}
return r;
}
public int lengthOfLIS(int[] a) {
int n=a.length;
int index[]=new int[n];
int len=0;
index[len]=0;
int reversePath[]=new int[n];
for(int i=0;i<n;i++) reversePath[i]=-1;
for(int i=1;i<n;i++){
if(a[index[0]]>=a[i]){
index[0]=i;
reversePath[i]=-1;
}else if(a[index[len]]<a[i]){
reversePath[i]=index[len];
len++;
index[len]=i;
}else{
int idx=ceilIndex(a, len, index, a[i]);
reversePath[i]=index[idx-1];
index[idx]=i;
}
}
for(int i=0;i<n;i++) System.out.print(reversePath[i]+" ");
System.out.println();
// printing the LIS in reverseFashion
// we iterate the indexes in reverse
int idx=index[len];
while(idx!=-1){
System.out.print(a[idx]+" ");
idx=reversePath[idx];
}
return len+1;
}
Code for Longest Non-Decreasing subsequence:
public int ceilIndex(int []a, int n, int t[], int ele){
int l=-1, r=n+1;
while(r-l>1){
int mid=l+(r-l)/2;
if(a[t[mid]]<=ele) l=mid;
else r=mid;
}
return r;
}
public int lengthOfLongestNonDecreasingSubsequence(int[] a) {
int n=a.length;
int index[]=new int[n];
int len=0;
index[len]=0;
int reversePath[]=new int[n];
for(int i=0;i<n;i++) reversePath[i]=-1;
for(int i=1;i<n;i++){
if(a[index[0]]>a[i]){
index[0]=i;
reversePath[i]=-1;
}else if(a[index[len]]<=a[i]){
reversePath[i]=index[len];
len++;
index[len]=i;
}else{
int idx=ceilIndex(a, len, index, a[i]);
reversePath[i]=index[idx-1];
index[idx]=i;
}
}
for(int i=0;i<n;i++) System.out.print(reversePath[i]+" ");
System.out.println();
// printing the LIS in reverseFashion
// we iterate the indexes in reverse
int idx=index[len];
while(idx!=-1){
System.out.print(a[idx]+" ");
idx=reversePath[idx];
}
return len+1;
}

Old Top Coder riddle: Making a number by inserting +

I am thinking about this topcoder problem.
Given a string of digits, find the minimum number of additions required for the string to equal some target number. Each addition is the equivalent of inserting a plus sign somewhere into the string of digits. After all plus signs are inserted, evaluate the sum as usual.
For example, consider "303" and a target sum of 6. The best strategy is "3+03".
I would solve it with brute force as follows:
for each i in 0 to 9 // i -- number of plus signs to insert
for each combination c of i from 10
for each pos in c // we can just split the string w/o inserting plus signs
insert plus sign in position pos
evaluate the expression
if the expression value == given sum
return i
Does it make sense? Is it optimal from the performance point of view?
...
Well, now I see that a dynamic programming solution will be more efficient. However it is interesting if the presented solution makes sense anyway.
It's certainly not optimal. If, for example, you are given the string "1234567890" and the target is a three-digit number, you know that you have to split the string into at least four parts, so you need not check 0, 1, or 2 inserts. Also, the target limits the range of admissible insertion positions. Both points have small impact for short strings, but can make a huge difference for longer ones. However, I suspect there's a vastly better method, smells a bit of DP.
I haven't given it much thought yet, but if you scroll down you can see a link to the contest it was from, and from there you can see the solvers' solutions. Here's one in C#.
using System;
using System.Text;
using System.Text.RegularExpressions;
using System.Collections;
public class QuickSums {
public int minSums(string numbers, int sum) {
int[] arr = new int[numbers.Length];
for (int i = 0 ; i < arr.Length; i++)
arr[i] = 0;
int min = 15;
while (arr[arr.Length - 1] != 2)
{
arr[0]++;
for (int i = 0; i < arr.Length - 1; i++)
if (arr[i] == 2)
{
arr[i] = 0;
arr[i + 1]++;
}
String newString = "";
for (int i = 0; i < numbers.Length; i++)
{
newString+=numbers[i];
if (arr[i] == 1)
newString+="+";
}
String[] nums = newString.Split('+');
int sum1 = 0;
for (int i = 0; i < nums.Length; i++)
try
{
sum1 += Int32.Parse(nums[i]);
}
catch
{
}
if (sum == sum1 && nums.Length - 1 < min)
min = nums.Length - 1;
}
if (min == 15)
return -1;
return min;
}
}
Because input length is small (10) all possible ways (which can be found by a simple binary counter of length 10) is small (2^10 = 1024), so your algorithm is fast enough and returns valid result, and IMO there is no need to improve it.
In all until your solution works fine in time and memory and other given constrains, there is no need to do micro optimization. e.g this case as akappa offered can be solved with DP like DP in two-Partition problem, but when your algorithm is fast there is no need to do this and may be adding some big constant or making code unreadable.
I just offer parse digits of string one time (in array of length 10) to prevent from too many string parsing, and just use a*10^k + ... (Also you can calculate 10^k for k=0..9 in startup and save its value).
I think the problem is similar to Matrix Chain Multiplication problem where we have to put braces for least multiplication. Here braces represent '+'. So I think it could be solved by similar dp approach.. Will try to implement it.
dynamic programming :
public class QuickSums {
public static int req(int n, int[] digits, int sum) {
if (n == 0) {
if (sum == 0)
return 0;
else
return -1;
} else if (n == 1) {
if (sum == digits[0]) {
return 0;
} else {
return -1;
}
}
int deg = 1;
int red = 0;
int opt = 100000;
int split = -1;
for (int i=0; i<n;i++) {
red += digits[n-i-1] * deg;
int t = req(n-i-1,digits,sum - red);
if (t != -1 && t <= opt) {
opt = t;
split = i;
}
deg = deg*10;
}
if (opt == 100000)
return -1;
if (split == n-1)
return opt;
else
return opt + 1;
}
public static int solve (String digits,int sum) {
int [] dig = new int[digits.length()];
for (int i=0;i<digits.length();i++) {
dig[i] = digits.charAt(i) - 48;
}
return req(digits.length(), dig, sum);
}
public static void doit() {
String digits = "9230560001";
int sum = 71;
int result = solve(digits, sum);
System.out.println(result);
}
Seems to be too late .. but just read some comments and answers here which say no to dp approach . But it is a very straightforward dp similar to rod-cutting problem:
To get the essence:
int val[N][N];
int dp[N][T];
val[i][j]: numerical value of s[i..j] including both i and j
val[i][j] can be easily computed using dynamic programming approach in O(N^2) time
dp[i][j] : Minimum no of '+' symbols to be inserted in s[0..i] to get the required sum j
dp[i][j] = min( 1+dp[k][j-val[k+1][j]] ) over all k such that 0<=k<=i and val[k][j]>0
In simple terms , to compute dp[i][j] you assume the position k of last '+' symbol and then recur for s[0..k]

Resources