Related
My friend was asked this question in an interview:
We have a vector of integers consisting only of 0s and 1s. A delete consists of selecting consecutive equal numbers and removing them. The remaining parts are then attached to each other. For e.g., if the vector is [0,1,1,0] then after removing [1,1] we get [0,0]. We need one delete to remove an element from the vector, if no consecutive elements are found.
We need to write a function that returns the minimum number of deletes to make the vector empty.
Examples 1:
Input: [0,1,1,0]
Output: 2
Explanation: [0,1,1,0] -> [0,0] -> []
Examples 2:
Input: [1,0,1,0]
Output: 3
Explanation: [1,0,1,0] -> [0,1,0] -> [0,0] -> [].
Examples 3:
Input: [1,1,1]
Output: 1
Explanation: [1,1,1] -> []
I am unsure of how to solve this question. I feel that we can use a greedy approach:
Remove all consecutive equal elements and increment the delete counter for each;
Remove elements of the form <a, b, c> where a==c and a!=b, because of we had multiple consecutive bs, it would have been deleted in step (1) above. Increment the delete counter once as we delete one b.
Repeat steps (1) and (2) as long as we can.
Increment delete counter once for each of the remaining elements in the vector.
But I am not sure if this would work. Could someone please confirm if this is the right approach? If not, how do we solve this?
Hint
You can simplify this problem greatly by noticing the following fact: a chain of consecutive zeros or ones can be shortened or lengthened without changing the final solution. By example, the two vectors have the same solution:
[1, 0, 1]
[1, 0, 0, 0, 0, 0, 0, 1]
With that in mind, the solution becomes simpler. So I encourage you to pause and try to figure it out!
Solution
With the previous remark, we can reduce the problem to vectors of alternating zeros and ones. In fact, since zero and one have no special meaning here, it suffices to solve for all such vector which start by... say a one.
[] # number of steps: 0
[1] # number of steps: 1
[1, 0] # number of steps: 2
[1, 0, 1] # number of steps: 2
[1, 0, 1, 0] # number of steps: 3
[1, 0, 1, 0, 1] # number of steps: 3
[1, 0, 1, 0, 1, 0] # number of steps: 4
[1, 0, 1, 0, 1, 0, 1] # number of steps: 4
We notice a pattern, the solution seems to be floor(n / 2) + 1 for n > 1 where n is the length of those sequences. But can we prove it..?
Proof
We will proceed by induction. Suppose you have a solution for a vector of length n - 2, then any move you do (except for deleting the two characters on the edges of the vector) will have the following result.
[..., 0, 1, 0, 1, 0 ...]
^------------ delete this one
Result:
[..., 0, 1, 1, 0, ...]
But we already mentioned that a chain of consecutive zeros or ones can be shortened or lengthened without changing the final solution. So the result of the deletion is in fact equivalent to now having to solve for:
[..., 0, 1, 0, ...]
What we did is one deletion in n elements and arrived to a case which is equivalent to having to solve for n - 2 elements. So the solution for a vector of size n is...
Solution(n) = Solution(n - 2) + 1
= [floor((n - 2) / 2) + 1] + 1
= floor(n / 2) + 1
Keeping in mind that the solutions for [1] and [1, 0] are respectively 1 and 2, this concludes our proof. Notice here, that [] turns out to be an edge case.
Interestingly enough, this proof also shows us that the optimal sequence of deletions for a given vector is highly non-unique. You can simply delete any block of ones or zeros, except for the first and last ones, and you will end up with an optimal solution.
Conclusion
In conclusion, given an arbitrary vector of ones and zeros, the smallest number of deletions you will need can be computed by counting the number of groups of consecutive ones or zeros. The answer is then floor(n / 2) + 1 for n > 1.
Just for fun, here is a Python implementation to solve this problem.
from itertools import groupby
def solution(vector):
n = 0
for group in groupby(vector):
n += 1
return n // 2 + 1 if n > 1 else n
Intuition: If we remove the subsegments of one integer, then all the remaining integers are of one type leads to only one operation.
Choosing the integer which is not the starting one to remove subsegments leads to optimal results.
Solution:
Take the integer other than the one that is starting as a flag.
Count the number of contiguous segments of the flag in a vector.
The answer will be the above count + 1(one operation for removing a segment of starting integer)
So, the answer is:
answer = Count of contiguous segments of flag + 1
Example 1:
[0,1,1,0]
flag = 1
Count of subsegments with flag = 1
So, answer = 1 + 1 = 2
Example 2:
[1,0,1,0]
flag = 0
Count of subsegments with flag = 2
So, answer = 2 + 1 = 3
Example 3:
[1,1,1]
flag = 0
Count of subsegments with flag = 0
So, answer = 0 + 1 = 1
How to formulate this problem in code?
Problem Statement:
UPDATED:
Find the number of ways to pick the element from the array which are
not visited.
We starting from 1,2,.....,n with some (1<= x <= n) number of elements already picked/visited randomly which is given in the input.
Now, we need to find the number of ways we can pick rest of the (n - x) number of elements present in the array, and the way we pick an element is defined as:
On every turn, we can only pick the element which is adjacent(either left or right) to some visited element i.e
in an array of elements:
1,2,3,4,5,6 let's say we have visited 3 & 6 then we can now pick
2 or 4 or 5, as they are unvisited and adjacent to visited nodes, now say we pick 2, so now we can pick 1 or 4 or 5 and continues.
example:
input: N = 6(number of elements: 1, 2, 3, 4, 5, 6)
M = 2(number of visited elements)
visited elements are = 1, 5
Output: 16(number of ways we can pick the unvisited elements)
ways: 4, 6, 2, 3
4, 6, 3, 2
4, 2, 3, 6
4, 2, 6, 3
4, 3, 2, 6
4, 3, 6, 2
6, 4, 2, 3
6, 4, 2, 3
6, 2, 3, 4
6, 2, 4, 3
2, 6, 4, 3
2, 6, 3, 4
2, 4, 6, 3
2, 4, 3, 6
2, 3, 4, 6
2, 3, 6, 4.
Some analysis of the problem:
The actual values in the input array are assumed to be 1...n, but these values do not really play a role. These values just represent indexes that are referenced by the other input array, which lists the visited indexes (1-based)
The list of visited indexes actually cuts the main array into subarrays with smaller sizes. So for example, when n=6 and visited=[1,5], then the original array [1,2,3,4,5,6] is cut into [2,3,4] and [6]. So it cuts it into sizes 3 and 1. At this point the index numbering loses its purpose, so the problem really is fully described with those two sizes: 3 and 1. To illustrate, the solution for (n=6, visited=[1,5]) is necessarily the same as for (n=7, visited[1,2,6]): the sizes into which the original array is cut, are the same in both cases (in a different order, but that doesn't influence the result).
Algorithm, based on a list of sizes of subarrays (see above):
The number of ways that one such subarray can be visited, is not that difficult: if the subarray's size is 1, there is just one way. If it is greater, then at each pick, there are two possibilities: either you pick from the left side or from the right side. So you get like 2*2*..*2*1 possibilities to pick. This is 2size-1 possibilities.
The two outer subarrays are an exception to this, as you can only pick items from the inside-out, so for those the number of ways to visit such a subarray is just 1.
The number of ways that you can pick items from two subarrays can be determined as follows: count the number of ways to pick from just one of those subarrays, and the number of ways to pick from the other one. Then consider that you can alternate when to pick from one sub array or from the other. This comes down to interweaving the two sub arrays. Let's say the larger of the two sub arrays has j elements, and the smaller k, then consider there are j+1 positions where an element from the smaller sub array can be injected (merged) into the larger array. There are "k multichoose j+1" ways ways to inject all elements from the smaller sub array.
When you have counted the number of ways to merge two subarrays, you actually have an array with a size that is the sum of those two sizes. The above logic can then be applied with this array and the next subarray in the problem specification. The number of ways just multiplies as you merge more subarrays into this growing array. Of course, you don't really deal with the arrays, just with sizes.
Here is an implementation in JavaScript, which applies the above algorithm:
function getSubArraySizes(n, visited) {
// Translate the problem into a set of sizes (of subarrays)
let j = 0;
let sizes = [];
for (let i of visited) {
let size = i - j - 1;
if (size > 0) sizes.push(size);
j = i;
}
let size = n - j;
if (size > 0) sizes.push(size);
return sizes;
}
function Combi(n, k) {
// Count combinations: "from n, take k"
// See Wikipedia on "Combination"
let c = 1;
let end = Math.min(k, n - k);
for (let i = 0; i < end; i++) {
c = c * (n-i) / (end-i); // This is floating point
}
return c; // ... but result is integer
}
function getPickCount(sizes) {
// Main function, based on a list of sizes of subarrays
let count = 0;
let result = 1;
for (let i = 0; i < sizes.length; i++) {
let size = sizes[i];
// Number of ways to take items from this chunk:
// - when items can only be taken from one side: 1
// - otherwise: every time we have a choice between 2, except for the last remaining item
let pickCount = i == 0 || i == sizes.length-1 ? 1 : 2 ** (size-1);
// Number of ways to merge/weave two arrays, where relative order of elements is not changed
// = a "k multichoice from n". See
// https://en.wikipedia.org/wiki/Combination#Number_of_combinations_with_repetition
let weaveCount = count == 0 ? 1 // First time only
: Combi(size+count, Math.min(count, size));
// Number of possibilities:
result *= pickCount * weaveCount;
// Update the size to be the size of the merged/woven array
count += size;
}
return result;
}
// Demo with the example input (n = 6, visited = 1 and 5)
let result = getPickCount(getSubArraySizes(6, [1, 5]));
console.log(result);
I'm looking for solution to my problem. Say I have a number X, now I want to generate 20 random numbers whose sum would equal to X, but I want those random numbers to have enthropy in them. So for example, if X = 50, the algorithm should generate
3
11
0
6
19
7
etc. The sum of given numbres should equal to 50.
Is there any simple way to do that?
Thanks
Simple way:
Generate random number between 1 and X : say R1;
subtract R1 from X, now generate a random number between 1 and (X - R1) : say R2. Repeat the process until all Ri add to X : i.e. (X-Rn) is zero. Note: each consecutive number Ri will be smaller then the first. If you want the final sequence to look more random, simply permute the resulting Ri numbers. I.e. if you generate for X=50, an array like: 22,11,9,5,2,1 - permute it to get something like 9,22,2,11,1,5. You can also put a limit to how large any random number can be.
One fairly straightforward way to get k random values that sum to N is to create an array of size k+1, add values 0 and N, and fill the rest of the array with k-1 randomly generated values between 1 and N-1. Then sort the array and take the differences between successive pairs.
Here's an implementation in Ruby:
def sum_k_values_to_n(k = 20, n = 50)
a = Array.new(k + 1) { 1 + rand(n - 1) }
a[0] = 0
a[-1] = n
a.sort!
(1..(a.length - 1)).collect { |i| a[i] - a[i-1] }
end
p sum_k_values_to_n(3, 10) # produces, e.g., [2, 3, 5]
p sum_k_values_to_n # produces, e.g., [5, 2, 3, 1, 6, 0, 4, 4, 5, 0, 2, 1, 0, 5, 7, 2, 1, 1, 0, 1]
I know that this question has been asked, and there is a very nice elegant solution using a min heap.
MY question is how would one do this using the merge function of merge sort.
You already have an array of sorted arrays. So you should be able to merge all of them into one array in O(nlog K) time, correct?
I just can't figure out how to do this!
Say I have
[ [5,6], [3,4], [1,2], [0] ]
Step 1: [ [3,4,5,6], [0,1,2] ]
Step2: [ [0,1,2,3,4,5,6] ]
Is there a simple way to do this? Is O(nlog K) theoretically achievable with mergesort?
As others have said, using the min heap to hold the next items is the optimal way. It's called an N-way merge. Its complexity is O(n log k).
You can use a 2-way merge algorithm to sort k arrays. Perhaps the easiest way is to modify the standard merge sort so that it uses non-constant partition sizes. For example, imagine that you have 4 arrays with lengths 10, 8, 12, and 33. Each array is sorted. If you concatenated the arrays into one, you would have these partitions (the numbers are indexes into the array, not values):
[0-9][10-17][18-29][30-62]
The first pass of your merge sort would have starting indexes of 0 and 10. You would merge that into a new array, just as you would with the standard merge sort. The next pass would start at positions 18 and 30 in the second array. When you're done with the second pass, your output array contains:
[0-17][18-62]
Now your partitions start at 0 and 18. You merge those two into a single array and you're done.
The only real difference is that rather than starting with a partition size of 2 and doubling, you have non-constant partition sizes. As you make each pass, the new partition size is the sum of the sizes of the two partitions you used in the previous pass. This really is just a slight modification of the standard merge sort.
It will take log(k) passes to do the sort, and at each pass you look at all n items. The algorithm is O(n log k), but with a much higher constant than the N-way merge.
For implementation, build an array of integers that contains the starting indexes of each of your sub arrays. So in the example above you would have:
int[] partitions = [0, 10, 18, 30];
int numPartitions = 4;
Now you do your standard merge sort. But you select your partitions from the partitions array. So your merge would start with:
merge (inputArray, outputArray, part1Index, part2Index, outputStart)
{
part1Start = partitions[part1Index];
part2Start = partitions[part2Index];
part1Length = part2Start - part1Start;
part2Length = partitions[part2Index-1] - part2Start;
// now merge part1 and part2 into the output array,
// starting at outputStart
}
And your main loop would look something like:
while (numPartitions > 1)
{
for (int p = 0; p < numPartitions; p += 2)
{
outputStart = partitions[p];
merge(inputArray, outputArray, p, p+1, outputStart);
// update partitions table
partitions[p/2] = partitions[p] + partitions[p+1];
}
numPartitions /= 2;
}
That's the basic idea. You'll have to do some work to handle the dangling partition when the number is odd, but in general that's how it's done.
You can also do it by maintaining an array of arrays, and merging each two arrays into a new array, adding that to an output array of arrays. Lather, rinse, repeat.
You should note that when we say complexity is O(n log k), we assume that n means TOTAL number of elements in ALL of k arrays, i.e. number of elements in a final merged array.
For example, if you want to merge k arrays that contain n elements each, total number of elements in final array will be nk. So complexity will be O(nk log k).
There different ways to merge arrays. To accoplish that task in N*Log(K) time you can use a structure called Heap (it is good structure to implement priority queue). I suppose that you already have it, if you don’t then pick up any available implementation: http://en.wikipedia.org/wiki/Heap_(data_structure)
Then you can do that like this:
1. We have A[1..K] array of arrays to sort, Head[1..K] - current pointer for every array and Count[1..K] - number of items for every array.
2. We have Heap of pairs (Value: int; NumberOfArray: int) - empty at start.
3. We put to the heap first item of every array - initialization phase.
4. Then we organize cycle:
5. Get pair (Value, NumberOfArray) from the heap.
6. Value is next value to output.
7. NumberOfArray – is number of array where we need to take next item (if any) and place to the heap.
8. If heap is not empty, then repeat from step 5
So for every item we operate only with heap built from K items as maximum. It mean that we will have N*Log(K) complexity as you asked.
I implemented it in python. The main idea is similar to mergesort. There are k arrays in lists. In function mainMerageK, just divide lists (k) into left (k/2) and right (k/2). Therefore, the total count of partition is log(k). Regarding function merge, it is easy to know the runtime is O(n). Finally, we get O(nlog k)
By the way, it also can be implemented in min heap, and there is a link: Merging K- Sorted Lists using Priority Queue
def mainMergeK(*lists):
# implemented by k-way partition
k = len(lists)
if k > 1:
mid = int(k / 2)
B = mainMergeK(*lists[0: mid])
C = mainMergeK(*lists[mid:])
A = merge(B, C)
print B, ' + ', C, ' = ', A
return A
return lists[0]
def merge(B, C):
A = []
p = len(B)
q = len(C)
i = 0
j = 0
while i < p and j < q:
if B[i] <= C[j]:
A.append(B[i])
i += 1
else:
A.append(C[j])
j += 1
if i == p:
for c in C[j:]:
A.append(c)
else:
for b in B[i:]:
A.append(b)
return A
if __name__ == '__main__':
x = mainMergeK([1, 3, 5], [2, 4, 6], [7, 8, 10], [9])
print x
The output likes below:
[1, 3, 5] + [2, 4, 6] = [1, 2, 3, 4, 5, 6]
[7, 8, 10] + [9] = [7, 8, 9, 10]
[1, 2, 3, 4, 5, 6] + [7, 8, 9, 10] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Just do it like a 2-way merge except with K items. Will result in O(NK). If you want O(N logK) you will need to use a min-heap to keep track of the K pointers(with source array as a metadata) in the algorithm below:
Keep an array of K elements - i.e K pointers showing position in each array.
Mark all K elements are valid.
loop:
Compare values in K pointers that are valid. if the value is minimum, select least index pointer and increment it into the next value in the array. If incremented value has crossed it's array, mark it invalid.
Add the least value into the result.
Repeat till all K elements are invalid.
For example,:
Positions Arrays
p1:0 Array 1: 0 5 10
p2:3 Array 2: 3 6 9
p3:2 Array 3: 2 4 6
Output (min of 0,3,2)=> 0. So output is {0}
Array
p1:5 0 5 10
p2:3 3 6 9
p3:2 2 4 6
Output (min of 5,3,2)=> 2. So {0,2}
Array
p1:5 0 5 10
p2:3 3 6 9
p3:4 2 4 6
Output (min of 5,3,4)=>3. So {0,2,3}
..and so on..until you come to a state where output is {0,2,3,4,5,6}
Array
p1:5 0 5 10
p2:9 3 6 9
p3:6 2 4 6
Output (min of 5,9,6)=>6. So {0,2,3,4,5,6}+{6} when you mark p3 as "invalid" as you have exhausted the array. (or if you are using a min-heap you will simply remove the min-item, get it's source array metadata: in this case array 3, see that it's done so you will not add anything new to the min-heap)
I'm new to psuedocode, and I'm having trouble putting all the pieces together:
Here is the definition of a function named foo whose inputs are two integers and an array of integers a[1] ... a[n].
1 Foo(k,m, a[1],...,a[n])
2 if (k < 1 or m > n or k > m) return 0
3 else return a[k] + Foo(k+1,m,a[1],...,a[n])
Suppose that the input integers are k=2 and m=5 and the input array contains [5, 6, 2, 3, 4, 8, 2]. What value does Foo return? Using summation notation, give a general formula for what Foo computes.
This one is making my head hurt. Here's what I did so far:
Line 2 has three conditional statements:
If k<1 // if 2<1..this is false
If m>n // if 5 is greater than the amount of values in the array, which is 7, so this is false
If k>m // if 2>5, this is false
So this function will display line 3. Line 3 says:
return a[k] which is a[2] which is the second value of the array, which is 6. So take 6 and add it to (2+1, 5, a[1].....,a[n])
Is what I have done correct up there? If so, how would I know what a[n] is? Am I supposed to be finding that? What would be the final result of all this?
Simple answer: that function returns the sum of all the numbers a[k], a[k+1], ... a[m].
What you're doing is correct so far. The "n" is just a placeholder meaning the last element of the array. So if your input array is {5,6,2,3,4,8,2}, n = 7 (cause your have seven elements), and a[n] = 2.
But why it returns the sum of all numbers a[k], a[k+1], ... a[m], you should find out for yourself. Just continue with your analysis. :)
So take 6 and add it to (2+1, 5,
a[1].....,a[n])
Take 6 and add it to Foo(2+1, 5, a[1].....,a[n]). It's a recursive function. You have to evaluate the function again with k=3 and m=5.
I think you are confused because your pseudocode looks like real code to me. I may be wrong, but we are taught to write pseudocode differently, using plain English phrases.