Suppose we have a list of numbers like [6,5,4,7,3]. How can we tell that the array contains consecutive numbers? One way is ofcourse to sort them or we can find the minimum and maximum. But can we determine based on the sum of the elements ? E.g. in the example above, it is 25. Could anyone help me with this?
The sum of elements by itself is not enough.
Instead you could check for:
All elements being unique.
and either:
Difference between min and max being right
or
Sum of all elements being right.
Approach 1
Sort the list and check the first element and last element.
In general this is O( n log(n) ), but if you have a limited data set you can sort in O( n ) time using counting sort or radix sort.
Approach 2
Pass over the data to get the highest and lowest elements.
As you pass through, add each element into a hash table and see if that element has now been added twice. This is more or less O( n ).
Approach 3
To save storage space (hash table), use an approximate approach.
Pass over the data to get the highest and lowest elements.
As you do, implement an algorithm which will with high (read User defined) probability determine that each element is distinct. Many such algorithms exist, and are in use in Data Mining. Here's a link to a paper describing different approaches.
The numbers in the array would be consecutive if the difference between the max and the minimum number of the array is equal to n-1 provided numbers are unique ( where n is the size of the array ). And ofcourse minimum and maximum number can be calculated in O(n).
Related
I've been trying to find a solution for this question:
Given an array of integers, count the distinct permutations that are palindromes ("mirrors"); that is, find the number of distinct ways that the array's elements can be rearranged so that they read the same way backward as forward. For example:
If the array is [1,1,2], then there is only one distinct palindromic permutation (namely [1,2,1]), so the desired result is 1.
If the array is [1,1,2,2], then there are two distinct palindromic permutations (namely [1,2,2,1] and [2,1,1,2]), so the desired result is 2.
If the array is [2,2,2,3,3], then there are two distinct palindromic permutations (namely [3,2,2,2,3] and [2,3,2,3,2]), so the desired result is 2.
I've been trying to solve this and been stuck for quite a while, and can't find any solution online. Any help will be appreciated (just starting out on algo & ds stuff)
My idea is to find the index of the median of that array (e.g., in example #1, the median is at index 1) and move all numbers after it to before it (so, [1,2,1]), and check using two pointers (one at end, one at start) if all numbers are equal.
However, this won't work if, let's say, #1 is arr = [1,2,2], as doing the above would be equal to 1,2,2. What I should've done in this case is then to move the 1 in between the 2s (sort of median from the end, if that makes sense). Sort of like the above method but the reverse (?)
Here is the general idea:
Count the frequency of each unique value.
If the array's length is odd, then exactly one frequency should be odd. If not, there are no mirrors. If so, that value will have to be placed in the center. The number of mirrors is then equal to what you would get for an array with one value less -- that value removed.
Now the array length is even. No frequencies should be odd, or else there are no mirrors. Now halve all those frequencies.
Determine how many permutations can be formed with those values and their (halved) frequencies. The formula is:
𝑛! / (𝑛1!𝑛2!𝑛3!...𝑛𝑘!)
where 𝑛 is the sum of all (halved) frequencies (i.e. half the size of the array), and the 𝑛𝑖 is the list of (halved) frequencies.
I have a list which contains random numbers such that Number >= 0. Now i have to divide the list into 2 equal parts (assume list contains even number of elements) such that all the numbers contain in first list are less than the numbers present in second list. This can be easily done by any sorting mechanism in O(nlogn). But i don't need data to be sorted in any two equal length list. Only condition is that (all elements in first list <= all elements in second list.)
So is there a way or hack we can reduce the complexity since we don't require sorted data here?
If the problem is actually solvable (data is right) you can find the median using the selection algorithm. When you have that you just create 2 equally sized arrays and iterate over the original list element by element putting each element into either of the new lists depending whether it's bigger or smaller than the median. Should run in linear time.
#Edit: as gen-y-s pointed out if you write the selection algorithm yourself or use a proper library it might already divide the input list so no need for the second pass.
In a recent campus Facebook interview i have asked to divide an array into 3 equal parts such that the sum in each array is roughly equal to sum/3.My Approach1. Sort The Array2. Fill the array[k] (k=0) uptil (array[k]<=sum/3)3. After that increment k and repeat the above step for array[k]Is there any better algorithm for this or it is NP Hard Problem
This is a variant of the partition problem (see http://en.wikipedia.org/wiki/Partition_problem for details). In fact a solution to this can solve that one (take an array, pad with 0s, and then solve this problem) so this problem is NP hard.
There is a dynamic programming approach that is pseudo-polynomial. For each i from 0 to the size of the array, you keep track of all possible combinations of current sizes for the sub arrays, and their current sums. As long as there are a limited number possible sums of subsets of the array, this runs acceptably fast.
The solution that I would have suggested is to just go for "good enough" closeness. First let's consider the simpler problem with all values positive. Then sort by value descending. Take that array in threes. Build up the three subsets by always adding the largest of the triple to the one with the smallest sum, the smallest to the one with the largest, and the middle to the middle. You will end up dividing the array evenly, and the difference will be no more than the value of the third smallest element.
For the general case you can divide into positive and negative, use the above approach on each, and then brute force all combinations of a group of positives, a group of negatives, and the few leftover values in the middle that did not divide evenly.
Here are details on a dynamic programming solution if you are interested. The running time and memory usage is O(n*(sum)^2) where n is the size of your array and sum is the sum of absolute values of your array values. For each array index j from 1 to n, store all the possible values you can get for your 3 subset sums when you split the array from index 1 to j into 3 subsets. Also for each possibility, store one possible way to split the array to get the 3 sums. Then to extend this information for 1 to (j+1) given the information from 1 to j, simply take each possible combination of 3 sums for splitting 1 to j and form the 3 combinations of 3 sums you get when you choose to add the (j+1)th array element to any one of the 3 subsets. Finally, when you are done and reach j = n, go through the set of all combinations of 3 subset sums you can get when you split array positions 1 to n into 3 sets, and choose the one whose maximum deviation from sum/3 is minimized. At first this may seem like O(n*(sum)^3) complexity, but for each j and each combination of the first 2 subset sums, the 3rd subset sum is uniquely determined. (because you are not allowed to omit any elements of the array). Thus the complexity really is O(n*(sum)^2).
I'm working on a program that takes in a bunch (y) of integers and then needs to return the x highest integers in order. This code needs to be as fast as possible, but at the moment I dont think I have the best algorithm.
My approach/algorithm so far is to create a sorted list of integers (high to low) that have already been input and then handle each item as it comes in. For the first x items, I maintain a sorted array of integers, and when each new item comes in, I figure out where it should be placed using a binary sort. (Im also considering just taking in the first x items and then quick sorting them, but I dont know if this is faster) After the first x items have been sorted I then consider the rest of the items by first seeing if they qualify to enter the already sorted list of highest integers (by seeing if the new integer is greater than the integer at the end of the list) and if it does, add it to the sorted list via a binary search and remove the integer at the end of the list.
I was wondering if anyone had any advice as to how I can make this faster, or perhaps an entire new approach that is faster than this. Thanks.
This is a partial sort:
The fastest implementation is Quicksort where you only recurse on ranges containing the bottom/top k elements.
In C++ you can just use std::partial_sort
If you use a heap-ordered tree data structure to store the integers, inserting a new integer takes no more than lg N comparisons and removing the maximum takes no more than 2 lg N comparisions. Thus, to insert y items would require no more than y lg N comparisons and to remove the top x items would require no more than 2x lg N comparisons. The Wikipedia entry has references to a range of implementations.
This is called a top-N sort. Here is a very simple and efficient scheme. No fancy data structures needed.
Keep a list of the highest x elements (it starts out empty)
Split your input into chunks of x * 10 items
For each chunk, add the remembered list of the x highest items so far to it and sort it (e.g. quick sort)
Keep the x highest items. They form the new remembered list
goto 3 until all chunks processed
The remembered list is now your final result
This is O(N) in the number of items and only requires a normal quick sort as a primitive.
You don't seem to need the top N items in sorted order. Because of this, you can solve this in linear time.
Find the Nth largest array element using linear-time selection. Return it and all array elements larger than it.
Find the nth most frequent number in array.
(There is no limit on the range of the numbers)
I think we can
(i) store the occurence of every element using maps in C++
(ii) build a Max-heap in linear time of the occurences(or frequence) of element and then extract upto the N-th element,
Each extraction takes log(n) time to heapify.
(iii) we will get the frequency of the N-th most frequent number
(iv) then we can linear search through the hash to find the element having this frequency.
Time - O(NlogN)
Space - O(N)
Is there any better method ?
It can be done in linear time and space. Let T be the total number of elements in the input array from which we have to find the Nth most frequent number:
Count and store the frequency of every number in T in a map. Let M be the total number of distinct elements in the array. So, the size of the map is M. -- O(T)
Find Nth largest frequency in map using Selection algorithm. -- O(M)
Total time = O(T) + O(M) = O(T)
Your method is basically right. You would avoid final hash search if you mark each vertex of the constructed heap with the number it represents. Moreover, it is possible to constantly keep watch on the fifth element of the heap as you are building it, because at some point you can get to a situation where the outcome cannot change anymore and the rest of the computation can be dropped. But this would probably not make the algorithm faster in the general case, and maybe not even in special cases. So you answered your own question correctly.
It depends on whether you want most effective, or the most easy-to-write method.
1) if you know that all numbers will be from 0 to 1000, you just make an array of 1000 zeros (occurences), loop through your array and increment the right occurence position. Then you sort these occurences and select the Nth value.
2) You have a "bag" of unique items, you loop through your numbers, check if that number is in a bag, if not, you add it, if it is here, you just increment the number of occurences. Then you pick an Nth smallest number from it.
Bag can be linear array, BST or Dictionary (hash table).
The question is "N-th most frequent", so I think you cannot avoid sorting (or clever data structure), so best complexity can not be better than O(n*log(n)).
Just written a method in Java8: This is not an efficient solution.
Create a frequency map for each element
Sort the map content based on values in reverse order.
Skip the (N-1)th element then find the first element
private static Integer findMostNthFrequentElement(int[] inputs, int frequency) {
return Arrays.stream(inputs).boxed()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet().stream().sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
.skip(frequency - 1).findFirst().get().getKey();
}