Huffman Tree with Max-Height, Nice Questions? - algorithm

I ran into a nice question in one Solution of Homework in DS course.
which of the following (for large n) create the most height for Huffman Tree. the elements of each sequence in following option shows the frequencies of character in input text and not shown the characters.
1) sequence of n equal numbers
2) sequence of n consecutive Fibonacci numbers.
3) sequence <1,2,3,...,n>
4) sequence <1^2,2^2,3^2,...,n^2>
Anyone could say, why this solution select (2)? thanks to anyone.

Let's analyze the various options here.
A sequence of N equal numbers means a balanced tree will be created with the actual symbols at the bottom leaf nodes.
A sequence 1-N has the property that as you start grouping the two lowest element their sum will quickly rise above other elements, here's an example:
As you can see, the groups from 4+5 and 7+8 did not by themselves contribute to the height of the tree.
After grouping the two 3-nodes into a 6, nodes 4 and 5 are the next in line, which means that each new group formed won't contribute to its height. Most will, but not all, and that's the important fact.
A sequence using squares (note: squares as in the third sequence in the question, 1^2, 2^2, 3^2, 4^2, ..., N^2, not square diagram elements) has somewhat the same behavior as a sequence of 1-N, some of the time other elements than the one that was just formed will be used, which cuts down on the height:
As you can see here, the same happened to 36+49, it did not contribute to the height of the tree.
However, the fibonacci sequence is different. As you group the two lowest nodes, their sum will at most topple the next item but not more than one of them, which means that each new group being formed will be used in the next as well, so that each new group formed will contribute to the height of the tree. This is different from the other 3 examples.

Related

Find every possible permutation of bits in a 2D array that has a single group of contiguous 1s

Below I have represented 2 permutations of bits in a 2D bit array (1s are red). The matrix on the left has a single group of contiguous 1s but the right matrix has 2.
I would like to loop through every possible permutation of binary values in such an array that has a single group of contiguous 1s. I am aware that for a 10×7 grid like above there are 2(10 × 7) permutations when you include non-contiguous permutations, but my hope is that by excluding non-contiguous permutations I will be able to go through them all in reasonable CPU time.
Speaking of reasonableness, I am also interested in an algorithm to determine how many permutations are contiguous.
My question is similar to, but different from, these:
2D Bit matrix with every possible combination
Finding Contiguous Areas of Bits in 2D Bit Array
Any help is appreciated. I'm a bit stuck.
So, I found that the OEIS (Online Encyclopedia of Integer Sequences) has a sequence from n = 0..7 for the "number of nonzero n X n binary arrays with all 1's connected" (A059525). They provide no formula though except for grids fixed at 1 cell wide (triangular numbers), or 2 or 3 cells wide. There's a sequence for 4 x n too but no formula.
Two approaches I can think of. One is to iterate through all possible sequences and devise a test for non-contiguous groups and some method for skipping over large regions guaranteed to be non-contiguous.
A second approach is to attempt to build all sets of contiguous groups so you don't need to test. This is the approach I would take:
Let n = width * height
Enumerate blocks left to right, top to bottom from 0 to n - 1
Fix a block at position 0.
Generate all contiguous solutions between 1 and n blocks extending from position 0
Omit position 0 and fix a block at position 1
Find all contiguous solutions between 1 and n - 1 blocks extending from position 1
Continue until you've reached position n
You can place your pieces according to the following rules, backtracking for the next placement at each depth:
To left of most recently placed piece if placed in row above prior piece provided that no other neighbors exist for that vacancy.
Above left-most available piece in row of most recently placed piece if no other neighbors exist for that vacancy.
To right of most recently placed piece (if adjacent piece exists)
On the next row, farthest left vacancy such that upper row has a piece above any contiguous right remaining neighbors
Next move for any backtracked position is first available move to the right of, or in the row below, the backtracked position (obeying prior 4 rules)

Algorithm challenge to merge sets

Given n sets of numbers. Each set contains some numbers from 1 to 100. How to select sets to merge into the longest set under a special rule, only two non-overlapping sets can merge. [1,2,3] can merge with [4,5] but not [3,4]. What will be an efficient algorithm to merge into the longest set.
My first attempt is to form an n by n matrix. Each row/column represents a set. Entry(i,j) equals to 1 if two sets overlap, entry(i,i) stores the length of set i. Then the questions becomes can we perform row and column operations at the same time to create a diagonal sub-matrix on top left corner whose trace is as large as possible.
However, I got stuck in how to efficiently perform row and column operations to form such a diagonal sub-matrix on top left corner.
As already pointed out in the comments (maximum coverage problem) you have a NP-hart problem. Luckily, matlab offers solvers for integer linear programming.
So we try to reduce the problem to the form:
min f*x subject to Ax<=b , 0<=x
There are n sets, we can encode a set as a vector of 0s and 1s. For example (1,1,1,0,0,...) would represent {1,2,3} and (0,0,1,1,0,0...) - {3,4}.
Every column of A represents a set. A(i,j)=1 means that the i-th element is in the j-th set, A(i,j)=0 means that the i-th element is not in the j-th set.
Now, x represents the sets we select: if x_j=1 than the set j is selected, if x_j=0 - than not selected!
As every element must be at most in one set, we choose b=(1, 1, 1, ..., 1): If we take two sets which contain the i-th element, than the i-th element of (Ax) would be at least 2.
The only question left is what is f? We try to maximize the number of elements in the union, so we choose f_j=-|set_j| (minus owing to min<->max conversion), with |set_j| - number of elements in the j-th set.
Putting it all in matlab we get:
f=-sum(A)
xopt=intlinprog(f.',1:n,A,ones(m,1),[],[],zeros(n,1),ones(n,1))
f.' - cost function as column
1:n - all n elements of x are integers
A - encodes n sets
ones(m,1) - b=(1,1,1...), there are m=100 elements
[],[] - there are no constrains of the form Aeq*x=beq
zeros(n,1), 0<=x must hold
ones(n,1), x<=1 follows already from others constrains, but maybe it will help the solver a little bit
You can represent sets as bit fields. A bitwise and operation yielding zero would indicate non-overlapping sets. Depending on the width of the underlying data type, you may need to perform multiple and operations. For example, with a 64 bit machine word size, I'd need two words to cover 1 to 100 as a bit field.

stack of piles visible from top view

This is an interview question.
Here are the notes arranged in the following manner as depicted in the image.
Given the starting and ending point of each note.
for eg. [2-5], [3-9], [7-100] on a scale of length limits 0-10^9
in this example all three notes will be visible.
we need to find out, when viewed from top, how many notes are visible??
I tried to solve in O(n*n) , where n is the number of notes by checking every note visibilty with every other note. but in this approach how will we determine if the two notes are in different stacks.
ultimately did not reached the solution.
O(n) solutions will be preferred as O(n) solution was demanded by interviewer
If the order of the notes in the input is "the former is on top" than its easy:
keep values of the min_x and the max_x, initializing it to the first note's x values. Iterate over the notes, each note that has x values either greater than max_x or lesser than min_x changes those respective value to its own x values and is considered visible, otherwise it is not. finish iterating and return the list of visible notes. collect the cash.
If O(n log n) is sufficient: first, remap all numbers in the input to between 0..(2*n+1) (that is, if a number x_i is the j-th smallest number among all numbers in the input, replace all x_i with j). You can then use Painter's algorithm on a segment tree.
Details:
Consider an array of size (2 * n + 1). Initialize all these cells with -1.
Painter's algorithm: Iterate the bank notes from the last one given (in the bottom) to the topmost one. For each note covering from a_i to b_i, replace the values of all cells in the array whose index is between a_i and b_i with i. At the end of this algorithm, we can simply look at which indexes are in the array and these form all the visible notes. However, naively this works in O(N^2).
Segment tree: So, instead of using an array, we use a segment tree. The operations above can then be done in O(log N).

Number of binary search trees over n distinct elements

How many binary search trees can be constructed from n distinct elements? And how can we find a mathematically proved formula for it?
Example:
If we have 3 distinct elements, say 1, 2, 3, there
are 5 binary search trees.
Given n elements, the number of binary search trees that can be made from those elements is given by the nth Catalan number (denoted Cn). This is equal to
Intuitively, the Catalan numbers represent the number of ways that you can create a structure out of n elements that is made in the following way:
Order the elements as 1, 2, 3, ..., n.
Pick one of those elements to use as a pivot element. This splits the remaining elements into two groups - those that come before the element and those that come after.
Recursively build structures out of those two groups.
Combine those two structures together with the one element you removed to get the final structure.
This pattern perfectly matches the ways in which you can build a BST from a set of n elements. Pick one element to use as the root of the tree. All smaller elements must go to the left, and all larger elements must go to the right. From there, you can then build smaller BSTs out of the elements to the left and the right, then fuse them together with the root node to form an overall BST. The number of ways you can do this with n elements is given by Cn, and therefore the number of possible BSTs is given by the nth Catalan number.
Hope this helps!
I am sure this question is not just to count using a mathematical formula.. I took out some time and wrote the program and the explanation or idea behind the calculation for the same.
I tried solving it with recursion and dynamic programming both. Hope this helps.
The formula is already present in the previous answer:
So if you are interested in learning the solution and understanding the apporach you can always check my article Count Binary Search Trees created from N unique elements
Let T(n) be the number of bsts of n elements.
Given n distinct ordered elements, numbered 1 to n, we select i as the root.
This leaves (1..i-1) in the left subtree for T(i-1) combinations and (i+1..n) in the right subtree for T(n-i) combinations.
Therefore:
T(n) = sum_i=1^n(T(i-1) * T(n-i))
and of course T(1) = 1

IOI Qualifier INOI task 2

I can't figure out how to solve question 2 in the following link in an efficient manner:
http://www.iarcs.org.in/inoi/2012/inoi2012/inoi2012-qpaper.pdf
You can do this in On log n) time. (Or linear if you really care to.) First, pad the input array out to the next power of two using some really big negative number. Now, build an interval tree-like data structure; recursively partition your array by dividing it in half. Each node in the tree represents a subarray whose length is a power of two and which begins at a position that is a multiple of its length, and each nonleaf node has a "left half" child and a "right half" child.
Compute, for each node in your tree, what happens when you add 0,1,2,3,... to that subarray and take the maximum element. Notice that this is trivial for the leaves, which represent subarrays of length 1. For internal nodes, this is simply the maximum of the left child with length/2 + right child. So you can build this tree in linear time.
Now we want to run a sequence of n queries on this tree and print out the answers. The queries are of the form "what happens if I add k,k+1,k+2,...n,1,...,k-1 to the array and report the maximum?"
Notice that, when we add that sequence to the whole array, the break between n and 1 either occurs at the beginning/end, or smack in the middle, or somewhere in the left half, or somewhere in the right half. So, partition the array into the k,k+1,k+2,...,n part and the 1,2,...,k-1 part. If you identify all of the nodes in the tree that represent subarrays lying completely inside one of the two sequences but whose parents either don't exist or straddle the break-point, you will have O(log n) nodes. You need to look at their values, add various constants, and take the maximum. So each query takes O(log n) time.

Resources