Closure Number Method for Generate Parenthesis Problem - algorithm

The standard Generate Parenthesis question on Leetcode is as follows
Given n pairs of parentheses, write a function to generate all combinations of well-formed parentheses.
For example, given n = 3, a solution set is:
[
"((()))",
"(()())",
"(())()",
"()(())",
"()()()"
]
In the solution tab they have explained Closure Number Method which I am finding it difficult to understand.
I did a dry run of the code and even got the correct answer but can't seem to understand why it works? What is the intuition behind this method?
Any help would be greatly appreciated!

The basic idea of this algorithm is dynamic programming. So you try to divide your problem into smaller problems that are easy to solve. In this example you make the sub-problems so small that the solution is either an empty string (if the size is 0) or the solution is "()" (for the size 1).
You start using the knowledge that if you want the parenthesis of a given length then the first character needs to be "(" and in some later place of the string there needs to be this character: ")". Otherwhise the output is not valid.
Now you don't know the position of the closing parenthesis so you just try every position (the first for loop).
The second thing you know, is that between the opening and the closing parenthesis and after the closing parenthesis there has to be something, that you don't realy know how it looks (because there are many possibilities), but it has to be a valid parenthesis pair again.
Now this problem is just the problem you already solved. So you just put in every possibility of valid parenthesis (using a smaller input size). Because this is just what your algorithm already does you can use the recursive function call to do this.
So summarized: You know a part of the problem, and that the rest of the problem is just the same problem with a smaller size. So you solve the small part of the problem you know and recursively call the same method to do this on the rest of the problem. Afterwards you just put it all together and got your solution.
Dynamic programming is usually not that easy to understand but very powerfull. So don't wory if you don't understand it directly. Solving puzzles like these is the best way to learn dynamic programming.

The closure number of a sequence in the size of the smallest prefix of the sequence which is a valid sequence on its own.
If a sequence has a closure number of k, than you know that in index 0 there is '(' and in index k there is ')'
The method solves the problem by checking all possible sizes of such prefix, for each one it breaks the sequence to the prefix (removing the 0 and k element) and all the rest of the sequence and solving the two sub problems recursively.

Related

How to find loop/ repetition in a data stream?

I came across an interesting question in an interview. But I couldn't answer it, neither I found it on Google.
Question is as follows:
You are given a data stream. With the help of variable declaration how you can find whether there is any repetition or loop in the data.
Example of the data stream are:
100100100100
0001000100010001
100100010001
10...0010....010....01(where 0....0 is 0^10^10^10)
How can this problem be solved? Is there any algorithm for such kind of problem?
I think there must two approaches to this problem
1. Longest repeated substring problem
This is well known problem which have solution in linear time. You have to construct suffix tree for your string then analyze it.
Please check this article for details
2. Repeated substring problem (any)
You can modify Longest repeated substring to find any repeated substring.
The brute force solution would be to use a map or a dictionary for that, i.e. for stream 100100100100 it will be:
dict["1"]++
dict["10"]++
dict["100"]++
dict["1001"]++
etc till the max length of the repetition to find. Then we drop the first symbol and repeat, i.e. 1 is dropped and 00100100100 is left to analyze:
dict["0"]++
dict["00"]++
dict["001"]++
dict["0010"]++
etc.
At the end we iterate over the map and print all keys with more than one value.
There are more efficient algorithms, but this is the easiest I guess.

Minimum-size Trie Construction by Dynamic Programming (War Story: What’s Past is Prolog from The Algorithm Design Manual by Steven S Skiena)

I'm reading the The Algorithm Design Manual by Steven S Skiena, and trying to understand the solution to the problem of War Story: What’s Past is Prolog.
The problem is also well described here.
Basically, the problem is, given an ordered list of strings, give a optimal solution to construct a trie with minimum size (the string character as the node), with the constraint that the order of the strings must be reserved, while the character index can be reordered.
Maybe this is not an appropriate question here for stackoverflow, still I'm wondering if anyone could give me some hint on the solution, especially what this recurrence means by its arguments:
the recurrence for the Dynamic Programming algorithm
You can think about it this way:
Let's assume that we fix the index of the first character. All strings get split into r bins based on the value of the character in this position (bins are essentially subtrees).
We can work with each bin independently. It won't change the order across different bins because two strings in different bins are different in the first character.
Thus, we can solve the problem for each bin independently. After that, we need exactly one edge to connect the root to each bin (that is, subtree). That's where the formula
C[i_k, j_k] + 1 comes from.
As we want to minimize the total number of edges and we're free to pick the first position, we just try all possible options among m positions.
Note: this algorithm is correct under assumption that we can reorder the rest of the characters in each subtree independently. If it's not the case, the dynamic programming solution is incorrect.

Generate a valid expression that computes to given N

This was asked to me in an interview,
Given a list of integer numbers, a list of symbols [+,-,*,/] and a target number N,
provide an expression which evaluates to N or return False if that is not possible.
e.g. let the list of numbers be [1,5,5] and the target number is 9, one possible
solution could be 5+5-1.
Now, my solution was a brute-force recursive solution that runs through all possible numbers and all possible operations and the recursion terminated either when the number exceeded N or was equal to N.
This got me wondering if there was a better, more refined solution. Any thoughts on this? I was thinking some kind of reverse construction of an expression tree.
I'm gonna go ahead and say this interview question cannot be about anything more than trying to narrow the problem down by asking questions. There is an extremely large list of questions you haven't covered that could be important to the solution, for
Do the numbers stay integers when you divide them, so is 1/5 a float, 0 or a big decimal
Can the numbers and operators repeat if only one is in the input, if so there seems to be no way to terminate if you can't find a solution
can you use parentheses or can the input have parentheses
can the numbers be negative
can you just print true or false or do you have to find a valid solution
One thing from those questions I notice is that if division works by rounding and you have a + and / in the operator list, you can always divide until it rounds to 1 and then just add. Also if you can repeat multiplication is essentially irrelevant because it can be replaced by many additions.
The reason I am sure that your interviewer wanted you to ask more clarifying questions is because even the small set of questions that I thought of change the problem in a big way.
One last thing to consider, this problem is a superset of the knapsack problem which is already known to be np-complete, so there is obviously no polynomial time solution.

Shift rules in Boyer-Moore algorithm

There is something I could not figure out about the two shifting rules (bad character and good suffix) in this algorithm. Are they working together and what exactly decide which one to deploy in each case or shift. This comprehensive explanation ended with an example of SSIMPLE EXAMPLE which confused me, my question here, if the algorithm moves backward, why would the algorithm will need good suffix shift to move to the right? I am sure I miss something here. Would you help me to explain the aforementioned example.
The missing point is the algorithm moves backward on the pattern not the string, so the comparison starts from the character of index n ( n is pattern length) not from the index 1. the following visual example is very helpful to clarify that.

Finding a sequence of operations

This should eventually be written in JavaScript. But I feel that I should not type any code until my algorithm is clear, which it is not!
Problem Given: Starting at 1, write a function that given a number returns a sequence of operations that consist only of either "+5" or "*3" that produce the number in question.
My basic algorithm:
Get the number
if the number is 1
return 1.
else if we surpass the number
return -1.
else keep trying to "+5" or "*3" until number is reached, assuming it can be reached.
My problem is with step # 4: I see that there are two paths to take which will bring me to the number in question(target), either "+5" OR "*3", but what about the number 13 which can be found by a MIXTURE of BOTH paths?? I can only do one thing or the other!
How would I know which path to take and how many times I should take that path? How would I bounce back and forth between paths?
I agree with the concept of breadth first search in a binary tree. However, I suggest turning the problem around, and looking at the problem of using "-5" or "/3" to get from the target back to 1. That allows pruning based on the target.
For example, 13 is not divisible by 3, so the first step in the backwards problem for target 13 must be "-5", not "/3".
It does not change the complexity, but may make the algorithm faster in practice for small problems.
You essentially want to do a breadth first, binary search tree. You could use recursion, or just some while loops. Each step you take the current number and add 5 or multiply by 3. Do your tests, and if you find the input value, then return 0 or something (You did not specify).
The key here is to thing about the data structure and how to search it. Do you understand why it should be breadth first? Do you understand why it is a binary tree?
In response to comments:
First off I admire your efforts. Solving this kind of problem, independent of language, is a great way to approach a problem. It is not about stupid trick in Javascript (or any other language).
So the first concept to get down is that you "searching" for a solution, if you don't find one return -1.
Second you should do some research on binary trees. They are a very important concept!
Third you should then go breadth first search. However, that is the least important. It just makes the problem a bit more efficient.
what about the number 13 which can be found by a MIXTURE of BOTH paths?? I can only do one thing or the other!
Well, actually you can do both. As in the example in chapter 3 of the book you mention, you'll see that the function find is called twice inside itself -- the function is trying both paths at any choice point and the first correct solution is returned (you could also experiment with altering the overall function so it will return all correct paths).
How would I know which path to take and how many times I should take that path? How would I bounce back and forth between paths?
Basically, bouncing back and forth between paths is achieved by traveling both of them. You know if it's the right path if the function hits the target number.

Resources