Maximize evaluation of expression with one parenthesis insertion - algorithm

I encountered this problem in a programming contest:
Given expression x1 op x2 op x3 op . . . op xn, where op is either addition '+' or multiplication '*' and xi are digits between 1 to 9. The goal is to insert just one set of parenthesis within the expression such that it maximizes the result of the expression.
The n is maximum 2500.
Eg.:
Input:
3+5*7+8*4
Output:
303
Explanation:
3+5*(7+8)*4
There was another constraint given in the problem that at max only 15 '*' sign will be present. This simplified the problem. As we will have just 17 options of brackets insertion and brute force would work in O(17*n).
I have been thinking if this constraint was not present, then can I theoretically solve the problem in O(n^2)? It seemed to me a DP problem. I am saying theoretically because the answers will be quite big (9^2500 possible). So if I ignore the time complexity of working with big numbers then is O(n^2) possible?

If there is no multiplication, you are finished.
If there is no addition, you are finished.
The leading and trailing operation of subterms that have to be evaluated always are additions, because parenthesis around a multiplication does not alter the outcome.
If you have subterms with only additions, you do not need to evaluate subparts of them. Multiplication of the full subterm will always be bigger. (Since we only have positiv numbers/digits.)
Travers the term once, trying to place the opening parenthesis after (worst case) each * that is succeeded with a +, and within that loop a second time, trying to place the closing parenthesis before (worst case) each succeeding * that immediately follows an +.
You can solve the problem in O(ma/2), with m: number of multiplications and a: number of additions. This is smaller than n^2.
Possible places for parenthesis shown with ^:
1*2*^3+4+5^*6*^7+8^

Related

number of valid parenthesis using dynamic programming [Uber phone interview]

For given n find the number of valid combinations of valid parenthesis.I told him direct formula of
Catalan number(becuase i encountered this prob earlier) but he specifically wanted this problem Solution using dynamic programming and wanted working solution with explanation with Test cases.I didn't managed to get to the working solution.
Eg:
n=1 valid par=0
n=2 valid par=1
Now I just want to understand this problem
I found one explanation but not getting it Please can you help me in understanding or can you provide
more verbose explanation of below logic (which seems to be correct)
You need even number of paranthesis, if C(n) denotes the number of valid paranthesis with 2 * n paranthesis then
C(0)=1 and for any n>0
C(n)=C(0) * C(n-1)+C(1) * C(n-2)+...+C(n-1) * C(0)=sum(i=0,n-1,C(i) * C(n-1-i))
because you need to start with a '(' and see where is that closing bracket with a ')' if there is 2 * i paranthesis between them then the number of such cases is C(i) * C(n-1-i).
Recursion is the key here.
Divide the N into N/2 and N/2 (Count for open and closed parentheses ).
Select the open parentheses, add it to the result string, and reduce its count and make a recursive call.
Select the close parentheses, add it to the result string, and reduce its count and make a recursive call.
To print only valid parentheses, make sure at any given point of time, close parentheses count is not less than open parentheses count because it means close parentheses have been printed with its respective open parentheses.
Take a look at this link.
M[i][j] dp state.
The length between i - j is always even number.
Then problem is similar to matrix multiplication.
M[i][j] = M[i][i+1]*M[i+2][j] + M[i][i+3]*M[i+4][j] + ..... + M[i][j-2]*M[j-1][j]
Also add the case where i'th parentheses is '(' and j'th parentheses is ')'
M[i][j] += M[i+1][j-1]

Guidance on Algorithmic Thinking (4 fours equation)

I recently saw a logic/math problem called 4 Fours where you need to use 4 fours and a range of operators to create equations that equal to all the integers 0 to N.
How would you go about writing an elegant algorithm to come up with say the first 100...
I started by creating base calculations like 4-4, 4+4, 4x4, 4/4, 4!, Sqrt 4 and made these values integers.
However, I realized that this was going to be a brute force method testing the combinations to see if they equal, 0 then 1, then 2, then 3 etc...
I then thought of finding all possible combinations of the above values, checking that the result was less than 100 and filling an array and then sorting it...again inefficient because it may find 1000s of numbers over 100
Any help on how to approach a problem like this would be helpful...not actual code...but how to think through this problem
Thanks!!
This is an interesting problem. There are a couple of different things going on here. One issue is how to describe the sequence of operations and operands that go into an arithmetic expression. Using parentheses to establish order of operations is quite messy, so instead I suggest thinking of an expression as a stack of operations and operands, like - 4 4 for 4-4, + 4 * 4 4 for (4*4)+4, * 4 + 4 4 for (4+4)*4, etc. It's like Reverse Polish Notation on an HP calculator. Then you don't have to worry about parentheses, having the data structure for expressions will help below when we build up larger and larger expressions.
Now we turn to the algorithm for building expressions. Dynamic programming doesn't work in this situation, in my opinion, because (for example) to construct some numbers in the range from 0 to 100 you might have to go outside of that range temporarily.
A better way to conceptualize the problem, I think, is as breadth first search (BFS) on a graph. Technically, the graph would be infinite (all positive integers, or all integers, or all rational numbers, depending on how elaborate you want to get) but at any time you'd only have a finite portion of the graph. A sparse graph data structure would be appropriate.
Each node (number) on the graph would have a weight associated with it, the minimum number of 4's needed to reach that node, and also the expression which achieves that result. Initially, you would start with just the node (4), with the number 1 associated with it (it takes one 4 to make 4) and the simple expression "4". You can also throw in (44) with weight 2, (444) with weight 3, and (4444) with weight 4.
To build up larger expressions, apply all the different operations you have to those initial node. For example, unary negation, factorial, square root; binary operations like * 4 at the bottom of your stack for multiply by 4, + 4, - 4, / 4, ^ 4 for exponentiation, and also + 44, etc. The weight of an operation is the number of 4s required for that operation; unary operations would have weight 0, + 4 would have weight 1, * 44 would have weight 2, etc. You would add the weight of the operation to the weight of the node on which it operates to get a new weight, so for example + 4 acting on node (44) with weight 2 and expression "44" would result in a new node (48) with weight 3 and expression "+ 4 44". If the result for 48 has better weight than the existing result for 48, substitute that new node for (48).
You will have to use some sense when applying functions. factorial(4444) would be a very large number; it would be wise to set a domain for your factorial function which would prevent the result from getting too big or going out of bounds. The same with functions like / 4; if you don't want to deal with fractions, say that non-multiples of 4 are outside of the domain of / 4 and don't apply the operator in that case.
The resulting algorithm is very much like Dijkstra's algorithm for calculating distance in a graph, though not exactly the same.
I think that the brute force solution here is the only way to go.
The reasoning behind this is that each number has a different way to get to it, and getting to a certain x might have nothing to do with getting to x+1.
Having said that, you might be able to make the brute force solution a bit quicker by using obvious moves where possible.
For instance, if I got to 20 using "4" three times (4*4+4), it is obvious to get to 16, 24 and 80. Holding an array of 100 bits and marking the numbers reached
Similar to subset sum problem, it can be solved using Dynamic Programming (DP) by following the recursive formulas:
D(0,0) = true
D(x,0) = false x!=0
D(x,i) = D(x-4,i-1) OR D(x+4,i-1) OR D(x*4,i-1) OR D(x/4,i-1)
By computing the above using DP technique, it is easy to find out which numbers can be produced using these 4's, and by walking back the solution, you can find out how each number was built.
The advantage of this method (when implemented with DP) is you do not recalculate multiple values more than once. I am not sure it will actually be effective for 4 4's, but I believe theoretically it could be a significant improvement for a less restricted generalization of this problem.
This answer is just an extension of Amit's.
Essentially, your operations are:
Apply a unary operator to an existing expression to get a new expression (this does not use any additional 4s)
Apply a binary operator to two existing expressions to get a new expression (the new expression has number of 4s equal to the sum of the two input expressions)
For each n from 1..4, calculate Expressions(n) - a List of (Expression, Value) pairs as follows:
(For a fixed n, only store 1 expression in the list that evaluates to any given value)
Initialise the list with the concatenation of n 4s (i.e. 4, 44, 444, 4444)
For i from 1 to n-1, and each permitted binary operator op, add an expression (and value) e1 op e2 where e1 is in Expressions(i) and e2 is in Expressions(n-i)
Repeatedly apply unary operators to the expressions/values calculated so far in steps 1-3. When to stop (applying 3 recursively) is a little vague, certainly stop if an iteration produces no new values. Potentially limit the magnitude of the values you allow, or the size of the expressions.
Example unary operators are !, Sqrt, -, etc. Example binary operators are +-*/^ etc. You can easily extend this approach to operators with more arguments if permitted.
You could do something a bit cleverer in terms of step 3 never ending for any given n. The simple way (described above) does not start calculating Expressions(i) until Expressions(j) is complete for all j < i. This requires that we know when to stop. The alternative is to build Expressions of a certain maximum length for each n, then if you need to (because you haven't found certain values), extend the maximum length in an outer loop.

Balanced Parenthesis Order number

Suppose if you consider the case of length-six strings, the order would be: “()()()”, “()(())”, “(())()”, “(()())”, “((()))”.
In the above example, if we see that the strings in which the first opening parenthesis is closed the earliest come first, and if that is the same for two strings, the rule is recursively applied to the next opening parenthesis.
If particular balanced parenthesis sequence is given how to find the order number? Suppose ()(())--> Output is 2....In O(n) where n is the length of balanced parenthesis i.e 3 in above case...The input can be around 100000 balanced parenthesis
First let g(n,k) be the number of length 2n + k strings there are with n pairs of balanced parentheses, which close k more parentheses. Can we calculate g(n,k)?
Let's try recursion. For that we first need a base case. It is clear that if there are no balanced parentheses, then we can only have one possibility - only closing parentheses. So g(0,k) = 1. There is our base case.
Next the recursive case. The first character is either an opening parenthesis, or a closing parenthesis. If it is an opening parenthesis, then there are g(n-1,k+1) ways to finish. If it is a closing parenthesis, then there are g(n,k-1) ways to finish. But we can't have a negative number of open
g(0,k) = 1
g(n,-1) = 0
g(n,k) = g(n-1, k+1)
This lets us calculate g but is not efficient - we are effectively going to list every possible string in the recursive calls. However there is a trick, memoize the results. Meaning that every time you call g(n, k) see if you've ever called it before, and if you have just return that answer. Otherwise you calculate the answer, cache it, and then return it. (For more on this trick, and alternate strategies, look up dynamic programming.)
OK, so now we can generate counts of something related, but how can we use this to get your answer?
Well note the following. Suppose that partway through your string you find an open parenthesis where there logically could be a close parenthesis instead. Suppose that at that point there are n pairs of parentheses needed and k open parentheses. Then there were g(n, k-1) possible strings that are the same as yours until then, then have a close parenthesis there (so they come before yours) and do whatever afterwards. So summing g(n, k-1) over all of the close parentheses gives you the number of strings before yours. Adding one to that gives you your position.
I got the answer from the Ruskey thesis. This algorithm specified about the Ranking & unranking of binary trees.
http://webhome.cs.uvic.ca/~ruskey/Publications/Thesis/ThesisPage16.png

An unparenthesized arithmetic expression

An arithmetic expression can have many possible values
Can someone help me?
There is a dynamic programming solution.
For a expression, you can define its "outmost split point" be the first operator that is not within any parentheses. Now after this split, if it is on a +, then you need to maximize the left sub expression and the right sub expression; if it is a -, then maximize the left side and minimize the right side.
You can use either dynamic programming or memoization to implement this algorithm. Memoization is straightforward: search for each split point, and save the answer in another data structure (two 2D matrices, with M[x][y] string the max/min value of the expression beginning at x and ending at y); when the data is in the matrices, use it instead of recompute.
Use dynamic programming is a bit trickier, but you can think of it this way:
first, you loop through the expression, finding the max/min for each consecutive 2 values with the operator between them (well, this is the fancy way of saying just compute it);
loop through the expression, finding the max/min for each consecutive 3 values with the operator between them (for a ? b ? c, this is computed by assuming the split point is between a and b, and the assuming the split point is on b and c, and store the max/min values of these two);
Once you know the max/min for all k-length sequences, compute the k + 1-length ones using the same method as in step 2, until k is the length of the array, and return the max value for length k.
This is almost the same as Matrix Chain Multiplication algorithm, which has O(N^3) complexity. The complexity can be proved crudely by reasoning: you need to do the loop N - 1 times, each time at most N - 1 subsequences, and you need to try at most N - 1 split points. So, N ^ 3 time complexity.

Algorithm to find lenth of longest sequence of blanks in a given string

Looking for an algorithm to find the length of longest sequence of blanks in a given string examining as few characters as possible?
Hint : Your program should become faster as the length of the sequence of blanks increases.
I know the solution which is O(n).. But looking for more optimal solution
You won't be able to find a solution which is a smaller complexity than O(n) because you need to pass through every character in the worst case with an input string that has at most 0 or 1 consecutive whitespace, or is completely whitespace.
You can do some optimizations though, but it'll still be considered O(n).
For example:
Let M be the current longest match so far as you go through your list. Also assume you can access input elements in O(1), for example you have an array as input.
When you see a non-whitespace you can skip M elements if the current + M is non whitespace. Surely no whitespace longer than M can be contained inside.
And when you see a whitepsace character, if current + M-1 is not whitespace you know you don't have the longest runs o you can skip in that case as well.
But in the worst case (when all characters are blank) you have to examine every character. So it can't be better than O(n) in complexity.
Rationale: assume the whole string is blank, you haven't examined N characters and your algorithms outputs n. Then if any non-examined character is not blank, your answer would be wrong. So for this particular input you have to examine the whole string.
There's no way to make it faster than O(N) in the worst case. However, here are a few optimizations, assuming 0-based indexing.
If you already have a complete sequence of L blanks (by complete I mean a sequence that is not a subsequence of a larger sequence), and L is at least as large as half the size of your string, you can stop.
If you have a complete sequence of L blanks, once you hit a space at position i check if the character at position i + L is also a space. If it is, continue scanning from position i forwards as you might find a larger sequence - however, if you encounter a non-space until position i + L, then you can skip directly to i + L + 1. If it isn't a space, there's no way you can build a larger sequence starting at i, so scan forwards starting from i + L + 1.
If you have a complete sequence of blanks of length L, and you are at position i and you have k positions left to examine, and k <= L, you can stop your search, as obviously there's no way you'll be able to find anything better anymore.
To prove that you can't make it faster than O(N), consider a string that contains no spaces. You will have to access each character once, so it's O(N). Same with a string that contains nothing but spaces.
The obvious idea: you can jump by K+1 places (where K is the current longest space sequence) and scan back if you found a space.
This way you have something about (n + n/M)/2 = n(M+1)/2M positions checked.
Edit:
Another idea would be to apply a kind of binary search. This is like follows: for a given k you make a procedure that checks whether there is a sequence of spaces with length >= k. This can be achieved in O(n/k) steps. Then, you try to find the maximal k with binary search.
Edit:
During the consequent searches, you can utilize the knowledge that the sequence of some length k already exist, and start skipping at k from the very beginning.
What ever you do, the worst case will always be o(n) - if those blanks are on the last part of the string... (or the last "checked" part of the string).

Resources