How to convert infix expression to prefix expression without using stack, array, programming language or implemetaion - data-structures

Someone tell me any algorithm or steps to be taken for converting infix expression to prefix expression without using stack, array, any programming language or implementation. Just simple human algorithms for Non CS Students.
If anyone have better algorithm or steps please specify and also try to solve it for me please... :)
(5+15/3)^2-(8*3/3*4/5*32/5+42)*(3*3/3*5/4)

The "simple human algorithm" uses a stack. Consider the Shunting Yard, for example. You can do that with paper and pencil. The "output queue" is simply the solution that you output. The "stack" is just a holding place. So when it says, "push onto the stack", imagine putting that value on the top of a stack of other values. When it says, "pop from the stack," imagine removing the thing that was on top.
When doing it with pencil and paper, dedicate a couple of lines at the bottom of the page for your output queue. Create a column on the right side of the page as your stack. Wherever it says, "write it to the output queue", write that value as the next value on your answer line.
The first time it says, "push onto the stack", write that value in the stack column, at the bottom. If you have to push something else, write it above that value. When it says "pop from the stack," erase the top value from your stack column, freeing up a space.
That really is the simplest reliable way to do things by hand.
I'll use the first bit of your example for a demonstration. Let's say you want to convert (5+15/3)^2 to postfix. Using the instructions in the Shunting Yard article:
Your output queue is empty and so is your stack. The first token is (. The instructions say to push it onto the stack. So we have:
output queue:
stack: (
The next token is 5. It goes to the output queue:
output queue: 5
stack: (
Next is +. Since there is no token on the top of the stack, we just push it:
output queue: 5
stack: ( +
Next is 15. It goes to the output queue
output queue: 5 15
stack: ( +
Next is /. It's an operator and there's an operator on the stack, but / has higher precedence than +. So according to the rules, we push / onto the stack:
output queue 5 15
stack: ( + /
Next is 3. It goes to the output queue:
output queue 5 15 3
stack: ( + /
Next is ). The rules say to start popping operators from the stack until we get to the open parenthesis. Or, if we empty the stack and there's no open paren, then we have mismatched parentheses. Anyway, popping the stack and adding to the output queue:
output queue: 5 15 3 / +
stack: <empty>
Next token is ^. There are no operators on the stack, so we push it.
output queue: 5 15 3 / +
stack: ^
Finally, we have 2. It goes to the output queue:
output queue: 5 15 3 / + 2
stack: ^
And we're at the end of the string, so we pop all the operators and put them on the output queue:
output queue: 5 15 3 / + 2 ^
And that's the postfix representation of (5 + 15/3)^2.
The only tricky part is getting the operator precedence right. Basically, exponentiation is highest. Multiplication and division next, at equal precedence, then addition and subtraction at equal precedence. If those are the only operators, it's easy to remember. Otherwise you'll probably want a table of operator precedence handy so you can refer to it when you're working the algorithm. And the unary minus (i.e. 5 + -1) require a special case. But really, that's all there is to it.

Related

Parse expression with functions

This is my situation: the input is a string that contains a normal mathematical operation like 5+3*4. Functions are also possible, i.e. min(5,A*2). This string is already tokenized, and now I want to parse it using stacks (so no AST). I first used the Shunting Yard Algorithm, but here my main problem arise:
Suppose you have this (tokenized) string: min(1,2,3,+) which is obviously invalid syntax. However, SYA turns this into the output stack 1 2 3 + min(, and hopefully you see the problem coming. When parsing from left to right, it sees the + first, calculating 2+3=5, and then calculating min(1,5), which results in 1. Thus, my algorithm says this expression is completely fine, while it should throw a syntax error (or something similar).
What is the best way to prevent things like this? Add a special delimiter (such as the comma), use a different algorithm, or what?
In order to prevent this issue, you might have to keep track of the stack depth. The way I would do this (and I'm not sure it is the "best" way) is with another stack.
The new stack follows these rules:
When an open parentheses, (, or function is parsed, push a 0.
Do this in case of nested functions
When a closing parentheses, ), is parsed, pop the last item off and add it to the new last value on the stack.
The number that just got popped off is how many values were returned by the function. You probably want this to always be 1.
When a comma or similar delimiter is parsed, pop from the stack, add that number to the new last element, then push a 0.
Reset so that we can begin verifying the next argument of a function
The value that just got popped off is how many values were returned by the statement. You probably want this to always be 1.
When a number is pushed to the output, increment the top element of this stack.
This is how many values are available in the output. Numbers increase the number of values. Binary operators need to have at least 2.
When a binary operator is pushed to the output, decrement the top element
A binary operator takes 2 values and outputs 1, thus reducing the overall number of values left on the output by 1.
In general, an n-ary operator that takes n values and returns m values should add (m-n) to the top element.
If this value ever becomes negative, throw an error!
This will find that the last argument in your example, which just contains a +, will decrement the top of the stack to -1, automatically throwing an error.
But then you might notice that a final argument in your example of, say, 3+ would return a zero, which is not negative. In this case, you would throw an error in one of the steps where "you probably want this to always be 1."

Why postfix (rpn) notation is more used than prefix?

By use I mean its use in many calculators like HP35-
My guesses (and confusions) are -
postfix is actually more memory efficient -( SO post comments here ). (confusion - The evaluation algorithm of both are similar with a stack)
keyboard input type in calculators back then(confusion - this shouldn't have mattered much as it only depends on order of operators given first or last)
Another way this question can be asked is what advantages postfix notation have over prefix?
Can anyone enlighten me?
For one it is easier to implement evaluation.
With prefix, if you push an operator, then its operands, you need to have forward knowledge of when the operator has all its operands. Basically you need to keep track of when operators you've pushed have all their operands so that you can unwind the stack and evaluate.
Since a complex expression will likely end up with many operators on the stack you need to have a data structure that can handle this.
For instance, this expression: - + 10 20 + 30 40 will have one - and one + on the stack at the same time, and for each you need to know if you have the operands available.
With suffix, when you push an operator, the operands are (should) already on the stack, simply pop the operands and evaluate. You only need a stack that can handle operands, and no other data structure is necessary.
Prefix notation is probably used more commonly ... in mathematics, in expressions like F(x,y). It's a very old convention, but like many old systems (feet and inches, letter paper) it has drawbacks compared to what we could do if we used a more thoughtfully designed system.
Just about every first year university math textbook has to waste a page at least explaining that f(g(x)) means we apply g first then f. Doing it in reading order makes so much more sense: x.f.g means we apply f first. Then if we want to apply h "after" we just say x.f.g.h.
As an example, consider an issue in 3d rotations that I recently had to deal with. We want to rotate a vector according to XYZ convention. In postfix, the operation is vec.rotx(phi).roty(theta).rotz(psi). With prefix, we have to overload * or () and then reverse the order of the operations, e.g., rotz*roty*rotx*vec. That is error prone and irritating to have to think about that all the time when you want to be thinking about bigger issues.
For example, I saw something like rotx*roty*rotz*vec in someone else's code and I didn't know whether it was a mistake or an unusual ZYX rotation convention. I still don't know. The program worked, so it was internally self-consistent, but in this case prefix notation made it hard to maintain.
Another issue with prefix notation is that when we (or a computer) parses the expression f(g(h(x))) we have to hold f in our memory (or on the stack), then g, then h, then ok we can apply h to x, then we can apply g to the result, then f to the result. Too much stuff in memory compared to x.f.g.h. At some point (for humans much sooner than computers) we will run out of memory. Failure in that way is not common, but why even open the door to that when x.f.g.h requires no short term memory. It's like the difference between recursion and looping.
And another thing: f(g(h(x))) has so many parentheses that it's starting to look like Lisp. Postfix notation is unambiguous when it comes to operator precedence.
Some mathematicians (in particular Nathan Jacobson) have tried changing the convention, because postfix so much easier to work with in noncommutative algebra where order really matters, to little avail. But since we have a chance to do things over, better, in computing, we should take the opportunity.
Basically, because if you write the expression in postfix, you can evaluate that expression using just a Stack:
Read the next element of the expression,
If it is an operand, push into Stack,
Otherwise read from Stack operands required by the Operation, & push the result into Stack.
If not the end of the expression, go to 1.
Example
expression = 1 2 + 3 4 + *
stack = [ ]
Read 1, 1 is Operand, Push 1
[ 1 ]
Read 2, 2 is Operand, Push 2
[ 1 2 ]
Read +, + is Operation, Pop two Operands 1 2
Evaluate 1 + 2 = 3, Push 3
[ 3 ]
Read 3, 3 is Operand, Push 3
[ 3 3 ]
Read 4, 4 is Operand, Push 4
[ 3 3 4 ]
Read +, + is Operation, Pop two Operands 3 4
Evaluate 3 + 4 = 7, Push 7
[ 3 7 ]
Read *, * is Operation, Pop two Operands 3 7
Evaluate 3 * 7 = 21, Push 21
[ 21 ]
If you like your human reading order to match the machine's stack-based evaluation order then postfix is a good choice.
That is, assuming you read left-to-right, which not everyone does (e.g. Hebrew, Arabic, ...). And assuming your machine evaluates with a stack, which not all do (e.g. term rewriting - see Joy).
On the other hand, there's nothing wrong with the human preferring prefix while the machine evaluates "back to front/bottom-to-top". Serialization could be reversed too if the concern is evaluation as tokens arrive. Tool assistance may work better in prefix notation (knowing functions/words first may help scope valid arguments), but you could always type right-to-left.
It's merely a convention I believe...
Offline evaluation of both notation is same in theoretical machine
(Eager evaluation strategy)Evaluating with only one stack(without putting operator in stack)
It can be done by evaluating Prefix-notation right-to-left.
- 7 + 2 3
# evaluate + 2 3
- 7 5
# evaluate - 7 5
2
It is same as evaluating Postfix-notation left-to-right.
7 2 3 + -
# put 7 on stack
7 2 3 + -
# evaluate 2 3 +
7 5 -
# evaluate 7 5 -
2
(Optimized short-circuit strategy) Evaluating with two stacks(one for operator and one for operand)
It can be done by evaluating Prefix-notation left-to-right.
|| 1 < 2 3
# put || in instruction stack, 1 in operand stack or keep the pair in stack
instruction-stack: or
operand-stack: 1
< 2 3
# push < 2 3 in stack
instruction-stack: or, less_than
operand-stack: 1, 2, 3
# evaluate < 2 3 as 1
instruction-stack: or
operand-stack: 1, 1
# evaluate || 1 1 as 1
operand-stack:1
Notice that we can do short-circuit optimization for the boolean expression here easily(compared to previous evaluation sequence).
|| 1 < 2 3
# put || in instruction stack, 1 in operand stack or keep the pair in stack
instruction-stack: or
operand-stack: 1
< 2 3
# Is it possible to evaluate `|| 1` without evaluating the rest ? Yes !!
# skip < 2 3 and put place-holder 0
instruction-stack: or
operand-stack: 1 0
# evaluate || 1 0 as 1
operand-stack: 1
It is same as evaluating Postfix-notation right-to-left.
(Optimized short-circuit strategy) Evaluating with one stack that takes a tuple (same as above)
It can be done by evaluating Prefix-notation left-to-right.
|| 1 < 2 3
# put || 1 in tuple-stack
stack tuple[or,1,unknown]
< 2 3
# We do not need to compute < 2 3
stack tuple[or,1,unknown]
# evaluate || 1 unknown as 1
1
It is same as evaluating Postfix-notation right-to-left.
Online evaluation in a calculator while human entering data in left-to-right
When putting numbers in a calculator, the Postfix-notation 2 3 + can be evaluated instantly without any knowledge of the symbol human is going to put. It is opposite of Prefix notation because when we have - 7 + we have nothing to do, not until we get something like - 7 + 2 3.
Online evaluation in a calculator while human entering data in right-to-left
Now the Prefix-notation can evaluate + 2 3 instantly, while the Postfix-notation waits for further input when it has 3 + - .
Please refer to #AshleyF note that the Arabic-language writes from right-to-left in contrast to English-language that writes from left-to-write !
I guess little-endian and big-endian is something related to this prefix/postfix notation.
One final comment, Reverse-Polish notation is strongly supported by Dijkstra (he is strong opponent of short-circuit optimization and regarded as the inventor of Reverse-Polish notation). It is your choice to support his opinion or not(I do not).

dc: how do I pop (and discard) the top number of the stack?

In dc, how do I pop and discard a number from the top of the stack? A stack with three items (1 2 3) should become a stack with two items (2 3). Currently I'm shoving the number onto another stack (Sz) but that seems rather lame.
There are numerous ways to delete the top of the stack but they have side effects. Removing an element without side effects requires you to avoid included side effects.
To remove the top of the stack without a side effect, ensure that the top is a number and then run d!=z. If the stack had [5], this does the following
Start with item to remove. Stack: [5]
Duplicate top of stack. Stack: [5,5]
Pop top 2 and test if they are not equal: 5 != 5 Stack: []
If test passed (which it can't), run z Stack: []
To ensure that the top of stack is a number, I use Z which will calculate the length of a string or the number of digits in a number and push that back. There are other options such as X. Anything that makes a number out of anything will work so that it will be compatible with !=.
So the full answer for copy pasting in all situations is the following:
Zd!=r
I usually stick this in register D (for Drop):
[Zd!=r]sD
and then I can run
lDx

Stack question: pop out in a pattern

At run time I want to input two types of datatype, double and string.
One of the condition is that String should pop in the order I input, and double will pop as the usual stack behaviour, LIFO. Another condition is that the stack is limited to max size 10
E.g. one runtime example
Input Hello 1 World 2 blah blah 3 4 5
Output Hello 5 World 4 blah blah 3 2 1
My first question is how many ways is there to solve this problem?
I have solved this problem using 3 stacks, one which stores double, one which store strings, and one which is used to reverse the string order.
I need to save the pattern so the program know which order the doubles comes, thus I save the pattern to the string stack. Since the stack is limited to size 10, I will need to save the pattern in another way.
So this is how my string stack will look like after the push
Hello*
World*
blah
blah***
So when at the first read I need to make specific read in that Stack position and just extract Hello out of it. Asterisk * is left for later use when I tell the program next pop is an double.
My second question is that I wonder if there is some other more elegant solution to this problem. Since my solution will involve some string manipulation to solve this problem. And as for now I'm not actually using the pop function in the string case as it is supposed to be used. I made the solution in C++ btw.
What you're doing is fine, except that if you must use a stack, then you aren't allowed to access random locations in a stack -- you can only push/pop -- and also it's not so nice to modify the input strings and store asterisks in them.
You can solve this using only push/pop operations with 5 stacks (technically, only 4 will be used at any one time, but since they are of different types, you need to declare all 5 in your program):
stack 1: push doubles in input order
stack 2: push strings in input order
stack 3: push data types (double or string) in input order
stack 4: reverse the order of strings in stack 2
stack 5: reverse the order of data types in stack 3
Now pop one data type at a time from stack 5, if it is a double, pop from stack 1, otherwise pop from stack 5, and print the popped value.
Edit: #jleedev makes a good point that there isn't a general solution when the stack size is limited. What I've described above assumes that you're allowed to use multiple stacks and each stack can hold as many items as present in the input.
I'll ignore the stack size constraint since I think it's meaning is unclear. In addition, if you can use multiple stacks all limited to size 10, then you can simulate larger stacks by using multiple actual stacks.
So, this can be done with 2 stacks using only push/pop.
push everything onto stack A.
pop everything from A onto B.
if B.empty return
if B.top is a double goto 7
output B.top and pop it off of B
goto 3
pop all of B onto A
while A.top is not a double pop A onto B
output A.top and pop it off of A
goto 2.

Calculator stack

My understanding of calculators is that they are stack-based. When you use most calculators, if you type 1 + 2 [enter] [enter] you get 5. 1 is pushed on the stack, + is the operator, then 2 is pushed on the stack. The 1st [enter] should pop 1 and 2 off the stack, add them to get 3 then push 3 back on the stack. The 2nd [enter] shouldn't have access to the 2 because it effectively doesn't exist anywhere.
How is the 2 retained so that the 2nd [enter] can use it?
Is 2 pushed back onto the stack before the 3 or is it retained somewhere else for later use? If it is pushed back on the stack, can you conceivably cause a stack overflow by repeatedly doing [operator] [number] [enter] [enter]?
Conceptually, in hardware these values are put into registers. In simple ALU (Arithmatic Logical Units (i.e. simply CPUs)), one of the registers would be considered an accumulator. The values you're discussing could be put on a stack to process, but once the stack is empty, the register value (including the last operation) may be cached in these registers. To which, when told to perform the operation again, uses the accumulator as well as the last argument.
For example,
Reg1 Reg2 (Accumulator) Operator
Input 1 1
Input + 1 +
Input 2 2 1 +
Enter 2 3 +
Enter 2 5 +
Enter 2 7 +
So it may be a function of the hardware being used.
The only true stack based calculators are calculators which have Reverse Polish Notation as the input method, as that notation directly operates on stacks.
All you would need to do is retain the last operator and operand, and just apply them if the stack is empty.
There is an excellent description and tutorial of the Shunting-yard algorithm (infix -> rpn conversion) on wikipedia.

Resources