Calculator stack - data-structures

My understanding of calculators is that they are stack-based. When you use most calculators, if you type 1 + 2 [enter] [enter] you get 5. 1 is pushed on the stack, + is the operator, then 2 is pushed on the stack. The 1st [enter] should pop 1 and 2 off the stack, add them to get 3 then push 3 back on the stack. The 2nd [enter] shouldn't have access to the 2 because it effectively doesn't exist anywhere.
How is the 2 retained so that the 2nd [enter] can use it?
Is 2 pushed back onto the stack before the 3 or is it retained somewhere else for later use? If it is pushed back on the stack, can you conceivably cause a stack overflow by repeatedly doing [operator] [number] [enter] [enter]?

Conceptually, in hardware these values are put into registers. In simple ALU (Arithmatic Logical Units (i.e. simply CPUs)), one of the registers would be considered an accumulator. The values you're discussing could be put on a stack to process, but once the stack is empty, the register value (including the last operation) may be cached in these registers. To which, when told to perform the operation again, uses the accumulator as well as the last argument.
For example,
Reg1 Reg2 (Accumulator) Operator
Input 1 1
Input + 1 +
Input 2 2 1 +
Enter 2 3 +
Enter 2 5 +
Enter 2 7 +
So it may be a function of the hardware being used.

The only true stack based calculators are calculators which have Reverse Polish Notation as the input method, as that notation directly operates on stacks.

All you would need to do is retain the last operator and operand, and just apply them if the stack is empty.

There is an excellent description and tutorial of the Shunting-yard algorithm (infix -> rpn conversion) on wikipedia.

Related

How are Ethereum bytecode JUMPs and JUMPDESTs resolved?

I've been looking around for info on how Ethereum deals with jumps and jump destinations. From various blogs and the yellow paper what I found is as follows:
The operand taken by JUMP and the first of the two operands taken by JUMPI are the value the the PC is set to (assume the first stack value != 0 in the case of JUMPI).
However, looking at this contract's creation code (as opcodes) the first few opcodes/values are:
PUSH1 0x60
PUSH1 0x40
MSTORE
CALLDATASIZE
ISZERO
PUSH2 0x00f8
JUMPI
As I understand it this means that if the value pushed to the stack by ISZERO != 0 then PC will change to 0x00f8 as JUMPI takes two from the stack, checks if the second is 0 and if not sets PC to the value of its first operand.
The problem I am having is that 0x00f8 in decimal is 248. The 248th position in the contract appears to be MSTORE and not a JUMPDEST, which would cause the contract to fail in its execution as JUMP* can only point to a valid JUMPDEST.
Presumably contracts don't jump to invalid destinations on purpose?
If anyone could explain how jumps and jump destinations are resolved I would be very grateful.
In case it helps others:
The confusion arose from the EVM reading byte by byte and NOT word by word.
From the example in the question, 0x00f8 would be the 248th byte, not the 248th word.
As each opcode is 1 byte long PC is normally incremented by 1 when reading an opcode.
However in the case of a PUSH instruction, information on how many of the following bytes are to be taken as its operand is also included.
For example PUSH2 takes the 2 bytes that follow it, PUSH6 takes 6 bytes that follow it, and so on. Here PC would be incremented by 1 for the PUSH and then 2 or 6 respectively for each byte of the data used by the PUSH.
Just want to point out that there is a difference in JUMP and JUMPI.
JUMP just takes 1 element from the stack i.e. destination. Which is generally an offset in hex pushed to the stack.
JUMPI is a conditional jump that takes top 2 elements from the stack i.e. destination and condition.
In the example you gave the condition is ISZERO(checks if the top most element of the stack is 0 or not).
So if that returns true, it will JUMP to the desitnation that is the offset 0x00f8(248 in decimal).
If the condition is False, it will just increase the program counter by 1.
In the contract you mentioned, it is a JUMPDEST opcode at (Program counter)248.
The program counter depends on the opcode. How much many bytes does a opcode push into the stack,etc. e.g.
PUSH1 0x60 - PC[0]
PUSH1 0x40 - PC[2]
MSTORE - PC[4]
CALLDATASIZE- PC[5]
ISZERO - PC[6]
PUSH2 0x00f8- PC[7]
JUMPI - PC[10]
Maybe this website will give you a better understanding on opcodes https://ethervm.io/

How to convert infix expression to prefix expression without using stack, array, programming language or implemetaion

Someone tell me any algorithm or steps to be taken for converting infix expression to prefix expression without using stack, array, any programming language or implementation. Just simple human algorithms for Non CS Students.
If anyone have better algorithm or steps please specify and also try to solve it for me please... :)
(5+15/3)^2-(8*3/3*4/5*32/5+42)*(3*3/3*5/4)
The "simple human algorithm" uses a stack. Consider the Shunting Yard, for example. You can do that with paper and pencil. The "output queue" is simply the solution that you output. The "stack" is just a holding place. So when it says, "push onto the stack", imagine putting that value on the top of a stack of other values. When it says, "pop from the stack," imagine removing the thing that was on top.
When doing it with pencil and paper, dedicate a couple of lines at the bottom of the page for your output queue. Create a column on the right side of the page as your stack. Wherever it says, "write it to the output queue", write that value as the next value on your answer line.
The first time it says, "push onto the stack", write that value in the stack column, at the bottom. If you have to push something else, write it above that value. When it says "pop from the stack," erase the top value from your stack column, freeing up a space.
That really is the simplest reliable way to do things by hand.
I'll use the first bit of your example for a demonstration. Let's say you want to convert (5+15/3)^2 to postfix. Using the instructions in the Shunting Yard article:
Your output queue is empty and so is your stack. The first token is (. The instructions say to push it onto the stack. So we have:
output queue:
stack: (
The next token is 5. It goes to the output queue:
output queue: 5
stack: (
Next is +. Since there is no token on the top of the stack, we just push it:
output queue: 5
stack: ( +
Next is 15. It goes to the output queue
output queue: 5 15
stack: ( +
Next is /. It's an operator and there's an operator on the stack, but / has higher precedence than +. So according to the rules, we push / onto the stack:
output queue 5 15
stack: ( + /
Next is 3. It goes to the output queue:
output queue 5 15 3
stack: ( + /
Next is ). The rules say to start popping operators from the stack until we get to the open parenthesis. Or, if we empty the stack and there's no open paren, then we have mismatched parentheses. Anyway, popping the stack and adding to the output queue:
output queue: 5 15 3 / +
stack: <empty>
Next token is ^. There are no operators on the stack, so we push it.
output queue: 5 15 3 / +
stack: ^
Finally, we have 2. It goes to the output queue:
output queue: 5 15 3 / + 2
stack: ^
And we're at the end of the string, so we pop all the operators and put them on the output queue:
output queue: 5 15 3 / + 2 ^
And that's the postfix representation of (5 + 15/3)^2.
The only tricky part is getting the operator precedence right. Basically, exponentiation is highest. Multiplication and division next, at equal precedence, then addition and subtraction at equal precedence. If those are the only operators, it's easy to remember. Otherwise you'll probably want a table of operator precedence handy so you can refer to it when you're working the algorithm. And the unary minus (i.e. 5 + -1) require a special case. But really, that's all there is to it.

Why postfix (rpn) notation is more used than prefix?

By use I mean its use in many calculators like HP35-
My guesses (and confusions) are -
postfix is actually more memory efficient -( SO post comments here ). (confusion - The evaluation algorithm of both are similar with a stack)
keyboard input type in calculators back then(confusion - this shouldn't have mattered much as it only depends on order of operators given first or last)
Another way this question can be asked is what advantages postfix notation have over prefix?
Can anyone enlighten me?
For one it is easier to implement evaluation.
With prefix, if you push an operator, then its operands, you need to have forward knowledge of when the operator has all its operands. Basically you need to keep track of when operators you've pushed have all their operands so that you can unwind the stack and evaluate.
Since a complex expression will likely end up with many operators on the stack you need to have a data structure that can handle this.
For instance, this expression: - + 10 20 + 30 40 will have one - and one + on the stack at the same time, and for each you need to know if you have the operands available.
With suffix, when you push an operator, the operands are (should) already on the stack, simply pop the operands and evaluate. You only need a stack that can handle operands, and no other data structure is necessary.
Prefix notation is probably used more commonly ... in mathematics, in expressions like F(x,y). It's a very old convention, but like many old systems (feet and inches, letter paper) it has drawbacks compared to what we could do if we used a more thoughtfully designed system.
Just about every first year university math textbook has to waste a page at least explaining that f(g(x)) means we apply g first then f. Doing it in reading order makes so much more sense: x.f.g means we apply f first. Then if we want to apply h "after" we just say x.f.g.h.
As an example, consider an issue in 3d rotations that I recently had to deal with. We want to rotate a vector according to XYZ convention. In postfix, the operation is vec.rotx(phi).roty(theta).rotz(psi). With prefix, we have to overload * or () and then reverse the order of the operations, e.g., rotz*roty*rotx*vec. That is error prone and irritating to have to think about that all the time when you want to be thinking about bigger issues.
For example, I saw something like rotx*roty*rotz*vec in someone else's code and I didn't know whether it was a mistake or an unusual ZYX rotation convention. I still don't know. The program worked, so it was internally self-consistent, but in this case prefix notation made it hard to maintain.
Another issue with prefix notation is that when we (or a computer) parses the expression f(g(h(x))) we have to hold f in our memory (or on the stack), then g, then h, then ok we can apply h to x, then we can apply g to the result, then f to the result. Too much stuff in memory compared to x.f.g.h. At some point (for humans much sooner than computers) we will run out of memory. Failure in that way is not common, but why even open the door to that when x.f.g.h requires no short term memory. It's like the difference between recursion and looping.
And another thing: f(g(h(x))) has so many parentheses that it's starting to look like Lisp. Postfix notation is unambiguous when it comes to operator precedence.
Some mathematicians (in particular Nathan Jacobson) have tried changing the convention, because postfix so much easier to work with in noncommutative algebra where order really matters, to little avail. But since we have a chance to do things over, better, in computing, we should take the opportunity.
Basically, because if you write the expression in postfix, you can evaluate that expression using just a Stack:
Read the next element of the expression,
If it is an operand, push into Stack,
Otherwise read from Stack operands required by the Operation, & push the result into Stack.
If not the end of the expression, go to 1.
Example
expression = 1 2 + 3 4 + *
stack = [ ]
Read 1, 1 is Operand, Push 1
[ 1 ]
Read 2, 2 is Operand, Push 2
[ 1 2 ]
Read +, + is Operation, Pop two Operands 1 2
Evaluate 1 + 2 = 3, Push 3
[ 3 ]
Read 3, 3 is Operand, Push 3
[ 3 3 ]
Read 4, 4 is Operand, Push 4
[ 3 3 4 ]
Read +, + is Operation, Pop two Operands 3 4
Evaluate 3 + 4 = 7, Push 7
[ 3 7 ]
Read *, * is Operation, Pop two Operands 3 7
Evaluate 3 * 7 = 21, Push 21
[ 21 ]
If you like your human reading order to match the machine's stack-based evaluation order then postfix is a good choice.
That is, assuming you read left-to-right, which not everyone does (e.g. Hebrew, Arabic, ...). And assuming your machine evaluates with a stack, which not all do (e.g. term rewriting - see Joy).
On the other hand, there's nothing wrong with the human preferring prefix while the machine evaluates "back to front/bottom-to-top". Serialization could be reversed too if the concern is evaluation as tokens arrive. Tool assistance may work better in prefix notation (knowing functions/words first may help scope valid arguments), but you could always type right-to-left.
It's merely a convention I believe...
Offline evaluation of both notation is same in theoretical machine
(Eager evaluation strategy)Evaluating with only one stack(without putting operator in stack)
It can be done by evaluating Prefix-notation right-to-left.
- 7 + 2 3
# evaluate + 2 3
- 7 5
# evaluate - 7 5
2
It is same as evaluating Postfix-notation left-to-right.
7 2 3 + -
# put 7 on stack
7 2 3 + -
# evaluate 2 3 +
7 5 -
# evaluate 7 5 -
2
(Optimized short-circuit strategy) Evaluating with two stacks(one for operator and one for operand)
It can be done by evaluating Prefix-notation left-to-right.
|| 1 < 2 3
# put || in instruction stack, 1 in operand stack or keep the pair in stack
instruction-stack: or
operand-stack: 1
< 2 3
# push < 2 3 in stack
instruction-stack: or, less_than
operand-stack: 1, 2, 3
# evaluate < 2 3 as 1
instruction-stack: or
operand-stack: 1, 1
# evaluate || 1 1 as 1
operand-stack:1
Notice that we can do short-circuit optimization for the boolean expression here easily(compared to previous evaluation sequence).
|| 1 < 2 3
# put || in instruction stack, 1 in operand stack or keep the pair in stack
instruction-stack: or
operand-stack: 1
< 2 3
# Is it possible to evaluate `|| 1` without evaluating the rest ? Yes !!
# skip < 2 3 and put place-holder 0
instruction-stack: or
operand-stack: 1 0
# evaluate || 1 0 as 1
operand-stack: 1
It is same as evaluating Postfix-notation right-to-left.
(Optimized short-circuit strategy) Evaluating with one stack that takes a tuple (same as above)
It can be done by evaluating Prefix-notation left-to-right.
|| 1 < 2 3
# put || 1 in tuple-stack
stack tuple[or,1,unknown]
< 2 3
# We do not need to compute < 2 3
stack tuple[or,1,unknown]
# evaluate || 1 unknown as 1
1
It is same as evaluating Postfix-notation right-to-left.
Online evaluation in a calculator while human entering data in left-to-right
When putting numbers in a calculator, the Postfix-notation 2 3 + can be evaluated instantly without any knowledge of the symbol human is going to put. It is opposite of Prefix notation because when we have - 7 + we have nothing to do, not until we get something like - 7 + 2 3.
Online evaluation in a calculator while human entering data in right-to-left
Now the Prefix-notation can evaluate + 2 3 instantly, while the Postfix-notation waits for further input when it has 3 + - .
Please refer to #AshleyF note that the Arabic-language writes from right-to-left in contrast to English-language that writes from left-to-write !
I guess little-endian and big-endian is something related to this prefix/postfix notation.
One final comment, Reverse-Polish notation is strongly supported by Dijkstra (he is strong opponent of short-circuit optimization and regarded as the inventor of Reverse-Polish notation). It is your choice to support his opinion or not(I do not).

dc: how do I pop (and discard) the top number of the stack?

In dc, how do I pop and discard a number from the top of the stack? A stack with three items (1 2 3) should become a stack with two items (2 3). Currently I'm shoving the number onto another stack (Sz) but that seems rather lame.
There are numerous ways to delete the top of the stack but they have side effects. Removing an element without side effects requires you to avoid included side effects.
To remove the top of the stack without a side effect, ensure that the top is a number and then run d!=z. If the stack had [5], this does the following
Start with item to remove. Stack: [5]
Duplicate top of stack. Stack: [5,5]
Pop top 2 and test if they are not equal: 5 != 5 Stack: []
If test passed (which it can't), run z Stack: []
To ensure that the top of stack is a number, I use Z which will calculate the length of a string or the number of digits in a number and push that back. There are other options such as X. Anything that makes a number out of anything will work so that it will be compatible with !=.
So the full answer for copy pasting in all situations is the following:
Zd!=r
I usually stick this in register D (for Drop):
[Zd!=r]sD
and then I can run
lDx

Stack question: pop out in a pattern

At run time I want to input two types of datatype, double and string.
One of the condition is that String should pop in the order I input, and double will pop as the usual stack behaviour, LIFO. Another condition is that the stack is limited to max size 10
E.g. one runtime example
Input Hello 1 World 2 blah blah 3 4 5
Output Hello 5 World 4 blah blah 3 2 1
My first question is how many ways is there to solve this problem?
I have solved this problem using 3 stacks, one which stores double, one which store strings, and one which is used to reverse the string order.
I need to save the pattern so the program know which order the doubles comes, thus I save the pattern to the string stack. Since the stack is limited to size 10, I will need to save the pattern in another way.
So this is how my string stack will look like after the push
Hello*
World*
blah
blah***
So when at the first read I need to make specific read in that Stack position and just extract Hello out of it. Asterisk * is left for later use when I tell the program next pop is an double.
My second question is that I wonder if there is some other more elegant solution to this problem. Since my solution will involve some string manipulation to solve this problem. And as for now I'm not actually using the pop function in the string case as it is supposed to be used. I made the solution in C++ btw.
What you're doing is fine, except that if you must use a stack, then you aren't allowed to access random locations in a stack -- you can only push/pop -- and also it's not so nice to modify the input strings and store asterisks in them.
You can solve this using only push/pop operations with 5 stacks (technically, only 4 will be used at any one time, but since they are of different types, you need to declare all 5 in your program):
stack 1: push doubles in input order
stack 2: push strings in input order
stack 3: push data types (double or string) in input order
stack 4: reverse the order of strings in stack 2
stack 5: reverse the order of data types in stack 3
Now pop one data type at a time from stack 5, if it is a double, pop from stack 1, otherwise pop from stack 5, and print the popped value.
Edit: #jleedev makes a good point that there isn't a general solution when the stack size is limited. What I've described above assumes that you're allowed to use multiple stacks and each stack can hold as many items as present in the input.
I'll ignore the stack size constraint since I think it's meaning is unclear. In addition, if you can use multiple stacks all limited to size 10, then you can simulate larger stacks by using multiple actual stacks.
So, this can be done with 2 stacks using only push/pop.
push everything onto stack A.
pop everything from A onto B.
if B.empty return
if B.top is a double goto 7
output B.top and pop it off of B
goto 3
pop all of B onto A
while A.top is not a double pop A onto B
output A.top and pop it off of A
goto 2.

Resources