How to recognize variables that don't affect the output of a program? - compiler-theory

Sometimes the value of a variable accessed within the control-flow of a program cannot possibly have any effect on a its output. For example:
global var_1
global var_2
start program hello(var_3, var_4)
if (var_2 < 0) then
save-log-to-disk (var_1, var_3, var_4)
end-if
return ("Hello " + var_3 + ", my name is " + var_1)
end program
Here only var_1 and var_3 have any influence on the output, while var_2 and var_4 are only used for side effects.
Do variables such as var_1 and var_3 have a name in dataflow-theory/compiler-theory?
Which static dataflow analysis techniques can be used to discover them?
References to academic literature on the subject would be particularly appreciated.

The problem that you stated is undecidable in general,
even for the following very narrow special case:
Given a single routine P(x), where x is a parameter of type integer. Is the output of P(x) independent of the value of x, i.e., does
P(0) = P(1) = P(2) = ...?
We can reduce the following still undecidable version of the halting problem to the question above: Given a Turing machine M(), does the program
never stop on the empty input?
I assume that we use a (Turing-complete) language in which we can build a "Turing machine simulator":
Given the program M(), construct this routine:
P(x):
if x == 0:
return 0
Run M() for x steps
if M() has terminated then:
return 1
else:
return 0
Now:
P(0) = P(1) = P(2) = ...
=>
M() does not terminate.
M() does terminate
=> P(x) = 1 for a sufficiently large x
=> P(x) != P(0) = 0
So, it is very difficult for a compiler to decide whether a variable actually does not influence the return value of a routine; in your example, the "side effect routine" might manipulate one of its values (or even loop infinitely, which would most definitely change the return value of the routine ;-)
Of course overapproximations are still possible. For example, one might conclude that a variable does not influence the return value if it does not appear in the routine body at all. You can also see some classical compiler analyses (like Expression Simplification, Constant propagation) having the side effect of eliminating appearances of such redundant variables.

Pachelbel has discussed the fact that you cannot do this perfectly. OK, I'm an engineer, I'm willing to accept some dirt in my answer.
The classic way to answer you question is to do dataflow tracing from program outputs back to program inputs. A dataflow is the connection of a program assignment (or sideeffect) to a variable value, to a place in the application that consumes that value.
If there is (transitive) dataflow from a program output that you care about (in your example, the printed text stream) to an input you supplied (var2), then that input "affects" the output. A variable that does not flow from the input to your desired output is useless from your point of view.
If you focus your attention only the computations involved in the dataflows, and display them, you get what is generally called a "program slice" . There are (very few) commercial tools that can show this to you.
Grammatech has a good reputation here for C and C++.
There are standard compiler algorithms for constructing such dataflow graphs; see any competent compiler book.
They all suffer from some limitation due to Turing's impossibility proofs as pointed out by Pachelbel. When you implement such a dataflow algorithm, there will be places that it cannot know the right answer; simply pick one.
If your algorithm chooses to answer "there is no dataflow" in certain places where it is not sure, then it may miss a valid dataflow and it might report that a variable does not affect the answer incorrectly. (This is called a "false negative"). This occasional error may be satisfactory if
the algorithm has some other nice properties, e.g, it runs really fast on a millions of code. (The trivial algorithm simply says "no dataflow" in all places, and it is really fast :)
If your algorithm chooses to answer "yes there is a dataflow", then it may claim that some variable affects the answer when it does not. (This is called a "false positive").
You get to decide which is more important; many people prefer false positives when looking for a problem, because then you have to at least look at possibilities detected by the tool. A false negative means it didn't report something you might care about. YMMV.
Here's a starting reference: http://en.wikipedia.org/wiki/Data-flow_analysis
Any of the books on that page will be pretty good. I have Muchnick's book and like it lot. See also this page: (http://en.wikipedia.org/wiki/Program_slicing)
You will discover that implementing this is pretty big effort, for any real langauge. You are probably better off finding a tool framework that does most or all this for you already.

I use the following algorithm: a variable is used if it is a parameter or it occurs anywhere in an expression, excluding as the LHS of an assignment. First, count the number of uses of all variables. Delete unused variables and assignments to unused variables. Repeat until no variables are deleted.
This algorithm only implements a subset of the OP's requirement, it is horribly inefficient because it requires multiple passes. A garbage collection may be faster but is harder to write: my algorithm only requires a list of variables with usage counts. Each pass is linear in the size of the program. The algorithm effectively does a limited kind of dataflow analysis by elimination of the tail of a flow ending in an assignment.
For my language the elimination of side effects in the RHS of an assignment to an unused variable is mandated by the language specification, it may not be suitable for other languages. Effectiveness is improved by running before inlining to reduce the cost of inlining unused function applications, then running it again afterwards which eliminates parameters of inlined functions.
Just as an example of the utility of the language specification, the library constructs a thread pool and assigns a pointer to it to a global variable. If the thread pool is not used, the assignment is deleted, and hence the construction of the thread pool elided.
IMHO compiler optimisations are almost invariably heuristics whose performance matters more than effectiveness achieving a theoretical goal (like removing unused variables). Simple reductions are useful not only because they're fast and easy to write, but because a programmer using a language who understand basics of the compiler operation can leverage this knowledge to help the compiler. The most well known example of this is probably the refactoring of recursive functions to place the recursion in tail position: a pointless exercise unless the programmer knows the compiler can do tail-recursion optimisation.

Related

An algorithm for compiler designing?

Recently I am thinking about an algorithm constructed by myself. I call it Replacment Compiling.
It works as follows:
Define a language as well as its operators' precedence, such as
(1) store <value> as <id>, replace with: var <id> = <value>, precedence: 1
(2) add <num> to <num>, replace with: <num> + <num>, precedence: 2
Accept a line of input, such as store add 1 to 2 as a;
Tokenize it: <kw,store><kw,add><num,1><kw,to><2><kw,as><id,a><EOF>;
Then scan through all the tokens until reach the end-of-file, find the operation with highest precedence, and "pack" the operation:
<kw,store>(<kw,add><num,1><kw,to><2>)<kw,as><id,a><EOF>
Replace the "sub-statement", the expression in parenthesis, with the defined replacement:
<kw,store>(1 + 2)<kw,as><id,a><EOF>
Repeat until no more statements left:
(<kw,store>(1 + 2)<kw,as><id,a>)<EOF>
(var a = (1 + 2))
Then evaluate the code with the built-in function, eval().
eval("var a = (1 + 2)")
Then my question is: would this algorithm work, and what are the limitations? Is this algorithm works better on simple languages?
This won't work as-is, because there's no way of deciding the precedence of operations and keywords, but you have essentially defined parsing (and thrown in an interpretation step at the end). This looks pretty close to operator-precedence parsing, but I could be wrong in the details of your vision. The real keys to what makes a parsing algorithm are the direction/precedence it reads the code, whether the decisions are made top-down (figure out what kind of statement and apply the rules) or bottom-up (assemble small pieces into larger components until the types of statements are apparent), and whether the grammar is encoded as code or data for a generic parser. (I'm probably overlooking something, but this should give you a starting point to make sense out of further reading.)
More typically, code is generally parsed using an LR technique (LL if it's top-down) that's driven from a state machine with look-ahead and next-step information, but you'll also find the occasional recursive descent. Since they're all doing very similar things (only implemented differently), your rough algorithm could probably be refined to look a lot like any of them.
For most people learning about parsing, recursive-descent is the way to go, since everything is in the code instead of building what amounts to an interpreter for the state machine definition. But most parser generators build an LL or LR compiler.
And I'm obviously over-simplifying the field, since you can see at the bottom of the Wikipedia pages that there's a smattering of related systems that partly revolve around the kind of grammar you have available. But for most languages, those are the big-three algorithms.
What you've defined is a rewriting system: https://en.wikipedia.org/wiki/Rewriting
You can make a compiler like that, but it's hard work and runs slowly, and if you do a really good job of optimizing it then you'll get conventional table-driven parser. It would be better in the end to learn about those first and just start there.
If you really don't want to use a parser generating tool, then the easiest way to write a parser for a simple language by hand is usually recursive descent: https://en.wikipedia.org/wiki/Recursive_descent_parser

Precision in Program analysis

According to David Brumley's Control Flow Integrity & Software Fault Isolation (PPT slide),
in the below statements, x is always 8 due to the path to the x=7 is unrealizable even with the path sensitive analysis.
Why is that?
Is it because the analysis cannot determine the values of n, a, b, and c in advance during the analysis? Or is it because there's no solution that can be calculated by a computer?
if(a^n + b^n = c^n && n>2 && a>0 && b>0 && c>0)
x = 7; /unrealizable path/
else
x = 8;
In general, the task to determine which path in the program is taken, and which — not, is undecidable. It is quite possible that a particular expression, as in your example, can be proved to have a specific value. However, the words "in general" and "undecidable" say that you cannot write an algorithm that would be able to compute the value every time.
At this point the analysis algorithm can be optimistic or pessimistic. The optimistic one could pick 8 and be fine — it considers possible that at run-time x would get this value. It could also pick 7 — "who knows, maybe, x would be 7". But if the analysis is required to be sound, and it cannot determine the value of the condition, it should assume that the first branch could be taken during one execution, and the second branch could be taken during another execution, so x could be either 7 or 8.
In other words, there is a trade-off between soundness and precision. Or, actually, between soundness, precision, and decidability. The latter property tells if the analysis always terminates. Now, you have to pick what is needed:
Decidability — this is a common choice for compilers and code analyzers, because you would like to get an answer about your program in finite time. However, proof assistants could start some processes that could run up to the specified time limit, and if the limit is not set, forever: it's up to the user to stop it and to try something else.
Soundness — this is a common choice for compilers, because you would like to get the answer that matches the language specification. Code analyzers are more flexible. Many of them are unsound, but because of that they can find more potential issues in finite time, leaving the interpretation to the developer. I believe the example you mention talks about sound analysis.
Precision — this is a rare property. Compilers and code analyzer should be pessimistic, because otherwise some incorrect code could sneak in. But this might be parameterizable. E.g., if the compiler/analyzer supports constant propagation and folding, and all of the variables in the example are set to some known constants before the condition, it can figure out the exact value of x after it, and be completely precise.

Is it worth it to rewrite an if statement to avoid branching?

Recently I realized I have been doing too much branching without caring the negative impact on performance it had, therefore I have made up my mind to attempt to learn all about not branching. And here is a more extreme case, in attempt to make the code to have as little branch as possible.
Hence for the code
if(expression)
A = C; //A and C have to be the same type here obviously
expression can be A == B, or Q<=B, it could be anything that resolve to true or false, or i would like to think of it in term of the result being 1 or 0 here
I have come up with this non branching version
A += (expression)*(C-A); //Edited with thanks
So my question would be, is this a good solution that maximize efficiency?
If yes why and if not why?
Depends on the compiler, instruction set, optimizer, etc. When you use a boolean expression as an int value, e.g., (A == B) * C, the compiler has to do the compare, and the set some register to 0 or 1 based on the result. Some instruction sets might not have any way to do that other than branching. Generally speaking, it's better to write simple, straightforward code and let the optimizer figure it out, or find a different algorithm that branches less.
Jeez, no, don't do that!
Anyone who "penalize[s] [you] a lot for branching" would hopefully send you packing for using something that awful.
How is it awful, let me count the ways:
There's no guarantee you can multiply a quantity (e.g., C) by a boolean value (e.g., (A==B) yields true or false). Some languages will, some won't.
Anyone casually reading it is going observe a calculation, not an assignment statement.
You're replacing a comparison, and a conditional branch with two comparisons, two multiplications, a subtraction, and an addition. Seriously non-optimal.
It only works for integral numeric quantities. Try this with a wide variety of floating point numbers, or with an object, and if you're really lucky it will be rejected by the compiler/interpreter/whatever.
You should only ever consider doing this if you had analyzed the runtime properties of the program and determined that there is a frequent branch misprediction here, and that this is causing an actual performance problem. It makes the code much less clear, and its not obvious that it would be any faster in general (this is something you would also have to measure, under the circumstances you are interested in).
After doing research, I came to the conclusion that when there are bottleneck, it would be good to include timed profiler, as these kind of codes are usually not portable and are mainly used for optimization.
An exact example I had after reading the following question below
Why is it faster to process a sorted array than an unsorted array?
I tested my code on C++ using that, that my implementation was actually slower due to the extra arithmetics.
HOWEVER!
For this case below
if(expression) //branched version
A += C;
//OR
A += (expression)*(C); //non-branching version
The timing was as of such.
Branched Sorted list was approximately 2seconds.
Branched unsorted list was aproximately 10 seconds.
My implementation (whether sorted or unsorted) are both 3seconds.
This goes to show that in an unsorted area of bottleneck, when we have a trivial branching that can be simply replaced by a single multiplication.
It is probably more worthwhile to consider the implementation that I have suggested.
** Once again it is mainly for the areas that is deemed as the bottleneck **

Writing a program that writes a program

Its well known in theoretical computer science that the "Hello world tester" program is an undecidable problem.(Here is a link what i mean by hello world tester
My question is since given a program as input we can't say what the program will do,can we solve the reverse problem:
Given set of input and output,is there an algorithm for writing a program that writes a program to achieve a one to one mapping between the given input and output.
I know about metaprogramming but my question is more of theoretical interest. Something which can apply for a generic case.
With these kind of musings one has to be very careful. A lot of confusion arises from not clearly distinguishing about a program x for which proposition P(x) holds or any program x for which proposition P(x) hold. As long as the set of programs for which P(x) holds is finite there always is a proof, of their correctness (although this proof may not be known).
At this point you also have to distinguish between programs, which are and can be known and programs which can only be shown to exist by full enumeration of all posibilities. Let's make an example:
Take 10 Programs, which take no input and may or may not terminate and produce "hello World". Then there is a program which decides exactly which of these programs are correct, and which aren't. Lets call these programs (x_1,...,x_10). Then take the programs (X_0,...,X_{2^10}) where X_i output true for program x_j if the j-th bit in the binary representation of i is set. One of these programs has to be the one which decides correctly for all ten x_i, there just might not be any way to ever figure out which one of these 100 X_js is the correct one (a meta-problem at this point).
This goes to show that considering finite sets of programs and input/output pairs one can always resolve to full enumeration and all halting-problem type of paradoxies instantly disappear. In your case the set of generated programs for each input is of size one and the set of input/output pairs is of finite size (because you have to supply it to the meta-program). Hence full enumeration solves your problem very simple and you can also easily proof both the correctness of the corrected program as well as the correctness of the meta-program.
Note: Since the set of generated programs is infinite, this is one of the few cases where you can proof P(x) for a infinite set of programs (actually you can proof P(x,input,output) for this set). This shows that the set being infinite is only a necessary, not a sufficient condition for this type of paradoxes to appear.
Your question is ambiguously phrased.
How would you specify "what a program should do"?
Any precise, complete, and machine-readable specification of a program's functionality is already a program.
Thus, the answer to your question is, a compiler.
Now, you're asking how to find a function based on a sample of its input and output.
That is a question about statistical analysis that I cannot answer.
Sounds like you want to generate a state machine that learns by being given an input sequence and then updates itself to produce the appropriate output sequence. Assuming your output sequences are always the same for the same input sequence it should be simple enough to write. If your output is not deterministic, such as changing the output depending on the time of day, then you cannot automatically generate a state machine.
Depends on what you mean by "one to one mapping". (And also, I suppose, "input" and "output".)
My guess is that you're asking whether, given an example of inputs and outputs for a given arbitrary program, can one devise an algorithm to write an equivalent program? If so, the answer is no. Eg, you could have a program with the inputs/outputs of 1/1, 2/2, 3/3, 4/4, and yet, if the next input value was 5, the output would be 3782. There's no way to know, from a given set of results, what the next result might be.
The question is underspecified since you did not say how the input and output are presented. For finite lists, the answer is "yes", as in this Python code:
def f(input,output):
print "def g():"
print " x = {" + ",".join(repr(x) + ":" + repr(y) for x,y in zip(input,output)) + "}"
print " print x[raw_input()]"
>>> f(['2','3','4'],['a','b','x'])
def g():
x = {'2':'a','3':'b','4':'x'}
print x[raw_input()]
>>> def g():
... x = {'2':'a','3':'b','4':'x'}
... print x[raw_input()]
...
>>> g()
3
b
for infinite sets how are you going to present them? If you show only a small sample of input this does not specify the whole algorithm. Guessing the best fit is undecidable. If you have a "magic blackbox" then there are continuum many mappings but only a countable number of programs, so it's impossible.
I think I agree with SLaks, but taking things from a different angle, what does a compiler do?
(EDIT: I see SLaks edited his original answer, which was essentially 'you're describing the identity function').
It takes a program in one source language that describes the intended behaviour of a program, and "writes" another program in a target language that exhibits that behaviour.
We could also think of this in terms of things like process refinement --- given an abstract specification, we can construct a refinement mapping to some "more concrete" (read: less non-deterministic, usually) implementation.
But based on your question, it's really very difficult to tell which of these you meant, if any.

Why does Pascal forbid modification of the counter inside the for block?

Is it because Pascal was designed to be so, or are there any tradeoffs?
Or what are the pros and cons to forbid or not forbid modification of the counter inside a for-block? IMHO, there is little use to modify the counter inside a for-block.
EDIT:
Could you provide one example where we need to modify the counter inside the for-block?
It is hard to choose between wallyk's answer and cartoonfox's answer,since both answer are so nice.Cartoonfox analysis the problem from language aspect,while wallyk analysis the problem from the history and the real-world aspect.Anyway,thanks for all of your answers and I'd like to give my special thanks to wallyk.
In programming language theory (and in computability theory) WHILE and FOR loops have different theoretical properties:
a WHILE loop may never terminate (the expression could just be TRUE)
the finite number of times a FOR loop is to execute is supposed to be known before it starts executing. You're supposed to know that FOR loops always terminate.
The FOR loop present in C doesn't technically count as a FOR loop because you don't necessarily know how many times the loop will iterate before executing it. (i.e. you can hack the loop counter to run forever)
The class of problems you can solve with WHILE loops is strictly more powerful than those you could have solved with the strict FOR loop found in Pascal.
Pascal is designed this way so that students have two different loop constructs with different computational properties. (If you implemented FOR the C-way, the FOR loop would just be an alternative syntax for while...)
In strictly theoretical terms, you shouldn't ever need to modify the counter within a for loop. If you could get away with it, you'd just have an alternative syntax for a WHILE loop.
You can find out more about "while loop computability" and "for loop computability" in these CS lecture notes: http://www-compsci.swan.ac.uk/~csjvt/JVTTeaching/TPL.html
Another such property btw is that the loopvariable is undefined after the for loop. This also makes optimization easier
Pascal was first implemented for the CDC Cyber—a 1960s and 1970s mainframe—which like many CPUs today, had excellent sequential instruction execution performance, but also a significant performance penalty for branches. This and other characteristics of the Cyber architecture probably heavily influenced Pascal's design of for loops.
The Short Answer is that allowing assignment of a loop variable would require extra guard code and messed up optimization for loop variables which could ordinarily be handled well in 18-bit index registers. In those days, software performance was highly valued due to the expense of the hardware and inability to speed it up any other way.
Long Answer
The Control Data Corporation 6600 family, which includes the Cyber, is a RISC architecture using 60-bit central memory words referenced by 18-bit addresses. Some models had an (expensive, therefore uncommon) option, the Compare-Move Unit (CMU), for directly addressing 6-bit character fields, but otherwise there was no support for "bytes" of any sort. Since the CMU could not be counted on in general, most Cyber code was generated for its absence. Ten characters per word was the usual data format until support for lowercase characters gave way to a tentative 12-bit character representation.
Instructions are 15 bits or 30 bits long, except for the CMU instructions being effectively 60 bits long. So up to 4 instructions packed into each word, or two 30 bit, or a pair of 15 bit and one 30 bit. 30 bit instructions cannot span words. Since branch destinations may only reference words, jump targets are word-aligned.
The architecture has no stack. In fact, the procedure call instruction RJ is intrinsically non-re-entrant. RJ modifies the first word of the called procedure by writing a jump to the next instruction after where the RJ instruction is. Called procedures return to the caller by jumping to their beginning, which is reserved for return linkage. Procedures begin at the second word. To implement recursion, most compilers made use of a helper function.
The register file has eight instances each of three kinds of register, A0..A7 for address manipulation, B0..B7 for indexing, and X0..X7 for general arithmetic. A and B registers are 18 bits; X registers are 60 bits. Setting A1 through A5 has the side effect of loading the corresponding X1 through X5 register with the contents of the loaded address. Setting A6 or A7 writes the corresponding X6 or X7 contents to the address loaded into the A register. A0 and X0 are not connected. The B registers can be used in virtually every instruction as a value to add or subtract from any other A, B, or X register. Hence they are great for small counters.
For efficient code, a B register is used for loop variables since direct comparison instructions can be used on them (B2 < 100, etc.); comparisons with X registers are limited to relations to zero, so comparing an X register to 100, say, requires subtracting 100 and testing the result for less than zero, etc. If an assignment to the loop variable were allowed, a 60-bit value would have to be range-checked before assignment to the B register. This is a real hassle. Herr Wirth probably figured that both the hassle and the inefficiency wasn't worth the utility--the programmer can always use a while or repeat...until loop in that situation.
Additional weirdness
Several unique-to-Pascal language features relate directly to aspects of the Cyber:
the pack keyword: either a single "character" consumes a 60-bit word, or it is packed ten characters per word.
the (unusual) alfa type: packed array [1..10] of char
intrinsic procedures pack() and unpack() to deal with packed characters. These perform no transformation on modern architectures, only type conversion.
the weirdness of text files vs. file of char
no explicit newline character. Record management was explicitly invoked with writeln
While set of char was very useful on CDCs, it was unsupported on many subsequent 8 bit machines due to its excess memory use (32-byte variables/constants for 8-bit ASCII). In contrast, a single Cyber word could manage the native 62-character set by omitting newline and something else.
full expression evaluation (versus shortcuts). These were implemented not by jumping and setting one or zero (as most code generators do today), but by using CPU instructions implementing Boolean arithmetic.
Pascal was originally designed as a teaching language to encourage block-structured programming. Kernighan (the K of K&R) wrote an (understandably biased) essay on Pascal's limitations, Why Pascal is Not My Favorite Programming Language.
The prohibition on modifying what Pascal calls the control variable of a for loop, combined with the lack of a break statement means that it is possible to know how many times the loop body is executed without studying its contents.
Without a break statement, and not being able to use the control variable after the loop terminates is more of a restriction than not being able to modify the control variable inside the loop as it prevents some string and array processing algorithms from being written in the "obvious" way.
These and other difference between Pascal and C reflect the different philosophies with which they were first designed: Pascal to enforce a concept of "correct" design, C to permit more or less anything, no matter how dangerous.
(Note: Delphi does have a Break statement however, as well as Continue, and Exit which is like return in C.)
Clearly we never need to be able to modify the control variable in a for loop, because we can always rewrite using a while loop. An example in C where such behaviour is used can be found in K&R section 7.3, where a simple version of printf() is introduced. The code that handles '%' sequences within a format string fmt is:
for (p = fmt; *p; p++) {
if (*p != '%') {
putchar(*p);
continue;
}
switch (*++p) {
case 'd':
/* handle integers */
break;
case 'f':
/* handle floats */
break;
case 's':
/* handle strings */
break;
default:
putchar(*p);
break;
}
}
Although this uses a pointer as the loop variable, it could equally have been written with an integer index into the string:
for (i = 0; i < strlen(fmt); i++) {
if (fmt[i] != '%') {
putchar(fmt[i]);
continue;
}
switch (fmt[++i]) {
case 'd':
/* handle integers */
break;
case 'f':
/* handle floats */
break;
case 's':
/* handle strings */
break;
default:
putchar(fmt[i]);
break;
}
}
It can make some optimizations (loop unrolling for instance) easier: no need for complicated static analysis to determine if the loop behavior is predictable or not.
From For loop
In some languages (not C or C++) the
loop variable is immutable within the
scope of the loop body, with any
attempt to modify its value being
regarded as a semantic error. Such
modifications are sometimes a
consequence of a programmer error,
which can be very difficult to
identify once made. However only overt
changes are likely to be detected by
the compiler. Situations where the
address of the loop variable is passed
as an argument to a subroutine make it
very difficult to check, because the
routine's behaviour is in general
unknowable to the compiler.
So this seems to be to help you not burn your hand later on.
Disclaimer: It has been decades since I last did PASCAL, so my syntax may not be exactly correct.
You have to remember that PASCAL is Nicklaus Wirth's child, and Wirth cared very strongly about reliability and understandability when he designed PASCAL (and all of its successors).
Consider the following code fragment:
FOR I := 1 TO 42 (* THE UNIVERSAL ANSWER *) DO FOO(I);
Without looking at procedure FOO, answer these questions: Does this loop ever end? How do you know? How many times is procedure FOO called in the loop? How do you know?
PASCAL forbids modifying the index variable in the loop body so that it is POSSIBLE to know the answers to those questions, and know that the answers won't change when and if procedure FOO changes.
It's probably safe to conclude that Pascal was designed to prevent modification of a for loop index inside the loop. It's worth noting that Pascal is by no means the only language which prevents programmers doing this, Fortran is another example.
There are two compelling reasons for designing a language that way:
Programs, specifically the for loops in them, are easier to understand and therefore easier to write and to modify and to verify.
Loops are easier to optimise if the compiler knows that the trip count through a loop is established before entry to the loop and invariant thereafter.
For many algorithms this behaviour is the required behaviour; updating all the elements in an array for example. If memory serves Pascal also provides do-while loops and repeat-until loops. Most, I guess, algorithms which are implemented in C-style languages with modifications to the loop index variable or breaks out of the loop could just as easily be implemented with these alternative forms of loop.
I've scratched my head and failed to find a compelling reason for allowing the modification of a loop index variable inside the loop, but then I've always regarded doing so as bad design, and the selection of the right loop construct as an element of good design.
Regards
Mark

Resources