Prove that we can decide whether a Turing machine takes at least 100 steps on some input - computation-theory

We know that the problem “Does this Turing machine take at least this finite number of steps on that input?” is decidable, because it will always answer yes or no, where it will say yes if the machine reaches the given number of steps and no if it halts before that.
Now here is my doubt: if it halts before reaching those many steps — i.e. the input either (1) got accepted or (2) got rejected or maybe (3)if it doesn’t halt but rather goes into an infinite loop — then, when we are in case (3), how can we be sure that it will always be in that loop?
What I mean to say is that if it doesn't run forever but comes out of the loop at some point of time then it might cross the asked number of steps and the decision can be made now which was earlier not possible. If so, then how can we conclude that it's decidable when we know that being stuck in a loop we won’t be able to say anything about the outcome?

(I already more or less answered your question when I edited it.)
The thing is, the decision system (a Turing machine, an algorithm or any other equivalent formalism) that takes as inputs a Turing machine M, a number N and a value X, and returns yes or no, has total control over how it executes M on X. It simulates it step by step. So it can run one step of M(X), increment an instruction counter, compare it to N and, as soon as the given number of steps is reached, it stops and returns yes. At that point, there is no need that the simulated machine M be in a final state, and actually the full computation M(X) could very well diverge. We don’t care, because we only run the first N steps.

Most likely the "conditional structures where not being debuged/developed enough so that multiple conditions often conflicted each other..the error reporting where not as definitive, so it where used semi abstract notions as "decidable" and "undecidable"
as a semi example i writen years ago in vbs a "64 bit rom memory" simulator, as i tried to manage the memory cells, where i/o read/write locations where atributed , using manny formulas and conditions to set conversions from decimal to binary and all the operations, indexing, etc.
I had allso run into bugs becouse that the conditons where not perfect.Why? becouse the program had some unresolved somewhat arbitrary results that could had ended up in :
print.debug "decidable"
On Error Resume h
h:
print.debug "undecidable"
this was a example with a clear scope and with a debatable result.
to resume to your question : > "so how do we conclude that it's decidable??"
wikipedia :
The Turing machine was invented in 1936 by Alan Turing, who called it an "a-machine" (automatic machine). With this model, Turing was able to answer two questions in the negative:
Does a machine exist that can determine whether any arbitrary machine on its tape is "circular" (e.g., freezes, or fails to continue its computational task)?
Does a machine exist that can determine whether any arbitrary machine on its tape ever prints a given symbol?
Thus by providing a mathematical description of a very simple device capable of arbitrary computations, he was able to prove properties of computation in general—and in particular, the uncomputability of the ('decision problem').

Related

Precision in Program analysis

According to David Brumley's Control Flow Integrity & Software Fault Isolation (PPT slide),
in the below statements, x is always 8 due to the path to the x=7 is unrealizable even with the path sensitive analysis.
Why is that?
Is it because the analysis cannot determine the values of n, a, b, and c in advance during the analysis? Or is it because there's no solution that can be calculated by a computer?
if(a^n + b^n = c^n && n>2 && a>0 && b>0 && c>0)
x = 7; /unrealizable path/
else
x = 8;
In general, the task to determine which path in the program is taken, and which — not, is undecidable. It is quite possible that a particular expression, as in your example, can be proved to have a specific value. However, the words "in general" and "undecidable" say that you cannot write an algorithm that would be able to compute the value every time.
At this point the analysis algorithm can be optimistic or pessimistic. The optimistic one could pick 8 and be fine — it considers possible that at run-time x would get this value. It could also pick 7 — "who knows, maybe, x would be 7". But if the analysis is required to be sound, and it cannot determine the value of the condition, it should assume that the first branch could be taken during one execution, and the second branch could be taken during another execution, so x could be either 7 or 8.
In other words, there is a trade-off between soundness and precision. Or, actually, between soundness, precision, and decidability. The latter property tells if the analysis always terminates. Now, you have to pick what is needed:
Decidability — this is a common choice for compilers and code analyzers, because you would like to get an answer about your program in finite time. However, proof assistants could start some processes that could run up to the specified time limit, and if the limit is not set, forever: it's up to the user to stop it and to try something else.
Soundness — this is a common choice for compilers, because you would like to get the answer that matches the language specification. Code analyzers are more flexible. Many of them are unsound, but because of that they can find more potential issues in finite time, leaving the interpretation to the developer. I believe the example you mention talks about sound analysis.
Precision — this is a rare property. Compilers and code analyzer should be pessimistic, because otherwise some incorrect code could sneak in. But this might be parameterizable. E.g., if the compiler/analyzer supports constant propagation and folding, and all of the variables in the example are set to some known constants before the condition, it can figure out the exact value of x after it, and be completely precise.

Prover9 "Some, but not all, of the requested proofs were found"

I'm running some lattice proofs through Prover9/Mace4. Prover9's saying Exit: Time limit. plus the message in the Title.
I've doubled the time limit from 60 to 120 seconds. Same message (in twice the time). The weird thing is:
there's only one statement to prove. That is, only one label(goal) in the report (what's with the but not all?)
it does seem to have completed the proof, in that it shows last line $F.
Mace4 can't find any counter-examples (I upped its time to 120 seconds).
I've found some GHits for that message, but they seem to be all in Chinese(?)
It's possible the axioms I've given are (mutually) recursive -- I'm trying to introduce a function and a nominated 'absorbing element' [**]; and that solving will need infinitary unification. Does Prover9 do that?
I'm happy to add the axioms and goal to this message. (I'm using a non-standard way to define the meet and join.) But first, are there any sanity checks I should go through?
[**] the absorbing element is neither lattice top nor lattice bottom; more like lattice left-corner. (The element will be lattice bottom just in case the lattice degenerates to two elements.) The function is a partial ordering 'at right angles' to top/bottom. The lattice I expect to be neither complemented nor distributive (again except when 2 elements).
I've reproduced this after much trying, but only by setting some strange option that I'm sure I wouldn't have touched. (The only option I usually change is the Time limit, and I Reset to defaults quite often, so that would have blatted any evidence.)
Here's my guess for what happened.
what's with the but not all?
You can enter multiple goals (providing they're all positive). [**]
With strange option settings, if Prover9 can prove the first but not the second, it'll keep trying until exhausted; but then only report the successful one -- with a $F. result OK.
If you double the Time limit, it'll still prove the first and still keep on trying for the second -- taking twice the time for the same outcome.
Mace4 will come across the first goal, and use up its time trying for a counter-example. There isn't one because it's provable. Again, doubling its Time limit will get the same outcome after twice as long.
[Note **] It's never that I intend to set multiple goals; but when I'm hacking/experimenting with axioms, I keep all the goals in the Goals: box so I can easily toggle un/comment. I guess I didn't comment-out one when I was uncommenting another.
The behaviour usually, as described in the manual, is Prover9 reports success at the first goal it proves; doesn't go on to other goals. If there's multiple provable goals, it seems to choose the easiest/quickest(?) irrespective of position in the file.
But with max_proofs set to more than default 1, Prover9 will keep trying. (There's also a auto_denials flag that has something to do with it I don't understand.)
I've no idea how I set max_proofs -- I didn't recognise the Options/Limits sub-screen when I eventually found it. Weird.

Writing a program that writes a program

Its well known in theoretical computer science that the "Hello world tester" program is an undecidable problem.(Here is a link what i mean by hello world tester
My question is since given a program as input we can't say what the program will do,can we solve the reverse problem:
Given set of input and output,is there an algorithm for writing a program that writes a program to achieve a one to one mapping between the given input and output.
I know about metaprogramming but my question is more of theoretical interest. Something which can apply for a generic case.
With these kind of musings one has to be very careful. A lot of confusion arises from not clearly distinguishing about a program x for which proposition P(x) holds or any program x for which proposition P(x) hold. As long as the set of programs for which P(x) holds is finite there always is a proof, of their correctness (although this proof may not be known).
At this point you also have to distinguish between programs, which are and can be known and programs which can only be shown to exist by full enumeration of all posibilities. Let's make an example:
Take 10 Programs, which take no input and may or may not terminate and produce "hello World". Then there is a program which decides exactly which of these programs are correct, and which aren't. Lets call these programs (x_1,...,x_10). Then take the programs (X_0,...,X_{2^10}) where X_i output true for program x_j if the j-th bit in the binary representation of i is set. One of these programs has to be the one which decides correctly for all ten x_i, there just might not be any way to ever figure out which one of these 100 X_js is the correct one (a meta-problem at this point).
This goes to show that considering finite sets of programs and input/output pairs one can always resolve to full enumeration and all halting-problem type of paradoxies instantly disappear. In your case the set of generated programs for each input is of size one and the set of input/output pairs is of finite size (because you have to supply it to the meta-program). Hence full enumeration solves your problem very simple and you can also easily proof both the correctness of the corrected program as well as the correctness of the meta-program.
Note: Since the set of generated programs is infinite, this is one of the few cases where you can proof P(x) for a infinite set of programs (actually you can proof P(x,input,output) for this set). This shows that the set being infinite is only a necessary, not a sufficient condition for this type of paradoxes to appear.
Your question is ambiguously phrased.
How would you specify "what a program should do"?
Any precise, complete, and machine-readable specification of a program's functionality is already a program.
Thus, the answer to your question is, a compiler.
Now, you're asking how to find a function based on a sample of its input and output.
That is a question about statistical analysis that I cannot answer.
Sounds like you want to generate a state machine that learns by being given an input sequence and then updates itself to produce the appropriate output sequence. Assuming your output sequences are always the same for the same input sequence it should be simple enough to write. If your output is not deterministic, such as changing the output depending on the time of day, then you cannot automatically generate a state machine.
Depends on what you mean by "one to one mapping". (And also, I suppose, "input" and "output".)
My guess is that you're asking whether, given an example of inputs and outputs for a given arbitrary program, can one devise an algorithm to write an equivalent program? If so, the answer is no. Eg, you could have a program with the inputs/outputs of 1/1, 2/2, 3/3, 4/4, and yet, if the next input value was 5, the output would be 3782. There's no way to know, from a given set of results, what the next result might be.
The question is underspecified since you did not say how the input and output are presented. For finite lists, the answer is "yes", as in this Python code:
def f(input,output):
print "def g():"
print " x = {" + ",".join(repr(x) + ":" + repr(y) for x,y in zip(input,output)) + "}"
print " print x[raw_input()]"
>>> f(['2','3','4'],['a','b','x'])
def g():
x = {'2':'a','3':'b','4':'x'}
print x[raw_input()]
>>> def g():
... x = {'2':'a','3':'b','4':'x'}
... print x[raw_input()]
...
>>> g()
3
b
for infinite sets how are you going to present them? If you show only a small sample of input this does not specify the whole algorithm. Guessing the best fit is undecidable. If you have a "magic blackbox" then there are continuum many mappings but only a countable number of programs, so it's impossible.
I think I agree with SLaks, but taking things from a different angle, what does a compiler do?
(EDIT: I see SLaks edited his original answer, which was essentially 'you're describing the identity function').
It takes a program in one source language that describes the intended behaviour of a program, and "writes" another program in a target language that exhibits that behaviour.
We could also think of this in terms of things like process refinement --- given an abstract specification, we can construct a refinement mapping to some "more concrete" (read: less non-deterministic, usually) implementation.
But based on your question, it's really very difficult to tell which of these you meant, if any.

Print 1 followed by googolplex number of zeros

Assuming we are not concerned about running time of the program (which is practically infinite for human mortals) and using limited amount of memory (2^64 bytes), we want to print out in base 10, the exact value of 10^(googolplex), one digit at a time on screen (mostly zeros).
Describe an algorithm (which can be coded on current day computers), or write a program to do this.
Since we cannot practically check the output, so we will rely on collective opinion on the correctness of the program.
NOTE : I do not know the solution, or whether a solution exists or not. The problem is my own invention. To those readers who are quick to mark this offtopic... kindly reconsider. This is difficult and bit theoretical but definitely CS.
This is impossible. There are more states (10^(10^100)) in the program than there are electrons in the universe (~10^80). Therefore, in our universe, there can be no such realization of a machine capable of executing the task.
First of all, we note that 10^(10^100) is equivalent to ((((10^10)^10)^...)^10), 100 times.
Or 10↑↑↑↑↑↑↑↑↑↑10.
This gives rise to the following solution:
print 1
for i in A(10, 100)
print 0
in bash:
printf 1
while true; do
printf 0
done
... close enough.
Here's an algorithm that solves this:
print 1
for 1 to 10^(10^100)
print 0
One can trivially prove correctness using Hoare logic:
There are no pre-conditions
The post condition is that a one followed by 10^(10^100) zeros are printed
The cycle's invariant is that the number of zeros printed so far is equal to i
EDIT: A machine to solve the problem needs the ability to distinguish between one googolplex of distinct states: each state is the result of printing one more zero than the previous. The amount of memory needed to do this is the same needed to store the number one googolplex. If there isn't that much memory available, this problem cannot be solved.
This does not mean it isn't a computable problem: it can be solved by a Turing machine because a Turing machine has a limitless amount of memory.
There definitely is a solution to this problem in theory, assuming of course you have a machine that is capable of producing that sort of output. I'm pretty sure that a googolplex is larger than the number of atoms in the universe, at least according to what the physicists tell us, so I don't think that any physically realizable model of computation could print it out. However, mathematically speaking, you could define a Turing machine capable of printing out the value by just giving it a googolplex-ish number of states and having each write a zero and then move to the next lower state.
Consider the following:
The console window to which you are printing the output will have a maximum buffer size.
When this buffer size is exceeded, anything printed earlier is discarded, and the user will not be able to scroll back to see it.
The maximum buffer size will be minuscule compared to a googolplex.
Therefore, if you want to mimic the user experience of your program running to completion, find the maximum buffer size of the console you will print to and print that many zeroes.
Hurray laziness!

Detecting infinite loop in brainfuck program

I have written a simple brainfuck interpreter in MATLAB script language. It is fed random bf programs to execute (as part of a genetic algorithm project). The problem I face is, the program turns out to have an infinite loop in a sizeable number of cases, and hence the GA gets stuck at the point.
So, I need a mechanism to detect infinite loops and avoid executing that code in bf.
One obvious (trivial) case is when I have
[]
I can detect this and refuse to run that program.
For the non-trivial cases, I figured out that the basic idea is: to determine how one iteration of the loop changes the current cell. If the change is negative, we're eventually going to reach 0, so it's a finite loop. Otherwise, if the change is non-negative, it's an infinite loop.
Implementing this is easy for the case of a single loop, but with nested loops it becomes very complicated. For example, (in what follows (1) refers to contents of cell 1, etc. )
++++ Put 4 in 1st cell (1)
>+++ Put 3 in (2)
<[ While( (1) is non zero)
-- Decrease (1) by 2
>[ While( (2) is non zero)
- Decrement (2)
<+ Increment (1)
>]
(2) would be 0 at this point
+++ Increase (2) by 3 making (2) = 3
<] (1) was decreased by 2 and then increased by 3, so net effect is increment
and hence the code runs on and on. A naive check of the number of +'s and -'s done on cell 1, however, would say the number of -'s is more, so would not detect the infinite loop.
Can anyone think of a good algorithm to detect infinite loops, given arbitrary nesting of arbitrary number of loops in bf?
EDIT: I do know that the halting problem is unsolvable in general, but I was not sure whether there did not exist special case exceptions. Like, maybe Matlab might function as a Super Turing machine able to determine the halting of the bf program. I might be horribly wrong, but if so, I would like to know exactly how and why.
SECOND EDIT: I have written what I purport to be infinite loop detector. It probably misses some edge cases (or less probably, somehow escapes Mr. Turing's clutches), but seems to work for me as of now.
In pseudocode form, here it goes:
subroutine bfexec(bfprogram)
begin
Looping through the bfprogram,
If(current character is '[')
Find the corresponding ']'
Store the code between the two brackets in, say, 'subprog'
Save the value of the current cell in oldval
Call bfexec recursively with subprog
Save the value of the current cell in newval
If(newval >= oldval)
Raise an 'infinite loop' error and exit
EndIf
/* Do other character's processings */
EndIf
EndLoop
end
Alan Turing would like to have a word with you.
http://en.wikipedia.org/wiki/Halting_problem
When I used linear genetic programming, I just used an upper bound for the number of instructions a single program was allowed to do in its lifetime. I think that this is sensible in two ways: I cannot really solve the halting problem anyway, and programs that take too long to compute are not worthy of getting more time anyway.
Let's say you did write a program that could detect whether this program would run in an infinite loop. Let's say for the sake of simplicity that this program was written in brainfuck to analyze brainfuck programs (though this is not a precondition of the following proof, because any language can emulate brainfuck and brainfuck can emulate any language).
Now let's say you extend the checker program to make a new program. This new program exits immediately when its input loops indefinitely, and loops forever when its input exits at some point.
If you input this new program into itself, what will the results be?
If this program loops forever when run, then by its own definition it should exit immediately when run with itself as input. And vice versa. The checker program cannot possibly exist, because its very existence implies a contradiction.
As has been mentioned before, you are essentially restating the famous halting problem:
http://en.wikipedia.org/wiki/Halting_problem
Ed. I want to make clear that the above disproof is not my own, but is essentially the famous disproof Alan Turing gave back in 1936.
State in bf is a single array of chars.
If I were you, I'd take a hash of the bf interpreter state on every "]" (or once in rand(1, 100) "]"s*) and assert that the set of hashes is unique.
The second (or more) time I see a certain hash, I save the whole state aside.
The third (or more) time I see a certain hash, I compare the whole state to the saved one(s) and if there's a match, I quit.
On every input command ('.', IIRC) I reset my saved states and list of hashes.
An optimization is to only hash the part of state that was touched.
I haven't solved the halting problem - I'm detecting infinite loops while running the program.
*The rand is to make the check independent of loop period
Infinite loop cannot be detected, but you can detect if the program is taking too much time.
Implement a timeout by incrementing a counter every time you run a command (e.g. <, >, +, -). When the counter reaches some large number, which you set by observation, you can say that it takes very long time to execute your program. For your purpose, "very long" and infinite is a good-enough approximation.
As already mentioned this is the Halting Problem.
But in your case there might be a solution: The Halting Problem is considering is about the Turing machine, which has unlimited memory.
In case you know that you have a upper limit of memory (e.g. you know you dont use more than 10 memory cells), you can execute your programm and stop it. The idea is that the computation space bounds computation time (as you cant write more than one cell at one step). After you executed as much steps as you can have different memory configurations, you can break. E.g. if you have 3 cells, with 256 conditions, you can have at most 3^256 different states, and so you can stop after executing that many steps. But be careful, there are implicit cells, like the instruction pointer and the registers. You do it even shorter, if you save every state configuration and as soon as you detect one, which you already had, you have an infite loop. This approach is definitly much better in the run time, but therefor needs much more space (here it might be suitable to hash the configurations).
This is not the halting problem, however, it is still not reasonable to try to detect halting even in such a limited machine as a 1000 cell BF machine.
Consider this program:
+[->[>]+<[-<]+]
This program will not repeat until it has filled up the entire of memory which for just 1000 cells will take about 10^300 years.
If I remember correctly, the halting problem proof was only true for some extreme case that involved self reference. However it's still trivial to show a practical example of why you can't make an infinite loop detector.
Consider Fermat's Last Theorem. It's easy to create a program that iterates through every number (or in this case 3 numbers), and detects if it's a counterexample to the theorem. If so it halts, otherwise it continues.
So if you have an infinite loop detector, it should be able to prove this theorem, and many many others (perhaps all others, if they can be reduced to searching for counterexamples.)
In general, any program that involves iterating through numbers and only stopping under some condition, would require a general theorem prover to prove if that condition can ever be met. And that's the simplest case of looping there is.
Off the top of my head (and I could be wrong), I would think it would be a little bit difficult to detect whether or not a program has an infinite loop without actually executing the program itself.
As the conditional execution of portions of the program depends on the execution state of the program, it will be difficult to know the particular state of the program without actually executing the program.
If you don't require that a program with an infinite loop be executed, you could try having an "instructions executed" counter, and only execute a finite number of instructions. This way, if a program does have an infinite loop, the interpreter can terminate the program which is stuck in an infinite loop.

Resources