Subset of programs for which we can determine if the program will halt - computation-theory

We know about the halting problem that states that there is in general no way to determine if a program will halt. However, for certain programs we can certainly say that they will halt. For example a simple for loop.
So my question is, is there a general classification/taxonomy of problems for which we can determine if they halt or not?

Related

What is Undecidability and its practical application?

I have a vague understanding of undecidability. I get it that undecidability is that for a problem there exists no algorithm or the Turing machine will never halt. But I cant just visualize it. Can someone explain in a better way. and I don't get why we are learning about it and its applications. Can someone explain in this topic?
The fundamental concept being explored here is decision problems. A decision problem is any problem that can be formulated as a yes/no (or true/false) question and expressed in a formal language, such as lambda calculus or the input to a Turing machine.
A problem is undecidable if there exists no algorithm that can give the right answer on every possible input. It might help to think about the converse of this: If we have some undecidable problem and Bob comes along and says he's got an algorithm that solves it (decides the problem), well then there's going to be at least one input that causes Bob's algorithm to give the wrong answer or loop infinitely.
The halting problem is a decision problem - an excellent example of one that is undecidable. In fact, it was the first decision problem to be shown to be undecidable.
For the sake of keeping things concrete, let's say Alice would like to have a program for her Turing machine that can analyze any program and predict with perfect accuracy whether or not it will halt. Bob comes along with his program, which he calls P for "the Perfect Program!" He claims it'll do the job.
Skeptical, Alice takes P and starts writing her own program, which she calls Q (for Quixotic). Like P, Q takes any program as its input. When it gets its input, Q immediately calls P and asks it what the input program is going to do.
If P says the input program is going to run forever, Q prints "Done!" and immediately halts.
If P says the input program is going to halt, Q prints "Looping!" and then enters an infinite loop from which it will never return.
What happens if we run Q and give it its own source code as input (We run Q(Q))?
If P says Q will run forever, Q immediately prints "Done!" and halts, so it can't do that. It would be giving the wrong answer.
If P says Q will halt, Q prints "Looping!" and then enters an infinite loop. So P would be giving the wrong answer here too.
So no matter what P does, it can't give the correct answer. Bob claimed it solved the halting problem (it always gives the right answer, on every input)! But doing so would produce a contradiction, as we see above. So, Bob can't be telling the truth about this program P of his. And in fact, it doesn't even matter what's actually in Bob's program. It could be simple, or it could be ridiculously complex. No matter how sophisticated or clever or elegant it is, it can't solve the problem. Not with 100% accuracy, as decision problems require. No program can. That's what undecidability means.
Addressing the "practical application" part of this question, developing compilers is a good example of how undecidability can be leveraged in real life.
Some compilers make every effort to identify bad behavior that will happen at runtime. For example, it would be pretty cool if your C compiler could always, 100% of the time tell you if the program you have written down is going to loop infinitely or have a segmentation fault, for example. But, it can't! If it could, the C compiler would be deciding an undecidable problem (the halting problem, in the case of infinite loops).
I'm not an expert in compilers, but I'd bet that having a pretty solid understanding of decidability is pretty important for researchers / engineers working on cutting edge compilers. Will you and I, as average Joes, have a practical application for decidability? Probably not. But it's still pretty cool!

Is Starlark Turning Complete?

The Starlark configuration language does not support infinite loops or recursion or user defined data types but it does support functions. The docs indicate that means that the language is not Turing complete. I have forgotten a lot of my Computer science classes on languages and automata theory.
Questions:
Is the lack of user defined data types, infinite loops and recursion enough for a language to be Turing incomplete.
Is there a proof that StarLark is not Turing complete?
If a language is not Turning complete does that mean that the program is guaranteed to halt eventually?
A Turing machine program (or any program in a Turing complete language) may never halt by falling into what is effectively an infinite loop. Ruling out nonterminating Turing machine programs is impossible (see Halting problem). Thus any language that seeks to ensure all programs terminate (such as Starlark) must sacrifice Turing completeness. See also total functional programming.
See above.
Not necessarily. There are other ways a language can be Turing incomplete without lacking infinite loops. For example a language where the only allowed program is while True: pass is not Turing complete, but it also does not terminate.

Computer programs from the point of view of CPU

This might sound a bit naive, but, I'm unable to find an appropriate answer to the question in my mind.
Let's say there is an algorithm X, which is implemented in 10 different programming languages. After the bootstrap stage for every program, each program executes the algorithm over and over again. My question is that Would there be any difference from the hardware-level when they all execute on the same CPU?
What I understand is that the set of hardware resources (register, etc.) on each CPU are limited. Hence, executing a core algorithm should follow a similar (if not identical) pattern from the "fetch–decode–execute" cycle.

Is it possible to predict how long a program will take to run?

I was wondering, is it possible to do that with an arbitrary program? I heard that, with some mathematics, you can estimate how long a simple algorithm, like a sorting algorithm, will take to run; but what about more complex programs?
Once I visited a large cluster at an university, which runs programs from scientists all over the world. When I asked one of the engineers how they managed to schedule when each program will be run, he said that the researchers sent, with their programs, an estimative of how long they would take to run, based on the previous analysis that some program made for this purpose.
So, does this kind of program really exist? If not, how can I make a good estimative of the run time of my programs?
in general you cannot do this because in general you cannot prove that a program will finish at all. this is known as the halting problem
You're actually asking a few related questions at the same time, not just one single question.
Is there a Program A that, when given another arbitrary Program B as an input, provide an estimate of how long it will take Program B to run? No. Absolutely not. You can't even devise a Program A that will tell you if an arbitrary Program B will ever stop.
That second version-- will Program B ever halt-- is called the Halting Problem, cleverly enough, and it's been proven that it's just not decidable. Wikipedia has a nice web page, and the book Goedel, Escher, Bach is a very long, but very conversational and readable exposition of the ideas involved Goedel's Incompleteness Theorem, which is very closely related.
http://en.wikipedia.org/wiki/Halting_problem
http://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach
So if that's true, then how are the scientists coming up with those estimates? Well, they don't have arbitrary programs, they have particular programs that they have written. Some programs are undecidable, but not **all* programs are undecidable. So, unless someone makes a serious error, the scientists aren't going to try to run a program that they haven't proven will stop.
Once they have proven that some program will stop, one of the major mathematical tools is called Big O notation. At a very intuitive level, Big O notation helps to develop scaling laws for how the run-time of a program varies with the size of the input. At a very trivial example, suppose your program is a loop, and the loop takes one arbitrary unit of time to run. If you run the loop N times, it takes N units of time. But, if that loop is itself in another loop that runs N times, the whole thing takes N*N units of time. Those two programs scale very differently. (That's a trivial example, but real examples can get quite complicated.)
http://en.wikipedia.org/wiki/Big_oh
That's a rather abstract, mathematical tool. Big O analyses are often so abstract that they simply assume that all sufficiently low level operations take "about" the same amount of time, and Big O doesn't really give answers in terms of seconds, hours, or days, anyway. In practice, real computers are also affected by hardware details, such as how long it takes to perform some very low level operation, or worse, how long it takes to move information from one part of the machine to another part which is extremely important on multi-processor computers. So in practice, the insights from the Big O analyses are combined with a detailed knowledge of the machine that it will run on in order to come up with an estimate.
You should research the Big O notation. While it does not give a fixed number, it tells you how its performance will change for different sizes. There are a few simple rules (if your code is in a loop of n iterations, then it will take n*time to run the loop).
The trouble are:
With complex programs there are multiple variables affecting it.
Does not take into account user interaction.
Same for network latency, etc.
So, this method works well for scientific programs, where the program is heavy calculus, using a very studied algorithm (and many times it is just running just the same algorithm over different data sets).
You can't really.
For a simple algorithm you know if something is O(n) or O(n^2). Which one it is, you can guesstimate.
However, if you've got a program running tons of different algorithms, it'll become quite hard to guesstimate it. What you, however, can do, is predict results based on a previous run.
Assume you first estimate your program to run for one hour, but it runs for half an hour. If you change very little between builds/releases then you'll know next time that it will run somewhere around half an hour.
If you've made radical changes then it becomes harder to find the ETA :-]

Could a program determine if another program plays chess?

I'm wondering about the following issue. I obviously don't expect any practical solutions but I would appreciate any developer's thoughts on this:
Would it be theoretically possible to have a program that opens other programs (for the sake of argument let's say it opens .exe files), and determines whether or not a particular executable, when executed (with fixed input and machine state), plays a game of chess (amongst whatever other tasks it may perform).
With 'play chess' I mean having some representation of a chess board and pieces, applying subsequent moves for black and white originating from a built-in chess AI engine.
Such a theoretical 'chess detection program' may contain a Virtual Machine or PC emulator or whatever to actually simulate the scanned executable if necessary. We can assume it runs on an arbitrarily fast computer with ditto ram.
(Edit) Regarding the halting problem, I can solve that like this:
Load the program in a virtual machine, which has N bits (harddisk and memory space and CPU registers altogether). This virtual machine can assume at most 2^N different states.
Execute the program in the VM step by step. After each step, check if it halted.
If yes: problem solved (result: yes, it halts).
If no: take the current state of the virtual machine, and see if this state exists in a list of states we've already encountered before. If yes: problem solved (result: no it will run forever). If no: add this state to the list and continue.
Since there are at most 2^N different states that can occur, this algorithm will determine whether the program halts or not with certainty in finite time.
(Edit2) There seems to be some ambiguity about the (in)finiteness of the scanned executable or the (virtual) machine it runs on. Let's say the executables to be scanned may be at most 1 GB (which should be enough since most chess programs are considerably smaller) and they're supposed to run on a PC (or VM) with 10 GB of ram.
Our theoretical chess detector program can use an arbitrary amount of ram.
No, there is no such algorithm that can detect whether an executable plays chess.
The proof of this rests in the fact that the following problem (called the halting problem) cannot be solved by any algorithm:
Given a computer program, does that program eventually terminate?
We can show that if there was a computer program that could determine whether or not another program plays a game of chess, we could solve the halting problem. To do so, we would write a computer program that does the following:
Take as input some other computer program P.
Run program P.
If program P terminates, play a game of chess.
This program has the following interesting behavior: it plays a game of chess if and only if the program P terminates. In other words, if we could detect whether this program could play chess, we would be detecting whether or not the program P terminates. However, we know that this is provably impossible to do, so there must be not be a program that detects whether some computer program plays chess.
This general approach is called a reduction from the halting problem and can be used to show that a huge number of different problems are probably unsolvable.
Hope this helps, and sorry for the "no" answer!
In regards to your edited question: yes, if we limit the size of the memory so we only have finitely-many possible programs, we could theoretically enumerate every possible program and manually divide them into "chess-playing" and "non-chess playing" by whatever set of criteria you wanted.
In this case, we'd no longer have a Turing machine, so the Halting Problem doesn't apply. Instead, we'd have a finite state machine (and yes, this means in the real world, all computers are actually finite-state approximations of a Turing machine).
However, you added this limitation because you wanted to be "practical, not theoretical," so here's another bit of practicality for you: to enumerate all of the 256-bit programs (with a billion PCs, each of which enumerate a billion programs a second) would take significantly longer than the age of the universe to complete. You can hardly imagine, then, how long it would take to enumerate all programs less than 1 GB (~1,000,000,000-bits).
Because of this, it is actually more practical to model real computers as Turing machines than as finite-state machines; and under this model, as #templatetypedef proved, it is impossible.
No. This is equivalent to the halting problem.
What does it mean that a program plays chess? I don't believe there exists a precise mathematical definition of the problem that couldn't be gamed and wouldn't be trivially equivalent to an untreatable problem.
For example, if you ask "Does there exist an encoding of moves under which this program plays chess?" then a bare Python interpreter plays chess - under the encoding that stipulates that you need to input:
a chess-playing Python program plus opponent's first move if you want it to play black
a chess-playing Python program if you want it to play white
If you fix the encoding, then the problem becomes boring. Chess games are finite (by the 50-move rule), so the only hard question is "does this program hang between moves on any of the finite set of chess games". If it doesn't and always respects the encoding (and makes valid move, all this is trivial to check) then it plays chess. Of course checking if it hangs is untreatable. Enumerating all chess games is treatable but of course also totally impossible given practical considerations.

Resources