I understand that it's possible to represent some algorithms as FSMs, but can FSMs describe every possible algorithm?
No. Intuitively, an algorithm can only be represented as an FSM if it uses only a finite amount of state. For instance, you couldn't sort an arbitrary-length list with an FSM.
Now, add an unbounded amount of state to an FSM -- like an infinite one-dimensional array of values... and add a little bit of "glue" state between the FSM and the array -- a notion of "current position" in that array... and you've got a Turing machine. Which, yes, can do it all.
No.
There is a finite state machine that can describe every Regular Language.
For irregular languages, Finite State Machines, are not enough.
The set of all programs, are called the "Recursively enumerable" languages, and can be accepted by a Turing Machine.
This is often referred as the Chomsky Hirerchy:
Regular Languages <= Context Free Languages <= Context Sensitive Languages <= Recursively enumerable Languages
Which are accepted by:
Regular languages: Finite State Machine
Context Free Language: Push-down automaton
Context Sensitive Languages: Linear Bounded Turing Machines
Recursively enumerable Languages: Turing Machines
It is important to note that a machine that can accepted describe all "higher tier languages" can also describe all the lower tiers (for example, you can create a turing machine to accept each regular language)
Related
I am currently enrolled in the undergraduate version of Theory of Computation at my university and we are discussing Pushdown Automata, Turing Machines, and the Church-Turing Thesis.
In our homework problem sets, my professor asked the following question:
Construct a PDA for the language {wwR | w∈{0,1}*}.
However, in our class lecture slides, he wrote the following
Not every nondeterministic PDA has an equivalent deterministic PDA. For example, there does exist a nondeterministic PDA recognizing the following language, but no deterministic PDA can recognize this language:
So, my question is whether or not it is possible to write this deterministic PDA or not? I have tried researching it online for the past two hours, but have found no problems which discuss this problem specifically.
There is no deterministic PDA for this language. Which is the same as to say that there is no deterministic context-free grammar for it. The non-deterministic one in ABNF meta-syntax is:
a = ["0" a "0" | "1" a "1"]
The inability to create a deterministic PDA comes from the fact that the decision to accept/reject the input is based on the length of the input (a palindrome in this case). The PDA have no machinery that enables it to make decisions based on the length (you cannot use that "half of the input was accepted by now").
Imagine that at every input character "0"/"1" you push into the PDA's stack "0"/"1" and remain in the same state. Then for the input character that is the same as the previous you are facing a decision, should you push it into the stack and remain in the same state, or you have to start to recognize the reverse of the previously found input characters. There is no way to know what decision to make, based on the input character itself, the last pushed character into the PDA's stack (that is the same as the input character at this moment of time) and the current state of the PDA. You need to know the length of the input, and if you are in the middle of it, you can start to accept the reverse, else you remain in the same state.
There is nothing you can re-arrange, so that a deterministic decision become possible. The only way that remains is to explore both of the decision paths, every time that is needed, and to experience non-linearity.
It seems to be a matter of computer science lore that data and computation (or data and process, whatever you want to call it) are in some vague sense duals of each other: data is generated by computation but also guides future computation, and so the two are, vaguely, two sides of the same coin. This duality is more apparent in programming languages like Lisp which purposefully blur the line between the two.
I'm wondering whether this notion of duality has been studied in complexity theory in a rigorous setting. For instance, are there any computational models in which this duality arises naturally out of some deeper duality intrinsic to the model? For instance--and this is wishful thinking bordering on the nonsensical--if, say, we equated data with the states of a DFA and process with the DFA's transition function, and then the graph-dual of the DFA would yield another DFA related to the original in some meaningful way, then the data/computation duality would emerge naturally from the underlying model.
That sort of thing. Any pointers to research in the area (or even just keywords) are appreciated.
I never took a formal GA course, so this question might be vague: I'm trying to see whether I'm approaching this problem well.
Usually a genome is represented as a sequence of homogeneous elements, such as binary numbers, logic gates, elementary functions, etc., which can then be assembled into a homogeneous structure like a syntax-tree for a computer program or a 3D object or whatever.
My problem involves evolving a graph of components, lets say X, Y and Z: the graph can have N nodes and each node is an instance of either X, Y or Z. Encoding such a graph structure in a genome is rather straightforward, however, I also need to attach additional information for what X, Y and Z do themselves--which is actually the main object of the GA.
So it seems like my genome should code for a heterogeneous entity: an entity which is composed both of a structure graph and a functionality specification. It is not impossible to subsume the elements (genes) which code for the structure and those that code for functionality under a single parent "gene", and then simply separate them when the entity is being assembled, but this doesn't feel like the right approach.
Is this a common problem in GA? Am I supposed to find a "lower-level" representation / genome encoding in this situation? What are the relevant considerations?
Yes you can do that with GA, but strictly speaking you will be using Genetic Programming (GP) as opposed of Genetic Algorithms. GP is considered a special case of GA where the genome representation is heterogenous. This means your individual is a "computer program" instead of just "raw data" look here and here. This means you can really get creative on what this "computer program" means, how to represent it and handle it.
Regarding the additional information, it should be fine as long as all your genetic operators consider this representation. For instance, your crossover. It could be prepared to exchange half of the tree and half of the additional information of the parents. If for some reason the additional information cannot be divided, your crossover may decide to clone it from one of the parents.
The main disadvantage of this highly tuned approach is that you probably can't use high level GA/GP frameworks out there (I'm just assuming, I don't know much about them).
I've read a couple of introductory sections of books as well as a few papers on both topics, and it looks to me that these two methods are pretty much exactly the same. That said, I haven't had the time to actually deeply research the topics yet, so I might be wrong.
What are the distinctions between genetic algorithms and evolution strategies? What makes them different, and where are they similar?
In evolution strategies, the individuals are coded as vectors of real numbers. On reproduction, parents are selected randomly and the fittest offsprings are selected and inserted in the next generation. ES individuals are self-adapting. The step size or "mutation strength" is encoded in the individual, so good parameters get to the next generation by selecting good individuals.
In genetic algorithms, the individuals are coded as integers. The selection is done by selecting parents proportional to their fitness. So individuals must be evaluated before the first selection is done. Genetic operators work on the bit-level (e.g. cutting a bit string into multiple pieces and interchange them with the pieces of the other parent or switching single bits).
That's the theory. In practice, it is sometimes hard to distinguish between both evolutionary algorithms, and you need to create hybrid algorithms (e.g. integer (bit-string) individuals that encodes the parameters of the genetic operators).
Just stumbled on this thread when researching Evolution Strategies (ES).
As Paul noticed before, the encoding is not really the difference here, as this is an implementation detail of specific algorithms, although it seems more common in ES.
To answer the question, we first need to do a small step back and look at internals of an ES algorithm.
In ES there is a concept of endogenous and exogenous parameters of the evolution. Endogenous parameters are associated with individuals and therefore are evolved together with them, exogenous are provided from "outside" (e.g. set constant by the developer, or there can be a function/policy which sets their value depending on the iteration no).
The individual k consists therefore of two parts:
y(k) - a set of object parameters (e.g. a vector of real/int values) which denote the individual genotype
s(k) - a set of strategy parameters (e.g. a vector of real/int values again) which e.g. can control statistical properties of mutation)
Those two vectors are being selected, mutated, recombined together.
The main difference between GA and ES is that in classic GA there is no distinction between types of algorithm parameters. In fact all the parameters are set from "outside", so in ES terms are exogenous.
There are also other minor differences, e.g. in ES the selection policy is usually one and the same and in GA there are multiple different approaches which can be interchanged.
You can find a more detailed explanation here (see Chapter 3): Evolution strategies. A comprehensive introduction
In most newer textbooks on GA, real-valued coding is introduced as an alternative to the integer one, i.e. individuals can be coded as vectors of real numbers. This is called continuous parameter GA (see e.g. Haupt & Haupt, "Practical Genetic Algorithms", J.Wiley&Sons, 1998). So this is practically identical to ES real number coding.
With respect to parent selection, there are many different strategies published for GA's. I don't know them all, but I assume selection among all (not only the best has been used for some applications).
The main difference seems to be that a genetic algorithm represents a solution using a sequence of integers, whereas an evolution strategy uses a sequence of real numbers -- reference: http://en.wikipedia.org/wiki/Evolutionary_algorithm#
As the wikipedia source (http://en.wikipedia.org/wiki/Genetic_algorithm) and #Vaughn Cato said the difference in both techniques relies on the implementation. EA use
real numbers and GA use integers.
However, in practice I think you could use integers or real numbers in the formulation of your problem and in your program. It depends on you. For instance, for protein folding you can say the set of dihedral angles form a vector. This is a vector of real numbers, but the entries
are labeled by integers so I think you can formulate your problem and write you program based
on an integer arithmetic. It is just an idea.
My copy of The Design and Analysis of Computer Algorithms has arrived today. In the first chapter, the author introduced Turing Machines. I have two other algorithms textbooks, Introduction to Algorithms and The Algorithm Design Manual, but none of them talks about Turing machines, even though they are famous on the subject of algorithms and data structures.
I would like to understand What is the relation between Turing Machine and Algorithm/Datastructure. Is is really important to understand Turing machines to become expert in algorithms?
Turing machines are just theoretical tools to analyze computation, ie. we can specify an algorihm by creating a turing machine which computes it. They are very useful in the study of computability, that is, if it is possible at all to compute a function. Turing machines and several other formal language constructs are discuessed in the classic book by Hopcroft and Ullmann. Turing machines also appear when discussing NP-completeness for instance in this book, by Garey and Johnson.
Both books and turing machines in general are pretty theoretical. If you are interested in algorihhms in an academic way, I'd recommend them. However, if you want a practical understanding of algorihms implemented on real computers and run on real data then I'd say that it's only important to have a cursory understanding of turing machines.
The reason that Turing machines are of importance when describing data structures and algorithms is that they provide a mathematical model in which we can describe what an algorithm is. Most of the time, algorithms are described using high-level language or pseudocode. For example, I might describe an algorithm to find the maximum value in an array like this:
Set max = -infinity
For each element in the array:
If that element is greater than max:
Set max equal to that element.
From this description it's easy to see how the algorithm works, and it would be easy to translate it into source code. However, suppose that I had written out this description:
Guess the index at which the maximum element occurs.
Output the element at that position.
Is this a valid algorithm? That is, can we say "guess the index" and rigorously define what it means? If we do allow this, how long will it take to do this? If we don't allow it, why not? What's different about the first description from the second?
In order to have a mathematically rigorous definition of an algorithm, we need to have some formal model of how a computer works and what it can and cannot do. The Turing machine is one common way to formally define computation, though there are others that can be used as well (register machines, string rewriting systems, Church's lambda calculus, etc.) Once we have defined a mathematical model of computation, we can start talking about what sorts of algorithmic descriptions are valid - namely, those that could be implemented using our model of computation.
Many modern algorithms depend critically on the properties of the underlying model of computation. For example, cache-oblivious algorithms assume that the model of computation has some memory buffer of an unknown size and a two-tiered memory. Some algorithms require that the underlying machine be transdichotomous, meaning that the size of a machine word must be at least large enough to hold the size of any problem. Randomized algorithms require a formal definition of randomess and how the machine can use random values. Nondeterministic algorithms require a means of specifying a nondeterministic computation. Algorithms based on circuits must know what circuit primitives are and are not allowed. Quantum computers need a formal definition of what operations are and are not allowed, along with what the definition of an algorithm is given that the output is probabilistic. Distributed algorithms need a formal definition of communication across machines.
In short, it's important to be explicit about what is and is not allowed when describing an algorithm. However, to be a good programmer or to have a solid grasp of algorithms, you don't need to necessarily know Turing machines inside and out, nor do you need to know about the specific details of how you'd encode particular problems using them. What you should know, though, is what the model of computation can and cannot do, and what the cost is per operation. This way, you can reason about how efficient algorithms are, how much of various resources (time, space, memory, communication, randomess, nondeterminism, etc.) they use. But that said, don't panic if you don't understand the underlying model.
There is one other reason to think about the underlying model of computation - discussing its limitations. Every model of computation has its limits, and in some cases you can prove that certain algorithms cannot possibly exist for certain problems, or that any algorithm that would solve some problem necessarily must use some amount of a given resource. The most common example where this comes up in algorithm design the notion of NP-hardness. Some problems are conjectured to be extremely "difficult" to solve, but the formal definitions of what this difficulty is relies on knowledge of Turing machines and nondeterministic Turing machines. Understanding the model is useful in this case because it allows you to reason about the computational feasibility of certain problems.
Hope this helps!