Rigorous Formalization of the Data/Computation Duality? - complexity-theory

It seems to be a matter of computer science lore that data and computation (or data and process, whatever you want to call it) are in some vague sense duals of each other: data is generated by computation but also guides future computation, and so the two are, vaguely, two sides of the same coin. This duality is more apparent in programming languages like Lisp which purposefully blur the line between the two.
I'm wondering whether this notion of duality has been studied in complexity theory in a rigorous setting. For instance, are there any computational models in which this duality arises naturally out of some deeper duality intrinsic to the model? For instance--and this is wishful thinking bordering on the nonsensical--if, say, we equated data with the states of a DFA and process with the DFA's transition function, and then the graph-dual of the DFA would yield another DFA related to the original in some meaningful way, then the data/computation duality would emerge naturally from the underlying model.
That sort of thing. Any pointers to research in the area (or even just keywords) are appreciated.

Related

Can genomes be heterogeneous and express entities with heterogeneous elements?

I never took a formal GA course, so this question might be vague: I'm trying to see whether I'm approaching this problem well.
Usually a genome is represented as a sequence of homogeneous elements, such as binary numbers, logic gates, elementary functions, etc., which can then be assembled into a homogeneous structure like a syntax-tree for a computer program or a 3D object or whatever.
My problem involves evolving a graph of components, lets say X, Y and Z: the graph can have N nodes and each node is an instance of either X, Y or Z. Encoding such a graph structure in a genome is rather straightforward, however, I also need to attach additional information for what X, Y and Z do themselves--which is actually the main object of the GA.
So it seems like my genome should code for a heterogeneous entity: an entity which is composed both of a structure graph and a functionality specification. It is not impossible to subsume the elements (genes) which code for the structure and those that code for functionality under a single parent "gene", and then simply separate them when the entity is being assembled, but this doesn't feel like the right approach.
Is this a common problem in GA? Am I supposed to find a "lower-level" representation / genome encoding in this situation? What are the relevant considerations?
Yes you can do that with GA, but strictly speaking you will be using Genetic Programming (GP) as opposed of Genetic Algorithms. GP is considered a special case of GA where the genome representation is heterogenous. This means your individual is a "computer program" instead of just "raw data" look here and here. This means you can really get creative on what this "computer program" means, how to represent it and handle it.
Regarding the additional information, it should be fine as long as all your genetic operators consider this representation. For instance, your crossover. It could be prepared to exchange half of the tree and half of the additional information of the parents. If for some reason the additional information cannot be divided, your crossover may decide to clone it from one of the parents.
The main disadvantage of this highly tuned approach is that you probably can't use high level GA/GP frameworks out there (I'm just assuming, I don't know much about them).

Human-interpretable supervised machine learning algorithm

I'm looking for a supervised machine learning algorithm that would produce transparent rules or definitions that can be easily interpreted by a human.
Most algorithms that I work with (SVMs, random forests, PLS-DA) are not very transparent. That is, you can hardly summarize the models in a table in a publication aimed at a non-computer scientist audience. What authors usually do is, for example, publish a list of variables that are important based on some criterion (for example, Gini index or mean decrease of accuracy in the case of RF), and sometimes improve this list by indicating how these variables differ between the classes in question.
What I am looking is a relatively simple output of the style "if (any of the variables V1-V10 > median or any of the variables V11-V20 < 1st quartile) and variable V21-V30 > 3rd quartile, then class A".
Is there any such thing around?
Just to constraint my question a bit: I am working with highly multidimensional data sets (tens of thousands to hundreds of thousands of often colinear variables). So for example regression trees would not be a good idea (I think).
You sound like you are describing decision trees. Why would regression trees not be a good choice? Maybe not optimal, but they work, and those are the most directly interpretable models. Anything that works on continuous values works on ordinal values.
There's a tension between wanting an accurate classifier, and wanting a simple and explainable model. You could build a random decision forest model, and constrain it in several ways to make it more interpretable:
Small max depth
High minimum information gain
Prune the tree
Only train on "understandable" features
Quantize/round decision threhsolds
The model won't be as good, necessarily.
You can find interesting research in the understanding AI methods done by Been Kim at Google Brain.

Methods for crossover in genetic algorithms

When reading about the crossover part of genetic algorithms, books and papers usually refer to methods of simply swapping out bits in the data of two selected candidates which are to reproduce.
I have yet to see actual code of an implemented genetic algorithm for actual industry applications, but I find it hard to imagine that it's enough to operate on simple data types.
I always imagined that the various stages of genetic algorithms would be performed on complex objects involving complex mathematical operations, as opposed to just swapping out some bits in single integers.
Even Wikipedia just lists these kinds of operations for crossover.
Am I missing something important or are these kinds of crossover methods really the only thing used?
There are several things used... although the need for parallelity and several generations (and sometimes a big population) leads to using techniques that perform well...
Another point to keep in mind is that "swapping out some bits" when modeled correctly resembles a simple and rather accurate version of what happens naturally (recombination of genes, mutations)...
For a very simple and nicely written walkthrough see http://www.electricmonk.nl/log/2011/09/28/evolutionary-algorithm-evolving-hello-world/
For some more info see
http://www.codeproject.com/KB/recipes/btl_ga.aspx
http://www.codeproject.com/KB/recipes/genetics_dot_net.aspx
http://www.codeproject.com/KB/recipes/GeneticandAntAlgorithms.aspx
http://www.c-sharpcorner.com/UploadFile/mgold/GeneticAlgorithm12032005044205AM/GeneticAlgorithm.aspx
I always imagined that the various stages of genetic algorithms would be performed on complex objects involving complex mathematical operations, as opposed to just swapping out some bits in single integers.
You probably think complex mathematical operations are used because you think the Genetic Algorithm has to modify a complex object. That's usually not how a Genetic Algorithm works.
So what does happen? Well, usually, the programmer (or scientist) will identify various parameters in a configuration, and then map those parameters to integers/floats. This does limit the directions in which the algorithm can explore, but that's the only realistic method of getting any results.
Let's look at evolving an antenna. You could perform complex simulation with an genetic algorithm rearranging copper molecules, but that would be very complex and take forever. Instead, you'd identify antenna "parameters". Most antenna's are built up out of certain lengths of wire, bent at certain places in order to maximize their coverage area. So you could identify a couple of parameters: number of starting wires, section lengths, angle of the bends. All those are easily represented as integer numbers, and are therefor easy for the Genetic Algorithm to manipulate. The resulting manipulations can be fed into an "Antenna simulator", to see how well it receives signals.
In short, when you say:
I find it hard to imagine that it's enough to operate on simple data types.
you must realize that simple data types can be mapped to much more intricate structures. The Genetic Algorithm doesn't have to know anything about these intricate structures. All it needs to know is how it can manipulate the parameters that build up the intricate structures. That is, after all, the way DNA works.
In Genetic algorithms usually bitswapping of some variety is used.
As you have said:
I always imagined that the various stages of genetic algorithms would
be performed on complex objects involving complex mathematical
operations
What i think you are looking for is Genetic Programming, where the chromosome describes a program, in this case you would be able to do more with the operators when applying crossover.
Also make sure you have understood the difference between your fitness function in genetic algorithms, and the operators within a chromosome in genetic programming.
Different applications require different encondigs. The goal certainly is to find the most effective encoding and often enough the simple encodings are the better suited. So for example a Job Shop Scheduling Problem might be represented as list of permutations which represent the execution order of the jobs on the different machines (so called job sequence matrix). It can however also be represented as a list of priority rules that construct the schedule. A traveling salesman problem or quadratic assignment problem is typically represented by a single permutation that denotes a tour in one case or an assignment in another case. Optimizing the parameters of a simulation model or finding the root of a complex mathematical function is typically represented by a vector of real values.
For all those, still simple types crossover and mutation operators exist. For the permutation these are e.g. OX, ERX, CX, PMX, UBX, OBX, and many more. If you can combine a number of simple representations to represent a solution of your complex problem you might reuse these operations and apply them to each component individually.
The important thing about crossover to work effectively is that a few properties should be fulfilled:
The crossover should conserve those parts that are similar in both
For those parts that are not similar, the crossover should not introduce an element that is not already part of one of the parents
The crossover of two solutions should, if possible, produce a feasible solution
You want to avoid so called unwanted mutations in your crossovers. In that light you also want to avoid having to repair a large part of your chromosomes after crossover, since that is also introducing unwanted mutations.
If you want to experiment with different operators and problems, we have a nice GUI-driven software: HeuristicLab.
Simple Bit swapping is usually the way to go. The key thing to note is the encoding that is used in each candidate solution. Solutions should be encoded such that there is minimal or no error introduced into the new offspring. Any error would require that the algorithm to provide a fix which will lead to increased processing time.
As an example I have developed a university timetable generator in C# that uses a integer encoding to represent the timeslots available in each day. This representation allows very efficient single point or multi-point crossover operator which uses the LINQ intersect function to combine parents.
Typical multipoint crossover with hill-climbing
public List<TimeTable> CrossOver(List<TimeTable> parents) // Multipoint crossover
{
var baby1 = new TimeTable {Schedule = new List<string>(), Fitness = 0};
var baby2 = new TimeTable {Schedule = new List<string>(), Fitness = 0};
for (var gen = 0; gen < parents[0].Schedule.Count; gen++)
{
if (rnd.NextDouble() < (double) CrossOverProb)
{
baby2.Schedule.Add(parents[0].Schedule[gen]);
baby1.Schedule.Add(parents[1].Schedule[gen]);
}
else
{
baby1.Schedule.Add(parents[0].Schedule[gen]);
baby2.Schedule.Add(parents[1].Schedule[gen]);
}
}
CalculateFitness(ref baby1);
CalculateFitness(ref baby2);
// allow hill-climbing
parents.Add(baby1);
parents.Add(baby2);
return parents.OrderByDescending(i => i.Fitness).Take(2).ToList();
}

Algorithms for realtime strategy wargame AI

I'm designing a realtime strategy wargame where the AI will be responsible for controlling a large number of units (possibly 1000+) on a large hexagonal map.
A unit has a number of action points which can be expended on movement, attacking enemy units or various special actions (e.g. building new units). For example, a tank with 5 action points could spend 3 on movement then 2 in firing on an enemy within range. Different units have different costs for different actions etc.
Some additional notes:
The output of the AI is a "command" to any given unit
Action points are allocated at the beginning of a time period, but may be spent at any point within the time period (this is to allow for realtime multiplayer games). Hence "do nothing and save action points for later" is a potentially valid tactic (e.g. a gun turret that cannot move waiting for an enemy to come within firing range)
The game is updating in realtime, but the AI can get a consistent snapshot of the game state at any time (thanks to the game state being one of Clojure's persistent data structures)
I'm not expecting "optimal" behaviour, just something that is not obviously stupid and provides reasonable fun/challenge to play against
What can you recommend in terms of specific algorithms/approaches that would allow for the right balance between efficiency and reasonably intelligent behaviour?
If you read Russell and Norvig, you'll find a wealth of algorithms for every purpose, updated to pretty much today's state of the art. That said, I was amazed at how many different problem classes can be successfully approached with Bayesian algorithms.
However, in your case I think it would be a bad idea for each unit to have its own Petri net or inference engine... there's only so much CPU and memory and time available. Hence, a different approach:
While in some ways perhaps a crackpot, Stephen Wolfram has shown that it's possible to program remarkably complex behavior on a basis of very simple rules. He bravely extrapolates from the Game of Life to quantum physics and the entire universe.
Similarly, a lot of research on small robots is focusing on emergent behavior or swarm intelligence. While classic military strategy and practice are strongly based on hierarchies, I think that an army of completely selfless, fearless fighters (as can be found marching in your computer) could be remarkably effective if operating as self-organizing clusters.
This approach would probably fit a little better with Erlang's or Scala's actor-based concurrency model than with Clojure's STM: I think self-organization and actors would go together extremely well. Still, I could envision running through a list of units at each turn, and having each unit evaluating just a small handful of very simple rules to determine its next action. I'd be very interested to hear if you've tried this approach, and how it went!
EDIT
Something else that was on the back of my mind but that slipped out again while I was writing: I think you can get remarkable results from this approach if you combine it with genetic or evolutionary programming; i.e. let your virtual toy soldiers wage war on each other as you sleep, let them encode their strategies and mix, match and mutate their code for those strategies; and let a refereeing program select the more successful warriors.
I've read about some startling successes achieved with these techniques, with units operating in ways we'd never think of. I have heard of AIs working on these principles having had to be intentionally dumbed down in order not to frustrate human opponents.
First you should aim to make your game turn based at some level for the AI (i.e. you can somehow model it turn based even if it may not be entirely turn based, in RTS you may be able to break discrete intervals of time into turns.) Second, you should determine how much information the AI should work with. That is, if the AI is allowed to cheat and know every move of its opponent (thereby making it stronger) or if it should know less or more. Third, you should define a cost function of a state. The idea being that a higher cost means a worse state for the computer to be in. Fourth you need a move generator, generating all valid states the AI can transition to from a given state (this may be homogeneous [state-independent] or heterogeneous [state-dependent].)
The thing is, the cost function will be greatly influenced by what exactly you define the state to be. The more information you encode in the state the better balanced your AI will be but the more difficult it will be for it to perform, as it will have to search exponentially more for every additional state variable you include (in an exhaustive search.)
If you provide a definition of a state and a cost function your problem transforms to a general problem in AI that can be tackled with any algorithm of your choice.
Here is a summary of what I think would work well:
Evolutionary algorithms may work well if you put enough effort into them, but they will add a layer of complexity that will create room for bugs amongst other things that can go wrong. They will also require extreme amounts of tweaking of the fitness function etc. I don't have much experience working with these but if they are anything like neural networks (which I believe they are since both are heuristics inspired by biological models) you will quickly find they are fickle and far from consistent. Most importantly, I doubt they add any benefits over the option I describe in 3.
With the cost function and state defined it would technically be possible for you to apply gradient decent (with the assumption that the state function is differentiable and the domain of the state variables are continuous) however this would probably yield inferior results, since the biggest weakness of gradient descent is getting stuck in local minima. To give an example, this method would be prone to something like attacking the enemy always as soon as possible because there is a non-zero chance of annihilating them. Clearly, this may not be desirable behaviour for a game, however, gradient decent is a greedy method and doesn't know better.
This option would be my most highest recommended one: simulated annealing. Simulated annealing would (IMHO) have all the benefits of 1. without the added complexity while being much more robust than 2. In essence SA is just a random walk amongst the states. So in addition to the cost and states you will have to define a way to randomly transition between states. SA is also not prone to be stuck in local minima, while producing very good results quite consistently. The only tweaking required with SA would be the cooling schedule--which decides how fast SA will converge. The greatest advantage of SA I find is that it is conceptually simple and produces superior results empirically to most other methods I have tried. Information on SA can be found here with a long list of generic implementations at the bottom.
3b. (Edit Added much later) SA and the techniques I listed above are general AI techniques and not really specialized to AI for games. In general, the more specialized the algorithm the more chance it has at performing better. See No Free Lunch Theorem 2. Another extension of 3 is something called parallel tempering which dramatically improves the performance of SA by helping it avoid local optima. Some of the original papers on parallel tempering are quite dated 3, but others have been updated4.
Regardless of what method you choose in the end, its going to be very important to break your problem down into states and a cost function as I said earlier. As a rule of thumb I would start with 20-50 state variables as your state search space is exponential in the number of these variables.
This question is huge in scope. You are basically asking how to write a strategy game.
There are tons of books and online articles for this stuff. I strongly recommend the Game Programming Wisdom series and AI Game Programming Wisdom series. In particular, Section 6 of the first volume of AI Game Programming Wisdom covers general architecture, Section 7 covers decision-making architectures, and Section 8 covers architectures for specific genres (8.2 does the RTS genre).
It's a huge question, and the other answers have pointed out amazing resources to look into.
I've dealt with this problem in the past and found the simple-behavior-manifests-complexly/emergent behavior approach a bit too unwieldy for human design unless approached genetically/evolutionarily.
I ended up instead using abstracted layers of AI, similar to a way armies work in real life. Units would be grouped with nearby units of the same time into squads, which are grouped with nearby squads to create a mini battalion of sorts. More layers could be use here (group battalions in a region, etc.), but ultimately at the top there is the high-level strategic AI.
Each layer can only issue commands to the layers directly below it. The layer below it will then attempt to execute the command with the resources at hand (ie, the layers below that layer).
An example of a command issued to a single unit is "Go here" and "shoot at this target". Higher level commands issued to higher levels would be "secure this location", which that level would process and issue the appropriate commands to the lower levels.
The highest level master AI is responsible for very board strategic decisions, such as "we need more ____ units", or "we should aim to move towards this location".
The army analogy works here; commanders and lieutenants and chain of command.

Probability transition matrix

I'm working on Markov Chains and I would like to know of efficient algorithms for constructing probabilistic transition matrices (of order n), given a text file as input.
I am not after one algorithm, but I'd rather like to build a list of such algorithms. Papers on such algorithms are also more than welcome, as any tips on terminology, etc. Notice that this topic bears a strong resemblance with n-gram identification algorithms.
Any help would be much appreciated.
It sounds like there are two possible questions, you should clarify which one:
The 'text file' contains probability values and "n" and you build the matrix directly, but how to code it? This question is trivial, so let's disregard it
The 'text file' contains something like signal data and you want to model it as a Markov Chain.
'Markov Chain' generally refers to a first order stochastic process, so I'm not sure then what you mean by "order", probably the size of the matrix, but that is not typical terminology. Anyway, for 1st-order, n x n matrix, discrete time random process, you should look at Viterbi Algorithm: http://en.wikipedia.org/wiki/Viterbi_algorithm
Whenever dealing with Markov Models, I tend to end up looking at crm114 Discriminator. One, he goes into great detail about what different models there actually are (Markov isn't always the best, depending on what the application is) and provides general links and lots of background information on how probabilistic models work. While crm114 is generally used as some sort of SPAM identification tool, it is actually a more generic probability engine that I have used in other applications.

Resources