Chord detection algorithms? - algorithm

I am developing software that depends on musical chords detection. I know some algorithms for pitch detection, with techniques based on cepstral analysis or autocorrelation, but they are mainly focused on monophonic material recognition. But I need to work with some polyphonic recognition, that is, multiple pitches at the same time, like in a chord; does anyone know some good studies or solutions on that matter?
I am currently developing some algorithms based on the FFT, but if anyone has an idea on some algorithms or techniques that I can use, it would be of great help.

This is quite a good Open Source Project:
https://patterns.enm.bris.ac.uk/hpa-software-package
It detects chords based on a chromagram - a good solution, breaks down a window of the whole spectrum onto an array of pitch classes (size: 12) with float values. Then, chords can be detected by a Hidden Markov Model.
.. should provide you with everything you need. :)

The author of Capo, a transcription program for the Mac, has a pretty in-depth blog. The entry "A Note on Auto Tabbing" has some good jumping off points:
I started researching different methods of automatic transcription in mid-2009, because I was curious about how far along this technology was, and if it could be integrated into a future version of Capo.
Each of these automatic transcription algorithms start out with some kind of intermediate represenation of the audio data, and then they transfer that into a symbolic form (i.e. note onsets, and durations).
This is where I encountered some computationally expensive spectral representations (The Continuous Wavelet Transform (CWT), Constant Q Transform (CQT), and others.) I implemented all of these spectral transforms so that I could also implement the algorithms presented by the papers I was reading. This would give me an idea of whether they would work in practice.
Capo has some impressive technology. The standout feature is that its main view is not a frequency spectrogram like most other audio programs. It presents the audio like a piano roll, with the notes visible to the naked eye.
(source: supermegaultragroovy.com)
(Note: The hard note bars were drawn by a user. The fuzzy spots underneath are what Capo displays.)

There's significant overlap between chord detection and key detection, and so you may find some of my previous answer to that question useful, as it has a few links to papers and theses. Getting a good polyphonic recogniser is incredibly difficult.
My own viewpoint on this is that applying polyphonic recognition to extract the notes and then trying to detect chords from the notes is the wrong way to go about it. The reason is that it's an ambiguous problem. If you have two complex tones exactly an octave apart then it's impossible to detect whether there are one or two notes playing (unless you have extra context such as knowing the harmonic profile). Every harmonic of C5 is also a harmonic of C4 (and of C3, C2, etc). So if you try a major chord in a polyphonic recogniser then you are likely to get out a whole sequence of notes that are harmonically related to your chord, but not necessarily the notes you played. If you use an autocorrelation-based pitch detection method then you'll see this effect quite clearly.
Instead, I think it's better to look for the patterns that are made by certain chord shapes (Major, Minor, 7th, etc).

See my answer to this question:
How can I do real-time pitch detection in .Net?
The reference to this IEEE paper is mainly what you're looking for: http://ieeexplore.ieee.org/Xplore/login.jsp?reload=true&url=/iel5/89/18967/00876309.pdf?arnumber=876309
The harmonics are throwing you off. Plus, humans can find fundamentals in sound even when the fundamental isn't present! Think of reading, but by covering half of the letters. The brain fills in the gaps.
The context of other sounds in the mix, and what came before, is very important to how we perceive notes.

This is a very difficult pattern matching problem, probably suitable for an AI technique such as training neural nets or genetic algorithms.
Basically, at every point in time, you guess the number of notes being play, the notes, the instruments that played the notes, the amplitudes, and the duration of the note. Then you sum the magnitudes of all the harmonics and overtones that all those instruments would generate when played at that volume at that point in thier envelope (attack, decay, etc.). Subtract the sum of all those harmonics from the spectrum of you signal, then minimize the difference over all possibilities. Pattern recognition of the thump/squeak/pluck transient noise/etc. at the very onset of the note might also be important. Then do some decision analysis to make sure your choices make sense (e.g. a clarinet didn't suddenly change into a trumpet playing another note and back again 80 mS later), to minimize the error probability.
If you can constrain your choices (e.g. only 2 flutes playing only quarter notes, etc.), especially to instruments with very limited overtone energy, it makes the problem a lot easier.

Also http://www.schmittmachine.com/dywapitchtrack.html
The dywapitchtrack library computes the pitch of an audio stream in real time. The pitch is the main frequency of the waveform (the 'note' being played or sung). It is expressed as a float in Hz.
And http://clam-project.org/ may help a little.

This post is a bit old, but I thought I'd add the following paper to the discussion:
Klapuri,Anssi; Multipitch Analysis of Polyphonic Music and Speech Signals Using an Auditory Model; IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 2, FEBRUARY 2008 255
The paper acts somewhat like a literature review of multipitch analysis and discusses a method based on an auditory model:
(The image is from the paper. I don't know if I have to get permission to post it.)

Related

How to programmatically distinguish the professional photo from the amateur photo?

What are the different options and solutions (software) that will help distinguish professional (good) from amateur (bad) photo?
The criteria can be the contrast, sharpness, noise, presence of compression artifacts, etc. The question is, what are the tools that allow all this to determine it (the machine, not the man). For all of these criteria can be represented as mathematical models, you think?
Or in other words - to "feed" tool 1000 high-quality photos and 1000 substandard. And machine itself has identified the factors that distinguish the good from the bad image.
This is a quite vague definition of a problem. The only thing you have is 1000 high-quality photos and 1000 substandard photos. Your application however, is quite concrete, and I doubt (but I'm not sure) that you will find such a software.
Without looking to your images and have some tests is also difficult to say if contrast/gamma would be enough to classify them properly.
What you can do, if you know a bit of coding in matlab/python/C, is to use some existing libraries to try to solve your problem. I can't help you with that, as this itself is a quite tedious work, but, I can give you some insights.
To define your problem you will need:
Input: 1000 pro images, 1000 std images
You can represent this as 2000 images and a 2000 binary vector (1 for pro, 0 for std)
Features
Images itself might not give you enough information. What you can do is extract features from images. This step is called feature extraction and is an open research field in Computer Vision. There several feature extractors out there, you can try a couple of the most used ones, such as HoG or SIFT (have a look here for examples).
This feature extractors will give you a 1xM numerical vector for each image. With N images, you have a NxM matrix composed of N images and their descriptor.
Classification:
Once you managed to extract features from the image, having X = NxM data and y = label binary vector, you can use any machine learning algorithm, such as Deep Neural Networks, Random Forests, Supported Vector Machines, or any other one, to train your data, and classify it later.
By putting everything together, you might be able to get decent results.
Is this professional vs amateur photographer or equipment classification? I mean like distinguishing between a DLSR photo or a cell phone photo. Or is this distinguishing between an amateur with a DLSR and a professional with the same equipment? Or, are we talking about photoshop editing at the end.
In the case of equipment, I think features to look at would be noise, contrast, color gamut. In the case of skill of the photographer, you will probably have to look at features based on edge representations, natural scene metrics etc.
But, you will need to create a data matrix and then run a machine learning classification algorithm on it and hopefully find some features.
Are you making or looking for art for humans or for robots?
The definition of a great photo is subjective by definition. What to one person may be junk to another is genius. Trying to take that and turn it into an equation is asking for AI to take over humankind.
I don't mean to be harsh in my assessment. My degree is in Fine Art, Painting. I put a lot of time and effort into thinking about images. It's a wonder to me how a child might make an image that seemed to be thoughtless but was a breakthrough in my perception. Conversely you can work for hours, day, months or even years and then feel like the result just doesn't measure up.
Photography is an ingenious invention, therefore all photography is a source for amazement.
I do agree with you however. Some photos are truly impressive. I think that if we approach this as coders than what we are seeking is 'likes' or 'page views' or some other method of getting counts from many people. I know that is not the answer you were looking for but I don't think you can find a better one. I wish you well on your quest.
If you want to judge a photo on technicalities then the edge of the physics should be your target. Currently that would be mirrorless cameras and 3d imaging.

Drum sound recognition algorithms

I am thinking of trying to make program that will automatically generate drum tabs using an audio file containing only the drumming.
I have thought of using FFT to get an average spectrum peaks during a xxxx ms interval and then compare that to a table containing all the drum parts(snare, tombs, base drum and so no) of that specific drum kit and sound gear.
But i have a feeling that it won't be that easy. Have you guys any suggestions on which methods i could use to solve my issue?
// Eric
It isn't easy for anything except a trivial signal. Almost all western 'classical' and commercial music features coincident drum sounds.
1: Superposition: The original sources add together in a similar manner in the frequency domain as they do in the time domain. Each FFT bin contains contributions from all instruments currently being played (and those which are undamped and still decaying, or resonating sympathetically). Unpicking the various sources is hard - and certainly not a comparison with a library of spectra.
2: The FFT by its definition windows audio in the time domain and yields magnitude and phase of the basis function in each bin over that window period. The best you could say is that content appeared in the bin corresponding to a drum sound within the window period. If you were to compute a 1024 point FFT, the window duration would be 23ms at 44.1kHz. To put this into a musical perspective, 16th notes at 120bpm are 31.3ms apart. You might get away with smaller FFTs.
3: Percussion instrument signals tend to look a lot like noise - at least at the point where the instrument is hit. That is to say, there will be energy spread across the spectrum and no obviously dominant frequencies. After impact, tuned percussion starts to look more 'tonal'.
You probably need to look at a time-domain approach to accurately detect the onset point (onset detection). From there you could look at time or frequency domain characteristics of the signal to try and deduce the instrument in question. There's probably also a lot you could do with a priori knowledge of the genre of music being played, allowing you to predict the patterns that are likely to be present.
This is a particular case of the more generalised audio source separation problem. There's been lots of academic activity in this area, and consequently a lot of published papers describing approaches. Look for source separation, music information retrieval, audio feature detection

Generate musical instrument sounds algorithmically

Is it possible to generate a musical instrument's sounds using only algorithms? or can it only be done via pre-recorded sound samples?
Wavetable synthesis (PDF) is the most realistic method of real-instrument synthesis, as it takes samples and alters them slightly (for example adding vibrato, expression etc).
The waveforms generated by most musical instruments (especially wind and brass instruments) are so complex that pure algorithmic synthesis is not yet optimised enough to run on current hardware - even if it were, the technical complexities of writing such an algorithm are huge.
Interesting site here.
It is completely possible - that is one of the things synthesizers do.
It being possible doesn't mean it is simple. Synthesizers are usually expensive, and the amount of algorithms used are complex - the wikipedia page I linked before has links to some of them.
Pre-recorded sounds are simpler and cheaper to use, but they also have their limitations - they sound more "repetitive" for example.
Several years back, Sound on Sound magazine ran an excellent series called "Synth Secrets" which can now be viewed online for free. They give a good introduction to the types of techniques used in hardware synthesizers (both analogue and digital), and includes some articles discussing the difficulties of replicating certain real-world instrument sounds such as plucked and bowed strings, brass, snare drums, acoustic pianos etc.
After several days of hunting, this the best resource I have found: https://ccrma.stanford.edu/~jos/
This is a treasure trove for the subject of synthesising sounds.
STK
For example, this page links to a C example of synthesising a string, also a sound toolkit STK written in C++ for assisting this work.
This is going to keep me quiet for a few weeks while I dig through it.
It certainly is, and there are many approaches. Wolfram recently released WolframTones, which (unsurprisingly, if you know Wolfram) uses cellular automata. A detailed description of how it functions is here.
Karplus Strong Algorithm gives a very good synthesis of a plucked string. It can also be coded in a few lines of C. You create a circular buffer of floats (length proportional to the wavelength ie 1/f), and fill it full of random noise between -1 and 1.
Then you cycle through: each cycle, you replace the value at your current index with the average of the previous two values, and emit this new value.
index = (index+1) % bufSize;
outVal = buf[index] = decay * 0.5 * ( buf[index-1] + buf[index-2] );
The resultant byte stream gives you your sound. Of course, this can be heavily optimised.
To make your soundwave damp to 0.15 of its original strength after one second, you could set decay thus:
#define DECAY_1S =.15
Float32 decay = pow(DECAY_1S, 1.0f / freq);
Note: you need to size the original buffer so that it contains one complete waveform. so if you wish to generate a 441Hz sound, and your sampling rate is 44.1KHz, then you will need to allocate 100 elements in your buffer.
You can think of this as a resonance chamber, whose fundamental frequency is 441Hz, initially energised, with energy dissipating outwards from every point in the ring simultaneously. Magically it seems to organise itself into overtones of a fundamental frequency.
Could anyone post more algorithms? How about an algorithm for a continuous tone?
In addition to the answers provided here, there are also analysis synthesis frameworks that construct mathematical models (often based on capturing the trajectories of sinusoidal or noise components) of an input sound, allowing transformation and resynthesis. A few well-known frameworks are: SMS (available through the CLAM C++ project,) and Loris.
Physical models of instruments also are an option - they model the physical properties of an instrument such as reed stiffness, blowhole aperture, key clicking, and often produce realistic effects by incorporating non-linear effects such as overblowing. STK is one of these frameworks in C++.
These frameworks are generally more heavy then the wavetable synthesis option, but can provide more parameters for manipulation.
Chuck
This PDF details how SMule created Ocarina for the iPhone (I'm sure everyone has seen the advert). they did it by porting Chuck - Strongly-timed, Concurrent, and On-the-fly
Audio Programming Language
This is available for MacOS X, Windows, and Linux.
I've had a lot of success and fun with FM synthesis. It is a computationally light method, and is the source of a huge number of pop synth sounds from the 1980's (being the basis of the ubiquitous Yamaha DX-7).
I do use it for generating sound on the fly rather than using recordings. A simple example of a string synth sound generated on the fly can be had (free download, Win64) from http://philfrei.itch.io/referencenotekeyboard. The organ synth on that same program is a simple additive synthesis algo.
I would have to dig around for some good tutorials on this. I know we discussed it a bunch at java-gaming.org, when I was trying to figure it out, and then when helping nsigma work out a kink in his FM algo. The main thing is to use phase rather than frequency modulation if you want to chain more than a single modulator to a carrier.
Which reminds me! A kind of amazing java-based sound generator to check out, allows the generation of a number of different forms of synthesis, with real time control:
PraxisLIVE

How to detect the BPM of a song in php [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
How can the tempo/BPM of a song be determined programmatically? What algorithms are commonly used, and what considerations must be made?
This is challenging to explain in a single StackOverflow post. In general, the simplest beat-detection algorithms work by locating peaks in sound energy, which is easy to detect. More sophisticated methods use comb filters and other statistical/waveform methods. For a detailed explication including code samples, check this GameDev article out.
The keywords to search for are "Beat Detection", "Beat Tracking" and "Music Information Retrieval". There is lots of information here: http://www.music-ir.org/
There is a (maybe) annual contest called MIREX where different algorithms are tested on their beat detection performance.
http://nema.lis.illinois.edu/nema_out/mirex2010/results/abt/mck/
That should give you a list of algorithms to test.
A classic algorithm is Beatroot (google it), which is nice and easy to understand. It works like this:
Short-time FFT the music to get a sonogram.
Sum the increases in magnitude over all frequencies for each time step (ignore the decreases). This gives you a 1D time-varying function called the "spectral flux".
Find the peaks using any old peak detection algorithm. These are called "onsets" and correspond to the start of sounds in the music (starts of notes, drum hits, etc).
Construct a histogram of inter-onset-intervals (IOIs). This can be used to find likely tempos.
Initialise a set of "agents" or "hypotheses" for the beat-tracking result. Feed these agents the onsets one at a time in order. Each agent tracks the list of onsets that are also beats, and the current tempo estimate. The agents can either accept the onsets, if they fit closely with their last tracked beat and tempo, ignore them if they are wildly different, or spawn a new agent if they are in-between. Not every beat requires an onset - agents can interpolate.
Each agent is given a score according to how neat its hypothesis is - if all its beat onsets are loud it gets a higher score. If they are all regular it gets a higher score.
The highest scoring agent is the answer.
Downsides to this algorithm in my experience:
The peak-detection is rather ad-hoc and sensitive to threshold parameters and whatnot.
Some music doesn't have obvious onsets on the beats. Obviously it won't work with those.
Difficult to know how to resolve the 60bpm-vs-120bpm issue, especially with live tracking!
Throws away a lot of information by only using a 1D spectral flux. I reckon you can do much better by having a few band-limited spectral fluxes (and maybe one broadband one for drums).
Here is a demo of a live version of this algorithm, showing the spectral flux (black line at the bottom) and onsets (green circles). It's worth considering the fact that the beat is extracted from only the green circles. I've played back the onsets just as clicks, and to be honest I don't think I could hear the beat from them, so in some ways this algorithm is better than people at beat detection. I think the reduction to such a low-dimensional signal is its weak step though.
Annoyingly I did find a very good site with many algorithms and code for beat detection a few years ago. I've totally failed to refind it though.
Edit: Found it!
Here are some great links that should get you started:
http://marsyasweb.appspot.com/
http://www.vamp-plugins.org/download.html
Beat extraction involves the identification of cognitive metric structures in music. Very often these do not correspond to physical sound energy - for example, in most music there is a level of syncopation, which means that the "foot-tapping" beat that we perceive does not correspond to the presence of a physical sound. This means that this is a quite different field to onset detection, which is the detection of the physical sounds, and is performed in a different way.
You could try the Aubio library, which is a plain C library offering both onset and beat extraction tools.
There is also the online Echonest API, although this involves uploading an MP3 to a website and retrieving XML, so might not be so suitable..
EDIT: I came across this last night - a very promising looking C/C++ library, although I haven't used it myself. Vamp Plugins
The general area of research you are interested in is called MUSIC INFORMATION RETRIEVAL
There are many different algorithms that do this but they all are fundamentally centered around ONSET DETECTION.
Onset detection measures the start of an event, the event in this case is a note being played. You can look for changes in the weighted fourier transform (High Frequency Content) you can look for large changes in spectrial content. (Spectrial Difference). (there are a couple of papers that I recommend you look into further down) Once you apply an onset detection algorithm you pick off where the beats are via thresholding.
There are various algorithms that you can use once you've gotten that time localization of the beat. You can turn it into a pulse train (create a signal that is zero for all time and 1 only when your beat happens) then apply a FFT to that and BAM now you have a Frequency of Onsets at the largest peak.
Here are some papers to lead you in the right direction:
https://web.archive.org/web/20120310151026/http://www.elec.qmul.ac.uk/people/juan/Documents/Bello-TSAP-2005.pdf
https://adamhess.github.io/Onset_Detection_Nov302011.pdf
Here is an extension to what some people are discussing:
Someone mentioned looking into applying a machine learning algorithm: Basically collect a bunch of features from the onset detection functions (mentioned above) and combine them with the raw signal in a neural network/logistic regression and learn what makes a beat a beat.
look into Dr Andrew Ng, he has free machine learning lectures from Stanford University online (not the long winded video lectures, there is actually an online distance course)
If you can manage to interface with python code in your project, Echo Nest Remix API is a pretty slick API for python:
There's a method analysis.tempo which will give you the BPM. It can do a whole lot more than simple BPM, as you can see from the API docs or this tutorial
Perform a Fourier transform, and find peaks in the power spectrum. You're looking for peaks below the 20 Hz cutoff for human hearing. I'd guess typically in the 0.1-5ish Hz range to be generous.
SO question that might help: Bpm audio detection Library
Also, here is one of several "peak finding" questions on SO: Peak detection of measured signal
Edit: Not that I do audio processing. It's just a guess based on the fact that you're looking for a frequency domain property of the file...
another edit: It is worth noting that lossy compression formats like mp3, store Fourier domain data rather than time domain data in the first place. With a little cleverness, you can save yourself some heavy computation...but see the thoughtful comment by cobbal.
To repost my answer: The easy way to do it is to have the user tap a button in rhythm with the beat, and count the number of taps divided by the time.
Others have already described some beat-detection methods. I want to add that there are some libraries available that provide techniques and algorithms for this sort of task.
Aubio is one of them, it has a good reputation and it's written in C with a C++ wrapper so you can integrate it easily with a cocoa application (all the audio stuff in Apple's frameworks is also written in C/C++).
There are several methods to get the BPM but the one I find the most effective is the "beat spectrum" (described here).
This algorithm computes a similarity matrix by comparing each short sample of the music with every others. Once the similarity matrix is computed it is possible to get average similarity between every samples pairs {S(T);S(T+1)} for each time interval T: this is the beat spectrum. The first high peak in the beat spectrum is most of the time the beat duration. The best part is you can also do things like music structure or rythm analyses.
I'd imagine this will be easiest in 4-4 dance music, as there should be a single low frequency thud about twice a second.

What are good examples of genetic algorithms/genetic programming solutions? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Genetic algorithms (GA) and genetic programming (GP) are interesting areas of research.
I'd like to know about specific problems you have solved using GA/GP and what libraries/frameworks you used if you didn't roll your own.
Questions:
What problems have you used GA/GP to solve?
What libraries/frameworks did you use?
I'm looking for first-hand experiences, so please do not answer unless you have that.
Not homework.
My first job as a professional programmer (1995) was writing a genetic-algorithm based automated trading system for S&P500 futures. The application was written in Visual Basic 3 [!] and I have no idea how I did anything back then, since VB3 didn't even have classes.
The application started with a population of randomly-generated fixed-length strings (the "gene" part), each of which corresponded to a specific shape in the minute-by-minute price data of the S&P500 futures, as well as a specific order (buy or sell) and stop-loss and stop-profit amounts. Each string (or "gene") had its profit performance evaluated by a run through 3 years of historical data; whenever the specified "shape" matched the historical data, I assumed the corresponding buy or sell order and evaluated the trade's result. I added the caveat that each gene started with a fixed amount of money and could thus potentially go broke and be removed from the gene pool entirely.
After each evaluation of a population, the survivors were cross-bred randomly (by just mixing bits from two parents), with the likelihood of a gene being selected as a parent being proportional to the profit it produced. I also added the possibility of point mutations to spice things up a bit. After a few hundred generations of this, I ended up with a population of genes that could turn $5000 into an average of about $10000 with no chance of death/brokeness (on the historical data, of course).
Unfortunately, I never got the chance to use this system live, since my boss lost close to $100,000 in less than 3 months trading the traditional way, and he lost his willingness to continue with the project. In retrospect, I think the system would have made huge profits - not because I was necessarily doing anything right, but because the population of genes that I produced happened to be biased towards buy orders (as opposed to sell orders) by about a 5:1 ratio. And as we know with our 20/20 hindsight, the market went up a bit after 1995.
I made a little critters that lived in this little world. They had a neural network brain which received some inputs from the world and the output was a vector for movement among other actions. Their brains were the "genes".
The program started with a random population of critters with random brains. The inputs and output neurons were static but what was in between was not.
The environment contained food and dangers. Food increased energy and when you have enough energy, you can mate. The dangers would reduce energy and if energy was 0, they died.
Eventually the creatures evolved to move around the world and find food and avoid the dangers.
I then decided to do a little experiment. I gave the creature brains an output neuron called "mouth" and an input neuron called "ear". Started over and was surprised to find that they evolved to maximize the space and each respective creature would stay in its respective part (food was placed randomly). They learned to cooperate with each other and not get in each others way. There were always the exceptions.
Then i tried something interesting. I dead creatures would become food. Try to guess what happened! Two types of creatures evolved, ones that attacked like in swarms, and ones that were high avoidance.
So what is the lesson here? Communication means cooperation. As soon as you introduce an element where hurting another means you gain something, then cooperation is destroyed.
I wonder how this reflects on the system of free markets and capitalism. I mean, if businesses can hurt their competition and get away with it, then its clear they will do everything in their power to hurt the competition.
Edit:
I wrote it in C++ using no frameworks. Wrote my own neural net and GA code. Eric, thank you for saying it is plausible. People usually don't believe in the powers of GA (although the limitations are obvious) until they played with it. GA is simple but not simplistic.
For the doubters, neural nets have been proven to be able to simulate any function if they have more than one layer. GA is a pretty simple way to navigate a solution space finding local and potentially global minimum. Combine GA with neural nets and you have a pretty good way to find functions that find approximate solutions for generic problems. Because we are using neural nets, then we are optimizing the function for some inputs, not some inputs to a function as others are using GA
Here is the demo code for the survival example: http://www.mempko.com/darcs/neural/demos/eaters/
Build instructions:
Install darcs, libboost, liballegro, gcc, cmake, make
darcs clone --lazy http://www.mempko.com/darcs/neural/
cd neural
cmake .
make
cd demos/eaters
./eaters
In January 2004, I was contacted by Philips New Display Technologies who were creating the electronics for the first ever commercial e-ink, the Sony Librie, who had only been released in Japan, years before Amazon Kindle and the others hit the market in US an Europe.
The Philips engineers had a major problem. A few months before the product was supposed to hit the market, they were still getting ghosting on the screen when changing pages. The problem was the 200 drivers that were creating the electrostatic field. Each of these drivers had a certain voltage that had to be set right between zero and 1000 mV or something like this. But if you changed one of them, it would change everything.
So optimizing each driver's voltage individually was out of the question. The number of possible combination of values was in billions,and it took about 1 minute for a special camera to evaluate a single combination. The engineers had tried many standard optimization techniques, but nothing would come close.
The head engineer contacted me because I had previously released a Genetic Programming library to the open-source community. He asked if GP/GA's would help and if I could get involved. I did, and for about a month we worked together, me writing and tuning the GA library, on synthetic data, and him integrating it into their system. Then, one weekend they let it run live with the real thing.
The following Monday I got these glowing emails from him and their hardware designer, about how nobody could believe the amazing results the GA found. This was it. Later that year the product hit the market.
I didn't get paid one cent for it, but I got 'bragging' rights. They said from the beginning they were already over budget, so I knew what the deal was before I started working on it. And it's a great story for applications of GAs. :)
I used a GA to optimize seating assignments at my wedding reception. 80 guests over 10 tables. Evaluation function was based on keeping people with their dates, putting people with something in common together, and keeping people with extreme opposite views at separate tables.
I ran it several times. Each time, I got nine good tables, and one with all the odd balls. In the end, my wife did the seating assignments.
My traveling salesman optimizer used a novel mapping of chromosome to itinerary, which made it trivial to breed and mutate the chromosomes without any risk of generating invalid tours.
Update: Because a couple people have asked how ...
Start with an array of guests (or cities) in some arbitrary but consistent ordering, e.g., alphabetized. Call this the reference solution. Think of a guest's index as his/her seat number.
Instead of trying to encode this ordering directly in the chromosome, we encode instructions for transforming the reference solution into a new solution. Specifically, we treat the chromosomes as lists of indexes in the array to swap. To get decode a chromosome, we start with the reference solution and apply all the swaps indicated by the chromosome. Swapping two entries in the array always results in a valid solution: every guest (or city) still appears exactly once.
Thus chromosomes can be randomly generated, mutated, and crossed with others and will always produce a valid solution.
I used genetic algorithms (as well as some related techniques) to determine the best settings for a risk management system that tried to keep gold farmers from using stolen credit cards to pay for MMOs. The system would take in several thousand transactions with "known" values (fraud or not) and figure out what the best combination of settings was to properly identify the fraudulent transactions without having too many false positives.
We had data on several dozen (boolean) characteristics of a transaction, each of which was given a value and totalled up. If the total was higher than a threshold, the transaction was fraud. The GA would create a large number of random sets of values, evaluate them against a corpus of known data, select the ones that scored the best (on both fraud detection and limiting the number of false positives), then cross breed the best few from each generation to produce a new generation of candidates. After a certain number of generations the best scoring set of values was deemed the winner.
Creating the corpus of known data to test against was the Achilles' heel of the system. If you waited for chargebacks, you were several months behind when trying to respond to the fraudsters, so someone would have to manually review large numbers of transactions to build up that corpus of data without having to wait too long.
This ended up identifying the vast majority of the fraud that came in, but couldn't quite get it below 1% on the most fraud-prone items (given that 90% of incoming transactions could be fraud, that was doing pretty well).
I did all this using perl. One run of the software on a fairly old linux box would take 1-2 hours to run (20 minutes to load data over a WAN link, the rest of the time spent crunching). The size of any given generation was limited by available RAM. I'd run it over and over with slight changes to the parameters, looking for an especially good result set.
All in all it avoided some of the gaffes that came with manually trying to tweak the relative values of dozens of fraud indicators, and consistently came up with better solutions than I could create by hand. AFAIK, it's still in use (about 3 years after I wrote it).
Football Tipping. I built a GA system to predict the week to week outcome of games in the AFL (Aussie Rules Football).
A few years ago I got bored of the standard work football pool, everybody was just going online and taking the picks from some pundit in the press. So, I figured it couldn't be too hard to beat a bunch of broadcast journalism majors, right? My first thought was to take the results from Massey Ratings and then reveal at the end of the season my strategy after winning fame and glory. However, for reasons I've never discovered Massey does not track AFL. The cynic in me believes it is because the outcome of each AFL game has basically become random chance, but my complaints of recent rule changes belong in a different forum.
The system basically considered offensive strength, defensive strength, home field advantage, week to week improvement (or lack thereof) and velocity of changes to each of these. This created a set of polynomial equations for each team over the season. The winner and score for each match for a given date could be computed. The goal was to find the set of coefficients that most closely matched the outcome of all past games and use that set to predict the upcoming weeks game.
In practice, the system would find solutions that accurately predicted over 90% of past game outcomes. It would then successfully pick about 60-80% of games for the upcoming week (that is the week not in the training set).
The result: just above middle of the pack. No major cash prize nor a system that I could use to beat Vegas. It was fun though.
I built everything from scratch, no framework used.
As well as some of the common problems, like the Travelling Salesman and a variation on Roger Alsing's Mona Lisa program, I've also written an evolutionary Sudoku solver (which required a bit more original thought on my part, rather than just re-implementing somebody else's idea). There are more reliable algorithms for solving Sudokus but the evolutionary approach works fairly well.
In the last few days I've been playing around with an evolutionary program to find "cold decks" for poker after seeing this article on Reddit. It's not quite satisfactory at the moment but I think I can improve it.
I have my own framework that I use for evolutionary algorithms.
I developed a home brew GA for a 3D laser surface profile system my company developed for the freight industry back in 1992.
The system relied upon 3 dimensional triangulation and used a custom laser line scanner, a 512x512 camera (with custom capture hw). The distance between the camera and laser was never going to be precise and the focal point of the cameras were not to be found in the 256,256 position that you expected it to be!
It was a nightmare to try and work out the calibration parameters using standard geometry and simulated annealing style equation solving.
The Genetic algorithm was whipped up in an evening and I created a calibration cube to test it on. I knew the cube dimensions to high accuracy and thus the idea was that my GA could evolve a set of custom triangulation parameters for each scanning unit that would overcome production variations.
The trick worked a treat. I was flabbergasted to say the least! Within around 10 generations my 'virtual' cube (generated from the raw scan and recreated from the calibration parameters) actually looked like a cube! After around 50 generations I had the calibration I needed.
Its often difficult to get an exact color combination when you are planning to paint your house. Often, you have some color in mind, but it is not one of the colors, the vendor shows you.
Yesterday, my Prof. who is a GA researcher mentioned about a true story in Germany (sorry, I have no further references, yes, I can find it out if any one requests to). This guy (let's call him the color guy) used to go from door-door to help people to find the exact color code (in RGB) that would be the closet to what the customer had in mind. Here is how he would do it:
The color guy used to carry with him a software program which used GA. He used to start with 4 different colors- each coded as a coded Chromosome (whose decoded value would be a RGB value). The consumer picks 1 of the 4 colors (Which is the closest to which he/she has in mind). The program would then assign the maximum fitness to that individual and move onto the next generation using mutation/crossover. The above steps would be repeated till the consumer had found the exact color and then color guy used to tell him the RGB combination!
By assigning maximum fitness to the color closes to what the consumer have in mind, the color guy's program is increasing the chances to converge to the color, the consumer has in mind exactly. I found it pretty fun!
Now that I have got a -1, if you are planning for more -1's, pls. elucidate the reason for doing so!
A couple of weeks ago, I suggested a solution on SO using genetic algorithms to solve a problem of graph layout. It is an example of a constrained optimization problem.
Also in the area of machine learning, I implemented a GA-based classification rules framework in c/c++ from scratch.
I've also used GA in a sample project for training artificial neural networks (ANN) as opposed to using the famous backpropagation algorithm.
In addition, and as part of my graduate research, I've used GA in training Hidden Markov Models as an additional approach to the EM-based Baum-Welch algorithm (in c/c++ again).
As part of my undergraduate CompSci degree, we were assigned the problem of finding optimal jvm flags for the Jikes research virtual machine. This was evaluated using the Dicappo benchmark suite which returns a time to the console. I wrote a distributed gentic alogirthm that switched these flags to improve the runtime of the benchmark suite, although it took days to run to compensate for hardware jitter affecting the results. The only problem was I didn't properly learn about the compiler theory (which was the intent of the assignment).
I could have seeded the initial population with the exisiting default flags, but what was interesting was that the algorithm found a very similar configuration to the O3 optimisation level (but was actually faster in many tests).
Edit: Also I wrote my own genetic algorithm framework in Python for the assignment, and just used the popen commands to run the various benchmarks, although if it wasn't an assessed assignment I would have looked at pyEvolve.
First off, "Genetic Programming" by Jonathan Koza (on amazon) is pretty much THE book on genetic and evolutionary algorithm/programming techniques, with many examples. I highly suggest checking it out.
As for my own use of a genetic algorithm, I used a (home grown) genetic algorithm to evolve a swarm algorithm for an object collection/destruction scenario (practical purpose could have been clearing a minefield). Here is a link to the paper. The most interesting part of what I did was the multi-staged fitness function, which was a necessity since the simple fitness functions did not provide enough information for the genetic algorithm to sufficiently differentiate between members of the population.
I am part of a team investigating the use of Evolutionary Computation (EC) to automatically fix bugs in existing programs. We have successfully repaired a number of real bugs in real world software projects (see this project's homepage).
We have two applications of this EC repair technique.
The first (code and reproduction information available through the project page) evolves the abstract syntax trees parsed from existing C programs and is implemented in Ocaml using our own custom EC engine.
The second (code and reproduction information available through the project page), my personal contribution to the project, evolves the x86 assembly or Java byte code compiled from programs written in a number of programming languages. This application is implemented in Clojure and also uses its own custom built EC engine.
One nice aspect of Evolutionary Computation is the simplicity of the technique makes it possible to write your own custom implementations without too much difficulty. For a good freely available introductory text on Genetic Programming see the Field Guide to Genetic Programming.
A coworker and I are working on a solution for loading freight onto trucks using the various criteria our company requires. I've been working on a Genetic Algorithm solution while he is using a Branch And Bound with aggressive pruning. We are still in the process of implementing this solution but so far, we have been getting good results.
Several years ago I used ga's to optimize asr (automatic speech recognition) grammars for better recognition rates. I started with fairly simple lists of choices (where the ga was testing combinations of possible terms for each slot) and worked my way up to more open and complex grammars. Fitness was determined by measuring separation between terms/sequences under a kind of phonetic distance function. I also experimented with making weakly equivalent variations on a grammar to find one that compiled to a more compact representation (in the end I went with a direct algorithm, and it drastically increased the size of the "language" that we could use in applications).
More recently I have used them as a default hypothesis against which to test the quality of solutions generated from various algorithms. This has largely involved categorization and different kinds of fitting problems (i.e. create a "rule" that explains a set of choices made by reviewers over a dataset(s)).
I made a complete GA framework named "GALAB", to solve many problems:
locating GSM ANTs (BTS) to decrease overlap & blank locations.
Resource constraint project scheduling.
Evolutionary picture creation. (Evopic)
Travelling salesman problem.
N-Queen & N-Color problems.
Knight's tour & Knapsack problems.
Magic square & Sudoku puzzles.
string compression, based on Superstring problem.
2D Packaging problem.
Tiny artificial life APP.
Rubik puzzle.
I once used a GA to optimize a hash function for memory addresses. The addresses were 4K or 8K page sizes, so they showed some predictability in the bit pattern of the address (least significant bits all zero; middle bits incrementing regularly, etc.) The original hash function was "chunky" - it tended to cluster hits on every third hash bucket. The improved algorithm had a nearly perfect distribution.
I built a simple GA for extracting useful patterns out of the frequency spectrum of music as it was being played. The output was used to drive graphical effects in a winamp plugin.
Input: a few FFT frames (imagine a 2D array of floats)
Output: single float value (weighted sum of inputs), thresholded to 0.0 or 1.0
Genes: input weights
Fitness function: combination of duty cycle, pulse width and BPM within sensible range.
I had a few GAs tuned to different parts of the spectrum as well as different BPM limits, so they didn't tend to converge towards the same pattern. The outputs from the top 4 from each population were sent to the rendering engine.
An interesting side effect was that the average fitness across the population was a good indicator for changes in the music, although it generally took 4-5 seconds to figure it out.
I don't know if homework counts...
During my studies we rolled our own program to solve the Traveling Salesman problem.
The idea was to make a comparison on several criteria (difficulty to map the problem, performance, etc) and we also used other techniques such as Simulated annealing.
It worked pretty well, but it took us a while to understand how to do the 'reproduction' phase correctly: modeling the problem at hand into something suitable for Genetic programming really struck me as the hardest part...
It was an interesting course since we also dabbled with neural networks and the like.
I'd like to know if anyone used this kind of programming in 'production' code.
I used a simple genetic algorithm to optimize the signal to noise ratio of a wave that was represented as a binary string. By flipping the the bits certain ways over several million generations I was able to produce a transform that resulted in a higher signal to noise ratio of that wave. The algorithm could have also been "Simulated Annealing" but was not used in this case. At their core, genetic algorithms are simple, and this was about as simple of a use case that I have seen, so I didn't use a framework for generation creation and selection - only a random seed and the Signal-to-Noise Ratio function at hand.
As part of my thesis I wrote a generic java framework for the multi-objective optimisation algorithm mPOEMS (Multiobjective prototype optimization with evolved improvement steps), which is a GA using evolutionary concepts. It is generic in a way that all problem-independent parts have been separated from the problem-dependent parts, and an interface is povided to use the framework with only adding the problem-dependent parts. Thus one who wants to use the algorithm does not have to begin from zero, and it facilitates work a lot.
You can find the code here.
The solutions which you can find with this algorithm have been compared in a scientific work with state-of-the-art algorithms SPEA-2 and NSGA, and it has been proven that
the algorithm performes comparable or even better, depending on the metrics you take to measure the performance, and especially depending on the optimization-problem you are looking on.
You can find it here.
Also as part of my thesis and proof of work I applied this framework to the project selection problem found in portfolio management. It is about selecting the projects which add the most value to the company, support most the strategy of the company or support any other arbitrary goal. E.g. selection of a certain number of projects from a specific category, or maximization of project synergies, ...
My thesis which applies this framework to the project selection problem:
http://www.ub.tuwien.ac.at/dipl/2008/AC05038968.pdf
After that I worked in a portfolio management department in one of the fortune 500, where they used a commercial software which also applied a GA to the project selection problem / portfolio optimization.
Further resources:
The documentation of the framework:
http://thomaskremmel.com/mpoems/mpoems_in_java_documentation.pdf
mPOEMS presentation paper:
http://portal.acm.org/citation.cfm?id=1792634.1792653
Actually with a bit of enthusiasm everybody could easily adapt the code of the generic framework to an arbitrary multi-objective optimisation problem.
At work I had the following problem: given M tasks and N DSPs, what was the best way to assign tasks to DSPs? "Best" was defined as "minimizing the load of the most loaded DSP". There were different types of tasks, and various task types had various performance ramifications depending on where they were assigned, so I encoded the set of job-to-DSP assignments as a "DNA string" and then used a genetic algorithm to "breed" the best assignment string I could.
It worked fairly well (much better than my previous method, which was to evaluate every possible combination... on non-trivial problem sizes, it would have taken years to complete!), the only problem was that there was no way to tell if the optimal solution had been reached or not. You could only decide if the current "best effort" was good enough, or let it run longer to see if it could do better.
There was an competition on codechef.com (great site by the way, monthly programming competitions) where one was supposed to solve an unsolveable sudoku (one should come as close as possible with as few wrong collumns/rows/etc as possible).What I would do, was to first generate a perfect sudoku and then override the fields, that have been given. From this pretty good basis on I used genetic programming to improve my solution.I couldn't think of a deterministic approach in this case, because the sudoku was 300x300 and search would've taken too long.
In a seminar in the school, we develop an application to generate music based in the musical mode. The program was build in Java and the output was a midi file with the song. We using distincts aproachs of GA to generate the music. I think this program can be useful to explore new compositions.
in undergrad, we used NERO (a combination of neural network and genetic algorithm) to teach in-game robots to make intelligent decisions. It was pretty cool.
I developed a multithreaded swing based simulation of robot navigation through a set of randomized grid terrain of food sources and mines and developed a genetic algorithm based strategy of exploring the optimization of robotic behavior and survival of fittest genes for a robotic chromosome. This was done using charting and mapping of each iteration cycle.
Since, then I have developed even more game behavior. An example application I built recently for myself was a genetic algorithm for solving the traveling sales man problem in route finding in UK taking into account start and goal states as well as one/multiple connection points, delays, cancellations, construction works, rush hour, public strikes, consideration between fastest vs cheapest routes. Then providing a balanced recommendation for the route to take on a given day.
Generally, my strategy is to use POJO based representaton of genes then I apply specific interface implementations for selection, mutation, crossover strategies, and the criteria point. My fitness function then basically becomes a quite complex based on the strategy and criteria I need to apply as a heuristic measure.
I have also looked into applying genetic algorithm into automated testing within code using systematic mutation cycles where the algorithm understands the logic and tries to ascertain a bug report with recommendations for code fixes. Basically, a way to optimize my code and provide recommendations for improvement as well as a way of automating the discovery of new programmatic code. I have also tried to apply genetic algorithms to music production amongst other applications.
Generally, I find evolutionary strategies like most metaheuristic/global optimization strategies, they are slow to learn at first but start to pick up as the solutions become closer and closer to goal state and as long as your fitness function and heuristics are well aligned to produce that convergence within your search space.
I once tried to make a computer player for the game of Go, exclusively based on genetic programming. Each program would be treated as an evaluation function for a sequence of moves. The programs produced weren't very good though, even on a rather diminuitive 3x4 board.
I used Perl, and coded everything myself. I would do things differently today.
After reading The Blind Watchmaker, I was interested in the pascal program Dawkins said he had developed to create models of organisms that could evolve over time. I was interested enough to write my own using Swarm. I didn't make all the fancy critter graphics he did, but my 'chromosomes' controlled traits which affected organisms ability to survive. They lived in a simple world and could slug it out against each other and their environment.
Organisms lived or died partly due to chance, but also based on how effectively they adapted to their local environments, how well they consumed nutrients & how successfully they reproduced. It was fun, but also more proof to my wife that I am a geek.
It was a while ago, but I rolled a GA to evolve what were in effect image processing kernels to remove cosmic ray traces from Hubble Space Telescope (HST) images. The standard approach is to take multiple exposures with the Hubble and keep only the stuff that is the same in all the images. Since HST time is so valuable, I'm an astronomy buff, and had recently attended the Congress on Evolutionary Computation, I thought about using a GA to clean up single exposures.
The individuals were in the form of trees that took a 3x3 pixel area as input, performed some calculations, and produced a decision about whether and how to modify the center pixel. Fitness was judged by comparing the output with an image cleaned up in the traditional way (i.e. stacking exposures).
It actually sort of worked, but not well enough to warrant foregoing the original approach. If I hadn't been time-constrained by my thesis, I might have expanded the genetic parts bin available to the algorithm. I'm pretty sure I could have improved it significantly.
Libraries used: If I recall correctly, IRAF and cfitsio for astronomical image data processing and I/O.
I experimented with GA in my youth. I wrote a simulator in Python that worked as follows.
The genes encoded the weights of a neural network.
The neural network's inputs were "antennae" that detected touches. Higher values meant very close and 0 meant not touching.
The outputs were to two "wheels". If both wheels went forward, the guy went forward. If the wheels were in opposite directions, the guy turned. The strength of the output determined the speed of the wheel turning.
A simple maze was generated. It was really simple--stupid even. There was the start at the bottom of the screen and a goal at the top, with four walls in between. Each wall had a space taken out randomly, so there was always a path.
I started random guys (I thought of them as bugs) at the start. As soon as one guy reached the goal, or a time limit was reached, the fitness was calculated. It was inversely proportional to the distance to the goal at that time.
I then paired them off and "bred" them to create the next generation. The probability of being chosen to be bred was proportional to its fitness. Sometimes this meant that one was bred with itself repeatedly if it had a very high relative fitness.
I thought they would develop a "left wall hugging" behavior, but they always seemed to follow something less optimal. In every experiment, the bugs converged to a spiral pattern. They would spiral outward until they touched a wall to the right. They'd follow that, then when they got to the gap, they'd spiral down (away from the gap) and around. They would make a 270 degree turn to the left, then usually enter the gap. This would get them through a majority of the walls, and often to the goal.
One feature I added was to put in a color vector into the genes to track relatedness between individuals. After a few generations, they'd all be the same color, which tell me I should have a better breeding strategy.
I tried to get them to develop a better strategy. I complicated the neural net--adding a memory and everything. It didn't help. I always saw the same strategy.
I tried various things like having separate gene pools that only recombined after 100 generations. But nothing would push them to a better strategy. Maybe it was impossible.
Another interesting thing is graphing the fitness over time. There were definite patterns, like the maximum fitness going down before it would go up. I have never seen an evolution book talk about that possibility.

Resources