For my research project in biology for my final year I need to present a project in the field of Biotechnology. Being passionate about programming I immediately thought of Evolutionary Algorithms! However I am not sure if Evolutionary Algorithms would fall into the category of Biotechnology, hence I would rather confirm with the best and most passionate programming experts on the world.
Unfortunately no, a genetic algorithm (ga) is just an optimization technique that is inspired from various evolutionary processes like mutation or crossover. They belong to the area of evolutionary computing and artificial intelligence and not biotechnology.
Please follow this link for a brief introduction to genetic algorithms.
Biotechnology from the other hand has to do with actual organisms that are used in some way to make a product or an application. It sounds kind of broad but that is only because the particular field is in itself very broad. We use forms of biotechnology for thousands of years now in many common and not so common ways. This is not bad though as it gives you a lot of freedom regarding your project. Choose anything from food production to medicine and you will still be relevant to the subject.
Maybe the links provided will give you some inspiration.
Link one
Link two
Until you're implementing your evolutionary algorithms with organic material, no.
They are, of course, inspired from the way modern organisms have come to exist. But there's no biology in what you're doing.
No. It's just an example of a biological algorithm adapted for computational purposes.
Other examples include Ant-Colony Optimization, Flocking behavior, etc.
IIRC, Biotechnology requires the use of actual biology (i.e., living things or parts of them) adapted for technological purposes, not just an algorithmic emulation or modelling of their processes.
Related
"Designing the right algorithm for a given application is a difficult job. It requires a major creative act, taking a problem and pulling a solution out of the ether. This is much more difficult than taking someone else's idea and modifying it or tweaking it to make it a little better. The space of choices you can make in algorithm design is enormous, enough to leave you plenty of freedom to hang yourself".
I have studied several basic design techniques of algorithms like Divide and Conquer, Dynamic Programming, greedy, backtracking etc.
But i always fail to recognize what principles to apply when i come across certain programming problems. I want to master the designing of algorithms.
So can any one suggest a best place to understand the principles of algorithm design in depth.....
I suggest Programming Pearls, 2nd edition, by Jon Bentley. He talks a lot about algorithm design techniques and provides examples of real world problems, how they were solved, and how different algorithms affected the runtime.
Throughout the book, you learn algorithm design techniques, program verification methods to ensure your algorithms are correct, and you also learn a little bit about data structures. It's a very good book and I recommend it to anyone who wants to master algorithms. Go read the reviews in amazon: http://www.amazon.com/Programming-Pearls-2nd-Edition-Bentley/dp/0201657880
You can have a look at some of the book's contents here: http://netlib.bell-labs.com/cm/cs/pearls/
Enjoy!
You can't learn algorithm design just from reading books. Certainly, books can help. Books like Programming Pearls as suggested in another answer are great because they give you problems to work. Each problem forces you to think about how to solve a particular type of problem.
The idea is that you expose yourself to many different types of problems and their solutions. In doing so, you learn how to examine a problem and see if it shares anything in common with problems you've already seen. In that regard, it's not a whole lot different than the way you learned how to solve "word problems" in math class. Granted, most algorithm problems are more complex than having to figure out where on the tracks the two trains will collide, but the way you learn how to solve the problems is the same. You learn common techniques used to solve simple problems, then combine those techniques to solve more complex problems, etc.
Read, practice, lather, rinse, repeat.
In addition to books like Programming Pearls, there are sites online that post different programming challenges that you can test yourself on. It helps if you have friends or co-workers who also are interested in algorithms, because you can bounce ideas off each other and pose interesting challenges, or work together to come up with solutions to problems.
Did I mention that it takes practice?
"Mastering" anything takes time. A long time. A popular theory is that it takes 10,000 hours of practice to become an expert at anything. There's some dispute about that for particular endeavors, but in general it's true. You don't master anything overnight. You have to study. And practice. And read what others have done. Study some more and practice some more.
A good book about algorithm design is Kleinbeg Tardos. Every design technique depends on the problem that you are going to tackle. It is very important to do the exercises in the algorithm books and have feedback from teachers about that.
If there exist a locally optimal choice taht brings the globally optimal solution you can use a greedy algorithm.
If the problem has optimal substructure, you can use dynamic programming.
I'm about to take a course in AI and I want to practice before. I'm using a book to learn the theory, but resources and concrete examples in any language to help with the practice would be amazing. Can anyone recommend me good sites or books with plenty of examples and tutorials ?
Thanks !
Edit: My course will deal with Perceptrons, Neural networks and Bayesian AI.
Really depends on what area you want to specialize on. There is the startup - resource : is
here. I learned about neural nets from their example. Can you elaborate what kind of AI it should be?
Ah and i forgot: this link is a very nice forum where you can look at problems other people have and learn from that.
Cheers.
My advice would be to learn it by trying to implement the various types of learners yourself. See if you can find yourself a dataset related to some interest you have (sports, games, health, etc.) and then try and create a learner to do some kind of classification (predicting a winner in a sports game, learning how to classify backgammon positions, detecting cancer based on patient data, etc.) using different methods. Start with Decision Trees if that's part of your future course work since they're relatively simple, then move on to neural networks.
Here is a set of sources, each one of which i recommend highly--for the quality of the explanation, for the quality of the code, and for the 'completeness' of the algorithm demo.
Least-Squares Regression
(Python)
k-means clustering (Python)
Multi-Layer Perceptron (Python)
Hopfield Network (Python)
Decision Tree (ID3 & C4.5)
In addition, the excellent textbook Elements of Statistical Learning by Hastie, et al. is actually free to download. The authors have an R package that accompanies this textbook which contains all of the code. This book includes detailed discussion of most (if not all) of the major classes of ML algorithms, with specific examples that refer to working code and 'real-world' data.
Personally I would recommend this M.Tim.Jones book on AI.
Has many many topics on AI and almost every type of AI discussion is followed by C example code. Very pragmatic book on AI indeed !!
Russel & Norvig have a good survey of the broad field.
I'm wondering if anyone knows of any software techniques taking advantage of biology? For example, in the robotics world, there are tons, but what about software?
Many concepts originally observed in biology have been used in software. For example Genetic Algorithm (GA).
Artificial life (AL) exposes/uses several principles of biology such as resilience to imperfect code snippets, addressing by content, imperfect reproduction (in some implementations, also sexual, i.e. multi-orginanisms-driven, reproduction) and a non-goal-driven utility function. An interesting result of AL, is the spontaneous production of macro phenomenons observed in domains such as ecology or epidemiology (domains largely influenced by biology), such as the emergence of parasites and even that of organisms which take advantage of parasites, or subtle predator-prey relationships.
Maybe software can be said to have gone "full circle" with some experiments in computing which involve real (carbon-based) DNA (or RNA) molecules! The original experiment in this area (PDF link) by Prof. Alderman (of RSA fame), who coded the various elements of a graph-related problem (an hamiltonian graph) with different DNA molecules and let the massive parallel computing power of bio-chemistry do the rest and solve the problem !
Back in the digital world, but with a strong inspiration from biology and indeed from anatomy of the cerebral cortex, and from many theoretical and clinical observations in the neuroscience field, we have Neural Networks (NN). In the area of NN, maybe worthy of a special notice, is Numenta's Hierarchical Temporal Memory model which, although it reproduces the [understanding we have of] the neo-cortex only very loosely, introduces the idea that the very same algorithm is applied in all areas and at all levels of the cognitive process powered by the brains, an idea largely supported by biological, anatomical and other forms of evidence.
If your question means "have biological ideas been used to optimize software?" then
Genetic programming (http://en.wikipedia.org/wiki/Genetic_programming) is one example. From the Wikipedia article:
In artificial intelligence, genetic programming (GP) is an evolutionary algorithm-based methodology inspired by biological evolution to find computer programs that perform a user-defined task. It is a specialization of genetic algorithms (GA) where each individual is a computer program. Therefore it is a machine learning technique used to optimize a population of computer programs according to a fitness landscape determined by a program's ability to perform a given computational task.
If your question means "what software techniques have been inspired by biology?" then
see more generally http://en.wikipedia.org/wiki/Bio-inspired_computing. I would expect that several other methods such as ant-swarms (http://en.wikipedia.org/wiki/Ant_colony_optimization) and Neural Networks (http://en.wikipedia.org/wiki/Neural_network_software) could also be used.
Artificial Neural Networks are another classic example. The software application tends to be pattern recognition and prediction of behaviour of complex systems.
Ant colony optimization, a search / optimization method, and Artificial Life like Conway's Game of Life
Most of the answers yet talk about AI. The title of your question hints towards software that hides itself in order not to be detected.
We got viruses.
We got virus-hunters...
Me myself, I even hid some bugs in my own programs ... :(
Alan Kay (the object technology pioneer) spoke at length about the influence of biology in the OOP paradigm. He's got a series of ideas about how objects are like "cells" and that OOP scales in a similar way to the way that cells can scale to produce massive architectures...
You can follow quite a bit of this in his Turing Award Speech:
http://video.google.com/videoplay?docid=-2950949730059754521# -- Skip to about the 30:55 mark
Does anybody have any good source of software/tutorial about Genetic Engineering Simulation?
Maybe open source software about gene splicing / cloning simulation ?
Thanks
This might be up your alley:
Genetic Programming - Evolution of Mona Lisa by Roger Alsing.
Mona Lisa Source Code and binaries
Mona Lisa FAQ
All very interesting and impressive stuff.
This may not be exactly what you're after, but have you looked at Spore?
It's quite possible that there is no such thing as "serious" genetic engineering simulation software. I think it would be a very difficult concept to implement.
Excellent question!
Clearly genetic engineering can't be simulated at an analytic level - ie. modelling the mechanism involved - because some of the processes, such as protein folding, DNA transcription, etc., are too complex and haven't been modelled yet.
They are still working on a map of the human genome, and I'm sure they haven't gotten around to anything else because anything of reasonable complexity isn't that much less complex (in terms of number of genes) from a human. For example, you can save 2% in complexity, but then you have a chimp not a human.
However, there might be possible approaches based on empirical methods - i.e. "best guess" methods, or experimentation. Thus, you might look at recent work in machine learning or data mining for some possible approaches. There are a lot of data mining and machine learning methods out there, ranging from genetic algorithms, neural networks, decision trees, SVM, etc.
http://en.wikipedia.org/wiki/Data_mining
http://en.wikipedia.org/wiki/Machine_learning
I was wondering how common it is to find genetic algorithm approaches in commercial code.
It always seemed to me that some kinds of schedulers could benefit from a GA engine, as a supplement to the main algorithm.
Genetic Algorithms have been widely used commercially. Optimizing train routing was an early application. More recently fighter planes have used GAs to optimize wing designs. I have used GAs extensively at work to generate solutions to problems that have an extremely large search space.
Many problems are unlikely to benefit from GAs. I disagree with Thomas that they are too hard to understand. A GA is actually very simple. We found that there is a huge amount of knowledge to be gained from optimizing the GA to a particular problem that might be difficult and as always managing large amounts of parallel computation continue to be a problem for many programmers.
A problem that would benefit from a GA is going to have the following characteristics:
A good way to encode potential solutions
A way to compute an a numerical score to evaluate the quality of the solution
A large multi-dimensional search space where the answer is non-obvious
A good solution is good enough and a perfect solution is not required
There are many problems that could probably benefit from GAs and in the future they will probably be more widely deployed. I believe that GAs are used in cutting edge engineering more than people think however most people (like my company does) guards those secrets extremely closely. It is only long after the fact that it is revealed that GAs were used.
Most people that deal with "normal" applications probably don't have much use for them though.
If you want to find an example, look at Postgres's Query Planner. It uses many techniques, and one just so happens to be genetic.
http://developer.postgresql.org/pgdocs/postgres/geqo-pg-intro.html
I used GA in my Master's thesis, but after that I haven't found anything in my daily work a GA could solve that I couldn't solve faster with some other Algorithm.
I don't think it is particularly common to find genetic algorithms in everyday-commercial code. They are more commonly found in academic/research code where the need to find the "best algorithm" is less important than the need to just find a good solution to a problem.
Nonetheless, I have consulted on a couple of commercial projects that do use GAs (chiefly as a result of my involvement with GAUL). I think the most interesting example was at a Biotech company. They used the GA to optimise scoring functions that were used for virtual screening, as part of their drug discovery application.
Earlier this year, with my current company, I added a new feature to one of our products that uses another GA. I think we might be marketing this from next month. Basically, the GA is used to explore molecules that have the potential for binding to a protein, and could therefore be further investigated as drugs targeting that protein. A competing product that also uses a GA is EA inventor.
As part of my thesis I wrote a generic java framework for the multi-objective optimisation algorithm mPOEMS (Multiobjective prototype optimization with evolved improvement steps), which is a GA using evolutionary concepts. It is generic in a way that all problem-independent parts have been separated from the problem-dependent parts, and an interface is povided to use the framework with only adding the problem-dependent parts. Thus one who wants to use the algorithm does not have to begin from zero, and it facilitates work a lot.
You can find the code here.
The solutions which you can find with this algorithm have been compared in a scientific work with state-of-the-art algorithms SPEA-2 and NSGA, and it has been proven that
the algorithm performes comparable or even better, depending on the metrics you take to measure the performance, and especially depending on the optimization-problem you are looking on.
You can find it here.
Also as part of my thesis and proof of work I applied this framework to the project selection problem found in portfolio management. It is about selecting the projects which add the most value to the company, support most the strategy of the company or support any other arbitrary goal. E.g. selection of a certain number of projects from a specific category, or maximization of project synergies, ...
My thesis which applies this framework to the project selection problem:
http://www.ub.tuwien.ac.at/dipl/2008/AC05038968.pdf
After that I worked in a portfolio management department in one of the fortune 500, where they used a commercial software which also applied a GA to the project selection problem / portfolio optimization.
Further resources:
The documentation of the framework:
http://thomaskremmel.com/mpoems/mpoems_in_java_documentation.pdf
mPOEMS presentation paper:
http://portal.acm.org/citation.cfm?id=1792634.1792653
Actually with a bit of enthusiasm everybody could easily adapt the code of the generic framework to an arbitrary multi-objective optimisation problem.
I haven't but I've heard of this company (can't remember their name) which uses mutating, genetic algos to calculate placements and lengths of antennas (or something) from a friend of mine. And they're supposed to (according to my friend) have huge success with this. I guess GA is just too complex for "average Joe developer" to become mainstream. Kind of like Map Reduce - spectacularly cool, but WAY too advanced to hit the "mainstream"...
LibreOffice Calc uses it in its Solver module.