Genetic Engineering simulation - genetics

Does anybody have any good source of software/tutorial about Genetic Engineering Simulation?
Maybe open source software about gene splicing / cloning simulation ?
Thanks

This might be up your alley:
Genetic Programming - Evolution of Mona Lisa by Roger Alsing.
Mona Lisa Source Code and binaries
Mona Lisa FAQ
All very interesting and impressive stuff.

This may not be exactly what you're after, but have you looked at Spore?
It's quite possible that there is no such thing as "serious" genetic engineering simulation software. I think it would be a very difficult concept to implement.

Excellent question!
Clearly genetic engineering can't be simulated at an analytic level - ie. modelling the mechanism involved - because some of the processes, such as protein folding, DNA transcription, etc., are too complex and haven't been modelled yet.
They are still working on a map of the human genome, and I'm sure they haven't gotten around to anything else because anything of reasonable complexity isn't that much less complex (in terms of number of genes) from a human. For example, you can save 2% in complexity, but then you have a chimp not a human.
However, there might be possible approaches based on empirical methods - i.e. "best guess" methods, or experimentation. Thus, you might look at recent work in machine learning or data mining for some possible approaches. There are a lot of data mining and machine learning methods out there, ranging from genetic algorithms, neural networks, decision trees, SVM, etc.
http://en.wikipedia.org/wiki/Data_mining
http://en.wikipedia.org/wiki/Machine_learning

Related

Practical use cases for machine learning algorithms

I am just starting out studying machine learning and currently doing Andrew Ng's course on Coursera. I am going through the course but am a bit lost. It will make studying all those algorithms/theory a lot rewarding if I can see some use cases for them.
For example, the first topic I read about was gradient descent and then linear regression and logistic regression. Are these used directly in practice or are other algorithms like k-means and kernel density used? I guess I am trying to get real world (software engineering, data mining) examples of these topics. Can some one suggest a post that might have some explanation of any machine learning algorithm(s) usage? It will be greatly helpful.
NO FREE LUNCH THEOREM states that if algorithm A outperforms algorithm B for some problem, then loosely speaking there must exist exactly as many other problems where B outperforms.
So, it is difficult to link algorithm with particular use case.
If you are looking only for use cases where you can use machine learning algorithms, visit https://www.kaggle.com/wiki/DataScienceUseCases
Update : Just now, i came across http://pkghosh.wordpress.com Check it out. (use cases with algorithms)

Are Evolutionary algorithms biotechnology?

For my research project in biology for my final year I need to present a project in the field of Biotechnology. Being passionate about programming I immediately thought of Evolutionary Algorithms! However I am not sure if Evolutionary Algorithms would fall into the category of Biotechnology, hence I would rather confirm with the best and most passionate programming experts on the world.
Unfortunately no, a genetic algorithm (ga) is just an optimization technique that is inspired from various evolutionary processes like mutation or crossover. They belong to the area of evolutionary computing and artificial intelligence and not biotechnology.
Please follow this link for a brief introduction to genetic algorithms.
Biotechnology from the other hand has to do with actual organisms that are used in some way to make a product or an application. It sounds kind of broad but that is only because the particular field is in itself very broad. We use forms of biotechnology for thousands of years now in many common and not so common ways. This is not bad though as it gives you a lot of freedom regarding your project. Choose anything from food production to medicine and you will still be relevant to the subject.
Maybe the links provided will give you some inspiration.
Link one
Link two
Until you're implementing your evolutionary algorithms with organic material, no.
They are, of course, inspired from the way modern organisms have come to exist. But there's no biology in what you're doing.
No. It's just an example of a biological algorithm adapted for computational purposes.
Other examples include Ant-Colony Optimization, Flocking behavior, etc.
IIRC, Biotechnology requires the use of actual biology (i.e., living things or parts of them) adapted for technological purposes, not just an algorithmic emulation or modelling of their processes.

Understanding algorithm design techniques in depth

"Designing the right algorithm for a given application is a difficult job. It requires a major creative act, taking a problem and pulling a solution out of the ether. This is much more difficult than taking someone else's idea and modifying it or tweaking it to make it a little better. The space of choices you can make in algorithm design is enormous, enough to leave you plenty of freedom to hang yourself".
I have studied several basic design techniques of algorithms like Divide and Conquer, Dynamic Programming, greedy, backtracking etc.
But i always fail to recognize what principles to apply when i come across certain programming problems. I want to master the designing of algorithms.
So can any one suggest a best place to understand the principles of algorithm design in depth.....
I suggest Programming Pearls, 2nd edition, by Jon Bentley. He talks a lot about algorithm design techniques and provides examples of real world problems, how they were solved, and how different algorithms affected the runtime.
Throughout the book, you learn algorithm design techniques, program verification methods to ensure your algorithms are correct, and you also learn a little bit about data structures. It's a very good book and I recommend it to anyone who wants to master algorithms. Go read the reviews in amazon: http://www.amazon.com/Programming-Pearls-2nd-Edition-Bentley/dp/0201657880
You can have a look at some of the book's contents here: http://netlib.bell-labs.com/cm/cs/pearls/
Enjoy!
You can't learn algorithm design just from reading books. Certainly, books can help. Books like Programming Pearls as suggested in another answer are great because they give you problems to work. Each problem forces you to think about how to solve a particular type of problem.
The idea is that you expose yourself to many different types of problems and their solutions. In doing so, you learn how to examine a problem and see if it shares anything in common with problems you've already seen. In that regard, it's not a whole lot different than the way you learned how to solve "word problems" in math class. Granted, most algorithm problems are more complex than having to figure out where on the tracks the two trains will collide, but the way you learn how to solve the problems is the same. You learn common techniques used to solve simple problems, then combine those techniques to solve more complex problems, etc.
Read, practice, lather, rinse, repeat.
In addition to books like Programming Pearls, there are sites online that post different programming challenges that you can test yourself on. It helps if you have friends or co-workers who also are interested in algorithms, because you can bounce ideas off each other and pose interesting challenges, or work together to come up with solutions to problems.
Did I mention that it takes practice?
"Mastering" anything takes time. A long time. A popular theory is that it takes 10,000 hours of practice to become an expert at anything. There's some dispute about that for particular endeavors, but in general it's true. You don't master anything overnight. You have to study. And practice. And read what others have done. Study some more and practice some more.
A good book about algorithm design is Kleinbeg Tardos. Every design technique depends on the problem that you are going to tackle. It is very important to do the exercises in the algorithm books and have feedback from teachers about that.
If there exist a locally optimal choice taht brings the globally optimal solution you can use a greedy algorithm.
If the problem has optimal substructure, you can use dynamic programming.

state-of-the-art of classification algorithms

We know there are like a thousand of classifiers, recently I was told that, some people say adaboost is like the out of the shell one.
Are There better algorithms (with
that voting idea)
What is the state of the art in
the classifiers.Do you have an example?
First, adaboost is a meta-algorithm which is used in conjunction with (on top of) your favorite classifier. Second, classifiers which work well in one problem domain often don't work well in another. See the No Free Lunch wikipedia page. So, there is not going to be AN answer to your question. Still, it might be interesting to know what people are using in practice.
Weka and Mahout aren't algorithms... they're machine learning libraries. They include implementations of a wide range of algorithms. So, your best bet is to pick a library and try a few different algorithms to see which one works best for your particular problem (where "works best" is going to be a function of training cost, classification cost, and classification accuracy).
If it were me, I'd start with naive Bayes, k-nearest neighbors, and support vector machines. They represent well-established, well-understood methods with very different tradeoffs. Naive Bayes is cheap, but not especially accurate. K-NN is cheap during training but (can be) expensive during classification, and while it's usually very accurate it can be susceptible to overtraining. SVMs are expensive to train and have lots of meta-parameters to tweak, but they are cheap to apply and generally at least as accurate as k-NN.
If you tell us more about the problem you're trying to solve, we may be able to give more focused advice. But if you're just looking for the One True Algorithm, there isn't one -- the No Free Lunch theorem guarantees that.
Apache Mahout (open source, java) seems to pick up a lot of steam.
Weka is a very popular and stable Machine Learning library. It has been around for quite a while and written in Java.
Hastie et al. (2013, The Elements of Statistical Learning) conclude that the Gradient Boosting Machine is the best "off-the-shelf" Method. Independent of the Problem you have.
Definition (see page 352):
An “off-the-shelf” method is one that
can be directly applied to the data without requiring a great deal of timeconsuming data preprocessing or careful tuning of the learning procedure.
And a bit older meaning:
In fact, Breiman (NIPS Workshop, 1996) referred to AdaBoost with trees as the “best off-the-shelf classifier in the world” (see also Breiman (1998)).

Biologically inspired software

I'm wondering if anyone knows of any software techniques taking advantage of biology? For example, in the robotics world, there are tons, but what about software?
Many concepts originally observed in biology have been used in software. For example Genetic Algorithm (GA).
Artificial life (AL) exposes/uses several principles of biology such as resilience to imperfect code snippets, addressing by content, imperfect reproduction (in some implementations, also sexual, i.e. multi-orginanisms-driven, reproduction) and a non-goal-driven utility function. An interesting result of AL, is the spontaneous production of macro phenomenons observed in domains such as ecology or epidemiology (domains largely influenced by biology), such as the emergence of parasites and even that of organisms which take advantage of parasites, or subtle predator-prey relationships.
Maybe software can be said to have gone "full circle" with some experiments in computing which involve real (carbon-based) DNA (or RNA) molecules! The original experiment in this area (PDF link) by Prof. Alderman (of RSA fame), who coded the various elements of a graph-related problem (an hamiltonian graph) with different DNA molecules and let the massive parallel computing power of bio-chemistry do the rest and solve the problem !
Back in the digital world, but with a strong inspiration from biology and indeed from anatomy of the cerebral cortex, and from many theoretical and clinical observations in the neuroscience field, we have Neural Networks (NN). In the area of NN, maybe worthy of a special notice, is Numenta's Hierarchical Temporal Memory model which, although it reproduces the [understanding we have of] the neo-cortex only very loosely, introduces the idea that the very same algorithm is applied in all areas and at all levels of the cognitive process powered by the brains, an idea largely supported by biological, anatomical and other forms of evidence.
If your question means "have biological ideas been used to optimize software?" then
Genetic programming (http://en.wikipedia.org/wiki/Genetic_programming) is one example. From the Wikipedia article:
In artificial intelligence, genetic programming (GP) is an evolutionary algorithm-based methodology inspired by biological evolution to find computer programs that perform a user-defined task. It is a specialization of genetic algorithms (GA) where each individual is a computer program. Therefore it is a machine learning technique used to optimize a population of computer programs according to a fitness landscape determined by a program's ability to perform a given computational task.
If your question means "what software techniques have been inspired by biology?" then
see more generally http://en.wikipedia.org/wiki/Bio-inspired_computing. I would expect that several other methods such as ant-swarms (http://en.wikipedia.org/wiki/Ant_colony_optimization) and Neural Networks (http://en.wikipedia.org/wiki/Neural_network_software) could also be used.
Artificial Neural Networks are another classic example. The software application tends to be pattern recognition and prediction of behaviour of complex systems.
Ant colony optimization, a search / optimization method, and Artificial Life like Conway's Game of Life
Most of the answers yet talk about AI. The title of your question hints towards software that hides itself in order not to be detected.
We got viruses.
We got virus-hunters...
Me myself, I even hid some bugs in my own programs ... :(
Alan Kay (the object technology pioneer) spoke at length about the influence of biology in the OOP paradigm. He's got a series of ideas about how objects are like "cells" and that OOP scales in a similar way to the way that cells can scale to produce massive architectures...
You can follow quite a bit of this in his Turing Award Speech:
http://video.google.com/videoplay?docid=-2950949730059754521# -- Skip to about the 30:55 mark

Resources