Are the classification algorithms in GIS/Remote Sensing Softwares also machine learning algorithms? if no, what is the difference?
As far as I understand, classic GIS/RS algorithms do not fall under the term machine learning, because classification criteria is still user-picked. Machine learning algorithms not only set the classification thresholds by themselves but also look for useful classification criteria on their own.
Keep in mind that I might be wrong as this is just my understanding of things :)
Related
I am just starting out studying machine learning and currently doing Andrew Ng's course on Coursera. I am going through the course but am a bit lost. It will make studying all those algorithms/theory a lot rewarding if I can see some use cases for them.
For example, the first topic I read about was gradient descent and then linear regression and logistic regression. Are these used directly in practice or are other algorithms like k-means and kernel density used? I guess I am trying to get real world (software engineering, data mining) examples of these topics. Can some one suggest a post that might have some explanation of any machine learning algorithm(s) usage? It will be greatly helpful.
NO FREE LUNCH THEOREM states that if algorithm A outperforms algorithm B for some problem, then loosely speaking there must exist exactly as many other problems where B outperforms.
So, it is difficult to link algorithm with particular use case.
If you are looking only for use cases where you can use machine learning algorithms, visit https://www.kaggle.com/wiki/DataScienceUseCases
Update : Just now, i came across http://pkghosh.wordpress.com Check it out. (use cases with algorithms)
I am trying to get a grasp on the efficiency of neural networks over other artificial intelligence algorithms for use in intrusion detection systems. Most of the literature I’m reading isn’t giving a good comparison of neural networks compared to other IDS's.
Do they work better (detect more true attacks and less false positives)? Are they more or less efficient?
Another question is how new is NN in IDS environments? Are they used widely, is it old news?
It seems like you're asking the problem:
Will this algorithm help me, reliably, detect when an "intrusion" has happened.
Looking at some of the criticism of Neural Networks, it seems that NNs could be over-trained (which is possible for any AI algorithm); this could be overcome by using k-fold cross validation. NNs are also difficult because it is difficult to explain why the NN gave the result that it did.
Is this a research problem that you are working on?
Initially, I'd look at Naive Bayes to solve this problem because 1) it is easy to implement and 2) serves as a good base-line. Also, look at Decision Trees as a solution to your problem.
After implementing NB and DT, implement the NN and see if NN does better.
You could also try an ensemble technique and see if that gives you better results.
There is a Java-based package called Weka which implements many of the algorithms I've discussed and could be valuable to you.
I am also new in NN. I think you can use Encog Neural Network Library to implement NN Algorithms. It is available in both Java and C#.
We know there are like a thousand of classifiers, recently I was told that, some people say adaboost is like the out of the shell one.
Are There better algorithms (with
that voting idea)
What is the state of the art in
the classifiers.Do you have an example?
First, adaboost is a meta-algorithm which is used in conjunction with (on top of) your favorite classifier. Second, classifiers which work well in one problem domain often don't work well in another. See the No Free Lunch wikipedia page. So, there is not going to be AN answer to your question. Still, it might be interesting to know what people are using in practice.
Weka and Mahout aren't algorithms... they're machine learning libraries. They include implementations of a wide range of algorithms. So, your best bet is to pick a library and try a few different algorithms to see which one works best for your particular problem (where "works best" is going to be a function of training cost, classification cost, and classification accuracy).
If it were me, I'd start with naive Bayes, k-nearest neighbors, and support vector machines. They represent well-established, well-understood methods with very different tradeoffs. Naive Bayes is cheap, but not especially accurate. K-NN is cheap during training but (can be) expensive during classification, and while it's usually very accurate it can be susceptible to overtraining. SVMs are expensive to train and have lots of meta-parameters to tweak, but they are cheap to apply and generally at least as accurate as k-NN.
If you tell us more about the problem you're trying to solve, we may be able to give more focused advice. But if you're just looking for the One True Algorithm, there isn't one -- the No Free Lunch theorem guarantees that.
Apache Mahout (open source, java) seems to pick up a lot of steam.
Weka is a very popular and stable Machine Learning library. It has been around for quite a while and written in Java.
Hastie et al. (2013, The Elements of Statistical Learning) conclude that the Gradient Boosting Machine is the best "off-the-shelf" Method. Independent of the Problem you have.
Definition (see page 352):
An “off-the-shelf” method is one that
can be directly applied to the data without requiring a great deal of timeconsuming data preprocessing or careful tuning of the learning procedure.
And a bit older meaning:
In fact, Breiman (NIPS Workshop, 1996) referred to AdaBoost with trees as the “best off-the-shelf classifier in the world” (see also Breiman (1998)).
I'm wondering if anyone knows of any software techniques taking advantage of biology? For example, in the robotics world, there are tons, but what about software?
Many concepts originally observed in biology have been used in software. For example Genetic Algorithm (GA).
Artificial life (AL) exposes/uses several principles of biology such as resilience to imperfect code snippets, addressing by content, imperfect reproduction (in some implementations, also sexual, i.e. multi-orginanisms-driven, reproduction) and a non-goal-driven utility function. An interesting result of AL, is the spontaneous production of macro phenomenons observed in domains such as ecology or epidemiology (domains largely influenced by biology), such as the emergence of parasites and even that of organisms which take advantage of parasites, or subtle predator-prey relationships.
Maybe software can be said to have gone "full circle" with some experiments in computing which involve real (carbon-based) DNA (or RNA) molecules! The original experiment in this area (PDF link) by Prof. Alderman (of RSA fame), who coded the various elements of a graph-related problem (an hamiltonian graph) with different DNA molecules and let the massive parallel computing power of bio-chemistry do the rest and solve the problem !
Back in the digital world, but with a strong inspiration from biology and indeed from anatomy of the cerebral cortex, and from many theoretical and clinical observations in the neuroscience field, we have Neural Networks (NN). In the area of NN, maybe worthy of a special notice, is Numenta's Hierarchical Temporal Memory model which, although it reproduces the [understanding we have of] the neo-cortex only very loosely, introduces the idea that the very same algorithm is applied in all areas and at all levels of the cognitive process powered by the brains, an idea largely supported by biological, anatomical and other forms of evidence.
If your question means "have biological ideas been used to optimize software?" then
Genetic programming (http://en.wikipedia.org/wiki/Genetic_programming) is one example. From the Wikipedia article:
In artificial intelligence, genetic programming (GP) is an evolutionary algorithm-based methodology inspired by biological evolution to find computer programs that perform a user-defined task. It is a specialization of genetic algorithms (GA) where each individual is a computer program. Therefore it is a machine learning technique used to optimize a population of computer programs according to a fitness landscape determined by a program's ability to perform a given computational task.
If your question means "what software techniques have been inspired by biology?" then
see more generally http://en.wikipedia.org/wiki/Bio-inspired_computing. I would expect that several other methods such as ant-swarms (http://en.wikipedia.org/wiki/Ant_colony_optimization) and Neural Networks (http://en.wikipedia.org/wiki/Neural_network_software) could also be used.
Artificial Neural Networks are another classic example. The software application tends to be pattern recognition and prediction of behaviour of complex systems.
Ant colony optimization, a search / optimization method, and Artificial Life like Conway's Game of Life
Most of the answers yet talk about AI. The title of your question hints towards software that hides itself in order not to be detected.
We got viruses.
We got virus-hunters...
Me myself, I even hid some bugs in my own programs ... :(
Alan Kay (the object technology pioneer) spoke at length about the influence of biology in the OOP paradigm. He's got a series of ideas about how objects are like "cells" and that OOP scales in a similar way to the way that cells can scale to produce massive architectures...
You can follow quite a bit of this in his Turing Award Speech:
http://video.google.com/videoplay?docid=-2950949730059754521# -- Skip to about the 30:55 mark
Does anybody have any good source of software/tutorial about Genetic Engineering Simulation?
Maybe open source software about gene splicing / cloning simulation ?
Thanks
This might be up your alley:
Genetic Programming - Evolution of Mona Lisa by Roger Alsing.
Mona Lisa Source Code and binaries
Mona Lisa FAQ
All very interesting and impressive stuff.
This may not be exactly what you're after, but have you looked at Spore?
It's quite possible that there is no such thing as "serious" genetic engineering simulation software. I think it would be a very difficult concept to implement.
Excellent question!
Clearly genetic engineering can't be simulated at an analytic level - ie. modelling the mechanism involved - because some of the processes, such as protein folding, DNA transcription, etc., are too complex and haven't been modelled yet.
They are still working on a map of the human genome, and I'm sure they haven't gotten around to anything else because anything of reasonable complexity isn't that much less complex (in terms of number of genes) from a human. For example, you can save 2% in complexity, but then you have a chimp not a human.
However, there might be possible approaches based on empirical methods - i.e. "best guess" methods, or experimentation. Thus, you might look at recent work in machine learning or data mining for some possible approaches. There are a lot of data mining and machine learning methods out there, ranging from genetic algorithms, neural networks, decision trees, SVM, etc.
http://en.wikipedia.org/wiki/Data_mining
http://en.wikipedia.org/wiki/Machine_learning