Which algorithm can be used for multiclass classification where each row is classified into multiple classes?
For Multi class classification you can use the algorithms -
KNN
Decision Tree
Naive Bayes
Random Forest
And, if you are working on any Time Series data, LSTM would be a perfect algorithm to use.
Related
I am new to heuristic methods of optimization and learning about different optimization algorithms available in this space like Gentic Algorithm, PSO, DE, CMA ES etc.. The general flow of any of these algorithms seem to be initialise a population, select, crossover and mutation for update , evaluate and the cycle continues. The initial step of population creation in genetic algorithm seems to be that each member of the population is encoded by a chromosome, which is a bitstring of 0s and 1s and then all the other operations are performed. GE has simple update methods to popualation like mutation and crossover, but update methods are different in other algorithms.
My query here is do all the other heuristic algorithms also initialize the population as bitstrings of 0 and 1s or do they use the general natural numbers?
The representation of individuals in evolutionary algorithms (EA) depends on the representation of a candidate solution. If you are solving a combinatorial problem i.e. knapsack problem, the final solution is comprised of (0,1) string, so it makes sense to have a binary representation for the EA. However, if you are solving a continuous black-box optimisation problem, then it makes sense to have a representation with continuous decision variables.
In the old days, GA and other algorithms only used binary representation even for solving continuous problems. But nowadays, all the algorithms you mentioned have their own binary and continuous (and etc.) variants. For example, PSO is known as a continuous problem solver, but to update the individuals (particles), there are mapping strategies such as s-shape transform or v-shape transform to update the binary individuals for the next iteration.
My two cents: the choice of the algorithm relies on the type of the problem, and I personally won't recommend using a binary PSO at first try to solve a problem. Maybe there are benefits hidden there but need investigation.
Please feel free to extend your question.
Which clusetring machine learning algorithm is best to be used for clustering one-dimensional numerical features (scalar values)?
Is it Birch, Spectral clustering, k-means, DBSCAN...or something else?
All of these methods are better for multivariate data. Except for k-means which historically was used on oneudimensional data, they were all designed with the multivariate problem in mind, and none of them is well optimized for the particular case of 1-dimensional data.
For one-dimensional data, use kernel density estimation. KDE is a nice technique in 1d, has a strong statistical support, and becomes hard to use for clustering in multiple dimensions.
Take a look at K-means clustering algorithm. This algorithm works really well for clustering one dimensional feature vectors. But K means clustering algorithm doesn't work very well when there are outliers in your training dataset in which case you can use some advanced machine learning algorithms.
I'd suggest that before implementing a machine learning algorithm (classification, clustering etc.) for your dataset and problem statement, you can use Weka Toolkit to check which algorithm best fits your problem statement. Weka toolkit is a collection of a large number of machine learning and data mining algorithms that can be easily implemented for a given question. Once you have identified which algorithm works best for your problem, you can modify or write your own implementation of the algorithm. By tweaking it, you can even achieve more accuracy. You can download weka from here.
I'm currently working on a Machine Learning project for my Artificial Intelligence exam. The goal is to correctly choose two classification algorithms to compare using WEKA, bearing in mind that these two algorithms must be different enough to give the comparison a reason to be made. Besides, the algorithms must handle both nominal and numeric data (I suppose this is mandatory to let the comparison be made).
My professor suggested to choose a statistical classifier and a decision tree classifier, for example, or to delve into a comparison between a bottom-up classifier and a top-down one.
Since I have very little experience in the Machine Learning field, I am doing some research on the various algorithms WEKA offers, and I stepped on kNN, that is, k-nearest neighbors algorithm.
Is it statistical? And could it be compared with a Decision Stump algorithm, for example?
Or else, can you suggest a couple of algorithms that match these requirements I have pointed out above?
P. S.: Handled data must be both numerical and nominal. On WEKA there are numerical/nominal features and numerical/nominal classes. Do I have to choose algorithms with both numerical/nominal features AND classes or just one of them?
I would really appreciate any help guys, thanks for your patience!
Based on your professor's description, I would not consider k-Nearest Neighbors (kNN) a statistical classifier. In most contexts, a statistical classifier is one that generalizes via statistics of the training data (either by using statistics directly or by transforming them). An example of this is the Naïve Bayes Classifier.
By contrast, kNN is an example of Instance-Based Learning. It doesn't use statistics of the training data; rather, it compares new observations directly to the training instances to perform classification.
With regard to comparison, yes you can compare performance of kNN with a Decision Stump (or any other classifier). Since any two supervised classifiers will yield a classification accuracies with respect to your training/testing data, you can compare their performance.
I have a dataset of students profiles (Age,sex,address...etc) with the performance note (1 the worst, 5 the best).
I would like to know what could be the best data mining algorithm to determine the profile of those students with a performance bigger than 4.
Until the moment, I have think in clustering algorithm (K-means...) bus these are unsupervised algorithms so it's difficult for to fix a cluster with 100% of probability of having a student with the performance wished. Do you have any suggestion? Is there a better algorithm to achieve the objectives? Thanks!!
This does not sound like a clustering problem to me.
Instead, you are looking for a decision tree, on the target variable "grade > 4".
Decision trees, Neural network, SVD can be applied to characterize high performance students. There is no guarantees of perfect classification. You can see the quality of the model based on the accuracy measures.
Is there any impact of number of training documents on classification time ?? I know for K-nn that all of computations in K-nn is carried out in classification while no or minimum work is done in training. Is same is the case with SVM, Naive Bayes, Decision Trees etc ?
Only lazy classifiers have such a characteristics, one of which is KNN.
SVM - classification time depends on the number of support vectors, which may, but not have to be - dependent on the number of training documents (they are the upper bound of the number of SVs)
Naive Bayes - there is no impact, unless these new documents carry many new words, as the NB classification time is O( number of features ), so if you do not enlarge the vocablurary (in case of BOW model) you are safe to use many training data
Decision Tree - the same as for NB, it depends only on the number of features (and the complexity of the problem, which do not change with number of instances)
Neural Network - here classification time only depends on the number of neurons