Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
What I found are:
1. Naive Bayes classifier
2. K nearest neighbors classifier
3. Decision tree Algorithms(C4.5, Random Forest)
4. Kernel Discriminant Analysis
5. Support vector machines
If any other, can someone please help me with the remaining algorithms under this? I need complete list of supervised ML classification algorithms for my academic purpose. Thank you
Although this is an active area of research, I wouldn't say new algorithms are invented every day, not good ones anyway. The invention of a new ML algorithm that is better than the rest in even some semi-important particular cases would be pretty big news.
Usually, known algorithms are adapted to a given problem. Adapting one properly can itself be an area of research (spam classification is done with classical ML algorithms, but it's not trivial to perfect, so is digit recognition etc.)
Regardless, it's hard to find a source that lists all the known, classical algorithms. There are a lot, and it's unlikely that an author somewhere lists them all. They usually list the ones they work with, or the ones they consider the most important.
That said, I'm going to try to give you a longer list, and I'm making this community wiki to encourage other people to add more.
Naive Bayes classifier
K nearest neighbors classifier
Decision tree Algorithms(C4.5, Random Forest)
Kernel Discriminant Analysis
Support vector machines
Logistic Regression
Passive Aggressive Classifiers
Gaussian Processes
Neural networks
The Winnow algorithm
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
I'm new to stack overflow, but I'm here because I've searched everywhere and can't seem to find much info on the time complexity of A*, besides off the wiki. I would also like to compare it to Dijkstra's algorithm and see how adding a heuristic in A* improves it's performance.
I know it's a very advanced topic, but I just can't fully understand it from the info given on wiki (Even the analysis of Dijkstra's algorithm on wiki seems quite advanced).
https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
https://en.wikipedia.org/wiki/A*_search_algorithm
I would greatly appreciate it if anyone could explain the time complexity in more detail, or suggest any reading / learning material on the topic. I do have a good understanding of the A* algorithm, but I've just started learning the analysis thereof now.
The answer is simply it depends. A star by itself is no complete algorithm. A star is Dijkstra with a heuristic that fulfills some properties (like triangle inequality). You can select different heuristic functions that lead to different time complexities. The simplest heuristic is straight line distance. However there is also more advanced stuff like landmarks heuristic for example.
In the worst case you always need to explore the whole neighborhood so you won't get better than Dijkstra from a general point of analysis.
However in most practical applications you can achieve much better bounds.
This is only when you know some properties of your graph and of your heuristic function. You then can make some assumptions which lead to a better complexity, but only for those instances.
For example if you know that the straight line distance is always the correct distance in your graph and you use a straight line distance heuristic, then your A star will have the best possible complexity with Theta(1). However this is a much to strong assumption for most applications. But you can think of where this goes.
The bottom line is: It extremely depends on the structure of your graph and your heuristic function.
Here's a lecture on A star as you ask for learning material: Efficient Route Planning (A*, Landmarks, Set Dijkstra) - University of Freiburg
There is also much on the internet, the algorithm is pretty popular as it is very easy to implement and for most cases already fast enough (non-complex games for example).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm a mostly self-taught programmer, I'm in my freshman year of college going towards a BS in CompSci. Last year I would do some of the homework for the AP CompSci kids, and when they got to sorting algorithms, I understood what they did, but my question was what is a case where one is used? I know this may seem like a horrible, or ridiculous question, but other than a few cases I can think of, I don't understand when one would use a sorting algorithm. I understand that they are essential to know, and that they are foundational algorithms. But in the day to day, when are they used?
Sorting algorithm is an algorithm that arrange the list of elements in certain order. You can use such algorithms when you want the elements in some order.
For example:
Sorting strings on basis of lexicographical order. This makes several computation easier (like searching, insertion, deletion provided appropiate data structure is used)
Sorting integers as part of preprocessing of some algorithms. Suppose you have lot of queries in data base to find an integer, you will want to apply binary search. For it to be applicable, input must be sorted.
In many computational geometry algorithms (like convex hull), sorting the co-ordinates is the first step you do.
So, basically, if you want some ordering, you resort to sorting algorithms!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
This is a variant question from the Elements of Programming Interviews and doesn't come with a solution.
How can you compute the smallest number of queens that can be placed to attack each uncovered square?
The problem is about finding a minimal dominating set in a graph (the queen graph in your case http://mathworld.wolfram.com/QueenGraph.html), this more general problem is NP-Hard. Even if this reduction (on this specific kind of graphs) is unlikely to be NP-Hard, you may expect to not be able to find any efficient (polynomial) algorithm and indeed as up today nobody find one.
As an interview question, I think an acceptable answer would be a backtracking algorithm. You can add small improvements like always stop the search if you already put (n-2)-queens on the board.
For more information and pseudo-code of the algorithm and also more sophisticated algorithms I would suggest to read:
Fernau, H. (2010). minimum dominating set of queens: A trivial programming exercise?. Discrete Applied Mathematics, 158(4), 308-318.
http://www.sciencedirect.com/science/article/pii/S0166218X09003722
The simplest way is probably exhaustive searching with 1,2,3... queens until you find a solution. If you take the symmetries of the board into account you will only need ~10^6 searches to confirm that 4 queens is not enough (at that point you could use the same search until you find a solution for 5 queens or alternately, use a greedy algorithm for 5 queens to find a solution faster).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
In usual circumstances, sorting arrays of ~1000s of simple items like integer or floats is sufficiently fast that the small differences between implementations just doesn't matter.
But what if you need to sort N modest sized arrays that have been generated by some similar process or simply have have some relatedness?
I leave the specifics of what of the mysterious array generator and relationships of the generated arrays intentionally vague. It is up to any applicable algorithms to specify a large as possible domain where they will work when they will be most useful.
EDIT: Let's narrow this by letting the arrays be independent samples. There exists an unchanging probability distribution of arrays that will be generated. Implicitly then there's a stable probability distribution of elements in the arrays but it's conditonal -- the elements within an array might not be independent. It seems like it'd be extremely hard to make use of relationships between elements within the arrays but I could be wrong. We can narrow further if needed by letting the elements in the arrays be independent. In that case the problem is to effectively learn and use the probability distribution of elements in the arrays.
Here is a paper on a self improving sorting algorithm. I am pretty strong with algorithms and machine learning, but this paper is definitely not an easy read for me.
The abstract says this
We investigate ways in which an algorithm can improve
its expected performance by fine-tuning itself automatically with respect to an arbitrary, unknown input distribution. We give such self-improving algorithms for
sorting and clustering. The highlights of this work:
a sorting algorithm with optimal expected limiting running time ...
In all cases, the algorithm begins with a learning phase
during which it adjusts itself to the input distribution
(typically in a logarithmic number of rounds), followed
by a stationary regime in which the algorithm settles to
its optimized incarnation.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have a small confusion with the nature of heuristics.
We know that heuristics need not give correct outputs for all input instances.
But then, why are heuristics proposed??
Heuristics are used to trade off performance (usually execution speed, but also memory consumption) with potential accuracy or generality. For example, your anti virus software uses heuristics to characterize what a virus might look like, and can take advantage of that piece of information to determine which files it should spend more time analyzing. A good heuristic has the property that it can save substantial time with minimal cost.
In graph traversal theory, a heuristic for an A* search algorithm need not be perfect. It just needs to have a predicted cost function h(x) that is less than or equal to the true cost to the goal state in order to guarantee an optimal solution. The closer h(x) equals the true cost, the quicker an optimal solution will be found.
Let me give you an example which might help you understand the importance of heuristics.
In Artificial Intelligence, search problems are mainly classified as blind search and directed search. Blind search is where you make use of algorithms such as BFS and DFS and there is a reason they are called blind search, they don't have any knowledge about the direction you should go, you just have to explore and explore until you reach the goal node, imagine the time and space complexity for those algorithms.
Now if you look at the directed search algorithm such as A*, where you have some kind of heuristic function or in simple terms an assumption about which direction you should take the next step.
Although heuristics does not guarantee the best result but rather will try to give you a better solution and sometimes even the best. There are so many classes of problems (Ex. games you play) where a better solution does the task rather than wasting so much time and space in finding the best solution.
I hope it helps.