Ordered Sequence of learning algorithm design techniques - algorithm

So, I am taking a algorithms course this semester.
I have a basic understanding of design techniques and know that divide and conquer should be the first technique t learn.
But coming to backtracking, dynamic programming and greedy techniques I am confused to choose an appropriate order
While my course is structured in the order i described in above paragraph.
Suggest me..

I doubt you will lose anything by simply following the order that your course suggests. I notice that TutorialsPoint presents in a different order, and adds several other techniques. I think you will gain a lot by simply working your way through such material.
It's likely that your learning style is different from mine, but I find it very beneficial to do a breadth-first survey of a topic, happily accepting that I won't understand every detail the first time round, then drill down as my understanding increases. So I don't feel the need for a detailed order of study.

Related

Fast Downward planner

I am currently studying the AI fast downward planner, and I would like some help in this area. I know that the planner receives a domain.pddl file and a problem.pddl file, in addition, it receives a search algorithm and a heuristic function.
Many planners (not just the fast downward - ex. the pyperplan planner) gives us the opportunity to modify or create our new search algorithms to reach a solution. But as I have seen there are so many search algorithms already.
My question is: what is the idea in implementing our own search algorithm? or am
Am I missing something?
I'm not sure what your question is, so I'm going to give two different answers.
Why do planning systems like Fast Downward have the option to write your own search algorithms, heuristics, etc.?
Automated domain-independent planning is an active research area where new ideas are constantly developed (for example at ICAPS). Implementing a new idea to evaluate is is much easier if you can base the implementation on an existing framework than if you have to start from scratch every time. It also helps with comparability. For example, if you develop a new search algorithm but leave the heuristic the same, your implementation is much easier to compare to a baseline if the baseline uses the same heuristic implementation. That is why a lot of work is based on Fast Downward an similar frameworks.
How do I come up with an idea for a new search algorithm?
This is much harder to answer. As a general approach I would say: Try to find cases where existing search algorithms "don't get it", for example, a problem that you can solve by looking at it but the search algorithm fails to solve it. Then try to figure out what you did to solve it, generalize that idea so it works on other cases as well and write it down as an algorithm.

What is the 'predictive' element of machine learning

I'm hoping someone with a lot more knowledge of machine learning can help me out here. I've been reading examples of regression and classification and I always seem to come back to the question 'what is really the difference between what this algorithm is doing and what standard statistical analysis would do'.
Specifically, none of the examples I read seem to discuss the predictive element. For example, when looking at linear regression the articles commonly explain the concept of trying to create a 'best fit' - the combination of a linear equation and then iterating a cost function until it reaches a minimum. Of course, throughout a lot of emphasis is put on a 'training data set'. No problem... but this is usually where it ends. At this point I can't see the difference between the above and the standard way in which one would carry out statistical analysis on a data set that was assumed to have a linear relationship. Presumably, future values here are 'predicted' from the equation that was produced when the cost function converged on a minimum - again, there doesn't seem to be much 'learning' here as this is exactly what would be done in the usual case.
After a long winded intro... what I'm trying to ask is how has the algorithm learned from the original training data? and how does this training set help with future data sets? (again, this is where I get a bit lost - to me it seems that you would give it a new data set and carry out the same task of minimising the cost function - however, this time you have a better 'starting' point but all of your knowledge really comes from what you already 'knew' about the dataset i.e that one assumed a linear relationship).
I hope this makes sense - it's clearly a lack of understanding, but I'm hoping someone can shove me in the right direction.
Thanks!
You are right, there is no difference. Linear regression is purely a statistical method, and "fitting" would probably be more accurate than "learning" in this case. But again, this is usually just the first lecture on the subject. There many approaches where the differences are much clearer, for example SVMs. There are also approaches where the "learning" aspect is much clearer, eg using reirforcement learning in games, where you can actually see your system improve its performance with experience.
Anyway, the main subject of machine learning is learning from examples. You are given a list of 100 patients, along with blood pressure, age, cholesterol level etc, and for each of them you are told whether they have heart disease or not. Then, you are given a patient that you had not seen before. Does he have heart disease?? Most people call this prediction. You might prefer to call it fitting, or anything else. But the fact is, it usually works quite well.
Still, the subject remains closely tied to statistics, and indeed, you need to make some assumptions (to a larger or smaller extent, depending on the algorithm) about the underlying function. It is not perfect, but in many cases it's the best thing we have, so I would say it is worth studying. If you are starting now, there is a great online course, Stanford's "Statistical Learning", which deals with the subject from your point of view.

Algorithm for clustering people with similar interests

I want to cluster people into groups based on their interests. For eg. people who like machine learning and graphs may be placed in a group and people who have interest in mathematics and economics etc. may be placed in a different group.
The algorithm should be able to decide which people have most matching interests based on the interests of the people and create clusters.It should also be able to output about other persons in the group in which a particular person is placed.
This does not sound like a particularly difficult clustering problem, and any of the off-the-shelf clustering algorithm will probably work well. If you know how many clusters you want, then try k-means or k-medoid clustering. If you don't know how many clusters, then try agglomerative clustering.
The difficult part of the problem will be the features. You mentioned that 'interests' could be used as the features upon which to cluster, but feature engineering and selection will always involve some trial and error.
Without more context of your problem, I can't really give a definite answer. Most clustering algorithms will work though, the problem is how "good" are your results. I'm quoting the word "good" because you'll need some sort of metric to measure that (generally inter-cluster and intra-cluster distance).
Here's the advice given to me when I was taught on how to decide on an algorithm for data mining: Try the simplest algorithms first - quite often these are overlooked but perform quite well (Naive Bayes for supervised learning is a classic example).
To start you off, try something like K-means which is a simple and popular method, you can find more info here http://en.wikipedia.org/wiki/K-means_clustering (if you look at the Software section you can also find a list of implementations that you could try).
The second part of the criteria is to be able to output the other people in the group based on a target person. This is doable in all clustering algorithms since you'll have X subsets of people, you simply need to find the subset which the target person is in and then iterate that subset and print all the people within out.
I think the right approach will be Kmeans clustering. The most important part of your problem is feature selection.
Try with some features that you think are most important and simply apply kmeans in some statistical programing language like R, inspect the result and improve it by feature modification or selecting more appropriate features.
Hit and trial can give you insight if you are not sure about feature selection.
If you can provide some sample data, it will help to give some specific solutions to your problem.
Its coming a bit late, but there's actually an app in the windows store that is doing exactly that : finding profiles having similar characteristics
its called k-modo

Cut optimisation algorithm

Me and some of my friends at college were assigned a practical task of developing a net application for optimization of cutting rectangular parts from some kind of material. Something like apps in this list, but more simplistic. Basically, I'm interested if there is any source code for this kind of optimization algorithms available on the internet. I'm planning to develop the app using Adobe Flex framework. The programming part will be done in Actionscript 3, ofc. However, I doubt that there are any optimization samples for this language. There may be some for Java, C++, C#, Ruby or Python and other more popular languages, though(then I'd just have to rewrite it in AS). So, if anyone knows any free libs or algorithm code samples that would suit me, I'd like to hear your suggestions. :)
This sounds just like the stock cutting problem which is extermely hard! The best solutions use linear programming (typically based on the simplex method) with column generation (which, even after years on a constraint solving research project I feel unequipped to give a half decent explanation). In short, you won't want to try this approach in Actionscript; consequently, with whatever you do implement, you shouldn't expect great results on anything other than small problems.
The best advice I can offer, then, is to see if you can cut the source rectangle into strips (each of the width of the largest rectangles you need), then subdivide the remainder of each strip after the "head" rectangle has been removed.
I'd recommend using branch-and-bound as your optimisation strategy. BnB works by doing an exhaustive tree search that keeps track of the best solution seen so far. When you find a solution, update the bound, and backtrack looking for the next solution. Whenever you know your search takes you to a branch that you know cannot lead to a better solution than the best you have found, you can backtrack early at that point.
Since these search trees will be very large, you will probably want to place a time limit on the search and just return your best effort.
Hope this helps.
I had trouble finding examples when I wanted to do the same for the woodwoorking company I work for. The problem itself is NP-hard so you need to use an approximation algorithm like a first fit or best fit algorithm.
Do a search for 2d bin-packing algorithms. The one I found, you sort the panels biggest to smallest, then add the to the sheets in in order, putting in the first bin it will fit. Sorry don't have the code with with me and its in vb.net anyway.

Search/sort algorithms - is there a GoF-like listing for them?

I'm a self-taught developer and, quite frankly, am not all that great at figuring out which search or sort algorithm to use in any particular situation. I was just wondering if there was a Design Patterns-esque listing of the common algorithms available out there in the ether for me to bookmark. Something like:
Name of algorithm (with aliases, if any)
Problem it addresses
Big-O cost
Algorithm itself
Examples
Other algorithms it may be used with/substituted for
I'm just looking for a simple, concise listing of the algorithms I probably should know in one location. Is there anything like this available?
The web site http://www.sorting-algorithms.com/ shows many popular sorting algorithms, and describes their complexity and implementation. It goes the extra step to show, via animations, how those algorithms perform on different types of data (i.e pre-sorted, sparse, reverse-sorted, etc...).
This site has some examples of sorting algorithms, included visual aids to help you get the hang of it. I personally like the various best/worst/average/few unique cases they show.
Wikipedia has a nice table that lists most of the common sorting algorithms along with classification of them and basic analysis of their complexity characteristics.
The more common sorting algorithms have pseudocode and more in-depth analysis. For less common sorting algorithms, you'll probably have better luck finding details in academic papers or real implementations.
your should read CLRS.
In terms of problems variety, there are millions. and it all comes from puzzles and math.
Skienna has nice problems with different varieties.
You have a great article on the wikipedia.
http://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms
But I would suggest reading some book. Almost every book has one chapter about sorting.

Resources