Q-learning with linear function approximation - algorithm

I would like to get some helpful instructions about how to use the Q-learning algorithm with function approximation. For the basic Q-learning algorithm I have found examples and I think I did understand it. In case of using function approximation I get into trouble. Can somebody give me an explanation through a short example how it works?
What I know:
Istead of using matrix for Q-values we use features and parameters.
Make approximation with the linear combination of feauters and parameters.
Update the parameters.
I have checked this paper: Q-learning with function approximation
But I cant find any useful tutorial how to use it.
Thanks for help!

To my view, this is one of the best references to start with. It is well written with several pseudo-code examples. In your case, you can simplify the algorithms by ignoring eligibility traces.
Also, to my experience and depending on your use case, Q-Learning might not work very well (sometimes it needs huge amounts of experience data). You can try Fitted-Q value for example, which is a batch algorithm.

Related

Ordered Sequence of learning algorithm design techniques

So, I am taking a algorithms course this semester.
I have a basic understanding of design techniques and know that divide and conquer should be the first technique t learn.
But coming to backtracking, dynamic programming and greedy techniques I am confused to choose an appropriate order
While my course is structured in the order i described in above paragraph.
Suggest me..
I doubt you will lose anything by simply following the order that your course suggests. I notice that TutorialsPoint presents in a different order, and adds several other techniques. I think you will gain a lot by simply working your way through such material.
It's likely that your learning style is different from mine, but I find it very beneficial to do a breadth-first survey of a topic, happily accepting that I won't understand every detail the first time round, then drill down as my understanding increases. So I don't feel the need for a detailed order of study.

Difference Between OPTICS and HDBSCAN clustering techniques

As a part of my assignment, I have to work on both HDBSCAN and OPTICS clustering technique. I have researched on many sites to identify the difference between these algorithms. All I got was OPTICS algorithm is a slight variation from HDBSCAN. I would like to know more about this algorithm. Can someone help me to understand the difference between these algorithms and specific use cases about when do we have to use them?
Also, post reference link for further reading. Thanks

Algorithm, find local/global minima, function of 2 variables

Let us have a function of 2 variables:
z=f(x,y) = ....
Can you advise me any suitable method (simply algorithmizable, fast convergence) to calculate the the local extreme on some intervals or the global extreme?
Thanks for your help.
Gradient Descent is a wise choice for finding local minima for functions, assuming you can calculate the gradient.
Depending on the specific domain - sometimes there are other solutions as well.
For example, for Linear-Least-Squares (which is used for regression in the field of machine learning) , you can find local (and global, the function in this case is convex) - you can use normal equations
EDIT: As suggested in comments: If you don't have any information on the function, you might be able to use a hill climbing algorithm where you sample the candidates where to advance (you need to take a sample because there are infinite number of directions if the function is of real numbers) - and chose the most promising one.
You can also try to extract the derivatives numerically using numerical differentiation, and use gradient descent.
You might also look into simulated annealing if you like the idea of algorithms driven by ideas from thermodynamics and metallurgy.
Or perhaps you'd rather look at genetic algorithms, because you like the current explosion of knowledge in biology.

algorithm to combine data for linear fit?

I'm not sure if this is the best place to ask this, but you guys have been helpful with plenty of my CS homework in the past so I figure I'll give it a shot.
I'm looking for an algorithm to blindly combine several dependent variables into an index that produces the best linear fit with an external variable. Basically, it would combine the dependent variables using different mathematical operators, include or not include each one, etc. until an index is developed that best correlates with my external variable.
Has anyone seen/heard of something like this before? Even if you could point me in the right direction or to the right place to ask, I would appreciate it. Thanks.
Sounds like you're trying to do Multivariate Linear Regression or Multiple Regression. The simplest method (Read: less accurate) to do this is to individually compute the linear regression lines of each of the component variables and then do a weighted average of each of the lines. Beyond that I am afraid I will be of little help.
This appears to be simple linear regression using multiple explanatory variables. As the implication here is that you are using a computational approach you could do something as simple apply a linear model to your data using every possible combination of your explanatory variables that you have (whether you want to include interaction effects is your choice), choose a goodness of fit measure (R^2 being just one example) and use that to rank the fit of each model you fit?? The quality of a model is also somewhat subjective in many fields - you could reject a model containing 15 variables if it only moderately improves the fit over a far simpler model just containing 3 variables. If you have not read it already I don't doubt that you will find many useful suggestions in the following text :
Draper, N.R. and Smith, H. (1998).Applied Regression Analysis Wiley Series in Probability and Statistics
You might also try doing a google for the LASSO method of model selection.
The thing you're asking for is essentially the entirety of regression analysis.
this is what linear regression does, and this is a good portion of what "machine learning" does (machine learning is basically just a name for more complicated regression and classification algorithms). There are hundreds or thousands of different approaches with various tradeoffs, but the basic ones frequently work quite well.
If you want to learn more, the coursera course on machine learning is a great place to get a deeper understanding of this.

Simple Suggestion / Recommendation Algorithm

I am looking for a simple suggestion algorithm to implement in to my Web App. Much like Netflix, Amazon, etc... But simpler. I don't need teams of Phd's working to get a better suggestion metric.
So say I have:
User1 likes Object1.
User2 likes Object1 and Object2.
I want to suggest to User1 they might also like Object2.
I can obviously come up with something naive. I'm looking for something vetted and easily implemented.
There are many simple and not so simple examples of suggestion algorithms in the excellent
Programming Collective Intelligence
The Pearson correlation coefficient (a little dry Wikipedia article) can give pretty good results. Here's an implementation in Python and another in TSQL along with an interesting explanation of the algorithm.
try a Slope One algorithm, it's one of the most used for this kind of problem.
here's a sample implementation in t-sql
I would go with K nearest neighbors. The wikipedia entry explains it well, and has links to reference implementations.
You may wanna look at Association rule learning and Apriori algorithm. The basic idea behind is is that you create rules like "if User like Object1, than User likes Object2" and check how well they describe (your) reality. In your concrete example, this rule would have a Support of 2 (as two Users like Object1) and a confidence of 50% a (as the rule is true in 1 of 2 cases). I've just implemented a basic proofe of concept myself (actually my first steps on Hadoop) and it's not too difficult to do.
Alternatively, you may wanna look at Apache Mahout - Taste. I did't ever use it myself though.
k-nearest neighbor algorithm
I created a suggested articles algorithm that used keywords (as opposed to "product purchases") to determine correlation. It takes a keyword, and runs through all other articles where that keyword occurs and produces results based on which articles have the most matching keywords.
Besides the obvious need for caching such information, is there something wrong with him using a similar method?

Resources