Algorithms for Contextual Bandits (a.k.a. where is UCB?) - vowpalwabbit

Hello and let me start by saying I haven't really worked with contextual bandits before; I've worked a lot with multi-armed bandits, as well as with Monte-Carlo Tree Search. Anyway, I'm used to using UCB as my go-to for MABs, so I was very surprised when I couldn't find UCB in VowpalWabbit's documentation at all. As I understand, UCB isn't directly applicable to the contextual formulation of the problem, but there are adaptations such as LinearUCB that are.
My question is - what am I missing? Is UCB there, but under a different name? Has it been deliberately omitted as it is simply worse than another algorithm that has been implemented? If so, which is this(/these) algorithm and how is it better?

VW has an RND explorer (see end of page here) which is equivalent to a randomized approximation to the LinearUCB bound, so it might be what you are looking for.

Related

What is the core difference between algorithm and pseudocode?

As the question describe itself "What is the core difference between algorithm and pseudocode?".
algorithm
An algorithm is a procedure for solving a problem in terms of the actions to be executed and the order in which those actions are to be executed. An algorithm is merely the sequence of steps taken to solve a problem. The steps are normally "sequence," "selection, " "iteration," and a case-type statement.
Pseudocode
Pseudocode is an artificial and informal language that helps programmers develop algorithms. Pseudocode is a "text-based" detail (algorithmic) design tool.
The rules of Pseudocode are reasonably straightforward. All statements showing "dependency" are to be indented. These include while, do, for, if, switch. Examples below will illustrate this notion.
I think all the other answers give useful explanations and definitions, but I'm going to give mine.
An algorithm is the idea of how to obtain some result from some input. It is an abstract concept; an algorithm is not something material by itself, but more something like an imagination or a deduction, a thing that only exists in the mind. In the broad sense, any sequence of steps that give you some thing(s) from other thing(s) could be called an algorithm. For example, if the screen of your computer is dirty, "spraying some glass cleaner on it and wipe it with a cloth" could be said to be an algorithm to solve the problem of how to obtain a clean screen form a dirty screen. It is important to note the difference between the problem itself (getting a clean screen) and the algorithm (wiping it with a cloth and cleaner); generally, several different algorithms are possible to solve the same problem. The idea of complexity is inherent to the algorithms itself, not the problem or the particular implementation or execution of the algorithm.
Pseudocode is a language to express algorithms. Since, as said before, algorithms are only concepts, we need to use something to express them and explain them to other people. Pseudocode is a convenient way for many computer science algorithms, because it is usually unambiguous, easy to read and somewhat similar to many programming languages. However, a specific programming language like C or Java can also be used to express and algorithm (it's just less convenient to those not familiarized with that language). In other cases, pseudocode may not be the best way to express an algorithm; for example, many graph and tree algorithms can be explained more easily with drawings or diagrams. In the previous example, the algorithm to get your screen cleaned is probably better expressed in a natural language like English, because it is simple and specific enough for that case.
Obviously, terms are frequently used loosely and exchanged depending on the context, and there's no need to be nitpicky about it, but I think it is important to have the difference clear. An algorithm doesn't stop being an algorithm just because it is written in Python instead of pseudocode. Pseudocode is just a convenient and widespread communication tool to express them.
An algorithm is something (a sequence of steps) you can do. Pseudocode is a notation to describe an algorithm.
Algorithm is something which is represented in mathematical terms. It includes, analysis, complexity considerations(best, average and worstcase analysis etc.).Pseudo code is a human readable representation of a program.
From Wikipedia :
Starting from an initial state and initial input (perhaps empty), the instructions describe a computation that, when executed, proceeds through a finite number of well-defined successive states, eventually producing "output" and terminating at a final ending state. 
With a pseudo language one can implement an algorithm without using a programming language such as C.
An example of pseudo language is Flow Charts.

Recommendations for Fast Multipole Method implementation?

I'm interested in implementing the Fast Multipole Method to efficiently simulate a system of repulsive particles.
I've found a large collection of references discussing FMM, but none seem very approachable for non-mathematicians who want to fully understand the algorithm.
Can you recommend a ground-up reference that clearly explains the mathematics behind the process, and includes pseudocode exemplifying a proper implementation?
I am by no means an expert in FMM, but this java implementation and introduction is the best source I've found so far for explaining it carefully and slowly. The paper is good at defining terms before using them, and the code at least is useful as a reference point. The math still gets hairy very quickly, but it is what it is :)
A pedestrian introduction to fast multipole methods is a close second. It doesn't explain the actual details of a working FMM implementation, but it's a good introduction to the basic ideas.
I like the short course on FMM. In begins with FMM in 1D, than it uses theory of complex variable to do FMM in 2D. And than there is the crazy 3D version which uses theory of spherical harmonics functions, which I guess can be very difficult for non-mathematician. But If you need FMM only in 2D you should be fine.
Unfortunately no pseudo codes are given there.
But do you really need the accuracy of FMM?. You might be fine with Barnes-Hut's algorithm
After running into a similar issue to you, I ended up writing a fully-documented Python fast multipole method implementation, pybbfmm. I've also written a short, mathematics-free tutorial on how the method works. Together, I think they're substantially more accessible than any of the other presentations I could find.
(meta: Although this is effectively a linkpost, the OP is explicitly asking for a link. I've added what I think was missing from the last one - the name fo the library - but I'm not sure how else to offer this answer except as a name and a link. Certainly it doesn't feel any more linkpost-y than the accepted answer. If this one gets deleted as well, I'll give up)

Formally verifying the correctness of an algorithm

First of all, is this only possible on algorithms which have no side effects?
Secondly, where could I learn about this process, any good books, articles, etc?
COQ is a proof assistant that produces correct ocaml output. It's pretty complicated though. I never got around to looking at it, but my coworker started and then stopped using it after two months. It was mostly because he wanted to get things done quicker, but if you need to verify an algorithm this might be a good idea.
Here is a course that uses COQ and talks about proving algorithms.
And here is a tutorial about writing academic papers in COQ.
It's generally a lot easier to verify/prove correctness when no side effects are involved, but it's not an absolute requirement.
You might want to look at some of the documentation for a formal specification language like Z. A formal specification isn't a proof itself, but is often the basis for one.
I think that verifying the correctness of an algorithm would be validating its conformance with a specification. There is a branch of theoretical Computer Science called Formal Methods which may be what you are looking for if you need to get as close to proof as you can. From wikipedia,
Formal Methods are a particular kind
of mathematically-based techniques for
the specification, development and
verification of software and hardware
systems
You will be able to find many learning resources and tools from the multitude of links on the linked Wikipedia page and from the Formal Methods wiki.
Usually proofs of correctness are very specific to the algorithm at hand.
However, there are several well known tricks that are used and re-used again. For example, with recursive algorithms you can use loop invariants.
Another common trick is reducing the original problem to a problem for which your algorithm's proof of correctness is easier to show, then either generalizing the easier problem or showing that the easier problem can be translated to a solution to the original problem. Here is a description.
If you have a particular algorithm in mind, you may do better in asking how to construct a proof for that algorithm rather than a general answer.
Buy these books: http://www.amazon.com/Science-Programming-Monographs-Computer/dp/0387964800
The Gries book, Scientific Programming is great stuff. Patient, thorough, complete.
Logic in Computer Science, by Huth and Ryan, gives a reasonably readable overview of modern systems for verifying systems. Once upon a time people talked about proving programs correct - with programming languages which may or may not have side effects. The impression I get from this book and elsewhere is that real applications are different - for instance proving that a protocol is correct, or that a chip's floating point unit can divide correctly, or that a lock-free routine for manipulating linked lists is correct.
ACM Computing Surveys Vol 41 Issue 4 (October 2009) is a special issue on software verification. It looks like you can get to at least one of the papers without an ACM account by searching for "Formal Methods: Practice and Experience".
The tool Frama-C, for which Elazar suggests a demo video in the comments, gives you a specification language, ACSL, for writing function contracts and various analyzers for verifying that a C function satisfies its contract and safety properties such as the absence of run-time errors.
An extended tutorial, ACSL by example, shows examples of actual C algorithms being specified and verified, and separates the side-effect-free functions from the effectful ones (the side-effect-free ones are considered easier and come first in the tutorial). This document is also interesting in that it was not written by the designers of the tools it describe, so it gives a fresher and more didactic look at these techniques.
If you are familiar with LISP then you should definitely check out ACL2: http://www.cs.utexas.edu/~moore/acl2/acl2-doc.html
Dijkstra's Discipline of Programming and his EWDs lay the foundation for formal verification as a science in programming. A simpler work is Wirth's Systematic Programming, which begins with the simple approach to using verification. Wirth uses pre-ISO Pascal for the language; Dijkstra uses an Algol-68-like formalism called Guarded (GCL). Formal verification has matured since Dijkstra and Hoare, but these older texts may still be a good starting point.
PVS tool developed by Stanford guys is a specification and verification system. I worked on it and found it very useful for Theoram Proving.
WRT (1), you will probably have to create a model of the algorithm in a way that "captures" the side-effects of the algorithm in a program variable intended to model such state-based side-effects.

How to cultivate algorithm intuition?

When faced with a problem in software I usually see a solution right away. Of course, what I see is usually somewhat off, and I always need to sit down and design (admittedly, I usually don't design enough), but I get a certain intuition right away.
My problem is I don't get that same intuition when it comes to advanced algorithms. I feel much more up to the task of building another Facebook then building another Google search, or a Music Genom project. It's probably because I've been building software for quite some time, but I have little experience with composing algorithms.
I would like the community's advice on what to read and what projects to undertake to be better at composing algorithms.
(This question has nothing to do with Algorithmic composition. Well, almost nothing)
+1 To whoever said experience is the best teacher.
There are several online portals which have a lot of programming problems, that you can submit your own solutions to, and get an automated pass/fail indication.
http://www.spoj.pl/
http://uva.onlinejudge.org/
http://www.topcoder.com/tc
http://code.google.com/codejam/contests.html
http://projecteuler.net/
https://codeforces.com
https://leetcode.com
The USACO training site is the training program that all USA computing olympiad participants go through. It goes step by step, introducing more and more complex algorithms as you go.
You might find it helpful to perform algorithms physically. For example, when you're studying sorting algorithms, practice doing each one with a deck of cards. That will activate different parts of your brain than reading or programming alone will.
Steve Yegge referred to "The Algorithm Design Manual" in one of his rants. I haven't seen it myself, but it sounds like it's just the ticket from his description.
My absolute favorite for this kind of interview preparation is Steven Skiena's The Algorithm Design Manual. More than any other book it helped me understand just how astonishingly commonplace (and important) graph problems are – they should be part of every working programmer's toolkit. The book also covers basic data structures and sorting algorithms, which is a nice bonus. But the gold mine is the second half of the book, which is a sort of encyclopedia of 1-pagers on zillions of useful problems and various ways to solve them, without too much detail. Almost every 1-pager has a simple picture, making it easy to remember. This is a great way to learn how to identify hundreds of problem types.
problem domain
First you must understand the problem domain. An elegant solution to the wrong problem is no good, nor is an inefficient solution to the right problem in most cases. Solution quality, in other words, is often relative. A simple scheduling problem that has a deterministic solution that takes ten minutes to run may be fine if schedules are realculated once per week, but if schedules change several times a day then a genetic algorithm solution that converges in a few seconds may be required.
decomposition and mapping
Second, decompose the problem into sub-problems and known/unknown elements that correspond to elements of the solution. Sometimes this is obvious, e.g. to count widgets you need a way of identifying widgets, an incrementable counter, and a way of storing the count. Sometimes it is not so obvious. Sometimes you have to decompose the problem, the domain, and possible solutions at the same time and try several different mappings between them to find one that leads to the correct results [this is the general method].
model
Model the solution, in your head at least, and walk through it to see if it works correctly. Adjust as necessary (See decomposition and mapping, above).
composition/interfaces
Many times you can find elements of the problem and elements of the solution that map to each other and produce partial results that are useful. This composition and interface construction provides the kernal of the solution, and also serves to reduce the scope of the problem remaining. So then you just loop back to the top with a smaller initial problem, and go through it again.
experience
Experience is the best teacher, of course, but reading about different kinds of problems and solutions will also be helpful. Studying some of the well-known algorithms and their applications is likewise very helpful, e.g. Dijkstra, Bresenham, Unification, and of course, graph theory.
I am not sure intuition can be cultivated, but I think I know what you are asking. The more problems you solve, the more information and experience you have at your disposal for future problems. So, I say just practice. Practice programming real world applications and you run into plenty of problems. Sometimes, solving puzzles can be very educational as well.
I try to find physical analogues when I'm looking at a complex problem.

Some pointers on implementing algorithms in code

The other day I thought I'd attempt creating the Fibonacci algorithm in my code, but I've never been good at maths.
I ended up writing my own method with a loop but it seemed inefficient or not 'the proper way'.
Does anyone have any recommendations/reading material on implementing algorithms in code?
I find Project Euler useful for this kind of thing. It forces you to think about an algorithm and then implement it. Many of the questions then have extensive discussions on how to solve the problem (from the naive solutions to some pretty ingenious ones) that you can use to see what you did right and wrong.
In the discussion threads you'll find various implementations from other people in many different languages too. Coming up with a solution yourself and then comparing it to that from other people is (imho) a good way to learn.
Both of these introductory books have good information about this sort of thing:
How To Design Programs and moreso Structure and Interpretation of Computer Programs
Both are somewhat funcitonal (and scheme) oriented, but that's a natural fit for these sorts of problems.
On top of that, you might get quite a bit out of Project Euler
Derive your algorithm test-driven. I've been able to write much more complex algorithms correctly by using TDD than I was before.
Go on youtube and look at some of the lectures on Introduction to Algorithms. There are some really, really good lectures that break down some of the most common algorithms such as the Fibonacci series and how to optimize them.
Start reading about O notation so you can understand how your algorithm grows with variable size input and how to classifiy the run-time of the algorithm you have.
Start with this video series which I found excellent material on the subject:
Algorithms Lecture
If you can't translate pseudo code for a fibonacci function to your language, then you should go and find a basic tutorial for your language, since it seems that you have not yet grasped its basic idioms.
If you have a working solution, but feel insecure about it, show it to others for review.

Resources