Markov Decision Process: value iteration, how does it work?

Markov Decision Process: value iteration, how does it work? - algorithm

I've been reading a lot about Markov Decision Processes (using value iteration) lately but I simply can't get my head around them. I've found a lot of resources on the Internet / books, but they all use mathematical formulas that are way too complex for my competencies.
Since this is my first year at college, I've found that the explanations and formulas provided on the web use notions / terms that are way too complicated for me and they assume that the reader knows certain things that I've simply never heard of.
I want to use it on a 2D grid (filled with walls(unattainable), coins(desirable) and enemies that move(which must be avoided at all costs)). The whole goal is to collect all the coins without touching the enemies, and I want to create an AI for the main player using a Markov Decision Process (MDP). Here is how it partially looks like (note that the game-related aspect is not so much of a concern here. I just really want to understand MDPs in general):
From what I understand, a rude simplification of MDPs is that they can create a grid which holds in which direction we need to go (kind of a grid of "arrows" pointing where we need to go, starting at a certain position on the grid) to get to certain goals and avoid certain obstacles. Specific to my situation, that would mean that it allows the player to know in which direction to go to collect the coins and avoid the enemies.
Now, using the MDP terms, it would mean that it creates a collection of states(the grid) which holds certain policies(the action to take -> up, down, right, left) for a certain state(a position on the grid). The policies are determined by the "utility" values of each state, which themselves are calculated by evaluating how much getting there would be beneficial in the short and long term.
Is this correct? Or am I completely on the wrong track?
I'd at least like to know what the variables from the following equation represent in my situation:
(taken from the book "Artificial Intelligence - A Modern Approach" from Russell & Norvig)
I know that s would be a list of all the squares from the grid, a would be a specific action (up / down / right / left), but what about the rest?
How would the reward and utility functions be implemented?
It would be really great if someone knew a simple link which shows pseudo-code to implement a basic version with similarities to my situation in a very slow way, because I don't even know where to start here.
Thank you for your precious time.
(Note: feel free to add / remove tags or tell me in the comments if I should give more details about something or anything like that.)

Yes, the mathematical notation can make it seem much more complicated than it is. Really, it is a very simple idea. I have a implemented a value iteration demo applet that you can play with to get a better idea.
Basically, lets say you have a 2D grid with a robot in it. The robot can try to move North, South, East, West (those are the actions a) but, because its left wheel is slippery, when it tries to move North there is only a .9 probability that it will end up at the square North of it while there is a .1 probability that it will end up at the square West of it (similarly for the other 3 actions). These probabilities are captured by the T() function. Namely, T(s,A,s') will look like:
s A s' T //x=0,y=0 is at the top-left of the screen
x,y North x,y+1 .9 //we do move north
x,y North x-1,y .1 //wheels slipped, so we move West
x,y East x+1,y .9
x,y East x,y-1 .1
x,y South x,y+1 .9
x,y South x-1,y .1
x,y West x-1,y .9
x,y West x,y+1 .1
You then set the Reward to be 0 for all states, but 100 for the goal state, that is, the location you want the robot to get to.
What value-iteration does is its starts by giving a Utility of 100 to the goal state and 0 to all the other states. Then on the first iteration this 100 of utility gets distributed back 1-step from the goal, so all states that can get to the goal state in 1 step (all 4 squares right next to it) will get some utility. Namely, they will get a Utility equal to the probability that from that state we can get to the goal stated. We then continue iterating, at each step we move the utility back 1 more step away from the goal.
In the example above, say you start with R(5,5)= 100 and R(.) = 0 for all other states. So the goal is to get to 5,5.
On the first iteration we set
R(5,6) = gamma * (.9 * 100) + gamma * (.1 * 100)
because on 5,6 if you go North there is a .9 probability of ending up at 5,5, while if you go West there is a .1 probability of ending up at 5,5.
Similarly for (5,4), (4,5), (6,5).
All other states remain with U = 0 after the first iteration of value iteration.

Not a complete answer, but a clarifying remark.
The state is not a single cell. The state contains the information what is in each cell for all concerned cells at once. This means one state element contains the information which cells are solid and which are empty; which ones contain monsters; where are coins; where is the player.
Maybe you could use a map from each cell to its content as state. This does ignore the movement of monsters and player, which are probably very important, too.
The details depend on how you want to model your problem (deciding what belongs to the state and in which form).
Then a policy maps each state to an action like left, right, jump, etc.
First you must understand the problem that is expressed by a MDP before thinking about how algorithms like value iteration work.

I would recommend using Q-learning for your implementation.
Maybe you can use this post I wrote as an inspiration. This is a Q-learning demo with Java source code. This demo is a map with 6 fields and the AI learns where it should go from every state to get to the reward.
Q-learning is a technique for letting the AI learn by itself by giving it reward or punishment.
This example shows the Q-learning used for path finding. A robot learns where it should go from any state.
The robot starts at a random place, it keeps memory of the score while it explores the area, whenever it reaches the goal, we repeat with a new random start. After enough repetitions the score values will be stationary (convergence).
In this example the action outcome is deterministic (transition probability is 1) and the action selection is random. The score values are calculated by the Q-learning algorithm Q(s,a).
The image shows the states (A,B,C,D,E,F), possible actions from the states and the reward given.
Result Q*(s,a)
Policy Π*(s)
Qlearning.java
import java.text.DecimalFormat;
import java.util.Random;
/**
* #author Kunuk Nykjaer
*/
public class Qlearning {
final DecimalFormat df = new DecimalFormat("#.##");
// path finding
final double alpha = 0.1;
final double gamma = 0.9;
// states A,B,C,D,E,F
// e.g. from A we can go to B or D
// from C we can only go to C
// C is goal state, reward 100 when B->C or F->C
//
// _______
// |A|B|C|
// |_____|
// |D|E|F|
// |_____|
//
final int stateA = 0;
final int stateB = 1;
final int stateC = 2;
final int stateD = 3;
final int stateE = 4;
final int stateF = 5;
final int statesCount = 6;
final int[] states = new int[]{stateA,stateB,stateC,stateD,stateE,stateF};
// http://en.wikipedia.org/wiki/Q-learning
// http://people.revoledu.com/kardi/tutorial/ReinforcementLearning/Q-Learning.htm
// Q(s,a)= Q(s,a) + alpha * (R(s,a) + gamma * Max(next state, all actions) - Q(s,a))
int[][] R = new int[statesCount][statesCount]; // reward lookup
double[][] Q = new double[statesCount][statesCount]; // Q learning
int[] actionsFromA = new int[] { stateB, stateD };
int[] actionsFromB = new int[] { stateA, stateC, stateE };
int[] actionsFromC = new int[] { stateC };
int[] actionsFromD = new int[] { stateA, stateE };
int[] actionsFromE = new int[] { stateB, stateD, stateF };
int[] actionsFromF = new int[] { stateC, stateE };
int[][] actions = new int[][] { actionsFromA, actionsFromB, actionsFromC,
actionsFromD, actionsFromE, actionsFromF };
String[] stateNames = new String[] { "A", "B", "C", "D", "E", "F" };
public Qlearning() {
init();
}
public void init() {
R[stateB][stateC] = 100; // from b to c
R[stateF][stateC] = 100; // from f to c
}
public static void main(String[] args) {
long BEGIN = System.currentTimeMillis();
Qlearning obj = new Qlearning();
obj.run();
obj.printResult();
obj.showPolicy();
long END = System.currentTimeMillis();
System.out.println("Time: " + (END - BEGIN) / 1000.0 + " sec.");
}
void run() {
/*
1. Set parameter , and environment reward matrix R
2. Initialize matrix Q as zero matrix
3. For each episode: Select random initial state
Do while not reach goal state o
Select one among all possible actions for the current state o
Using this possible action, consider to go to the next state o
Get maximum Q value of this next state based on all possible actions o
Compute o Set the next state as the current state
*/
// For each episode
Random rand = new Random();
for (int i = 0; i < 1000; i++) { // train episodes
// Select random initial state
int state = rand.nextInt(statesCount);
while (state != stateC) // goal state
{
// Select one among all possible actions for the current state
int[] actionsFromState = actions[state];
// Selection strategy is random in this example
int index = rand.nextInt(actionsFromState.length);
int action = actionsFromState[index];
// Action outcome is set to deterministic in this example
// Transition probability is 1
int nextState = action; // data structure
// Using this possible action, consider to go to the next state
double q = Q(state, action);
double maxQ = maxQ(nextState);
int r = R(state, action);
double value = q + alpha * (r + gamma * maxQ - q);
setQ(state, action, value);
// Set the next state as the current state
state = nextState;
}
}
}
double maxQ(int s) {
int[] actionsFromState = actions[s];
double maxValue = Double.MIN_VALUE;
for (int i = 0; i < actionsFromState.length; i++) {
int nextState = actionsFromState[i];
double value = Q[s][nextState];
if (value > maxValue)
maxValue = value;
}
return maxValue;
}
// get policy from state
int policy(int state) {
int[] actionsFromState = actions[state];
double maxValue = Double.MIN_VALUE;
int policyGotoState = state; // default goto self if not found
for (int i = 0; i < actionsFromState.length; i++) {
int nextState = actionsFromState[i];
double value = Q[state][nextState];
if (value > maxValue) {
maxValue = value;
policyGotoState = nextState;
}
}
return policyGotoState;
}
double Q(int s, int a) {
return Q[s][a];
}
void setQ(int s, int a, double value) {
Q[s][a] = value;
}
int R(int s, int a) {
return R[s][a];
}
void printResult() {
System.out.println("Print result");
for (int i = 0; i < Q.length; i++) {
System.out.print("out from " + stateNames[i] + ": ");
for (int j = 0; j < Q[i].length; j++) {
System.out.print(df.format(Q[i][j]) + " ");
}
System.out.println();
}
}
// policy is maxQ(states)
void showPolicy() {
System.out.println("\nshowPolicy");
for (int i = 0; i < states.length; i++) {
int from = states[i];
int to = policy(from);
System.out.println("from "+stateNames[from]+" goto "+stateNames[to]);
}
}
}
Print result
out from A: 0 90 0 72,9 0 0
out from B: 81 0 100 0 81 0
out from C: 0 0 0 0 0 0
out from D: 81 0 0 0 81 0
out from E: 0 90 0 72,9 0 90
out from F: 0 0 100 0 81 0
showPolicy
from a goto B
from b goto C
from c goto C
from d goto A
from e goto B
from f goto C
Time: 0.025 sec.

I know this is a fairly old post, but i came across it when looking for MDP related questions, I did want to note (for folks coming in here) a few more comments about when you stated what "s" and "a" were.
I think for a you are absolutely correct it's your list of [up,down,left,right].
However for s it's really the location in the grid and s' is the location you can go to.
What that means is that you pick a state, and then you pick a particular s' and go through all the actions that can take you to that sprime, which you use to figure out those values. (pick a max out of those). Finally you go for the next s' and do the same thing, when you've exhausted all the s' values then you find the max of what you just finished searching on.
Suppose you picked a grid cell in the corner, you'd only have 2 states you could possibly move to (assuming bottom left corner), depending on how you choose to "name" your states, we could in this case assume a state is an x,y coordinate, so your current state s is 1,1 and your s' (or s prime) list is x+1,y and x,y+1 (no diagonal in this example) (The Summation part that goes over all s')
Also you don't have it listed in your equation, but the max is of a or the action that gives you the max, so first you pick the s' that gives you the max and then within that you pick the action (at least this is my understanding of the algorithm).
So if you had
x,y+1 left = 10
x,y+1 right = 5
x+1,y left = 3
x+1,y right 2
You'll pick x,y+1 as your s', but then you'll need to pick an action that is maximized which is in this case left for x,y+1. I'm not sure if there is a subtle difference between just finding the maximum number and finding the state then the maximum number though so maybe someone someday can clarify that for me.
If your movements are deterministic (meaning if you say go forward, you go forward with 100% certainty), then it's pretty easy you have one action, However if they are non deterministic, you have a say 80% certainty then you should consider the other actions which could get you there. This is the context of the slippery wheel that Jose mentioned above.
I don't want to detract what others have said, but just to give some additional information.

Related

HMM Localization in 2D maze, trouble applying smoothing (backward algorithm)

We use HMM (Hidden Markov Model) to localize a robot in a windy maze with damaged sensors. If he attempts to move in a direction, he will do so with a high probability, and a low chance to accidentally go to either side. If his movement would make him go over an obstacle, he will bounce back to the original tile.
From any given position, he can sense in all four directions. He will notice an obstacle if it is there with high certainty, and see an obstacle when there is none with low certainty.
We have a probability map for all possible places the robot might be in the maze, since he knows what the maze looks like. Initially it all starts evenly distributed.
I have completed the motion and sensing aspect of this and am getting the proper answers, but I am stuck on smoothing (backward algorithm).
Assume that the robot performs the following sequence of actions: senses, moves, senses, moves, senses. This gives us 3 states in our HMM model. Assume that the results I have at each step of the way so far are correct.
I am having a lot of trouble performing smoothing (backward algorithm), given that there are four conditional probabilities (one for each direction).
Assume SP is for smoothing probability, BP is for backward probability
Assume Sk is for a state, and Zk is for an observation at that state. The problem for me is figuring out how to construct my backwards equation given that each Zk is only for a single direction.
I know the algorithm for smoothing is: SP(k) is proportional to BP(k+1) * P(Sk | Z1:k)
Where BP(k+1) is defined as :
if (k == n) return 1 else return Sum(s) of BP(k+1) * P(Zk+1|Sk+1) * P(Sk+1=s | Sk)
This is where I am having my trouble. Mainly in the Conditional Probability portion of this equation. Because each spot has four different directions that it observed! In other words, each state has four different evidence variables as opposed to just one! Do I average these values? Do I do a separate summation for them? How do I account for multiple observations at a given state and properly condense it into this equation which only has room for one conditional probability?
Here is the code I have performing the smoothing:
public static void Smoothing(List<int[]> observations) {
int n = observations.Count; //n is Total length of evidence sequence
int k = n - 1; //k is the state we are trying to smooth. start with n-1
for (; k >= 1; k--) { //Smooth all the way back to the first state
for (int dir = 0; dir < 4; dir++) {
//We must smooth each direction separately
SmoothDirection(dir, observations, k, n);
}
Console.WriteLine($"Smoothing for k = {k}\n");
UpdateMapMotion(mapHistory[k]);
PrintMap();
}
}
public static void SmoothDirection(int dir, List<int[]> observations, int k, int n) {
var alphas = new double[ROWS, COLS];
var normalizer = 0.0;
int row, col;
foreach (var t in map) {
if (t.isObstacle) continue;
row = t.pos.y;
col = t.pos.x;
alphas[row, col] = mapHistory[k][row, col]
* Backwards(k, n, t, dir, observations, moves[^(n - k)]);
normalizer += alphas[row, col];
}
UpdateHistory(k, alphas, normalizer);
}
public static void UpdateHistory(int index, double[,] alphas, double normalizer) {
for (int r = 0; r < ROWS; r++) {
for (int c = 0; c < COLS; c++) {
mapHistory[index][r, c] = alphas[r, c] / normalizer;
}
}
}
public static double Backwards(int k, int n, Tile t, int dir, List<int[]> observations, int moveDir) {
if (k == n) return 1;
double p = 0;
var nextStates = GetPossibleNextStates(t, moveDir);
foreach (var s in nextStates) {
p += Cond_Prob(s.hasObstacle[dir], observations[^(n - k)][dir] == 1) * Trans_Prob(t, s, moveDir)
* Backwards(k+1, n, s, dir, observations, moves[^(n - k)]);
}
return p;
}
public static List<Tile> GetPossibleNextStates(Tile t, int direction) {
var tiles = new List<Tile>(); //Next States
var perpDirs = GetPerpendicularDir(direction); //Perpendicular Directions
//If obstacle in front of Tile t or on the sides, Tile t is a possible next state.
if (t.hasObstacle[direction] || t.hasObstacle[perpDirs[0]] || t.hasObstacle[perpDirs[1]])
tiles.Add(t);
//If there is no obstacle in front of Tile t, then that tile is a possible next state.
if (!t.hasObstacle[direction])
tiles.Add(GetTileAtPos(t.pos + directions[direction]));
//If there are no obstacles on the sides of Tile t, then those are possible next states.
foreach (var dir in perpDirs) {
if (!t.hasObstacle[dir])
tiles.Add(GetTileAtPos(t.pos + directions[dir]));
}
return tiles;
}
TL;DR : How do I perform smoothing (backward algorithm) in a Hidden Markov Model when there are 4 evidences at each state as opposed to just 1?

SOLVED!
It was actually rather much more simple than I imagined.
I don't actually need to each iteration separately in each direction.
I just need to replace the Cond_Prob() function with Joint_Cond_Prob() which finds the joint probability of all directional observations at a given state.
So P(Zk|Sk) is actually P(Zk1:Zk4|Sk) which is just P(Zk1|Sk)P(Zk2|Sk)P(Zk3|Sk)P(Zk4|Sk)

how to sort objects for Guendelman shock propagation?

This is a question regarding forming a topographical tree in 3D. A bit of context: I have a physics engine where I have bodies and collision points and some constraints. This is not homework, but an experiment in a multi-threading.
I need to sort the bodies in a bottom-to-top fashion with groups of objects belonging to layers like in this document: See section on "shock-propagation"
http://www2.imm.dtu.dk/visiondag/VD05/graphical/slides/kenny.pdf
the pseudocode he uses to describe how to iterate over the tree makes perfect sense:
shock-propagation(algorithm A)
compute contact graph
for each stack layer in bottom up order
fixate bottom-most objects of layer
apply algorithm A to layer
un-fixate bottom-most objects of layer
next layer
I already have algorithm A figured out (my impulse code). What would the pseudocode look like for tree/layer sorting (topo sort?) with a list of 3D points?
I.E., I don't know where to stop/begin the next "rung" or "branch". I guess I could just chunk it up by y position, but that seems clunky and error prone. Do I look into topographical sorting? I don't really know how to go about this in 3D. How would I get "edges" for a topo sort, if that's the way to do it?
Am I over thinking this and I just "connect the dots" by finding point p1 then the least distant next point p2 where p2.y > p1.y ? I see a problem here where p1 distance from p0 could be greater than p2 using pure distances, which would lead to a bad sort.

i just tackled this myself.
I found an example of how to accomplish this in the downloadable source code linked to this paper:
http://www-cs-students.stanford.edu/~eparker/files/PhysicsEngine/
Specifically the WorldState.cs file starting at line 221.
But the idea being that you assign all static objects with the level of -1 and each other object with a different default level of say -2. Then for each collision with the bodies at level -1 you add the body it collided with to a list and set its level to 0.
After that using a while loop while(list.Count > 0) check for the bodies that collide with it and set there levels to the body.level + 1.
After that, for each body in the simulation that still has the default level (i said -2 earlier) set its level to the highest level.
There are a few more fine details, but looking at the code in the example will explain it way better than i ever could.
Hope it helps!
Relevant Code from Evan Parker's code. [Stanford]
{{{
// topological sort (bfs)
// TODO check this
int max_level = -1;
while (queue.Count > 0)
{
RigidBody a = queue.Dequeue() as RigidBody;
//Console.Out.WriteLine("considering collisions with '{0}'", a.Name);
if (a.level > max_level) max_level = a.level;
foreach (CollisionPair cp in a.collisions)
{
RigidBody b = (cp.body[0] == a ? cp.body[1] : cp.body[0]);
//Console.Out.WriteLine("considering collision between '{0}' and '{1}'", a.Name, b.Name);
if (!b.levelSet)
{
b.level = a.level + 1;
b.levelSet = true;
queue.Enqueue(b);
//Console.Out.WriteLine("found body '{0}' in level {1}", b.Name, b.level);
}
}
}
int num_levels = max_level + 1;
//Console.WriteLine("num_levels = {0}", num_levels);
ArrayList[] bodiesAtLevel = new ArrayList[num_levels];
ArrayList[] collisionsAtLevel = new ArrayList[num_levels];
for (int i = 0; i < num_levels; i++)
{
bodiesAtLevel[i] = new ArrayList();
collisionsAtLevel[i] = new ArrayList();
}
for (int i = 0; i < bodies.GetNumBodies(); i++)
{
RigidBody a = bodies.GetBody(i);
if (!a.levelSet || a.level < 0) continue; // either a static body or no contacts
// add a to a's level
bodiesAtLevel[a.level].Add(a);
// add collisions involving a to a's level
foreach (CollisionPair cp in a.collisions)
{
RigidBody b = (cp.body[0] == a ? cp.body[1] : cp.body[0]);
if (b.level <= a.level) // contact with object at or below the same level as a
{
// make sure not to add duplicate collisions
bool found = false;
foreach (CollisionPair cp2 in collisionsAtLevel[a.level])
if (cp == cp2) found = true;
if (!found) collisionsAtLevel[a.level].Add(cp);
}
}
}
for (int step = 0; step < num_contact_steps; step++)
{
for (int level = 0; level < num_levels; level++)
{
// process all contacts
foreach (CollisionPair cp in collisionsAtLevel[level])
{
cp.ResolveContact(dt, (num_contact_steps - step - 1) * -1.0f/num_contact_steps);
}
}
}
}}}

Printing numbers of the form 2^i * 5^j in increasing order

How do you print numbers of form 2^i * 5^j in increasing order.
For eg:
1, 2, 4, 5, 8, 10, 16, 20

This is actually a very interesting question, especially if you don't want this to be N^2 or NlogN complexity.
What I would do is the following:
Define a data structure containing 2 values (i and j) and the result of the formula.
Define a collection (e.g. std::vector) containing this data structures
Initialize the collection with the value (0,0) (the result is 1 in this case)
Now in a loop do the following:
Look in the collection and take the instance with the smallest value
Remove it from the collection
Print this out
Create 2 new instances based on the instance you just processed
In the first instance increment i
In the second instance increment j
Add both instances to the collection (if they aren't in the collection yet)
Loop until you had enough of it
The performance can be easily tweaked by choosing the right data structure and collection.
E.g. in C++, you could use an std::map, where the key is the result of the formula, and the value is the pair (i,j). Taking the smallest value is then just taking the first instance in the map (*map.begin()).
I quickly wrote the following application to illustrate it (it works!, but contains no further comments, sorry):
#include <math.h>
#include <map>
#include <iostream>
typedef __int64 Integer;
typedef std::pair<Integer,Integer> MyPair;
typedef std::map<Integer,MyPair> MyMap;
Integer result(const MyPair &myPair)
{
return pow((double)2,(double)myPair.first) * pow((double)5,(double)myPair.second);
}
int main()
{
MyMap myMap;
MyPair firstValue(0,0);
myMap[result(firstValue)] = firstValue;
while (true)
{
auto it=myMap.begin();
if (it->first < 0) break; // overflow
MyPair myPair = it->second;
std::cout << it->first << "= 2^" << myPair.first << "*5^" << myPair.second << std::endl;
myMap.erase(it);
MyPair pair1 = myPair;
++pair1.first;
myMap[result(pair1)] = pair1;
MyPair pair2 = myPair;
++pair2.second;
myMap[result(pair2)] = pair2;
}
}

This is well suited to a functional programming style. In F#:
let min (a,b)= if(a<b)then a else b;;
type stream (current, next)=
member this.current = current
member this.next():stream = next();;
let rec merge(a:stream,b:stream)=
if(a.current<b.current) then new stream(a.current, fun()->merge(a.next(),b))
else new stream(b.current, fun()->merge(a,b.next()));;
let rec Squares(start) = new stream(start,fun()->Squares(start*2));;
let rec AllPowers(start) = new stream(start,fun()->merge(Squares(start*2),AllPowers(start*5)));;
let Results = AllPowers(1);;
Works well with Results then being a stream type with current value and a next method.
Walking through it:
I define min for completenes.
I define a stream type to have a current value and a method to return a new string, essentially head and tail of a stream of numbers.
I define the function merge, which takes the smaller of the current values of two streams and then increments that stream. It then recurses to provide the rest of the stream. Essentially, given two streams which are in order, it will produce a new stream which is in order.
I define squares to be a stream increasing in powers of 2.
AllPowers takes the start value and merges the stream resulting from all squares at this number of powers of 5. it with the stream resulting from multiplying it by 5, since these are your only two options. You effectively are left with a tree of results
The result is merging more and more streams, so you merge the following streams
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
.
.
.
Merging all of these turns out to be fairly efficient with tail recursio and compiler optimisations etc.
These could be printed to the console like this:
let rec PrintAll(s:stream)=
if (s.current > 0) then
do System.Console.WriteLine(s.current)
PrintAll(s.next());;
PrintAll(Results);
let v = System.Console.ReadLine();
Similar things could be done in any language which allows for recursion and passing functions as values (it's only a little more complex if you can't pass functions as variables).

For an O(N) solution, you can use a list of numbers found so far and two indexes: one representing the next number to be multiplied by 2, and the other the next number to be multiplied by 5. Then in each iteration you have two candidate values to choose the smaller one from.
In Python:
numbers = [1]
next_2 = 0
next_5 = 0
for i in xrange(100):
mult_2 = numbers[next_2]*2
mult_5 = numbers[next_5]*5
if mult_2 < mult_5:
next = mult_2
next_2 += 1
else:
next = mult_5
next_5 += 1
# The comparison here is to avoid appending duplicates
if next > numbers[-1]:
numbers.append(next)
print numbers

So we have two loops, one incrementing i and second one incrementing j starting both from zero, right? (multiply symbol is confusing in the title of the question)
You can do something very straightforward:
Add all items in an array
Sort the array
Or you need an other solution with more math analysys?
EDIT: More smart solution by leveraging similarity with Merge Sort problem
If we imagine infinite set of numbers of 2^i and 5^j as two independent streams/lists this problem looks very the same as well known Merge Sort problem.
So solution steps are:
Get two numbers one from the each of streams (of 2 and of 5)
Compare
Return smallest
get next number from the stream of the previously returned smallest
and that's it! ;)
PS: Complexity of Merge Sort always is O(n*log(n))

I visualize this problem as a matrix M where M(i,j) = 2^i * 5^j. This means that both the rows and columns are increasing.
Think about drawing a line through the entries in increasing order, clearly beginning at entry (1,1). As you visit entries, the row and column increasing conditions ensure that the shape formed by those cells will always be an integer partition (in English notation). Keep track of this partition (mu = (m1, m2, m3, ...) where mi is the number of smaller entries in row i -- hence m1 >= m2 >= ...). Then the only entries that you need to compare are those entries which can be added to the partition.
Here's a crude example. Suppose you've visited all the xs (mu = (5,3,3,1)), then you need only check the #s:
x x x x x #
x x x #
x x x
x #
#
Therefore the number of checks is the number of addable cells (equivalently the number of ways to go up in Bruhat order if you're of a mind to think in terms of posets).
Given a partition mu, it's easy to determine what the addable states are. Image an infinite string of 0s following the last positive entry. Then you can increase mi by 1 if and only if m(i-1) > mi.
Back to the example, for mu = (5,3,3,1) we can increase m1 (6,3,3,1) or m2 (5,4,3,1) or m4 (5,3,3,2) or m5 (5,3,3,1,1).
The solution to the problem then finds the correct sequence of partitions (saturated chain). In pseudocode:
mu = [1,0,0,...,0];
while (/* some terminate condition or go on forever */) {
minNext = 0;
nextCell = [];
// look through all addable cells
for (int i=0; i<mu.length; ++i) {
if (i==0 or mu[i-1]>mu[i]) {
// check for new minimum value
if (minNext == 0 or 2^i * 5^(mu[i]+1) < minNext) {
nextCell = i;
minNext = 2^i * 5^(mu[i]+1)
}
}
}
// print next largest entry and update mu
print(minNext);
mu[i]++;
}
I wrote this in Maple stopping after 12 iterations:
1, 2, 4, 5, 8, 10, 16, 20, 25, 32, 40, 50
and the outputted sequence of cells added and got this:
1 2 3 5 7 10
4 6 8 11
9 12
corresponding to this matrix representation:
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...

First of all, (as others mentioned already) this question is very vague!!!
Nevertheless, I am going to give a shot based on your vague equation and the pattern as your expected result. So I am not sure the following will be true for what you are trying to do, however it may give you some idea about java collections!
import java.util.List;
import java.util.ArrayList;
import java.util.SortedSet;
import java.util.TreeSet;
public class IncreasingNumbers {
private static List<Integer> findIncreasingNumbers(int maxIteration) {
SortedSet<Integer> numbers = new TreeSet<Integer>();
SortedSet<Integer> numbers2 = new TreeSet<Integer>();
for (int i=0;i < maxIteration;i++) {
int n1 = (int)Math.pow(2, i);
numbers.add(n1);
for (int j=0;j < maxIteration;j++) {
int n2 = (int)Math.pow(5, i);
numbers.add(n2);
for (Integer n: numbers) {
int n3 = n*n1;
numbers2.add(n3);
}
}
}
numbers.addAll(numbers2);
return new ArrayList<Integer>(numbers);
}
/**
* Based on the following fuzzy question # StackOverflow
* http://stackoverflow.com/questions/7571934/printing-numbers-of-the-form-2i-5j-in-increasing-order
*
*
* Result:
* 1 2 4 5 8 10 16 20 25 32 40 64 80 100 125 128 200 256 400 625 1000 2000 10000
*/
public static void main(String[] args) {
List<Integer> numbers = findIncreasingNumbers(5);
for (Integer i: numbers) {
System.out.print(i + " ");
}
}
}

If you can do it in O(nlogn), here's a simple solution:
Get an empty min-heap
Put 1 in the heap
while (you want to continue)
Get num from heap
print num
put num*2 and num*5 in the heap
There you have it. By min-heap, I mean min-heap

As a mathematician the first thing I always think about when looking at something like this is "will logarithms help?".
In this case it might.
If our series A is increasing then the series log(A) is also increasing. Since all terms of A are of the form 2^i.5^j then all members of the series log(A) are of the form i.log(2) + j.log(5)
We can then look at the series log(A)/log(2) which is also increasing and its elements are of the form i+j.(log(5)/log(2))
If we work out the i and j that generates the full ordered list for this last series (call it B) then that i and j will also generate the series A correctly.
This is just changing the nature of the problem but hopefully to one where it becomes easier to solve. At each step you can either increase i and decrease j or vice versa.
Looking at a few of the early changes you can make (which I will possibly refer to as transforms of i,j or just transorms) gives us some clues of where we are going.
Clearly increasing i by 1 will increase B by 1. However, given that log(5)/log(2) is approx 2.3 then increasing j by 1 while decreasing i by 2 will given an increase of just 0.3 . The problem then is at each stage finding the minimum possible increase in B for changes of i and j.
To do this I just kept a record as I increased of the most efficient transforms of i and j (ie what to add and subtract from each) to get the smallest possible increase in the series. Then applied whichever one was valid (ie making sure i and j don't go negative).
Since at each stage you can either decrease i or decrease j there are effectively two classes of transforms that can be checked individually. A new transform doesn't have to have the best overall score to be included in our future checks, just better than any other in its class.
To test my thougths I wrote a sort of program in LinqPad. Key things to note are that the Dump() method just outputs the object to screen and that the syntax/structure isn't valid for a real c# file. Converting it if you want to run it should be easy though.
Hopefully anything not explicitly explained will be understandable from the code.
void Main()
{
double C = Math.Log(5)/Math.Log(2);
int i = 0;
int j = 0;
int maxi = i;
int maxj = j;
List<int> outputList = new List<int>();
List<Transform> transforms = new List<Transform>();
outputList.Add(1);
while (outputList.Count<500)
{
Transform tr;
if (i==maxi)
{
//We haven't considered i this big before. Lets see if we can find an efficient transform by getting this many i and taking away some j.
maxi++;
tr = new Transform(maxi, (int)(-(maxi-maxi%C)/C), maxi%C);
AddIfWorthwhile(transforms, tr);
}
if (j==maxj)
{
//We haven't considered j this big before. Lets see if we can find an efficient transform by getting this many j and taking away some i.
maxj++;
tr = new Transform((int)(-(maxj*C)), maxj, (maxj*C)%1);
AddIfWorthwhile(transforms, tr);
}
//We have a set of transforms. We first find ones that are valid then order them by score and take the first (smallest) one.
Transform bestTransform = transforms.Where(x=>x.I>=-i && x.J >=-j).OrderBy(x=>x.Score).First();
//Apply transform
i+=bestTransform.I;
j+=bestTransform.J;
//output the next number in out list.
int value = GetValue(i,j);
//This line just gets it to stop when it overflows. I would have expected an exception but maybe LinqPad does magic with them?
if (value<0) break;
outputList.Add(value);
}
outputList.Dump();
}
public int GetValue(int i, int j)
{
return (int)(Math.Pow(2,i)*Math.Pow(5,j));
}
public void AddIfWorthwhile(List<Transform> list, Transform tr)
{
if (list.Where(x=>(x.Score<tr.Score && x.IncreaseI == tr.IncreaseI)).Count()==0)
{
list.Add(tr);
}
}
// Define other methods and classes here
public class Transform
{
public int I;
public int J;
public double Score;
public bool IncreaseI
{
get {return I>0;}
}
public Transform(int i, int j, double score)
{
I=i;
J=j;
Score=score;
}
}
I've not bothered looking at the efficiency of this but I strongly suspect its better than some other solutions because at each stage all I need to do is check my set of transforms - working out how many of these there are compared to "n" is non-trivial. It is clearly related since the further you go the more transforms there are but the number of new transforms becomes vanishingly small at higher numbers so maybe its just O(1). This O stuff always confused me though. ;-)
One advantage over other solutions is that it allows you to calculate i,j without needing to calculate the product allowing me to work out what the sequence would be without needing to calculate the actual number itself.
For what its worth after the first 230 nunmbers (when int runs out of space) I had 9 transforms to check each time. And given its only my total that overflowed I ran if for the first million results and got to i=5191 and j=354. The number of transforms was 23. The size of this number in the list is approximately 10^1810. Runtime to get to this level was approx 5 seconds.
P.S. If you like this answer please feel free to tell your friends since I spent ages on this and a few +1s would be nice compensation. Or in fact just comment to tell me what you think. :)

I'm sure everyone one's might have got the answer by now, but just wanted to give a direction to this solution..
It's a Ctrl C + Ctrl V from
http://www.careercup.com/question?id=16378662
void print(int N)
{
int arr[N];
arr[0] = 1;
int i = 0, j = 0, k = 1;
int numJ, numI;
int num;
for(int count = 1; count < N; )
{
numI = arr[i] * 2;
numJ = arr[j] * 5;
if(numI < numJ)
{
num = numI;
i++;
}
else
{
num = numJ;
j++;
}
if(num > arr[k-1])
{
arr[k] = num;
k++;
count++;
}
}
for(int counter = 0; counter < N; counter++)
{
printf("%d ", arr[counter]);
}
}

The question as put to me was to return an infinite set of solutions. I pondered the use of trees, but felt there was a problem with figuring out when to harvest and prune the tree, given an infinite number of values for i & j. I realized that a sieve algorithm could be used. Starting from zero, determine whether each positive integer had values for i and j. This was facilitated by turning answer = (2^i)*(2^j) around and solving for i instead. That gave me i = log2 (answer/ (5^j)). Here is the code:
class Program
{
static void Main(string[] args)
{
var startTime = DateTime.Now;
int potential = 0;
do
{
if (ExistsIandJ(potential))
Console.WriteLine("{0}", potential);
potential++;
} while (potential < 100000);
Console.WriteLine("Took {0} seconds", DateTime.Now.Subtract(startTime).TotalSeconds);
}
private static bool ExistsIandJ(int potential)
{
// potential = (2^i)*(5^j)
// 1 = (2^i)*(5^j)/potential
// 1/(2^1) = (5^j)/potential or (2^i) = potential / (5^j)
// i = log2 (potential / (5^j))
for (var j = 0; Math.Pow(5,j) <= potential; j++)
{
var i = Math.Log(potential / Math.Pow(5, j), 2);
if (i == Math.Truncate(i))
return true;
}
return false;
}
}

Interesting sorting problem

There are ones, zeroes and ‘U’s in a particular order. (E.g. “1001UU0011”) The number of ones and zeroes are the same, and there’s always two ‘U’s next to each other. You can swap the pair of ‘U’s with any pair of adjacent digits. Here’s a sample move:
__
/ \
1100UU0011 --> 11001100UU
The task is to put all the zeroes before the ones.
Here's a sample solution:
First step:
__
/ \
1100UU0011
Second step:
____
/ \
UU00110011
000011UU11 --> DONE
It’s pretty easy to create a brute-force algorithm. But with that it takes hundreds or even thousands of moves to solve a simple one like my example. So I’m looking for something more “clever” algorithm.
It's not homework; it was a task in a competition. The contest is over but I can’t find the solution for this.
Edit: The task here is the create an algorithm that can sort those 0s and 1s - not just output N 0s and N 1s and 2 Us. You have to show the steps somehow, like in my example.
Edit 2: The task didn't ask for the result with the least moves or anything like that. But personally I would love the see an algorithm that provides that :)

I think this should work:
Iterate once to find the position of
the U's. If they don't occupy the last
two spots, move them there by
swapping with the last two.
Create a
variable to track the currently
sorted elements, initially set to
array.length - 1, meaning anything
after it is sorted.
Iterate
backwards. Every time you encounter a
1:
swap the the one and its element before it with the U's.
swap the U's back to the the currently sorted elements tracker -1, update variable
Continue until the beginning of the array.

This is quite an interesting problem - so let's try to solve it. I will start with an precise analysis of the problem and see what one can find out. I will add piece by piece to this answer over the next days. Any help is welcome.
A problem of size n is a problem with exactly exactly n zeros, n ones, and two Us, hence it consists of 2n+2 symbols.
There are
(2n)!
-----
(n!)²
different sequences of exactly n zeros and nones. Then there are 2n+1 possible positions to insert the two Us, hence there are
(2n)! (2n+1)!
-----(2n+1) = -------
(n!)² (n!)²
problem instances of size n.
Next I am looking for a way to assign a score to each problem instance and how this score changes under all possible moves hoping to find out what the minimal number of required moves is.
Instance of size one are either already sorted
--01 0--1 01--
(I think I will use hyphens instead of Us because they are easier to recognize) or cannot be sorted.
--10 ==only valid move==> 10--
-10- no valid move
10-- ==only valid move==> --10
In consequence I will assume n >= 2.
I am thinking about the inverse problem - what unordered sequences can be reached starting from an ordered sequence. The ordered sequences are determined up to the location of the both hyphens - so the next question is if it is possible to reach every ordered sequence from every other order sequence. Because a sequence of moves can be performed forward and backward it is sufficient to show that one specific ordered sequence is reachable from all other. I choose (0|n)(1|n)--. ((0|x) represents exactly x zeros. If x is not of the form n-m zero or more is assumed. There may be additional constraints like a+b+2=n not explicitly stated. ^^ indicates the swap position. The 0/1 border is obviously between the last zero and first one.)
// n >= 2, at least two zeros between -- and the 0/1 border
(0|a)--(0|b)00(1|n) => (0|n)--(1|n-2)11 => (0|n)(1|n)--
^^ ^^
// n >= 3, one zero between -- and 0/1 boarder
(0|n-1)--01(1|n-1) => (0|n)1--(1|n-3)11 => (0|n)(1|n)--
^^ ^^
// n >= 2, -- after last zero but at least two ones after --
(0|n)(1|a)--(1|b)11 => (0|n)(1|n)--
^^
// n >= 3, exactly one one after --
(0|n)(1|n-3)11--1 => (0|n)(1|n-3)--111 => (0|n)(1|n)--
^^ ^^
// n >= 0, nothing to move
(0|n)(1|n)--
For the remaining two problems of size two - 0--011 and 001--1 - it seems not to be possible to reach 0011--. So for n >= 3 it is possible to reach every ordered sequence from every other ordered sequence in at most four moves (Probably less in all cases because I think it would have been better to choose (0|n)--(1|n) but I leave this for tomorrow.). The preliminary goal is to find out at what rate and under what conditions one can create (and in consequence remove) 010 and 101 because they seem to be the hard cases as already mentioned by others.

If you use a WIDTH-first brute force, it's still brute force, but at least you are guaranteed to come up with the shortest sequence of moves, if there is an answer at all. Here's a quick Python solution using a width-first search.
from time import time
def generate(c):
sep = "UU"
c1, c2 = c.split(sep)
for a in range(len(c1)-1):
yield c1[0:a]+sep+c1[(a+2):]+c1[a:(a+2)]+c2
for a in range(len(c2)-1):
yield c1+c2[a:(a+2)]+c2[0:a]+sep+c2[(a+2):]
def solve(moves,used):
solved = [cl for cl in moves if cl[-1].rindex('0') < cl[-1].index('1')]
if len(solved) > 0: return solved[0]
return solve([cl+[d] for cl in moves for d in generate(cl[-1]) if d not in used and not used.add(d)],used)
code = raw_input('enter the code:')
a = time()
print solve([[code]],set())
print "elapsed time:",(time()-a),"seconds"

Well, the first thing that gets up to my mind is top-down dynamic programming approach. It's kind of easy to understand but could eat a lot of memory. While I'm trying to apply a bottom-up approach you can try this one:
Idea is simple - cache all of the results for the brute-force search. It will become something like this:
function findBestStep(currentArray, cache) {
if (!cache.contains(currentArray)) {
for (all possible moves) {
find best move recursively
}
cache.set(currentArray, bestMove);
}
return cache.get(currentArray);
}
This method complexity would be... O(2^n) which is creepy. However I see no logical way it can be smaller as any move is allowed.
If if find a way to apply bottom-up algorithm it could be a little faster (it does not need a cache) but it will still have O(2^n) complexity.
Added:
Ok, I've implemented this thing in Java. Code is long, as it always is in Java, so don't get scared of it's size. The main algorithm is pretty simple and can be found at the bottom. I don't think there can be any way faster than this (this is more of a mathematical question if it can be faster). It eats tonns of memory but still computes it all pretty fast.
This 0,1,0,1,0,1,0,1,0,1,0,1,0,1,2,2 computes in 1 second, eating ~60mb memory resulting in 7 step sorting.
public class Main {
public static final int UU_CODE = 2;
public static void main(String[] args) {
new Main();
}
private static class NumberSet {
private final int uuPosition;
private final int[] numberSet;
private final NumberSet parent;
public NumberSet(int[] numberSet) {
this(numberSet, null, findUUPosition(numberSet));
}
public NumberSet(int[] numberSet, NumberSet parent, int uuPosition) {
this.numberSet = numberSet;
this.parent = parent;
this.uuPosition = uuPosition;
}
public static int findUUPosition(int[] numberSet) {
for (int i=0;i<numberSet.length;i++) {
if (numberSet[i] == UU_CODE) {
return i;
}
}
return -1;
}
protected NumberSet getNextNumberSet(int uuMovePos) {
final int[] nextNumberSet = new int[numberSet.length];
System.arraycopy(numberSet, 0, nextNumberSet, 0, numberSet.length);
System.arraycopy(this.getNumberSet(), uuMovePos, nextNumberSet, uuPosition, 2);
System.arraycopy(this.getNumberSet(), uuPosition, nextNumberSet, uuMovePos, 2);
return new NumberSet(nextNumberSet, this, uuMovePos);
}
public Collection<NumberSet> getNextPositionalSteps() {
final Collection<NumberSet> result = new LinkedList<NumberSet>();
for (int i=0;i<=numberSet.length;i++) {
final int[] nextNumberSet = new int[numberSet.length+2];
System.arraycopy(numberSet, 0, nextNumberSet, 0, i);
Arrays.fill(nextNumberSet, i, i+2, UU_CODE);
System.arraycopy(numberSet, i, nextNumberSet, i+2, numberSet.length-i);
result.add(new NumberSet(nextNumberSet, this, i));
}
return result;
}
public Collection<NumberSet> getNextSteps() {
final Collection<NumberSet> result = new LinkedList<NumberSet>();
for (int i=0;i<=uuPosition-2;i++) {
result.add(getNextNumberSet(i));
}
for (int i=uuPosition+2;i<numberSet.length-1;i++) {
result.add(getNextNumberSet(i));
}
return result;
}
public boolean isFinished() {
boolean ones = false;
for (int i=0;i<numberSet.length;i++) {
if (numberSet[i] == 1)
ones = true;
else if (numberSet[i] == 0 && ones)
return false;
}
return true;
}
#Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final NumberSet other = (NumberSet) obj;
if (!Arrays.equals(this.numberSet, other.numberSet)) {
return false;
}
return true;
}
#Override
public int hashCode() {
int hash = 7;
hash = 83 * hash + Arrays.hashCode(this.numberSet);
return hash;
}
public int[] getNumberSet() {
return this.numberSet;
}
public NumberSet getParent() {
return parent;
}
public int getUUPosition() {
return uuPosition;
}
}
void precacheNumberMap(Map<NumberSet, NumberSet> setMap, int length, NumberSet endSet) {
int[] startArray = new int[length*2];
for (int i=0;i<length;i++) startArray[i]=0;
for (int i=length;i<length*2;i++) startArray[i]=1;
NumberSet currentSet = new NumberSet(startArray);
Collection<NumberSet> nextSteps = currentSet.getNextPositionalSteps();
List<NumberSet> nextNextSteps = new LinkedList<NumberSet>();
int depth = 1;
while (nextSteps.size() > 0) {
for (NumberSet nextSet : nextSteps) {
if (!setMap.containsKey(nextSet)) {
setMap.put(nextSet, nextSet);
nextNextSteps.addAll(nextSet.getNextSteps());
if (nextSet.equals(endSet)) {
return;
}
}
}
nextSteps = nextNextSteps;
nextNextSteps = new LinkedList<NumberSet>();
depth++;
}
}
public Main() {
final Map<NumberSet, NumberSet> cache = new HashMap<NumberSet, NumberSet>();
final NumberSet startSet = new NumberSet(new int[] {0,1,0,1,0,1,0,1,0,1,0,1,0,1,2,2});
precacheNumberMap(cache, (startSet.getNumberSet().length-2)/2, startSet);
if (cache.containsKey(startSet) == false) {
System.out.println("No solutions");
} else {
NumberSet cachedSet = cache.get(startSet).getParent();
while (cachedSet != null && cachedSet.parent != null) {
System.out.println(cachedSet.getUUPosition());
cachedSet = cachedSet.getParent();
}
}
}
}

Here's a try:
Start:
let c1 = the total number of 1s
let c0 = the total number of 0s
if the UU is at the right end of the string, goto StartFromLeft
StartFromRight
starting at the right end of the string, move left, counting 1s,
until you reach a 0 or the UU.
If you've reached the UU, goto StartFromLeft.
If the count of 1s equals c1, you are done.
Else, swap UU with the 0 and its left neighbor if possible.
If not, goto StartFromLeft.
StartFromLeft
starting at the left end of the string, move right, counting 0s,
until you reach a 1 or the UU.
If you've reached the UU, goto StartFromRight.
If the count of 0s equals c0, you are done.
Else, swap UU with the 1 and its right neighbor, if possible.
If not, goto StartFromRight
Then goto StartFromRight.
So, for the original 1100UU0011:
1100UU0011 - original
110000UU11 - start from right, swap UU with 00
UU00001111 - start from left, swap UU with 11
For the trickier 0101UU01
0101UU01 - original
0UU11001 - start from right, can't swap UU with U0, so start from left and swap UU with 10
00011UU1 - start from right, swap UU with 00
However, this won't solve something like 01UU0...but that could be fixed by a flag - if you've gone through the whole algorithm once, made no swaps and it isn't solved...do something.

About the question... It never asked for the optimal solution and these types of questions do not want that. You need to write a general purpose algorithm to handle this problem and a brute-force search to find the best solution is not feasible for strings that may be megabytes in length. Also I noticed late that there are guaranteed to be the same number of 0s and 1s, but I think it's more interesting to work with the general case where there may be different numbers of 0s and 1s. There actually isn't guaranteed to be a solution in every case if the length of the input string is less than 7, even in the case where you have 2 0s and 2 1s.
Size 3: Only one digit so it is sorted by definition (UU0 UU1 0UU 1UU)
Size 4: No way to alter the order. There are no moves if UU is in the middle, and only swap with both digits if it is at an end (1UU0 no moves, UU10->10UU->UU10, etc)
Size 5: UU in the middle can only move to the far end and not change the order of the 0s and 1s (1UU10->110UU). UU at an end can move to middle and not change order, but only move back to the same end so there is no use for it (UU110->11UU0->UU110). The only way to change digits is if the UU is at an end and to swap with the opposite end. (UUABC->BCAUU or ABCUU->UUCAB). This means that if UU is at positions 0 or 2 it can solve if 0 is in the middle (UU101->011UU or UU100->001UU) and if UU is at positions 1 or 3 it can solve if 1 is in the middle (010UU->UU001 or 110UU->UU011). Anything else is already solved or is unsolvable. If we need to handle this case, I would say hard-code it. If sorted, return result (no moves). If UU is in the middle somewhere, move it to the end. Swap from the end to the other end and that is the only possible swap whether it is now sorted or not.
Size 6: Now we get so a position where we can have a string specified according to the rules where we can make moves but where there can be no solution. This is the problem point with any algorithm, because I would think a condition of any solution should be that it will let you know if it cannot be solved. For instance 0010, 0100, 1000, 1011, 1100, 1101, and 1110 can be solved no matter where the UU is placed and the worst cases take 4 moves to solve. 0101 and 1010 can only be solved if UU is in an odd position. 0110 and 1001 can only be solved if UU is in an even position (either end or middle).
I think the best way will be something like the following, but I haven't written it yet. First, make sure you place a '1' at the end of the list. If the end is currently 0, move UU to the end then move it to the last '1' position - 1. After that you continually move UU to the first '1', then to the first '0' after the new UU. This will move all the 0s to the start of the list. I've seen a similar answer the other way, but it didn't take into account the final character on either end. This can run into issues with small values still (i.e. 001UU01, cannot move to first 1, move to end 00101UU lets us move to start but leaves 0 at end 00UU110).
My guess is that you can hard-code special cases like that. I'm thinking there may be a better algorithm though. For instance you could use the first two characters as a 'temporary swap variable. You would put UU there and then do combinations of operations on others to leave UY back at the start. For instance, UUABCDE can swap AB with CD or DE or BC WITH DE (BCAUUDE->BCADEUU->UUADEBC).
Another possible thing would be to treat the characters as two blocks of two base-3 bits
0101UU0101 will show up as 11C11 or 3593. Maybe also something like a combination of hard-coded swaps. For instance if you ever see 11UU, move UU left 2. If you ever see UU00, move UU right two. If you see UU100, or UU101, move UU right 2 to get 001UU or 011UU.
Maybe another possibility would be some algorithm to move 0s left of center and 1s right of center (if it is given that there are the same number of 0s and 1s.
Maybe it would be better to work on an a structure that contained only 0s and 1s with a position for UU.
Maybe look at the resulting condition better, allowing for UU to be anywhere in the string, these conditions MUST be satisfied:
No 0s after Length/2
No 1s before (Length/2-1)
Maybe there are more general rules, like it's really good to swap UU with 10 in this case '10111UU0' because a '0' is after UU now and that would let you move the new 00 back to where the 10 was (10111UU0->UU111100->001111UU).
Anyway, here's the brute force code in C#. The input is a string and an empty Dictionary. It fills the dictionary with every possible resulting string as the keys and the list of shortest steps to get there as the value:
Call:
m_Steps = new Dictionary<string, List<string>>();
DoSort("UU1010011101", new List<string>);
It includes DoTests() which calls DoSort for every possible string with the given number of digits (not including UU):
Dictionary<string, List<string>> m_Steps = new Dictionary<string, List<string>>();
public void DoStep(string state, List<string> moves) {
if (m_Steps.ContainsKey(state) && m_Steps[state].Count <= moves.Count + 1) // have better already
return;
// we have a better (or new) solution to get to this state, so set it to the moves we used to get here
List<string> newMoves = new List<string>(moves);
newMoves.Add(state);
m_Steps[state] = newMoves;
// if the state is a valid solution, stop here
if (state.IndexOf('1') > state.LastIndexOf('0'))
return;
// try all moves
int upos = state.IndexOf('U');
for (int i = 0; i < state.Length - 1; i++) {
// need to be at least 2 before or 2 after the UU position (00UU11 upos is 2, so can only move to 0 or 4)
if (i > upos - 2 && i < upos + 2)
continue;
char[] chars = state.ToCharArray();
chars[upos] = chars[i];
chars[upos + 1] = chars[i + 1];
chars[i] = chars[i + 1] = 'U';
DoStep(new String(chars), newMoves);
}
}
public void DoTests(int digits) { // try all combinations
char[] chars = new char[digits + 2];
for (int value = 0; value < (2 << digits); value++) {
for (int uupos = 0; uupos < chars.Length - 1; uupos++) {
for (int i = 0; i < chars.Length; i++) {
if (i < uupos)
chars[i] = ((value >> i) & 0x01) > 0 ? '1' : '0';
else if (i > uupos + 1)
chars[i] = ((value >> (i - 2)) & 0x01) > 0 ? '1' : '0';
else
chars[i] = 'U';
}
m_Steps = new Dictionary<string, List<string>>();
DoSort(new string(chars), new List<string>);
foreach (string key in m_Steps.AllKeys))
if (key.IndexOf('1') > key.LastIndexOf('0')) { // winner
foreach (string step in m_Steps[key])
Console.Write("{0}\t", step);
Console.WriteLine();
}
}
}
}

Counting sort.
If A is the number of 0s, A is also the number of 1s, and U is the number of Us:
for(int i=0; i<A; i++)
data[i] = '0';
for(int i=0; i<A; i++)
data[A+i] = '1';
for(int i=0; i<U; i++)
data[A+A+i] = 'U';

There are only 2 Us?
Why not just count the number of 0s and store the position of the us:
numberOfZeros = 0
uPosition = []
for i, value in enumerate(sample):
if value = 0:
numberOfZeros += 1
if value = U
uPosition.append(i)
result = []
for i in range(len(sample)):
if i = uPosition[0]
result.append('U')
uPosition.pop(0)
continue
if numberOfZeros > 0:
result.append('0')
numberOfZeros -= 1
continue
result.append('1')
Would result in a runtime of O(n)
Or even better:
result = []
numberOfZeros = (len(sample)-2)/2
for i, value in enumerate(sample):
if value = U
result.append('U')
continue
if numberOfZeros > 0:
result.append(0)
numberOfZeros -= 1
continue
result.append(1)

ACM Problem: Coin-Flipping, help me identify the type of problem this is

I'm practicing for the upcoming ACM programming competition in a week and I've gotten stumped on this programming problem.
The problem is as follows:
You have a puzzle consisting of a square grid of size 4. Each grid square holds a single coin; each coin is showing either heads (H) and tails (T). One such puzzle is shown here:
H H H H
T T T T
H T H T
T T H T
Any coin that is current showing Tails (T) can be flipped to Heads (H). However, any time we flip a coin, we must also flip the adjacent coins direct above, below and to the left and right in the same row. Thus if we flip the second coin in the second row we must also flip 4 other coins, giving us this arrangment (coins that changed are shown in bold).
H T H H
H H H T
H H H T
T T H T
If a coin is at the edge of the puzzle, so there is no coin on one side or the other, then we flip fewer coins. We do not "wrap around" to the other side. For example, if we flipped the bottom right coin of the arragnement above we would get:
H T H H
H H H T
H H H H
T T T H
Note: Only coins showing (T) tails can be selected for flipping. However, anytime we flip such a coin, adjacent coins are also flipped, regardless of their state.
The goal of the puzzle is to have all coins show heads. While it is possible for some arragnements to not have solutions, all the problems given will have solutions. The answer we are looking for is, for any given 4x4 grid of coins what is the least number of flips in order to make the grid entirely heads.
For Example the grid:
H T H H
T T T H
H T H T
H H T T
The answer to this grid is: 2 flips.
What I have done so far:
I'm storing our grids as two-dimensional array of booleans. Heads = true, tails = false.
I have a flip(int row, int col) method that will flip the adjacent coins according the rules above and I have a isSolved() method that will determine if the puzzle is in a solved state (all heads). So we have our "mechanics" in place.
The part we are having problems with is how should we loop through, going an the least amount of times deep?

Your puzzle is a classic Breadth-First Search candidate. This is because you're looking for a solution with the fewest possible 'moves'.
If you knew the number of moves to the goal, then that would be ideal for a Depth-First Search.
Those Wikipedia articles contain plenty of information about the way the searches work, they even contain code samples in several languages.
Either search can be recursive, if you're sure you won't run out of stack space.

EDIT: I hadn't noticed that you can't use a coin as the primary move unless it's showing tails. That does indeed make order important. I'll leave this answer here, but look into writing another one as well.
No pseudo-code here, but think about this: can you ever imagine yourself flipping a coin twice? What would be the effect?
Alternative, write down some arbitrary board (literally, write it down). Set up some real world coins, and pick two arbitrary ones, X and Y. Do an "X flip", then a "Y flip" then another "X flip". Write down the result. Now reset the board to the starting version, and just do a "Y flip". Compare the results, and think about what's happened. Try it a few times, sometimes with X and Y close together, sometimes not. Become confident in your conclusion.
That line of thought should lead you to a way of determining a finite set of possible solutions. You can test all of them fairly easily.
Hope this hint wasn't too blatant - I'll keep an eye on this question to see if you need more help. It's a nice puzzle.
As for recursion: you could use recursion. Personally, I wouldn't in this case.
EDIT: Actually, on second thoughts I probably would use recursion. It could make life a lot simpler.
Okay, perhaps that wasn't obvious enough. Let's label the coins A-P, like this:
ABCD
EFGH
IJKL
MNOP
Flipping F will always involve the following coins changing state: BEFGJ.
Flipping J will always involve the following coins changing state: FIJKN.
What happens if you flip a coin twice? The two flips cancel each other out, no matter what other flips occur.
In other words, flipping F and then J is the same as flipping J and then F. Flipping F and then J and then F again is the same as just flipping J to start with.
So any solution isn't really a path of "flip A then F then J" - it's "flip <these coins>; don't flip <these coins>". (It's unfortunate that the word "flip" is used for both the primary coin to flip and the secondary coins which change state for a particular move, but never mind - hopefully it's clear what I mean.)
Each coin will either be used as a primary move or not, 0 or 1. There are 16 coins, so 2^16 possibilities. So 0 might represent "don't do anything"; 1 might represent "just A"; 2 might represent "just B"; 3 "A and B" etc.
Test each combination. If (somehow) there's more than one solution, count the number of bits in each solution to find the least number.
Implementation hint: the "current state" can be represented as a 16 bit number as well. Using a particular coin as a primary move will always XOR the current state with a fixed number (for that coin). This makes it really easy to work out the effect of any particular combination of moves.
Okay, here's the solution in C#. It shows how many moves were required for each solution it finds, but it doesn't keep track of which moves those were, or what the least number of moves is. That's a SMOP :)
The input is a list of which coins are showing tails to start with - so for the example in the question, you'd start the program with an argument of "BEFGJLOP". Code:
using System;
public class CoinFlip
{
// All ints could really be ushorts, but ints are easier
// to work with
static readonly int[] MoveTransitions = CalculateMoveTransitions();
static int[] CalculateMoveTransitions()
{
int[] ret = new int[16];
for (int i=0; i < 16; i++)
{
int row = i / 4;
int col = i % 4;
ret[i] = PositionToBit(row, col) +
PositionToBit(row-1, col) +
PositionToBit(row+1, col) +
PositionToBit(row, col-1) +
PositionToBit(row, col+1);
}
return ret;
}
static int PositionToBit(int row, int col)
{
if (row < 0 || row > 3 || col < 0 || col > 3)
{
// Makes edge detection easier
return 0;
}
return 1 << (row * 4 + col);
}
static void Main(string[] args)
{
int initial = 0;
foreach (char c in args[0])
{
initial += 1 << (c-'A');
}
Console.WriteLine("Initial = {0}", initial);
ChangeState(initial, 0, 0);
}
static void ChangeState(int current, int nextCoin, int currentFlips)
{
// Reached the end. Success?
if (nextCoin == 16)
{
if (current == 0)
{
// More work required if we want to display the solution :)
Console.WriteLine("Found solution with {0} flips", currentFlips);
}
}
else
{
// Don't flip this coin
ChangeState(current, nextCoin+1, currentFlips);
// Or do...
ChangeState(current ^ MoveTransitions[nextCoin], nextCoin+1, currentFlips+1);
}
}
}

I would suggest a breadth first search, as someone else already mentioned.
The big secret here is to have multiple copies of the game board. Don't think of "the board."
I suggest creating a data structure that contains a representation of a board, and an ordered list of moves that got to that board from the starting position. A move is the coordinates of the center coin in a set of flips. I'll call an instance of this data structure a "state" below.
My basic algorithm would look something like this:
Create a queue.
Create a state that contains the start position and an empty list of moves.
Put this state into the queue.
Loop forever:
Pull first state off of queue.
For each coin showing tails on the board:
Create a new state by flipping that coin and the appropriate others around it.
Add the coordinates of that coin to the list of moves in the new state.
If the new state shows all heads:
Rejoice, you are done.
Push the new state into the end of the queue.
If you like, you could add a limit to the length of the queue or the length of move lists, to pick a place to give up. You could also keep track of boards that you have already seen in order to detect loops. If the queue empties and you haven't found any solutions, then none exist.
Also, a few of the comments already made seem to ignore the fact that the problem only allows coins that show tails to be in the middle of a move. This means that order very much does matter. If the first move flips a coin from heads to tails, then that coin can be the center of the second move, but it could not have been the center of the first move. Similarly, if the first move flips a coin from tails to heads, then that coin cannot be the center of the second move, even though it could have been the center of the first move.

The grid, read in row-major order, is nothing more than a 16 bit integer. Both the grid given by the problem and the 16 possible moves (or "generators") can be stored as 16 bit integers, thus the problems amounts to find the least possible number of generators which, summed by means of bitwise XOR, gives the grid itself as the result. I wonder if there's a smarter alternative than trying all the 65536 possibilities.
EDIT: Indeed there is a convenient way to do bruteforcing. You can try all the 1-move patterns, then all the 2-moves patterns, and so on. When a n-moves pattern matches the grid, you can stop, exhibit the winning pattern and say that the solution requires at least n moves. Enumeration of all the n-moves patterns is a recursive problem.
EDIT2: You can bruteforce with something along the lines of the following (probably buggy) recursive pseudocode:
// Tries all the n bit patterns with k bits set to 1
tryAllPatterns(unsigned short n, unsigned short k, unsigned short commonAddend=0)
{
if(n == 0)
tryPattern(commonAddend);
else
{
// All the patterns that have the n-th bit set to 1 and k-1 bits
// set to 1 in the remaining
tryAllPatterns(n-1, k-1, (2^(n-1) xor commonAddend) );
// All the patterns that have the n-th bit set to 0 and k bits
// set to 1 in the remaining
tryAllPatterns(n-1, k, commonAddend );
}
}

To elaborate on Federico's suggestion, the problem is about finding a set of the 16 generators that xor'ed together gives the starting position.
But if we consider each generator as a vector of integers modulo 2, this becomes finding a linear combination of vectors, that equal the starting position.
Solving this should just be a matter of gaussian elimination (mod 2).
EDIT:
After thinking a bit more, I think this would work:
Build a binary matrix G of all the generators, and let s be the starting state. We are looking for vectors x satisfying Gx=s (mod 2). After doing gaussian elimination, we either end up with such a vector x or we find that there are no solutions.
The problem is then to find the vector y such that Gy = 0 and x^y has as few bits set as possible, and I think the easiest way to find this would be to try all such y. Since they only depend on G, they can be precomputed.
I admit that a brute-force search would be a lot easier to implement, though. =)

Okay, here's an answer now that I've read the rules properly :)
It's a breadth-first search using a queue of states and the moves taken to get there. It doesn't make any attempt to prevent cycles, but you have to specify a maximum number of iterations to try, so it can't go on forever.
This implementation creates a lot of strings - an immutable linked list of moves would be neater on this front, but I don't have time for that right now.
using System;
using System.Collections.Generic;
public class CoinFlip
{
struct Position
{
readonly string moves;
readonly int state;
public Position(string moves, int state)
{
this.moves = moves;
this.state = state;
}
public string Moves { get { return moves; } }
public int State { get { return state; } }
public IEnumerable<Position> GetNextPositions()
{
for (int move = 0; move < 16; move++)
{
if ((state & (1 << move)) == 0)
{
continue; // Not allowed - it's already heads
}
int newState = state ^ MoveTransitions[move];
yield return new Position(moves + (char)(move+'A'), newState);
}
}
}
// All ints could really be ushorts, but ints are easier
// to work with
static readonly int[] MoveTransitions = CalculateMoveTransitions();
static int[] CalculateMoveTransitions()
{
int[] ret = new int[16];
for (int i=0; i < 16; i++)
{
int row = i / 4;
int col = i % 4;
ret[i] = PositionToBit(row, col) +
PositionToBit(row-1, col) +
PositionToBit(row+1, col) +
PositionToBit(row, col-1) +
PositionToBit(row, col+1);
}
return ret;
}
static int PositionToBit(int row, int col)
{
if (row < 0 || row > 3 || col < 0 || col > 3)
{
return 0;
}
return 1 << (row * 4 + col);
}
static void Main(string[] args)
{
int initial = 0;
foreach (char c in args[0])
{
initial += 1 << (c-'A');
}
int maxDepth = int.Parse(args[1]);
Queue<Position> queue = new Queue<Position>();
queue.Enqueue(new Position("", initial));
while (queue.Count != 0)
{
Position current = queue.Dequeue();
if (current.State == 0)
{
Console.WriteLine("Found solution in {0} moves: {1}",
current.Moves.Length, current.Moves);
return;
}
if (current.Moves.Length == maxDepth)
{
continue;
}
// Shame Queue<T> doesn't have EnqueueRange :(
foreach (Position nextPosition in current.GetNextPositions())
{
queue.Enqueue(nextPosition);
}
}
Console.WriteLine("No solutions");
}
}

If you are practicing for the ACM, I would consider this puzzle also for non-trivial boards, say 1000x1000. Brute force / greedy may still work, but be careful to avoid exponential blow-up.

The is the classic "Lights Out" problem. There is actually an easy O(2^N) brute force solution, where N is either the width or the height, whichever is smaller.
Let's assume the following works on the width, since you can transpose it.
One observation is that you don't need to press the same button twice - it just cancels out.
The key concept is just that you only need to determine if you want to press the button for each item on the first row. Every other button press is uniquely determined by one thing - whether the light above the considered button is on. If you're looking at cell (x,y), and cell (x,y-1) is on, there's only one way to turn it off, by pressing (x,y). Iterate through the rows from top to bottom and if there are no lights left on at the end, you have a solution there. You can then take the min of all the tries.

It's a finite state machine, where each "state" is the 16 bit integer corresponding the the value of each coin.
Each state has 16 outbound transitions, corresponding to the state after you flip each coin.
Once you've mapped out all the states and transitions, you have to find the shortest path in the graph from your beginning state to state 1111 1111 1111 1111,

I sat down and attempted my own solution to this problem (based on the help I received in this thread). I'm using a 2d array of booleans, so it isn't as nice as the people using 16bit integers with bit manipulation.
In any case, here is my solution in Java:
import java.util.*;
class Node
{
public boolean[][] Value;
public Node Parent;
public Node (boolean[][] value, Node parent)
{
this.Value = value;
this.Parent = parent;
}
}
public class CoinFlip
{
public static void main(String[] args)
{
boolean[][] startState = {{true, false, true, true},
{false, false, false, true},
{true, false, true, false},
{true, true, false, false}};
List<boolean[][]> solutionPath = search(startState);
System.out.println("Solution Depth: " + solutionPath.size());
for(int i = 0; i < solutionPath.size(); i++)
{
System.out.println("Transition " + (i+1) + ":");
print2DArray(solutionPath.get(i));
}
}
public static List<boolean[][]> search(boolean[][] startState)
{
Queue<Node> Open = new LinkedList<Node>();
Queue<Node> Closed = new LinkedList<Node>();
Node StartNode = new Node(startState, null);
Open.add(StartNode);
while(!Open.isEmpty())
{
Node nextState = Open.remove();
System.out.println("Considering: ");
print2DArray(nextState.Value);
if (isComplete(nextState.Value))
{
System.out.println("Solution Found!");
return constructPath(nextState);
}
else
{
List<Node> children = generateChildren(nextState);
Closed.add(nextState);
for(Node child : children)
{
if (!Open.contains(child))
Open.add(child);
}
}
}
return new ArrayList<boolean[][]>();
}
public static List<boolean[][]> constructPath(Node node)
{
List<boolean[][]> solutionPath = new ArrayList<boolean[][]>();
while(node.Parent != null)
{
solutionPath.add(node.Value);
node = node.Parent;
}
Collections.reverse(solutionPath);
return solutionPath;
}
public static List<Node> generateChildren(Node parent)
{
System.out.println("Generating Children...");
List<Node> children = new ArrayList<Node>();
boolean[][] coinState = parent.Value;
for(int i = 0; i < coinState.length; i++)
{
for(int j = 0; j < coinState[i].length; j++)
{
if (!coinState[i][j])
{
boolean[][] child = arrayDeepCopy(coinState);
flip(child, i, j);
children.add(new Node(child, parent));
}
}
}
return children;
}
public static boolean[][] arrayDeepCopy(boolean[][] original)
{
boolean[][] r = new boolean[original.length][original[0].length];
for(int i=0; i < original.length; i++)
for (int j=0; j < original[0].length; j++)
r[i][j] = original[i][j];
return r;
}
public static void flip(boolean[][] grid, int i, int j)
{
//System.out.println("Flip("+i+","+j+")");
// if (i,j) is on the grid, and it is tails
if ((i >= 0 && i < grid.length) && (j >= 0 && j <= grid[i].length))
{
// flip (i,j)
grid[i][j] = !grid[i][j];
// flip 1 to the right
if (i+1 >= 0 && i+1 < grid.length) grid[i+1][j] = !grid[i+1][j];
// flip 1 down
if (j+1 >= 0 && j+1 < grid[i].length) grid[i][j+1] = !grid[i][j+1];
// flip 1 to the left
if (i-1 >= 0 && i-1 < grid.length) grid[i-1][j] = !grid[i-1][j];
// flip 1 up
if (j-1 >= 0 && j-1 < grid[i].length) grid[i][j-1] = !grid[i][j-1];
}
}
public static boolean isComplete(boolean[][] coins)
{
boolean complete = true;
for(int i = 0; i < coins.length; i++)
{
for(int j = 0; j < coins[i].length; j++)
{
if (coins[i][j] == false) complete = false;
}
}
return complete;
}
public static void print2DArray(boolean[][] array)
{
for (int row=0; row < array.length; row++)
{
for (int col=0; col < array[row].length; col++)
{
System.out.print((array[row][col] ? "H" : "T") + " ");
}
System.out.println();
}
}
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio