Solving a system of equations to find expected residence time of a Markov Chain

Solving a system of equations to find expected residence time of a Markov Chain - matrix

I have been told that in order to calculate the expected residence time for a set of states I can use the following approach:
Construct a Markov Chain with index i,j being the probability of transition from state i to state j.
Transpose the matrix, so that each column contains the inbound probabilities for that state.
Invert the diagonal so that a value p becomes (1-p).
Add a row at the bottom, containing 1's
Construct a coefficient vector with 0's and the last element 1
Solve it. The resulting vector should contain the expected residence time for the various states
Let me give an example:
I have the initial Markov Chain:
0.25 ; 0.25 ; 0.25 ; 0.25
0.00 ; 0.50 ; 0.50 ; 0.00
0.33 ; 0.33 ; 0.33 ; 0.00
0.00 ; 0.00 ; 0.50 ; 0.50
After step 1-3 it looks like this:
0.75 ; 0.00 ; 0.33 ; 0.00
0.25 ; 0.50 ; 0.33 ; 0.00
0.25 ; 0.50 ; 0.67 ; 0.50
0.25 ; 0.00 ; 0.00 ; 0.50
I add the last line:
0.75 ; 0.00 ; 0.33 ; 0.00
0.25 ; 0.50 ; 0.33 ; 0.00
0.25 ; 0.50 ; 0.67 ; 0.50
0.25 ; 0.00 ; 0.00 ; 0.50
1.00 ; 1.00 ; 1.00 ; 1.00
The coefficient will be the following vector:
0 ; 0 ; 0 ; 0 ; 1
The added line of 1's should enforce, that the solution sums to 1. However, my solution is the set:
{0.42; 0.84; -0.79; 0.32}
Which sums to 0.79, so clearly something is wrong.
I also note, that the expected residence time of state 3 is negative, which in my mind should not be possible.
I have it implemented in Java and I use Commons.Math to handle the matrix calculations. I have tried the various algorithms described in the documentation, but I get the same result.
I have also tried to substitute one of the rows with the line of 1's in order to make the matrix square. When I do that, I get the following set of solutions:
{0.79; 0.79; -1.79; 1.2}
Even though the probabilities sum to 1 they must still be wrong as they should be in the range 0..1 AND sum to 1.
Is this an entirely wrong approach to the problem? Where am I off?
Unfortunately I am not very mathematical, but I hope I have given enough information for you to see the problem.

I found the answer:
Let all probabilities p but the diagonal be -p in step 3:
0.75 ; -0.00 ; -0.33 ; -0.00
-0.25 ; 0.50 ; -0.33 ; -0.00
-0.25 ; -0.50 ; 0.67 ; -0.50
-0.25 ; -0.00 ; -0.00 ; 0.50

Related

Julia pmap speed - parallel processing - dynamic programming

I am trying to speed up filling in a matrix for a dynamic programming problem in Julia (v0.6.0), and I can't seem to get much extra speed from using pmap. This is related to this question I posted almost a year ago: Filling a matrix using parallel processing in Julia. I was able to speed up serial processing with some great help then, and I'm now trying to get extra speed from parallel processing tools in Julia.
For the serial processing case, I was using a 3-dimensional matrix (essentially a set of equally-sized matrices, indexed by the 1st-dimension) and iterating over the 1st-dimension. I wanted to give pmap a try, though, to more efficiently iterate over the set of matrices.
Here is the code setup. To use pmap with the v_iter function below, I converted the three dimensional matrix into a dictionary object, with the dictionary keys equal to the index values in the 1st dimension (v_dict in the code below, with gcc equal to the 1st-dimension size). The v_iter function takes other dictionary objects (E_opt_dict and gridpoint_m_dict below) as additional inputs:
function v_iter(a,b,c)
diff_v = 1
while diff_v>convcrit
diff_v = -Inf
#These lines efficiently multiply the value function by the Markov transition matrix, using the A_mul_B function
exp_v = zeros(Float64,gkpc,1)
A_mul_B!(exp_v,a[1:gkpc,:],Zprob[1,:])
for j=2:gz
temp=Array{Float64}(gkpc,1)
A_mul_B!(temp,a[(j-1)*gkpc+1:(j-1)*gkpc+gkpc,:],Zprob[j,:])
exp_v=hcat(exp_v,temp)
end
#This tries to find the optimal value of v
for h=1:gm
for j=1:gz
oldv = a[h,j]
newv = (1-tau)*b[h,j]+beta*exp_v[c[h,j],j]
a[h,j] = newv
diff_v = max(diff_v, oldv-newv, newv-oldv)
end
end
end
end
gz = 9
gp = 13
gk = 17
gcc = 5
gm = gk * gp * gcc * gz
gkpc = gk * gp * gcc
gkp = gk*gp
beta = ((1+0.015)^(-1))
tau = 0.35
Zprob = [0.43 0.38 0.15 0.03 0.00 0.00 0.00 0.00 0.00; 0.05 0.47 0.35 0.11 0.02 0.00 0.00 0.00 0.00; 0.01 0.10 0.50 0.30 0.08 0.01 0.00 0.00 0.00; 0.00 0.02 0.15 0.51 0.26 0.06 0.01 0.00 0.00; 0.00 0.00 0.03 0.21 0.52 0.21 0.03 0.00 0.00 ; 0.00 0.00 0.01 0.06 0.26 0.51 0.15 0.02 0.00 ; 0.00 0.00 0.00 0.01 0.08 0.30 0.50 0.10 0.01 ; 0.00 0.00 0.00 0.00 0.02 0.11 0.35 0.47 0.05; 0.00 0.00 0.00 0.00 0.00 0.03 0.15 0.38 0.43]
convcrit = 0.001 # chosen convergence criterion
E_opt = Array{Float64}(gcc,gm,gz)
fill!(E_opt,10.0)
gridpoint_m = Array{Int64}(gcc,gm,gz)
fill!(gridpoint_m,fld(gkp,2))
v_dict=Dict(i => zeros(Float64,gm,gz) for i=1:gcc)
E_opt_dict=Dict(i => E_opt[i,:,:] for i=1:gcc)
gridpoint_m_dict=Dict(i => gridpoint_m[i,:,:] for i=1:gcc)
For parallel processing, I executed the following two commands:
wp = CachingPool(workers())
addprocs(3)
pmap(wp,v_iter,values(v_dict),values(E_opt_dict),values(gridpoint_m_dict))
...which produced this performance:
135.626417 seconds (3.29 G allocations: 57.152 GiB, 3.74% gc time)
I then tried to serial process instead:
for i=1:gcc
v_iter(v_dict[i],E_opt_dict[i],gridpoint_m_dict[i])
end
...and received better performance.
128.263852 seconds (3.29 G allocations: 57.101 GiB, 4.53% gc time)
This also gives me about the same performance as running v_iter on the original 3-dimensional objects:
v=zeros(Float64,gcc,gm,gz)
for i=1:gcc
v_iter(v[i,:,:],E_opt[i,:,:],gridpoint_m[i,:,:])
end
I know that parallel processing involves setup time, but when I increase the value of gcc, I still get about equal processing time for serial and parallel. This seems like a good candidate for parallel processing, since there is no need for messaging between the workers! But I can't seem to make it work efficiently.

You create the CachingPool before adding the worker processes. Hence your caching pool passed to pmap tells it to use just a single worker.
You can simply check it by running wp.workers you will see something like Set([1]).
Hence it should be:
addprocs(3)
wp = CachingPool(workers())
You could also consider running Julia -p command line parameter e.g. julia -p 3 and then you can skip the addprocs(3) command.
On top of that your for and pmap loops are not equivalent. The Julia Dict object is a hashmap and similar to other languages does not offer anything like element order. Hence in your for loop you are guaranteed to get the same matching i-th element while with the values the ordering of values does not need to match the original ordering (and you can have different order for each of those three variables in the pmap loop).
Since the keys for your Dicts are just numbers from 1 up to gcc you should simply use arrays instead. You can use generators very similar to Python. For an example instead of
v_dict=Dict(i => zeros(Float64,gm,gz) for i=1:gcc)
use
v_dict_a = [zeros(Float64,gm,gz) for i=1:gcc]
Hope that helps.

Based on #Przemyslaw Szufeul's helpful advice, I've placed below the code that properly executes parallel processing. After running it once, I achieved substantial improvement in running time:
77.728264 seconds (181.20 k allocations: 12.548 MiB)
In addition to reordering the wp command and using the generator Przemyslaw recommended, I also recast v_iter as an anonymous function, in order to avoid having to sprinkle #everywhere around the code to feed functions and data to the workers.
I also added return a to the v_iter function, and set v_a below equal to the output of pmap, since you cannot pass by reference to a remote object.
addprocs(3)
v_iter = function(a,b,c)
diff_v = 1
while diff_v>convcrit
diff_v = -Inf
#These lines efficiently multiply the value function by the Markov transition matrix, using the A_mul_B function
exp_v = zeros(Float64,gkpc,1)
A_mul_B!(exp_v,a[1:gkpc,:],Zprob[1,:])
for j=2:gz
temp=Array{Float64}(gkpc,1)
A_mul_B!(temp,a[(j-1)*gkpc+1:(j-1)*gkpc+gkpc,:],Zprob[j,:])
exp_v=hcat(exp_v,temp)
end
#This tries to find the optimal value of v
for h=1:gm
for j=1:gz
oldv = a[h,j]
newv = (1-tau)*b[h,j]+beta*exp_v[c[h,j],j]
a[h,j] = newv
diff_v = max(diff_v, oldv-newv, newv-oldv)
end
end
end
return a
end
gz = 9
gp = 13
gk = 17
gcc = 5
gm = gk * gp * gcc * gz
gkpc = gk * gp * gcc
gkp =gk*gp
beta = ((1+0.015)^(-1))
tau = 0.35
Zprob = [0.43 0.38 0.15 0.03 0.00 0.00 0.00 0.00 0.00; 0.05 0.47 0.35 0.11 0.02 0.00 0.00 0.00 0.00; 0.01 0.10 0.50 0.30 0.08 0.01 0.00 0.00 0.00; 0.00 0.02 0.15 0.51 0.26 0.06 0.01 0.00 0.00; 0.00 0.00 0.03 0.21 0.52 0.21 0.03 0.00 0.00 ; 0.00 0.00 0.01 0.06 0.26 0.51 0.15 0.02 0.00 ; 0.00 0.00 0.00 0.01 0.08 0.30 0.50 0.10 0.01 ; 0.00 0.00 0.00 0.00 0.02 0.11 0.35 0.47 0.05; 0.00 0.00 0.00 0.00 0.00 0.03 0.15 0.38 0.43]
convcrit = 0.001 # chosen convergence criterion
E_opt = Array{Float64}(gcc,gm,gz)
fill!(E_opt,10.0)
gridpoint_m = Array{Int64}(gcc,gm,gz)
fill!(gridpoint_m,fld(gkp,2))
v_a=[zeros(Float64,gm,gz) for i=1:gcc]
E_opt_a=[E_opt[i,:,:] for i=1:gcc]
gridpoint_m_a=[gridpoint_m[i,:,:] for i=1:gcc]
wp = CachingPool(workers())
v_a = pmap(wp,v_iter,v_a,E_opt_a,gridpoint_m_a)

Algorithm for optimal expected amount in a profit/loss game

I came upon the following question recently,
"You have a box which has G green and B blue coins. Pick a random coin, G gives a profit of +1 and blue a loss of -1. If you play optimally what is the expected profit."
I was thinking of using a brute force algorithm where I consider all possibilities of combinations of green and blue coins but I'm sure there must be a better solution for this (range of B and G was from 0 to 5000). Also what does playing optimally mean? Does it mean that if i pick all blue coins then I would continue playing till all green coins are also picked? If so then this means I shouldn't consider all possibilities of green and blue coins?

The "obvious" answer is to play whenever there's more green coins than blue coins. In fact, this is wrong. For example, if there's 999 green coins and 1000 blue coins, here's a strategy that takes an expected profit:
Take 2 coins
If GG -- stop with a profit of 2
if BG or GB -- stop with a profit of 0
if BB -- take all the remaining coins for a profit of -1
Since the first and last possibilities both occur with near 25% probability, your overall expectation is approximately 0.25*2 - 0.25*1 = 0.25
This is just a simple strategy in one extreme example that shows that the problem is not as simple as it first seems.
In general, the expectations with g green coins and b blue coins is given by a recurrence relation:
E(g, 0) = g
E(0, b) = 0
E(g, b) = max(0, g(E(g-1, b) + 1)/(b+g) + b(E(g, b-1) - 1)/(b+g))
The max in the final row occurs because if it's -EV to play, then you're better stopping.
These recurrence relations can be solved using dynamic programming in O(gb) time.
from fractions import Fraction as F
def gb(G, B):
E = [[F(0, 1)] * (B+1) for _ in xrange(G+1)]
for g in xrange(G+1):
E[g][0] = F(g, 1)
for b in xrange(1, B+1):
for g in xrange(1, G+1):
E[g][b] = max(0, (g * (E[g-1][b]+1) + b * (E[g][b-1]-1)) * F(1, (b+g)))
for row in E:
for v in row:
print '%5.2f' % v,
print
print
return E[G][B]
print gb(8, 10)
Output:
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1.00 0.50 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2.00 1.33 0.67 0.20 0.00 0.00 0.00 0.00 0.00 0.00 0.00
3.00 2.25 1.50 0.85 0.34 0.00 0.00 0.00 0.00 0.00 0.00
4.00 3.20 2.40 1.66 1.00 0.44 0.07 0.00 0.00 0.00 0.00
5.00 4.17 3.33 2.54 1.79 1.12 0.55 0.15 0.00 0.00 0.00
6.00 5.14 4.29 3.45 2.66 1.91 1.23 0.66 0.23 0.00 0.00
7.00 6.12 5.25 4.39 3.56 2.76 2.01 1.34 0.75 0.30 0.00
8.00 7.11 6.22 5.35 4.49 3.66 2.86 2.11 1.43 0.84 0.36
7793/21879
From this you can see that the expectation is positive to play with 8 green and 10 blue coins (EV=7793/21879 ~= 0.36), and you even have positive expectation with 2 green and 3 blue coins (EV=0.2)

Simple and intuitive answer:
you should start off with an estimate for the total number of blue and green coins. After each pick you will update this estimate. If you estimate there are more blue coins than green coins at any point you should stop.
Example:
you start and you pick a coin. Its green so you estimate 100% of the coins are green. You pick a blue so you estimate 50% of coins are green. You pick another blue coin so you estimate 33% of the coins are green. At this point is isn't worth playing anymore, according to your estimate, so you stop.

This answer is wrong; see Paul Hankin's answer for counterexamples and a proper analysis. I leave this answer here as a learning example for all of us.
Assuming that your choice is only when to stop picking coins, you continue as long as G > B. That part is simple. If you start with G < B, then you never start drawing, and your gain is 0. For G = B, no strategy will get you a mathematical advantage; the gain there is also 0.
For the expected reward, take this in two steps:
(1) Expected value on any draw sequence. Do this recursively, figuring the chance of getting green or blue on the first draw, and then the expected values for the new state (G-1, B) or (G, B-1). You will quickly see that the expected value of any given draw number (such as all possibilities for the 3rd draw) is the same as the original.
Therefore, your expected value on any draw is e = (G-B) / (G+B). Your overall expected value is e * d, where d is the number of draws you choose.
(2) What is the expected number of draws? How many times do you expect to draw before G = B? I'll leave this as an exercise for the student, but note the previous idea of doing this recursively. You might find it easier to describe the state of the game as (extra, total), where extra = G-B and total = G+B.
Illustrative exercise: given G=4, B=2, what is the chance that you'll draw GG on the first two draws (and then stop the game)? What is the gain from that? How does that compare with the (4-2)/(4+2) advantage on each draw?

Chance a player has a card given set of possible cards per player

In a trick-taking game, it is often easy to keep track of which cards each player can possibly have left. For instance if following suit is mandatory and a player does not follow suit, it is obvious that player does not have any more cards of that particular suit.
This means, during the game you can build up knowledge about which cards each player can possibly have.
Is there a way to efficiently calculate (a reasonably accurate) chance that a specific player actually has a certain card?
A naive way would be to just generate all permutations of all cards left and check which of these permutations are possible given the constraints mentioned earlier. But this is not a really efficient way.
Another approach would be to just check how many others could have a particular card. For instance, if 3 players might have a particular card you could use 1/3 as the chance a particular player has a certain card. But this is often inaccurate.
For instance:
Each player has 2 cards left
Player A can have the AS, KS.
Player B can have the AS, KS, AH, and KH.
Algorithm 1 would correctly find that the chance Player B has the AS is 0.
Algorithm 2 would incorrectly find that the chance Player B has the AS is 0.5.
Is there a better algorithm that would be both reasonably accurate and reasonably fast?

Take a page from a book of quantum mechanics. Consider that every card is in a mix of states with probabilities - e.g. x|AS>+y|KS>+z|AH>+w|KH>. For 36 cards, you get 36 x 36 matrix, where initially all values are equal 1/36. Constraints are that sum of all values in a row equals 1 (every card is somewhere) and sum of all values in a column is 1 (every card is something). For your mini-example, initial matrix would be
0.25 0.25 0.25 0.25 (AS)
0.25 0.25 0.25 0.25 (KS)
0.25 0.25 0.25 0.25 (AH)
0.25 0.25 0.25 0.25 (KH)
(0) (1) (2) (3)
Let A cards be 0, 1 and B cards be 2, 3. Chance of B having AS is 0.5.
Now you observe that P(0 = AH) = 0, then you set corresponding element to 0 and proportionally alter column and row values, then all other values so that sums remain 1:
0.33 0.22 0.22 0.22 (AS)
0.33 0.22 0.22 0.22 (KS)
0.00 0.33 0.33 0.33 (AH)
0.33 0.22 0.22 0.22 (KH)
(0) (1) (2) (3)
Adding observations P(0 = KH) = 0, P(1 = AH) = 0, P(1 = KH) = 0 gets you this matrix:
0.50 0.50 0.00 0.00 (AS)
0.50 0.50 0.00 0.00 (KS)
0.00 0.00 0.50 0.50 (AH)
0.00 0.00 0.50 0.50 (KH)
(0) (1) (2) (3)
As you can see, P(2 = AS or 3 = AS) = 0, as it should be.
Note that most games allow the player to shuffle the cards in his or her hand (i.e. when B plays a card, you don't know if it's (2) or (3)). Suppose A and B exchange cards (1) and (2) - this leaves matrix the same - and then when B shuffles his cards, the matrix becomes
0.50 0.25 0.00 0.25 (AS)
0.50 0.25 0.00 0.25 (KS)
0.00 0.25 0.50 0.25 (AH)
0.00 0.25 0.50 0.25 (KH)
(0) (1) (2) (3)
Also note that the model isn't perfect - it doesn't allow to note observations like "B has either (AS, KH) or (AH, KS)". But in certain definitions of "reasonably accurate", it probably is.

How to transform a correlation matrix into a single row?

I have a 200x200 correlation matrix text file that I would like to turn into a single row.
e.g.
a b c d e
a 1.00 0.33 0.34 0.26 0.20
b 0.33 1.00 0.40 0.48 0.41
c 0.34 0.40 1.00 0.59 0.35
d 0.26 0.48 0.59 1.00 0.43
e 0.20 0.41 0.35 0.43 1.00
I want to turn it into:
a_b a_c a_d a_e b_c b_d b_e c_d c_e d_e
0.33 0.34 0.26 0.20 0.40 0.48 0.41 0.59 0.35 0.43
I need a code that can:
1. Join the variable names to make a single row of headers (e.g. turn "a" and "b" into "a_b") and
2. Turn only one half of the correlation matrix (bottom or top triangle) into a single row
A bit of extra information: I have around 500 participants in a study and each of them has a correlation matrix file. I want to consolidate these separate data files into one file where each row is one participant's correlation matrix.
Does anyone know how to do this?
Thanks!!

Why python implementation of miller-rabin faster than ruby by a lot?

For one of my classes I recently came across both a ruby and a python implementations of using the miller-rabin algorithm to identify the number of primes between 20 and 29000. I am curious why, even though they are seemingly the same implementation, the python code runs so much faster. I have read that python was typically faster than ruby but is this much of a speed difference to be expected?
miller_rabin.rb
def miller_rabin(m,k)
t = (m-1)/2;
s = 1;
while(t%2==0)
t/=2
s+=1
end
for r in (0...k)
b = 0
b = rand(m) while b==0
prime = false
y = (b**t) % m
if(y ==1)
prime = true
end
for i in (0...s)
if y == (m-1)
prime = true
break
else
y = (y*y) % m
end
end
if not prime
return false
end
end
return true
end
count = 0
for j in (20..29000)
if(j%2==1 and miller_rabin(j,2))
count+=1
end
end
puts count
miller_rabin.py:
import math
import random
def miller_rabin(m, k):
s=1
t = (m-1)/2
while t%2 == 0:
t /= 2
s += 1
for r in range(0,k):
rand_num = random.randint(1,m-1)
y = pow(rand_num, t, m)
prime = False
if (y == 1):
prime = True
for i in range(0,s):
if (y == m-1):
prime = True
break
else:
y = (y*y)%m
if not prime:
return False
return True
count = 0
for j in range(20,29001):
if j%2==1 and miller_rabin(j,2):
count+=1
print count
When I measure the execution time of each using Measure-Command in Windows Powershell, I get the following:
Python 2.7:
Ticks: 4874403
Total Milliseconds: 487.4403
Ruby 1.9.3:
Ticks: 682232430
Total Milliseconds: 68223.243
I would appreciate any insight anyone can give me into why their is such a huge difference

In ruby you are using (a ** b) % c to calculate the modulo of exponentiation. In Python, you are using the much more efficient three-element pow call whose docstring explicitly states:
With three arguments, equivalent to (x**y) % z, but may be more
efficient (e.g. for longs).
Whether you want to count the lack of such built-in operator against ruby is a matter of opinion. On the one hand, if ruby doesn't provide one, you might say that it's that much slower. On the other hand, you're not really testing the same thing algorithmically, so some would say that the comparison is not fair.
A quick googling reveals that there are implementations of modulo exponentiation for ruby.

I think these profile results should answer your question:
%self total self wait child calls name
96.81 43.05 43.05 0.00 0.00 17651 Fixnum#**
1.98 0.88 0.88 0.00 0.00 17584 Bignum#%
0.22 44.43 0.10 0.00 44.33 14490 Object#miller_rabin
0.11 0.05 0.05 0.00 0.00 32142 <Class::Range>#allocate
0.11 0.06 0.05 0.00 0.02 17658 Kernel#rand
0.08 44.47 0.04 0.00 44.43 32142 *Range#each
0.04 0.02 0.02 0.00 0.00 17658 Kernel#respond_to_missing?
0.00 44.47 0.00 0.00 44.47 1 Kernel#load
0.00 44.47 0.00 0.00 44.47 2 Global#[No method]
0.00 0.00 0.00 0.00 0.00 2 IO#write
0.00 0.00 0.00 0.00 0.00 1 Kernel#puts
0.00 0.00 0.00 0.00 0.00 1 IO#puts
0.00 0.00 0.00 0.00 0.00 2 IO#set_encoding
0.00 0.00 0.00 0.00 0.00 1 Fixnum#to_s
0.00 0.00 0.00 0.00 0.00 1 Module#method_added
Looks like Ruby's ** operator is slow as compared to Python.
It looks like (b**t) is often too big to fix in a Fixnum, so you are using Bignum (or arbitrary-precision) arithmetic, which is much slower.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Solving a system of equations to find expected residence time of a Markov Chain - matrix

I found the answer: Let all probabilities p but the diagonal be -p in step 3: 0.75 ; -0.00 ; -0.33 ; -0.00 -0.25 ; 0.50 ; -0.33 ; -0.00 -0.25 ; -0.50 ; 0.67 ; -0.50 -0.25 ; -0.00 ; -0.00 ; 0.50

Related

Julia pmap speed - parallel processing - dynamic programming

Algorithm for optimal expected amount in a profit/loss game

Chance a player has a card given set of possible cards per player

How to transform a correlation matrix into a single row?

Why python implementation of miller-rabin faster than ruby by a lot?

Categories

Resources