Running a function multiple times and tracking results of the fight simulation - ruby

Ive made a function to run a fight simulation. Its got a random element so would like to run it 100 times to check results.
Ive learnt that ruby cant have functions inside functions.
$p1_skill = 10
$p1_health = 10
$p2_skill = 10
$p2_health = 10
def hp_check
if $p2_health >= 1 && $p1_health == 0
return "p2_wins"
elsif $p1_health >= 1 && $p2_health == 0
return "p1_wins"
else
battle
end
end
def battle
p1_fight = $p1_skill + rand(2..12)
p2_fight = $p2_skill + rand(2..12)
if p1_fight > p2_fight
$p2_health -= 2
hp_check
elsif p2_fight > p1_fight
$p1_health -= 2
hp_check
else
battle
end
end
battle
Right now this accurately produces a winner. It rolls two dice and adds them to a players skill. If its higher than the other players the other player loses 2 health.
The skills and hp of players will change throughout the game, this is for a project assignment.
Id like this to produce odds for win chances for balancing issues.

I have several suggestions regarding your implementation. Note that since this is a homework I'm providing the answer in pieces rather than just giving you an entire program. In no particular order...
Don't use global variables. I suspect this is the major hurdle you're running into with trying to achieve multiple runs of your model. The model state should be contained within the model methods, and initial state can be passed to it as arguments. Example:
def battle(p1_skill, p1_health, p2_skill, p2_health)
Unless your instructor has mandated that you use recursion, a simple loop structure will serve you much better. There's no need to check who won until one player or the other drops down to zero (or lower). There's also no need for an else to recursively call battle, the loop will iterate to the next round of the fight if both are still in the running, even if neither player took a hit.
while p1_health > 0 && p2_health > 0
# roll the dice and update health
end
# check who won and return that answer
hp_check really isn't needed, when you lose the recursive calls it becomes a one-liner if you perform the check after breaking out of the loop. Also, it would be more useful to return just the winner, so whoever gets that return value can decide whether they want to print it, use it to update a tally, both, or something else entirely. After you break out of the loop outlined above:
# determine which player won, since somebody's health dropped to 0 or less
p1_health > 0 ? 1 : 2
When you're incrementing or decrementing a quantity, don't do equality testing. p1_health <= 0 is much safer than p1_health == 0, because some day you or somebody else is going to start from an odd number while decrementing by 2's, or decrement by some other (random?) amount.
Generating a number uniformly between 2 and 12 is not the same as summing two 6-sided dice. There are 36 possible outcomes for the two dice. Only one of the 36 yields a 2, only one yields a 12, and at the other extreme, there are six ways to get a sum of 7. I created a little die-roll method which takes the number of dice as an argument:
def roll_dice(n)
n.times.inject(0) { |total| total + rand(1..6) }
end
so, for example, determining player 1's fight score becomes p1_fight = p1_skill + roll_dice(2).
After making these sorts of changes, tallying up the statistics is pretty straightforward:
n = 10000
number_of_p1_wins = 0
n.times { number_of_p1_wins += 1 if battle(10, 10, 10, 10) == 1 }
proportion = number_of_p1_wins.to_f / n
puts "p1 won #{"%5.2f" % (100.0 * proportion)}% of the time"
If you replace the constant 10's in the call to battle by getting user input or iterating over ranges, you can explore a rich set of other scenarios.

Related

Why does n-of show a discontinuity when the size of the reported agentset goes from 2 to 3?

The n-of reporter is one of those reporters making random choices, so we know that if we use the same random-seed we will always get the same agentset out of n-of.
n-of takes two arguments: size and agentset (it can also take lists, but a note on this later). I would expect that it works by throwing a pseudo-random number, using this number to choose an agent from agentset, and repeating this process size times.
If this is true we would expect that, if we test n-of on the same agentset and using the same random-seed, but each time increasing size by 1, every resulting agentset will be the same as in the previous extraction plus a further agent. After all, the sequence of pseudo-random numbers used to pick the first (size - 1) agents was the same as before.
This seems to be confirmed generally. The code below highlights the same patches plus a further one everytime size is increased, as shown by the pictures:
to highlight-patches [n]
clear-all
random-seed 123
resize-world -6 6 -6 6
ask n-of n patches [
set pcolor yellow
]
ask patch 0 0 [
set plabel word "n = " n
]
end
But there is an exception: the same does not happen when size goes from 2 to 3. As shown by the pictures below, n-of seems to follow the usual behaviour when starting from a size of 1, but the agentset suddenly changes when size reaches 3 (becoming the agentset of the figures above - which, as far as I can tell, does not change anymore):
What is going on there behind the scenes of n-of, that causes this change at this seemingly-unexplicable threshold?
In particular, this seems to be the case only for n-of. In fact, using a combination of repeat and one-of doesn't show this discontinuity (or at least as far as I've seen):
to highlight-patches-with-repeat [n]
clear-all
random-seed 123
resize-world -6 6 -6 6
repeat n [
ask one-of patches [
set pcolor yellow
]
]
ask patch 0 0 [
set plabel word "n = " n
]
end
Note that this comparison is not influenced by the fact that n-of guarantees the absence of repetitions while repeat + one-of may have repetitions (in my example above the first repetition happens when size reaches 13). The relevant aspect simply is that the reported agentset of size x is consistent with the reported agentset of size x + 1.
On using n-of on lists instead of agentsets
Doing the same on a list results in always different numbers being extracted, i.e. the additional extraction does not equal the previous extraction with the addition of a further number. While this looks to me as a counter-intuitive behaviour from the point of view of expecting always the same items to be extracted from a list if the extraction is based on always the same sequence of pseudo-random numbers, at least it looks to happen consistently and therefore it does not look to me as ambiguous behaviour as in the case of agentsets.
So let's find out how this works together. Let's start by checking the primitive implementation itself. It lives here. Here is the relevant bit with error handling and comments chopped out for brevity:
if (obj instanceof LogoList) {
LogoList list = (LogoList) obj;
if (n == list.size()) {
return list;
}
return list.randomSubset(n, context.job.random);
} else if (obj instanceof AgentSet) {
AgentSet agents = (AgentSet) obj;
int count = agents.count();
return agents.randomSubset(n, count, context.job.random);
}
So we need to investigate the implementations of randomSubset() for lists and agentsets. I'll start with agentsets.
The implementation lives here. And the relevant bits:
val array: Array[Agent] =
resultSize match {
case 0 =>
Array()
case 1 =>
Array(randomOne(precomputedCount, rng.nextInt(precomputedCount)))
case 2 =>
val (smallRan, bigRan) = {
val r1 = rng.nextInt(precomputedCount)
val r2 = rng.nextInt(precomputedCount - 1)
if (r2 >= r1) (r1, r2 + 1) else (r2, r1)
}
randomTwo(precomputedCount, smallRan, bigRan)
case _ =>
randomSubsetGeneral(resultSize, precomputedCount, rng)
}
So there you go. We can see that there is a special case when the resultSize is 2. It auto-generates 2 random numbers, and flips them to make sure they won't "overflow" the possible choices. The comment on the randomTwo() implementation clarifies that this is done as an optimization. There is similarly a special case for 1, but that's just one-of.
Okay, so now let's check lists. Looks like it's implementation of randomSubset() lives over here. Here is the snippit:
def randomSubset(n: Int, rng: Random): LogoList = {
val builder = new VectorBuilder[AnyRef]
var i = 0
var j = 0
while (j < n && i < size) {
if (rng.nextInt(size - i) < n - j) {
builder += this(i)
j += 1
}
i += 1
}
LogoList.fromVector(builder.result)
}
The code is a little obtuse, but for each element in the list it's randomly adding it to the resulting subset or not. If early items aren't added, the odds for later items go up (to 100% if need be). So changing the overall size of the list changes the numbers that will be generated in the sequence: rng.nextInt(size - i). That would explain why you don't see the same items selected in order when using the same seed but a larger list.
Elaboration
Okay, so let's elaborate on the n = 2 optimization for agentsets. There are a few things we have to know to explain this:
What does the non-optimized code do?
The non-optimized agentset code looks a lot like the list code I already discussed - it iterates each item in the agentset and randomly decides to add it to the result or not:
val iter = iterator
var i, j = 0
while (j < resultSize) {
val next = iter.next()
if (random.nextInt(precomputedCount - i) < resultSize - j) {
result(j) = next
j += 1
}
i += 1
}
Note that this code, for each item in the agentset will perform a couple of arithmetic operations, precomputedCount - i and resultSize - j as well as the final < comparison and the increments for j and i abd the j < resultSize check for the while loop. It also generates a random number for each checked element (an expensive operation) and calls next() to move our agent iterator forward. If it fills the result set before processing all elements of the agentset it will terminate "early" and save some of the work, but in the worst case scenario it is possible it'll perform all those operations for each element in the agentset when winds up needing the last agent to completely "fill" the results.
What does the optimized code do and why is it better??
So now let's check the optimized code n = 2 code:
if (!kind.mortal)
Array(
array(smallRandom),
array(bigRandom))
else {
val it = iterator
var i = 0
// skip to the first random place
while(i < smallRandom) {
it.next()
i += 1
}
val first = it.next()
i += 1
while (i < bigRandom) {
it.next()
i += 1
}
val second = it.next()
Array(first, second)
}
First, the check for kind.mortal at the start is basically checking if this is a patch agentset or not. Patches never die, so it's safe to assume all agents in the agentset are alive and you can just return the agents found in the backing array at the two provided random numbers as the result.
So on to the second bit. Here we have to use the iterator to get the agents from the set, because some of them might be dead (turtles or links). The iterator will skip over those for us as we call next() to get the next agent. You see the operations here are doing the while checks as it increments i up through the desired random numbers. So here the work is the increments for the indexer, i, as well as the checks for the while() loops. We also have to call next() to move the iterator forward. This works because we know smallRandom is smaller than bigRandom - we're just skipping through the agents and plucking out the ones we want.
Compared the non-optimized version we've avoided generator many of the random numbers, we avoid having an extra variable to track the result set count, and we avoid the math and less-than check to determine memebership in the result set. That's not bad (especially the RNG operations).
What would the impact be? Well if you have a large agentset, say 1000 agents, and you are picking 2 of them, the odds of picking any one agent are small (starting at 1/1000, in fact). That means you will run all that code for a long time before getting your 2 resulting agents.
So why not optimize for n-of 3, or 4, or 5, etc? Well, let's look back at the code to run the optimized version:
case 2 =>
val (smallRan, bigRan) = {
val r1 = rng.nextInt(precomputedCount)
val r2 = rng.nextInt(precomputedCount - 1)
if (r2 >= r1) (r1, r2 + 1) else (r2, r1)
}
randomTwo(precomputedCount, smallRan, bigRan)
That little logic at the end if (r2 >= r1) (r1, r2 + 1) else (r2, r1) makes sure that smallRan < bigRan; that is strictly less than, not equal. That logic gets much more complex when you need to generate 3, 4, or 5+ random numbers. None of them can be the same, and they all have to be in order. There are ways to quickly sort lists of numbers which might work, but generating random numbers without repetition is much harder.

How to compute blot exposure in backgammon efficiently

I am trying to implement an algorithm for backgammon similar to td-gammon as described here.
As described in the paper, the initial version of td-gammon used only the raw board encoding in the feature space which created a good playing agent, but to get a world-class agent you need to add some pre-computed features associated with good play. One of the most important features turns out to be the blot exposure.
Blot exposure is defined here as:
For a given blot, the number of rolls out of 36 which would allow the opponent to hit the blot. The total blot exposure is the number of rolls out of 36 which would allow the opponent to hit any blot. Blot exposure depends on: (a) the locations of all enemy men in front of the blot; (b) the number and location of blocking points between the blot and the enemy men and (c) the number of enemy men on the bar, and the rolls which allow them to re-enter the board, since men on the bar must re-enter before blots can be hit.
I have tried various approaches to compute this feature efficiently but my computation is still too slow and I am not sure how to speed it up.
Keep in mind that the td-gammon approach evaluates every possible board position for a given dice roll, so each turn for every players dice roll you would need to calculate this feature for every possible board position.
Some rough numbers: assuming there are approximately 30 board position per turn and an average game lasts 50 turns we get that to run 1,000,000 game simulations takes: (x * 30 * 50 * 1,000,000) / (1000 * 60 * 60 * 24) days where x is the number of milliseconds to compute the feature. Putting x = 0.7 we get approximately 12 days to simulate 1,000,000 games.
I don't really know if that's reasonable timing but I feel there must be a significantly faster approach.
So here's what I've tried:
Approach 1 (By dice roll)
For every one of the 21 possible dice rolls, recursively check to see a hit occurs. Here's the main workhorse for this procedure:
private bool HitBlot(int[] dieValues, Checker.Color checkerColor, ref int depth)
{
Moves legalMovesOfDie = new Moves();
if (depth < dieValues.Length)
{
legalMovesOfDie = LegalMovesOfDie(dieValues[depth], checkerColor);
}
if (depth == dieValues.Length || legalMovesOfDie.Count == 0)
{
return false;
}
bool hitBlot = false;
foreach (Move m in legalMovesOfDie.List)
{
if (m.HitChecker == true)
{
return true;
}
board.ApplyMove(m);
depth++;
hitBlot = HitBlot(dieValues, checkerColor, ref depth);
board.UnapplyMove(m);
depth--;
if (hitBlot == true)
{
break;
}
}
return hitBlot;
}
What this function does is take as input an array of dice values (i.e. if the player rolls 1,1 the array would be [1,1,1,1]. The function then recursively checks to see if there is a hit and if so exits with true. The function LegalMovesOfDie computes the legal moves for that particular die value.
Approach 2 (By blot)
With this approach I first find all the blots and then for each blot I loop though every possible dice value and see if a hit occurs. The function is optimized so that once a dice value registers a hit I don't use it again for the next blot. It is also optimized to only consider moves that are in front of the blot. My code:
public int BlotExposure2(Checker.Color checkerColor)
{
if (DegreeOfContact() == 0 || CountBlots(checkerColor) == 0)
{
return 0;
}
List<Dice> unusedDice = Dice.GetAllDice();
List<int> blotPositions = BlotPositions(checkerColor);
int count = 0;
for(int i =0;i<blotPositions.Count;i++)
{
int blotPosition = blotPositions[i];
for (int j =unusedDice.Count-1; j>= 0;j--)
{
Dice dice = unusedDice[j];
Transitions transitions = new Transitions(this, dice);
bool hitBlot = transitions.HitBlot2(checkerColor, blotPosition);
if(hitBlot==true)
{
unusedDice.Remove(dice);
if (dice.ValuesEqual())
{
count = count + 1;
}
else
{
count = count + 2;
}
}
}
}
return count;
}
The method transitions.HitBlot2 takes a blotPosition parameter which ensures that only moves considered are those that are in front of the blot.
Both of these implementations were very slow and when I used a profiler I discovered that the recursion was the cause, so I then tried refactoring these as follows:
To use for loops instead of recursion (ugly code but it's much faster)
To use parallel.foreach so that instead of checking 1 dice value at a time I check these in parallel.
Here are the average timing results of my runs for 50000 computations of the feature (note the timings for each approach was done of the same data):
Approach 1 using recursion: 2.28 ms per computation
Approach 2 using recursion: 1.1 ms per computation
Approach 1 using for loops: 1.02 ms per computation
Approach 2 using for loops: 0.57 ms per computation
Approach 1 using parallel.foreach: 0.75 ms per computation
6 Approach 2 using parallel.foreach: 0.75 ms per computation
I've found the timings to be quite volatile (Maybe dependent on the random initialization of the neural network weights) but around 0.7 ms seems achievable which if you recall leads to 12 days of training for 1,000,000 games.
My questions are: Does anyone know if this is reasonable? Is there a faster algorithm I am not aware of that can reduce training?
One last piece of info: I'm running on a fairly new machine. Intel Cote (TM) i7-5500U CPU #2.40 GHz.
Any more info required please let me know and I will provide.
Thanks,
Ofir
Yes, calculating these features makes really hairy code. Look at the GNU Backgammon code. find the eval.c and look at the lines for 1008 to 1267. Yes, it's 260 lines of code. That code calculates what the number of rolls that hits at least one checker, and also the number of rolls that hits at least 2 checkers. As you see, the code is hairy.
If you find a better way to calculate this, please post your results. To improve I think you have to look at the board representation. Can you represent the board in a different way that makes this calculation faster?

Trying to understand binary search

I'm trying to understand this code my pairing partner wrote. I dont understand why she used the until loop stating to loop until (finish - start) == 1. What exactly is she looping until?
def binary_search(object, array)
array.sort!
start = -1
finish = array.length
until (finish - start) == 1 do
median = start + ((finish - start) / 2)
# p start
# p finish
return median if object == array[median]
if object > array[median]
start = median
elsif object < array[median]
finish = median
end
end
-1
end
finish - start is the length of the window left to search (+ 1, for easier arithmetic); it starts of as the entire array and gets halved on every iteration, by setting either the start or the finish to the median.
When it reaches 1, there is nothing left to search, and the input object was not found.
Think about how kids play the "guess a number between 1 and 100" game. "Is it bigger than 50?" "No." You now know it's a number between 1 and 50. "Is it bigger than 25?" "Yes." You now know it's between 26 and 50. And so on...
It's the same with binary search. You check to see if the target is above or below the midrange. Whichever way the answer turns out, you've eliminated half of the possibilities and can focus on the remaining subset. Every time you repeat the process, you cut the range that's still under consideration in half. When the range gets down to size one, you've either found the target value or established it wasn't in the set.

Getting the least common denominator of two decimals

I am currently working on a text based web game, where in I simulate the battle sequences automatically like MyBrute and Pockie Ninja
So this is the situation.
We have 2 Players with different attack speed
attack speed(determines the number of seconds needed for a player to start attacking)
(Easy Example) Lets assume Player 1 has 6s and Player 2 has 3s
This means Player 2 will attack twice before Player 1 does
(its because if two player tied on a attack turn, the one with the better attack speed goes first)
(but if they have the same attack speed, the player who have not attack lately will go)
Now my problem is in the loop.
I'd like to determine who's turn it is with the minimum number of loops
for our Easy Example we could just create an infinite loop with a counter that increments 3 values to determine whos turn it's going to be and just check if every iteration if we have a winner and exit the loop. (this is my algo you can suggest better one)
The big problem for me is when i have decimal values now for attack speed
Realistic Example (assume that i only use 1 digit for decimal)
Player1 attack speed = 5.7
Player2 attack speed = 6.6
at worst we could have is 0.1 as a an LCD and use as subtrahend per loop but i want to determine the the best subtrahend(LCD) value.
Hope it makes sense.
Thank you. I appreciate you sharing your great minds.
UPDATE
//THIS IS NOT THE ACTUAL CODES BUT THIS IS THE LOGIC
decimal Player1Turn = Player1.attackspeed;
decimal Player2Turn = Player2.attackspeed;
decimal LCD = GetLCD(Player1.attackspeed,Player2.attackspeed) ***//THIS IS WHAT I WANT TO DETERMINE***
while (Player1.HP >0 && Player2.HP >0)
{
Player1Turn -= LCD;
Player2Turn -= LCD;
if (Player1Turn<=0)
{
//DO STUFF
Player1Turn = Player1.attackspeed;
}
if (Player2Turn<=0)
{
//DO STUFF
Player2Turn = Player2.attackspeed;
}
}
WE CAN USE A FUNCTION LIKE
public decimal GetLCD(decimal num1, decimal num2)
{
//returns the lcd
}
The following code processes the battle sequence without using the lowest common denominator. It will also run about 1 million times faster than all possible attempts with using the lowest common denominator for player attack speeds equal e.g. 1000 and 1000.001 respectively.
decimal time = 0;
while (player1.HP > 0 && player2.HP > 0) {
decimal player1remainingtime = player1.attackspeed - (time % player1.attackspeed);
decimal player2remainingtime = player2.attackspeed - (time % player2.attackspeed);
time += Math.Min(player1remainingtime, player2remainingtime);
if(player1remainingtime < player2remainingtime) {
//it is player 1 turn; do stuff;
} else if(player1remainingtime > player2remainingtime) {
//it is player 2 turn; do stuff;
} else {
//both player turns now
if(player1.attackspeed < player2.attackspeed) {
//player 1 is faster, its player 1 turn; do stuff
//now do stuff for player 2
} else {
//player 2 is faster, its player 2 turn; do stuff
//now do stuff for player 1
}
}
}
If you are using an object oriented language then you can do this:
Players will be objects of type Player and there will be a Timer object.
The Timer will use the Observer design pattern.
Players will register themselves to the Timer with their response time.
When their time is due then they are notified that they can take action.

Ruby - Picking an element in an array with 50% chance for a[0], 25% chance for a[1]

Nothing too complicated, basically I just want to pick an element from the array as if I were making coin tosses for each index and and choosing the index when I first get a head. Also no heads means I choose the last bin.
I came up with the following and was wondering if there was a better/more efficient way of doing this.
def coin_toss(size)
random_number = rand(2**size)
if random_number == 0
return size-1
else
return (0..size-1).detect { |n| random_number[n] == 1 }
end
end
First guess...pick a random number between 1 and 2**size, find the log base 2 of that, and pick the number that many elements from the end.
Forgive my horrible ruby skillz.
return a[-((Math.log(rand(2**size-1)+1) / Math.log(2)).floor) - 1]
if rand returns 0, the last element should be chosen. 1 or 2, the next to last. 3, 4, 5, or 6, third from the end. Etc. Assuming an even distribution of random numbers, each element has twice as much chance of being picked as the one after it.
Edit: Actually, it seems there's a log2 function, so we don't have to do the log/log(2) thing.
return a[-(Math.log2(rand(2**size - 1)+1).floor) - 1]
You may be able to get rid of those log calls altogether like
return a[-((rand(2**size-1)+1).to_s(2).length)]
But you're creating an extra String. Not sure whether that's better than complicated math. :)
Edit: Actually, if you're going to go the string route, you can get rid of the +1 and -1 altogether. It'd make the probabilities more accurate, as the last two elements should have an equal chance of being chosen. (If the next-to-last value isn't chosen, the last value would always be.)
Edit: We could also turn the ** into a bit shift, which should be a little faster (unless Ruby was smart enough to do that already).
return a[-(rand(1<<size).to_s(2).length)]
A non-smart, simple way is:
def coin_toss( arr )
arr.detect{ rand(2) == 0 } || arr.last
end

Resources