Weighted random number generation with seed and modifiers - algorithm

I try to implement a generation of a complex object based on a weighted elements list.
ListEnum.kt
enum class Elements(weighting:Int){
ELEM1(15),
ELEM2(20),
ELEM3(7),
ELEM4(18)
// function to get weighted random element
companion object{
fun getRandomElement(seed:Long): Elements{
var totalSum = 0
values().forEach {
totalSum += it.weighting
}
val index = Random(seed).nextInt(totalSum)
var sum = 0
var i = 0
while (sum < index) {
sum += values()[i++].weighting
}
return values()[max(0, i - 1)]
}
}
}
MyClass.kt
class MyClass{
fun getRandomElement():RandomElement{
val seed = Random.nextLong()
val element = Elements.getRandomElement(seed)
return RandomElement(element, seed)
}
}
I can persist the seed and recreate the same object with the same seed.
Now I want to modify the weighting in the Elements enum at runtime.
Elements.kt
enum class Elements(weighting:Int){
ELEM1(15),
ELEM2(20),
ELEM3(7),
ELEM4(18)
// function to get weighted random element
companion object{
fun getRandomElement(seed:Long, modifiers:Mods): Elements{
var totalSum = 0
values().forEach {
var modifiedWeighting =it.weighting
if(modifiers.modifier[it] != null){
modifiedWeighting= modifiers.modifier[it].times(modifiedWeighting).toInt()
}
totalSum += modifiedWeighting
}
val index = Random(seed).nextInt(totalSum)
var sum = 0
var i = 0
while (sum < index) {
var newSum = values()[i].weighting
if(modifiers.modifier[values()[i]] != null){
newSum = newSum.times(modifiers.modifier[values()[i]]).toInt()
}
sum += newSum
i++
}
return values()[max(0, i - 1)]
}
}
}
That works for generating random elements with modified weightings, but now I can't recreate the object because the seed does not consider the modifiers.
How can I generate such an object based on modified weightings that I can recreate just from the seed?
UPDATE:
To clarify by what I mean with modifiers. You can see them as a temporary modification of the weight of a specific element.
Modifiers are just float numbers like 1.0f -> keep weight as it is, 2.0f -> double the weight.
So after applying the modifiers, the element enum would look different. But like you can see, the modification is just temporary inside the calculation method.
One possible solution may be to create a static elements enum for every modification I need (they are limited, three or four), e.g.
ElementsModifiedByX, ElementsModifiedByY.
But that seems quiet a dirty solution to me.

I won't try to write Kotlin, but here is a simple strategy.
Instead of keeping your weights in the enum, move them into a structure like this:
{
start_weight: 0,
end_weight: 60,
choices: [(ELEM1, 15, 15), (ELEM2, 20, 35), (ELEM3, 7, 42), (ELEM4, 18, 60))],
}
And now to add weight, you just add onto the list and adjust the end_weight. To remove weight you move start_weight forward far enough then add in new entries for the weights that you accidentally erased. Your seeds are random numbers in the range from start_weight to end_weight. Making the choice can be done with a binary search. And, because you never erase entries, you can reliably reconstruct the same result every time from the same seed.

Related

Is There Some Stream-Only Way To Determine The Index Of The Max Stream Element?

I have a Stream<Set<Integer>> intSetStream.
I can do this on it...
Set<Integer> theSetWithTheMax = intSetStream.max( (x,y)->{ return Integer.compare( x.size(), y.size() ); } ).get( );
...and I get a hold of the Set<Integer> that has the highest number of Integer elements in it.
That's great. But what I really need to know is, is it the 1st Set in that Stream that's the max? Or is it the 10th Set in the Stream? Or the ith Set? Which one of them has the most elements in it?
So my question is: Is there some way — using the Stream API — that I can determine "It was the ith Set in the Stream of Sets that returned the largest value of them all, for the Set.size( ) call"?
The best solution I can think of, is to iterate over the Stream<Set<Integer>> (using intSetStream.iterator()) and do a hand-rolled max( ) calculation. But I'm hoping to learn a more Stream-y way to go about it; if there is such a thing.
You can do this with a custom collector:
int posOfMax = stream.mapToInt(Set::size)
.collect(() -> new int[] { 0, -1, -1 },
(a,i) -> { int pos = a[0]++; if(i>a[2]) { a[1] = pos; a[2] = i; } },
(a1,a2) -> {
if(a2[2] > a1[2]) { a1[1] = a1[0]+a2[1]; a1[2] = a2[2]; }
a1[0] += a2[0];
})[1];
This is the most lightweight solution. Its logic becomes clearer when we use a dedicated class instead of an array:
int posOfMax = stream.mapToInt(Set::size)
.collect(() -> new Object() { int size = 0, pos = -1, max = -1; },
(o,i) -> { int pos = o.size++; if(i>o.max) { o.pos = pos; o.max = i; } },
(a,b) -> {
if(b.max > a.max) { a.pos = a.size+b.pos; a.max = b.max; }
a.size += b.size;
}).pos;
The state object holds the size, which is simply the number of elements encountered so far, the last encountered max value and its position which we update to the previous value of the size if the current element is bigger than the max value. That’s what the accumulator function (the second argument to collect) does.
In order to support arbitrary evaluation orders, i.e. parallel stream, we have to provide a combiner function (the last argument to collect). It merges the state of two partial evaluation into the first state. If the second state’s max value is bigger, we update the first’s max value and the position, whereas we have to add the first state’s size to the second’s position to reflect the fact that both are partial results. Further, we have to update the size to the sum of both sizes.
One way to do it is to firstly map Stream<Set<Integer>> to a Collection<Integer> where each element is the size of each Set<Integer> and then you can extract what is the largest number of elements given Stream<Set<Integer>> and then get the "index" of this set by finding an index of the largest number in the collection of sizes.
Consider following example:
import java.util.Arrays;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class IntSetStreamExample {
public static void main(String[] args) {
final Stream<Set<Integer>> stream = Stream.of(
new HashSet<>(Arrays.asList(1,2,3)),
new HashSet<>(Arrays.asList(1,2)),
new HashSet<>(Arrays.asList(1,2,3,4,5)),
new HashSet<>(Arrays.asList(0)),
new HashSet<>(Arrays.asList(0,1,2,3,4,5)),
new HashSet<>()
);
final List<Integer> result = stream.map(Set::size).collect(Collectors.toList());
System.out.println("List of number of elements in Stream<Set<Integer>>: " + result);
final int max = Collections.max(result);
System.out.println("Largest set contains " + max + " elements");
final int index = result.indexOf(max);
System.out.println("Index of the largest set: " + index);
}
}
The exemplary output may look like this:
List of number of elements in Stream<Set<Integer>>: [3, 2, 5, 1, 6, 0]
Largest set contains 6 elements
Index of the largest set: 4
Streams methods are not designed to be aware of the current element iterated.
So I think that you actual way : find the Set with the max of elements and then iterate on the Sets to find this Set is not bad.
As alternative you could first collect the Stream<Set<Integer>> into a List (to have a way to retrieve the index) and use a SimpleImmutableEntry but it seems really overkill :
Stream<Set<Integer>> intSetStream = ...;
List<Set<Integer>> list = intSetStream.collect(Collectors.toList());
SimpleImmutableEntry<Integer, Set<Integer>> entry =
IntStream.range(0, list.size())
.mapToObj(i -> new SimpleImmutableEntry<>(i, list.get(i)))
.max((x, y) -> {
return Integer.compare(x.getValue()
.size(),
y.getValue()
.size());
})
.get();
Integer index = entry.getKey();
Set<Integer> setWithMaxNbElements = entry.getValue();
Insight provided in #Holzer's custom Collector-based solution (on top of my downright shameless plagiarizing of the source code of IntSummaryStatistics.java), inspired a custom Collector-based solution of my own; that might, in turn, inspire others...
public class IndexOfMaxCollector implements IntConsumer {
private int max = Integer.MIN_VALUE;
private int maxIdx = -1;
private int currIdx = 0;
public void accept( int value ){
if( value > max )
maxIdx = currIdx;
max = Math.max( max, value );
currIdx++;
}
public void combine( IndexOfMaxCollector other ){
if( other.max > max ){
maxIdx = other.maxIdx + currIdx;
max = other.max;
}
currIdx += other.currIdx;
}
public int getMax( ){ return this.max; }
public int getIndexOfMax( ){ return this.maxIdx; }
}
...Using that custom Collector, I could take the intSetStream of my OQ and determine the index of the Set<Integer> that contains the highest number of elements, like this...
int indexOfMax = intSetStream.map( Set::size )
.collect( IndexOfMaxCollector::new,
IndexOfMaxCollector::accept,
IndexOfMaxCollector::combine )
.getIndexOfMax( );
This solution — admittedly not the most "beautiful" — possibly has a teensie bit of an edge over others in both the reusability and understandability stakes.

Number flower pots in an arrangement

It's a Google interview question. There's a list of "T" and "F" only. All denotes a position such that T means position is occupied by a flower pot and F means pot is not there, so you can put another pot at this position. Find the number of pots that can be placed in a given arrangement such that no two pots are adjacent to each other(they can be adjacent in the given arrangement). If a position at the beginning is unoccupied then a pot can be placed if second position is also unoccupied and if the last position is unoccupied than a pot can be placed if second last position is also unoccupied. For ex.
TFFFTFFTFFFFT - returns 2
FFTTFFFFFTTFF - returns 4
I tried solving it by looking at adjacent values for every position with value F. Increased the counter if both adjacent positions were F and set this position as T. I need a better solution or any other solution(if any).
Let's analyse what has to be done.
So first we probably need to visit and examine each place. That suggests loop of some sort. E.g.:
for (int i = 0; i < myPlaces.Length; ++i)
When we are at a spot we have to check if it's occupied
if (place[i] == 'F')
but that's not enough to place the flower pot there. We have to check if the next and previous place is free
place[i-1]
place[i+1]
If all tree contain F you can put the flower pot there and move to next field
Now, we also have some exceptions from the rule. Beginning and end of the list. So you have to deal with them separately. E.g
if (i == 0)
{
// only check current position and next position
}
if (i == myPlaces.Length - 1) // minus 1 because indexing usually starts from 0
{
// only check current position and previous position
}
After that you can perform the checks mentioned previously.
Now let's think of the input data. Generally, it's a good habit not to modify the input data but make a copy and work on the copy. Also some data structures work better than the others for different tasks. Here you can use simple string to keep entry values. But I would say an array of chars would be a better option because then, when you find a place where you can put a flower pot you can actually replace the F with the T in an array. Then when you move to new spot your data structers knows that there is already a pot in the previous position so your algorithm won't put an adjacent one.
You would not be able to do that with string as strings are immutable and you would need to generate a new string each time.
Note that it's only a naive algorithm with a lot of scope for improvement and optimization. But my goal was rather to give some idea how to approach this kind of problems in general. I'll leave implementing of the details to you as an afternoon exercise before targeting a job at Google.
You may be able to do this with a modified Mergesort. Consider the flowerpots that can be placed in the singletons, then the flowerpots that can be placed in the doubleton merges of those singletons, up the tree to the full arrangement. It would complete in O(n lg n) for a list of n flowerpots.
There is certainly a way to do this with a modified Rod Cutting algorithm with complexity O(n^2). The subproblem is whether or not an open "false set" exists in the substring being considered. The "closed false sets" already have some maximum value computed for them. So, when a new character is added, it either increases the amount of flowerpots that can be inserted, or "locks in" the maximum quantity of available flowerpots for the substring.
Also, you know that the maximum flowerpots that can be placed in a set of n open positions bound by closed positions is n - 2 (else n-1 if only bracketed on one side, i.e. the string begins or ends with a "false set". The base condition (the first position is open, or the first position is closed) can calculated upon reaching the second flowerpot.
So, we can build up to the total number of flowerpots that can be inserted into the whole arrangement in terms of the maximum number of flowerpots that can be inserted into smaller subarrangements that have been previously calculated. By storing our previous calculations in an array, we reduce the amount of time necessary to calculate the maximum for the next subarrangement to a single array lookup and some constant-time calculations. This is the essence of dynamic programming!
EDIT: I updated the answer to provide a description of the Dynamic Programming approach. Please consider working through the interactive textbook I mentioned in the comments! http://interactivepython.org/runestone/static/pythonds/index.html
I would approach the problem like this. You need FFF to have one more pot, FFFFF for two pots, etc. To handle the end cases, add an F at each end.
Because this is very similar to a 16-bit integer, the algorithm should use tricks like binary arithmetic operations.
Here is an implementation in Python that uses bit masking (value & 1), bit shifting (value >>= 1) and math ((zeros - 1) / 2) to count empty slots and calculate how many flower pots could fit.
#value = 0b1000100100001
value = 0b0011000001100
width = 13
print bin(value)
pots = 0 # number of flower pots possible
zeros = 1 # number of zero bits in a row, start with one leading zero
for i in range(width):
if value & 1: # bit is one, count the number of zeros
if zeros > 0:
pots += (zeros - 1) / 2
zeros = 0
else: # bit is zero, increment the number found
zeros += 1
value >>= 1 # shift the bits to the right
zeros += 1 # add one trailing zero
pots += (zeros - 1) / 2
print pots, "flower pots"
The solution is really simple, check the previous and current value of the position and mark the position as plantable (or puttable) and increment the count. Read the next value, if it is already is planted, (backtrack and) change the previous value and decrement the count. The complexity is O(n). What we really want to check is the occurrence of 1001. Following is the implementation of the algorithm in Java.
public boolean canPlaceFlowers(List<Boolean> flowerbed, int numberToPlace) {
Boolean previous = false;
boolean puttable = false;
boolean prevChanged = false;
int planted = 0;
for (Boolean current : flowerbed) {
if (previous == false && current == false) {
puttable = true;
}
if (prevChanged == true && current == true) {
planted--;
}
if (puttable) {
previous = true;
prevChanged = true;
planted++;
puttable = false;
} else {
previous = current;
prevChanged = false;
}
}
if (planted >= numberToPlace) {
return true;
}
return false;
}
private static void canPlaceOneFlower(List<Boolean> flowerbed, FlowerBed fb) {
boolean result;
result = fb.canPlaceFlowers(flowerbed, 1);
System.out.println("Can place 1 flower");
if (result) {
System.out.println("-->Yes");
} else {
System.out.println("-->No");
}
}
private static void canPlaceTwoFlowers(List<Boolean> flowerbed, FlowerBed fb) {
boolean result;
result = fb.canPlaceFlowers(flowerbed, 2);
System.out.println("Can place 2 flowers");
if (result) {
System.out.println("-->Yes");
} else {
System.out.println("-->No");
}
}
private static void canPlaceThreeFlowers(List<Boolean> flowerbed, FlowerBed fb) {
boolean result;
result = fb.canPlaceFlowers(flowerbed, 3);
System.out.println("Can place 3 flowers");
if (result) {
System.out.println("-->Yes");
} else {
System.out.println("-->No");
}
}
private static void canPlaceFourFlowers(List<Boolean> flowerbed, FlowerBed fb) {
boolean result;
result = fb.canPlaceFlowers(flowerbed, 4);
System.out.println("Can place 4 flowers");
if (result) {
System.out.println("-->Yes");
} else {
System.out.println("-->No");
}
}
public static void main(String[] args) {
List<Boolean> flowerbed = makeBed(new int[] { 0, 0, 0, 0, 0, 0, 0 });
FlowerBed fb = new FlowerBed();
canPlaceFourFlowers(flowerbed, fb);
canPlaceThreeFlowers(flowerbed, fb);
flowerbed = makeBed(new int[] { 0, 0, 0, 1, 0, 0, 0 });
canPlaceFourFlowers(flowerbed, fb);
canPlaceThreeFlowers(flowerbed, fb);
canPlaceTwoFlowers(flowerbed, fb);
flowerbed = makeBed(new int[] { 1, 0, 0, 1, 0, 0, 0, 1 });
canPlaceFourFlowers(flowerbed, fb);
canPlaceThreeFlowers(flowerbed, fb);
canPlaceTwoFlowers(flowerbed, fb);
canPlaceOneFlower(flowerbed, fb);
}
My solution using dynamic programming.
ar is array in the form of ['F','T','F'].
import numpy as np
def pot(ar):
s = len(ar)
rt = np.zeros((s,s))
for k in range(0,s):
for i in range(s-k):
for j in range(i,i+k+1):
left = 0
right = 0
if ar[j] != 'F':
continue
if j-1 >= i and ar[j-1] == 'T':
continue
else:
left = 0
if j+1 <= i+k and ar[j+1] == 'T':
continue
else:
right = 0
if j-2 >= i:
left = rt[i][j-2]
if j+2 <= i+k:
right = rt[j+2][i+k]
rt[i][i+k] = max(rt[i][i+k], left+right+1)
return rt[0][len(ar)-1]
My solution written in C#
private static int CheckAvailableSlots(string str)
{
int counter = 0;
char[] chrs = str.ToCharArray();
if (chrs.FirstOrDefault().Equals('F'))
if (chrs.Length == 1)
counter++;
else if (chrs.Skip(1).FirstOrDefault().Equals('F'))
counter++;
if (chrs.LastOrDefault().Equals('F') && chrs.Reverse().Skip(1).FirstOrDefault().Equals('F'))
counter++;
for (int i = 1; i < chrs.Length - 2; i++)
{
if (chrs[i - 1].Equals('T'))
continue;
else if (chrs[i].Equals('F') && chrs[i + 1].Equals('F'))
{
chrs[i] = 'T';
counter++;
i++;
}
else
i++;
}
return counter;
}
// 1='T'
// 0='F'
int[] flowerbed = new int[] {1,0,0,0,0,1};
public boolean canPlaceFlowers(int[] flowerbed, int n) {
int tg = 0;
for (int i = 0, g = 1; i < flowerbed.length && tg < n; i++) {
g += flowerbed[i] == 0 ? flowerbed.length - 1 == i ? 2 : 1 : 0;
if (flowerbed[i] == 1 || i == flowerbed.length - 1) {
tg += g / 2 - (g % 2 == 0 ? 1 : 0);
g = 0;
}
}
return tg >= n;
}
Most of these answers (unless they alter the array or traverse and a copy) dont consider the situation where the first 3 (or last 3) pots are empty. These solutions will incorrectly determine that FFFT will contain 2 spaces, rather than just one. We therefore need to start at the third element (rather than then second) and end at index length - 3 (rather than length - 2). Also, while looping through the array, if an eligible index is found, the index just be incremented by 2, otherwise TTFFFFT would give 2 available plots instead of one. This is true unless you alter the array while looping or use a copy of the array and alter it.
Edit: this holds true unless the question is how many spaces are available for planting, rather than how many total plants can be added

Printing numbers of the form 2^i * 5^j in increasing order

How do you print numbers of form 2^i * 5^j in increasing order.
For eg:
1, 2, 4, 5, 8, 10, 16, 20
This is actually a very interesting question, especially if you don't want this to be N^2 or NlogN complexity.
What I would do is the following:
Define a data structure containing 2 values (i and j) and the result of the formula.
Define a collection (e.g. std::vector) containing this data structures
Initialize the collection with the value (0,0) (the result is 1 in this case)
Now in a loop do the following:
Look in the collection and take the instance with the smallest value
Remove it from the collection
Print this out
Create 2 new instances based on the instance you just processed
In the first instance increment i
In the second instance increment j
Add both instances to the collection (if they aren't in the collection yet)
Loop until you had enough of it
The performance can be easily tweaked by choosing the right data structure and collection.
E.g. in C++, you could use an std::map, where the key is the result of the formula, and the value is the pair (i,j). Taking the smallest value is then just taking the first instance in the map (*map.begin()).
I quickly wrote the following application to illustrate it (it works!, but contains no further comments, sorry):
#include <math.h>
#include <map>
#include <iostream>
typedef __int64 Integer;
typedef std::pair<Integer,Integer> MyPair;
typedef std::map<Integer,MyPair> MyMap;
Integer result(const MyPair &myPair)
{
return pow((double)2,(double)myPair.first) * pow((double)5,(double)myPair.second);
}
int main()
{
MyMap myMap;
MyPair firstValue(0,0);
myMap[result(firstValue)] = firstValue;
while (true)
{
auto it=myMap.begin();
if (it->first < 0) break; // overflow
MyPair myPair = it->second;
std::cout << it->first << "= 2^" << myPair.first << "*5^" << myPair.second << std::endl;
myMap.erase(it);
MyPair pair1 = myPair;
++pair1.first;
myMap[result(pair1)] = pair1;
MyPair pair2 = myPair;
++pair2.second;
myMap[result(pair2)] = pair2;
}
}
This is well suited to a functional programming style. In F#:
let min (a,b)= if(a<b)then a else b;;
type stream (current, next)=
member this.current = current
member this.next():stream = next();;
let rec merge(a:stream,b:stream)=
if(a.current<b.current) then new stream(a.current, fun()->merge(a.next(),b))
else new stream(b.current, fun()->merge(a,b.next()));;
let rec Squares(start) = new stream(start,fun()->Squares(start*2));;
let rec AllPowers(start) = new stream(start,fun()->merge(Squares(start*2),AllPowers(start*5)));;
let Results = AllPowers(1);;
Works well with Results then being a stream type with current value and a next method.
Walking through it:
I define min for completenes.
I define a stream type to have a current value and a method to return a new string, essentially head and tail of a stream of numbers.
I define the function merge, which takes the smaller of the current values of two streams and then increments that stream. It then recurses to provide the rest of the stream. Essentially, given two streams which are in order, it will produce a new stream which is in order.
I define squares to be a stream increasing in powers of 2.
AllPowers takes the start value and merges the stream resulting from all squares at this number of powers of 5. it with the stream resulting from multiplying it by 5, since these are your only two options. You effectively are left with a tree of results
The result is merging more and more streams, so you merge the following streams
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
.
.
.
Merging all of these turns out to be fairly efficient with tail recursio and compiler optimisations etc.
These could be printed to the console like this:
let rec PrintAll(s:stream)=
if (s.current > 0) then
do System.Console.WriteLine(s.current)
PrintAll(s.next());;
PrintAll(Results);
let v = System.Console.ReadLine();
Similar things could be done in any language which allows for recursion and passing functions as values (it's only a little more complex if you can't pass functions as variables).
For an O(N) solution, you can use a list of numbers found so far and two indexes: one representing the next number to be multiplied by 2, and the other the next number to be multiplied by 5. Then in each iteration you have two candidate values to choose the smaller one from.
In Python:
numbers = [1]
next_2 = 0
next_5 = 0
for i in xrange(100):
mult_2 = numbers[next_2]*2
mult_5 = numbers[next_5]*5
if mult_2 < mult_5:
next = mult_2
next_2 += 1
else:
next = mult_5
next_5 += 1
# The comparison here is to avoid appending duplicates
if next > numbers[-1]:
numbers.append(next)
print numbers
So we have two loops, one incrementing i and second one incrementing j starting both from zero, right? (multiply symbol is confusing in the title of the question)
You can do something very straightforward:
Add all items in an array
Sort the array
Or you need an other solution with more math analysys?
EDIT: More smart solution by leveraging similarity with Merge Sort problem
If we imagine infinite set of numbers of 2^i and 5^j as two independent streams/lists this problem looks very the same as well known Merge Sort problem.
So solution steps are:
Get two numbers one from the each of streams (of 2 and of 5)
Compare
Return smallest
get next number from the stream of the previously returned smallest
and that's it! ;)
PS: Complexity of Merge Sort always is O(n*log(n))
I visualize this problem as a matrix M where M(i,j) = 2^i * 5^j. This means that both the rows and columns are increasing.
Think about drawing a line through the entries in increasing order, clearly beginning at entry (1,1). As you visit entries, the row and column increasing conditions ensure that the shape formed by those cells will always be an integer partition (in English notation). Keep track of this partition (mu = (m1, m2, m3, ...) where mi is the number of smaller entries in row i -- hence m1 >= m2 >= ...). Then the only entries that you need to compare are those entries which can be added to the partition.
Here's a crude example. Suppose you've visited all the xs (mu = (5,3,3,1)), then you need only check the #s:
x x x x x #
x x x #
x x x
x #
#
Therefore the number of checks is the number of addable cells (equivalently the number of ways to go up in Bruhat order if you're of a mind to think in terms of posets).
Given a partition mu, it's easy to determine what the addable states are. Image an infinite string of 0s following the last positive entry. Then you can increase mi by 1 if and only if m(i-1) > mi.
Back to the example, for mu = (5,3,3,1) we can increase m1 (6,3,3,1) or m2 (5,4,3,1) or m4 (5,3,3,2) or m5 (5,3,3,1,1).
The solution to the problem then finds the correct sequence of partitions (saturated chain). In pseudocode:
mu = [1,0,0,...,0];
while (/* some terminate condition or go on forever */) {
minNext = 0;
nextCell = [];
// look through all addable cells
for (int i=0; i<mu.length; ++i) {
if (i==0 or mu[i-1]>mu[i]) {
// check for new minimum value
if (minNext == 0 or 2^i * 5^(mu[i]+1) < minNext) {
nextCell = i;
minNext = 2^i * 5^(mu[i]+1)
}
}
}
// print next largest entry and update mu
print(minNext);
mu[i]++;
}
I wrote this in Maple stopping after 12 iterations:
1, 2, 4, 5, 8, 10, 16, 20, 25, 32, 40, 50
and the outputted sequence of cells added and got this:
1 2 3 5 7 10
4 6 8 11
9 12
corresponding to this matrix representation:
1, 2, 4, 8, 16, 32...
5, 10, 20, 40, 80, 160...
25, 50, 100, 200, 400...
First of all, (as others mentioned already) this question is very vague!!!
Nevertheless, I am going to give a shot based on your vague equation and the pattern as your expected result. So I am not sure the following will be true for what you are trying to do, however it may give you some idea about java collections!
import java.util.List;
import java.util.ArrayList;
import java.util.SortedSet;
import java.util.TreeSet;
public class IncreasingNumbers {
private static List<Integer> findIncreasingNumbers(int maxIteration) {
SortedSet<Integer> numbers = new TreeSet<Integer>();
SortedSet<Integer> numbers2 = new TreeSet<Integer>();
for (int i=0;i < maxIteration;i++) {
int n1 = (int)Math.pow(2, i);
numbers.add(n1);
for (int j=0;j < maxIteration;j++) {
int n2 = (int)Math.pow(5, i);
numbers.add(n2);
for (Integer n: numbers) {
int n3 = n*n1;
numbers2.add(n3);
}
}
}
numbers.addAll(numbers2);
return new ArrayList<Integer>(numbers);
}
/**
* Based on the following fuzzy question # StackOverflow
* http://stackoverflow.com/questions/7571934/printing-numbers-of-the-form-2i-5j-in-increasing-order
*
*
* Result:
* 1 2 4 5 8 10 16 20 25 32 40 64 80 100 125 128 200 256 400 625 1000 2000 10000
*/
public static void main(String[] args) {
List<Integer> numbers = findIncreasingNumbers(5);
for (Integer i: numbers) {
System.out.print(i + " ");
}
}
}
If you can do it in O(nlogn), here's a simple solution:
Get an empty min-heap
Put 1 in the heap
while (you want to continue)
Get num from heap
print num
put num*2 and num*5 in the heap
There you have it. By min-heap, I mean min-heap
As a mathematician the first thing I always think about when looking at something like this is "will logarithms help?".
In this case it might.
If our series A is increasing then the series log(A) is also increasing. Since all terms of A are of the form 2^i.5^j then all members of the series log(A) are of the form i.log(2) + j.log(5)
We can then look at the series log(A)/log(2) which is also increasing and its elements are of the form i+j.(log(5)/log(2))
If we work out the i and j that generates the full ordered list for this last series (call it B) then that i and j will also generate the series A correctly.
This is just changing the nature of the problem but hopefully to one where it becomes easier to solve. At each step you can either increase i and decrease j or vice versa.
Looking at a few of the early changes you can make (which I will possibly refer to as transforms of i,j or just transorms) gives us some clues of where we are going.
Clearly increasing i by 1 will increase B by 1. However, given that log(5)/log(2) is approx 2.3 then increasing j by 1 while decreasing i by 2 will given an increase of just 0.3 . The problem then is at each stage finding the minimum possible increase in B for changes of i and j.
To do this I just kept a record as I increased of the most efficient transforms of i and j (ie what to add and subtract from each) to get the smallest possible increase in the series. Then applied whichever one was valid (ie making sure i and j don't go negative).
Since at each stage you can either decrease i or decrease j there are effectively two classes of transforms that can be checked individually. A new transform doesn't have to have the best overall score to be included in our future checks, just better than any other in its class.
To test my thougths I wrote a sort of program in LinqPad. Key things to note are that the Dump() method just outputs the object to screen and that the syntax/structure isn't valid for a real c# file. Converting it if you want to run it should be easy though.
Hopefully anything not explicitly explained will be understandable from the code.
void Main()
{
double C = Math.Log(5)/Math.Log(2);
int i = 0;
int j = 0;
int maxi = i;
int maxj = j;
List<int> outputList = new List<int>();
List<Transform> transforms = new List<Transform>();
outputList.Add(1);
while (outputList.Count<500)
{
Transform tr;
if (i==maxi)
{
//We haven't considered i this big before. Lets see if we can find an efficient transform by getting this many i and taking away some j.
maxi++;
tr = new Transform(maxi, (int)(-(maxi-maxi%C)/C), maxi%C);
AddIfWorthwhile(transforms, tr);
}
if (j==maxj)
{
//We haven't considered j this big before. Lets see if we can find an efficient transform by getting this many j and taking away some i.
maxj++;
tr = new Transform((int)(-(maxj*C)), maxj, (maxj*C)%1);
AddIfWorthwhile(transforms, tr);
}
//We have a set of transforms. We first find ones that are valid then order them by score and take the first (smallest) one.
Transform bestTransform = transforms.Where(x=>x.I>=-i && x.J >=-j).OrderBy(x=>x.Score).First();
//Apply transform
i+=bestTransform.I;
j+=bestTransform.J;
//output the next number in out list.
int value = GetValue(i,j);
//This line just gets it to stop when it overflows. I would have expected an exception but maybe LinqPad does magic with them?
if (value<0) break;
outputList.Add(value);
}
outputList.Dump();
}
public int GetValue(int i, int j)
{
return (int)(Math.Pow(2,i)*Math.Pow(5,j));
}
public void AddIfWorthwhile(List<Transform> list, Transform tr)
{
if (list.Where(x=>(x.Score<tr.Score && x.IncreaseI == tr.IncreaseI)).Count()==0)
{
list.Add(tr);
}
}
// Define other methods and classes here
public class Transform
{
public int I;
public int J;
public double Score;
public bool IncreaseI
{
get {return I>0;}
}
public Transform(int i, int j, double score)
{
I=i;
J=j;
Score=score;
}
}
I've not bothered looking at the efficiency of this but I strongly suspect its better than some other solutions because at each stage all I need to do is check my set of transforms - working out how many of these there are compared to "n" is non-trivial. It is clearly related since the further you go the more transforms there are but the number of new transforms becomes vanishingly small at higher numbers so maybe its just O(1). This O stuff always confused me though. ;-)
One advantage over other solutions is that it allows you to calculate i,j without needing to calculate the product allowing me to work out what the sequence would be without needing to calculate the actual number itself.
For what its worth after the first 230 nunmbers (when int runs out of space) I had 9 transforms to check each time. And given its only my total that overflowed I ran if for the first million results and got to i=5191 and j=354. The number of transforms was 23. The size of this number in the list is approximately 10^1810. Runtime to get to this level was approx 5 seconds.
P.S. If you like this answer please feel free to tell your friends since I spent ages on this and a few +1s would be nice compensation. Or in fact just comment to tell me what you think. :)
I'm sure everyone one's might have got the answer by now, but just wanted to give a direction to this solution..
It's a Ctrl C + Ctrl V from
http://www.careercup.com/question?id=16378662
void print(int N)
{
int arr[N];
arr[0] = 1;
int i = 0, j = 0, k = 1;
int numJ, numI;
int num;
for(int count = 1; count < N; )
{
numI = arr[i] * 2;
numJ = arr[j] * 5;
if(numI < numJ)
{
num = numI;
i++;
}
else
{
num = numJ;
j++;
}
if(num > arr[k-1])
{
arr[k] = num;
k++;
count++;
}
}
for(int counter = 0; counter < N; counter++)
{
printf("%d ", arr[counter]);
}
}
The question as put to me was to return an infinite set of solutions. I pondered the use of trees, but felt there was a problem with figuring out when to harvest and prune the tree, given an infinite number of values for i & j. I realized that a sieve algorithm could be used. Starting from zero, determine whether each positive integer had values for i and j. This was facilitated by turning answer = (2^i)*(2^j) around and solving for i instead. That gave me i = log2 (answer/ (5^j)). Here is the code:
class Program
{
static void Main(string[] args)
{
var startTime = DateTime.Now;
int potential = 0;
do
{
if (ExistsIandJ(potential))
Console.WriteLine("{0}", potential);
potential++;
} while (potential < 100000);
Console.WriteLine("Took {0} seconds", DateTime.Now.Subtract(startTime).TotalSeconds);
}
private static bool ExistsIandJ(int potential)
{
// potential = (2^i)*(5^j)
// 1 = (2^i)*(5^j)/potential
// 1/(2^1) = (5^j)/potential or (2^i) = potential / (5^j)
// i = log2 (potential / (5^j))
for (var j = 0; Math.Pow(5,j) <= potential; j++)
{
var i = Math.Log(potential / Math.Pow(5, j), 2);
if (i == Math.Truncate(i))
return true;
}
return false;
}
}

Find out which combinations of numbers in a set add up to a given total

I've been tasked with helping some accountants solve a common problem they have - given a list of transactions and a total deposit, which transactions are part of the deposit? For example, say I have this list of numbers:
1.00
2.50
3.75
8.00
And I know that my total deposit is 10.50, I can easily see that it's made up of the 8.00 and 2.50 transaction. However, given a hundred transactions and a deposit in the millions, it quickly becomes much more difficult.
In testing a brute force solution (which takes way too long to be practical), I had two questions:
With a list of about 60 numbers, it seems to find a dozen or more combinations for any total that's reasonable. I was expecting a single combination to satisfy my total, or maybe a few possibilities, but there always seem to be a ton of combinations. Is there a math principle that describes why this is? It seems that given a collection of random numbers of even a medium size, you can find a multiple combination that adds up to just about any total you want.
I built a brute force solution for the problem, but it's clearly O(n!), and quickly grows out of control. Aside from the obvious shortcuts (exclude numbers larger than the total themselves), is there a way to shorten the time to calculate this?
Details on my current (super-slow) solution:
The list of detail amounts is sorted largest to smallest, and then the following process runs recursively:
Take the next item in the list and see if adding it to your running total makes your total match the target. If it does, set aside the current chain as a match. If it falls short of your target, add it to your running total, remove it from the list of detail amounts, and then call this process again
This way it excludes the larger numbers quickly, cutting the list down to only the numbers it needs to consider. However, it's still n! and larger lists never seem to finish, so I'm interested in any shortcuts I might be able to take to speed this up - I suspect that even cutting 1 number out of the list would cut the calculation time in half.
Thanks for your help!
This special case of the Knapsack problem is called Subset Sum.
C# version
setup test:
using System;
using System.Collections.Generic;
public class Program
{
public static void Main(string[] args)
{
// subtotal list
List<double> totals = new List<double>(new double[] { 1, -1, 18, 23, 3.50, 8, 70, 99.50, 87, 22, 4, 4, 100.50, 120, 27, 101.50, 100.50 });
// get matches
List<double[]> results = Knapsack.MatchTotal(100.50, totals);
// print results
foreach (var result in results)
{
Console.WriteLine(string.Join(",", result));
}
Console.WriteLine("Done.");
Console.ReadKey();
}
}
code:
using System.Collections.Generic;
using System.Linq;
public class Knapsack
{
internal static List<double[]> MatchTotal(double theTotal, List<double> subTotals)
{
List<double[]> results = new List<double[]>();
while (subTotals.Contains(theTotal))
{
results.Add(new double[1] { theTotal });
subTotals.Remove(theTotal);
}
// if no subtotals were passed
// or all matched the Total
// return
if (subTotals.Count == 0)
return results;
subTotals.Sort();
double mostNegativeNumber = subTotals[0];
if (mostNegativeNumber > 0)
mostNegativeNumber = 0;
// if there aren't any negative values
// we can remove any values bigger than the total
if (mostNegativeNumber == 0)
subTotals.RemoveAll(d => d > theTotal);
// if there aren't any negative values
// and sum is less than the total no need to look further
if (mostNegativeNumber == 0 && subTotals.Sum() < theTotal)
return results;
// get the combinations for the remaining subTotals
// skip 1 since we already removed subTotals that match
for (int choose = 2; choose <= subTotals.Count; choose++)
{
// get combinations for each length
IEnumerable<IEnumerable<double>> combos = Combination.Combinations(subTotals.AsEnumerable(), choose);
// add combinations where the sum mathces the total to the result list
results.AddRange(from combo in combos
where combo.Sum() == theTotal
select combo.ToArray());
}
return results;
}
}
public static class Combination
{
public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int choose)
{
return choose == 0 ? // if choose = 0
new[] { new T[0] } : // return empty Type array
elements.SelectMany((element, i) => // else recursively iterate over array to create combinations
elements.Skip(i + 1).Combinations(choose - 1).Select(combo => (new[] { element }).Concat(combo)));
}
}
results:
100.5
100.5
-1,101.5
1,99.5
3.5,27,70
3.5,4,23,70
3.5,4,23,70
-1,1,3.5,27,70
1,3.5,4,22,70
1,3.5,4,22,70
1,3.5,8,18,70
-1,1,3.5,4,23,70
-1,1,3.5,4,23,70
1,3.5,4,4,18,70
-1,3.5,8,18,22,23,27
-1,3.5,4,4,18,22,23,27
Done.
If subTotals are repeated, there will appear to be duplicate results (the desired effect). In reality, you will probably want to use the subTotal Tupled with some ID, so you can relate it back to your data.
If I understand your problem correctly, you have a set of transactions, and you merely wish to know which of them could have been included in a given total. So if there are 4 possible transactions, then there are 2^4 = 16 possible sets to inspect. This problem is, for 100 possible transactions, the search space has 2^100 = 1267650600228229401496703205376 possible combinations to search over. For 1000 potential transactions in the mix, it grows to a total of
10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376
sets that you must test. Brute force will hardly be a viable solution on these problems.
Instead, use a solver that can handle knapsack problems. But even then, I'm not sure that you can generate a complete enumeration of all possible solutions without some variation of brute force.
There is a cheap Excel Add-in that solves this problem: SumMatch
The Excel Solver Addin as posted over on superuser.com has a great solution (if you have Excel) https://superuser.com/questions/204925/excel-find-a-subset-of-numbers-that-add-to-a-given-total
Its kind of like 0-1 Knapsack problem which is NP-complete and can be solved through dynamic programming in polynomial time.
http://en.wikipedia.org/wiki/Knapsack_problem
But at the end of the algorithm you also need to check that the sum is what you wanted.
Depending on your data you could first look at the cents portion of each transaction. Like in your initial example you know that 2.50 has to be part of the total because it is the only set of non-zero cent transactions which add to 50.
Not a super efficient solution but heres an implementation in coffeescript
combinations returns all possible combinations of the elements in list
combinations = (list) ->
permuations = Math.pow(2, list.length) - 1
out = []
combinations = []
while permuations
out = []
for i in [0..list.length]
y = ( 1 << i )
if( y & permuations and (y isnt permuations))
out.push(list[i])
if out.length <= list.length and out.length > 0
combinations.push(out)
permuations--
return combinations
and then find_components makes use of it to determine which numbers add up to total
find_components = (total, list) ->
# given a list that is assumed to have only unique elements
list_combinations = combinations(list)
for combination in list_combinations
sum = 0
for number in combination
sum += number
if sum is total
return combination
return []
Heres an example
list = [7.2, 3.3, 4.5, 6.0, 2, 4.1]
total = 7.2 + 2 + 4.1
console.log(find_components(total, list))
which returns [ 7.2, 2, 4.1 ]
#include <stdio.h>
#include <stdlib.h>
/* Takes at least 3 numbers as arguments.
* First number is desired sum.
* Find the subset of the rest that comes closest
* to the desired sum without going over.
*/
static long *elements;
static int nelements;
/* A linked list of some elements, not necessarily all */
/* The list represents the optimal subset for elements in the range [index..nelements-1] */
struct status {
long sum; /* sum of all the elements in the list */
struct status *next; /* points to next element in the list */
int index; /* index into elements array of this element */
};
/* find the subset of elements[startingat .. nelements-1] whose sum is closest to but does not exceed desiredsum */
struct status *reportoptimalsubset(long desiredsum, int startingat) {
struct status *sumcdr = NULL;
struct status *sumlist = NULL;
/* sum of zero elements or summing to zero */
if (startingat == nelements || desiredsum == 0) {
return NULL;
}
/* optimal sum using the current element */
/* if current elements[startingat] too big, it won't fit, don't try it */
if (elements[startingat] <= desiredsum) {
sumlist = malloc(sizeof(struct status));
sumlist->index = startingat;
sumlist->next = reportoptimalsubset(desiredsum - elements[startingat], startingat + 1);
sumlist->sum = elements[startingat] + (sumlist->next ? sumlist->next->sum : 0);
if (sumlist->sum == desiredsum)
return sumlist;
}
/* optimal sum not using current element */
sumcdr = reportoptimalsubset(desiredsum, startingat + 1);
if (!sumcdr) return sumlist;
if (!sumlist) return sumcdr;
return (sumcdr->sum < sumlist->sum) ? sumlist : sumcdr;
}
int main(int argc, char **argv) {
struct status *result = NULL;
long desiredsum = strtol(argv[1], NULL, 10);
nelements = argc - 2;
elements = malloc(sizeof(long) * nelements);
for (int i = 0; i < nelements; i++) {
elements[i] = strtol(argv[i + 2], NULL , 10);
}
result = reportoptimalsubset(desiredsum, 0);
if (result)
printf("optimal subset = %ld\n", result->sum);
while (result) {
printf("%ld + ", elements[result->index]);
result = result->next;
}
printf("\n");
}
Best to avoid use of floats and doubles when doing arithmetic and equality comparisons btw.

Roulette wheel selection algorithm [duplicate]

This question already has answers here:
Roulette Selection in Genetic Algorithms
(14 answers)
Closed 7 years ago.
Can anyone provide some pseudo code for a roulette selection function? How would I implement this: I don't really understand how to read this math notation.I want General algorithm to this.
The other answers seem to be assuming that you are trying to implement a roulette game. I think that you are asking about roulette wheel selection in evolutionary algorithms.
Here is some Java code that implements roulette wheel selection.
Assume you have 10 items to choose from and you choose by generating a random number between 0 and 1. You divide the range 0 to 1 up into ten non-overlapping segments, each proportional to the fitness of one of the ten items. For example, this might look like this:
0 - 0.3 is item 1
0.3 - 0.4 is item 2
0.4 - 0.5 is item 3
0.5 - 0.57 is item 4
0.57 - 0.63 is item 5
0.63 - 0.68 is item 6
0.68 - 0.8 is item 7
0.8 - 0.85 is item 8
0.85 - 0.98 is item 9
0.98 - 1 is item 10
This is your roulette wheel. Your random number between 0 and 1 is your spin. If the random number is 0.46, then the chosen item is item 3. If it's 0.92, then it's item 9.
Here is a bit of python code:
def roulette_select(population, fitnesses, num):
""" Roulette selection, implemented according to:
<http://stackoverflow.com/questions/177271/roulette
-selection-in-genetic-algorithms/177278#177278>
"""
total_fitness = float(sum(fitnesses))
rel_fitness = [f/total_fitness for f in fitnesses]
# Generate probability intervals for each individual
probs = [sum(rel_fitness[:i+1]) for i in range(len(rel_fitness))]
# Draw new population
new_population = []
for n in xrange(num):
r = rand()
for (i, individual) in enumerate(population):
if r <= probs[i]:
new_population.append(individual)
break
return new_population
First, generate an array of the percentages you assigned, let's say p[1..n]
and assume the total is the sum of all the percentages.
Then get a random number between 1 to total, let's say r
Now, the algorithm in lua:
local c = 0
for i = 1,n do
c = c + p[i]
if r <= c then
return i
end
end
There are 2 steps to this: First create an array with all the values on the wheel. This can be a 2 dimensional array with colour as well as number, or you can choose to add 100 to red numbers.
Then simply generate a random number between 0 or 1 (depending on whether your language starts numbering array indexes from 0 or 1) and the last element in your array.
Most languages have built-in random number functions. In VB and VBScript the function is RND(). In Javascript it is Math.random()
Fetch the value from that position in the array and you have your random roulette number.
Final note: don't forget to seed your random number generator or you will get the same sequence of draws every time you run the program.
Here is a really quick way to do it using stream selection in Java. It selects the indices of an array using the values as weights. No cumulative weights needed due to the mathematical properties.
static int selectRandomWeighted(double[] wts, Random rnd) {
int selected = 0;
double total = wts[0];
for( int i = 1; i < wts.length; i++ ) {
total += wts[i];
if( rnd.nextDouble() <= (wts[i] / total)) selected = i;
}
return selected;
}
This could be further improved using Kahan summation or reading through the doubles as an iterable if the array was too big to initialize at once.
I wanted the same and so created this self-contained Roulette class. You give it a series of weights (in the form of a double array), and it will simply return an index from that array according to a weighted random pick.
I created a class because you can get a big speed up by only doing the cumulative additions once via the constructor. It's C# code, but enjoy the C like speed and simplicity!
class Roulette
{
double[] c;
double total;
Random random;
public Roulette(double[] n) {
random = new Random();
total = 0;
c = new double[n.Length+1];
c[0] = 0;
// Create cumulative values for later:
for (int i = 0; i < n.Length; i++) {
c[i+1] = c[i] + n[i];
total += n[i];
}
}
public int spin() {
double r = random.NextDouble() * total; // Create a random number between 0 and 1 and times by the total we calculated earlier.
//int j; for (j = 0; j < c.Length; j++) if (c[j] > r) break; return j-1; // Don't use this - it's slower than the binary search below.
//// Binary search for efficiency. Objective is to find index of the number just above r:
int a = 0;
int b = c.Length - 1;
while (b - a > 1) {
int mid = (a + b) / 2;
if (c[mid] > r) b = mid;
else a = mid;
}
return a;
}
}
The initial weights are up to you. Maybe it could be the fitness of each member, or a value inversely proportional to the member's position in the "top 50". E.g.: 1st place = 1.0 weighting, 2nd place = 0.5, 3rd place = 0.333, 4th place = 0.25 weighting etc. etc.
Well, for an American Roulette wheel, you're going to need to generate a random integer between 1 and 38. There are 36 numbers, a 0, and a 00.
One of the big things to consider, though, is that in American roulette, their are many different bets that can be made. A single bet can cover 1, 2, 3, 4, 5, 6, two different 12s, or 18. You may wish to create a list of lists where each number has additional flages to simplify that, or do it all in the programming.
If I were implementing it in Python, I would just create a Tuple of 0, 00, and 1 through 36 and use random.choice() for each spin.
This assumes some class "Classifier" which just has a String condition, String message, and double strength. Just follow the logic.
-- Paul
public static List<Classifier> rouletteSelection(int classifiers) {
List<Classifier> classifierList = new LinkedList<Classifier>();
double strengthSum = 0.0;
double probabilitySum = 0.0;
// add up the strengths of the map
Set<String> keySet = ClassifierMap.CLASSIFIER_MAP.keySet();
for (String key : keySet) {
/* used for debug to make sure wheel is working.
if (strengthSum == 0.0) {
ClassifierMap.CLASSIFIER_MAP.get(key).setStrength(8000.0);
}
*/
Classifier classifier = ClassifierMap.CLASSIFIER_MAP.get(key);
double strength = classifier.getStrength();
strengthSum = strengthSum + strength;
}
System.out.println("strengthSum: " + strengthSum);
// compute the total probability. this will be 1.00 or close to it.
for (String key : keySet) {
Classifier classifier = ClassifierMap.CLASSIFIER_MAP.get(key);
double probability = (classifier.getStrength() / strengthSum);
probabilitySum = probabilitySum + probability;
}
System.out.println("probabilitySum: " + probabilitySum);
while (classifierList.size() < classifiers) {
boolean winnerFound = false;
double rouletteRandom = random.nextDouble();
double rouletteSum = 0.0;
for (String key : keySet) {
Classifier classifier = ClassifierMap.CLASSIFIER_MAP.get(key);
double probability = (classifier.getStrength() / strengthSum);
rouletteSum = rouletteSum + probability;
if (rouletteSum > rouletteRandom && (winnerFound == false)) {
System.out.println("Winner found: " + probability);
classifierList.add(classifier);
winnerFound = true;
}
}
}
return classifierList;
}
You can use a data structure like this:
Map<A, B> roulette_wheel_schema = new LinkedHashMap<A, B>()
where A is an integer that represents a pocket of the roulette wheel, and B is an index that identifies a chromosome in the population. The number of pockets is proportional to the fitness proportionate of each chromosome:
number of pockets = (fitness proportionate) · (scale factor)
Then we generate a random between 0 and the size of the selection schema and with this random number we get the index of the chromosome from the roulette.
We calculate the relative error between the fitness proportionate of each chromosome and the probability of being selected by the selection scheme.
The method getRouletteWheel returns the selection scheme based on previous data structure.
private Map<Integer, Integer> getRouletteWheel(
ArrayList<Chromosome_fitnessProportionate> chromosomes,
int precision) {
/*
* The number of pockets on the wheel
*
* number of pockets in roulette_wheel_schema = probability ·
* (10^precision)
*/
Map<Integer, Integer> roulette_wheel_schema = new LinkedHashMap<Integer, Integer>();
double fitness_proportionate = 0.0D;
double pockets = 0.0D;
int key_counter = -1;
double scale_factor = Math
.pow(new Double(10.0D), new Double(precision));
for (int index_cromosome = 0; index_cromosome < chromosomes.size(); index_cromosome++){
Chromosome_fitnessProportionate chromosome = chromosomes
.get(index_cromosome);
fitness_proportionate = chromosome.getFitness_proportionate();
fitness_proportionate *= scale_factor;
pockets = Math.rint(fitness_proportionate);
System.out.println("... " + index_cromosome + " : " + pockets);
for (int j = 0; j < pockets; j++) {
roulette_wheel_schema.put(Integer.valueOf(++key_counter),
Integer.valueOf(index_cromosome));
}
}
return roulette_wheel_schema;
}
I have worked out a Java code similar to that of Dan Dyer (referenced earlier). My roulette-wheel, however, selects a single element based on a probability vector (input) and returns the index of the selected element.
Having said that, the following code is more appropriate if the selection size is unitary and if you do not assume how the probabilities are calculated and zero probability value is allowed. The code is self-contained and includes a test with 20 wheel spins (to run).
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Random;
import java.util.logging.Level;
import java.util.logging.Logger;
/**
* Roulette-wheel Test version.
* Features a probability vector input with possibly null probability values.
* Appropriate for adaptive operator selection such as Probability Matching
* or Adaptive Pursuit, (Dynamic) Multi-armed Bandit.
* #version October 2015.
* #author Hakim Mitiche
*/
public class RouletteWheel {
/**
* Selects an element probabilistically.
* #param wheelProbabilities elements probability vector.
* #param rng random generator object
* #return selected element index
* #throws java.lang.Exception
*/
public int select(List<Double> wheelProbabilities, Random rng)
throws Exception{
double[] cumulativeProba = new double[wheelProbabilities.size()];
cumulativeProba[0] = wheelProbabilities.get(0);
for (int i = 1; i < wheelProbabilities.size(); i++)
{
double proba = wheelProbabilities.get(i);
cumulativeProba[i] = cumulativeProba[i - 1] + proba;
}
int last = wheelProbabilities.size()-1;
if (cumulativeProba[last] != 1.0)
{
throw new Exception("The probabilities does not sum up to one ("
+ "sum="+cumulativeProba[last]);
}
double r = rng.nextDouble();
int selected = Arrays.binarySearch(cumulativeProba, r);
if (selected < 0)
{
/* Convert negative insertion point to array index.
to find the correct cumulative proba range index.
*/
selected = Math.abs(selected + 1);
}
/* skip indexes of elements with Zero probability,
go backward to matching index*/
int i = selected;
while (wheelProbabilities.get(i) == 0.0){
System.out.print(i+" selected, correction");
i--;
if (i<0) i=last;
}
selected = i;
return selected;
}
public static void main(String[] args){
RouletteWheel rw = new RouletteWheel();
int rept = 20;
List<Double> P = new ArrayList<>(4);
P.add(0.2);
P.add(0.1);
P.add(0.6);
P.add(0.1);
Random rng = new Random();
for (int i = 0 ; i < rept; i++){
try {
int s = rw.select(P, rng);
System.out.println("Element selected "+s+ ", P(s)="+P.get(s));
} catch (Exception ex) {
Logger.getLogger(RouletteWheel.class.getName()).log(Level.SEVERE, null, ex);
}
}
P.clear();
P.add(0.2);
P.add(0.0);
P.add(0.5);
P.add(0.0);
P.add(0.1);
P.add(0.2);
//rng = new Random();
for (int i = 0 ; i < rept; i++){
try {
int s = rw.select(P, rng);
System.out.println("Element selected "+s+ ", P(s)="+P.get(s));
} catch (Exception ex) {
Logger.getLogger(RouletteWheel.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
/**
* {#inheritDoc}
* #return
*/
#Override
public String toString()
{
return "Roulette Wheel Selection";
}
}
Below an execution sample for a proba vector P=[0.2,0.1,0.6,0.1],
WheelElements = [0,1,2,3]:
Element selected 3, P(s)=0.1
Element selected 2, P(s)=0.6
Element selected 3, P(s)=0.1
Element selected 2, P(s)=0.6
Element selected 1, P(s)=0.1
Element selected 2, P(s)=0.6
Element selected 3, P(s)=0.1
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 3, P(s)=0.1
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 0, P(s)=0.2
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
Element selected 2, P(s)=0.6
The code also tests a roulette wheel with zero probability.
I am afraid that anybody using the in built random number generator in all programming languages must be aware that the number generated is not 100% random.So should be used with caution.
Random Number Generator pseudo code
add one to a sequential counter
get the current value of the sequential counter
add the counter value by the computer tick count or some other small interval timer value
optionally add addition numbers, like a number from an external piece of hardware like a plasma generator or some other type of somewhat random phenomena
divide the result by a very big prime number
359334085968622831041960188598043661065388726959079837 for example
get some digits from the far right of the decimal point of the result
use these digits as a random number
Use the random number digits to create random numbers between 1 and 38 (or 37 European) for roulette.

Resources