Let me start with the version of genetic algorithm I am implementing. I apologize in advance for any terminology errors that I make here. Please feel free to correct me.
The chromosome for my problem is two dimensional. Three rows and thirty two columns. Essentially the alleles (values) are indexes that are contained by this chromosome.
How an Index is formulated
Each row and column (together) of the chromosome refer to a single gene. Each gene contains an integer value (0 - 30). A single column (I believe referred to as a gnome) therefore refers to an index of a four dimensional array containing user data on which the fitness function operates.
This is how a chromosome would look like
11 22 33 14 27 15 16 ...
3 29 1 7 18 24 22 ...
29 9 16 10 14 21 3 ...
e.g. column 0 ==> data[11][3][29]
where
11 -> (0, 0); 0th row, 0th column
3 -> (1, 0); 1st row, 0th column
29 -> (2, 0); 2nd row, 0th column
For completeness, the fitness function works as follows: (for a single chromosome)
for first 10 iterations: (user 0 to 9)
for each column (genome)
consider gene value for first row as the first index of data array
consider gene value for the second row as the second index of data array
consider gene value for the third row as the third index of data array
so if the first column contains [11][3][29] user = 0
it refers to data[0][11][3][29]
SUM the data array value for all columns and save it
Do the same for all iterations (users)
for second 10 iterations: (user 10 to 19)
for each column (genome)
consider gene value for the SECOND row as the FIRST index of data array
consider gene value for the THIRD row as the SECOND index of data array
consider gene value for FIRST row as the THIRD index of data array
SUM the data array value for all columns and save it
Do the same for all iterations (users)
for third 10 iterations: (user 20 to 29)
for each column (genome)
consider gene value for the THIRD row as the FIRST index of data array
consider gene value for FIRST row as the SECOND index of data array
consider gene value for the SECOND row as the THIRD index of data array
SUM the data array value for all columns and save it
Do the same for all iterations (users)
Out of the 30 (sum) values calculated so far, assign the minimum value as fitness value
to this chromosome.
The point to explain the fitness function here is to explain the optimization problem I am dealing with. I am sorry I could not formulate it in Mathematical notation. Anyone who could do it, his/her comment is more than welcome. Essentially it is maximizing the minimum X. Where X refers to data contained in data array. (maximizing is done over generation where the highest fitness chromosomes are selected for next generations)
Q1) I am using a single random number generator for crossover and mutation probabilities. Generally speaking, is this correct was to implement it with a single generator? I ask this question because the crossover rate I chose is 0.7 and mutation to be 0.01. My random number generator generates a uniformly distributed integer number. The number are between 0 to (2^31 - 1). If a number generated by the random function lies under the border where it satisfies mutation, the same number also satisfies crossover. Does this effect the evolution process?
NOTE: the highest number that the random number generates is 2147483647. 1% of this value is 21474836. so whenever a number less than 21474836, it suggests that this gene can be mutated. this number also suggest that crossover must be done. Shouldn't there be different generators?
Q2) Although I see that there is a relation between genes is a column when calculating fitness. But while performing mutation, all the genes should be considered independent from each other or all the rows for a genome (column) should be effected by mutation.
Explanation
As I learned in a binary string of e.g. 1000 bits where each bit corresponds to a gene, with a mutation rate of 1% would mean 1 out of 100 bits might get flipped. in my case however I have chromosome which is 2D (3 rows, 32 columns). Should I consider all 96 genes independent of each other or simply consider 32 genes. And whenever I need a flip, flip the column all together. How does mutation work in 2D chromosome?
Q3) Do I really have a correlation between rows here. I am a bit confused?
Explanation
I have 2D chromosome, whose column values altogether points to the data i have to use to calculate fitness of this chromosome. Genetic algorithm manipulates chromosomes where as fitness is assigned by the data that is associated with this chromosome. My question is how would genetic algorithm should treat 2D chromosome. Should there be a relation between the genes in a column. Can I get a reference to some paper/code where a 2D chromosome is manipulated?
I'm not sure if i understood the chromosome structure, but it doesn't matter, the concepts are the same:
1 - You have a chromosome object, which you can access the individual genes
2 - You have a fitness function, which takes a chromosome and outputs a value
3 - You have a selection function, which selects chromosomes to mate
4 - You have a crossover function, which generally takes 2 chromosomes, exchange genes between them and outputs two new chromosomes
5 - You have a mutation operator, which acts randomly on the genes of a chromosome
So
Q1) You can use a single random generator, there's no problem at all. But why are you using
integer numbers? It's much easier to generate a random between [0, 1).
Q2) This is up to you, but generally the genes are mutated randomly, independent of each other (mutation happens after the crossover, but i think you already know that).
EDIT: Yes, you should consider all the 96 genes independent of each other. For each mutation, you'll select one 'row' and one 'column' and modify (mutate) that gene with some probability p, so:
for row in chromosome.row
for col in row
val = random_between_0_and_1
if val < p
chromosome[row][col] = noise
Q4) It's up to you to decide what the fitness function will do. If this chromosome is 'good' or 'bad' at solving your problem, then you should return a value that reflects that.
All the random numbers you use would typically be independently generated, so use one RNG or many, it doesn't matter. You should generate new numbers for each gene for crossover and mutation step, if you use the same single random number for multiple purposes you will limit the explorable solution space.
To make your algorithm easier to understand, generate uniformly distributed floats in [0..1) as r()=rand()/(2^32-1), then you can express things simply as, for example,
if r() < 0.3
mutate()
I don't understand your other questions. Please rewrite them.
An improvement you can do relatively to mutation and crossover probabilities is built a GA that choose these probabilities by itself. Because the use of given probabilities (or a function that evolves with the number of runs for probabilities) is always arbitrary, codify your operators inside chromosomes.
For example, you have two operators. Add a bit to the end of chromosome where 1 codify for mutation and 0 for crossover. When you apply operators on parents, you will obtain childs that will have the code for the operator to apply. In this way, the GA makes a double search: in the space of solutions and in the space of operators. The choose of operators is given by the nature of your problem a by the concrete conditions of the run. During the calculation, probabilites of both operators will change automatically to maximize you objective function.
Same thing for an arbitrary number of operators. You will need simply more bits to codify. I use generally three operators (three for crossover and one for mutation) and this mechanism works fine.
Related
I'm looking for an efficient algorithm to generate or iteratively approximate a solution to the problem described below.
You are given an array of length N and a finite set of numbers Si for each index i of the array. Now, if we are to place a number from Si at each index i to fill the entire array, while ensuring that the number is unique across the entire array; given all the possible arrays, what is the probability ditribution over each number at each index?
Here I give an example:
Assuming we have the following array of length 3 with each column representing Si at the index of the column
4 4 4
2 2
1 1 1
We will have the following possible arrays:
421
412
124
142
And the following probability distribution: (over 1 2 4 at each index respectively)
0.5 0.25 0.25
0.5 0.5
0.5 0.25 0.25
Brute forcing this problem is obviously doable but I have a gut feeling that there must be some more efficient algorithms for this.
The reason why I think so is due to the fact that one can derive the probability distribution from the set of all possibilities but not the other way around, so the distribution itself must contain less information then the set of all possibilities have. Therefore, I believe that we do not need to generate all possibilites just to obtain the probability distribution.
Hence, I am wondering if there is any smart matrix operation we could use for this problem or even fixed-point iteration/density evolution to approximate the end probability distribution? Some other potentially more efficient approaches to this problem are also appreciated.
(p.s. The reason why I am interested in this problem is because I wanted to generate probability distribution over candidate numbers for the empty cells in a sudoku and other sudoku-like games without a unique answers by only applying all the standard rules)
Sudoku is a combinatorial problem. It is easy to show that the probability of any independent cell is uniform (because you can relabel a configuration to put any number at a given position). The joint probabilities are more complicated.
If the game is partially filled you have constraints that will affect this distribution.
You must devise an algorithm to calculate the number of solutions from a given initial configuration. Then you compute the fraction of the total solutions are will have a specific value at the position of interest.
counts = {}
for i in range(1, 10):
board[cell] = i;
counts[i] = countSolutions(board);
prob = {i: counts[i] / sum(counts[i] for i in range(1, 10))}
The same approach works for joint probabilities but in some cases the number of possibilities may be too high.
Background:
This is extra credit in a logic and algorithms class, we are currently covering propositional logic, P implies Q that kind of thing, so I think the Prof wanted to give us and assignment out of our depth.
I will implement this in C++, but right now I just want to understand whats going on in the example....which I don't.
Example
Enclosed is a walkthrough for the Lefty algorithm which computes the number
of nxn 0-1 matrices with t ones in each row and column, but none on the main
diagonal.
The algorithm used to verify the equations presented counts all the possible
matrices, but does not construct them.
It is called "Lefty", it is reasonably simple, and is best described with an
example.
Suppose we wanted to compute the number of 6x6 0-1 matrices with 2 ones
in each row and column, but no ones on the main diagonal. We first create a
state vector of length 6, filled with 2s:
(2 2 2 2 2 2)
This state vector symbolizes the number of ones we must yet place in each
column. We accompany it with an integer which we call the "puck", which is
initialized to 1. This puck will increase by one each time we perform a ones
placement in a row of the matrix (a "round"), and we will think of the puck as
"covering up" the column that we wonít be able to place ones in for that round.
Since we are starting with the first row (and hence the first round), we place
two ones in any column, but since the puck is 1, we cannot place ones in the
first column. This corresponds to the forced zero that we must place in the first
column, since the 1,1 entry is part of the matrixís main diagonal.
The algorithm will iterate over all possible choices, but to show each round,
we shall make a choice, say the 2nd and 6th columns. We then drop the state
vector by subtracting 1 from the 2nd and 6th values, and advance the puck:
(2 1 2 2 2 1); 2
For the second round, the puck is 2, so we cannot place a one in that column.
We choose to place ones in the 4th and 6th columns instead and advance the
puck:
(2 1 2 1 2 0); 3
Now at this point, we can place two ones anywhere but the 3rd and 6th
columns. At this stage the algorithm treats the possibilities di§erently: We
can place some ones before the puck (in the column indexes less than the puck
value), and/or some ones after the puck (in the column indexes greater than
the puck value). Before the puck, we can place a one where there is a 1, or
where there is a 2; after the puck, we can place a one in the 4th or 5th columns.
Suppose we place ones in the 4th and 5th columns. We drop the state vector
and advance the puck once more:
(2 1 2 0 1 0); 4
1
For the 4th round, we once again notice we can place some ones before the
puck, and/or some ones after.
Before the puck, we can place:
(a) two ones in columns of value 2 (1 choice)
(b) one one in the column of value 2 (2 choices)
(c) one one in the column of value 1 (1 choice)
(d) one one in a column of value 2 and one one in a column of value 1 (2
choices).
After we choose one of the options (a)-(d), we must multiply the listed
number of choices by one for each way to place any remaining ones to the right
of the puck.
So, for option (a), there is only one way to place the ones.
For option (b), there are two possible ways for each possible placement of
the remaining one to the right of the puck. Since there is only one nonzero value
remaining to the right of the puck, there are two ways total.
For option (c), there is one possible way for each possible placement of the
remaining one to the right of the puck. Again, since there is only one nonzero
value remaining, there is one way total.
For option (d), there are two possible ways to place the ones.
We choose option (a). We drop the state vector and advance the puck:
(1 1 1 0 1 0); 5
Since the puck is "covering" the 1 in the 5th column, we can only place
ones before the puck. There are (3 take 2) ways to place two ones in the three
columns of value 1, so we multiply 3 by the number of ways to get remaining
possibilities. After choosing the 1st and 3rd columns (though it doesnít matter
since weíre left of the puck; any two of the three will do), we drop the state
vector and advance the puck one final time:
(0 1 0 0 1 0); 6
There is only one way to place the ones in this situation, so we terminate
with a count of 1. But we must take into account all the multiplications along
the way: 1*1*1*1*3*1 = 3.
Another way of thinking of the varying row is to start with the first matrix,
focus on the lower-left 2x3 submatrix, and note how many ways there were to
permute the columns of that submatrix. Since there are only 3 such ways, we
get 3 matrices.
What I think I understand
This algorithm counts the the all possible 6x6 arrays with 2 1's in each row and column with none in the descending diagonal.
Instead of constructing the matrices it uses a "state_vector" filled with 6 2's, representing how many 2's are in that column, and a "puck" that represents the index of the diagonal and the current row as the algorithm iterates.
What I don't understand
The algorithm comes up with a value of 1 for each row except 5 which is assigned a 3, at the end these values are multiplied for the end result. These values are supposed to be the possible placements for each row but there are many possibilities for row 1, why was it given a one, why did the algorithm wait until row 5 to figure all the possible permutations?
Any help will be much appreciated!
I think what is going on is a tradeoff between doing combinatorics and doing recursion.
The algorithm is using recursion to add up all the counts for each choice of placing the 1's. The example considers a single choice at each stage, but to get the full count it needs to add the results for all possible choices.
Now it is quite possible to get the final answer simply using recursion all the way down. Every time we reach the bottom we just add 1 to the total count.
The normal next step is to cache the result of calling the recursive function as this greatly improves the speed. However, the memory use for such a dynamic programming approach depends on the number of states that need to be expanded.
The combinatorics in the later stages is making use of the fact that once the puck has passed a column, the exact arrangement of counts in the columns doesn't matter so you only need to evaluate one representative of each type and then add up the resulting counts multiplied by the number of equivalent ways.
This both reduces the memory use and improves the speed of the algorithm.
Note that you cannot use combinatorics for counts to the right of the puck, as for these the order of the counts is still important due to the restriction about the diagonal.
P.S. You can actually compute the number of ways for counting the number of n*n matrices with 2 1's in each column (and no diagonal entries) with pure combinatorics as:
a(n) = Sum_{k=0..n} Sum_{s=0..k} Sum_{j=0..n-k} (-1)^(k+j-s)*n!*(n-k)!*(2n-k-2j-s)!/(s!*(k-s)!*(n-k-j)!^2*j!*2^(2n-2k-j))
According to OEIS.
I have an array of N elements (representing the N letters of a given alphabet), and each cell of the array holds an integer value, that integer value meaning the number of occurrences in a given text of that letter. Now I want to randomly choose a letter from all of the letters in the alphabet, based on his number of appearances with the given constraints:
If the letter has a positive (nonzero) value, then it can be always chosen by the algorithm (with a bigger or smaller probability, of course).
If a letter A has a higher value than a letter B, then it has to be more likely to be chosen by the algorithm.
Now, taking that into account, I've come up with a simple algorithm that might do the job, but I was just wondering if there was a better thing to do. This seems to be quite fundamental, and I think there might be more clever things to do in order to accomplish this more efficiently. This is the algorithm i thought:
Add up all the frequencies in the array. Store it in SUM
Choosing up a random value from 0 to SUM. Store it in RAN
[While] RAN > 0, Starting from the first, visit each cell in the array (in order), and subtract the value of that cell from RAN
The last visited cell is the chosen one
So, is there a better thing to do than this? Am I missing something?
I'm aware most modern computers can compute this so fast I won't even notice if my algorithm is inefficient, so this is more of a theoretical question rather than a practical one.
I prefer an explained algorithm rather than just code for an answer, but If you're more comfortable providing your answer in code, I have no problem with that.
The idea:
Iterate through all the elements and set the value of each element as the cumulative frequency thus far.
Generate a random number between 1 and the sum of all frequencies
Do a binary search on the values for this number (finding the first value greater than or equal to the number).
Example:
Element A B C D
Frequency 1 4 3 2
Cumulative 1 5 8 10
Generate a random number in the range 1-10 (1+4+3+2 = 10, the same as the last value in the cumulative list), do a binary search, which will return values as follows:
Number Element returned
1 A
2 B
3 B
4 B
5 B
6 C
7 C
8 C
9 D
10 D
The Alias Method has amortized O(1) time per value generated, but requires two uniforms per lookup. Basically, you create a table where each column contains one of the values to be generated, a second value called an alias, and a conditional probability of choosing between the value and its alias. Use your first uniform to pick any of the columns with equal likelihood. Then choose between the primary value and the alias based on your second uniform. It takes a O(n log n) work to initially set up a valid table for n values, but after the table's built generating values is constant time. You can download this Ruby gem to see an actual implementation.
Two other very fast methods by Marsaglia et al. are described here. They have provided C implementations.
Given a bit array of fixed length and the number of 0s and 1s it contains, how can I arrange all possible combinations such that returning the i-th combinations takes the least possible time?
It is not important the order in which they are returned.
Here is an example:
array length = 6
number of 0s = 4
number of 1s = 2
possible combinations (6! / 4! / 2!)
000011 000101 000110 001001 001010
001100 010001 010010 010100 011000
100001 100010 100100 101000 110000
problem
1st combination = 000011
5th combination = 001010
9th combination = 010100
With a different arrangement such as
100001 100010 100100 101000 110000
001100 010001 010010 010100 011000
000011 000101 000110 001001 001010
it shall return
1st combination = 100001
5th combination = 110000
9th combination = 010100
Currently I am using a O(n) algorithm which tests for each bit whether it is a 1 or 0. The problem is I need to handle lots of very long arrays (in the order of 10000 bits), and so it is still very slow (and caching is out of the question). I would like to know if you think a faster algorithm may exist.
Thank you
I'm not sure I understand the problem, but if you only want the i-th combination without generating the others, here is a possible algorithm:
There are C(M,N)=M!/(N!(M-N)!) combinations of N bits set to 1 having at most highest bit at position M.
You want the i-th: you iteratively increment M until C(M,N)>=i
while( C(M,N) < i ) M = M + 1
That will tell you the highest bit that is set.
Of course, you compute the combination iteratively with
C(M+1,N) = C(M,N)*(M+1)/(M+1-N)
Once found, you have a problem of finding (i-C(M-1,N))th combination of N-1 bits, so you can apply a recursion in N...
Here is a possible variant with D=C(M+1,N)-C(M,N), and I=I-1 to make it start at zero
SOL=0
I=I-1
while(N>0)
M=N
C=1
D=1
while(i>=D)
i=i-D
M=M+1
D=N*C/(M-N)
C=C+D
SOL=SOL+(1<<(M-1))
N=N-1
RETURN SOL
This will require large integer arithmetic if you have that many bits...
If the ordering doesn't matter (it just needs to remain consistent), I think the fastest thing to do would be to have combination(i) return anything you want that has the desired density the first time combination() is called with argument i. Then store that value in a member variable (say, a hashmap that has the value i as key and the combination you returned as its value). The second time combination(i) is called, you just look up i in the hashmap, figure out what you returned before and return it again.
Of course, when you're returning the combination for argument(i), you'll need to make sure it's not something you have returned before for some other argument.
If the number you will ever be asked to return is significantly smaller than the total number of combinations, an easy implementation for the first call to combination(i) would be to make a value of the right length with all 0s, randomly set num_ones of the bits to 1, and then make sure it's not one you've already returned for a different value of i.
Your problem appears to be constrained by the binomial coefficient. In the example you give, the problem can be translated as follows:
there are 6 items that can be chosen 2 at a time. By using the binomial coefficient, the total number of unique combinations can be calculated as N! / (K! (N - K)!, which for the case of K = 2 simplifies to N(N-1)/2. Plugging 6 in for N, we get 15, which is the same number of combinations that you calculated with 6! / 4! / 2! - which appears to be another way to calculate the binomial coefficient that I have never seen before. I have tried other combinations as well and both formulas generate the same number of combinations. So, it looks like your problem can be translated to a binomial coefficient problem.
Given this, it looks like you might be able to take advantage of a class that I wrote to handle common functions for working with the binomial coefficient:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters. This method makes solving this type of problem quite trivial.
Converts the K-indexes to the proper index of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle. My paper talks about this. I believe I am the first to discover and publish this technique, but I could be wrong.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to perform the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
It should not be hard to convert this class to the language of your choice.
There may be some limitations since you are using a very large N that could end up creating larger numbers than the program can handle. This is especially true if K can be large as well. Right now, the class is limited to the size of an int. But, it should not be hard to update it to use longs.
I need to randomly generate an NxN matrix of integers in the range 1 to K inclusive such that all rows and columns individually have the property that their elements are pairwise distinct.
For example for N=2 and K=3
This is ok:
1 2
2 1
This is not:
1 3
1 2
(Notice that if K < N this is impossible)
When K is sufficiently larger than N an efficient enough algorithm is just to generate a random matrix of 1..K integers, check that each row and each column is pairwise distinct, and if it isn't try again.
But what about the case where K is not much larger than N?
This is not a full answer, but a warning about an intuitive solution that does not work.
I am assuming that by "randomly generate" you mean with uniform probability on all existing such matrices.
For N=2 and K=3, here are the possible matrices, up to permutations of the set [1..K]:
1 2 1 2 1 2
2 1 2 3 3 1
(since we are ignoring permutations of the set [1..K], we can assume wlog that the first line is 1 2).
Now, an intuitive (but incorrect) strategy would be to draw the matrix entries one by one, ensuring for each entry that it is distinct from the other entries on the same line or column.
To see why it's incorrect, consider that we have drawn this:
1 2
x .
and we are now drawing x. x can be 2 or 3, but if we gave each possibility the probability 1/2, then the matrix
1 2
3 1
would get probability 1/2 of being drawn at the end, while it should have only probability 1/3.
Here is a (textual) solution. I don't think it provides good randomness, but nevertherless it could be ok for your application.
Let's generate a matrix in the range [0;K-1] (you will do +1 for all elements if you want to) with the following algorithm:
Generate the first line with any random method you want.
Each number will be the first element of a random sequence calculated in such a manner that you are guarranteed to have no duplicate in subsequent rows, that is for any distinct column x and y, you will have x[i]!=y[i] for all i in [0;N-1].
Compute each row for the previous one.
All the algorithm is based on the random generator with the property I mentioned. With a quick search, I found that the Inversive congruential generator meets this requirement. It seems to be easy to implement. It works if K is prime; if K is not prime, see on the same page 'Compound Inversive Generators'. Maybe it will be a little tricky to handle with perfect squares or cubic numbers (your problem sound like sudoku :-) ), but I think it is possible by creating compound generators with prime factors of K and different parametrization. For all generators, the first element of each column is the seed.
Whatever the value of K, the complexity is only depending on N and is O(N^2).
Deterministically generate a matrix having the desired property for rows and columns. Provided K > N, this can easily be done by starting the ith row with i, and filling in the rest of the row with i+1, i+2, etc., wrapping back to 1 after K. Other algorithms are possible.
Randomly permute columns, then randomly permute rows.
Let's show that permuting rows (i.e. picking up entire rows and assembling a new matrix from them in some order, with each row possibly in a different vertical position) leaves the desired properties intact for both rows and columns, assuming they were true before. The same reasoning then holds for column permutations, and for any sequence of permutations of either kind.
Trivially, permuting rows cannot change the property that, within each row, no element appears more than once.
The effect of permuting rows on a particular column is to reorder the elements within that column. This holds for any column, and since reordering elements cannot produce duplicate elements where there were none before, permuting rows cannot change the property that, within each column, no element appears more than once.
I'm not certain whether this algorithm is capable of generating all possible satisfying matrices, or if it does, whether it will generate all possible satisfying matrices with equal probability. Another interesting question that I don't have an answer for is: How many rounds of row-permutation-then-column-permutation are needed? More precisely, is any finite sequence of row-perm-then-column-perm rounds equivalent to a bounded number of (or in particular, one) row-perm-then-column-perm round? If so then nothing is gained by further permutations after the first row and column permutations. Perhaps someone with a stronger mathematics background can comment. But it may be good enough in any case.