How to gain the parent population size in new generation with certain crossover, mutation probability and elitism size one? - genetic-algorithm

For example population size is 300, crossover probability of 0.75 results in 224 chromosome's selection as parents and mutation probability of 0.005 results in 2 chromosomes to mutate. With elitism size 1, we will have 227 offspring solutions. How rest of the 73 chromosomes will be generated to complete population size of 300?

The common approach is to copy 73 chromosomes of the old population to complete the new one.
You could:
simply take the 73 higher fitness chromosomes (can cause premature convergence);
perform 73 tournament selections;
always copy the chromosomes not involved in crossover;
randomly select 73 chromosomes among the 300 of the old population;
...
The effectiveness of these strategies (meta-heuristics) greatly depends on the specific problem and you have to experiment.

Related

Genetic Algorithm on modified knapsack problem

Let’s say, you are going to spend a month in the wilderness. The only thing you are carrying is a
backpack that can hold a maximum weight of 40 kg. Now you have different survival items, each
having its own “Survival Points” (which are given for each item in the table). Some of the items are so
essential that if you do not take them, you incur some additional penalty.
Here is the table giving details about each item.
Item Weight Survival Value Penalty if not taken:
Sleeping Bag 30 20 0,
Rope 10 10 0,
Bottle 5 20 0,
Torch+Battery 15 25 -20,
Glucose 5 30 0,
Pocket Knife 10 15 -10,
Umbrella 20 10 0
.Formulate this as a genetic algorithm problem where your objective is to maximize the survival points.
Write how you would represent the chromosomes, fitness function, crossover, mutation, etc.
I am not sure about what would be the fitness function. A simple fitness function that I thought of is just simply adding the survival points of the weights that we want to take and subtracting the penalties of the weights that we don't want to take. But by doing this the overall value of the fitness function for a particular gene can be negative as well.
Please tell me how should I proceed further and what should be an appropriate fitness function in this case.

A sudoku problem: Efficiently find or approximate probability distribution over chosen numbers at each index of an array with no repeats

I'm looking for an efficient algorithm to generate or iteratively approximate a solution to the problem described below.
You are given an array of length N and a finite set of numbers Si for each index i of the array. Now, if we are to place a number from Si at each index i to fill the entire array, while ensuring that the number is unique across the entire array; given all the possible arrays, what is the probability ditribution over each number at each index?
Here I give an example:
Assuming we have the following array of length 3 with each column representing Si at the index of the column
4 4 4
   2  2
1  1  1
We will have the following possible arrays:
421
412
124
142
And the following probability distribution: (over 1 2 4 at each index respectively)
0.5 0.25 0.25
      0.5   0.5
0.5 0.25 0.25
Brute forcing this problem is obviously doable but I have a gut feeling that there must be some more efficient algorithms for this.
The reason why I think so is due to the fact that one can derive the probability distribution from the set of all possibilities but not the other way around, so the distribution itself must contain less information then the set of all possibilities have. Therefore, I believe that we do not need to generate all possibilites just to obtain the probability distribution.
Hence, I am wondering if there is any smart matrix operation we could use for this problem or even fixed-point iteration/density evolution to approximate the end probability distribution? Some other potentially more efficient approaches to this problem are also appreciated.
(p.s. The reason why I am interested in this problem is because I wanted to generate probability distribution over candidate numbers for the empty cells in a sudoku and other sudoku-like games without a unique answers by only applying all the standard rules)
Sudoku is a combinatorial problem. It is easy to show that the probability of any independent cell is uniform (because you can relabel a configuration to put any number at a given position). The joint probabilities are more complicated.
If the game is partially filled you have constraints that will affect this distribution.
You must devise an algorithm to calculate the number of solutions from a given initial configuration. Then you compute the fraction of the total solutions are will have a specific value at the position of interest.
counts = {}
for i in range(1, 10):
board[cell] = i;
counts[i] = countSolutions(board);
prob = {i: counts[i] / sum(counts[i] for i in range(1, 10))}
The same approach works for joint probabilities but in some cases the number of possibilities may be too high.

Given a list of 2D points and a square grid size, return the coordinate closest to the most points

Here's a summarized problem statement from an interview I had:
There is an n x n grid representing a city, along with a list of k
3-tuples (x, y, w), where (x, y) is the coordinate of an event,
and w is the the "worth" of the event. You're also given a radius
r, which represents how far you can see. You derive happiness h from seeing an event, and h=w/d, where d is (1 + Euclidean distance to the event) (to account for 0 distance). If d is greater than r, then the happiness is 0. Output a coordinate (x,y) that has the highest cumulative happiness.
I didn't really know how to approach this problem other than brute forcing through each possible coordinate and calculating the happiness at each point, recording the max. I also thought about calculating the center of mass of the points and finding the closest integer coordinates to the center of mass, but that doesn't properly take into account the "worth" of the event.
What's the best way to approach this problem?
(I can't see an obvious best algorithm or data structure for this; it could be one of those questions where they wanted to hear your thought process more than your solution.)
Of the two obvious approaches:
Iterating over all locations and measuring the distance to all events to calculate the location's worth
Iterating over all events and adding to the worth of the locations in the circle around them
the latter seems to be the most efficient one. You're never looking at worthless locations, and to distribute the worth, you only need to calculate one octant of the circle and then mirror it for the rest of the circle.
You obviously need the memory space to store a rectangular grid of the locations' worth, so that's a consideration. And if you don't know the city size beforehand, you'd have to iterate over the input once just to choose the grid size. (In contrast, the first method would require almost no memory space).
Time complexity-wise, you'd iterate over k events, and for each of these you'd have to calculate the worth of a number of locations related to r2. You can keep a maximum while you're iterating over the events, so finding the maximum value doesn't add to the time complexity. (In the first method, you'd obviously have to calculate all the same w/(d+1) values, without the advantage of mirroring one octant of a circle, plus at least the distance of all the additional worthless locations.)
If the number of events and the affected regions around them are small compared to the city size, the advantage of the second method is obvious. If there are a large number of events and/or r is large, the difference may not be significant.
There may be some mathematical tricks to decide which events to check first, or which to ignore, or when to stop, but you'd have to know some more details for that, e.g. whether two events can happen at the same location. There could e.g. be an advantage in sorting the events by worth and looking at the events with the most worth first, because at some point it may become obvious that events outside of a "hot spot" around the current maximum can be ignored. But much would depend on the specifics of the data.
UPDATE
When distributing the worth of an event over the locations around it, you obviously don't have to calculate the distances more than once; e.g. if r = 3 you'd make this 7×7 grid with 1/d weights:
0 0 0 0.250 0 0 0
0 0.261 0.309 0.333 0.309 0.261 0
0 0.309 0.414 0.500 0.414 0.309 0
0.250 0.333 0.500 1.000 0.500 0.333 0.250
0 0.309 0.414 0.500 0.414 0.309 0
0 0.261 0.309 0.333 0.309 0.261 0
0 0 0 0.250 0 0 0
Which contains only eight different values. Then you'd use this as a template to overlay on top of the grid at the location of an event, and multiply the event's worth with the weights and add them to each location's worth.
UPDATE
I considered the possibility that only locations with an event could be the location with the highest worth, and without the limit r that would be true. That would make the problem quite different. However, it's easy to create a counter-example; consider e.g. these events:
- - 60 - -
- - - - -
60 - - - 60
- - - - -
- - 60 - -
With a limit r greater than 4, they would create this worth in the locations around them:
61.92 73.28 103.3 73.28 61.92
73.28 78.54 82.08 78.54 73.28
103.3 82.08 80.00 82.08 103.3
73.28 78.54 82.08 78.54 73.28
61.92 73.28 103.3 73.28 61.92
And the locations with the highest worth 103.3 are the locations of the events. However, if we set the limit r = 2, we get:
40 30 60 30 40
30 49.7 30 49.7 30
60 30 80 30 60
30 49.7 30 49.7 30
40 30 60 30 40
And the location in the middle, which doesn't have an event, is now the location of maximum worth 80.
This means that locations without events, at least those within the convex hull around a cluster of events, have to be considered. Of course, if two clusters of events are found to be more than 2 × r away from each other, they can be treated as separate zones. In that case, you wouldn't have to create a grid for the whole city, but separate smaller grids around every cluster.
So the overall approach would be:
Create the square grid of size 2 × r with the weights.
Separate the events into clusters with a distance of more than 2 × r between them.
For each cluster of events, create the smallest rectangular grid that fits around the events.
For each event, use the weight grid to distribute worth over the rectangular grid.
While adding worth to locations, keep track of the maximum worth.

Quick Sort Algorithms - Many different ways of doing the same thing?

Am I correct in saying that there would be many ways to perform a Quick Sort?
For argument sakes, lets use the first textbook's numbers:
20 47 12 53 32 84 85 96 45 18
This book says to swap the 18 and 20 (in the book the 20 is red and the 18 is blue, so I've bolded the 20).
Basically it keeps moving the blue pointer until the numbers are:
18 12 20 53 32 84 85 96 45 47
Now it says (and this is obvious to me) that all the numbers to the left of the 20 are less than and all of the numbers to the right are greater than, but it never names the 20 as a "pivot", which is how most other resources talk about it. Then as all other methods state, it does a quick sort on two sides and then we end up with (it only covers sorting the right half of the list):
47 32 45 53 96 85 84 and the book ends. Now I know from the other resources that once all of the lists are in order they are put back together. I guess I understand this but am constantly confused by the one "Cambridge approved" textbook that differs from the second one. The second one talking about finding a pivot by picking the median.
What's the best way to find a "pivot" for a list?
What is given in your textbook is similar to the pivot-based concept except that they haven't mentioned this terminology over there. But,anyways the concepts are the same.
What's the best way to find a "pivot" for a list?
There's not a fixed way of selecting pivotal-element. You can select any of the element of the array---first,second,last,etc. It can also be randomly selected for a given array.
But, scientists and mathematicians generally talk about the median element which is the middle element of the list for the symmetry based reason,thereby reducing the recursive calls.
It is almost obvious that when you'll select the first or the last element of the array, there will be more number of recursive calls --- thereby moving closer to the worst case scenario. The more number of recursive calls will be framed to separately perform quick-sort on the two partitions.
Theoretically - choosing the median element as the pivot guarantees least number of recursive calls, and guarantees Theta(nlogn) running time.
However, finding this median is done with selection algorithm - and if you want to guarantee selection takes linear time - it needs median of medians algorithm, which has poor constants.
If you chose the first (or last) element as pivot - you are guaranteed to get poor performance for sorted or almost sorted array - which is pretty likely to be your input array in many applications - so that's not a good choice either. So choosing the first/last element of the array is actually a bad idea.
A good solid solution to chose pivot - is at random. Draw a random number from r = rand([0,length(array)), and chose the r'th element as your pivot.
While there is a theoretical possibility for worst case here - it is:
Very unlikely
Hard for mallicious user to predict what is the worst case input, especially if the random function and/or seed is unknown to him.

Crossover operator for permutations

i'm trying to solve the problem of crossover in genetic algorithm on my permutations.
Let's say I have two permutations of 20 integers. I want to crossover them to get two children. Parents have the same integers inside, but the order is different.
Example:
Parent1:
5 12 60 50 42 21 530 999 112 234 15 152 601 750 442 221 30 969 113 134
Parent2:
12 750 42 113 530 112 5 23415 60 152 601 999 442 221 50 30 969 134 21
Let it be that way - how can I get children of these two?
What you are looking for is ordered crossover. There is an explanation for the Travelling Salesman Problem here.
Here is some Java code that implements the partially mapped crossover (PMX) variant.
The choice of crossover depends on whether the order or the absolute position of the integers is important to the fitness. In HeuristicLab (C#) we have implemented several popular ones found in the literature which include: OrderCrossover (2 variants), OrderBasedCrossover, PartiallyMatchedCrossover, CyclicCrossover (2 variants), EdgeRecombinationCrossover (ERX), MaximalPreservativeCrossover, PositionBasedCrossover and UniformLikeCrossover. Their implementation can be found together with reference to a scientific source in the HeuristicLab.Encodings.PermutationEncoding plugin. The ERX makes sense only for the TSP or TSP-like problems. The CX is position-based, the PMX is partly position partly order based, but more towards position. The OX is solely order based.
Beware that our implementations assume a continous numbered permutation with integers from 0 to N-1. You have to map them to this range first.
According to my research and implementations of genetic operators. Many types of crossover operators exist for the order coding (i.e. repetition of genes not allowed, like in TSP). In general, I like to think that there are two main families:
The ERX-family
A list of neighborhood is used to store the neighbors of each node in both parents. Then, The child is generated using only the list. ERX is known to be more respectful and alleles transmitting, which basically means that the links between genes are not likely to be broken.
Examples of ERX-like operators include: Edge Recombination (ERX), Edge-2, Edge-3, Edge-4, and Generalized Partition Crossover (GPX).
OX-like crossovers
Two crossover points are chosen. Then, the genes between the points are swapped between the two parents. Since repetitions are not allowed, each crossover proposes a technique to avoid/eliminate repetitions. These crossover operators are more disruptive than ERX.
Example of OX-like crossovers: Order crossover (OX), Maximal Preservative Crossover (MPX), and Partial-Mapped Crossover (PMX).
The first family (ERX) performs better in plain genetic algorithms. While the second family is more suited in a hybrid genetic algorithm or memetic algorithm (use of local search). This paper explains it in details.
In Traveling Salesrep Problem (TSP), you want the order to visit a list of cities, and you want to visit each city exactly once. If you encode the cities directly in the genome, then a naive crossover or mutation will often generate an invalid itinerary.
I once came up with a novel approach to solving this problem: Instead of encoding the solution directly in the genome, I instead encoded a transformation that would re-order a canonical list of values.
Given the genome [1, 2, 4, 3, 2, 4, 1, 3], you'd start with the list of cities in some arbitrary order, say alphabetical:
Atlanta
Boston
Chicago
Denver
You'd then you'd take each pair of values from the genome and swap the cities in those positions. So, for the genome above, you'd swap those in 1 and 2, and then those in 4 and 3, and then those in 2 and 4, and finally those in 1 and 3. You'd end up with:
Denver
Chicago
Boston
Atlanta
With this technique, you can use any type of crossover or mutation operation and still always get a valid tour. If the genome is long enough, then entire solution space can be explored.
I've used this for TSP and other optimization problems with lots of success.

Resources