Genetic algorithm, cross over without duplicate data - algorithm

I'm creating a genetic algorithm and I just encounter a problem, let's take an example. I have a list of numbers : [2, 3, 6, 8, 9, 1, 4] which represent my datas.
The best solution to my problem depends on the order of the numbers in the list. So I have two solution : S1 [2, 3, 9, 8, 1, 6, 4] and S2 [1, 6, 4, 3, 9, 2, 8]
If I do a basic cross-over with S1 and S2 I may obtain a solution like this : child [2, 3, 9, 8, 9, 2, 8] and we can see that the solution is bad because I duplicate datas.
The question is how may I realized an evolution (so cross-over) without duplicate thoses datas ?
thanks.

You will need a crossover operator like Ordered Crossover (OX1) that can perform crossover without duplicate thoses datas:
OX1:
A randomly selected portion of one parent is mapped to a portion
of the other parent. From the replaced portion on, the rest is filled
up by the remaining genes, where already present genes are omitted and
the order is preserved.
You should take care with mutation too, because it can change the genes order, in this case you can use a mutation operator like Reverse Sequence Mutation (RSM).
In the reverse sequence mutation operator, we take a sequence S
limited by two positions i and j randomly chosen, such that i<j.
The gene order in this sequence will be reversed by the same way as
what has been covered in the previous operation.

You have Permutation Encoding, look at this explanation: http://www.obitko.com/tutorials/genetic-algorithms/crossover-mutation.php
In general you take the elements of the first parent in order in which they are met in the first parent and you take the rest of the elements in the order in which they are met in the second parent.

Related

Histogram based search in SOLR (relevancy by position in multivalued field)

I'm trying to add histogram based search into SOLR. For instance, we need to search closest (or exact the same) distribution to [1, 2, 3, 4]. So, most likely, we can use multivalued field of ints.
The question is - how to make results more relevant depends on their position in multivalued field?
For example
[1, 2, 3, 5]
is more relevant to [1, 2, 3, 4] - only last element is different,
than
[1, 4, 3, 2], despite a fact, that numbers in this example is exact the same, positions of 2 elements are different.
On the other hand, we don't need exact elements search, because we need to find just a closest one. The weight of elements is the same.
Any thoughts?

Return an index of the most common element in a list of integers with equal probability using O(1) space

I came across this coding problem and am having a hard time coming up with a solution.
Given an array of integers, find the most common element in a list of integers
and return any of its indexes randomly with equal probability. The solution must run in O(N) time and use O(1) space.
Example:
List contains: [-1, 4, 9, 7, 7, 2, 7, 3, 0, 9, 6, 5, 7, 8, 9]
7 is most common element so output should be one of: 3, 4, 6, 12
Now this problem would be fairly trivial if not for the constant space constraint. I know reservoir sampling can be used to solve the problem with these constraints if we know the the most common element ahead of time. But if we don't know the most common element, how could this problem be solved?

How to find the largest number in an array made by a ascendingly sorted array and a descendingly sorted array

for an array like [1, 2, 4, 6, 8, 7, 5], how do we efficiently find the largest number in it?
We know that the first part of the array is 1, 2, 4, 6, which is ascendingly sorted and the second part is 8, 7, 5 which is a descendingly sorted array.
The simply solution would be iterate through the array, but given the array is made of two sorted array, I would image the search can be done by some sort of binary search variation to achieve o(logn) runtime complexity. However I cannot seem to come up with the solution.
What you are asking for is equivalent to finding the "peak" of an array. Here is logarithmic time solution to the problem

Finding permutations for balanced distribution of values in lists

Sorry for the bad title, but I don't know how to call this.
I have K lists, N elements in each, for example:
[8, 5, 6]
[4, 3, 2]
[6, 5, 0]
and I want to find such a permutation of the lists' elements, so that the sum of elements in first column, second column etc are as close to each other as possible (so the distribution is "fair").
In my example that would be (probably):
[8, 5, 6]
[4, 2, 3] -- the lists contain the same values
[0, 6, 5] just in different order
sums: 12, 13, 14
Is there some more elegant way than finding all the permutations for each list, and brute-force finding the "ideal" combination of them?
I'm not asking for code, just give me a hint how to do it, if you know.
Thanks!
ps. the lists can be quite large, and more of them - think ~20x~20 max.
If you can accept an approximation, I would do it iteratively :
Sort matrix lines by descending weight (sum of line elements).
Edit : Sorting first by max element in line could be better.
Each time you are going to add a new line to your result matrix, put smaller elements into higher columns.
Order lines of your result matrix back to their initial state (if you have to).
It works with your example, but will obviously not be always perfect.
Here is an example (javascript)

How do I add a random offset to values in a Pseq?

Given a Pseq similar to the following:
Pseq([1, 2, 3, 4, 5, 6, 7, 8], inf)
How would I randomise the values slightly each time? That is, not just randomly alter the 8 values once at initialisation time, but have a random offset added each time a value is sent to the stream?
Here's a neat way:
(Pseq([1, 2, 3, 4, 5, 6, 7, 8], inf) + Pgauss(0, 0.1))
First you need to know that Pgauss is just a pattern that generates gaussian random numbers. You can use any other kind of pattern such as Pwhite.
Then you need to know the really pleasant bit: performing basic math operations on Patterns (as above) composes the patterns (by wrapping them in Pbinop).

Resources