Mutation for pipeline network optimization - genetic-algorithm

I'm working on pipeline network optimization, and I'm representing the chromosomes as a string of numbers as following
example
chromosome [1] = 3 4 7 2 8 9 6 5
where, each number refers to well and the distance between wells are defined. since, the wells cannot be duplicated for one chromosome. for example
chromosome [1]' = 3 4 7 2 7 9 6 5 (not acceptable)
what is the best mutation that can deal with a representation like that? thanks in advance.

Can't say "best" but one model that I've used for graph-like problems is: For each node (well number), calculate the set of adjacent nodes / wells from the entire population. e.g.,
population = [[1,2,3,4], [1,2,3,5], [1,2,3,6], [1,2,6,5], [1,2,6,7]]
adjacencies = {
1 : [2] , #In the entire population, 1 is always only near 2
2 : [1, 3, 6] , #2 is adjacent to 1, 3, and 6 in various individuals
3 : [2, 4, 5, 6], #...etc...
4 : [3] ,
5 : [3, 6] ,
6 : [3, 2, 5, 7],
7 : [6]
}
choose_from_subset = [1,2,3,4,5,6,7] #At first, entire population
Then create a new individual / network by:
choose_next_individual(adjacencies, choose_from_subset) :
Sort adjacencies by the size of their associated sets
From the choices in choose_from_subset, choose the well with the highest number of adjacent possibilities (e.g., either 3 or 6, both of which have 4 possibilities)
If there is a tie (as there is with 3 and 6), choose among them randomly (let's say "3")
Place the chosen well as the next element of the individual / network ([3])
fewerAdjacencies = Remove the chosen well from the set of adjacencies (see below)
new_choose_from_subset = adjacencies to your just-chosen well (i.e., 3 : [2,4,5,6])
Recurse -- choose_next_individual(fewerAdjacencies, new_choose_from_subset)
The idea is that nodes with high numbers of adjacencies are ripe for recombination (since the population hasn't converged on, e.g., 1->2), a lower "adjacency count" (but non-zero) implies convergence, and a zero adjacency count is (basically) a mutation.
Just to show a sample run ..
#Recurse: After removing "3" from the population
new_graph = [3]
new_choose_from_subset = [2,4,5,6] #from 3 : [2,4,5,6]
adjacencies = {
1: [2]
2: [1, 6] ,
4: [] ,
5: [6] ,
6: [2, 5, 7] ,
7: [6]
}
#Recurse: "6" has most adjacencies in new_choose_from_subset, so choose and remove
new_graph = [3, 6]
new_choose_from_subset = [2, 5,7]
adjacencies = {
1: [2]
2: [1] ,
4: [] ,
5: [] ,
7: []
}
#Recurse: Amongst [2,5,7], 2 has the most adjacencies
new_graph = [3, 6, 2]
new_choose_from_subset = [1]
adjacencies = {
1: []
4: [] ,
5: [] ,
7: []
]
#new_choose_from_subset contains only 1, so that's your next...
new_graph = [3,6,2,1]
new_choose_from_subset = []
adjacencies = {
4: [] ,
5: [] ,
7: []
]
#From here on out, you'd be choosing randomly between the rest, so you might end up with:
new_graph = [3, 6, 2, 1, 5, 7, 4]
Sanity-check? 3->6 occurs 1x in original, 6->2 appears 2x, 2->1 appears 5x, 1->5 appears 0, 5->7 appears 0, 7->4 appears 0. So you've preserved the most-common adjacency (2->1) and two other "perhaps significant" adjacencies. Otherwise, you're trying out new adjacencies in the solution space.
UPDATE: Originally I'd forgotten the critical point that when recursing, you choose the most-connected to the just-chosen node. That's critical to preserving high-fitness chains! I've updated the description.

Related

minimum number of comparisons to find maximum and kth maximum of an arbitrary array of numbers

This question is an extension of the known problem below:
Find the minimum number of comparisons to find the maximum and minimum
of an arbitrary array of N numbers.
It turns out we can answer it using tournament tree construction or by using pairs construction(i got these methods from here).
I wondered what would be the answer of the following problem:
Find the minimum number of comparisons to find the maximum and the kth maximum of an arbitrary array of N numbers.
I dont know how to approach this problem. The ideas in my mind are to observe some property on loser/winner arrays at each level of the tournament method, like so:
if array is 8 6 4 3 2 1 5 7, then the winner and loser arrays at each level would be like
winner_1 = [8, 4, 2, 7]
loser_1 = [6, 3, 1, 5]
winner_winner_2 = [8, 7]
loser_winner_2 = [4, 2]
winner_loser_2 = [6, 5]
loser_loser_2 = [3, 1]
winner_winner_winner_3 = [8]
winner_loser_winner_3 = [4]
winner_winner_loser_3 = [6]
winner_loser_loser_3 = [3]
loser_winner_winner_3 = [7]
loser_loser_winner_3 = [2]
loser_winner_loser_3 = [5]
loser_loser_loser_3 = 1

How many times a number appears as a leaf node?

Suppose you have an array of n elements
A = {1,2,3,4,5}
total of 5! binary search trees are possible(not necessarily distinct) now my question is in how many of trees 1 appeared as leaf node and in how many 2 appeared as leaf node and so on ?
What I have tried:
I've seen for A = {1,2,3}
2 appears 6/3 = 2 times
1 appears 2+1 = 3 times
3 appears 2+1 = 3 times
can i generalise that and say that,
if A= {1,2,3,4}
2 = 24/4 = 6 times
3 = 24/4 = 6 times
1 = 6+1 = 7 times
4 = 6+1 = 7 times
We can generalize, but not in that way.
You can try to permute the array and produce all possible BST's. A brute-force approach, that returns answer in a map/dictionary data structure shouldn't be that hard. First write a function that given one of permuted arrays, finds all leaves. It takes first element as root, sends all elements less than root to left, all greater ones to right, and calls this function recursively for both of them. It then just returns after combining those values.
In the end, combine values for all possible permutations.
A possible approach in python:
from itertools import permutations
def func(arr):
if not arr: return {}
if len(arr)==1: return {arr[0]}
ans = set()
left = func([v for v in arr[1:] if v<arr[0]])
right = func([v for v in arr[1:] if v>=arr[0]])
ans.update(left)
ans.update(right)
return ans
arr = [1,2,3,4]
ans = {i:0 for i in arr}
for a in permutations(arr):
dic = func(a)
print(a,":",dic)
for k in dic:
ans[k]+=1
print(ans)
for [1,2,3] it outputs:
(1, 2, 3) : {3}
(1, 3, 2) : {2}
(2, 1, 3) : {1, 3}
(2, 3, 1) : {1, 3}
(3, 1, 2) : {2}
(3, 2, 1) : {1}
{1: 3, 2: 2, 3: 3}
for [1,2,3,4], only the last line i.e answer is:
{1: 12, 2: 8, 3: 8, 4: 12}
for [1,2,3,4,5], it is :
{1: 60, 2: 40, 3: 40, 4: 40, 5: 60}
Can you see the pattern? well, one last example. For up to 6 it is:
{1: 360, 2: 240, 3: 240, 4: 240, 5: 240, 6: 360}

Find local min based on the length of occurences of successive means without falling in wrong min

1. Problem description
I have the following list of values [10, 10, 10, 10, 5, 5, 5, 5, 7, 7, 7, 2, 4, 3, 3, 3, 10] It is shown in the following picture.
What I want to do is find the minimum based on the value of the element and
its duration. From the previous list we can construct the following dictionary (key:val) :[10:4, 5:4, 7:2, 2:1, 4:1, 3:3, 10:1]. Meaning we have 4 sucessive 10s followed by 4 successive 5s, 2 successive 7s and 3 successive 3s.
Based on what I said the local min is 5. But I don't want that The local min should be 3. We didn't select 2 because it happened only once.
Do you have an idea on how we can solve that problem. Is there an existing method that can be used to solve it?
Of course we can sort the dictionary by values [10:4, 5:4, 7:2, 3:3, 10:1] and select the lowest key that has a value different than 1. Is that a good solution?
2. Selection criteria
must be a local min (find_local_min(prices))
must have the highest numbers of succession
the min succession must be > 1
AND I AM STUCK! because now I have 3 as local minimum but it is repeated only 3 times. I was testing if My idea is correct and I tried to find a counter example and I shot my foot
3. source code
the following code extracts the minimums with the dictionary:
#!/usr/bin/env python
import csv
import sys
import os
from collections import defaultdict
def find_local_min(prices):
i = 1
minPrices = []
while i < len(prices):
if prices[i] < prices[i-1]:
minPrices.append(prices[i])
j = i + 1
while j < len(prices) and prices[j] == prices[j-1]:
minPrices.append(prices[j])
j += 1
i = j
else:
i += 1
return minPrices
if __name__ == "__main__":
l = [10, 10, 10, 10, 5, 5, 5, 5, 7, 7, 7, 2,4, 3, 3, 3, 10]
minPrices = find_local_min(l)
minPriceDict = defaultdict(int)
for future in minPrices :
minPriceDict[future] += 1
print minPriceDict
As output if gives the following: defaultdict(<type 'int'>, {2: 1, 3:
3, 5: 4}) Based on this output the algorithm will select 5 as the min
because it is repeated 5 successive times. But that's wrong! it
should be 3. I really want to know how to solve that problem

Combine lists to the least possible amount of 2-dimensional lists

Sorry for the bad description in the title.
Consider a 2-dimensional list such as this:
list = [
[1, 2],
[2, 3],
[3, 4]
]
If I were to extract all possible "vertical" combinations of this list, for a total of 2*2*2=8 combinations, they would be the following sequences:
1, 2, 3
2, 2, 3
1, 3, 3
2, 3, 3
1, 2, 4
2, 2, 4
1, 3, 4
2, 3, 4
Now, let's say I remove some of these sequences. Let's say I only want to keep sequences which have either the number 2 in position #1 OR number 4 in position #3. Then I would be left with these sequences:
2, 2, 3
2, 3, 3
1, 2, 4
2, 2, 4
1, 3, 4
2, 3, 4
The problem
I would like to re-combine these remaining sequences to the least possible amount of 2-dimensional lists needed to contain all sequences but no less or no more.
By doing so, the resulting 2-dimensional lists in this particular example would be:
list_1 = [
[2],
[2, 3],
[3, 4]
]
list_2 = [
[1],
[2, 3],
[4]
]
In this particular case, the resulting lists can be thought out. But how would I go about if there were thousands of sequences yielding hundereds of 2-dimensional lists? I have been trying to come up with a good algorithm for two weeks now, but I am getting nowhere near a satisfying result.
Divide et impera, or divide and conquer. If we have a logical expression, stating that the value at position x should be a or the value at position y should be b, then we have 3 cases:
a is the value at position x and b is the value at position y
a is the value at position x and b is not the value at position y
a is not the value at position x and b is the value at position y
So, first you generate all your scenarios, you know now that you have 3 scenarios.
Then, you effectively separate your cases and handle all of them in a sub-routine as they were your main tasks. The philosophy behind divide et imera is to reduce your complex problem into several similar, but less complex problems, until you reach triviality.

Loop through different sets of unique permutations

I'm having a hard time getting started to layout code for this problem.
I have a fixed amount of random numbers, in this case 8 numbers.
R[] = { 1, 2, 3, 4, 5, 6, 7, 8 };
That are going to be placed in 3 sets of numbers, with the only constraint that each set contain minimum one value, and each value can only be used once. Edit: all 8 numbers should be used
For example:
R1[] = { 1, 4 }
R2[] = { 2, 8, 5, 6 }
R3[] = { 7, 3 }
I need to loop through all possible combinations of a set R1, R2, R3. Order is not important, so if the above example happened, I don't need
R1[] = { 4, 1 }
R2[] = { 2, 8, 5, 6 }
R3[] = { 7, 3 }
NOR
R1[] = { 2, 8, 5, 6 }
R2[] = { 7, 3 }
R3[] = { 1, 4 }
What is a good method?
I have in front of me Knuth Volume 4, Fascicle 3, Generating all Combinations and Partitions, section 7.2.1.5 Generating all set partitions (page 61 in fascicle).
First he details Algorithm H, Restricted growth strings in lexicographic order due to George Hutchinson. It looks simple, but I'm not going to dive into it just now.
On the next page under an elaboration Gray codes for set partitions he ponders:
Suppose, however, that we aren't interested in all of the partitions; we might want only the ones that have m blocks. Can we run this through the smaller collection of restricted growth strings, still changing one digit at a time?
Then he details a solution due to Frank Ruskey.
The simple solution (and certain to be correct) is to code Algorithm H filtering on partitions where m==3 and none of the partitions are the empty set (according to your stated constraints). I suspect Algorithm H runs blazingly fast, so the filtering cost will not be large.
If you're implementing this on an 8051, you might start with the Ruskey algorithm and then only filter on partitions containing the empty set.
If you're implementing this on something smaller than an 8051 and milliseconds matter, you can seed each of the three partitions with a unique element (a simple nested loop of three levels), and then augment by partitioning on the remaining five elements for m==3 using the Ruskey algorithm. You won't have to filter anything, but you do have to keep track of which five elements remain to partition.
The nice thing about filtering down from the general algorithm is that you don't have to verify any cleverness of your own, and you change your mind later about your constraints without having to revise your cleverness.
I might even work a solution later, but that's all for now.
P.S. for the Java guppies: I discovered searching on "George Hutchison restricted growth strings" a certain package ca.ubc.cs.kisynski.bell with documentation for method growthStrings() which implements the Hutchison algorithm.
Appears to be available at http://www.cs.ubc.ca/~kisynski/code/bell/
Probably not the best approach but it should work.
Determine number of combinations of three numbers which sum to 8:
1,1,6
1,2,5
1,3,4
2,2,4
2,3,3
To find the above I started with:
6,1,1 then subtracted 1 from six and added it to the next column...
5,2,1 then subtracted 1 from second column and added to next column...
5,1,2 then started again at first column...
4,2,2 carry again from second to third
4,1,3 again from first...
3,2,3 second -> third
3,1,4
knowing that less than half is 2 all combinations must have been found... but since the list isn't long we might as well go to the end.
Now sort each list of 3 from greatest to least(or vice versa)
Now sort each list of 3 relative to each other.
Copy each unique list into a list of unique lists.
We now have all the combinations which add to 8 (five lists I think).
Now consider a list in the above set
6,1,1 all the possible combinations are found by:
8 pick 6, (since we picked six there is only 2 left to pick from) 2 pick 1, 1 pick 1
which works out to 28*2*1 = 56, it is worth knowing how many possibilities there are so you can test.
n choose r (pick r elements from n total options)
n C r = n! / [(n-r)! r!]
So now you have the total number of iterations for each component of the list for the first one it is 28...
Well picking 6 items from 8 is the same as creating a list of 8 minus 2 elements, but which two elements?
Well if we remove 1,2 that leaves us with 3,4,5,6,7,8. Lets consider all groups of 2... Starting with 1,2 the next would be 1,3... so the following is read column by column.
12
13 23
14 24 34
15 25 35 45
16 26 36 46 56
17 27 37 47 57 67
18 28 38 48 58 68 78
Summing each of the above columns gives us 28. (so this only covered the first digit in the list (6,1,1) repeat the procedure for the second digit (a one) which is "2 Choose 1" So of the left over two digits from the above list we pick one of two and then for the last we pick the remaining one.
I know this is not a detailed algorithm but I hope you'll be able to get started.
Turn the problem on it's head and you'll find a straight-forward solution. You've got 8 numbers that each need to be assigned to exactly one group; The "solution" is only a solution if at least one number got assigned to each group.
The trivial implementation would involve 8 for loops and a few IF's (pseudocode):
for num1 in [1,2,3]
for num2 in [1,2,3]
for num3 in [1,2,3]
...
if ((num1==1) or (num2==1) or (num3 == 1) ... (num8 == 1)) and ((num1 == 2) or ... or (num8 == 2)) and ((num1 == 3) or ... or (num8 == 3))
Print Solution!
It may also be implemented recursively, using two arrays and a couple of functions. Much nicer and easier to debug/follow (pseudocode):
numbers = [1, 2, 3, 4, 5, 6, 7, 8]
positions = [0, 0, 0, 0, 0, 0, 0, 0]
function HandleNumber(i) {
for position in [1,2,3] {
positions[i] = position;
if (i == LastPosition) {
// Check if valid solution (it's valid if we got numbers in all groups)
// and print solution!
}
else HandleNumber(i+1)
}
}
The third implementation would use no recursion and a little bit of backtracking. Pseudocode, again:
numbers = [1,2,3,4,5,6,7,8]
groups = [0,0,0,0,0,0,0,0]
c_pos = 0 // Current position in Numbers array; We're done when we reach -1
while (cpos != -1) {
if (groups[c_pos] == 3) {
// Back-track
groups[c_pos]=0;
c_pos=c_pos-1
}
else {
// Try the next group
groups[c_pos] = groups[c_pos] + 1
// Advance to next position OR print solution
if (c_pos == LastPostion) {
// Check for valid solution (all groups are used) and print solution!
}
else
c_pos = c_pos + 1
}
}
Generate all combinations of subsets recursively in the classic way. When you reach the point where the number of remaining elements equals the number of empty subsets, then restrict yourself to the empty subsets only.
Here's a Python implementation:
def combinations(source, n):
def combinations_helper(source, subsets, p=0, nonempty=0):
if p == len(source):
yield subsets[:]
elif len(source) - p == len(subsets) - nonempty:
empty = [subset for subset in subsets if not subset]
for subset in empty:
subset.append(source[p])
for combination in combinations_helper(source, subsets, p+1, nonempty+1):
yield combination
subset.pop()
else:
for subset in subsets:
newfilled = not subset
subset.append(source[p])
for combination in combinations_helper(source, subsets, p+1, nonempty+newfilled):
yield combination
subset.pop()
assert len(source) >= n, "Not enough items"
subsets = [[] for _ in xrange(n)]
for combination in combinations_helper(source, subsets):
yield combination
And a test:
>>> for combination in combinations(range(1, 5), 2):
... print ', '.join(map(str, combination))
...
[1, 2, 3], [4]
[1, 2, 4], [3]
[1, 2], [3, 4]
[1, 3, 4], [2]
[1, 3], [2, 4]
[1, 4], [2, 3]
[1], [2, 3, 4]
[2, 3, 4], [1]
[2, 3], [1, 4]
[2, 4], [1, 3]
[2], [1, 3, 4]
[3, 4], [1, 2]
[3], [1, 2, 4]
[4], [1, 2, 3]
>>> len(list(combinations(range(1, 9), 3)))
5796

Resources