A feature ranking algorithm - algorithm

if I have the following partitions or subsets with the corresponding scores as follows:
{X1,X2} with score C1
{X2,X3} with score C2
{X3,X4} with score C3
{X4,X1} with score C4
I want to write an algorithm that will rank the Xs based on the corresponding score of the subset they appeared in.
one way for example will be to do the following:
X1 = (C1 + C4)/2
X2 = (C1 + C2)/2
X3 = (C2 + C3)/2
X4 = (C3 + C4)/2
and then sort the results.
is there a more efficient or better ideas to do the ranking?

If you think that the score of a set is the sum of the scores of each object, you can write your equation in matrix form as :
C = M * X
where C is a vector of length 4 with components C1, C2, C3, C4, M is the matrix (in your case, as I understand this may vary)
1 1 0 0
0 1 1 0
0 0 1 1
1 0 0 1
and X is the unknown. You can then use Gaussian elimination to determine X and the get the ranking as you suggested.

Related

2D Matrix Problem - how many people can get a color that they want?

Given a bitarray such as the following:
C0 C1 C2 C3 C4 C5
**********************************************
P0 * 0 0 1 0 1 0 *
P1 * 0 1 0 0 1 0 *
P2 * 0 0 0 1 1 0 *
P3 * 1 0 0 0 0 1 *
P4 * 0 0 0 0 0 0 *
P5 * 0 0 0 0 0 0 *
P6 * 1 0 0 0 0 0 *
**********************************************
Each row represents a different person P_i, while each column represent a different color C_j. If a given cell A[i][j] is 1, it means that person i would like color j. A person can only get one color, and a color can only be given to one person.
In general, the number of people P > 0, and the number of colors C >= 0.
How can I, time-efficiently, compute the maximal amount of people who can get a color that they want?
The correct answer to the example above would be 5.
Person 6 (P6) only has one wish, so he gets color 0 (C0)
Since C0 is now taken, P3 only has one wish left, so he gets C5.
P0 gets C2, P1 gets C1 and P2 gets C3.
My first idea was a greedy algorithm, that simply favored the person (i.e. row) with the lowest amount of wanted colors. This works for the most part, but is simply too slow for my liking, as it runs in O(P*(P*C)) time, which is equal to O(n^3) when n = P = C. Any ideas to an algorithm (or another data structure) that can solve the problem quicker?
This might be a duplicate of another similar question, but I had trouble with finding the correct name for the type of problem, so bear with me if this is the case.
This is a classical problem known as maximum cardinality bipartite matching . Here, you have a bipartite graph where in one side you have the vertices corresponding to the people and on the other side the vertices corresponding to the colors. An edge between a person and a color exists if there is a one in the corresponding entry in the matrix.
In the general case, the best known algorithms has worst case performance O(E*sqrt(V)), where E is the number of edges in the graph and V is the number of vertices. One such algorithm is called Hopcroft-Karp. I would suggest you to read the Wikipedia explanation that I linked.

Discussion about how to retrieve an i-th element in the j-th level of a binary tree algorithm

I am solving some problems from a site called codefights and the last one solved was about a binary tree in which are:
Consider a special family of Engineers and Doctors. This family has
the following rules:
Everybody has two children. The first child of an Engineer is an
Engineer and the second child is a Doctor. The first child of a Doctor
is a Doctor and the second child is an Engineer. All generations of
Doctors and Engineers start with an Engineer.
We can represent the situation using this diagram:
E
/ \
E D
/ \ / \
E D D E
/ \ / \ / \ / \
E D D E D E E D
Given the level and position of a person in the ancestor tree above,
find the profession of the person. Note: in this tree first child is
considered as left child, second - as right.
As there is some space and time restrictions, the solution can not be based on actually constructing the tree until the level required and check which element is in the position asked. So far so good. My proposed solution written in python was:
def findProfession(level, pos):
size = 2**(level-1)
shift = False
while size > 2:
if pos <= size/2:
size /= 2
else:
size /= 2
pos -= size
shift = not shift
if pos == 1 and shift == False:
return 'Engineer'
if pos == 1 and shift == True:
return 'Doctor'
if pos == 2 and shift == False:
return 'Doctor'
if pos == 2 and shift == True:
return 'Engineer'
As it solved the problem, I got access to the solutions of other used and I was astonished by this one:
def findProfession(level, pos):
return ['Engineer', 'Doctor'][bin(pos-1).count("1")%2]
Even more, I did not understand the logic behind it and so we arrived to this question. Someone could explain to me this algorithm?
Let's number the nodes of the tree in the following way:
1) the root has number 1
2) the first child of node x has number 2*x
3) the second child of node x has number 2*x+1
Now, notice that each time you go to the first child, the profession stays the same, and you add a 0 to the binary representation of the node.
And each time you go to the second child, the profession flips and you add a 1 to the binary representation.
Example: Let's find the profession of the 4th node in the 4th level (last level in the diagram you have in the question). First we start at the root with number 1, then we go to the first child with number 2 (10 binary). After that we go to the second child of 2 which is 5 (101 binary). Finally, we go to the second child of 5 which is 11 (1011 binary).
Notice that we started with only one bit equal to 1, then every 1 bit we added to the binary representation flipped the profession. So the number of times we flip a profession is equal to the (number of bits equal to 1) - 1. The parity of this amount decides the profession.
This leads us to the following solution:
X = number of bits equal to 1 in [ 2^(level-1) + pos - 1 ]
Y = (X-1) mod 2
if Y is 0 then the answer is "Engineer"
Otherwise the answer is "Doctor"
since 2^(level-1) is a power of 2, it has exactly one bit equal to 1, therefore you can write:
X = number of bits equal to 1 in [ pos-1 ]
Y = X mod 2
Which is equal to the solution you mentioned in the question.
This type of sequence is known as the Thue-Morse sequence. Using the same tree, here is a demonstration of why it gives the correct answer:
p is the 0-indexed position
b is the binary representation of p
c is the number of 1's in b
p0
E
b0
c0
/ \
p0 p1
E D
b0 b1
c0 c1
/ \ / \
p0 p1 p2 p3
E D D E
b0 b1 b10 b11
c0 c1 c1 c2
/ \ / \ / \ / \
p0 p1 p2 p3 p4 p5 p6 p7
E D D E D E E D
b0 b1 b10 b11 b100 b101 b110 b111
c0 c1 c1 c2 c1 c2 c2 c3
c is always even for Engineer and odd for Doctor. Therefore:
index = bin(pos-1).count('1') % 2
return ['Engineer', 'Doctor'][index]

Genetic algorithm to solve a quadratic equation

I have a problem understanding the process for genetic algorithms. I found examples of maximizing a function over an interval, and I think I understand them, but how can a genetic algorithm be used to solve, for example, a quadratic equation?
Assuming that we want to find a solution up to 4 digits, what is a proper representation to encode the numbers? What can be used as the fitness function to evaluate each number?
Any help is appreciated
If you want to solve a quadratic equation
a * x^2 + b * x + c = 0
then you need only one variable x as representation. You can use
f(x) = abs(a * x^2 + b * x + c)
as fitness function, which is the same as the precision then, so it needs to be minimized.
But with only one variable it's hard to do crossovers, you can use 10 numbers per individual and then take the average to get x, or just take the average of the two numbers when doing crossovers. Also for mutation instead of completely overriding x, you could multiply it by a random number between 0.5 and 2 for example.
First step is choose a representation of solutions. The most widely used is binary encoding. For example your x may looks:
1 0 0 1 1 1 1 0 | 0 0 0 0 0 0 0 0 0 0 1 1 1
First 8 bits coded an integer part of number, residual 13 bits coded part of number after dot. In this example the binary string coding a number 158.0007.
Crrossover may looks
1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 - 158.0007
1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 - 225.0008
The most simple crossover operator is one divide point. You generate one number from 1 to length of string - 1. And to this point you get a bits from one string and from that point from second string. In this example we choose for divide point 4 position. The offspring will looks like:
1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 - 145.0008
Mutation change with chosen probability some bits.
Fitness function may be a function value of quadratic equation(in case you try found an maximum) in x and x is obtained as decoding of bits string.
And some theory on the end. You have a two sets. One set is search space(space with binary strings) and second set is space with solution. Individual from search space is decoded into the solution in the solution space(in our case value of x coded by binary string). Search space represent genotype and decoded solution is phenotype. Genetics operators work with search space individual(binary string in this case) and fitness function using a decoded solution.
I've got one that solves the equation:
a(x1*x1+x2*x2)+b(x1+x2)+2*c = 0
which is the addition of:
ax1x1+bx1+c=0 and ax2x2+bx2+c=0
since x1 and x2 are both the solutions of the equation the addition can be made. The code gives for aa=1, bb=-1 and cc=-30 the following output:
best solutions at generation 0 :: fitness = 1
chromosome 13 : x1 = -5 , x2 = 6
chromosome 269 : x1 = 6 , x2 = 6
chromosome 340 : x1 = 6 , x2 = -5
chromosome 440 : x1 = -5 , x2 = 6
chromosome 452 : x1 = 6 , x2 = -5
chromosome 549 : x1 = -5 , x2 = -5
chromosome 550 : x1 = 6 , x2 = -5
chromosome 603 : x1 = -5 , x2 = -5
chromosome 826 : x1 = 6 , x2 = -5
chromosome 827 : x1 = -5 , x2 = 6
chromosome 842 : x1 = -5 , x2 = -5
chromosome 952 : x1 = 6 , x2 = 6
chromosome 986 : x1 = 6 , x2 = -5
which is, I believe a good start, I only doesn't know yet how to filter the good from the less good solutions.
this is the code partially:
void objective(Chromosome* c){
// the problem here is when one root is found the fitness
// will be 1 :
// resulting in the second value is a non-root or the same
// value as the first root
//so probably I need to rewrite the fitness function
c->result = aa * ((c->gene[0].geneticcode * c->gene[0].geneticcode) + (c->gene[1].geneticcode * c->gene[1].geneticcode)) /
+ bb * (c->gene[0].geneticcode + c->gene[1].geneticcode) /
+ 2 * cc;
}
void fitness(Chromosome* c){
//rewrite of fitness function for this example
c->fitness = 1.0 / (1.0 + fabs(c->result));
}
If anyone can improve and I'm sure there are please share.

Counting number of quadruples of integers

I saw this question today where we need to count the number of
quadruples of integers
(X1, X2, X3, X4), such that Li ≤ Xi ≤ Ri for i
= 1, 2, 3, 4 and X1 ≠ X2, X2 ≠ X3, X3 ≠ X4, X4 ≠ X1.
input:
Li Ri
1 4
1 3
1 2
4 4
output:
8
1 2 1 4
1 3 1 4
1 3 2 4
2 1 2 4
2 3 1 4
2 3 2 4
3 1 2 4
3 2 1 4
My initial thoughts were using
Principle of Inclusion Exclusion
I was able to find number if unrestricted quadruples but I am not able to figure out how can we find the remaining conditions to reach the final solution. Also I came to know this question can be done using DFS .
How can we do this question with Inclusion Exclusion/ DFS
Inclusion/Exclusion will give you the number of quadruples, but won't give you the quadruples themselves.
Let Ai be the set of quadruples satisfying Lj<=Xj<=Rj for all j, with Xi=X(i+1) (where the indices are cyclic, so X5 means X1). In the example you provided,
A1 = { (1114), (1124), (2214), (2224), (3314), (3324) }
A2 = { (1114), (2114), (3114), (4114), (1224), (2224), (3224), (4224) }
A3 = { } (empty set)
A4 = { (4114), (4214), (4314), (4124), (4224), (4324) }
We also need the intersections of pairs of sets:
A1 cap A2 = { (1114), (2224) } (note first three numbers identical)
A1 cap A3 = { }
A1 cap A4 = { } (can't have X4=X1=X2)
A2 cap A3 = { }
A2 cap A4 = { (4114), (4224) }
A3 cap A4 = { }
Intersections of triples of sets:
A1 cap A2 cap A3 = { }
A1 cap A2 cap A4 = { }
A1 cap A3 cap A4 = { }
A2 cap A3 cap A4 = { }
And the intersection of all the sets:
A1 cap A2 cap A3 cap A4 = { }
Inclusion/exclusion in its complementary form tells us that
|intersection of complements of Ai| = |unrestricted quadruples|
- sum of |Ai| + sum of |Ai cap Aj| - sum of |Ai cap Aj cap Ak|
+ sum of |Ai cap Aj cap Ak cap Al|
where none of the indices i,j,k,l are equal. In your example,
|intersection of complements of Ai| = 4x3x2x1 - (6+8+0+6) + (2+0+0+0+2+0) - (0+0+0+0) + 0
= 24 - 20 + 4 - 0 + 0 = 8
In order to find the |Ai| and their intersections, you have to find intersections of intervals [Li,Ri] and multiply the lengths of intersections by the lengths of unrestricted intervals. For example,
|A1| = |[1234] cap [123]| x |[12]| x |[4]| = 3 x 2 x 1 = 6
|A2 cap A4| = |[123] cap [12]| x |[4] cap [1234]| = |[12]| x |[4]| = 2 x 1 = 2
I don't see what depth first search has to do with it in this approach.
It depends if the sets are disjoint or share elements. For n = 4, meaning quadruples, as you asked about, I think I got it down to between 1 and 4 iterations if we commit the ends to four types describing if x_1 is a member of X2 and x_4 a member of X3.
Example with three iterations:
input = {1,2,3}{1,2}{1,2,3}{3,4}
2 * (1)(12)(123)(3) = (1)(2)(1)(3) = 2 * 1 // x_1 ∈ X2, x_4 ∈ X3
2 * (1)(12)(123)(4) = (1)(2)(13)(4) = 2 * 2 // x_1 ∈ X2, x_4 ∉ X3
1 * (3)(12)(123)(4) = (3)(12)(12,3)(4) = 1 * (2 + 2) // x_1 ∉ X2, x_4 ∉ X3
Total = 10
Example with one iteration:
input = {1,2,3,4}{1,2,3,4}{1,2,3,4}{1,2,3,4} // x_1 ∈ X2, x_4 ∈ X3
12 * (1)(1234)(1234)(2) = (1)(2,34)(134)(2) = 12 * (3 + 4)
Total = 84

Looking for a generic, fast, low-memory algorithm to output N-out-of-M combinations of an array without repetitions

I have an array with players
$players = array('A','B','C','D','E','F');
and i want to get every possible 3 way finishing.
1st 2nd 3rd
A B C
A B D
...
C A B
C B A
...
F D E
F E D
I have some permutation algorithm but it must be something else since in permutation there is 6 * 5 * 4 * 3 * 2 * 1 combination and here is only 6 * 5 * 4
Here's some pseudo-code to print your 3 out of 6 combinations without repetition:
for i = 1 to 6
for j = 1 to 6
if (j != i)
for k = 1 to 6
if (k != i && k != j)
print(A[i], A[j], A[k])
end if
next k
end if
next j
next i
For the general k-of-n case see: Algorithm to return all combinations of k elements from n
Given your permutation algorithm, you can use it in two steps to get the desired permutations.
First, let's consider the following mapping. Given input as A1 A2 A3 A4 A5 ... An, a value b1 b2 b3 b4 b5 ... bn means select Ai if bi is 1 and not if it is 0.
With your input, for example:
0 0 1 1 0 1 -> C D F
0 1 0 0 1 1 -> B E F
Now your algorithm can go as follows:
Take n as the number of elements (in your case 6) and m as the number you want to choose from.
Construct the following sequence:
0 0 0 ... 0 1 1 1 ... 1
\____ ____/ \____ ____/
V V
n - m m
Get all permutations of the above sequence and for each:
Find the m elements that are marked in the sequence
Get all permutations of those m elements and for each:
do whatever you want!
Your problem is not finding all permutations of 6 elements.
Your problem is to choose 3 elements, and than check its permutations.
The number of combinations = C(6,3)*3! = 6! / 3! = 6*5*4.
C(6,3) - for choosing 3 elements out of 6. (No matter the order)
3! - for ordering the 3 chosen elements.
This is the exactly number of combinations you should get. (and you do)
However, you can use your permutation algorithm to get all permutations of the 6 elements.
Than, just ignore the last 3 elements, and remove duplicates from the result.
I may be wrong but I think you have the correct amount of possible permutations here. You choose only 3 players among the 6 players array. So for the first player, you have 6 possibilities, for the second player you have 5 possibilities, and for the third player, you have 4 possibilities.
If you decide to have 4 players at the end instead of having 3, the possible amount of permutations would be 6*5*4*3, and so on.
I hope my math is not too old!

Resources