equal value = equal rank - wolfram-mathematica

I would like to rank the elements of a list such that elements that have the same value also get the same rank:
list = {1, 2, 3, 4, 4, 5}
desired output:
ranks = {5, 4, 3, 2, 2, 1}
Ordering[] does almost what I want but assigns different ranks to the two instances of 4 in the list.

I am not sure that I cover everything you have in mind, but the following code will give the desired output. It presupposes that the smallest value is the highest rank, and should work with numerical values or as long as you are ok with the standard sorting order of Mathematica. The local variable dv is a shortname for "distinct values".
FromListToRanks[k_List]:= Module[ {dv=Reverse[Union[k]]},
k /. Thread[dv -> Range[Length[dv]]] ]
FromListToRanks[list]
{5,4,3,2,2,1}

Related

Algorithm for obtaining from a set all sequences of size N that include a given subset

To be more specific, the sequences that we want to output can have the elements of the input subset in a different order than the one in the input subset.
EXAMPLE: Let's say we have a set {1, 9, 1, 5, 6, 9, 0 , 9, 9, 1, 10} and an input subset {9, 1}.
We want to find all subsets of size 3 that include all elements of the input subset.
This would return the sets {1, 9, 1} , {9, 1, 5}, {9, 9, 1}, {9, 1, 10}.
Of course, the algorithm with the lowest complexity possible is preferred.
EDIT: Edited with better terminology. Also, here's what I considered, in pseudocode:
1. For each sequence s in the set of size n, do the following:
2. Create a list l that is a copy of the input subset i.
3. j = 0
4. For each element e in s, do the following:
5. Check if it's in i.
If it is, remove that element from l.
6. If l is empty, add s to the sequences to return,
skip to the next sequence (go to step 1)
7. Increment j.
8. if j == n, go to the next sequence (go to step 1)
This is what I came up with, but it takes a really awful amount of time
due to the fact that we consider EVERY sequence with no memory of
previously scanned ones whatsoever. I really don't have an idea on how to implement such memory, this is all very new to me.
You could simply find all occurrences of that subset and then produce all lists that contain that combination.
In python for example:
def get_Allsubsets_withSubset(original_list, N , subset):
subset_len = len(subset)
list_len = len(original_list)
index_of_occurrence = []
overflow = N - subset_len
for index in range(len(original_list)-len(subset)):
if original_list[index: index +len(subset)] == subset:
index_of_occurrence.append(index)
final_result = []
for index in index_of_occurrence:
for value in range(overflow+1):
i = index - value
final_result.append(original_list[i:i+N])
return final_result
Its not beautiful, but I would say its a start

Subset sum problem, break ties by initial ordering

Given a set of numbers, find the subset s.t. the sum of all numbers in the subset is equal to N. Break ties between all feasible subsets by the initial order of their elements. Assume that the numbers are integers, and there exists such subset that perfectly sums up to N.
For example, given an array [2, 5, 1, 3, 4, 5] and N = 6, the output needs to be {2, 1, 3}. Although {5, 1}, {2, 4} and {1, 5} are also subsets whose total sums up to 6, we need to return {2, 1, 3} according to the ordering of the numbers in the array.
For the classic problem I know how to do it with dynamic programming but to break ties I can't think of a better approach besides finding ALL possible subsets first and then choosing the one with the best ordering. Any ideas?
I will try to provide an interesting way of breaking ties.
Let's assign another value to the items, let the value of the i_th item be called v[i].
Let there T items, the i-th item will have = , weight = .
Among all the subsets of maximum weight, we will look for the one that has the maximum accumulated value of the items. I have set the values as powers of two so as to guarantee that an item that comes first in the array is prioritized over all it's successors.
Here is a practical example:
Consider N = 8, as the limit weight that we have.
Items {8, 4, 2, 2}
Values {8, 4, 2, 1}
We will have two distinct subsets that have weight sum = 8, {8} and {4, 2, 2}.
But the first has accumulated value = 8, and the other one has accumulated_value = 7, so we will choose {8} over {4, 2, 2}.
The idea now is to formulate a dynamic programming that takes into account the total value.
= maximum accumulated value of among all subset of items from the interval [0, i] that have total weight = W.
I will give a pseudocode for the solution
Set all DP[i][j] = -infinity
DP[0][0] = 0
for(int weight = 0; weight <= N; ++weight )
{
for(int item = 0; item < T; ++item )
{
DP[item][weight] = DP[item - 1][weight]; // not using the current item
if( v[item] < weight ) continue;
else
{
DP[item][weight] = max( DP[item][weight], DP[item - 1][ weight - w[item]] + v[item])
}
}
}
How to recover the items is really trivial after running the algorithm.
DP[T][N] will be a sum of powers of two (or -infinity if it is not possible to select items that sum up to N) and the i-th item belongs to the answer if and only if, (DP[T][N] & v[i]) == v[i].

algorithm to get all combinations of splitting up N items into K bins

Assuming I have a list of elements [1,2,3,4,] and a number of bins (let's assume 2 bins), I want to come up with a list of all combinations of splitting up items 1-4 into the 2 bins. Solution should look something like this
[{{1}, {2,3,4}}, {{2}, {1,3,4}}, {{3}, {1,2,4}}, {{4}, {1,2,3}},
{{1,2}, {3,4}}, {{1,3}, {2,4}}, {{1,4}, {2,3}}, {{}, {1, 2, 3, 4}, {{1, 2, 3, 4}, {}}]
Also, order does matter -- I didn't write out all the return values but {{1, 2, 3}, {4}} is a different solution from {{3, 2, 1}, {4}}
A common approach is as follows.
If you have, say, K bins, then add K-1 special values to your initial array. I will use the -1 value assuming that it never occurs in the initial array; you can choose a different value.
So for your example the initial array becomes arr=[1,2,3,4,-1]; if K were, say, 4, the array would be arr=[1,2,3,4,-1,-1,-1].
Now list all the permutations of array arr. For each permutation, treat all -1s as bin separators, so all the elements befors the first -1 go to the first bin (in that particular order), all the elements between first and second -1 go to the second bin, and so on.
For example:
[-1,1,2,3,4] -> {{}, {1,2,3,4}}
[2,1,3,-1,4] -> {{2,3,4}, {4}}
[3,1,2,-1,4] -> {{3,1,2}, {4}}
[1,3,-1,2,4] -> {{1,3}, {2,4}}
and so on.
Generating all permutation is a standard task (see, for example, Permutations in JavaScript?), and splitting an array by -1s should be easy.

Finding a group of subsets that does not overlap

I am reviewing for an upcoming programming contest and was working on the following problem:
Given a list of integers, an integer t, an integer r, and an integer p, determine if the list contains t sets of 3, r runs of 3, and p pairs of numbers. For each of these subsets, the numbers must be adjacent and any given number can only exist in one subset, if any at all.
Currently, I am solving the problem by simply finding all sets of 3, runs of 3, and pairs and then checking all permutations until finding one which has no overlapping subsets. This seems inefficient, however, and I was wondering if there was a better solution to the problem.
Here are two examples of the problem:
{1, 1, 1, 2, 3, 4, 4, 4, 5, 5, 1, 0}, t = 1, r = 1, p = 2.
This works because we have the triple {4 4 4}, the run {1 2 3}, and the pairs {1 1} and {5 5}
{1, 1, 1, 2, 3, 3}, t = 1, r = 1, p = 1
This does not work because the only triple is {1 1 1} and the only run is {1 2 3} and the two overlap (They share a 1).
I am looking for a more efficient approach to this problem.
There is probably a faster way, but you can solve this with dynamic programming. Compute a recursive function F(t,r,p,n) which decides whether it is possible to have t triples, r runs, and p pairs in the sequence starting at position 1 and ending at n, and storing the last subset of the solution ending at position n if it is possible. If you can have a triple, run, or pair ending at position n then you have a recursive case, either. F(t-1,r,p,n-3) or F(t,r-1,p,n-3) or F(t,r,p-1,n-2), and you have the last subset stored, or otherwise you have a recursive case F(t,r,p,n-1). This looks like fourth power complexity but it really isn't, because the value of n is always decreasing so the complexity is actually O(n + TRP), where T is the total desired number of triples, R is the total desired number of runs, and P is the total desired number of pairs. So O(n^3) in the worst case.

Find the middle element in merged arrays in O(logn)

We have two sorted arrays of the same size n. Let's call the array a and b.
How to find the middle element in an sorted array merged by a and b?
Example:
n = 4
a = [1, 2, 3, 4]
b = [3, 4, 5, 6]
merged = [1, 2, 3, 3, 4, 4, 5, 6]
mid_element = merged[(0 + merged.length - 1) / 2] = merged[3] = 3
More complicated cases:
Case 1:
a = [1, 2, 3, 4]
b = [3, 4, 5, 6]
Case 2:
a = [1, 2, 3, 4, 8]
b = [3, 4, 5, 6, 7]
Case 3:
a = [1, 2, 3, 4, 8]
b = [0, 4, 5, 6, 7]
Case 4:
a = [1, 3, 5, 7]
b = [2, 4, 6, 8]
Time required: O(log n). Any ideas?
Look at the middle of both the arrays. Let's say one value is smaller and the other is bigger.
Discard the lower half of the array with the smaller value. Discard the upper half of the array with the higher value. Now we are left with half of what we started with.
Rinse and repeat until only one element is left in each array. Return the smaller of those two.
If the two middle values are the same, then pick arbitrarily.
Credits: Bill Li's blog
Quite interesting task. I'm not sure about O(logn), but solution O((logn)^2) is obvious for me.
If you know position of some element in first array then you can find how many elements are smaller in both arrays then this value (you know already how many smaller elements are in first array and you can find count of smaller elements in second array using binary search - so just sum up this two numbers). So if you know that number of smaller elements in both arrays is less than N, you should look in to the upper half in first array, otherwise you should move to the lower half. So you will get general binary search with internal binary search. Overall complexity will be O((logn)^2)
Note: if you will not find median in first array then start initial search in the second array. This will not have impact on complexity
So, having
n = 4 and a = [1, 2, 3, 4] and b = [3, 4, 5, 6]
You know the k-th position in result array in advance based on n, which is equal to n.
The result n-th element could be in first array or second.
Let's first assume that element is in first array then
do binary search taking middle element from [l,r], at the beginning l = 0, r = 3;
So taking middle element you know how many elements in the same array smaller, which is middle - 1.
Knowing that middle-1 element is less and knowing you need n-th element you may have [n - (middle-1)]th element from second array to be smaller, greater. If that's greater and previos element is smaller that it's what you need, if it's greater and previous is also greater we need to L = middle, if it's smaller r = middle.
Than do the same for the second array in case you did not find solution for first.
In total log(n) + log(n)

Resources