algorithm to maximize sum of unique set with multiple contributors - algorithm

Im looking for an approach to maximize the value of a common set comprised of contributions from multiple sources with a fixed number of contributions from each.
Example problem: 3 people each have a hand of cards. Each hand contains a unique set, but the 3 sets may overlap. Each player can pick three cards to contribute to the middle. How can I maximize the sum of the 9 contributed cards where
each player contributes exactly 3 cards
all 9 cards are unique (when possible)
solution which can scale in the range of 200 possible "cards", 40
contributors and 6 contributions each.

Integer-programming sounds like a viable approach. Without guaranteeing it, this problem also feels NP-hard, meaning: there is no general algorithm beating brute-force (without assumptions about the possible input; IP-solvers actually do assume a lot / are tuned for real-world problems).
(Alternative off-the-shelf approaches: Constraint-programming and SAT-solvers; CP: easy to formulate, faster in regards to combinatorial-search but less good using branch-and-bound style in terms of maximization; SAT: hard to formulate as counters need to build, very fast combinatorial-search and again: no concept of maximization: needs decision-problem like transform).
Here is some python-based complete example solving this problem (in the hard-constraint version; each player has to play all his cards). As i'm using cvxpy, the code is quite in math-style and should be easy to read despite not knowing python or the lib!
Before presenting the code, some remarks:
General remarks:
The IP-approach is heavily dependent on the underlying solver!
Commercial solvers (Gurobi and co.) are the best
Good open-source solvers: CBC, GLPK, lpsolve
The default-solver in cvxpy is not ready for this (when increasing the problem)!
In my experiment, with my data, commercial solvers scale very well!
A popular commercial-solver needs a few seconds for:
N_PLAYERS = 40 , CARD_RANGE = (0, 400) , N_CARDS = 200 , N_PLAY = 6
Using cvxpy is not best-practice as it's created for very different use-cases and this induces some penalty in times of model-creation time
I'm using it because i'm familiar with it and i love it
Improvements: Problem
We are solving the each-player-plays-exactly-n_cards here
Sometimes there is no solution
Your model-description does not formally describe how to handle this
General idea to improve the code:
bigM-style penalty-based objective: e.g. Maximize(n_unique * bigM + classic_score)
(where bigM is a very big number)
Improvements: Performance
We are building all those pairwise-conflicts and use a classic not-both constraint
The number of conflicts, depending on the task can grow a lot
Improvement idea (too lazy to add):
Calculate the set of maximal cliques and add these as constraints
Will be much more powerful, but:
For general conflict-graphs this problem should be NP-hard too, so an approximation algorithm needs to be used
(opposed to other applications like time-invervals, where this set can be calculated in polynomial time as the graphs will be chordal)
Code:
import numpy as np
import cvxpy as cvx
np.random.seed(1)
""" Random problem """
N_PLAYERS = 5
CARD_RANGE = (0, 20)
N_CARDS = 10
N_PLAY = 3
card_set = np.arange(*CARD_RANGE)
p = np.empty(shape=(N_PLAYERS, N_CARDS), dtype=int)
for player in range(N_PLAYERS):
p[player] = np.random.choice(card_set, size=N_CARDS, replace=False)
print('Players and their cards')
print(p)
""" Preprocessing:
Conflict-constraints
-> if p[i, j] == p[x, y] => don't allow both
Could be made more efficient
"""
conflicts = []
for p_a in range(N_PLAYERS):
for c_a in range(N_CARDS):
for p_b in range(p_a + 1, N_PLAYERS): # sym-reduction
if p_b != p_a:
for c_b in range(N_CARDS):
if p[p_a, c_a] == p[p_b, c_b]:
conflicts.append( ((p_a, c_a), (p_b, c_b)) )
# print(conflicts) # debug
""" Solve """
# Decision-vars
x = cvx.Bool(N_PLAYERS, N_CARDS)
# Constraints
constraints = []
# -> Conflicts
for (p_a, c_a), (p_b, c_b) in conflicts:
# don't allow both -> linearized
constraints.append(x[p_a, c_a] + x[p_b, c_b] <= 1)
# -> N to play
constraints.append(cvx.sum_entries(x, axis=1) == N_PLAY)
# Objective
objective = cvx.sum_entries(cvx.mul_elemwise(p.flatten(order='F'), cvx.vec(x))) # 2d -> 1d flattening
# ouch -> C vs. Fortran storage
# print(objective) # debug
# Problem
problem = cvx.Problem(cvx.Maximize(objective), constraints)
problem.solve(verbose=False)
print('MIP solution')
print(problem.status)
print(problem.value)
print(np.round(x.T.value))
sol = x.value
nnz = np.where(abs(sol - 1) <= 0.01) # being careful with fp-math
sol_p = p[nnz]
assert sol_p.shape[0] == N_PLAYERS * N_PLAY
""" Output solution """
for player in range(N_PLAYERS):
print('player: ', player, 'with cards: ', p[player, :])
print(' plays: ', sol_p[player*N_PLAY:player*N_PLAY+N_PLAY])
Output:
Players and their cards
[[ 3 16 6 10 2 14 4 17 7 1]
[15 8 16 3 19 17 5 6 0 12]
[ 4 2 18 12 11 19 5 6 14 7]
[10 14 5 6 18 1 8 7 19 15]
[15 17 1 16 14 13 18 3 12 9]]
MIP solution
optimal
180.00000005500087
[[ 0. 0. 0. 0. 0.]
[ 0. 1. 0. 1. 0.]
[ 1. 0. 0. -0. -0.]
[ 1. -0. 1. 0. 1.]
[ 0. 1. 1. 1. 0.]
[ 0. 1. 0. -0. 1.]
[ 0. -0. 1. 0. 0.]
[ 0. 0. 0. 0. -0.]
[ 1. -0. 0. 0. 0.]
[ 0. 0. 0. 1. 1.]]
player: 0 with cards: [ 3 16 6 10 2 14 4 17 7 1]
plays: [ 6 10 7]
player: 1 with cards: [15 8 16 3 19 17 5 6 0 12]
plays: [ 8 19 17]
player: 2 with cards: [ 4 2 18 12 11 19 5 6 14 7]
plays: [12 11 5]
player: 3 with cards: [10 14 5 6 18 1 8 7 19 15]
plays: [14 18 15]
player: 4 with cards: [15 17 1 16 14 13 18 3 12 9]
plays: [16 13 9]

Looks like a packing problem, where you want to pack 3 disjoint subsets of your original sets, each of size 3, and maximize the sum. You can formulate it as an ILP. Without loss of generality, we can assume the cards represent natural numbers ranging from 1 to N.
Let a_i in {0,1} indicate if player A plays card with value i, where i is in {1,...,N}. Notice that if player A doesn't have card i in his hand, a_i is set to 0 in the beginning.
Similarly, define b_i and c_i variables for players B and C.
Also, similarly, let m_i in {0,1} indicate if card i will appear in the middle, i.e., one of the players will play a card with value i.
Now you can say:
Maximize Sum(m_i . i), subject to:
For each i in {1,...N,}:
a_i, b_i, c_i, m_i are in {0, 1}
m_i = a_i + b_i + c_i
Sum(a_i) = 3, Sum(b_i) = 3, Sum(c_i) = 3
Discussion
Notice that constraint 1 and 2, force the uniqueness of each card in the middle.
I'm not sure how big of a problem can be handled by commercial or non-commercial solvers with this program, but notice that this is really a binary linear program, which might be simpler to solve than the general ILP, so it might be worth trying for the size you are looking for.

Sort each hand, dropping duplicate values. Delete anything past the 10-th highest card of any hand (3 hands * 3 cards/hand, plus 1): nobody can contribute a card that low.
For accounting purposes, make a directory by card value, showing which hands hold each value. For instance, given players A, B, C and these hands
A [1, 1, 1, 6, 4, 12, 7, 11, 13, 13, 9, 2, 2]
B [13, 2, 3, 1, 5, 5, 8, 9, 11, 10, 5, 5, 9]
C [13, 12, 11, 10, 6, 7, 2, 4, 4, 12, 3, 10, 8]
We would sort and de-dup the hands. 2 is the 10th-highest card of hand c, so we drop all values 2 and below. Then build the directory
A [13, 12, 11, 9, 7, 6]
B [13, 11, 10, 9, 8, 5, 3]
C [13, 12, 11, 10, 8, 7, 6, 4, 3]
Directory:
13 A B C
12 A C
11 A B C
10 B C
9 A B
8 B C
7 A B
6 A C
5 B
4 C
3 B C
Now, you need to implement a backtracking algorithm to choose cards in some order, get the sum of that order, and compare with the best so far. I suggest that you iterate through the directory, choosing a hand from which to obtain the highest remaining card, backtracking when you run out of contributors entirely, or when you get 9 cards.
I recommend that you maintain a few parameters to allow you to prune the investigation, especially when you get into the lower values.
Make a maximum possible value, the sum of the top 9 values in the directory. If you hit this value, stop immediately, as you've found an optimum solution.
Make a high starting target: cycle through the hands in sequence, taking the highest usable card remaining in the hand. In this case, cycling A-B-C, we would have
13, 11, 12, 9, 10, 8, 7, 5, 6 => 81
// Note: because of the values I picked
// this happens to provide an optimum solution.
// It will do so far a lot of the bridge-hand problem space.
Keep count of how many cards have been contributed by each hand; when one has given its 3 cards, disqualify it in some way: have a check in the choice code, or delete it from the local copy of the directory.
As you walk down the choice list, prune the search any time the remaining cards are insufficient to reach the best-so-far total. For instance, if you have a total of 71 after 7 cards, and the highest remaining card is 5, stop: you can't get to 81 with 5+4.
Does that get you moving?

Related

Finding the best combination of elements with the max total parameter value

I have 100 elements. Each element has 4 features A,B,C,D. Each feature is an integer.
I want to select 2 elements for each feature, so that I have selected a total of 8 distinct elements. I want to maximize the sum of the 8 selected features A,A,B,B,C,C,D,D.
A greedy algorithm would be to select the 2 elements with highest A, then the two elements with highest B among the remaining elements, etc. However, this might not be optimal, because the elements that have highest A could also have a much higher B.
Do we have an algorithm to solve such a problem optimally?
This can be solved as a minimum cost flow problem. In particular, this is an Assignment problem
First of all, see that we only need the 8 best elements of each features, meaning 32 elements maximum. It should even be possible to cut the search space further (as if the 2 best elements of A is not one of the 6 best elements of any other feature, we can already assigne those 2 elements to A, and each other feature only needs to look at the first 6 best elements. If it's not clear why, I'll try to explain further).
Then we make the vertices S,T and Fa,Fb,Fc,Fd and E1,E2,...E32, with the following edge :
for each vertex Fx, an edge from S to Fx with maximum flow 2 and a weight of 0 (as we want 2 element for each feature)
for each vertex Ei, an edge from Fx to Ei if Ei is one of the top elements of feature x, with maximum flow 1 and weight equal to the negative value of feature x of Ei. (negative because the algorithm will find the minimum cost)
for each vertex Ei, an edge from Ei to T, with maximum flow 1 and weight 0. (as each element can only be selected once)
I'm not sure if this is the best way, but It should work.
As suggested per #AloisChristen, this can be written as an assignment problem:
On the one side, we select the 8 best elements for each feature; that's 32 elements or less, since one element might be in the best 8 for more than one feature;
On the other side, we put 8 seats A,A,B,B,C,C,D,D
Solve the resulting assignment problem.
Here the problem is solved using scipy's linear_sum_assignment optimization function:
from numpy.random import randint
from numpy import argpartition, unique, concatenate
from scipy.optimize import linear_sum_assignment
# PARAMETERS
n_elements = 100
n_features = 4
n_per_feature = 2
# RANDOM DATA
data = randint(0, 21, (n_elements, n_features)) # random data with integer features between 0 and 20 included
# SELECT BEST 8 CANDIDATES FOR EACH FEATURE
n_selected = n_features * n_per_feature
n_candidates = n_selected * n_features
idx = argpartition(data, range(-n_candidates, 0), axis=0)
idx = unique(idx[-n_selected:].ravel())
candidates = data[idx]
n_candidates = candidates.shape[0]
# SOLVE ASSIGNMENT PROBLEM
cost_matrix = -concatenate((candidates,candidates), axis=1) # 8 columns in order ABCDABCD
element_idx, seat_idx = linear_sum_assignment(cost_matrix)
score = -cost_matrix[element_idx, seat_idx].sum()
# DISPLAY RESULTS
print('SUM OF SELECTED FEATURES: {}'.format(score))
for e,s in zip(element_idx, seat_idx):
print('{:2d}'.format(idx[e]),
'ABCDABCD'[s],
-cost_matrix[e,s],
data[idx[e]])
Output:
SUM OF SELECTED FEATURES: 160
3 B 20 [ 5 20 14 11]
4 A 20 [20 9 3 12]
6 C 20 [ 3 3 20 8]
10 A 20 [20 10 9 9]
13 C 20 [16 12 20 18]
23 D 20 [ 6 10 4 20]
24 B 20 [ 5 20 6 8]
27 D 20 [20 13 19 20]

Convert the permutation sequence A to B by selecting a set in A then reversing that set and inserting that set at the beginning of A

Given the sequence A and B consisting of N numbers that are permutations of 1,2,3,...,N. At each step, you choose a set S in sequence A in order from left to right (the numbers selected will be removed from A), then reverse S and add all elements in S to the beginning of the sequence A. Find a way to transform A into B in log2(n) steps.
Input: N <= 10^4 (number of elements of sequence A, B) and 2 permutations sequence A, B.
Output: K (Number of steps to convert A to B). The next K lines are the set of numbers S selected at each step.
Example:
Input:
5 // N
5 4 3 2 1 // A sequence
2 5 1 3 4 // B sequence
Output:
2
4 3 1
5 2
Step 0: S = {}, A = {5, 4, 3, 2, 1}
Step 1: S = {4, 3, 1}, A = {5, 2}. Then reverse S => S = {1, 3, 4}. Insert S to beginning of A => A = {1, 3, 4, 5, 2}
Step 2: S = {5, 2}, A = {1, 3, 4}. Then reverse S => S = {2, 5}. Insert S to beginning of A => A = {2, 5, 1, 3, 4}
My solution is to use backtracking to consider all possible choices of S in log2(n) steps. However, N is too large so is there a better approach? Thank you.
For each operation of combined selecting/removing/prepending, you're effectively sorting the elements relative to a "pivot", and preserving order. With this in mind, you can repeatedly "sort" the items in backwards order (by that I mean, you sort on the most significant bit last), to achieve a true sort.
For an explicit example, lets take an example sequence 7 3 1 8. Rewrite the terms with their respective positions in the final sorted list (which would be 1 3 7 8), to get 2 1 0 3.
7 -> 2 // 7 is at index 2 in the sorted array
3 -> 1 // 3 is at index 0 in the sorted array
1 -> 0 // so on
8 -> 3
This new array is equivalent to the original- we are just using indices to refer to the values indirectly (if you squint hard enough, we're kinda rewriting the unsorted list as pointers to the sorted list, rather than values).
Now, lets write these new values in binary:
2 10
1 01
0 00
3 11
If we were to sort this list, we'd first sort by the MSB (most significant bit) and then tiebreak only where necessary on the subsequent bit(s) until we're at the LSB (least significant bit). Equivalently, we can sort by the LSB first, and then sort all values on the next most significant bit, and continuing in this fashion until we're at the MSB. This will work, and correctly sort the list, as long as the sort is stable, that is- it doesn't change the order of elements that are considered equal.
Let's work this out by example: if we sorted these by the LSB, we'd get
2 10
0 00
1 01
3 11
-and then following that up with a sort on the MSB (but no tie-breaking logic this time), we'd get:
0 00
1 01
2 10
3 11
-which is the correct, sorted result.
Remember the "pivot" sorting note at the beginning? This is where we use that insight. We're going to take this transformed list 2 1 0 3, and sort it bit by bit, from the LSB to the MSB, with no tie-breaking. And to do so, we're going to pivot on the criteria <= 0.
This is effectively what we just did in our last example, so in the name of space I won't write it out again, but have a look again at what we did in each step. We took the elements with the bits we were checking that were equal to 0, and moved them to the beginning. First, we moved 2 (10) and 0 (00) to the beginning, and then the next iteration we moved 0 (00) and 1 (01) to the beginning. This is exactly what operation your challenge permits you to do.
Additionally, because our numbers are reduced to their indices, the max value is len(array)-1, and the number of bits is log2() of that, so overall we'll only need to do log2(n) steps, just as your problem statement asks.
Now, what does this look like in actual code?
from itertools import product
from math import log2, ceil
nums = [5, 9, 1, 3, 2, 7]
size = ceil(log2(len(nums)-1))
bit_table = list(product([0, 1], repeat=size))
idx_table = {x: i for i, x in enumerate(sorted(nums))}
for bit_idx in range(size)[::-1]:
subset_vals = [x for x in nums if bit_table[idx_table[x]][bit_idx] == 0]
nums.sort(key=lambda x: bit_table[idx_table[x]][bit_idx])
print(" ".join(map(str, subset_vals)))
You can of course use bitwise operators to accomplish the bit magic ((thing << bit_idx) & 1) if you want, and you could del slices of the list + prepend instead of .sort()ing, this is just a proof-of-concept to show that it actually works. The actual output being:
1 3 7
1 7 9 2
1 2 3 5

Neighbors in the matrix - algorithm

I have a problem with coming up with an algorithm for the "graph" :(
Maybe one of you would be so kind and direct me somehow <3
The task is as follows:
We have a board of at least 3x3 (it doesn't have to be a square, it can be 4x5 for example). The user specifies a sequence of moves (as in Android lock pattern). The task is to check how many points he has given are adjacent to each other horizontally or vertically.
Here is an example:
Matrix:
1 2 3 4
5 6 7 8
9 10 11 12
The user entered the code: 10,6,7,3
The algorithm should return the number 3 because:
10 is a neighbor of 6
6 is a neighbor of 7
7 is a neighbor of 3
Eventually return 3
Second example:
Matrix:
1 2 3
4 5 6
7 8 9
The user entered the code: 7,8,6,3
The algorithm should return 2 because:
7 is a neighbor of 8
8 is not a neighbor of 6
6 is a neighbor of 3
Eventually return 2
Ofc number of operations equal length of array - 1
Sorry for "ile" and "tutaj", i'm polish
If all the codes are unique, use them as keys to a dictionary (with (row/col) pairs as values). Loop thru the 2nd item in user input to the end, check if math.Abs(cur.row-prev.row)+math.Abs(cur.col-prev.col)==1. This is not space efficient but deal with user input in linear complexity.
The idea is you have 4 conditions, one for each direction. Given any matrix of the shape n,m which is made of a sequence of integers AND given any element:
The element left or right will always be + or - 1 to the given element.
The element up or down will always be + or - m to the given element.
So, if abs(x-y) is 1 or m, then x and y are neighbors.
I demonstrate this in python.
def get_neighbors(seq,matrix):
#Conditions
check = lambda x,y,m: np.abs(x-y)==1 or np.abs(x-y)==m
#Pairs of sequences appended with m
params = zip(seq, seq[1:], [matrix.shape[1]]*(len(seq)-1))
neighbours = [check(*i) for i in params]
count = sum(neighbours)
return neighbours, count
seq = [7,8,6,3]
matrix = np.arange(1,10).reshape((3,3))
neighbours, count = get_neighbors(seq, matrix)
print('Matrix:')
print(matrix)
print('')
print('Sequence:', seq)
print('')
print('Count of neighbors:',count)
Matrix:
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
Sequence: [10, 6, 7, 3]
Count of neighbors: 3
Another example -
seq = [7,8,6,3]
matrix = np.arange(1,10).reshape((3,3))
neighbours, count = get_neighbors(seq, matrix)
Matrix:
[[1 2 3]
[4 5 6]
[7 8 9]]
Sequence: [7, 8, 6, 3]
Count of neighbors: 2
So your input is the width of a table, the height of a table, and a list of numbers.
W = 4, H = 3, list = [10,6,7,3]
There are two steps:
Convert the list of numbers into a list of row/column coordinates (1 to [1,1], 5 to [2,1], 12 to [3,4]).
In the new list of coordinates, find consequent pairs, which have one coordinate identical, and the other one has a difference of 1.
Both steps are quite simple ("for" loops). Do you have problems with 1 or 2?

Practical algorithms for permuting external memory

On a spinning disk, I have N records that I want to permute. In RAM, I have an array of N indices that contain the desired permutation. I also have enough RAM to hold n records at a time. What algorithm can I use to execute the permutation on disk as quickly as possible, taking into account the fact that sequential disk access is a lot faster?
I have plenty of excess disk to use for intermediate files, if desired.
This is a known problem. Find the cycles in your permutation order. For instance, given five records to permute [1, 0, 3, 4, 2], you have cycles (0, 1) and (2, 3, 4). You do this by picking an unused starting position; follow the index pointers until you return to your starting point. The sequence of pointers describes a cycle.
You then permute the records with an internal temporary variable, one record long.
temp = disk[0]
disk[0] = disk[1]
disk[1] = temp
temp = disk[2]
disk[2] = disk[3]
disk[3] = disk[4]
disk[4] = temp
Note that you can also perform the permutation as you traverse the pointers. You will also need some method to recall which positions have already been permuted, such as clearing the permutation index (set it to -1).
Can you see how to generalize that?
This is an problem with interval coordination. I'll simplify the notation slightly by changing the memory available to M records -- having upper- and lower-case N is a little confusing.
First, we re-cast the permutations as a series of intervals, the rotational span during which a record needs to reside in RAM. If a record needs to be written to a lower-numbered position, we increase the endpoint by the list size, to indicate the wraparound -- have to wait for the next disk rotation. For instance, using my earlier example, we expand the list:
[1, 0, 3, 4, 2]
0 -> 1
1 -> 0+5
2 -> 3
3 -> 4
4 -> 2+5
Now, we apply standard greedy scheduling resolution. First, sort by endpoint:
[0, 1]
[2, 3]
[3, 4]
[1, 5]
[4, 7]
Now, apply the algorithm for M-1 "lanes"; the extra one is needed for swap space. We fill each lane, appending the interval with the earliest endpoint, whose start-point doesn't overlap:
[0, 1] [2, 3] [3, 4] [4, 7]
[1, 5]
We can do this in a total of 7 "ticks" if M >= 3. If M=2, we defer the second lane by 2 rotations to [11, 15].
Sneftal's nice example gives us more troubles, with deeper overlap:
[0, 4]
[1, 5]
[2, 6]
[3, 7]
[4, 0+8]
[5, 1+8]
[6, 2+8]
[7, 3+8]
This requires 4 "lanes" if available, deferring lanes as needed if M < 5.
The pathological case is where every record in the permutation needs to be copied back one position, such as [3, 0, 1, 2], with M=2.
[0, 3]
[1, 4]
[2, 5]
[3, 6]
In this case, we walk through the deferral cycle multiple times. At the end of every rotation, we have to defer all remaining intervals by one rotation, resulting in
[0, 3] [3, 6] [2+4, 5+4] [1+4+4, 4+4+4]
Does that get you moving, or do you need more detail?
I have an idea, which might need further improvement. But here it goes:
suppose the hdd has the following structure:
5 4 1 2 3
And we want to write out this permutation:
2 3 5 1 4
Since hdd is a circular buffer, and assuming it can only rotate in one direction, we can write the above permutation using shifts as such:
5 >> 2
4 >> 3
1 >> 1
2 >> 2
3 >> 2
So let's put that in an array, and since we know it is a circular array, lets put its mirrors side by side:
| 2 3 1 2 2 | 2 3 1 2 2| 2 3 1 2 2 | 2 3 1 2 2 |... Inf
Since we want to favor sequential reads, (or writes) we can put a cost function to the above series. Let the cost function be linear, i. e:
0 1 2 3 4 5 6 7 8 9 10 ... Inf
Now, let us add the cost function to the above series, but how to select the starting point?
The idea is to select the starting point such that you get the maximum congruent monotonically increasing sequence.
For example, if you select the 0 point to be on "3", you'll get
(1) | - 3 2 4 5 | 6 8 7 9 10 | ...
If you select the 0 point to be on "2", the one just right of "1", you'll get:
(2) | - - - 2 3 | 4 6 5 7 8 | ...
Since we are trying to favor consecutive reads, lets define our read-write function to work as such:
f():
At any currently pointed hdd location, function will read the currently pointed hdd file, into available RAM. (namely, total space - 1, because we want to save 1 for swap)
If no available space is left on RAM for read, the function will assert and program will halt.
At any current hdd location, if ram holds the value that we want to be written in that hdd location, function reads the current file into swap space, writes the wanted value from the ram to hdd, and destroys the value in ram.
If a value is placed into hdd, function will check if the sequence is completed. If it is, program will return with success.
Now, we should note that if the following holds:
shift amount <= n - 1 (n : available memory we can hold)
We can traverse the hard disk in once pass using the above function. For example:
current: 4 5 6 7 0 1 2 3
we want: 0 1 2 3 4 5 6 7
n : 5
We can start anywhere we want, say from the initial "4". We read 4 items sequentially, (n has 4 items now) and we start placing from 0 1 2 3, (we can because n = 5 total, and 4 is used. 1 is used for swap). So the total operations is 4 consecutive reads, and then r-w operations for 8 times.
Using that analogy, it becomes clear that if we subtract "n-1" from equations (1) and (2), the positions which have value "<= 0" will be a better suit for initial position because the ones higher than zero will definitely require another pass.
So we select eq. (2) and subtract, for let's say "n = 3", we subtract 2 from eq. (2):
(2) | - - - 0 1 | 2 4 3 5 6 | ...
Now it is clear that, using f(), and starting from 0, assuming n = 3, we will have a starting operation as such: r, r, r-w, r-w, ...
So, how do we do the rest and find minimum cost? We will place an array with initial minimum cost, just below equation (2). The positions in that array will signify where we want f() to be executed.
| - - - 0 1 | 2 4 3 5 6 | ...
| - - - 1 1 | 1 1 1 1 1 | ...
The second array, the ones with 1's and 0's tell the program where to execute f(). Note that, if we assumed those locations wrong, f() will assert.
Before we start actually placing files into hdd, we of course want to see if the f() positions are correct. We check if there are assertions, we we will try to minimize cost whilst removing all assertions. So, e.g:
(1) 1111000000000000001111
(2) 1111111000000000000000
(1) obviously has higher cost that (2). So the question simplifies on finding the 1-0 array.
Some ideas on finding the best array:
Simplest solution is to write out all 1's and turn assertions into 0's. (essentially it's a skip). This method is guaranteed to work.
Brute force: write an array of as shown in (2) and start shifting 1's to right, in such an order that tries out every permutation available:
1111111100000000
1111111010000000
1111110110000000
...
Full random approach: Plug in mt1997 and start permuting. Whenever you see a sharp drop in cost, stop executing and implement hdd copy-paste. You won't find the global minimum, but you'll get a nice trade-off.
Genetic algorithms: For permutations where "shift count is much lower than n - 1", the methodology provided in this answer should (?) provide a global minimum and smooth gradients. This allows one to use genetic algorithms without relying on mutations too much.
One advantage I find in this approach is that, since OP mentioned that this is a real life problem, the method provides an easy(ier?) way to change cost functions. It is easier to detect the effect of say, having lots of contigous small files to be copied vs. having a single huge file. Or perhaps rrwwrrww is better than rrrrwwww?
Does any of this even make sense? We will have to try out ...

How to check if a given number is of the form x^y?

I'm preparing for my interviews and came across this question:
Write a program to check if a number n is of x^y form. It is known that n, x and y are integers and that x and y are greater than 2.
I thought of taking log and stuff but couldn't certainly figure out how to check if the number is of the form. Could any of you please help? :)
"Taking the log and stuff" is the way to go. Note that N > 1 is never a^b for integer a and b > log_2(N). So you can check floor(N^(1/b))^b = N for each integer b between 2 and log_2(N). You have to do about log(N) many exponentiations, each of which produces a number at most the size of N.
This is far faster than #dasblinkenlight's solution, which requires you to factor N first. (No polynomial-time algorithm---that is, polynomial in the number of bits in N, is known for integer factorisation. However, integer exponentiation with a small exponent can be done in polynomial time.)
One way to solve this would be to factorize n, count the individual factors, and find the greatest common denominator of the counts. If GCD is 1, the answer is "no". Otherwise, the answer is "yes".
Here are some examples:
7, prime factor 7 (one time). We have one factor repeated once. Answer "no", because the GCD is 1.
8, prime factors 2 (3 times). We have one factor with the count of three. Answer "yes", because GCD is 3.
144, prime factors 2 (4 times) 3 (2 times). GCD of 4 and 2 is 2, so the answer is "yes".
72, prime factors 2 (3 times) 3 (2 times). GCD of 3 and 2 is 1, so the answer is "no".
There are a lot of good answers, but I see modulo arithmetics is still missing.
Depending on the magnitude of the numbers to check, it might be useful to classify them by their last bits. We can easily create a table with possible candidates.
To show how it works, let us create such a table for 4 last bits. In that case we have 16 cases to consider:
0^2, 0^3, ... : 0 mod 16
1^2, 1^3, ... : 1 mod 16
2^2, 2^3, ... : 0, 4, 8 mod 16
3^2, 3^3, ... : 9, 11, 1, 3 mod 16
4^2, 4^3, ... : 0 mod 16
5^2, 5^3, ... : 9, 13, 1, 5 mod 16
6^2, 6^3, ... : 4, 8, 0 mod 16
7^2, 7^3, ... : 1, 7 mod 16
8^2, 8^3, ... : 0 mod 16
9^2, 9^3, ... : 9, 1 mod 16
10^2,10^3, ... : 4, 8, 0 mod 16
11^2,11^3, ... : 9, 3, 1, 11 mod 16
12^2,12^3, ... : 0 mod 16
13^2,13^3, ... : 9, 5, 1, 13 mod 16
14^2,14^3, ... : 4, 8, 0 mod 16
15^2,15^3, ... : 1, 15 mod 16
The table is more useful the other way round; which bases x are possible for a given number n = x^y.
0: 0, 2, 4, 6, 8, 10, 12, 14 mod 16
1: 1, 3, 5, 7, 9, 11, 13, 15
2: -
3: 3, 11
4: 2, 6, 10, 14
5: 5, 13
6: -
7: 7
8: 2, 6, 10, 14
9: 3, 5, 9, 11, 13
10: -
11: 3, 11
12: -
13: 5, 13
14: -
15: 15
So, just by looking at the four last bits over one quarter of numbers can be discarded immediately.
If we take number 13726423, its remainder by 16 is 7, and thus if it is of the form we are interested in, it must be (16 n+7)^y.
For most numbers the number of divisors to try is quite limited. In practice, the table could me much larger, e.g., 16 bits.
A simple optimization with binary numbers is to remove the trailing zeros. This makes it unnecessary to worry about even numbers, and y must be a factor of the number of the zeros removed.
If we still have too much work, we can create another modulo table. The other could be, e.g. modulo 15. The equivalent table looks like this:
0: 0
1: 1, 2, 4, 7, 8, 11, 13, 14
2: 2, 8
3: 3, 12
4: 2, 4, 7, 8, 13
5: 5
6: 3, 6, 9, 12
7: 7, 13
8: 2, 8
9: 3, 9, 12
10: 5, 10
11: 11
12: 3, 12
13: 7, 13
14: 14
As our number from the previous example (13726423) is 13 modulo 15, then x = (15 m +7) or (15 m +13). As there are no common factors in 15 and 16, the valid numbers are 240 p + 7 and 240 p + 103. By two integer divisions and two table lookups we have managed to limit the possible values of x to 1/120 of numbers.
If the tables are largish, the number of possible x s is easy to limit to a very low number. For example, with tables of 65536 and 65535 elements the cycle is 4294901760, so for any number below approximately 1.6 x 10^19 the two tables give a short unique list of possible values of x.
If you can factor n, then it is easy to find an answer by examining the multiplicities of the factors. But the usual use for determining if a number is a perfect power is as a preliminary test for some factoring algorithms, in which case it is not realistic to find the factors of n.
The trick to determining if a number is a perfect power is to know that, if the number is a perfect power, then the exponent e must be less than log2 n, because if e is greater then 2e will be greater than n. Further, it is only necessary to test prime es, because if a number is a perfect power to a composite exponent it will also be a perfect power to the prime factors of the composite component; for instance, 215 = 32768 = 323 = 85 is a perfect cube root and also a perfect fifth root. Here is pseudocode for a function that returns b if there is some exponent e such that be = n or 0 if there is not; the function root(e,n) returns the e-th root of n:
function perfectPower(n)
for p in primes(log2(n))
b = floor(root(p,n))
if b**p == n return b
return 0
I discuss this function at my blog.
Alternatively, if factorization is too hard, you can exploit your maths library and try many values of x or y until you find one that works.
Trying for y will be less work, if you have an operation "y-th root of n" available (it could be masquerading under the name of "x to the power of 1/y"). Just try all integer values of y larger than 2 until either you find one that gives an integer answer, or the result drops below 2. If n is a standard 32-bit integer, then it will take no more than 32 attempts (and, more generally, if n is a m-bit integer, then it will take no more than m attempts).
If you do not have "y-th root of n" available, you can try all x's with the operation "log base x of n", until you get an integer answer or the result drops below 2. This will take more work since you need to check all values up until square root of x. I think it should be possible to optimize this somehow and "home in" on potential integer results.
The exponent y is easily bounded 2 ≤ y ≤ log_2(n) . Test each y in that range. If it exists, x will be the integer yth root of n.
The point is while x determines y and vice versa, the search space for y is much smaller, so you should search y rather than x (which could be as large as sqrt(n)).

Resources