finding maximal subsets

finding maximal subsets - algorithm

For given n, find the subset S of {1,2,...,n} such that
all elements of S are coprime
the sum of the elements of S is as large as possible
Doing a brute force search takes too long and I can't find a pattern. I know that I can just take all the primes from 1 to n, but that's probably not the right answer. Thanks.

I would tackle this as a dynamic programming problem. Let me walk through it for 20. First take the primes in reverse order.
19, 17, 13, 11, 7, 5, 3, 2
Now we're going to walk up the best solutions which have used subsets of those primes of increasing size. We're going to do a variation of breadth first search, but with the trick that we always use the largest currently unused prime (plus possibly more). I will represent all of the data structures in the form size: {set} = (total, next_number). (I'm doing this by hand, so all mistakes are mine.) Here is how we build up the data structure. (In each step I consider all ways of growing all sets of one smaller size from the previous step, and take the best totals.)
Try to reproduce this listing and, modulo any mistakes I made, you should have an algorithm.
Step 0
0: {} => (1, 1)
Step 1
1: {19} => (20, 19)
Step 2
2: {19, 17} => (37, 17)
Step 3
3: {19, 17, 13} => (50, 13)
Step 4
4: {19, 17, 13, 11} => (61, 11)
Step 5
5: {19, 17, 13, 11, 7} => (68, 7)
6: {19, 17, 13, 11, 7, 2} => (75, 14)
Step 6
6: {19, 17, 13, 11, 7, 5} => (73, 5)
{19, 17, 13, 11, 7, 2} => (75, 14)
7: {19, 17, 13, 11, 7, 5, 2} => (88, 20)
{19, 17, 13, 11, 7, 5, 3} => (83, 15)
Step 7
7: {19, 17, 13, 11, 7, 5, 2} => (88, 20)
{19, 17, 13, 11, 7, 5, 3} => (83, 15)
8: {19, 17, 13, 11, 7, 5, 3, 2} => (91, 18)
Step 8
8: {19, 17, 13, 11, 7, 5, 3, 2} => (99, 16)
And now we just trace the data structures backwards to read off 16, 15, 7, 11, 13, 17, 19, 1 which we can sort to get 1, 7, 11, 13, 15, 16, 17, 19.
(Note there are a lot of details to get right to turn this into a solution. Good luck!)

You can do a little better by taking powers of primes, up the to bound you have. For example, suppose that n=30. Then you want to start with
1, 16, 27, 25, 7, 11, 13, 17, 19, 23, 29
Now look at where there are places to improve. Certainly you cannot increase any of the primes that are already at least n/2: 17, 19, 23, 29 (why?). Also, 3^3 and 5^2 are pretty close to 30, so they're also probably best left alone (why?).
But what about 2^4, 7, 11 and 13? We can take the 2's and combine them with 7, 11, or 13. This would give:
2 * 13 = 26 replaces 16 + 13 = 29 BAD
2 * 11 = 22 replaces 16 + 11 = 27 BAD
2^2 * 7 = 28 replaces 16 + 7 = 23 GOOD
So it looks like we should get the following list (now sorted):
1, 11, 13, 17, 19, 23, 25, 27, 28, 29
Try to prove that this cannot be improved, and that should give you some insight into the general case.
Good luck!

The following is quite practical.
Let N = {1, 2, 3, ..., n}.
Let p1 < p2 < p3 < ... < pk be the primes in N.
Let Ti be the natural numbers in N divisible by pi but not by any prime less than pi.
We can pick at most one number from each subset Ti.
Now recurse.
S = {1}.
Check if pi is a divisor of any of the numbers already in S. If it is, skip Ti.
Otherwise, pick a number xi from Ti coprime to the elements already in S, and add it to S.
Go to next i.
When we reach k + 1, calculate the sum of the elements in S. If new maximum, save S away.
Continue.
Take n = 30.
The primes are 2, 3, 5, 7, 11, 13, 17, 19, 23, and 29.
T1 = {2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30}
T2 = {3, 9, 15, 21, 27}
T3 = {5, 25}
T4 = {7}
T5 = {11}
T6 = {13}
T7 = {17}
T8 = {19}
T9 = {23}
T10 = {29}
So fewer than 15 * 5 * 2 = 150 possibilities.
Here is my original wrong result for n = 100.
1 17 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 88 89 91 95 97 99
Sum = 1374
It should be
1 17 23 29 31 37 41 43 47 53 59 61 67 71 73 79 81 83 88 89 91 95 97
Sum = 1356
Less than 2 seconds for n = 150. About 9 seconds for n = 200.

I think that this is similar to the subset problem, which is NP-Complete.
First, break each number into its prime factors (or use a list of primes to generate the full list from 1 to n, same thing).
Solve the subset problem with recursive descend by finding a subset that contains no common primes.
Run through all solutions and find the largest one.

I implemented a recursive solution in Prolog, based on taking the list of integers in descending order. On my fairly ancient Toshiba laptop, SWI-Prolog produces answers without hesitation for N < 90. Here are some timings for N = 100 to 150 by tens:
N Sum Time(s)
----- --------- -------
100 1356 1.9
110 1778 2.4
120 1962 4.2
130 2273 11.8
140 2692 16.3
150 2841 30.5
The timings reflect an implementation that starts from scratch for each value of N. A lot of the computation for N+1 can be skipped if the result for N is previously known, so if a range of values N are to be computed, it would make sense to take advantage of that.
Prolog source code follows.
/*
Check if two positive integers are coprime
recursively via Euclid's division algorithm
*/
coprime(0,Z) :- !, Z = 1.
coprime(A,B) :-
C is B mod A,
coprime(C,A).
/*
Find the sublist of first argument that are
integers coprime to the second argument
*/
listCoprime([ ],_,[ ]).
listCoprime([H|T],X,L) :-
( coprime(H,X)
-> L = [H|M]
; L = M
),
listCoprime(T,X,M).
/*
Find the sublist of first argument of coprime
integers having the maximum possible sum
*/
sublistCoprimeMaxSum([ ],S,[ ],S).
sublistCoprimeMaxSum([H|T],A,L,S) :-
listCoprime(T,H,R),
B is A+H,
sublistCoprimeMaxSum(R,B,U,Z),
( T = R
-> ( L = [H|U], S = Z )
; ( sublistCoprimeMaxSum(T,A,V,W),
( W < Z
-> ( L = [H|U], S = Z )
; ( L = V, S = W )
)
)
).
/* Test utility to generate list N,..,1 */
list1toN(1,[1]).
list1toN(N,[N|L]) :-
N > 1,
M is N-1,
list1toN(M,L).
/* Test calling sublistCoprimeMaxSum/4 */
testCoprimeMaxSum(N,CoList,Sum) :-
list1toN(N,L),
sublistCoprimeMaxSum(L,0,CoList,Sum).

Related

"super ugly number" clarification

Write a program to find the nth super ugly number.
Super ugly numbers are positive numbers whose all prime factors are in the given prime list of size k. For example, [1, 2, 4, 7, 8, 13, 14, 16, 19, 26, 28, 32] is the sequence of the first 12 super ugly numbers given primes = [2, 7, 13, 19] of size 4.
I don't understand the question. That is what I need help/clarification on:
In the above statement, why are [1, 2, 4, 7, 8, 13, 14, 16, 19, 26, 28, 32] the first 12 super ugly numbers? How is that related to the given input primes = [2, 7, 13, 19]

2 -> 2
4 -> 2 * 2
7 -> 7
8 -> 2 * 2 * 2
13 -> 13
14 -> 2 * 7
16 -> 2 * 2 * 2 * 2
19 -> 19
26 -> 2 * 13
28 -> 2 * 2 * 7
32 -> 2 * 2 * 2 * 2 * 2
Not sure why 1 is on the list. ;)
Edit: The question statement says that 1 should always be a super ugly number.

You are given a list containing a selection of prime numbers : for example, [2,7,13,19].
What you must do is take each natural integer (1, 2, ...), starting from 1, and calculate its prime factors. If all those prime factors belong to the list of "authorized" prime numbers given above, then the number is declared "super ugly".
For example, the prime factors of 14 are [2, 7], which are all in the reference list ([2,7,13,19]). So, 14 is super ugly.
You job is to find the Nth super ugly number with that method.

Retrieving elements from array regarding to an accumulating parameter

Assume that there are 2 arrays of elements and a function call will return elements within them. Each time a retrieval is performed, 8 elements will be retrieved from array 1, while 2 will be retrieved from array 2. And the elements to be retrieved is indicated by a number provided, assume that list 1 has 35 elements, and list 2 has 7, the situation will be like:
Assume the 2 arrays are:
array 1: 0, 1, 2, 3, 4, ..., 35
array 2: 0, 1, 2, 3, 4, 5, 6
number provided elements from array 1 elements from array 2
1 0, 1, 2, 3, 4, 5, 6, 7 0, 1
11 8, 9, 10, 11, 12, 13, 14, 15 2, 3
21 16, 17, 18, 19, 20, 21, 22, 23 4, 5
31 24, 25, 26, 27, 28, 29, 30, 31 6
40 32, 33, 34, 35 0, 1
46 0, 1, 2, 3, 4, 5, 6, 7 2, 3
56 8, 9, 10, 11, 12, 13, 14, 15 4, 5
66 16, 17, 18, 19, 20, 21, 22, 23 6
75 24, 25, 26, 27, 28, 29, 30, 31 0, 1
85 32, 33, 34, 35 2, 3
...
Each time a retrieval is done, the count of numbers returned will be added to the last provided number become the next provided number. If one of the list is exhausted (remaining elements fewer than 8), then the remaining numbers will be retrieved from that list, and next time it will start retrieving elements start from index 0 again, like the situations when number 31 and 40 is passed.
The question is, is there anyway to determine what position to start in both array when a number is provided? e.g. when number 40 is given, I should start at 32 in list 1, and 0 in list 2. Like the above situation, list one is exhausted every 5th retrieval, while list 2 exhausted at every 4th retrieval, but since the provided number is based on the accumulated count of number retrieved, how can I determine where to start this time when a number is given?
I have been thinking this for days and really feel frustrated about it. Thanks for any help!

Their is a cycle. And one cycle will have total_num numbers, we can get total_num from the code bellow:
def get_one_cycle_numbers:
n = len(a) / 8
m = len(b) / 2
g = gcd(n, m)
total_num = len(a) * n / g + len(b) * m / g
return total_num
When we get the provided number num we just num = num % total_num and simulate the cycle.
PS: Hope I got the right understanding of the question.

Equality Between Base 10 and Base 16

From my textbook:
What does it mean when it says 37 subscript(16) = 55 subscript(10)?

It means 37 base 16 (Hexadecimal), and 55 base 10 (Decimal). The 0x preceding a number denotes that it is base 16 hexadecimal.
To see how they are equal lets first look at the place values of 55
5, 5 (digits)
10, 1 (place values)
They are 10 to the power of the number of places over they are so 10^0 = 1 for the ones, and 10^1 = 10 for the tens.
You have a 5 in the ones place giving you 5, and 5 in the tens place giving you 50 when you add them together you get 55.
5 * 10 = 50
5 * 1 = 5
5 + 50 = 55
The 37 is in Hexadecimal which means its base is 16 so the place values are 16 to the power of the number of places over which gives you
3, 7 (digits)
16, 1 (place values)
3 * 16 = 48
7 * 1 = 7
48 + 7 = 55
Because the Hexadecimal system requires 16 unique numerals it uses the letters a-f as well
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
Also because you might see these 0b denotes base 2 (Binary), and 0o denotes base 8 (Octal).

Spread objects evenly over multiple collections

The scenario is that there are n objects, of different sizes, unevenly spread over m buckets. The size of a bucket is the sum of all of the object sizes that it contains. It now happens that the sizes of the buckets are varying wildly.
What would be a good algorithm if I want to spread those objects evenly over those buckets so that the total size of each bucket would be about the same? It would be nice if the algorithm leaned towards less move size over a perfectly even spread.
I have this naïve, ineffective, and buggy solution in Ruby.
buckets = [ [10, 4, 3, 3, 2, 1], [5, 5, 3, 2, 1], [3, 1, 1], [2] ]
avg_size = buckets.flatten.reduce(:+) / buckets.count + 1
large_buckets = buckets.take_while {|arr| arr.reduce(:+) >= avg_size}.to_a
large_buckets.each do |large|
smallest = buckets.last
until ((small_sum = smallest.reduce(:+)) >= avg_size)
break if small_sum + large.last >= avg_size
smallest << large.pop
end
buckets.insert(0, buckets.pop)
end
=> [[3, 1, 1, 1, 2, 3], [2, 1, 2, 3, 3], [10, 4], [5, 5]]

I believe this is a variant of the bin packing problem, and as such it is NP-hard. Your answer is essentially a variant of the first fit decreasing heuristic, which is a pretty good heuristic. That said, I believe that the following will give better results.
Sort each individual bucket in descending size order, using a balanced binary tree.
Calculate average size.
Sort the buckets with size less than average (the "too-small buckets") in descending size order, using a balanced binary tree.
Sort the buckets with size greater than average (the "too-large buckets") in order of the size of their greatest elements, using a balanced binary tree (so the bucket with {9, 1} would come first and the bucket with {8, 5} would come second).
Pass1: Remove the largest element from the bucket with the largest element; if this reduces its size below the average, then replace the removed element and remove the bucket from the balanced binary tree of "too-large buckets"; else place the element in the smallest bucket, and re-index the two modified buckets to reflect the new smallest bucket and the new "too-large bucket" with the largest element. Continue iterating until you've removed all of the "too-large buckets."
Pass2: Iterate through the "too-small buckets" from smallest to largest, and select the best-fitting elements from the largest "too-large bucket" without causing it to become a "too-small bucket;" iterate through the remaining "too-large buckets" from largest to smallest, removing the best-fitting elements from them without causing them to become "too-small buckets." Do the same for the remaining "too-small buckets." The results of this variant won't be as good as they are for the more complex variant because it won't shift buckets from the "too-large" to the "too-small" category or vice versa (hence the search space will be smaller), but this also means that it has much simpler halting conditions (simply iterate through all of the "too-small" buckets and then halt), whereas the complex variant might cause an infinite loop if you're not careful.
The idea is that by moving the largest elements in Pass1 you make it easier to more precisely match up the buckets' sizes in Pass2. You use balanced binary trees so that you can quickly re-index the buckets or the trees of buckets after removing or adding an element, but you could use linked lists instead (the balanced binary trees would have better worst-case performance but the linked lists might have better average-case performance). By performing a best-fit instead of a first-fit in Pass2 you're less likely to perform useless moves (e.g. moving a size-10 object from a bucket that's 5 greater than average into a bucket that's 5 less than average - first fit would blindly perform the movie, best-fit would either query the next "too-large bucket" for a better-sized object or else would remove the "too-small bucket" from the bucket tree).

I ended up with something like this.
Sort the buckets in descending size order.
Sort each individual bucket in descending size order.
Calculate average size.
Iterate over each bucket with a size larger than average size.
Move objects in size order from those buckets to the smallest bucket until either the large bucket is smaller than average size or the target bucket reaches average size.
Ruby code example
require 'pp'
def average_size(buckets)
(buckets.flatten.reduce(:+).to_f / buckets.count + 0.5).to_i
end
def spread_evenly(buckets)
average = average_size(buckets)
large_buckets = buckets.take_while {|arr| arr.reduce(:+) >= average}.to_a
large_buckets.each do |large_bucket|
smallest_bucket = buckets.last
smallest_size = smallest_bucket.reduce(:+)
large_size = large_bucket.reduce(:+)
until (smallest_size >= average)
break if large_size <= average
if smallest_size + large_bucket.last > average and large_size > average
buckets.unshift buckets.pop
smallest_bucket = buckets.last
smallest_size = smallest_bucket.reduce(:+)
end
smallest_size += smallest_object = large_bucket.pop
large_size -= smallest_object
smallest_bucket << smallest_object
end
buckets.unshift buckets.pop if smallest_size >= average
end
buckets
end
test_buckets = [
[ [10, 4, 3, 3, 2, 1], [5, 5, 3, 2, 1], [3, 1, 1], [2] ],
[ [4, 3, 3, 2, 2, 2, 2, 1, 1], [10, 5, 3, 2, 1], [3, 3, 3], [6] ],
[ [1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1], [1, 1] ],
[ [10, 9, 8, 7], [6, 5, 4], [3, 2], [1] ],
]
test_buckets.each do |buckets|
puts "Before spread with average of #{average_size(buckets)}:"
pp buckets
result = spread_evenly(buckets)
puts "Result and sum of each bucket:"
pp result
sizes = result.map {|bucket| bucket.reduce :+}
pp sizes
puts
end
Output:
Before spread with average of 12:
[[10, 4, 3, 3, 2, 1], [5, 5, 3, 2, 1], [3, 1, 1], [2]]
Result and sum of each bucket:
[[3, 1, 1, 4, 1, 2], [2, 1, 2, 3, 3], [10], [5, 5, 3]]
[12, 11, 10, 13]
Before spread with average of 14:
[[4, 3, 3, 2, 2, 2, 2, 1, 1], [10, 5, 3, 2, 1], [3, 3, 3], [6]]
Result and sum of each bucket:
[[3, 3, 3, 2, 3], [6, 1, 1, 2, 2, 1], [4, 3, 3, 2, 2], [10, 5]]
[14, 13, 14, 15]
Before spread with average of 4:
[[1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1], [1, 1]]
Result and sum of each bucket:
[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]]
[4, 4, 4, 4, 4]
Before spread with average of 14:
[[10, 9, 8, 7], [6, 5, 4], [3, 2], [1]]
Result and sum of each bucket:
[[1, 7, 9], [10], [6, 5, 4], [3, 2, 8]]
[17, 10, 15, 13]

This isn't bin packing as others have suggested. There the size of bins is fixed and you are trying to minimize the number. Here you are trying to minimize the variance among a fixed number of bins.
It turns out this is equivalent to Multiprocessor Scheduling, and - according to the reference - the algorithm below (known as "Longest Job First" or "Longest Processing Time First") is certain to produce a largest sum no more than 4/3 - 1/(3m) times optimal, where m is the number of buckets. In the test cases shonw, we'd have 4/3-1/12 = 5/4 or no more than 25% above optimal.
We just start with all bins empty, and put each item in decreasing order of size into the currently least full bin. We can track the least full bin efficiently with a min heap. With a heap having O(log n) insert and deletemin, the algorithm has O(n log m) time (n and m defined as #Jonas Elfström says). Ruby is very expressive here: only 9 sloc for the algorithm itself.
Here is code. I am not a Ruby expert, so please feel free to suggest better ways. I am using #Jonas Elfström's test cases.
require 'algorithms'
require 'pp'
test_buckets = [
[ [10, 4, 3, 3, 2, 1], [5, 5, 3, 2, 1], [3, 1, 1], [2] ],
[ [4, 3, 3, 2, 2, 2, 2, 1, 1], [10, 5, 3, 2, 1], [3, 3, 3], [6] ],
[ [1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1], [1, 1] ],
[ [10, 9, 8, 7], [6, 5, 4], [3, 2], [1] ],
]
def relevel(buckets)
q = Containers::PriorityQueue.new { |x, y| x < y }
# Initially all buckets to be returned are empty and so have zero sums.
rtn = Array.new(buckets.length) { [] }
buckets.each_index {|i| q.push(i, 0) }
sums = Array.new(buckets.length, 0)
# Add to emptiest bucket in descending order.
# Bang! ops would generate less garbage.
buckets.flatten.sort.reverse.each do |val|
i = q.pop # Get index of emptiest bucket
rtn[i] << val # Append current value to it
q.push(i, sums[i] += val) # Update sums and min heap
end
rtn
end
test_buckets.each {|b| pp relevel(b).map {|a| a.inject(:+) }}
Results:
[12, 11, 11, 12]
[14, 14, 14, 14]
[4, 4, 4, 4, 4]
[13, 13, 15, 14]

You could use my answer to fitting n variable height images into 3 (similar length) column layout.
Mentally map:
Object size to picture height, and
bucket count to bincount
Then the rest of that solution should apply...
The following uses the first_fit algorithm mentioned by Robin Green earlier but then improves on this by greedy swapping.
The swapping routine finds the column that is furthest away from the average column height then systematically looks for a swap between one of its pictures and the first picture in another column that minimizes the maximum deviation from the average.
I used a random sample of 30 pictures with heights in the range five to 50 'units'. The convergenge was swift in my case and improved significantly on the first_fit algorithm.
The code (Python 3.2:
def first_fit(items, bincount=3):
items = sorted(items, reverse=1) # New - improves first fit.
bins = [[] for c in range(bincount)]
binsizes = [0] * bincount
for item in items:
minbinindex = binsizes.index(min(binsizes))
bins[minbinindex].append(item)
binsizes[minbinindex] += item
average = sum(binsizes) / float(bincount)
maxdeviation = max(abs(average - bs) for bs in binsizes)
return bins, binsizes, average, maxdeviation
def swap1(columns, colsize, average, margin=0):
'See if you can do a swap to smooth the heights'
colcount = len(columns)
maxdeviation, i_a = max((abs(average - cs), i)
for i,cs in enumerate(colsize))
col_a = columns[i_a]
for pic_a in set(col_a): # use set as if same height then only do once
for i_b, col_b in enumerate(columns):
if i_a != i_b: # Not same column
for pic_b in set(col_b):
if (abs(pic_a - pic_b) > margin): # Not same heights
# new heights if swapped
new_a = colsize[i_a] - pic_a + pic_b
new_b = colsize[i_b] - pic_b + pic_a
if all(abs(average - new) < maxdeviation
for new in (new_a, new_b)):
# Better to swap (in-place)
colsize[i_a] = new_a
colsize[i_b] = new_b
columns[i_a].remove(pic_a)
columns[i_a].append(pic_b)
columns[i_b].remove(pic_b)
columns[i_b].append(pic_a)
maxdeviation = max(abs(average - cs)
for cs in colsize)
return True, maxdeviation
return False, maxdeviation
def printit(columns, colsize, average, maxdeviation):
print('columns')
pp(columns)
print('colsize:', colsize)
print('average, maxdeviation:', average, maxdeviation)
print('deviations:', [abs(average - cs) for cs in colsize])
print()
if __name__ == '__main__':
## Some data
#import random
#heights = [random.randint(5, 50) for i in range(30)]
## Here's some from the above, but 'fixed'.
from pprint import pprint as pp
heights = [45, 7, 46, 34, 12, 12, 34, 19, 17, 41,
28, 9, 37, 32, 30, 44, 17, 16, 44, 7,
23, 30, 36, 5, 40, 20, 28, 42, 8, 38]
columns, colsize, average, maxdeviation = first_fit(heights)
printit(columns, colsize, average, maxdeviation)
while 1:
swapped, maxdeviation = swap1(columns, colsize, average, maxdeviation)
printit(columns, colsize, average, maxdeviation)
if not swapped:
break
#input('Paused: ')
The output:
columns
[[45, 12, 17, 28, 32, 17, 44, 5, 40, 8, 38],
[7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
[46, 34, 9, 37, 44, 30, 20, 28]]
colsize: [286, 267, 248]
average, maxdeviation: 267.0 19.0
deviations: [19.0, 0.0, 19.0]
columns
[[45, 12, 17, 28, 17, 44, 5, 40, 8, 38, 9],
[7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
[46, 34, 37, 44, 30, 20, 28, 32]]
colsize: [263, 267, 271]
average, maxdeviation: 267.0 4.0
deviations: [4.0, 0.0, 4.0]
columns
[[45, 12, 17, 17, 44, 5, 40, 8, 38, 9, 34],
[7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
[46, 37, 44, 30, 20, 28, 32, 28]]
colsize: [269, 267, 265]
average, maxdeviation: 267.0 2.0
deviations: [2.0, 0.0, 2.0]
columns
[[45, 12, 17, 17, 44, 5, 8, 38, 9, 34, 37],
[7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
[46, 44, 30, 20, 28, 32, 28, 40]]
colsize: [266, 267, 268]
average, maxdeviation: 267.0 1.0
deviations: [1.0, 0.0, 1.0]
columns
[[45, 12, 17, 17, 44, 5, 8, 38, 9, 34, 37],
[7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
[46, 44, 30, 20, 28, 32, 28, 40]]
colsize: [266, 267, 268]
average, maxdeviation: 267.0 1.0
deviations: [1.0, 0.0, 1.0]
Nice problem.
Heres the info on reverse-sorting mentioned in my separate comment below.
>>> h = sorted(heights, reverse=1)
>>> h
[46, 45, 44, 44, 42, 41, 40, 38, 37, 36, 34, 34, 32, 30, 30, 28, 28, 23, 20, 19, 17, 17, 16, 12, 12, 9, 8, 7, 7, 5]
>>> columns, colsize, average, maxdeviation = first_fit(h)
>>> printit(columns, colsize, average, maxdeviation)
columns
[[46, 41, 40, 34, 30, 28, 19, 12, 12, 5],
[45, 42, 38, 36, 30, 28, 17, 16, 8, 7],
[44, 44, 37, 34, 32, 23, 20, 17, 9, 7]]
colsize: [267, 267, 267]
average, maxdeviation: 267.0 0.0
deviations: [0.0, 0.0, 0.0]
If you have the reverse-sorting, this extra code appended to the bottom of the above code (in the 'if name == ...), will do extra trials on random data:
for trial in range(2,11):
print('\n## Trial %i' % trial)
heights = [random.randint(5, 50) for i in range(random.randint(5, 50))]
print('Pictures:',len(heights))
columns, colsize, average, maxdeviation = first_fit(heights)
print('average %7.3f' % average, '\nmaxdeviation:')
print('%5.2f%% = %6.3f' % ((maxdeviation * 100. / average), maxdeviation))
swapcount = 0
while maxdeviation:
swapped, maxdeviation = swap1(columns, colsize, average, maxdeviation)
if not swapped:
break
print('%5.2f%% = %6.3f' % ((maxdeviation * 100. / average), maxdeviation))
swapcount += 1
print('swaps:', swapcount)
The extra output shows the effect of the swaps:
## Trial 2
Pictures: 11
average 72.000
maxdeviation:
9.72% = 7.000
swaps: 0
## Trial 3
Pictures: 14
average 118.667
maxdeviation:
6.46% = 7.667
4.78% = 5.667
3.09% = 3.667
0.56% = 0.667
swaps: 3
## Trial 4
Pictures: 46
average 470.333
maxdeviation:
0.57% = 2.667
0.35% = 1.667
0.14% = 0.667
swaps: 2
## Trial 5
Pictures: 40
average 388.667
maxdeviation:
0.43% = 1.667
0.17% = 0.667
swaps: 1
## Trial 6
Pictures: 5
average 44.000
maxdeviation:
4.55% = 2.000
swaps: 0
## Trial 7
Pictures: 30
average 295.000
maxdeviation:
0.34% = 1.000
swaps: 0
## Trial 8
Pictures: 43
average 413.000
maxdeviation:
0.97% = 4.000
0.73% = 3.000
0.48% = 2.000
swaps: 2
## Trial 9
Pictures: 33
average 342.000
maxdeviation:
0.29% = 1.000
swaps: 0
## Trial 10
Pictures: 26
average 233.333
maxdeviation:
2.29% = 5.333
1.86% = 4.333
1.43% = 3.333
1.00% = 2.333
0.57% = 1.333
swaps: 4

Adapt the Knapsack Problem solving algorithms' by, for example, specify the "weight" of every buckets to be roughly equals to the mean of the n objects' sizes (try a gaussian distri around the mean value).
http://en.wikipedia.org/wiki/Knapsack_problem#Solving

Sort buckets in size order.
Move an object from the largest bucket into the smallest bucket, re-sorting the array (which is almost-sorted, so we can use "limited insertion sort" in both directions; you can also speed things up by noting where you placed the last two buckets to be sorted. If you have 6-6-6-6-6-6-5... and get one object from the first bucket, you will move it to the sixth position. Then on the next iteration you can start comparing from the fifth. The same goes, right-to-left, for the smallest buckets).
When the difference of the two buckets is one, you can stop.
This moves the minimum number of buckets, but is of order n^2 log n for comparisons (the simplest version is n^3 log n). If object moving is expensive while bucket size checking is not, for reasonable n it might still do:
12 7 5 2
11 7 5 3
10 7 5 4
9 7 5 5
8 7 6 5
7 7 6 6
12 7 3 1
11 7 3 2
10 7 3 3
9 7 4 3
8 7 4 4
7 7 5 4
7 6 5 5
6 6 6 5
Another possibility would be to calculate the expected average size for every bucket, and "move along" a bag (or a further bucket) with the excess from the larger buckets to the smaller ones.
Otherwise, strange things may happen:
12 7 3 1, the average is a bit less than 6, so we take 5 as the average.
5 7 3 1 bag = 7 from 1st bucket
5 5 3 1 bag = 9
5 5 5 1 bag = 7
5 5 5 8 which is a bit unbalanced.
By taking 6 (i.e. rounding) it goes better, but again sometimes it won't work:
12 5 3 1
6 5 3 1 bag = 6 from 1st bucket
6 6 3 1 bag = 5
6 6 6 1 bag = 2
6 6 6 3 which again is unbalanced.
You can run two passes, the first with the rounded mean left-to-right, the other with the truncated mean right-to-left:
12 5 3 1 we want to get no more than 6 in each bucket
6 11 3 1
6 6 8 1
6 6 6 3
6 6 6 3 and now we want to get at least 5 in each bucket
6 6 4 5 (we have taken 2 from bucket #3 into bucket #5)
6 5 5 5 (when the difference is 1 we stop).
This will require "n log n" size checks, and no more than 2n object moves.
Another possibility which is interesting is to reason thus: you have m objects into n buckets. So you need to do an integer mapping of m onto n, and this is Bresenham's linearization algorithm. Run a (n,m) Bresenham on the sorted array, and at step i (i.e. against bucket i-th) the algorithm will tell you whether to use round(m/n) or floor(m/n) size. Then move objects from or to the "moving bag" according to bucket i-th size.
This requires n log n comparisons.
You can further reduce the number of object moves by initially removing all buckets that are either round(m/n) or floor(m/n) in size to two pools of buckets sized R or F. When, running the algorithm, you need the i-th bucket to hold R objects, if the pool of R objects is not empty, swap the i-th bucket with one of the R-sized ones. This way, only buckets that are hopelessly under- or over-sized get balanced; (most of) the others are simply ignored, except for their references being shuffled.
If object access time is huge in proportion to computation time (e.g. some kind of automatic loader magazine), this will yield a magazine that is as balanced as possible, with the absolute minimum of overall object moves.

You could use an Integer Programming Package if it's fast enough.
It may be tricky getting your constraints right. Something like the following may do the trick:
let variable Oij denote Object i being in Bucket j. Let Wi represent the weight or size of Oi
Constraints:
sum(Oij for all j) == 1 #each object is in only one bucket
Oij = 1 or 0. #object is either in bucket j or not in bucket j
sum(Oij * Wi for all i) <= X + R #restrict weight on buckets.
Objective:
minimize X
Note R is the relaxation constant that you can play with depending on how much movement is required and how much performance is needed.
Now the maximum bucket size is X + R
The next step is to figure out the minimum amount movement possible whilst keeping the bucket size less than X + R
Define a Stay variable Si that controls if Oi stays in bucket j
If Si is 0 it indicates that Oi stays where it was.
Constraints:
Si = 1 or 0.
Oij = 1 or 0.
Oij <= Si where j != original bucket of Object i
Oij != Si where j == original bucket of Object i
Sum(Oij for all j) == 1
Sum(Oij for all i) <= X + R
Objective:
minimize Sum(Si for all i)
Here Sum(Si for all i) represents the number of objects that have moved.

Computing a moving maximum [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Find the min number in all contiguous subarrays of size l of a array of size n
I have a (large) array of numeric data (size N) and would like to compute an array of running maximums with a fixed window size w.
More directly, I can define a new array out[k-w+1] = max{data[k-w+1,...,k]} for k >= w-1 (this assumes 0-based arrays, as in C++).
Is there a better way to do this than N log(w)?
[I'm hoping there should be a linear one in N without dependence on w, like for moving average, but cannot find it. For N log(w) I think there is a way to manage with a sorted data structure which will do insert(), delete() and extract_max() altogether in log(w) or less on a structure of size w -- like a sorted binary tree, for example].
Thank you very much.

There is indeed an algorithm that can do this in O(N) time with no dependence on the window size w. The idea is to use a clever data structure that supports the following operations:
Enqueue, which adds a new element to the structure,
Dequeue, which removes the oldest element from the structure, and
Find-max, which returns (but does not remove) the minimum element from the structure.
This is essentially a queue data structure that supports access (but not removal) of the maximum element. Amazingly, as seen in this earlier question, it is possible to implement this data structure such that each of these operations runs in amortized O(1) time. As a result, if you use this structure to enqueue w elements, then continuously dequeue and enqueue another element into the structure while calling find-max as needed, it will take only O(n + Q) time, where Q is the number of queries you make. If you only care about the minimum of each window once, this ends up being O(n), with no dependence on the window size.
Hope this helps!

I'll demonstrate how to do it with the list:
L = [21, 17, 16, 7, 3, 9, 11, 18, 19, 5, 10, 23, 20, 15, 4, 14, 1, 2, 22, 13, 8, 12, 6]
with length N=23 and W = 4.
Make two new copies of your list:
L1 = [21, 17, 16, 7, 3, 9, 11, 18, 19, 5, 10, 23, 20, 15, 4, 14, 1, 2, 22, 13, 8, 12, 6]
L2 = [21, 17, 16, 7, 3, 9, 11, 18, 19, 5, 10, 23, 20, 15, 4, 14, 1, 2, 22, 13, 8, 12, 6]
Loop from i=0 to N-1. If i is not divisible by W, then replace L1[i] with max(L1[i],L1[i-1]).
L1 = [21, 21, 21, 21, | 3, 9, 11, 18, | 19, 19, 19, 23 | 20, 20, 20, 20 | 1, 2, 22, 22 | 8, 12, 12]
Loop from i=N-2 to0. If i+1 is not divisible by W, then replace L2[i] with max(L2[i], L2[i+1]).
L2 = [21, 17, 16, 7 | 18, 18, 18, 18 | 23, 23, 23, 23 | 20, 15, 14, 14 | 22, 22, 22, 13 | 12, 12, 6]
Make a list L3 of length N + 1 - W, so that L3[i] = max(L2[i], L1[i + W - 1])
L3 = [21, 17, 16, 11 | 18, 19, 19, 19 | 23, 23, 23, 23 | 20, 15, 14, 22 | 22, 22, 22, 13]
Then this list L3 is the moving maxima you seek, L2[i] is the maximum of the range between i and the next vertical line, while l1[i + W - 1] is the maximum of the range between the vertical line and i + W - 1.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

finding maximal subsets - algorithm

Related

"super ugly number" clarification

Retrieving elements from array regarding to an accumulating parameter

Equality Between Base 10 and Base 16

Spread objects evenly over multiple collections

Computing a moving maximum [duplicate]

Categories

Resources