I have N object pairs (master copy/slave copy) all with the same size. I wish to distribute the copies among M bins each with a different capacity so that no bin will include both the master and slave copy.
What's the most efficient algorithm? And more importantly what's the most efficient algorithm to find out if there is a possible solution for a given input (without actually generating the solution)?
Hard to imagine anything better thn brute force: track the M bins in a prioirty queue by descending remaining capacity, and add each object pair to the first two bins in the queue; rebalance queue and repeat. Solution exists if total capacity of the M bins >= 2*N.
That would seem to be complexity O(N * log M)
Note: For exactly three bins, no solution exists for N > M1 + M2 where Mn is the capacity of bin n sorted by descending capacity for n in range 0..M, regardless of the capacity of M0.
Likewise for exactly 2 bins, solutions exist only for N <= M1.
A simple solutions is:
Sort the M buckets in descending order according to their capacity: x1, x2,..,xm
Pick the topmost two buckets, assign an object to that pair, decrement the available capacities of the two buckets and rearrange the buckets. You can use a heap to keep track of buckets and the complexity is close to O(n)
Keep repeating until all the objects are allocated.
Related
You work in an electronic exchange. Throughout the day, you receive ticks (trading data) which consists of product name and its traded volume of stocks. Eg: {name: vodafone, volume: 20}
What data structure will you maintain if:
You have to tell top k products traded by volume at end of day.
You have to tell top k products traded by volume throughout the day.
What's the most efficient solution that you can think of?
The most efficient solution I could think of was to use a heap and map for both situations
heap to store stock by decreasing volume (updating - O(logn)and getTop k - O(k))
map to track stock volume (updating - O(1))
What you're looking for is a kind of map or dictionary which supports the following queries:
Add(key, x): add x to the total for that key, creating a new entry if it doesn't already exist.
GetKLargest(k): return the keys/totals for the k largest entries.
Let's say Q is the number of queries, and n is the number of distinct keys. We should assume that Q is much larger than n; choosing the NYSE as an example, there are a few thousand stocks traded, and a few million trades per day.
In the first scenario we assume that there are a large number of Add queries followed by one GetKLargest query. Since the cost of the Add query dominates, we can use a hashtable so that Add takes O(1) time, and then at the end of the day we can do GetKLargest in O(n log k) time using a priority queue of size k; note that we don't need to sort the whole key-set in O(n log n) time just to find the k largest elements. The total cost of answering Q queries is O(Q + n log k).
In the second scenario, we assume there could be a large number of both kinds of query. The cost of either query could dominate. A good option is to use an order statistic tree, which supports Add in O(log n) time, and GetKLargest in O(k log n) time. To look up a company by name in the tree requires a separate index, which can be maintained as a hashtable. The total cost is O(Qk log n) in the worst case.
If k is fixed or has a fixed limit, we can do better: keep the totals in a hashtable, but also maintain a priority queue of the current top k elements alongside. The cost of the Add query is now O(log k) because of maintaining the priority queue; to do this efficiently we need the map to also store the current index of each company in the priority queue, if it's there, otherwise searching the priority queue for the right company is O(k). The cost of GetKLargest is O(k) since we just output the contents of the priority queue. (The problem doesn't say we need to output them in order. If we do, then we can use a sorted array instead of a heap for the priority queue, and Add takes O(k) time.)
In this case, the total cost of answering Q queries is O(Qk). Note that this only works if we know in advance the maximum value of k that could be queried, before the query arrives; otherwise we don't know how big to make the priority queue.
I have a stack with integer values objects. I want to buy a truck with capacity N( N is unknown) to be sure that I can transport all the objects in maximum X laps.
X is known.
In other words, I have to partition the stack( the order of objects must be maintained) In maximum X subsets with sum lower than N and find that minimum N.
Can you help me with an algorithm or an idea, please?
Thanks.
If, as you state, "the order of objects must be maintained," then we can solve this with binary-search on N, in O(|objects| * log m), where m is the total sum.
The paper #hlt linked to in a comment, about multi way number partitioning, would apply if the objects' order could be rearranged. In this case, where the order is fixed, we can just try different Ns, packing the partitions as much as we can. If the N we pick is too small, we'll end up surpassing X. This makes N ordered and therefore searchable.
Let's say we have two arrays of ints of equal length, a_1, ..., a_n and b_1, ..., b_n. For any given index pairs i and j with 1<=i<j<=n, we need to find the max of the min for any sequence of the form a_k, ..., a_{l-1}, b_l, ..., b_{j-i+k} with 0<=k<=n-j+i and l can be j-i+k+1, i.e. that sequence is purely from array a. When k=0, the sequence is purely from array b.
We want to do this for all pairs of i and j very efficiently.
Example, given
`a=[3,2,4,1]` and `b=[4,6,1,3]`
when `i=1, j=3`, the sequence can be
`[3,2,4]`, min is 2
`[3,2,1]`, min is 1
`[3,6,1]`, min is 1
`[2,4,1]`, min is 1
`[2,4,3]`, min is 2
`[2,1,3]`, min is 1
`[4,6,1]`, min is 1
`[6,1,3]`, min is 1
So the max is 2 for this input.
Is there a good way to run this efficiently?
It's seems possible to make the brute force approach run fairly quickly.
If you preprocess each sequence into a balanced tree where each node is augmented with the min of that subtree, then you can find the min of any subrange of that sequence in O(log n) time by splitting the tree at the appropriate points. See, for example, this paper for more information. Note that this preprocessing takes O(n) time.
Let's call the range (i,j) a window. The complexity of this problem doesn't depend on the specific (i,j), but rather the size of the window (that is, j-i+1). For a window size of m (=j-i+1), there are n-m+1 windows of that size. For each window, there are m+1 places where you can "cut" the window so that some prefix of the elements come from from sequence a and the suffix comes from sequence b. You pay O(log n) for each cut (to split the binary trees as I mentioned above). That's a total cost of O((n-m+1) * (m+1) * log(n)).
There is probably a faster way to do this, by reusing splits, or by noticing that nearby windows share a lot of elements. But regardless, I think the binary tree splitting trick I mentioned above might be helpful!
If I have an unsorted large set of n integers (say 2^20 of them) and would like to generate subsets with k elements each (where k is small, say 5) in increasing order of their sums, what is the most efficient way to do so?
Why I need to generate these subsets in this fashion is that I would like to find the k-element subset with the smallest sum satisfying a certain condition, and I thus would apply the condition on each of the k-element subsets generated.
Also, what would be the complexity of the algorithm?
There is a similar question here: Algorithm to get every possible subset of a list, in order of their product, without building and sorting the entire list (i.e Generators) about generating subsets in order of their product, but it wouldn't fit my needs due to the extremely large size of the set n
I intend to implement the algorithm in Mathematica, but could do it in C++ or Python too.
If your desired property of the small subsets (call it P) is fairly common, a probabilistic approach may work well:
Sort the n integers (for millions of integers i.e. 10s to 100s of MB of ram, this should not be a problem), and sum the k-1 smallest. Call this total offset.
Generate a random k-subset (say, by sampling k random numbers, mod n) and check it for P-ness.
On a match, note the sum-total of the subset. Subtract offset from this to find an upper bound on the largest element of any k-subset of equivalent sum-total.
Restrict your set of n integers to those less than or equal to this bound.
Repeat (goto 2) until no matches are found within some fixed number of iterations.
Note the initial sort is O(n log n). The binary search implicit in step 4 is O(log n).
Obviously, if P is so rare that random pot-shots are unlikely to get a match, this does you no good.
Even if only 1 in 1000 of the k-sized sets meets your condition, That's still far too many combinations to test. I believe runtime scales with nCk (n choose k), where n is the size of your unsorted list. The answer by Andrew Mao has a link to this value. 10^28/1000 is still 10^25. Even at 1000 tests per second, that's still 10^22 seconds. =10^14 years.
If you are allowed to, I think you need to eliminate duplicate numbers from your large set. Each duplicate you remove will drastically reduce the number of evaluations you need to perform. Sort the list, then kill the dupes.
Also, are you looking for the single best answer here? Who will verify the answer, and how long would that take? I suggest implementing a Genetic Algorithm and running a bunch of instances overnight (for as long as you have the time). This will yield a very good answer, in much less time than the duration of the universe.
Do you mean 20 integers, or 2^20? If it's really 2^20, then you may need to go through a significant amount of (2^20 choose 5) subsets before you find one that satisfies your condition. On a modern 100k MIPS CPU, assuming just 1 instruction can compute a set and evaluate that condition, going through that entire set would still take 3 quadrillion years. So if you even need to go through a fraction of that, it's not going to finish in your lifetime.
Even if the number of integers is smaller, this seems to be a rather brute force way to solve this problem. I conjecture that you may be able to express your condition as a constraint in a mixed integer program, in which case solving the following could be a much faster way to obtain the solution than brute force enumeration. Assuming your integers are w_i, i from 1 to N:
min sum(i) w_i*x_i
x_i binary
sum over x_i = k
subject to (some constraints on w_i*x_i)
If it turns out that the linear programming relaxation of your MIP is tight, then you would be in luck and have a very efficient way to solve the problem, even for 2^20 integers (Example: max-flow/min-cut problem.) Also, you can use the approach of column generation to find a solution since you may have a very large number of values that cannot be solved for at the same time.
If you post a bit more about the constraint you are interested in, I or someone else may be able to propose a more concrete solution for you that doesn't involve brute force enumeration.
Here's an approximate way to do what you're saying.
First, sort the list. Then, consider some length-5 index vector v, corresponding to the positions in the sorted list, where the maximum index is some number m, and some other index vector v', with some max index m' > m. The smallest sum for all such vectors v' is always greater than the smallest sum for all vectors v.
So, here's how you can loop through the elements with approximately increasing sum:
sort arr
for i = 1 to N
for v = 5-element subsets of (1, ..., i)
set = arr{v}
if condition(set) is satisfied
break_loop = true
compute sum(set), keep set if it is the best so far
break if break_loop
Basically, this means that you no longer need to check for 5-element combinations of (1, ..., n+1) if you find a satisfying assignment in (1, ..., n), since any satisfying assignment with max index n+1 will have a greater sum, and you can stop after that set. However, there is no easy way to loop through the 5-combinations of (1, ..., n) while guaranteeing that the sum is always increasing, but at least you can stop checking after you find a satisfying set at some n.
This looks to be a perfect candidate for map-reduce (http://en.wikipedia.org/wiki/MapReduce). If you know of any way of partitioning them smartly so that passing candidates are equally present in each node then you can probably get a great throughput.
Complete sort may not really be needed as the map stage can take care of it. Each node can then verify the condition against the k-tuples and output results into a file that can be aggregated / reduced later.
If you know of the probability of occurrence and don't need all of the results try looking at probabilistic algorithms to converge to an answer.
This is an interview question. I have K machines each of which is connected to 1 central machine. Each of the K machines have an array of 4 byte numbers in file. You can use any data structure to load those numbers into memory on those machines and they fit. Numbers are not unique across K machines. Find the K largest numbers in the union of the numbers across all K machines. What is the fastest I can do this?
(This is an interesting problem because it involves parallelism. As I haven't encountered parallel algorithm optimization before, it's quite amusing: you can get away with some ridiculously high-complexity steps, because you can make up for it later. Anyway, onto the answer...)
> "What is the fastest I can do this?"
The best you can do is O(K). Below I illustrate both a simple O(K log(K)) algorithm, and the more complex O(K) algorithm.
First step:
Each computer needs enough time to read every element. This means that unless the elements are already in memory, one of the two bounds on the time is O(largest array size). If for example your largest array size varies as O(K log(K)) or O(K^2) or something, no amount of algorithmic trickery will let you go faster than that. Thus the actual best running time is O(max(K, largestArraySize)) technically.
Let us say the arrays have a max length of N, which is <=K. With the above caveat, we're allowed to bound N<K since each computer has to look at each of its elements at least once (O(N) preprocessing per computer), each computer can pick the largest K elements (this is known as finding kth-order-statistics, see these linear-time algorithms). Furthermore, we can do so for free (since it's also O(N)).
Bounds and reasonable expectations:
Let's begin by thinking of some worst-case scenarios, and estimates for the minimum amount of work necessary.
One minimum-work-necessary estimate is O(K*N/K) = O(N), because we need to look at every element at the very least. But, if we're smart, we can distribute the work evenly across all K computers (hence the division by K).
Another minimum-work-necessary estimate is O(N): if one array is larger than all elements on all other computers, we return the set.
We must output all K elements; this is at least O(K) to print them out. We can avoid this if we are content merely knowing where the elements are, in which case the O(K) bound does not necessarily apply.
Can this bound of O(N) be achieved? Let's see...
Simple approach - O(NlogN + K) = O(KlogK):
For now let's come up with a simple approach, which achieves O(NlogN + K).
Consider the data arranged like so, where each column is a computer, and each row is a number in the array:
computer: A B C D E F G
10 (o) (o)
9 o (o) (o)
8 o (o)
7 x x (x)
6 x x (x)
5 x ..........
4 x x ..
3 x x x . .
2 x x . .
1 x x .
0 x x .
You can also imagine this as a sweep-line algorithm from computation geometry, or an efficient variant of the 'merge' step from mergesort. The elements with parentheses represent the elements with which we'll initialize our potential "candidate solution" (in some central server). The algorithm will converge on the correct o responses by dumping the (x) answers for the two unselected os.
Algorithm:
All computers start as 'active'.
Each computer sorts its elements. (parallel O(N logN))
Repeat until all computers are inactive:
Each active computer finds the next-highest element (O(1) since sorted) and gives it to the central server.
The server smartly combines the new elements with the old K elements, and removes an equal number of the lowest elements from the combined set. To perform this step efficiently, we have a global priority queue of fixed size K. We insert the new potentially-better elements, and bad elements fall out of the set. Whenever an element falls out of the set, we tell the computer which sent that element to never send another one. (Justification: This always raises the smallest element of the candidate set.)
(sidenote: Adding a callback hook to falling out of a priority queue is an O(1) operation.)
We can see graphically that this will perform at most 2K*(findNextHighest_time + queueInsert_time) operations, and as we do so, elements will naturally fall out of the priority queue. findNextHighest_time is O(1) since we sorted the arrays, so to minimize 2K*queueInsert_time, we choose a priority queue with an O(1) insertion time (e.g. a Fibonacci-heap based priority queue). This gives us an O(log(queue_size)) extraction time (we cannot have O(1) insertion and extraction); however, we never need to use the extract operation! Once we are done, we merely dump the priority queue as an unordered set, which takes O(queue_size)=O(K) time.
We'd thus have O(N log(N) + K) total running time (parallel sorting, followed by O(K)*O(1) priority queue insertions). In the worst case of N=K, this is O(K log(K)).
The better approach - O(N+K) = O(K):
However I have come up with a better approach, which achieves O(K). It is based on the median-of-median selection algorithm, but parallelized. It goes like this:
We can eliminate a set of numbers if we know for sure that there are at least K (not strictly) larger numbers somewhere among all the computers.
Algorithm:
Each computer finds the sqrt(N)th highest element of its set, and splits the set into elements < and > it. This takes O(N) time in parallel.
The computers collaborate to combine those statistics into a new set, and find the K/sqrt(N)th highest element of that set (let's call it the 'superstatistic'), and note which computers have statistics < and > the superstatistic. This takes O(K) time.
Now consider all elements less than their computer's statistics, on computers whose statistic is less than the superstatistic. Those elements can be eliminated. This is because the elements greater than their computer's statistic, on computers whose statistic is larger than the superstatistic, are a set of K elements which are larger. (See the visual here).
Now, the computers with the uneliminated elements evenly redistribute their data to the computers who lost data.
Recurse: you still have K computers, but the value of N has decreased. Once N is less than a predetermined constant, use the previous algorithm I mentioned in "simple approach - O(NlogN + K)"; except in this case, it is now O(K). =)
It turns out that the reductions are O(N) total (amazingly not order K), except perhaps the final step which might by O(K). Thus this algorithm is O(N+K) = O(K) total.
Analysis and simulation of O(K) running time below. The statistics allow us to divide the world into four unordered sets, represented here as a rectangle divided into four subboxes:
------N-----
N^.5
________________
| | s | <- computer
| | #=K s REDIST. | <- computer
| | s | <- computer
| K/N^.5|-----S----------| <- computer
| | s | <- computer
K | s | <- computer
| | s ELIMIN. | <- computer
| | s | <- computer
| | s | <- computer
| |_____s__________| <- computer
LEGEND:
s=statistic, S=superstatistic
#=K -- set of K largest elements
(I'd draw the relation between the unordered sets of rows and s-column here, but it would clutter things up; see the addendum right now quickly.)
For this analysis, we will consider N as it decreases.
At a given step, we are able to eliminate the elements labelled ELIMIN; this has removed area from the rectangle representation above, reducing the problem size from K*N to , which hilariously simplifies to
Now, the computers with the uneliminated elements redistribute their data (REDIST rectangle above) to the computers with eliminated elements (ELIMIN). This is done in parallel, where the bandwidth bottleneck corresponds to the length of the short size of REDIST (because they are outnumbered by the ELIMIN computers which are waiting for their data). Therefore the data will take as long to transfer as the long length of the REDIST rectangle (another way of thinking about it: K/√N * (N-√N) is the area, divided by K/√N data-per-time, resulting in O(N-√N) time).
Thus at each step of size N, we are able to reduce the problem size to K(2√N-1), at the cost of performing N + 3K + (N-√N) work. We now recurse. The recurrence relation which will tell us our performance is:
T(N) = 2N+3K-√N + T(2√N-1)
The decimation of the subproblem size is much faster than the normal geometric series (being √N rather than something like N/2 which you'd normally get from common divide-and-conquers). Unfortunately neither the Master Theorem nor the powerful Akra-Bazzi theorem work, but we can at least convince ourselves it is linear via a simulation:
>>> def T(n,k=None):
... return 1 if n<10 else sqrt(n)*(2*sqrt(n)-1)+3*k+T(2*sqrt(n)-1, k=k)
>>> f = (lambda x: x)
>>> (lambda n: T((10**5)*n,k=(10**5)*n)/f((10**5)*n) - T(n,k=n)/f(n))(10**30)
-3.552713678800501e-15
The function T(N) is, at large scales, a multiple of the linear function x, hence linear (doubling the input doubles the output). This method, therefore, almost certainly achieves the bound of O(N) we conjecture. Though see the addendum for an interesting possibility.
...
Addendum
One pitfall is accidentally sorting. If we do anything which accidentally sorts our elements, we will incur a log(N) penalty at the least. Thus it is better to think of the arrays as sets, to avoid the pitfall of thinking that they are sorted.
Also we might initially think that with the constant amount of work at each step of 3K, so we would have to do work 3Klog(log(N)) work. But the -1 has a powerful role to play in the decimation of the problem size. It is very slightly possible that the running time is actually something above linear, but definitely much smaller than even Nlog(log(log(log(N)))). For example it might be something like O(N*InverseAckermann(N)), but I hit the recursion limit when testing.
The O(K) is probably only due to the fact that we have to print them out; if we are content merely knowing where the data is, we might even be able to pull off an O(N) (e.g. if the arrays are of length O(log(K)) we might be able to achieve O(log(K)))... but that's another story.
The relation between the unordered sets is as follows. Would have cluttered things up in explanation.
.
_
/ \
(.....) > s > (.....)
s
(.....) > s > (.....)
s
(.....) > s > (.....)
\_/
v
S
v
/ \
(.....) > s > (.....)
s
(.....) > s > (.....)
s
(.....) > s > (.....)
\_/
Find the k largest numbers on each machine. O(n*log(k))
Combine the results (on a centralized server, if k is not huge, otherwise you can merge them in a tree-hierarchy accross the server cluster).
Update: to make it clear, the combine step is not a sort. You just pick the top k numbers from the results. There are many ways to do this efficiently. You can use a heap for example, pushing the head of each list. Then you can remove the head from the heap and push the head from the list the element belonged to. Doing this k times gives you the result. All this is O(k*log(k)).
Maintain a min heap of size 'k' in the centralized server.
Initially insert first k elements into the min heap.
For the remaining elements
Check(peek) for the min element in the heap (O(1))
If the min element is lesser than the current element, then remove the min element from heap and insert the current element.
Finally min heap will have 'k' largest elements
This would require n(log k) time.
I would suggest something like this:
take the k largest numbers on each machine in sorted order O(Nk) where N is the number of element on each machine
sort each of these arrays of k elements by largest element (you will get k arrays of k elements sorted by largest element : a square matrix kxk)
take the "upper triangle" of the matrix made of these k arrays of k elements, (the k largest element will be in this upper triangle)
the central machine can now find the k largest element of these k(k+1)/2 elements
Let the machines find the out k largest elements copy it into a
datastructure (stack), sort it and pass it on to the Central
machine.
At the central machine receive the stacks from all the machine. Find
the greatest of the elements at the top of the stacks.
Pop out the greatest element form its stack and copy it to the 'TopK list'.
Leave the other stacks intact.
Repeat step 3, k times to get Top K numbers.
1) sort the items on every machine
2) use a k - binary heap on the central machine
a) populate the heap with first (max) element from each machine
b) extract the first element, and put back in the heap the first element from the machine that you extracted the element. (of course heapify your heap, after the element is added).
Sort will be O(Nlog(N)) where N is the max array on the machines.
O(k) - to build the heap
O(klog(k)) to extract and populate the heap k times.
Complexity is max(O(klog(k)),O(Nlog(N)))
I would think the MapReduce paradigm would be well suited to a task like this.
Every machine runs it's own independent map task to find the maximum value in its array (depends on the language used) and this will probably be O(N) complexity for N numbers on each machine.
The reduce task compares the result from the individual machines' outputs to give you the largest k numbers.