linear solution of grid constraints - algorithm

You have a grid n x n with n rows and n columns. For every column j you are given a number Cj and for every row i you are given a number Ri.
You need to mark some points on the grid, in this way:
the number of marked points in every row is at most Ri;
the number of marked points in every column is at most Cj;
you mark the maximum number of points that satify the last two constraints and return this number of points.
The input is: n (dimension of the grid); the sequence of Ri and the sequence of Cj.
for example in this grid the return is 34
example
Find an algorithm in linear time: O(n) or O(n log(n)) with demonstration.
I have found a solution with Max-Flow alg. but the complexity is too high.

Hints
I suggest iterating over the rows in order from greatest Ri to smallest.
Keep track of how many spaces we have for each column. The number of spaces starts at the given Cj values.
For each row, mark as many points in the grid as are allowed based on the current number of spaces in the columns. Make sure to place points in the columns with the greatest number of spaces first.

#NiklasB. with an augmented self-balancing binary search tree you can decrement one interval by 1 in O(Log n) but how can you find the number of elements that are not zero in minor time of O(n)?In this case for example input 4 3322 3322 3322 -> 2212 -> 1111 -> 0011 -> 0000 You take number 3(the first element of row indicator) and decrement the interval [0,2] of columns by 1 and this cost O(Log n), and the columns indicator are all > 0, but when you arrive at 0011 you must alert that are 2 zero because if you have a 3 in the row you can subtract only 2 if you don't want a number a negative number.If you say the number of zero you can take only interval of n - number of 0.But how you can manage this problem in complexity time that isn't O(n)?

Related

Preprocess-Query to find number of pairs containing a number X

Formally we are given N pairs of rational numbers . We want to somehow preprocess on this data so as to answer queries like "Find number of pairs which contain a given rational number X" .
By ' a pair contains X' i mean [2,5] contains 3 & so on.
At worst , expected time for each query should be O(log N) or O(sqrt(N)) (or anything similair better than O(N)) & preprocessing should be at worst O(N^2) .
My approach:
I tried sorting pairs , first by first number & break ties by second number [First nos in pair < Second nos in pair]. Then applying a lower_bound form of binary search reduces the search space but now i can't apply another Binary search in this search space since pairs are sorted first by first nos. so after reducing search space i have to linearly check . This is again having worst case O(N) per query.
First you should try to make the ranges disjoint. For example ranges [1 5],[2 6],[3 7] will result in disjoint ranges of [1 2],[2 3],[3 5],[5 6],[6 7] and for each range you should save in how many original ranges it was present. Like this
1-------5 // original ranges
2------6
3------7
1-2, 2-3, 3-5, 5-6, 6-7 // disjoint ranges
1 2 3 2 1 // number of presence of each range in original ranges
You can do this by a sweep line algorithm in O(NlogN). After that You can use the method you described by sorting the ranges by its start and then for each query finding the lower_bound of Xi and printing the presence count of that range. For example in this case if the query is 4 you can find the range 3-5 by a binary search and then the result is 3 because the presence of range 3-5 is equal to 3.

minimum changes to N lines to make a given line a subsegment of all of them

there's a line segment LS of length l and N other line segments whose end points are given like, (a1, b1), (a2,b2), ... (an, bn) [both the points inclusive]. All the values in the ranges are less than or equal to a given value k. find the minimum units of changes you need to make in those N line segments to make LS a subsegment with all the N line segments. (Assume all the line segments are horizontal) e.g.,
N=4
l=2
k=8
ranges: (1,2), (2, 5) (1,8), (2,4)
1|2|3|4|5|6|7|8
---
-----
---------------
-----
ans: 2 (increase 1st line by 1 unit towards 3 and second line by 1 towards 2) or
(increase line 1 by 2 units upto 4)
right now what I am doing is that I am creating an array of size k initialized by 0 and incrementing the values of ranges by 1 and then finally finding the window of size l with maximum sum and subtracting that sum from n*l. But, the problem is when all the N line segments are of full length i.e. (1, k). Then the update time becomes O(N*k) or O(N^2) if N=k. Is there a ay to do it in less than O(N*k)?
Any help is appreciated. Thanks.
Here is an O(N log N + k) approach.
First, create lists of all the start and end points of the segments. Sort each list.
Imagine a vertical line sweeping right to left, whose position is given by i. Maintain a value s[i] which is the number of segments whose right endpoints are greater than i. Since you have a sorted list of right endpoints, s[1:k] can be computed in O(N+k) time. Using this, you can populate an array L[i] whose values are the number of units of units of leftward movement to have all segments start at position i or lower. L[k+1] = 0 and L[i] = L[i+1] + s[i].
You can do a similar thing from left to right to compute R[j], the number of units of rightward movement to have all segments end at position j or higher
Now the total units of movement to have all line segments overlap the interval from i to j is R[i] + L[j].
The total time for this approach is O(N log N) to sort the endpoints, plus O(N+k) for the rest of the algorithm, for a total bound of O(N log N + k).
With a little fussing around, you could probably improve this to O(N log N) by only computing R[i] and L[j] near the interval endpoints.

minimum number of steps required to set all flags of array elements to 1 which were initially 0 by default [duplicate]

Two integers N<=10^5 and K<=N are given, where N is the size of array A[] and K is the length of continuous subsequence we can choose in our process.Each element A[i]<=10^9. Now suppose initially all the elements of array are unmarked. In each step we'll choose any subsequence of length K and if this subsequence has unmarked elements then we will mark all the unmarked elements which are minimum in susequence. Now how to calculate minimum number of steps to mark all the elements?
For better understanding of problem see this example--
N=5 K=3
A[]=40 30 40 30 40
Step 1- Select interval [1,3] and mark A[1] and A[3]
Step2- Select interval [0,2] and mark A[0] and A[2]
Step 3- Select interval [2,4] and mark A[4]
Hence minimum number of steps here is 3.
My approach(which is not fast enough to pass)-
I am starting from first element of array and marking all the unmarked elements equal to it at distance <=K and incrementing steps by 1.
First consider how you'd answer the question for K == N (i.e. without any effective restriction on the length of subsequences). Your answer should be that the minimum number of steps is the number of distinct values in the array.
Then consider how this changes as K decreases; all that matters is how many copies of a K-length interval you need to cover the selection set {i: A[i] == n} for each value n present in A. The naive algorithm of walking a K-length interval along A, halting at each position A[i] not yet covered for that value of n is perfectly adequate.
As we see minimum number of steps = N/k or N/k+1 and maximum number of steps =(n+k-1).
We have to optimize the total number of steps and which depend on past history of choices we made which refers to dynamic solution.
For dynamic theory tutorial see http://www.quora.com/Dynamic-Programming/How-do-I-get-better-at-DP-Are-there-some-good-resources-or-tutorials-on-it-like-the-TopCoder-tutorial-on-DP/answer/Michal-Danil%C3%A1k
Can be solved in O(n) as follows:
Trace each element a[i]. If a[i] wasn't traced before then map the number and its index and increase counter.If the number was traced previously then check whether its (last index-curr_index)>=K if yes update the index and increase count. Print count.
Map STL will be beneficial.

Finding the minimum distance in a table

I have a table of dimension m * n as given below
2 6 9 13
1 4 12 21
10 14 16 -1
Few constraints about this table:
Elements in each row is sorted in increasing order (natural
ordering).
A -1 means the cell is of no significance for the purpose of
calculatio, i.e. no element exists there.
No element can appear in a row after a -1.
All the cells can have either a positive number between 0 and N or
a -1.
No two cells have the same positive numbe, i.e. a -1 can appear
multiple times but no other number can.
Question: I would like to find a set S of n numbers from the table where the set must contain only one number from each row and the max(S) - min(S) is as small as possible.
For e.g. the above table gives me S = 12,13,14.
I would really appreciate if this can be solved. My solution is complicated and it takes O(m^n) and this is too much. I want an optimal solution.
Here is a brute force O((m*n)^2 * nlog(m)) algorithm that I can prove works:
min <- INFINITY
For each 2 numbers in different rows, let them be a,b
for each other row:
check if there is a number between a and b
if there is a matching number in every other row:
min <- min{min,|a-b|}
Explanation:
Checking if there is a number between a and b can be done using binary search, and is O(logm)
There are O((n*m)^2) different possibilities for a,b.
The idea is to exhaustively check the pair which creates the maximal difference, and check if it gives a "feasible" solution(all other elements in this solution are in range [a,b]), and get the pair that minimizes the difference between all "feasible" solutions .
EDIT: removed the 2nd solution I proposed, which was greedy and wrong.
Put positions of all first elements of each row into priority queue (min-heap).
Remove smallest element from the queue and replace it with the next element from the same row.
Repeat step 2 until no more elements different from "-1" are left in some row. Calculate max(S) - min(S) for each iteration and if it is smaller than any previous value, update the best-so-far set S.
Time complexity is O(m*n*log(m)).

Data structure that supports range based most frequently occuring element query

I'm looking for a data structure with which I can find the most frequently occuring number (among an array of numbers) in a given, variable range.
Let's consider the following 1 based array:
1 2 3 1 1 3 3 3 3 1 1 1 1
If I query the range (1,4), the data structure must retun 1, which occurs twice.
Several other examples:
(1,13) = 1
(4,9) = 3
(2,2) = 2
(1,3) = 1 (all of 1,2,3 occur once, so return the first/smallest one. not so important at the moment)
I have searched, but could not find anything similar. I'm looking (ideally) a data structure with minimal space requirement, fast preprocessing, and/or query complexities.
Thanks in advance!
Let N be the size of the array and M the number of different values in that array.
I'm considering two complexities : pre-processing and querying an interval of size n, each must be spacial and temporal.
Solution 1 :
Spacial : O(1) and O(M)
Temporal : O(1) and O(n + M)
No pre-processing, we look at all values of the interval and find the most frequent one.
Solution 2 :
Spacial : O(M*N) and O(1)
Temporal : O(M*N) and O(min(n,M))
For each position of the array, we have an accumulative array that gives us for each value x, how many times x is in the array before that position.
Given an interval we just need for each x to subtract 2 values to find the number of x in that interval. We iterate over each x and find the maximum value. If n < M we iterate over each value of the interval, otherwise we iterate over all possible values for x.
Solution 3 :
Spacial : O(N) and O(1)
Temporal : O(N) and O(min(n,M)*log(n))
For each value x build a binary heap of all the position in the array where x is present. The key in your heap is the position but you also store the total number of x between this position and the begin of the array.
Given an interval we just need for each x to subtract 2 values to find the number of x in that interval : in O(log(N)) we can ask the x's heap to find the two positions just before the start/end of the interval and substract the numbers. Basically it needs less space than a histogram but the query in now in O(log(N)).
You could create a binary partition tree where each node represents a histogram map of {value -> frequency} for a given range, and has two child nodes which represent the upper half and lower half of the range.
Querying is then just a case of recursively adding together a small number of these histograms to cover the range required, and scanning the resulting histogram once to find the highest occurrence count.
Useful optimizations include:
Using a histogram with mutable frequency counts as an "accumulator" while you add histograms together
Stop using precomputed histograms once you get down to a certain size (maybe a range less than the total number of possible values M) and just counting the numbers directly. It's a time/space trade-off that I think will pay off a lot of the time.
If you have a fixed small number of possible values, use an array rather than a map to store the frequency counts at each node
UPDATE: my thinking on algorithmic complexity assuming a bounded small number of possible values M and a total of N values in the complete range:
Preprocessing is O(N log N) - basically you need to traverse the complete list and build a binary tree, building one node for every M elements in order to amortise the overhead of each node
Querying is O(M log N) - basically adding up O(log N) histograms each of size M, plus counting O(M) values on either side of the range
Space requirement is O(N) - approx. 2N/M histograms each of size M. The 2 factor is the sum from having N/M histograms at the bottom level, 0.5N/M histograms at the next level, 0.25N/M at the third level etc...

Resources