The title is slightly misleading: What I am trying to is a little bit more complicated, but I don't know how to describe it in a title.
I have two sequences (vectors, arrays, lists) p and q of numbers. I sum all of the numbers in p except the first element, then perform a test. If the fails, I replace the first element of p with the first element of q, and start over with the second element: Sum all elements of p except the second one (including the new element copied from q). If the test fails the second time, replace the second element in p with the second element in q, and do it again with the third element. The test always succeeds on some element. Pseudocode:
i = 0
if test(sum_except p i) then exit loop
else p[i] := q[i], and run again with i+1
This means that if I have 1000 elements, I sum 999 of them, and then the next time through I sum 998 of those plus a new element, and so on. As the process goes on, I am also summing more and more elements from q rather than from the original p, but I will have summed most of them, previously, as well. So at each step, I'm summing a bunch of numbers that I had already added together. This seems inefficient, but I have not figured out a more sensible way to write this.
In case it matters, what the test does is check whether the sum of the values in p except the element at i, plus the value of q at i, is greater or equal to than 1 (or is less than or equal to 1--there are two variants). i.e.:
(sum_except p i) + q[i] > 1
This all happens inside an inner loop, and the sequence length will probably be between 500 and 5000, although larger numbers are possible.
(My code actually is in OCaml, fwiw.)
Calculate the sum of all numbers at the start (henceforth referred to as sum). sum_except p i is then equivalent to sum - p[i].
When you update an item in the list, update the sum as well so that you do not have to recalculate it by looping through each element in p every time.
// Remove old value
sum := sum - p[i]
p[i] := q[i]
// Add new value
sum := sum + p[i]
Since you will be modifying at most one value in the list at a time, you can simply update the sum manually to reflect the change instead of summing all elements in the list again, which makes the loop body run in constant time.
Related
I am currently learning Loop Invariants and is wondering whether I have generated them correctly here. The algorithm pseudocode is:
**EvenSumming(A)**
outcome=0
for i=1 to n
if A[i] is even
outcome=outcome+A[i]
return outcome
So far my LI and proving:
Loop Invariant (LI): At every iteration, the variable outcome is always even and at most can only be composed of i-1 elements originally in A[i....n].
Initialization: Let i = 1, thus outcome starts at 0, which makes it an even number and is composed of 0 elements from A[1...n]
Maintenance: By math, we have that adding two even numbers always results in an even number, and since every iteration in the array increments i by 1 then trivially we know that there are i-1 elements that have been iterated through before i. Thus, at iteration i, outcome will be an even value composed of at most i-1 elements from A[i...n]
Termination: At termination, let i > n and thus we can say that i = n+1. Then by LI we have that outcome is still even value composed from at most n elements originally existing in A[n]
Question:
Given an array A of integers and a score S = 0. For each place in the array, you can do one of the following:
Place a "(". The score would be S += Ai
Place a ")". The score would be S -= Ai
Skip it
What is the highest score you can get so that the brackets are balanced?
Limits:
|Ai| <= 10^9
Size of array A: <= 10^5
P/S:
I have tried many ways but my best take is a brute force that takes O(3^n). Is there a way to do this problem in O(n.logn) or less?
You can do this in O(n log n) time with a max-heap.
First, remove the asymmetry in the operations. Rather than having open and closed brackets, assume we start off with a running sum of -sum(A), i.e. all closed brackets. Now, for every element in A, we can add it to our running sum either zero, one or two times, corresponding to leaving a closed bracket, removing the closed bracket, or adding an open bracket, respectively. The balance constraint now says that after processing the first k elements, we have:
Made at least k additions, for all integers k,
We make length(A) total additions.
We have added the final element to our sum either zero or one times.
Suppose that after processing the first k elements, we have made k additions, and that we have the maximum score possible of all such configurations. We can extend this to a maximum score configuration of the first k+1 elements with k+1 additions, greedily. We have a new choice going forward of adding the k+1-th element to our sum up to two times, but can only add it at most once now. Simply choose the largest seen element that has not yet been added to our sum two times, and add it to our sum: this must also be a maximum-score configuration, or we can show the old configuration wasn't maximum either.
Python Code: (All values are negated because Python only has a min-heap)
def solve(nums: List[int]) -> int:
"""Given an array of integers, return the maximum sum achievable.
We must add k elements from nums and subtract k elements from nums,
left to right and all distinct, so that at no point have we subtracted
more elements than we have added.
"""
max_heap = []
running_sum = 0
# Balance will be 0 after all loop iterations.
for value in nums:
running_sum -= value # Assume value is subtracted
heapq.heappush(max_heap, -value) # Option to not subtract value
heapq.heappush(max_heap, -value) # Option to add value
# Either un-subtract or add the largest previous free element
running_sum -= heapq.heappop(max_heap)
return running_sum
You can do this in O(n2) time by using a two-dimensional array highest_score, where highest_score[i][b] is the highest score achievable after position i with b open brackets yet to be closed. Each element highest_score[i][b] depends only on highest_score[i−1][b−1], highest_score[i−1][b], and highest_score[i−1][b+1] (and of course A[i]), so each row highest_score[i] can be computed in O(n) time from the previous row highest_score[i−1], and the final answer is highest_score[n][0].
(Note: that uses O(n2) space, but since each row of highest_score depends only on the previous row, you can actually do it in O(n) by reusing rows. But the asymptotic runtime complexity will be the same either way.)
I have a number n and a set of numbers S ∈ [1..n]* with size s (which is substantially smaller than n). I want to sample a number k ∈ [1..n] with equal probability, but the number is not allowed to be in the set S.
I am trying to solve the problem in at worst O(log n + s). I am not sure whether it's possible.
A naive approach is creating an array of numbers from 1 to n excluding all numbers in S and then pick one array element. This will run in O(n) and is not an option.
Another approach may be just generating random numbers ∈[1..n] and rejecting them if they are contained in S. This has no theoretical bound as any number could be sampled multiple times even if it is in the set. But on average this might be a practical solution if s is substantially smaller than n.
Say s is sorted. Generate a random number between 1 and n-s, call it k. We've chosen the k'th element of {1,...,n} - s. Now we need to find it.
Use binary search on s to find the count of the elements of s <= k. This takes O(log |s|). Add this to k. In doing so, we may have passed or arrived at additional elements of s. We can adjust for this by incrementing our answer for each such element that we pass, which we find by checking the next larger element of s from the point we found in our binary search.
E.g., n = 100, s = {1,4,5,22}, and our random number is 3. So our approach should return the third element of [2,3,6,7,...,21,23,24,...,100] which is 6. Binary search finds that 1 element is at most 3, so we increment to 4. Now we compare to the next larger element of s which is 4 so increment to 5. Repeating this finds 5 in so we increment to 6. We check s once more, see that 6 isn't in it, so we stop.
E.g., n = 100, s = {1,4,5,22}, and our random number is 4. So our approach should return the fourth element of [2,3,6,7,...,21,23,24,...,100] which is 7. Binary search finds that 2 elements are at most 4, so we increment to 6. Now we compare to the next larger element of s which is 5 so increment to 7. We check s once more, see that the next number is > 7, so we stop.
If we assume that "s is substantially smaller than n" means |s| <= log(n), then we will increment at most log(n) times, and in any case at most s times.
If s is not sorted then we can do the following. Create an array of bits of size s. Generate k. Parse s and do two things: 1) count the number of elements < k, call this r. At the same time, set the i'th bit to 1 if k+i is in s (0 indexed so if k is in s then the first bit is set).
Now, increment k a number of times equal to r plus the number of set bits is the array with an index <= the number of times incremented.
E.g., n = 100, s = {1,4,5,22}, and our random number is 4. So our approach should return the fourth element of [2,3,6,7,...,21,23,24,...,100] which is 7. We parse s and 1) note that 1 element is below 4 (r=1), and 2) set our array to [1, 1, 0, 0]. We increment once for r=1 and an additional two times for the two set bits, ending up at 7.
This is O(s) time, O(s) space.
This is an O(1) solution with O(s) initial setup that works by mapping each non-allowed number > s to an allowed number <= s.
Let S be the set of non-allowed values, S(i), where i = [1 .. s] and s = |S|.
Here's a two part algorithm. The first part constructs a hash table based only on S in O(s) time, the second part finds the random value k ∈ {1..n}, k ∉ S in O(1) time, assuming we can generate a uniform random number in a contiguous range in constant time. The hash table can be reused for new random values and also for new n (assuming S ⊂ { 1 .. n } still holds of course).
To construct the hash, H. First set j = 1. Then iterate over S(i), the elements of S. They do not need to be sorted. If S(i) > s, add the key-value pair (S(i), j) to the hash table, unless j ∈ S, in which case increment j until it is not. Finally, increment j.
To find a random value k, first generate a uniform random value in the range s + 1 to n, inclusive. If k is a key in H, then k = H(k). I.e., we do at most one hash lookup to insure k is not in S.
Python code to generate the hash:
def substitute(S):
H = dict()
j = 1
for s in S:
if s > len(S):
while j in S: j += 1
H[s] = j
j += 1
return H
For the actual implementation to be O(s), one might need to convert S into something like a frozenset to insure the test for membership is O(1) and also move the len(S) loop invariant out of the loop. Assuming the j in S test and the insertion into the hash (H[s] = j) are constant time, this should have complexity O(s).
The generation of a random value is simply:
def myrand(n, s, H):
k = random.randint(s + 1, n)
return (H[k] if k in H else k)
If one is only interested in a single random value per S, then the algorithm can be optimized to improve the common case, while the worst case remains the same. This still requires S be in a hash table that allows for a constant time "element of" test.
def rand_not_in(n, S):
k = random.randint(len(S) + 1, n);
if k not in S: return k
j = 1
for s in S:
if s > len(S):
while j in S: j += 1
if s == k: return j
j += 1
Optimizations are: Only generate the mapping if the random value is in S. Don't save the mapping to a hash table. Short-circuit the mapping generation when the random value is found.
Actually, the rejection method seems like the practical approach.
Generate a number in 1...n and check whether it is forbidden; regenerate until the generated number is not forbidden.
The probability of a single rejection is p = s/n.
Thus the expected number of random number generations is 1 + p + p^2 + p^3 + ... which is 1/(1-p), which in turn is equal to n/(n-s).
Now, if s is much less than n, or even more up to s = n/2, this expected number is at most 2.
It would take s almost equal to n to make it infeasible in practice.
Multiply the expected time by log s if you use a tree-set to check whether the number is in the set, or by just 1 (expected value again) if it is a hash-set. So the average time is O(1) or O(log s) depending on the set implementation. There is also O(s) memory for storing the set, but unless the set is given in some special way, implicitly and concisely, I don't see how it can be avoided.
(Edit: As per comments, you do this only once for a given set.
If, additionally, we are out of luck, and the set is given as a plain array or list, not some fancier data structure, we get O(s) expected time with this approach, which still fits into the O(log n + s) requirement.)
If attacks against the unbounded algorithm are a concern (and only if they truly are), the method can include a fall-back algorithm for the cases when a certain fixed number of iterations didn't provide the answer.
Similarly to how IntroSort is QuickSort but falls back to HeapSort if the recursion depth gets too high (which is almost certainly a result of an attack resulting in quadratic QuickSort behavior).
Find all numbers that are in a forbidden set and less or equal then n-s. Call it array A.
Find all numbers that are not in a forbidden set and greater then n-s. Call it array B. It may be done in O(s) if set is sorted.
Note that lengths of A and B are equal, and create mapping map[A[i]] = B[i]
Generate number t up to n-s. If there is map[t] return it, otherwise return t
It will work in O(s) insertions to a map + 1 lookup which is either O(s) in average or O(s log s)
There are N characters in a string of types A and B in the array (same amount of each type). What is the minimal number of swaps to make sure that no two adjacent chars are same if we can only swap two adjacent characters ?
For example, input is:
AAAABBBB
The minimal number of swaps is 6 to make the array ABABABAB. But how would you solve it for any kind of input ? I can only think of O(N^2) solution. Maybe some kind of sort ?
If we need just to count swaps, then we can do it with O(N).
Let's assume for simplicity that array X of N elements should become ABAB... .
GetCount()
swaps = 0, i = -1, j = -1
for(k = 0; k < N; k++)
if(k % 2 == 0)
i = FindIndexOf(A, max(k, i))
X[k] <-> X[i]
swaps += i - k
else
j = FindIndexOf(B, max(k, j))
X[k] <-> X[j]
swaps += j - k
return swaps
FindIndexOf(element, index)
while(index < N)
if(X[index] == element) return index
index++
return -1; // should never happen if count of As == count of Bs
Basically, we run from left to right, and if a misplaced element is found, it gets exchanged with the correct element (e.g. abBbbbA** --> abAbbbB**) in O(1). At the same time swaps are counted as if the sequence of adjacent elements would be swapped instead. Variables i and j are used to cache indices of next A and B respectively, to make sure that all calls together of FindIndexOf are done in O(N).
If we need to sort by swaps then we cannot do better than O(N^2).
The rough idea is the following. Let's consider your sample: AAAABBBB. One of Bs needs O(N) swaps to get to the A B ... position, another B needs O(N) to get to A B A B ... position, etc. So we get O(N^2) at the end.
Observe that if any solution would swap two instances of the same letter, then we can find a better solution by dropping that swap, which necessarily has no effect. An optimal solution therefore only swaps differing letters.
Let's view the string of letters as an array of indices of one kind of letter (arbitrarily chosen, say A) into the string. So AAAABBBB would be represented as [0, 1, 2, 3] while ABABABAB would be [0, 2, 4, 6].
We know two instances of the same letter will never swap in an optimal solution. This lets us always safely identify the first (left-most) instance of A with the first element of our index array, the second instance with the second element, etc. It also tells us our array is always in sorted order at each step of an optimal solution.
Since each step of an optimal solution swaps differing letters, we know our index array evolves at each step only by incrementing or decrementing a single element at a time.
An initial string of length n = 2k will have an array representation A of length k. An optimal solution will transform this array to either
ODDS = [1, 3, 5, ... 2k]
or
EVENS = [0, 2, 4, ... 2k - 1]
Since we know in an optimal solution instances of a letter do not pass each other, we can conclude an optimal solution must spend min(abs(ODDS[0] - A[0]), abs(EVENS[0] - A[0])) swaps to put the first instance in correct position.
By realizing the EVENS or ODDS choice is made only once (not once per letter instance), and summing across the array, we can count the minimum number of needed swaps as
define count_swaps(length, initial, goal)
total = 0
for i from 0 to length - 1
total += abs(goal[i] - initial[i])
end
return total
end
define count_minimum_needed_swaps(k, A)
return min(count_swaps(k, A, EVENS), count_swaps(k, A, ODDS))
end
Notice the number of loop iterations implied by count_minimum_needed_swaps is 2 * k = n; it runs in O(n) time.
By noting which term is smaller in count_minimum_needed_swaps, we can also tell which of the two goal states is optimal.
Since you know N, you can simply write a loop that generates the values with no swaps needed.
#define N 4
char array[N + N];
for (size_t z = 0; z < N + N; z++)
{
array[z] = 'B' - ((z & 1) == 0);
}
return 0; // The number of swaps
#Nemo and #AlexD are right. The algorithm is order n^2. #Nemo misunderstood that we are looking for a reordering where two adjacent characters are not the same, so we can not use that if A is after B they are out of order.
Lets see the minimum number of swaps.
We dont care if our first character is A or B, because we can apply the same algorithm but using A instead of B and viceversa everywhere. So lets assume that the length of the word WORD_N is 2N, with N As and N Bs, starting with an A. (I am using length 2N to simplify the calculations).
What we will do is try to move the next B right to this A, without taking care of the positions of the other characters, because then we will have reduce the problem to reorder a new word WORD_{N-1}. Lets also assume that the next B is not just after A if the word has more that 2 characters, because then the first step is done and we reduce the problem to the next set of characters, WORD_{N-1}.
The next B should be as far as possible to be in the worst case, so it is after half of the word, so we need $N-1$ swaps to put this B after the A (maybe less than that). Then our word can be reduced to WORD_N = [A B WORD_{N-1}].
We se that we have to perform this algorithm as most N-1 times, because the last word (WORD_1) will be already ordered. Performing the algorithm N-1 times we have to make
N_swaps = (N-1)*N/2.
where N is half of the lenght of the initial word.
Lets see why we can apply the same algorithm for WORD_{N-1} also assuming that the first word is A. In this case it matters than the first word should be the same as in the already ordered pair. We can be sure that the first character in WORD_{N-1} is A because it was the character just next to the first character in our initial word, ant if it was B the first work can perform only a swap between these two words and or none and we will already have WORD_{N-1} starting with the same character than WORD_{N}, while the first two characters of WORD_{N} are different at the cost of almost 1 swap.
I think this answer is similar to the answer by phs, just in Haskell. The idea is that the resultant-indices for A's (or B's) are known so all we need to do is calculate how far each starting index has to move and sum the total.
Haskell code:
Prelude Data.List> let is = elemIndices 'B' "AAAABBBB"
in minimum
$ map (sum . zipWith ((abs .) . (-)) is) [[1,3..],[0,2..]]
6 --output
Generate all lists of size n, such that each element is between 0 and m (inclusive).
There are (m+1)^n such lists.
There are two easy ways of writing the general case. One is described in the existing answer from #didierc. The alternative is recursion.
For example, think about a method that takes a String as an argument:
if(input string is long enough)
print or store it
else
iterate over digit range
recursive call with the digit appended to the string
This is just like enumerating all the numbers in base (m+1) of n digits.
start with a list of n zeros
do the following loop
yeld the list as a new answer
increment the first element, counting in base (m+1), and propagate the carry recursively on its next element
if there is a carry left, exit the loop
Update:
just for fun, what would be the solution, if we add the restriction that all digits must remain different (like a lottery number, as it was initially stated - and of course we suppose that m >= n) ?
We proceed by enumerating all the numbers with the restriction stated above, and also that any element must be greater than its successor in the list (ie the digit of rank k < n is larger than the digit of rank k+1).
This is implemented by simply checking when computing the carry that the current digit will not become equal to its predecessor, and if so propagate the carry further.
Then, for each list yelded by enumeration, compute all the possible permutations. There are known algorithms to perform that
computation, see for instance the Johnson-Trotter algorithm, but one can build a simpler recursive algorithm:
function select l r:
if the list r is empty, yeld l
else
for each element x of the list r
let l' be the list of x and l
and r' the remaining elements of r
call select l' r'