Related
Is it possible to verify that a number can be decomposed into a sum of powers of 2 where the exponents are sequential?
Is there an algorithm to check this?
Example: where and
The binary representation would have a single, consecutive group of 1 bits.
To check this, you could first identify the value of the least significant bit, add that bit to the original value, and then check whether the result is a power of 2.
This leads to the following formula for a given x:
(x & (x + (x & -x))) == 0
This expression is also true when x is zero. If that case needs to be rejected as a solution, you need an extra condition for that.
In Python:
def f(x):
return x > 0 and (x & (x + (x & -x))) == 0
This can be done in an elegant way using bitwise operations to check whether the binary representation of the number is a single block of consecutive 1 bits, followed by perhaps some 0s.
The expression x & (x - 1) replaces the lowest 1 in the binary representation of x with a 0. If we call that number y, then y | (y >> 1) sets each bit to be a 1 if it had a 1 to its immediate left. If the original number x was a single block of consecutive 1 bits, then the result is the same as the number x that we started with, because the 1 which was removed will be replaced by the shift. On the other hand, if x is not a single block of consecutive 1 bits, then the shift will add at least one other 1 bit that wasn't there in the original x, and they won't be equal.
That works if x has more than one 1 bit, so the shift can put back the one that was removed. If x has only a single 1 bit, then removing it will result in y being zero. So we can check for that, too.
In Python:
def is_sum_of_consecutive_powers_of_two(x):
y = x & (x - 1)
z = y | (y >> 1)
return x == z or y == 0
Note that this returns True when x is zero, and that's the correct result if "a sum of consecutive powers of two" is allowed to be the empty sum. Otherwise, you will have to write a special case to reject zero.
A number can be represented as the sum of powers of 2 with sequential exponents iff its binary representation has all 1s adjacent.
E.g. the set of numbers that can be represented as 2^n + 2^n-1, n >= 1, is exactly those with two adjacent ones in the binary representation.
just like this:
bool check(int x) {/*the number you want to check*/
int flag = 0;
while (x >>= 1) {
if (x & 1) {
if (!flag) flag = 1;
if (flag == 2) return false;
}
if (flag == 1) flag = 2;
}
return true;
}
O(log n).
I have the following problem: I have a list of items
[O_0,..., O_n]
, where each item is represented by a binary power (o_0 represented by 2^0, ..., o_n by 2^n). I have constructed a list of combinations of these elements (each combination is represented by the sum of the binary representations of the items). For example I have
combs = [3, 9, 15, ......].
Given a new combination of these items say C_1, I would like to test if any of the elements of combs is included in C_1. An efficient and fast way that I thought of was to compute for each element c_i from combs, test if c_i & C_1 == c_i which means that it is true for this element. It is fast since I am doing a bitiwise and.
My problem is that instead of having 1 element C_1, I have a very big number of them C_1, ..., C_k, and I have to test for each one of them the above condition. SO I was wondering if there are any faster ways than the one I mentioned to test the condition of all of the elements (this is actually the same problem as testing if a set is a subset of another, which is why i chose binary representation from the beginning to transform the problem into a binary one).
My understanding of the problem: given a collection of k sets Y and a collection of m sets X, we would like to find a subset S of Y such that for all y in S, there exists x in X s.t. x is a subset of y.
I will assume that sets are represented by n-vectors of zeros and ones denoting inclusion. Here is the setup:
import pandas as pd # for drop_duplicates & benchmarks
import numpy as np
np.random.seed(0)
n = 100 # 100 "atomic" elements
m = 1000 # small sets
k = 1000 # large sets
X = pd.DataFrame(np.random.randint(0, 2, size=(m, n))).drop_duplicates().values
Y = pd.DataFrame(np.random.randint(0, 2, size=(k, n))).drop_duplicates().values
# For each row y in Y, we would like to check if there exists a row x in X
# s.t. x represents a subset of Y
def naive(Y, X):
# O(k^2 + m^2)
for i, y in enumerate(Y):
for x in X:
found_subset = False
if (x <= y).all():
yield i
found_subset = True
if found_subset:
break
def naive_as_array(Y, X):
return np.array(list(naive(Y, X)))
The naive function iterates over all pairs of sets that may satisfy the inclusion relation and short-circuits whenever appropriate. The runtime is O(m + k) where m = len(combs).
As an alternative, we can consider the following recursive algorithm processing each element (from 1 to n) at a time:
def contains(Y, X):
"""
Y : k x n indicator array specifying sets
X : m x n indicator array specifying sets
output: subset Z of [0..k-1] s.t. i in Z iff there exists x in X s.t.
# x is a subset of Y[i]. Z is represented by a 1D numpy array.
"""
k, n = Y.shape
assert Y.shape[1] == X.shape[1]
detected = np.zeros(k, dtype=np.bool)
inds = np.arange(k)
# utility to account for sets that already have a subset detected
def account_for_detected(Y, inds):
mask = ~detected[inds]
Y = Y[mask]
inds = inds[mask]
return Y, inds
# inductively reduce Y.shape[1] (==X.shape[1])
def f(Y, X, inds):
if Y.shape[0] == 0 or X.shape[0] == 0:
# collection Y is empty; inculsions are impossible
return
# avoid redundant comparisons by dropping sets y in Y
# if it is already known that y contains some element of X
Y, inds = account_for_detected(Y, inds)
if Y.shape[1] == 1:
# Y and X are collections of singletons
Y = np.ravel(Y)
X = np.ravel(X)
X_vals = np.zeros(2, dtype=np.int)
X_vals[X] = 1
if X_vals[0] > 0:
detected[inds] = True
elif X_vals[1] > 0:
detected[inds[Y==1]] = True
return
else:
# make a recursive call
Ymask = Y[:,0] == 0
Xmask = X[:,0] == 0
# if x in X is a subset of y in Y, x[0] <= y[0]
f(Y[Ymask,1:], X[Xmask,1:], inds[Ymask])
# by now, detected is updated in the outer scope
# process the remaining Y's
f(Y[~Ymask,1:], X[:,1:], inds[~Ymask])
# done
# make call at root:
f(Y, X, inds)
# return indices
return np.where(detected)[0]
At step d between 1 and n, we split sets Y into Y0 and Y1 where Y0 contains sets in Y that do not contain element d, and Y1 contains sets in Y that do contain element d. Similarly, we define X0 and X1. A key observation is that sets in X1 cannot occur as subsets of sets in Y0. Therefore, we can reduce the number of comparisons in the recursive call.
Timings:
%timeit contains(Y, X)
%timeit naive_as_array(Y, X)
10 loops, best of 3: 185 ms per loop
1 loop, best of 3: 2.39 s per loop
Given a number n of x digits. How to remove y digits in a way the remaining digits results in the greater possible number?
Examples:
1)x=7 y=3
n=7816295
-8-6-95
=8695
2)x=4 y=2
n=4213
4--3
=43
3)x=3 y=1
n=888
=88
Just to state: x > y > 0.
For each digit to remove: iterate through the digits left to right; if you find a digit that's less than the one to its right, remove it and stop, otherwise remove the last digit.
If the number of digits x is greater than the actual length of the number, it means there are leading zeros. Since those will be the first to go, you can simply reduce the count y by a corresponding amount.
Here's a working version in Python:
def remove_digits(n, x, y):
s = str(n)
if len(s) > x:
raise ValueError
elif len(s) < x:
y -= x - len(s)
if y <= 0:
return n
for r in range(y):
for i in range(len(s)):
if s[i] < s[i+1:i+2]:
break
s = s[:i] + s[i+1:]
return int(s)
>>> remove_digits(7816295, 7, 3)
8695
>>> remove_digits(4213, 4, 2)
43
>>> remove_digits(888, 3, 1)
88
I hesitated to submit this, because it seems too simple. But I wasn't able to think of a case where it wouldn't work.
if x = y we have to remove all the digits.
Otherwise, you need to find maximum digit in first y + 1 digits. Then remove all the y0 elements before this maximum digit. Then you need to add that maximum to the answer and then repeat that task again, but you need now to remove y - y0 elements now.
Straight forward implementation will work in O(x^2) time in the worst case.
But finding maximum in the given range can be done effectively using Segment Tree data structure. Time complexity will be O(x * log(x)) in the worst case.
P. S. I just realized, that it possible to solve in O(x) also, using the fact, that exists only 10 digits (but the algorithm maybe a little bit complicated). We need to find the minimum in the given range [L, R], but the ranges in this task will "change" from left to the right (L and R always increase). And we just need to store 10 pointers to the digits (1 per digit) to the first position in the number such that position >= L. Then to find the minimum, we need to check only 10 pointers. To update the pointers, we will try to move them right.
So the time complexity will be O(10 * x) = O(x)
Here's an O(x) solution. It builds an index that maps (i, d) to j, the smallest number > i such that the j'th digit of n is d. With this index, one can easily find the largest possible next digit in the solution in O(1) time.
def index(digits):
next = [len(digits)+1] * 10
for i in xrange(len(digits), 0, -1):
next[ord(digits[i-1])-ord('0')] = i-1
yield next[::-1]
def minseq(n, y):
n = str(n)
idx = list(index(n))[::-1]
i, r = 0, []
for ry in xrange(len(n)-y):
i = next(j for j in idx[i] if j <= y+ry) + 1
r.append(n[i - 1])
return ''.join(r)
print minseq(7816295, 3)
print minseq(4213, 2)
Pseudocode:
Number.toDigits().filter (sortedSet (Number.toDigits()). take (y))
Imho you don't need to know x.
For efficiency, Number.toDigits () could be precalculated
digits = Number.toDigits()
digits.filter (sortedSet (digits).take (y))
Depending on language and context, you either output the digits and are done or have to convert the result into a number again.
Working Scala-Code for example:
def toDigits (l: Long) : List [Long] = if (l < 10) l :: Nil else (toDigits (l /10)) :+ (l % 10)
val num = 734529L
val dig = toDigits (num)
dig.filter (_ > ((dig.sorted).take(2).last))
A sorted set is a set which is sorted, which means, every element is only contained once and then the resulting collection is sorted by some criteria, for example numerical ascending. => 234579.
We take two of them (23) and from that subset the last (3) and filter the number by the criteria, that the digits have to be greater than that value (3).
Your question does not explicitly say, that each digit is only contained once in the original number, but since you didn't give a criterion, which one to remove in doubt, I took it as an implicit assumption.
Other languages may of course have other expressions (x.sorted, x.toSortedSet, new SortedSet (num), ...) or lack certain classes, functions, which you would have to build on your own.
You might need to write your own filter method, which takes a pedicate P, and a collection C, and returns a new collection of all elements which satisfy P, P being a Method which takes one T and returns a Boolean. Very useful stuff.
Given the list of numbers
1 15 2 5 10
I need to obtain
1 2 5 10 15
The only operation I can do is "move the number X at position Y".
In the above example I only need to do "move the number 15 at position 5".
I would like to minimize the number of operations but I can't find/remember a classical algorithm for that, given the operation available.
Some background :
I'm interacting with an API for a kanban-like service.
I have about 600 cards and some actions on our bug-tracker can imply a reordering of these 600 cards in the kanban (multiple cards can move at the same time if the priority of a project is changed)
I can do it in 600 calls to the API but I'm trying to reduce that number as much as possible.
Lemma: The minimum number of (delete element, insert element) pairs you can perform to sort a list L (in increasing order) is:
Smin(L) = |L| - |LIC(L)|
Where LIC(L) is the Longest Increasing Subsequence.
Thus, you have to:
Establish the LIC of your list.
Remove the elements not in it and insert them back at the appropriate position (using binary search).
Proof:
By induction.
For a list of size 1, the longest increasing subsequence is of length... 1! The list is already sorted so the number of (del,ins) pairs required is
|L| - |LIC(L)| = 1 - 1 = 0
Now let Ln be a list of length n, 1 ≤ n. Let Ln+1 be the list obtained by adding an element en+1 to the left of Ln.
This element may or may not influence the Longest Increasing Subsequence. Let's try to see how...
Let in,1 and in,2 be the two first elements of LIC(Ln) (*):
If en+1 > in,2, then LIC(Ln+1) = LIC(Ln)
If en+1 ≤ in,1, then LIC(Ln+1) = en+1 || LIC(Ln)
Else, LIC(Ln+1) = LIC(Ln) - in,1 + en+1. We keep the LIC with the highest first element. This is done by removing in,1 from the LIC and replacing it with en+1.
In the first case, we delete en+1, we thus get to sort Ln. By the induction hypothesis, this require n (deletion, insertion) pairs. We then have to insert en+1 at the appropriate position. Thus:
S(Ln+1)min = 1 + S(Ln)min
S(Ln+1)min = 1 + n - |LIC(Ln)|
S(Ln+1)min = |Ln+1| - |LIC(Ln+1|
In the second case, we ignore en+1. We begin by deleting elements not in LIC(Ln). These elements have to be inserted again! There are
S(Ln)min = |Ln| - |LIC(Ln)|
such elements.
Now, we just have to take care and insert them in the right order (relatively to en+1). In the end, it requires:
S(Ln+1)min = |Ln| - |LIC(Ln)|
S(Ln+1)min = |Ln| + 1 - (|LIC(Ln)| + 1)
Since we have |LIC(Ln+1)| = |LIC(Ln)| + 1 and |Ln+1| = |Ln| + 1, we have in the end:
S(Ln+1)min = |Ln+1| - |LIC(Ln+1)|
The last case can be proved by considering the list L'n obtained by removing in,1 from Ln+1. In that case LIC(L'n) = LIC(Ln+1) and thus:
|LIC(L'n)| = |LIC(Ln)| (1)
From there, we can sort L'n (which takes |L'n| - |LIC(L'n| by the induction hypothesis. The previous equality (1) leads to the result.
(*): If LIC(Ln) < 2, then in,2 doesn't exist. Just ignore the comparisons with it. In that case, only case 2 and case 3 apply... The result is still valid
One possible solution is to find the longest increasing subsequence and move only elements that aren't inside it.
I can't prove it's optimal, but it is easy to prove it is correct and better than N swaps.
Here is a proof-of-concept in Python 2. I implemented it as a O(n2) algorithm, but I'm pretty sure it can be reduced to O(n log n).
from operator import itemgetter
def LIS(V):
T = [1]*(len(V))
P = [-1]*(len(V))
for i, v in enumerate(V):
for j in xrange(i-1, -1, -1):
if T[j]+1 > T[i] and V[j] <= V[i]:
T[i] = T[j] + 1
P[i] = j
i, _ = max(enumerate(T), key=itemgetter(1))
while i != -1:
yield i
i = P[i]
def complement(L, n):
for a, b in zip(L, L[1:]+[n]):
for i in range(a+1, b):
yield i
def find_moves(V):
n = len(V)
L = list(LIS(V))[::-1]
SV = sorted(range(n), key=lambda i:V[i])
moves = [(x, SV.index(x)) for x in complement(L, n)]
while len(moves):
a, b = moves.pop()
yield a, b
moves = [(x-(x>a)+(x>b), y) for x, y in moves]
def make_and_print_moves(V):
print 'Initial array:', V
for a, b in find_moves(V):
x = V.pop(a)
V.insert(b, x)
print 'Move {} to {}. Result: {}'.format(a, b, V)
print '***'
make_and_print_moves([1, 15, 2, 5, 10])
make_and_print_moves([4, 3, 2, 1])
make_and_print_moves([1, 2, 4, 3])
It outputs something like:
Initial array: [1, 15, 2, 5, 10]
Move 1 to 4. Result: [1, 2, 5, 10, 15]
***
Initial array: [4, 3, 2, 1]
Move 3 to 0. Result: [1, 4, 3, 2]
Move 3 to 1. Result: [1, 2, 4, 3]
Move 3 to 2. Result: [1, 2, 3, 4]
***
Initial array: [1, 2, 4, 3]
Move 3 to 2. Result: [1, 2, 3, 4]
***
I have an array of non-negative values. I want to build an array of values who's sum is 20 so that they are proportional to the first array.
This would be an easy problem, except that I want the proportional array to sum to exactly
20, compensating for any rounding error.
For example, the array
input = [400, 400, 0, 0, 100, 50, 50]
would yield
output = [8, 8, 0, 0, 2, 1, 1]
sum(output) = 20
However, most cases are going to have a lot of rounding errors, like
input = [3, 3, 3, 3, 3, 3, 18]
naively yields
output = [1, 1, 1, 1, 1, 1, 10]
sum(output) = 16 (ouch)
Is there a good way to apportion the output array so that it adds up to 20 every time?
There's a very simple answer to this question: I've done it many times. After each assignment into the new array, you reduce the values you're working with as follows:
Call the first array A, and the new, proportional array B (which starts out empty).
Call the sum of A elements T
Call the desired sum S.
For each element of the array (i) do the following:
a. B[i] = round(A[i] / T * S). (rounding to nearest integer, penny or whatever is required)
b. T = T - A[i]
c. S = S - B[i]
That's it! Easy to implement in any programming language or in a spreadsheet.
The solution is optimal in that the resulting array's elements will never be more than 1 away from their ideal, non-rounded values. Let's demonstrate with your example:
T = 36, S = 20. B[1] = round(A[1] / T * S) = 2. (ideally, 1.666....)
T = 33, S = 18. B[2] = round(A[2] / T * S) = 2. (ideally, 1.666....)
T = 30, S = 16. B[3] = round(A[3] / T * S) = 2. (ideally, 1.666....)
T = 27, S = 14. B[4] = round(A[4] / T * S) = 2. (ideally, 1.666....)
T = 24, S = 12. B[5] = round(A[5] / T * S) = 2. (ideally, 1.666....)
T = 21, S = 10. B[6] = round(A[6] / T * S) = 1. (ideally, 1.666....)
T = 18, S = 9. B[7] = round(A[7] / T * S) = 9. (ideally, 10)
Notice that comparing every value in B with it's ideal value in parentheses, the difference is never more than 1.
It's also interesting to note that rearranging the elements in the array can result in different corresponding values in the resulting array. I've found that arranging the elements in ascending order is best, because it results in the smallest average percentage difference between actual and ideal.
Your problem is similar to a proportional representation where you want to share N seats (in your case 20) among parties proportionnaly to the votes they obtain, in your case [3, 3, 3, 3, 3, 3, 18]
There are several methods used in different countries to handle the rounding problem. My code below uses the Hagenbach-Bischoff quota method used in Switzerland, which basically allocates the seats remaining after an integer division by (N+1) to parties which have the highest remainder:
def proportional(nseats,votes):
"""assign n seats proportionaly to votes using Hagenbach-Bischoff quota
:param nseats: int number of seats to assign
:param votes: iterable of int or float weighting each party
:result: list of ints seats allocated to each party
"""
quota=sum(votes)/(1.+nseats) #force float
frac=[vote/quota for vote in votes]
res=[int(f) for f in frac]
n=nseats-sum(res) #number of seats remaining to allocate
if n==0: return res #done
if n<0: return [min(x,nseats) for x in res] # see siamii's comment
#give the remaining seats to the n parties with the largest remainder
remainders=[ai-bi for ai,bi in zip(frac,res)]
limit=sorted(remainders,reverse=True)[n-1]
#n parties with remainter larger than limit get an extra seat
for i,r in enumerate(remainders):
if r>=limit:
res[i]+=1
n-=1 # attempt to handle perfect equality
if n==0: return res #done
raise #should never happen
However this method doesn't always give the same number of seats to parties with perfect equality as in your case:
proportional(20,[3, 3, 3, 3, 3, 3, 18])
[2,2,2,2,1,1,10]
You have set 3 incompatible requirements. An integer-valued array proportional to [1,1,1] cannot be made to sum to exactly 20. You must choose to break one of the "sum to exactly 20", "proportional to input", and "integer values" requirements.
If you choose to break the requirement for integer values, then use floating point or rational numbers. If you choose to break the exact sum requirement, then you've already solved the problem. Choosing to break proportionality is a little trickier. One approach you might take is to figure out how far off your sum is, and then distribute corrections randomly through the output array. For example, if your input is:
[1, 1, 1]
then you could first make it sum as well as possible while still being proportional:
[7, 7, 7]
and since 20 - (7+7+7) = -1, choose one element to decrement at random:
[7, 6, 7]
If the error was 4, you would choose four elements to increment.
A naïve solution that doesn't perform well, but will provide the right result...
Write an iterator that given an array with eight integers (candidate) and the input array, output the index of the element that is farthest away from being proportional to the others (pseudocode):
function next_index(candidate, input)
// Calculate weights
for i in 1 .. 8
w[i] = candidate[i] / input[i]
end for
// find the smallest weight
min = 0
min_index = 0
for i in 1 .. 8
if w[i] < min then
min = w[i]
min_index = i
end if
end for
return min_index
end function
Then just do this
result = [0, 0, 0, 0, 0, 0, 0, 0]
result[next_index(result, input)]++ for 1 .. 20
If there is no optimal solution, it'll skew towards the beginning of the array.
Using the approach above, you can reduce the number of iterations by rounding down (as you did in your example) and then just use the approach above to add what has been left out due to rounding errors:
result = <<approach using rounding down>>
while sum(result) < 20
result[next_index(result, input)]++
So the answers and comments above were helpful... particularly the decreasing sum comment from #Frederik.
The solution I came up with takes advantage of the fact that for an input array v, sum(v_i * 20) is divisible by sum(v). So for each value in v, I mulitply by 20 and divide by the sum. I keep the quotient, and accumulate the remainder. Whenever the accumulator is greater than sum(v), I add one to the value. That way I'm guaranteed that all the remainders get rolled into the results.
Is that legible? Here's the implementation in Python:
def proportion(values, total):
# set up by getting the sum of the values and starting
# with an empty result list and accumulator
sum_values = sum(values)
new_values = []
acc = 0
for v in values:
# for each value, find quotient and remainder
q, r = divmod(v * total, sum_values)
if acc + r < sum_values:
# if the accumlator plus remainder is too small, just add and move on
acc += r
else:
# we've accumulated enough to go over sum(values), so add 1 to result
if acc > r:
# add to previous
new_values[-1] += 1
else:
# add to current
q += 1
acc -= sum_values - r
# save the new value
new_values.append(q)
# accumulator is guaranteed to be zero at the end
print new_values, sum_values, acc
return new_values
(I added an enhancement that if the accumulator > remainder, I increment the previous value instead of the current value)