Count subsets of array which qualify min(subset)+max(subset) < k - algorithm

Was asked this question in an interview, didn't have a better answer than generating all possible subsets.
Example:
a = [4,2,5,7] k = 8
output = 4
[2],[4,2],[2,5],[4,2,5]
Interviewer tried implying sorting the array should help, but I still couldn't figure out a better-than-brute-force solution. Will appreciate your input.

The interviewer implied that sorting the array would help and it does help. I'll try to explain.
Taking the array and k values you stated:
a = [4,2,5,7]
k = 8
Sorting the array will yield:
a_sort = [2,4,5,7]
Now we can consider the following procedure:
set ii = 0, jj = 1
choose a_sort[ii] as a part of your subset
2.1. If 2 * a_sort[ii] >= k, you are done. else, the subset [a_sort[ii]] holds the condition and is a part of the solution.
add a_sort[ii+jj] to your subset
3.1. If a_sort[ii] + a_sort[ii+jj] < k,
3.1.1. the subset [a_sort[ii], a_sort[ii+jj]] holds the condition and is a part of the solution, as well as any subset which consists of any additional number of elements a_sort[kk] where ii< kk < ii+jj
3.1.2. set jj += 1 and go back to step 3.
3.2. else, set ii += 1, jj = ii + 1, go back to step 2
With your input this procedure should return:
[[2], [2,4],[2,5],[2,4,5]]
# [2,7] results in 9 > 8 and therefore you move to [4]
# Note that for [4] subset you get 8 = 8 which is not smaller than 8, we are done
Explenation
if you have a subset of [a_sort[ii]] which does not hold 2 * a_sort[ii] < k, adding additional numbers to the subset will only yield min(subset)+max(subset) > 2 * a_sort[ii] > k and therefore there will not be any additional subsets which hold the wanted condition. Moreover, by setting a subset of [a_sort[ii+1]] will results in 2 * a_sort[ii+1] >= 2 * a_sort[ii] > k` sinse a_sort is sorted. Therefore you will not find any additional subsets.
for jj > ii, if a_sort[ii] + a_sort[ii+jj] < k then you can push any number if members from a_sort into the subset, as long as the index kk will be bigger than ii and lower than ii+jj since a_sort is sorted, and adding these members to the subset will not change the value of min(subset)+max(subset) which will remain a_sort[ii] + a_sort[ii+jj] and we already know that this value is smaller thank k
Getting the count
In case you simply want to the possible subsets, this can be done easier than generating the subsets themselves.
Assuming that for ii > jj the condition holds, i.e. a_sort[ii] + a_sort[ii+jj] < k. If jj = ii + 1 there is an addition of 1 possible subset. If jj > ii + 1 there are jj - ii - 1 additional elements which can be either present not not without a change of the value a_sort[ii] + a_sort[ii+jj]. Therefore there are a total of 2**(jj-ii-1) additional subsets available to add to the solution group (jj-ii-1 elements, each is independently present or not). This also holds for jj = ii + 1 since in this case 2**(jj-ii-1) = 2**0 = 1
Looking at the example above:
[2] adds 1 count
[2,4] adds 1 count (1 = 0 + 1)
[2,5] adds 2 counts (2 = 0 + 2 --> 2 **(2 - 0 - 1) = 2**1 = 2)
A total count of 4

Sort the array
For an element x at index l, do a binary search on the array to get index of the maximum integer in the array which is < k-x. Let the index be r.
For all subsets where min(subset) = x, we can have any element with index in range (l,r]. Number of subsets with min(subset) = x becomes the total number of possible subsets for (r-l) elements, so count = 2^(r-l) (or 0 if r<l).
(Note: in all such subsets, we are fixing x. That's why the range (l,r] isn't inclusive of l)
You have to iterate over the array, use the above process for each element/index to get the count of subsets where our current element is the minimum and the subset satisfies the given constraint. If you find an element with count=0, break the iteration.
This should work with a 0(N*log(N)) complexity, good enough for an interview question imo.
For the given example, sorted array = [2,4,5,7].
For element 2, l=0 and r=2. Count = 2^(2-0) = 4 (covers [2],[4,2],[2,5],[4,2,5]
For element 4, l=1 and r=0. Count = 0, and we break the iteration.

Related

Arranging the number 1 in a 2d matrix

Given the number of rows and columns of a 2d matrix
Initially all elements of matrix are 0
Given the number of 1's that should be present in each row
Given the number of 1's that should be present in each column
Determine if it is possible to form such matrix.
Example:
Input: r=3 c=2 (no. of rows and columns)
2 1 0 (number of 1's that should be present in each row respectively)
1 2 (number of 1's that should be present in each column respectively)
Output: Possible
Explanation:
1 1
0 1
0 0
I tried solving this problem for like 12 hours by checking if summation of Ri = summation of Ci
But I wondered if wouldn't be possible for cases like
3 3
1 3 0
0 2 2
r and c can be upto 10^5
Any ideas how should I move further?
Edit: Constraints added and output should only be "possible" or "impossible". The possible matrix need not be displayed.
Can anyone help me now?
Hint: one possible solution utilizes Maximum Flow Problem by creating a special graph and running the standard maximum flow algorithm on it.
If you're not familiar with the above problem, you may start reading about it e.g. here https://en.wikipedia.org/wiki/Maximum_flow_problem
If you're interested in the full solution please comment and I'll update the answer. But it requires understading the above algorithm.
Solution as requested:
Create a graph of r+c+2 nodes.
Node 0 is the source, node r+c+1 is the sink. Nodes 1..r represent the rows, while r+1..r+c the columns.
Create following edges:
from source to nodes i=1..r of capacity r_i
from nodes i=r+1..r+c to sink of capacity c_i
between all the nodes i=1..r and j=r+1..r+c of capacity 1
Run maximum flow algorithm, the saturated edges between row nodes and column nodes define where you should put 1.
Or if it's not possible then the maximum flow value is less than number of expected ones in the matrix.
I will illustrate the algorithm with an example.
Assume we have m rows and n columns. Let rows[i] be the number of 1s in row i, for 0 <= i < m,
and cols[j] be the number of 1s in column j, for 0 <= j < n.
For example, for m = 3, and n = 4, we could have: rows = {4 2 3}, cols = {1 3 2 3}, and
the solution array would be:
1 3 2 3
+--------
4 | 1 1 1 1
2 | 0 1 0 1
3 | 0 1 1 1
Because we only want to know whether a solution exists, the values in rows and cols may be permuted in any order. The solution of each permutation is just a permutation of the rows and columns of the above solution.
So, given rows and cols, sort cols in decreasing order, and rows in increasing order. For our example, we have cols = {3 3 2 1} and rows = {2 3 4}, and the equivalent problem.
3 3 2 1
+--------
2 | 1 1 0 0
3 | 1 1 1 0
4 | 1 1 1 1
We transform cols into a form that is better suited for the algorithm. What cols tells us is that we have two series of 1s of length 3, one series of 1s of length 2, and one series of 1s of length 1, that are to be distributed among the rows of the array. We rewrite cols to capture just that, that is COLS = {2/3 1/2 1/1}, 2 series of length 3, 1 series of length 2, and 1 series of length 1.
Because we have 2 series of length 3, a solution exists only if we can put two 1s in the first row. This is possible because rows[0] = 2. We do not actually put any 1 in the first row, but record the fact that 1s have been placed there by decrementing the length of the series of length 3. So COLS becomes:
COLS = {2/2 1/2 1/1}
and we combine our two counts for series of length 2, yielding:
COLS = {3/2 1/1}
We now have the reduced problem:
3 | 1 1 1 0
4 | 1 1 1 1
Again we need to place 1s from our series of length 2 to have a solution. Fortunately, rows[1] = 3 and we can do this. We decrement the length of 3/2 and get:
COLS = {3/1 1/1} = {4/1}
We have the reduced problem:
4 | 1 1 1 1
Which is solved by 4 series of length 1, just what we have left. If at any step, the series in COLS cannot be used to satisfy a row count, then no solution is possible.
The general processing for each row may be stated as follows. For each row r, starting from the first element in COLS, decrement the lengths of as many elements count[k]/length[k] of COLS as needed, so that the sum of the count[k]'s equals rows[r]. Eliminate series of length 0 in COLS and combine series of same length.
Note that because elements of COLS are in decreasing order of lengths, the length of the last element decremented is always less than or equal to the next element in COLS (if there is a next element).
EXAMPLE 2 : Solution exists.
rows = {1 3 3}, cols = {2 2 2 1} => COLS = {3/2 1/1}
1 series of length 2 is decremented to satisfy rows[0] = 1, and the 2 other series of length 2 remains at length 2.
rows[0] = 1
COLS = {2/2 1/1 1/1} = {2/2 2/1}
The 2 series of length 2 are decremented, and 1 of the series of length 1.
The series whose length has become 0 is deleted, and the series of length 1 are combined.
rows[1] = 3
COLS = {2/1 1/0 1/1} = {2/1 1/1} = {3/1}
A solution exists for rows[2] can be satisfied.
rows[2] = 3
COLS = {3/0} = {}
EXAMPLE 3: Solution does not exists.
rows = {0 2 3}, cols = {3 2 0 0} => COLS = {1/3 1/2}
rows[0] = 0
COLS = {1/3 1/2}
rows[1] = 2
COLS = {1/2 1/1}
rows[2] = 3 => impossible to satisfy; no solution.
SPACE COMPLEXITY
It is easy to see that it is O(m + n).
TIME COMPLEXITY
We iterate over each row only once. For each row i, we need to iterate over at most
rows[i] <= n elements of COLS. Time complexity is O(m x n).
After finding this algorithm, I found the following theorem:
The Havel-Hakimi theorem (Havel 1955, Hakimi 1962) states that there exists a matrix Xn,m of 0’s and 1’s with row totals a0=(a1, a2,… , an) and column totals b0=(b1, b2,… , bm) such that bi ≥ bi+1 for every 0 < i < m if and only if another matrix Xn−1,m of 0’s and 1’s with row totals a1=(a2, a3,… , an) and column totals b1=(b1−1, b2−1,… ,ba1−1, ba1+1,… , bm) also exists.
from the post Finding if binary matrix exists given the row and column sums.
This is basically what my algorithm does, while trying to optimize the decrementing part, i.e., all the -1's in the above theorem. Now that I see the above theorem, I know my algorithm is correct. Nevertheless, I checked the correctness of my algorithm by comparing it with a brute-force algorithm for arrays of up to 50 cells.
Here is the C# implementation.
public class Pair
{
public int Count;
public int Length;
}
public class PairsList
{
public LinkedList<Pair> Pairs;
public int TotalCount;
}
class Program
{
static void Main(string[] args)
{
int[] rows = new int[] { 0, 0, 1, 1, 2, 2 };
int[] cols = new int[] { 2, 2, 0 };
bool success = Solve(cols, rows);
}
static bool Solve(int[] cols, int[] rows)
{
PairsList pairs = new PairsList() { Pairs = new LinkedList<Pair>(), TotalCount = 0 };
FillAllPairs(pairs, cols);
for (int r = 0; r < rows.Length; r++)
{
if (rows[r] > 0)
{
if (pairs.TotalCount < rows[r])
return false;
if (pairs.Pairs.First != null && pairs.Pairs.First.Value.Length > rows.Length - r)
return false;
DecrementPairs(pairs, rows[r]);
}
}
return pairs.Pairs.Count == 0 || pairs.Pairs.Count == 1 && pairs.Pairs.First.Value.Length == 0;
}
static void DecrementPairs(PairsList pairs, int count)
{
LinkedListNode<Pair> pair = pairs.Pairs.First;
while (count > 0 && pair != null)
{
LinkedListNode<Pair> next = pair.Next;
if (pair.Value.Count == count)
{
pair.Value.Length--;
if (pair.Value.Length == 0)
{
pairs.Pairs.Remove(pair);
pairs.TotalCount -= count;
}
else if (pair.Next != null && pair.Next.Value.Length == pair.Value.Length)
{
pair.Value.Count += pair.Next.Value.Count;
pairs.Pairs.Remove(pair.Next);
next = pair;
}
count = 0;
}
else if (pair.Value.Count < count)
{
count -= pair.Value.Count;
pair.Value.Length--;
if (pair.Value.Length == 0)
{
pairs.Pairs.Remove(pair);
pairs.TotalCount -= pair.Value.Count;
}
else if(pair.Next != null && pair.Next.Value.Length == pair.Value.Length)
{
pair.Value.Count += pair.Next.Value.Count;
pairs.Pairs.Remove(pair.Next);
next = pair;
}
}
else // pair.Value.Count > count
{
Pair p = new Pair() { Count = count, Length = pair.Value.Length - 1 };
pair.Value.Count -= count;
if (p.Length > 0)
{
if (pair.Next != null && pair.Next.Value.Length == p.Length)
pair.Next.Value.Count += p.Count;
else
pairs.Pairs.AddAfter(pair, p);
}
else
pairs.TotalCount -= count;
count = 0;
}
pair = next;
}
}
static int FillAllPairs(PairsList pairs, int[] cols)
{
List<Pair> newPairs = new List<Pair>();
int c = 0;
while (c < cols.Length && cols[c] > 0)
{
int k = c++;
if (cols[k] > 0)
pairs.TotalCount++;
while (c < cols.Length && cols[c] == cols[k])
{
if (cols[k] > 0) pairs.TotalCount++;
c++;
}
newPairs.Add(new Pair() { Count = c - k, Length = cols[k] });
}
LinkedListNode<Pair> pair = pairs.Pairs.First;
foreach (Pair p in newPairs)
{
while (pair != null && p.Length < pair.Value.Length)
pair = pair.Next;
if (pair == null)
{
pairs.Pairs.AddLast(p);
}
else if (p.Length == pair.Value.Length)
{
pair.Value.Count += p.Count;
pair = pair.Next;
}
else // p.Length > pair.Value.Length
{
pairs.Pairs.AddBefore(pair, p);
}
}
return c;
}
}
(Note: to avoid confusion between when I'm talking about the actual numbers in the problem vs. when I'm talking about the zeros in the ones in the matrix, I'm going to instead fill the matrix with spaces and X's. This obviously doesn't change the problem.)
Some observations:
If you're filling in a row, and there's (for example) one column needing 10 more X's and another column needing 5 more X's, then you're sometimes better off putting the X in the "10" column and saving the "5" column for later (because you might later run into 5 rows that each need 2 X's), but you're never better off putting the X in the "5" column and saving the "10" column for later (because even if you later run into 10 rows that all need an X, they won't mind if they don't all go in the same column). So we can use a somewhat "greedy" algorithm: always put an X in the column still needing the most X's. (Of course, we'll need to make sure that we don't greedily put an X in the same column multiple times for the same row!)
Since you don't need to actually output a possible matrix, the rows are all interchangeable and the columns are all interchangeable; all that matter is how many rows still need 1 X, how many still need 2 X's, etc., and likewise for columns.
With that in mind, here's one fairly simple approach:
(Optimization.) Add up the counts for all the rows, add up the counts for all the columns, and return "impossible" if the sums don't match.
Create an array of length r+1 and populate it with how many columns need 1 X, how many need 2 X's, etc. (You can ignore any columns needing 0 X's.)
(Optimization.) To help access the array efficiently, build a stack/linked-list/etc. of the indices of nonzero array elements, in decreasing order (e.g., starting at index r if it's nonzero, then index r−1 if it's nonzero, etc.), so that you can easily find the elements representing columns to put X's in.
(Optimization.) To help determine when there'll be a row can't be satisfied, also make note of the total number of columns needing any X's, and make note of the largest number of X's needed by any row. If the former is less than the latter, return "impossible".
(Optimization.) Sort the rows by the number of X's they need.
Iterate over the rows, starting with the one needing the fewest X's and ending with the one needing the most X's, and for each one:
Update the array accordingly. For example, if a row needs 12 X's, and the array looks like [..., 3, 8, 5], then you'll update the array to look like [..., 3+7 = 10, 8+5−7 = 6, 5−5 = 0]. If it's not possible to update the array because you run out of columns to put X's in, return "impossible". (Note: this part should never actually return "impossible", because we're keeping count of the number of columns left and the max number of columns we'll need, so we should have already returned "impossible" if this was going to happen. I mention this check only for clarity.)
Update the stack/linked-list of indices of nonzero array elements.
Update the total number of columns needing any X's. If it's now less than the greatest number of X's needed by any row, return "impossible".
(Optimization.) If the first nonzero array element has an index greater than the number of rows left, return "impossible".
If we complete our iteration without having returned "impossible", return "possible".
(Note: the reason I say to start with the row needing the fewest X's, and work your way to the row with the most X's, is that a row needing more X's may involve examining updating more elements of the array and of the stack, so the rows needing fewer X's are cheaper. This isn't just a matter of postponing the work: the rows needing fewer X's can help "consolidate" the array, so that there will be fewer distinct column-counts, making the later rows cheaper than they would otherwise be. In a very-bad-case scenario, such as the case of a square matrix where every single row needs a distinct positive number of X's and every single column needs a distinct positive number of X's, the fewest-to-most order means you can handle each row in O(1) time, for linear time overall, whereas the most-to-fewest order would mean that each row would take time proportional to the number of X's it needs, for quadratic time overall.)
Overall, this takes no worse than O(r+c+n) time (where n is the number of X's); I think that the optimizations I've listed are enough to ensure that it's closer to O(r+c) time, but it's hard to be 100% sure. I recommend trying it to see if it's fast enough for your purposes.
You can use brute force (iterating through all 2^(r * c) possibilities) to solve it, but that will take a long time. If r * c is under 64, you can accelerate it to a certain extent using bit-wise operations on 64-bit integers; however, even then, iterating through all 64-bit possibilities would take, at 1 try per ms, over 500M years.
A wiser choice is to add bits one by one, and only continue placing bits if no constraints are broken. This will eliminate the vast majority of possibilities, greatly speeding up the process. Look up backtracking for the general idea. It is not unlike solving sudokus through guesswork: once it becomes obvious that your guess was wrong, you erase it and try guessing a different digit.
As with sudokus, there are certain strategies that can be written into code and will result in speedups when they apply. For example, if the sum of 1s in rows is different from the sum of 1s in columns, then there are no solutions.
If over 50% of the bits will be on, you can instead work on the complementary problem (transform all ones to zeroes and vice-versa, while updating row and column counts). Both problems are equivalent, because any answer for one is also valid for the complementary.
This problem can be solved in O(n log n) using Gale-Ryser Theorem. (where n is the maximum of lengths of the two degree sequences).
First, make both sequences of equal length by adding 0's to the smaller sequence, and let this length be n.
Let the sequences be A and B. Sort A in non-decreasing order, and sort B in non-increasing order. Create another prefix sum array P for B such that ith element of P is equal to sum of first i elements of B.
Now, iterate over k's from 1 to n, and check for
The second sum can be calculated in O(log n) using binary search for index of last number in B smaller than k, and then using precalculated P.
Inspiring from the solution given by RobertBaron I have tried to build a new algorithm.
rows = [int(x)for x in input().split()]
cols = [int (ss) for ss in input().split()]
rows.sort()
cols.sort(reverse=True)
for i in range(len(rows)):
for j in range(len(cols)):
if(rows[i]!= 0 and cols[j]!=0):
rows[i] = rows[i] - 1;
cols[j] =cols[j]-1;
print("rows: ",rows)
print("cols: ",cols)
#if there is any non zero value, print NO else print yes
flag = True
for i in range(len(rows)):
if(rows[i]!=0):
flag = False
break
for j in range(len(cols)):
if(cols[j]!=0):
flag = False
if(flag):
print("YES")
else:
print("NO")
here, i have sorted the rows in ascending order and cols in descending order. later decrementing particular row and column if 1 need to be placed!
it is working for all the test cases posted here! rest GOD knows

Find number of continuous subarray having sum zero

You have given a array and You have to give number of continuous subarray which the sum is zero.
example:
1) 0 ,1,-1,0 => 6 {{0},{1,-1},{0,1,-1},{1,-1,0},{0}};
2) 5, 2, -2, 5 ,-5, 9 => 3.
With O(n^2) it can be done.I am trying to find the solution below this complexity.
Consider S[0..N] - prefix sums of your array, i.e. S[k] = A[0] + A[1] + ... + A[k-1] for k from 0 to N.
Now sum of elements from L to R-1 is zero if and only if S[R] = S[L]. It means that you have to find number of indices 0 <= L < R <= N such that S[L] = S[R].
This problem can be solved with a hash table. Iterate over elements of S[] while maintaining for each value X number of times it was met in the already processed part of S[]. These counts should be stored in a hash map, where the number X is a key, and the count H[X] is the value. When you meet a new elements S[i], add H[S[i]] to your answer (these account for substrings ending with (i-1)-st element), then increment H[S[i]] by one.
Note that if sum of absolute values of array elements is small, you can use a simple array instead of hash table. The complexity is linear on average.
Here is the code:
long long CountZeroSubstrings(vector<int> A) {
int n = A.size();
vector<long long> S(n+1, 0);
for (int i = 0; i < n; i++)
S[i+1] = S[i] + A[i];
long long answer = 0;
unordered_map<long long, int> H;
for (int i = 0; i <= n; i++) {
if (H.count(S[i]))
answer += H[S[i]];
H[S[i]]++;
}
return answer;
}
This can be solved in linear time by keeping a hash table of sums reached during the array traversal. The number of subsets can then be directly calculated from the counts of revisited sums.
Haskell version:
import qualified Data.Map as M
import Data.List (foldl')
f = foldl' (\b a -> b + div (a * (a + 1)) 2) 0 . M.elems . snd
. foldl' (\(s,m) x -> let s' = s + x in case M.lookup s' m of
Nothing -> (s',M.insert s' 0 m)
otherwise -> (s',M.adjust (+1) s' m)) (0,M.fromList[(0,0)])
Output:
*Main> f [0,1,-1,0]
6
*Main> f [5,2,-2,5,-5,9]
3
*Main> f [0,0,0,0]
10
*Main> f [0,1,0,0]
4
*Main> f [0,1,0,0,2,3,-3]
5
*Main> f [0,1,-1,0,0,2,3,-3]
11
C# version of #stgatilov answer https://stackoverflow.com/a/31489960/3087417 with readable variables:
int[] sums = new int[arr.Count() + 1];
for (int i = 0; i < arr.Count(); i++)
sums[i + 1] = sums[i] + arr[i];
int numberOfFragments = 0;
Dictionary<int, int> sumToNumberOfRepetitions = new Dictionary<int, int>();
foreach (int item in sums)
{
if (sumToNumberOfRepetitions.ContainsKey(item))
numberOfFragments += sumToNumberOfRepetitions[item];
else
sumToNumberOfRepetitions.Add(item, 0);
sumToNumberOfRepetitions[item]++;
}
return numberOfFragments;
If you want to have sum not only zero but any number k, here is the hint:
int numToFind = currentSum - k;
if (sumToNumberOfRepetitions.ContainsKey(numToFind))
numberOfFragments += sumToNumberOfRepetitions[numToFind];
I feel it can be solved using DP:
Let the state be :
DP[i][j] represents the number of ways j can be formed using all the subarrays ending at i!
Transitions:
for every element in the initial step ,
Increase the number of ways to form Element[i] using i elements by 1 i.e. using the subarray of length 1 starting from i and ending with i i.e
DP[i][Element[i]]++;
then for every j in Range [ -Mod(highest Magnitude of any element ) , Mod(highest Magnitude of any element) ]
DP[i][j]+=DP[i-1][j-Element[i]];
Then your answer will be the sum of all the DP[i][0] (Number of ways to form 0 using subarrays ending at i ) where i varies from 1 to Number of elements
Complexity is O(MOD highest magnitude of any element * Number of Elements)
https://www.techiedelight.com/find-sub-array-with-0-sum/
This would be an exact solution.
# Utility function to insert <key, value> into the dict
def insert(dict, key, value):
# if the key is seen for the first time, initialize the list
dict.setdefault(key, []).append(value)
# Function to print all sub-lists with 0 sum present
# in the given list
def printallSublists(A):
# create an empty -dict to store ending index of all
# sub-lists having same sum
dict = {}
# insert (0, -1) pair into the dict to handle the case when
# sub-list with 0 sum starts from index 0
insert(dict, 0, -1)
result = 0
sum = 0
# traverse the given list
for i in range(len(A)):
# sum of elements so far
sum += A[i]
# if sum is seen before, there exists at-least one
# sub-list with 0 sum
if sum in dict:
list = dict.get(sum)
result += len(list)
# find all sub-lists with same sum
for value in list:
print("Sublist is", (value + 1, i))
# insert (sum so far, current index) pair into the -dict
insert(dict, sum, i)
print("length :", result)
if __name__ == '__main__':
A = [0, 1, 2, -3, 0, 2, -2]
printallSublists(A)
I don't know what the complexity of my suggestion would be but i have an idea :)
What you can do is try to reduce element from main array which are not able to contribute for you solution
suppose elements are -10, 5, 2, -2, 5,7 ,-5, 9,11,19
so you can see that -10,9,11 and 19 are element
that are never gone be useful to make sum 0 in your case
so try to remove -10,9,11, and 19 from your main array
to do this what you can do is
1) create two sub array from your main array
`positive {5,7,2,9,11,19}` and `negative {-10,-2,-5}`
2) remove element from positive array which does not satisfy condition
condition -> value should be construct from negative arrays element
or sum of its elements
ie.
5 = -5 //so keep it //don't consider the sign
7 = (-5 + -2 ) // keep
2 = -2 // keep
9 // cannot be construct using -10,-2,-5
same for all 11 and 19
3) remove element form negative array which does not satisfy condition
condition -> value should be construct from positive arrays element
or sum of its elements
i.e. -10 // cannot be construct so discard
-2 = 2 // keep
-5 = 5 // keep
so finally you got an array which contains -2,-5,5,7,2 create all possible sub array form it and check for sum = 0
(Note if your input array contains 0 add all 0's in final array)

Number of Paths in a Triangle

I recently encountered a much more difficult variation of this problem, but realized I couldn't generate a solution for this very simple case. I searched Stack Overflow but couldn't find a resource that previously answered this.
You are given a triangle ABC, and you must compute the number of paths of certain length that start at and end at 'A'. Say our function f(3) is called, it must return the number of paths of length 3 that start and end at A: 2 (ABA,ACA).
I'm having trouble formulating an elegant solution. Right now, I've written a solution that generates all possible paths, but for larger lengths, the program is just too slow. I know there must be a nice dynamic programming solution that reuses sequences that we've previously computed but I can't quite figure it out. All help greatly appreciated.
My dumb code:
def paths(n,sequence):
t = ['A','B','C']
if len(sequence) < n:
for node in set(t) - set(sequence[-1]):
paths(n,sequence+node)
else:
if sequence[0] == 'A' and sequence[-1] == 'A':
print sequence
Let PA(n) be the number of paths from A back to A in exactly n steps.
Let P!A(n) be the number of paths from B (or C) to A in exactly n steps.
Then:
PA(1) = 1
PA(n) = 2 * P!A(n - 1)
P!A(1) = 0
P!A(2) = 1
P!A(n) = P!A(n - 1) + PA(n - 1)
= P!A(n - 1) + 2 * P!A(n - 2) (for n > 2) (substituting for PA(n-1))
We can solve the difference equations for P!A analytically, as we do for Fibonacci, by noting that (-1)^n and 2^n are both solutions of the difference equation, and then finding coefficients a, b such that P!A(n) = a*2^n + b*(-1)^n.
We end up with the equation P!A(n) = 2^n/6 + (-1)^n/3, and PA(n) being 2^(n-1)/3 - 2(-1)^n/3.
This gives us code:
def PA(n):
return (pow(2, n-1) + 2*pow(-1, n-1)) / 3
for n in xrange(1, 30):
print n, PA(n)
Which gives output:
1 1
2 0
3 2
4 2
5 6
6 10
7 22
8 42
9 86
10 170
11 342
12 682
13 1366
14 2730
15 5462
16 10922
17 21846
18 43690
19 87382
20 174762
21 349526
22 699050
23 1398102
24 2796202
25 5592406
26 11184810
27 22369622
28 44739242
29 89478486
The trick is not to try to generate all possible sequences. The number of them increases exponentially so the memory required would be too great.
Instead, let f(n) be the number of sequences of length n beginning and ending A, and let g(n) be the number of sequences of length n beginning with A but ending with B. To get things started, clearly f(1) = 1 and g(1) = 0. For n > 1 we have f(n) = 2g(n - 1), because the penultimate letter will be B or C and there are equal numbers of each. We also have g(n) = f(n - 1) + g(n - 1) because if a sequence ends begins A and ends B the penultimate letter is either A or C.
These rules allows you to compute the numbers really quickly using memoization.
My method is like this:
Define DP(l, end) = # of paths end at end and having length l
Then DP(l,'A') = DP(l-1, 'B') + DP(l-1,'C'), similar for DP(l,'B') and DP(l,'C')
Then for base case i.e. l = 1 I check if the end is not 'A', then I return 0, otherwise return 1, so that all bigger states only counts those starts at 'A'
Answer is simply calling DP(n, 'A') where n is the length
Below is a sample code in C++, you can call it with 3 which gives you 2 as answer; call it with 5 which gives you 6 as answer:
ABCBA, ACBCA, ABABA, ACACA, ABACA, ACABA
#include <bits/stdc++.h>
using namespace std;
int dp[500][500], n;
int DP(int l, int end){
if(l<=0) return 0;
if(l==1){
if(end != 'A') return 0;
return 1;
}
if(dp[l][end] != -1) return dp[l][end];
if(end == 'A') return dp[l][end] = DP(l-1, 'B') + DP(l-1, 'C');
else if(end == 'B') return dp[l][end] = DP(l-1, 'A') + DP(l-1, 'C');
else return dp[l][end] = DP(l-1, 'A') + DP(l-1, 'B');
}
int main() {
memset(dp,-1,sizeof(dp));
scanf("%d", &n);
printf("%d\n", DP(n, 'A'));
return 0;
}
EDITED
To answer OP's comment below:
Firstly, DP(dynamic programming) is always about state.
Remember here our state is DP(l,end), represents the # of paths having length l and ends at end. So to implement states using programming, we usually use array, so DP[500][500] is nothing special but the space to store the states DP(l,end) for all possible l and end (That's why I said if you need a bigger length, change the size of array)
But then you may ask, I understand the first dimension which is for l, 500 means l can be as large as 500, but how about the second dimension? I only need 'A', 'B', 'C', why using 500 then?
Here is another trick (of C/C++), the char type indeed can be used as an int type by default, which value is equal to its ASCII number. And I do not remember the ASCII table of course, but I know that around 300 will be enough to represent all the ASCII characters, including A(65), B(66), C(67)
So I just declare any size large enough to represent 'A','B','C' in the second dimension (that means actually 100 is more than enough, but I just do not think that much and declare 500 as they are almost the same, in terms of order)
so you asked what DP[3][1] means, it means nothing as the I do not need / calculate the second dimension when it is 1. (Or one can think that the state dp(3,1) does not have any physical meaning in our problem)
In fact, I always using 65, 66, 67.
so DP[3][65] means the # of paths of length 3 and ends at char(65) = 'A'
You can do better than the dynamic programming/recursion solution others have posted, for the given triangle and more general graphs. Whenever you are trying to compute the number of walks in a (possibly directed) graph, you can express this in terms of the entries of powers of a transfer matrix. Let M be a matrix whose entry m[i][j] is the number of paths of length 1 from vertex i to vertex j. For a triangle, the transfer matrix is
0 1 1
1 0 1.
1 1 0
Then M^n is a matrix whose i,j entry is the number of paths of length n from vertex i to vertex j. If A corresponds to vertex 1, you want the 1,1 entry of M^n.
Dynamic programming and recursion for the counts of paths of length n in terms of the paths of length n-1 are equivalent to computing M^n with n multiplications, M * M * M * ... * M, which can be fast enough. However, if you want to compute M^100, instead of doing 100 multiplies, you can use repeated squaring: Compute M, M^2, M^4, M^8, M^16, M^32, M^64, and then M^64 * M^32 * M^4. For larger exponents, the number of multiplies is about c log_2(exponent).
Instead of using that a path of length n is made up of a path of length n-1 and then a step of length 1, this uses that a path of length n is made up of a path of length k and then a path of length n-k.
We can solve this with a for loop, although Anonymous described a closed form for it.
function f(n){
var as = 0, abcs = 1;
for (n=n-3; n>0; n--){
as = abcs - as;
abcs *= 2;
}
return 2*(abcs - as);
}
Here's why:
Look at one strand of the decision tree (the other one is symmetrical):
A
B C...
A C
B C A B
A C A B B C A C
B C A B B C A C A C A B B C A B
Num A's Num ABC's (starting with first B on the left)
0 1
1 (1-0) 2
1 (2-1) 4
3 (4-1) 8
5 (8-3) 16
11 (16-5) 32
Cleary, we can't use the strands that end with the A's...
You can write a recursive brute force solution and then memoize it (aka top down dynamic programming). Recursive solutions are more intuitive and easy to come up with. Here is my version:
# search space (we have triangle with nodes)
nodes = ["A", "B", "C"]
#cache # memoize!
def recurse(length, steps):
# if length of the path is n and the last node is "A", then it's
# a valid path and we can count it.
if length == n and ((steps-1)%3 == 0 or (steps+1)%3 == 0):
return 1
# we don't want paths having len > n.
if length > n:
return 0
# from each position, we have two possibilities, either go to next
# node or previous node. Total paths will be sum of both the
# possibilities. We do this recursively.
return recurse(length+1, steps+1) + recurse(length+1, steps-1)

Random choosing number in array without repeated

I have a algorithm to randomly select element t in a array with out repeated. This is more detail of algorithm
It can explain as folowing:
Initial a array index u that stores the index of numbers from 1 to k (line 1 to 3)
Set initial of gamma from k and reduce by one for each iteration. The purpose of gamma is for without repeated (line 4,9,10)
Random choose a number t from 1 to N(at the j=1, choose 1 to k, N are nonrepated number), and then put the number to the end of array.
Repate the step 2 to 3
If gamma =0,reset gamma=k
This function will return the t.
For example, I have a array A=[1,2,3,4,5,6,7,8,9], k=9 =size(A), N=12 (From 1 to 9, number select only one time). Now I want to use this algorithm to randomly select number t from array A. This is my code. However, it does not similar the line 6 in the algorithm. Is it right? Let see my code help me
function nonRepeat
k=9;
u=1:k; % initial value of index
N=12
gamma=k;
for j=1:N
index=randi(gamma,1); % use other choosing
t=u(index)
%%swapping
temp=u(t);
u(t)=u(gamma);
u(gamma)=temp;
gamma=gamma-1;
if gamma==0
gamma=k;
end
end
end
I think index=randi(gamma,1); is not right because it says select number t randomly but you select index randomly and assign t=u(index).
See if it works,
k = 9;
u = 1 : k;
N = 12;
gamma = k;
for j = 1 : N
t = randi(gamma,1);
temp = u(t);
u(t) = u(gamma);
u(gamma) = temp;
gamma = gamma - 1;
if gamma == 0
gamma = k;
end
end

Find the minimum number of operations required to compute a number using a specified range of numbers

Let me start with an example -
I have a range of numbers from 1 to 9. And let's say the target number that I want is 29.
In this case the minimum number of operations that are required would be (9*3)+2 = 2 operations. Similarly for 18 the minimum number of operations is 1 (9*2=18).
I can use any of the 4 arithmetic operators - +, -, / and *.
How can I programmatically find out the minimum number of operations required?
Thanks in advance for any help provided.
clarification: integers only, no decimals allowed mid-calculation. i.e. the following is not valid (from comments below): ((9/2) + 1) * 4 == 22
I must admit I didn't think about this thoroughly, but for my purpose it doesn't matter if decimal numbers appear mid-calculation. ((9/2) + 1) * 4 == 22 is valid. Sorry for the confusion.
For the special case where set Y = [1..9] and n > 0:
n <= 9 : 0 operations
n <=18 : 1 operation (+)
otherwise : Remove any divisor found in Y. If this is not enough, do a recursion on the remainder for all offsets -9 .. +9. Offset 0 can be skipped as it has already been tried.
Notice how division is not needed in this case. For other Y this does not hold.
This algorithm is exponential in log(n). The exact analysis is a job for somebody with more knowledge about algebra than I.
For more speed, add pruning to eliminate some of the search for larger numbers.
Sample code:
def findop(n, maxlen=9999):
# Return a short postfix list of numbers and operations
# Simple solution to small numbers
if n<=9: return [n]
if n<=18: return [9,n-9,'+']
# Find direct multiply
x = divlist(n)
if len(x) > 1:
mults = len(x)-1
x[-1:] = findop(x[-1], maxlen-2*mults)
x.extend(['*'] * mults)
return x
shortest = 0
for o in range(1,10) + range(-1,-10,-1):
x = divlist(n-o)
if len(x) == 1: continue
mults = len(x)-1
# We spent len(divlist) + mults + 2 fields for offset.
# The last number is expanded by the recursion, so it doesn't count.
recursion_maxlen = maxlen - len(x) - mults - 2 + 1
if recursion_maxlen < 1: continue
x[-1:] = findop(x[-1], recursion_maxlen)
x.extend(['*'] * mults)
if o > 0:
x.extend([o, '+'])
else:
x.extend([-o, '-'])
if shortest == 0 or len(x) < shortest:
shortest = len(x)
maxlen = shortest - 1
solution = x[:]
if shortest == 0:
# Fake solution, it will be discarded
return '#' * (maxlen+1)
return solution
def divlist(n):
l = []
for d in range(9,1,-1):
while n%d == 0:
l.append(d)
n = n/d
if n>1: l.append(n)
return l
The basic idea is to test all possibilities with k operations, for k starting from 0. Imagine you create a tree of height k that branches for every possible new operation with operand (4*9 branches per level). You need to traverse and evaluate the leaves of the tree for each k before moving to the next k.
I didn't test this pseudo-code:
for every k from 0 to infinity
for every n from 1 to 9
if compute(n,0,k):
return k
boolean compute(n,j,k):
if (j == k):
return (n == target)
else:
for each operator in {+,-,*,/}:
for every i from 1 to 9:
if compute((n operator i),j+1,k):
return true
return false
It doesn't take into account arithmetic operators precedence and braces, that would require some rework.
Really cool question :)
Notice that you can start from the end! From your example (9*3)+2 = 29 is equivalent to saying (29-2)/3=9. That way we can avoid the double loop in cyborg's answer. This suggests the following algorithm for set Y and result r:
nextleaves = {r}
nops = 0
while(true):
nops = nops+1
leaves = nextleaves
nextleaves = {}
for leaf in leaves:
for y in Y:
if (leaf+y) or (leaf-y) or (leaf*y) or (leaf/y) is in X:
return(nops)
else:
add (leaf+y) and (leaf-y) and (leaf*y) and (leaf/y) to nextleaves
This is the basic idea, performance can be certainly be improved, for instance by avoiding "backtracks", such as r+a-a or r*a*b/a.
I guess my idea is similar to the one of Peer Sommerlund:
For big numbers, you advance fast, by multiplication with big ciphers.
Is Y=29 prime? If not, divide it by the maximum divider of (2 to 9).
Else you could subtract a number, to reach a dividable number. 27 is fine, since it is dividable by 9, so
(29-2)/9=3 =>
3*9+2 = 29
So maybe - I didn't think about this to the end: Search the next divisible by 9 number below Y. If you don't reach a number which is a digit, repeat.
The formula is the steps reversed.
(I'll try it for some numbers. :) )
I tried with 2551, which is
echo $((((3*9+4)*9+4)*9+4))
But I didn't test every intermediate result whether it is prime.
But
echo $((8*8*8*5-9))
is 2 operations less. Maybe I can investigate this later.

Resources