Ruby Split number in ones or halves - ruby

I like to split up a score in an array of n positions.
Lets say my score is 11 and the array is of size 12.
Then I like to have some array that is filled with for example 11 ones or 10 ones and 2 halves (0.5). In the end it should sum to 11.
Then the possible scores are:
size = 12
possible_scores = (0..size).step(0.5).to_a
I can create an array of 12 positions:
scores = Array.new(size) {0}
I could pick a random value from the following possible values:
[0, 0.5, 1].sample
I'm looking for an efficient way to retrieve a random array without having lots of state variables if possible. I already tried to do this in a while loop:
while score < 0
and reduce the value of score with the random value and keep track of the array positions that are set. But it became quite a messy piece of code.
Any ideas how to solve this? Thanks!
Edit:
For this example I want an array that sums up to 11. So any one of
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5, 0.5]
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1]
Or whatever combination that sums up to 11.

Parameters and variables
Given:
tot, the desired total, an integer or an odd multiple of 0.5; and
size, the total number of 0's, 0.5's and 1's that total tot, with the requirement that size >= tot.
we define three variables:
n0 equals the number of zeroes;
n0pt5_pairs equals the number of pairs of 0.5's; and
n1 equals the number of ones.
Case 1: tot is an integer
We require:
0 <= n0pt5_pairs <= [tot, size-tot].min
Note that because n1 = tot - n0pt5_pairs, 2 * n0pt5_pairs + n1 = n0pt5_pairs + tot > size if n0pt5_pairs > size-tot. That is, the total number of 0.5's and ones exceeds size if the number of 0.5 pairs exceeds size-tot.
Given a value for n0pt5_pairs that satisfies the above requirement, n0 and n1 are determined:
n1 = tot - n0pt5_pairs
n0 = size - 2*n0pt5_pairs - n1
= size - tot - n0pt5_pairs
We can therefore randomly select a random triple [n0, 2*n0pt5_pairs, n1] as follows:
def random_combo(size, tot)
n0pt5_pairs = rand(1+[tot, size-tot].min)
[size-tot-n0pt5_pairs, 2*n0pt5_pairs, tot-n0pt5_pairs]
end
For example:
arr = random_combo(17, 11)
#=> [3, 6, 8]
This is used to generate the array
arr1 = [*[0]*arr[0], *[0.5]*arr[1], *[1]*arr[2]]
#=> [0, 0, 0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 1, 1, 1, 1, 1, 1, 1, 1]
which we shuffle:
arr1.shuffle
#=> [1, 0, 0.5, 1, 0.5, 0, 1, 1, 0, 1, 1, 1, 0.5, 0.5, 1, 0.5, 0.5]
Note arr1.size #=> 17 and arr.sum #=> 11.
Case 2: tot is a multiple of 0.5
If
tot = n + 0.5
where n is an integer, every combination of 0's, 0.5's and 1's will have at least one 0.5. We therefore can compute the number of 0's and 1's, together with the number of 0.5's in excess of one. To do that we simply reduce tot by 0.5 (making it equal to an integer) and size by one, use generate_for_integer to solve that problem, then for each three-element array returned by that method increase the number of 0.5's by one.
def generate(size, tot)
return nil if size.zero?
is_int = (tot == tot.floor)
tot = tot.floor
size -= 1 unless is_int
n0pt5_pairs = rand(1+[tot, size-tot].min)
[*[0]*(size-tot-n0pt5_pairs), *[0.5]*(2*n0pt5_pairs + (is_int ? 0 : 1)),
*[1]*(tot-n0pt5_pairs)].
shuffle
end
ge = generate(17, 10)
#=> [0, 1, 0, 1, 0.5, 0.5, 0, 0.5, 0.5, 1, 1, 1, 1, 0.5, 0.5, 0.5, 0.5]
ge.size #=> 17
ge.sum #=> 10.0
go = generate(17, 10.5)
#=> [0.5, 0.5, 0.5, 1, 0, 0.5, 0.5, 1, 1, 0.5, 1, 1, 0.5, 1, 0.5, 0.5, 0]
go.size #=> 17
go.sum #=> 10.5

Ruby provides all you require here, no need to write any algorithmic code. Array#repeated_combination is your friend here:
[0, 0.5, 1].
repeated_combination(12). # 91 unique variant
to_a. # unfortunately it cannot be lazy
shuffle. # to randomize array outcome
detect { |a| a.sum == 11 }.
shuffle # to randomize numbers inside array
#⇒ [0.5, 0.5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Sidenote: one might avoid the necessity to shuffle twice (both array of generated arrays and the resulting array) by using Array#repeated_permutation, but this would drastically increase memory load and execution time.

I like Cary Swoveland's answer, but in fact this can be done without generating an array of solutions.
Let's consider a few examples.
Given size = 6 and score = 3, without shuffling, these are the possible outputs (numbered on the left for reasons that will become apparent):
i ones halves zeroes
0│ 1 1 1 0 0 0 3 0 3
1│ 1 1 ½ ½ 0 0 2 2 2
2│ 1 ½ ½ ½ ½ 0 1 4 1
3│ ½ ½ ½ ½ ½ ½ 0 6 0
Given size = 6 and score = 3.5:
i ones halves zeroes
0│ 1 1 1 ½ 0 0 3 1 2
1│ 1 1 ½ ½ ½ 0 2 3 1
2│ 1 ½ ½ ½ ½ ½ 1 5 0
Given size = 11 and score = 4.5:
i ones halves zeroes
0│ 1 1 1 1 ½ 0 0 0 0 0 0 4 1 6
1│ 1 1 1 ½ ½ ½ 0 0 0 0 0 3 3 5
2│ 1 1 ½ ½ ½ ½ ½ 0 0 0 0 2 5 4
3│ 1 ½ ½ ½ ½ ½ ½ ½ 0 0 0 1 7 3
4│ ½ ½ ½ ½ ½ ½ ½ ½ ½ 0 0 0 9 2
Given size = 12 and score = 11:
i ones halves zeroes
0│ 1 1 1 1 1 1 1 1 1 1 1 0 11 0 1
1│ 1 1 1 1 1 1 1 1 1 1 ½ ½ 10 2 0
Can you see the patterns? After a bit of chin-scratching we discover the following facts:
The number of possible outputs 𝑛 for a given size and score is given by:
𝑛 = min(⌊score⌋, size − ⌈score⌉) + 1
As 𝑖 increases, the number of ones decreases. The number of ones is given by:
count(1) = ⌊score⌋ − 𝑖
As 𝑖 increases, the number of halves (½) increases. The number of halves is given by:
count(½) = 2(𝑖 + mod(score, 1))
In other words, it's 2𝑖 + 1 if score has a fractional part, or 2𝑖 otherwise.
As 𝑖 increases, the number of zeroes decreases, given by:
count(0) = size − ⌈score⌉ − 𝑖
With these four facts in mind we can generate any of the 𝑛 possible outputs at random by picking a random 𝑖 where 0 ≤ 𝑖 < 𝑛:
𝑖 = random( [0..𝑛) )
These facts are easy to translate into Ruby code:
n = [score.floor, size - score.ceil].min + 1
i = rand(n)
num_ones = score.floor - i
num_halves = 2 * (i + score % 1)
num_zeroes = (size - score.floor) - i
Now we just need to clean it up a bit and put it in a function that takes size and score as arguments, turns num_ones, num_halves, and num_zeroes into an array of 0s, 0.5s, and 1s, and shuffles the result:
def generate(size, score)
init_ones = score.floor
init_zeroes = size - score.ceil
i = rand([init_ones, init_zeroes].min + 1)
num_ones = init_ones - i
num_halves = 2 * (i + score % 1)
num_zeroes = init_zeroes - i
[ *[1]*num_ones, *[0.5]*num_halves, *[0]*num_zeroes ].shuffle
end
generate(6, 3.5)
# => [0.5, 1, 0, 0.5, 0.5, 1]
You can see the result in action on repl.it: https://repl.it/#jrunning/UnpleasantDimpledLegacysystem (Note that when you run it on repl.it the output appears very slowly. This is only because repl.it executes Ruby code on the server and streams the result back to the browser.)

If I get the point, one possible option could be (brute force)
size = 12
sum = 11
tmp = Array.new(12){1}
loop do
raise 'not possible' if tmp.sum < sum
tmp[tmp.index(1)] = 0.5 if tmp.index(1)
unless tmp.index(1)
tmp[tmp.index(0.5)] = 0
end
break if tmp.sum == sum
end
tmp #=> [0.5, 0.5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
tmp.sum #=> 11.0

Related

Algorithm to minimize vertical sum of binned rows

I have a problem that I am trying to find an algorithm for. I'd appreciate any suggestions.
I have n rows of length m. The rows have binary (0,1) values that have some expected sum. The 1's can be placed anywhere in its row as long as the sum is as expected. The goal is to minimize the vertical sum for each column in length m.
For example, if we have a 4 rows and 10 columns, where the expected sums are:
Row 1: 6
Row 2: 7
Row 3: 4
Row 4: 5
A potential solution could be:
1 1 1 1 1 1 0 0 0 0
1 1 1 0 0 0 1 1 1 1
1 0 0 1 1 1 0 0 0 0
0 1 0 0 0 0 1 1 1 1
Getting vertical sums of:
3 3 2 2 2 2 2 2 2 2
Opposed to the larger sums if we would just place all the ones in the beginning:
1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 0 0 0
1 1 1 1 0 0 0 0 0 0
1 1 1 1 1 0 0 0 0 0
With sums of:
4 4 4 4 3 2 1 0 0 0
So, I'm trying to spread out the load. My number of rows will get into the millions/billions, so I'm hoping for a linear algebra approach rather than iterative. Thanks!
def create_min_vert_matrix(num_r, num_c, arr_sum_rows):
res = []
curr = 0
for r in range(num_r):
row = [0]*num_c
while arr_sum_rows[r] > 0:
arr_sum_rows[r] -= 1
row[curr] = 1
curr = (curr+1)%num_c
res.append(row)
return res
A quick test:
create_min_vert_matrix(4, 10, [6,7,4,5])
[[1, 1, 1, 1, 1, 1, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 1, 1, 1, 1],
[0, 0, 0, 1, 1, 1, 1, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 1, 1, 1]]
The function takes number of rows, num_r, number of columns, num_c, and an array that tells what the sum of the ones in each row has to be (arr_sum_rows).
The idea is if we distribute the one column-wise as evenly as possible, we are able to minimize the column sums. To make this happen, we keep track of the last column where we inserted a one, curr, and for each inserted one increment it. If it grows larger than the number of columns we set it to zero: curr = (curr+1)%num_c.
The algorithm runs in O(n*m) where n and m are the number of rows and columns of the matrix and O(1) extra space if we don't count the auxiliary space needed for the result (otherwise also O(n*m) extra space of course).

Image Quantization with quantums Algorithm question

I came across a question and unable to find a feasible solution.
Image Quantization
Given a grayscale mage, each pixels color range from (0 to 255), compress the range of values to a given number of quantum values.
The goal is to do that with the minimum sum of costs needed, the cost of a pixel is defined as the absolute difference between its color and the closest quantum value for it.
Example
There are 3 rows 3 columns, image [[7,2,8], [8,2,3], [9,8 255]] quantums = 3 number of quantum values.The optimal quantum values are (2,8,255) Leading to the minimum sum of costs |7-8| + |2-2| + |8-8| + |8-8| + |2-2| + |3-2| + |9-8| + |8-8| + |255-255| = 1+0+0+0+0+1+1+0+0 = 3
Function description
Complete the solve function provided in the editor. This function takes the following 4 parameters and returns the minimum sum of costs.
n Represents the number of rows in the image
m Represents the number of columns in the image
image Represents the image
quantums Represents the number of quantum values.
Output:
Print a single integer the minimum sum of costs/
Constraints:
1<=n,m<=100
0<=image|i||j|<=255
1<=quantums<=256
Sample Input 1
3
3
7 2 8
8 2 3
9 8 255
10
Sample output 1
0
Explanation
The optimum quantum values are {0,1,2,3,4,5,7,8,9,255} Leading the minimum sum of costs |7-7| + |2-2| + |8-8| + |8-8| + |2-2| + |3-3| + |9-9| + |8-8| + |255-255| = 0+0+0+0+0+0+0+0+0 = 0
can anyone help me to reach the solution ?
Clearly if we have as many or more quantums available than distinct pixels, we can return 0 as we set at least enough quantums to each equal one distinct pixel. Now consider setting the quantum at the lowest number of the sorted, grouped list.
M = [
[7, 2, 8],
[8, 2, 3],
[9, 8, 255]
]
[(2, 2), (3, 1), (7, 1), (8, 3), (9, 1), (255, 1)]
2
We record the required sum of differences:
0 + 0 + 1 + 5 + 6 + 6 + 6 + 7 + 253 = 284
Now to update by incrementing the quantum by 1, we observe that we have a movement of 1 per element so all we need is the count of affected elements.
Incremenet 2 to 3
3
1 + 1 + 0 + 4 + 5 + 5 + 5 + 6 + 252 = 279
or
284 + 2 * 1 - 7 * 1
= 284 + 2 - 7
= 279
Consider traversing from the left with a single quantum, calculating only the effect on pixels in the sorted, grouped list that are on the left side of the quantum value.
To only update the left side when adding a quantum, we have:
left[k][q] = min(left[k-1][p] + effect(A, p, q))
where effect is the effect on the elements in A (the sorted, grouped list) as we reduce p incrementally and update the effect on the pixels in the range, [p, q] according to whether they are closer to p or q. As we increase q for each round of k, we can keep the relevant place in the sorted, grouped pixel list with a pointer that moves incrementally.
If we have a solution for
left[k][q]
where it is the best for pixels on the left side of q when including k quantums with the rightmost quantum set as the number q, then the complete candidate solution would be given by:
left[k][q] + effect(A, q, list_end)
where there is no quantum between q and list_end
Time complexity would be O(n + k * q * q) = O(n + quantums ^ 3), where n is the number of elements in the input matrix.
Python code:
def f(M, quantums):
pixel_freq = [0] * 256
for row in M:
for colour in row:
pixel_freq[colour] += 1
# dp[k][q] stores the best solution up
# to the qth quantum value, with
# considering the effect left of
# k quantums with the rightmost as q
dp = [[0] * 256 for _ in range(quantums + 1)]
pixel_count = pixel_freq[0]
for q in range(1, 256):
dp[1][q] = dp[1][q-1] + pixel_count
pixel_count += pixel_freq[q]
predecessor = [[None] * 256 for _ in range(quantums + 1)]
# Main iteration, where the full
# candidate includes both right and
# left effects while incrementing the
# number of quantums.
for k in range(2, quantums + 1):
for q in range(k - 1, 256):
# Adding a quantum to the right
# of the rightmost doesn't change
# the left cost already calculated
# for the rightmost.
best_left = dp[k-1][q-1]
predecessor[k][q] = q - 1
q_effect = 0
p_effect = 0
p_count = 0
for p in range(q - 2, k - 3, -1):
r_idx = p + (q - p) // 2
# When the distance between p
# and q is even, we reassign
# one pixel frequency to q
if (q - p - 1) % 2 == 0:
r_freq = pixel_freq[r_idx + 1]
q_effect += (q - r_idx - 1) * r_freq
p_count -= r_freq
p_effect -= r_freq * (r_idx - p)
# Either way, we add one pixel frequency
# to p_count and recalculate
p_count += pixel_freq[p + 1]
p_effect += p_count
effect = dp[k-1][p] + p_effect + q_effect
if effect < best_left:
best_left = effect
predecessor[k][q] = p
dp[k][q] = best_left
# Records the cost only on the right
# of the rightmost quantum
# for candidate solutions.
right_side_effect = 0
pixel_count = pixel_freq[255]
best = dp[quantums][255]
best_quantum = 255
for q in range(254, quantums-1, -1):
right_side_effect += pixel_count
pixel_count += pixel_freq[q]
candidate = dp[quantums][q] + right_side_effect
if candidate < best:
best = candidate
best_quantum = q
quantum_list = [best_quantum]
prev_quantum = best_quantum
for i in range(k, 1, -1):
prev_quantum = predecessor[i][prev_quantum]
quantum_list.append(prev_quantum)
return best, list(reversed(quantum_list))
Output:
M = [
[7, 2, 8],
[8, 2, 3],
[9, 8, 255]
]
k = 3
print(f(M, k)) # (3, [2, 8, 255])
M = [
[7, 2, 8],
[8, 2, 3],
[9, 8, 255]
]
k = 10
print(f(M, k)) # (0, [2, 3, 7, 8, 9, 251, 252, 253, 254, 255])
I would propose the following:
step 0
Input is:
image = 7 2 8
8 2 3
9 8 255
quantums = 3
step 1
Then you can calculate histogram from the input image. Since your image is grayscale, it can contain only values from 0-255.
It means that your histogram array has length equal to 256.
hist = int[256] // init the histogram array
for each pixel color in image // iterate over image
hist[color]++ // and increment histogram values
hist:
value 0 0 2 1 0 0 0 1 2 1 0 . . . 1
---------------------------------------------
color 0 1 2 3 4 5 6 7 8 9 10 . . . 255
How to read the histogram:
color 3 has 1 occurrence
color 8 has 2 occurrences
With tis approach, we have reduced our problem from N (amount of pixels) to 256 (histogram size).
Time complexity of this step is O(N)
step 2
Once we have histogram in place, we can calculate its # of quantums local maximums. In our case, we can calculate 3 local maximums.
For the sake of simplicity, I will not write the pseudo code, there are numerous examples on internet. Just google ('find local maximum/extrema in array'
It is important that you end up with 3 biggest local maximums. In our case it is:
hist:
value 0 0 2 1 0 0 0 1 2 1 0 . . . 1
---------------------------------------------
color 0 1 2 3 4 5 6 7 8 9 10 . . . 255
^ ^ ^
These values (2, 8, 266) are your tops of the mountains.
Time complexity of this step is O(quantums)
I could explain why it is not O(1) or O(256), since you can find local maximums in a single pass. If needed I will add a comment.
step 3
Once you have your tops of the mountains, you want to isolate each mountain in a way that it has the maximum possible surface.
So, you will do that by finding the minimum value between two tops
In our case it is:
value 0 0 2 1 0 0 0 1 2 1 0 . . . 1
---------------------------------------------
color 0 1 2 3 4 5 6 7 8 9 10 . . . 255
^ ^
| \ / \
- - _ _ _ _ . . . _ ^
So our goal is to find between index values:
from 0 to 2 (not needed, first mountain start from beginning)
from 2 to 8 (to see where first mountain ends, and second one starts)
from 8 to 255 (to see where second one ends, and third starts)
from 255 to end (just noted, also not needed, last mountain always reaches the end)
There are multiple candidates (multiple zeros), and it is not important which one you choose for minimum. Final surface of the mountain is always the same.
Let's say that our algorithm return two minimums. We will use them in next step.
min_1_2 = 6
min_2_3 = 254
Time complexity of this step is O(256). You need just a single pass over histogram to calculate all minimums (actually you will do multiple smaller iterations, but in total you visit each element only once.
Someone could consider this as O(1)
Step 4
Calculate the median of each mountain.
This can be the tricky one. Why? Because we want to calculate the median using the original values (colors) and not counters (occurrences).
There is also the formula that can give us good estimate, and this one can be performed quite fast (looking only at histogram values) (https://medium.com/analytics-vidhya/descriptive-statistics-iii-c36ecb06a9ae)
If that is not precise enough, then the only option is to "unwrap" the calculated values. Then, we could sort these "raw" pixels and easily find the median.
In our case, those medians are 2, 8, 255
Time complexity of this step is O(nlogn) if we have to sort the whole original image. If approximation works fine, then time complexity of this step is almost the constant.
step 5
This is final step.
You now know the start and end of the "mountain".
You also know the median that belongs to that "mountain"
Again, you can iterate over each mountain and calculate the DIFF.
diff = 0
median_1 = 2
median_2 = 8
median_3 = 255
for each hist value (color, count) between START and END // for first mountain -> START = 0, END = 6
// for second mountain -> START = 6, END = 254
// for third mountain -> START = 254, END = 255
diff = diff + |color - median_X| * count
Time complexity of this step is again O(256), and it can be considered as constant time O(1)

Kth element in transformed array

I came across this question in recent interview :
Given an array A of length N, we are supposed to answer Q queries. Query form is as follows :
Given x and k, we need to make another array B of same length such that B[i] = A[i] ^ x where ^ is XOR operator. Sort an array B in descending order and return B[k].
Input format :
First line contains interger N
Second line contains N integers denoting array A
Third line contains Q i.e. number of queries
Next Q lines contains space-separated integers x and k
Output format :
Print respective B[k] value each on new line for Q queries.
e.g.
for input :
5
1 2 3 4 5
2
2 3
0 1
output will be :
3
5
For first query,
A = [1, 2, 3, 4, 5]
For query x = 2 and k = 3, B = [1^2, 2^2, 3^2, 4^2, 5^2] = [3, 0, 1, 6, 7]. Sorting in descending order B = [7, 6, 3, 1, 0]. So, B[3] = 3.
For second query,
A and B will be same as x = 0. So, B[1] = 5
I have no idea how to solve such problems. Thanks in advance.
This is solvable in O(N + Q). For simplicity I assume you are dealing with positive or unsigned values only, but you can probably adjust this algorithm also for negative numbers.
First you build a binary tree. The left edge stands for a bit that is 0, the right edge for a bit that is 1. In each node you store how many numbers are in this bucket. This can be done in O(N), because the number of bits is constant.
Because this is a little bit hard to explain, I'm going to show how the tree looks like for 3-bit numbers [0, 1, 4, 5, 7] i.e. [000, 001, 100, 101, 111]
*
/ \
2 3 2 numbers have first bit 0 and 3 numbers first bit 1
/ \ / \
2 0 2 1 of the 2 numbers with first bit 0, have 2 numbers 2nd bit 0, ...
/ \ / \ / \
1 1 1 1 0 1 of the 2 numbers with 1st and 2nd bit 0, has 1 number 3rd bit 0, ...
To answer a single query you go down the tree by using the bits of x. At each node you have 4 possibilities, looking at bit b of x and building answer a, which is initially 0:
b = 0 and k < the value stored in the left child of the current node (the 0-bit branch): current node becomes left child, a = 2 * a (shifting left by 1)
b = 0 and k >= the value stored in the left child: current node becomes right child, k = k - value of left child, a = 2 * a + 1
b = 1 and k < the value stored in the right child (the 1-bit branch, because of the xor operation everything is flipped): current node becomes right child, a = 2 * a
b = 1 and k >= the value stored in the right child: current node becomes left child, k = k - value of right child, a = 2 * a + 1
This is O(1), again because the number of bits is constant. Therefore the overall complexity is O(N + Q).
Example: [0, 1, 4, 5, 7] i.e. [000, 001, 100, 101, 111], k = 3, x = 3 i.e. 011
First bit is 0 and k >= 2, therefore we go right, k = k - 2 = 3 - 2 = 1 and a = 2 * a + 1 = 2 * 0 + 1 = 1.
Second bit is 1 and k >= 1, therefore we go left (inverted because the bit is 1), k = k - 1 = 0, a = 2 * a + 1 = 3
Third bit is 1 and k < 1, so the solution is a = 2 * a + 0 = 6
Control: [000, 001, 100, 101, 111] xor 011 = [011, 010, 111, 110, 100] i.e. [3, 2, 7, 6, 4] and in order [2, 3, 4, 6, 7], so indeed the number at index 3 is 6 and the solution (always talking about 0-based indexing here).

algorithmic puzzle for calculating the number of combinations of numbers sum to a fixed result

This is a puzzle i think of since last night. I have come up with a solution but it's not efficient so I want to see if there is better idea.
The puzzle is this:
given positive integers N and T, you will need to have:
for i in [1, T], A[i] from { -1, 0, 1 }, such that SUM(A) == N
additionally, the prefix sum of A shall be [0, N], while when the prefix sum PSUM[A, t] == N, it's necessary to have for i in [t + 1, T], A[i] == 0
here prefix sum PSUM is defined to be: PSUM[A, t] = SUM(A[i] for i in [1, t])
the puzzle asks how many such A's exist given fixed N and T
for example, when N = 2, T = 4, following As work:
1 1 0 0
1 -1 1 1
0 1 1 0
but following don't:
-1 1 1 1 # prefix sum -1
1 1 -1 1 # non-0 following a prefix sum == N
1 1 1 -1 # prefix sum > N
following python code can verify such rule, when given N as expect and an instance of A as seq(some people may feel easier reading code than reading literal description):
def verify(expect, seq):
s = 0
for j, i in enumerate(seq):
s += i
if s < 0:
return False
if s == expect:
break
else:
return s == expect
for k in range(j + 1, len(seq)):
if seq[k] != 0:
return False
return True
I have coded up my solution, but it's too slow. Following is mine:
I decompose the problem into two parts, a part without -1 in it(only {0, 1} and a part with -1.
so if SOLVE(N, T) is the correct answer, I define a function SOLVE'(N, T, B), where a positive B allows me to extend prefix sum to be in the interval of [-B, N] instead of [0, N]
so in fact SOLVE(N, T) == SOLVE'(N, T, 0).
so I soon realized the solution is actually:
have the prefix of A to be some valid {0, 1} combination with positive length l, and with o 1s in it
at position l + 1, I start to add 1 or more -1s and use B to track the number. the maximum will be B + o or depend on the number of slots remaining in A, whichever is less.
recursively call SOLVE'(N, T, B)
in the previous N = 2, T = 4 example, in one of the search case, I will do:
let the prefix of A be [1], then we have A = [1, -, -, -].
start add -1. here i will add only one: A = [1, -1, -, -].
recursive call SOLVE', here i will call SOLVE'(2, 2, 0) to solve the last two spots. here it will return [1, 1] only. then one of the combinations yields [1, -1, 1, 1].
but this algorithm is too slow.
I am wondering how can I optimize it or any different way to look at this problem that can boost the performance up?(I will just need the idea, not impl)
EDIT:
some sample will be:
T N RESOLVE(N, T)
3 2 3
4 2 7
5 2 15
6 2 31
7 2 63
8 2 127
9 2 255
10 2 511
11 2 1023
12 2 2047
13 2 4095
3 3 1
4 3 4
5 3 12
6 3 32
7 3 81
8 3 200
9 3 488
10 3 1184
11 3 2865
12 3 6924
13 3 16724
4 4 1
5 4 5
6 4 18
an exponential time solution will be following in general(in python):
import itertools
choices = [-1, 0, 1]
print len([l for l in itertools.product(*([choices] * t)) if verify(n, l)])
An observation: assuming that n is at least 1, every solution to your stated problem ends in something of the form [1, 0, ..., 0]: i.e., a single 1 followed by zero or more 0s. The portion of the solution prior to that point is a walk that lies entirely in [0, n-1], starts at 0, ends at n-1, and takes fewer than t steps.
Therefore you can reduce your original problem to a slightly simpler one, namely that of determining how many t-step walks there are in [0, n] that start at 0 and end at n (where each step can be 0, +1 or -1, as before).
The following code solves the simpler problem. It uses the lru_cache decorator to cache intermediate results; this is in the standard library in Python 3, or there's a recipe you can download for Python 2.
from functools import lru_cache
#lru_cache()
def walks(k, n, t):
"""
Return the number of length-t walks in [0, n]
that start at 0 and end at k. Each step
in the walk adds -1, 0 or 1 to the current total.
Inputs should satisfy 0 <= k <= n and 0 <= t.
"""
if t == 0:
# If no steps allowed, we can only get to 0,
# and then only in one way.
return k == 0
else:
# Count the walks ending in 0.
total = walks(k, n, t-1)
if 0 < k:
# ... plus the walks ending in 1.
total += walks(k-1, n, t-1)
if k < n:
# ... plus the walks ending in -1.
total += walks(k+1, n, t-1)
return total
Now we can use this function to solve your problem.
def solve(n, t):
"""
Find number of solutions to the original problem.
"""
# All solutions stick at n once they get there.
# Therefore it's enough to find all walks
# that lie in [0, n-1] and take us to n-1 in
# fewer than t steps.
return sum(walks(n-1, n-1, i) for i in range(t))
Result and timings on my machine for solve(10, 100):
In [1]: solve(10, 100)
Out[1]: 250639233987229485923025924628548154758061157
In [2]: %timeit solve(10, 100)
1000 loops, best of 3: 964 µs per loop

How to get this matrix

I have this matrix:
S.No. A B
1 5268020 1756
2 15106230 5241
3 24298744 9591
4 23197375 9129
I want to get a matrix which will have two columns [X,Y]. X will take values from S.No. and Y will can be either 1 or 0. For example, for 1 5268020 1756 there should be total 5268020 (1,0) i.e, (X,Y) pairs and 1756 (1,1) pairs.
How can I get this matrix in Octave ??
If I understand your question correctly, you want to fill a matrix with repeated entries (x,0) and (x,1), where x=1...4, where repetition is determined by values found in column A and B. Given the values you supplied that's going to be a huge matrix (67,896,086 rows). So, you could try something like this (replace m below, which has less elements for illustrative purpose):
m = [1, 2, 1;
2, 3, 2;
3, 2, 1;
4, 2, 2];
res = [];
for k = 1:4
res = [res ; [k*ones(m(k, 2), 1), zeros(m(k, 2), 1);
k*ones(m(k, 3), 1), ones(m(k, 3), 1)]];
endfor
which yields
res =
1 0
1 0
1 1
2 0
2 0
2 0
2 1
2 1
3 0
3 0
3 1
4 0
4 0
4 1
4 1
Out of curiosity, is there any reason not to consider a matrix like
1 0 n
1 1 m
2 0 p
2 1 q
...
where n, m, p, q, are values found in columns A and B. This would probably be easier to handle , no?

Resources