Falling segment reunites with other segments with a probability, determine the expected medium length of the segment - algorithm

This is a really tough problem, just a heads-up.
We have N segments, numbered from 1 to N and defined by their left and right points, {Left[i],Right[i]}.
The i-th segment is at height N-i. The first segment (the highest one) starts falling while the others remain fixed. If during the fall a segment i intersects another segment j in at least one point, then the two will reunite with the probability P[j]/Q[j], and the obtained segment will keep falling. From the reunion of two segments, {A,B} and {C,D}, the obtained segment will be {min(A,C),max(B,D)}.
You are asked to determine the expected medium length of the first segment (i.e after it reached a height smaller than the height of any of the other segments). If this answer is a rational number U/V, you are asked to determine X such that X*V=U (mod 10^9+7)
Restrictions :
0 < P < Q < 1 000
0 < Left < Right < 1 000 000
N ≤ 100 000
time : 2.5 sec
memory : 32768 kbytes
`
The input contains N on the first line, then on the following N lines there are 4 integers : Left, Right, P, Q, representing the i-th segment [Left, Right] with a probability P/Q to reunite with the falling segment.
Example:
input:
5
35 64 58 873
41 70 407 729
18 90 165 628
10 57 33 104
60 69 152 466
output:
779316733
The answer is approximately 49.813963.

Idea 1
The length of the final segment is R-L where R is the location of the right end, and L is the location of the left end.
Expectation is a linear operation so
E(length) = E(R) - E(L)
We can compute E(R) and E(L) separately, then combined the results.
Idea 2
We can iteratively compute the PDF for the position of the left end.
It starts off being at the left end of the first segment (Left[1]) with probability 1.
When it falls past segment i, there will be an interesting collision if the left end is between Left[i] and Right[i]. We define an interesting collision to be one that affects the position of the left end.
The key point here is that if we need to know the current position of the right end to determine if there is a collision, then it is not an interesting collision! This is because if we need to know the right end, then the segment i must be completely to the right of the start point, and therefore it does not affect the position of the left edge.
So to update the PDF we collect up all the probability mass between Left[i] and Right[i], multiply by the probability of collision, and add the result to Left[i]. (The existing mass in those locations is scaled down by the probability of collision.)
Idea 3
At the moment we have an O(n^2) algorithm made of n iterations of O(n) to count and modify the mass in each range.
However, we can use a data structure such as a segment tree to allow us to perform each iteration in O(logn) time for a total time complexity of O(nlogn).

Related

Find the number of all possible path in a grid, from (0, 0) to (n, n)

I don't know how to find the number of all possible path in a grid, from a Point A to a Point B.
The point A is on (0,0) and the point B is on (n,n).
A can move up, down, right, and left, and can't move on visited points.
While A moving, A(x,y) = (x,y|(0=<x=<n)∩(0=<y=<n)).
You can solve this problem with recursive backtracking, but there's another approach which I think is more interesting.
If we work out the first few cases by hand we find that:
A 1x1 square has 1 path
A 2x2 square has 2 paths
A 3x3 square has 12 paths
If we then go to OEIS (the Online Encyclopedia of Integer Sequences) and put in the search phrase "1,2,12 paths", the very first result is A007764 which is entitled "Number of nonintersecting (or self-avoiding) rook paths joining opposite corners of an n X n grid".
Knowing what integer sequence you're looking for unlocks significant mathematical resources, including source code to generate the sequence, related sequences, and best-known values.
The known values of the sequence are:
1 1
2 2
3 12
4 184
5 8512
6 1262816
7 575780564
8 789360053252
9 3266598486981642
10 41044208702632496804
11 1568758030464750013214100
12 182413291514248049241470885236
13 64528039343270018963357185158482118
14 69450664761521361664274701548907358996488
15 227449714676812739631826459327989863387613323440
16 2266745568862672746374567396713098934866324885408319028
17 68745445609149931587631563132489232824587945968099457285419306
18 6344814611237963971310297540795524400449443986866480693646369387855336
19 1782112840842065129893384946652325275167838065704767655931452474605826692782532
20 1523344971704879993080742810319229690899454255323294555776029866737355060592877569255844
21 3962892199823037560207299517133362502106339705739463771515237113377010682364035706704472064940398
22 31374751050137102720420538137382214513103312193698723653061351991346433379389385793965576992246021316463868
23 755970286667345339661519123315222619353103732072409481167391410479517925792743631234987038883317634987271171404439792
24 55435429355237477009914318489061437930690379970964331332556958646484008407334885544566386924020875711242060085408513482933945720
25 12371712231207064758338744862673570832373041989012943539678727080484951695515930485641394550792153037191858028212512280926600304581386791094
26 8402974857881133471007083745436809127296054293775383549824742623937028497898215256929178577083970960121625602506027316549718402106494049978375604247408
27 17369931586279272931175440421236498900372229588288140604663703720910342413276134762789218193498006107082296223143380491348290026721931129627708738890853908108906396
You can generate the first few terms yourself on paper or via recursive backtracking, per the other answer.
I would suggest solving this with naive recursion.
Keep a set visted of places that you have visited. And in pseudo-code that is deliberately not any particular language:
function recursive_call(i, j, visited=none)
if visited is none then
visited = set()
end if
if i = n and j = n then
return 1
else if (i, j) in visited or not in grid then
return 0
else
total = 0
add (i, j) to visited
for direction in directions:
(new_i, new_j) = move(i, j, direction)
total += recursive_call(new_i, new_j, visited)
remove (i, j) from visited
return total
end if
end function

Number of occurrences of 2 as a digit in numbers from 0 to n , Not getting the O(n) solution?

This the GFG Link
In this link, I am not able to get anything intuition that how we are calculating the number of 2 as a digit in,
My doubt is if we are counting the 6000 digits in the range as explained in the below description then why we are simply dividing the number by 10 and returning it, If anyone can help me, please do post your answer with examples
Case digits < 2
Consider the value x = 61523 and digit at index d = 3 (here indexes are considered from right and rightmost index is 0). We observe that x[d] = 1. There are 2s at the 3rd digit in the ranges 2000 – 2999, 12000 – 12999, 22000 – 22999, 32000 32999, 42000 – 42999, and 52000 – 52999. So there are 6000 2’s total in the 3rd digit. This is the same amount as if we were just counting all the 2s in the 3rd digit between 1 and 60000.
In other words, we can round down to the nearest 10d+1, and then divide by 10, to compute the number of 2s in the d-th digit.
if x[d) < 2: count2sinRangeAtDigit(x, d) =
Compute y = round down to nearest 10d+1
return y/10
Case digit > 2
Now, let’s look at the case where d-th digit (from right) of x is greater than 2 (x[d] > 2). We can apply almost the exact same logic to see that there are the same number of 2s in the 3rd digit in the range 0 – 63525 as there as in the range 0 – 70000. So, rather than rounding down, we round up.
if x[d) > 2: count2sinRangeAtDigit(x, d) =
Compute y = round down to nearest 10d+1
return y / 10
Case digit = 2
The final case may be the trickiest, but it follows from the earlier logic. Consider x = 62523 and d = 3. We know that there are the same ranges of 2s from before (that is, the ranges 2000 – 2999, 12000 – 12999, … , 52000 – 52999). How many appear in the 3rd digit in the final, partial range from 62000 – 62523? Well, that should be pretty easy. It’s just 524 (62000, 62001, … , 62523).
if x[d] = 2: count2sinRangeAtDigit(x, d) =
Compute y = round down to nearest 10d+1
Compute z = right side of x (i.e., x% 10d)
return y/10 + z + 1**// here why we are doing it ,what is the logic behind this approach**
There is not complete clarity in the explantion given above that's why I am asking here Thank you
For me that explanation is strange too. Also note that true complexity is O(log(n)) because it depends on nummber length (digit count).
Consider the next example: we have number 6125.
At the first round we need to calculate how many 2's are met as the rightmost digit in all numbers from 0 to 6125. We round number down to 6120 and up to 6130. Last digit is 5>2, so we have 613 intervals, every interval contains one digit 2 as the last digit - here we count last 2's in numbers like 2,12,22,..1352,..,6122.
At the second round we need to calculate how many 2's are met as the second (from right) digit in all numbers from 0 to 6125. We round number down to 6100 and up to 6200. Also we have right=5. Digit is 2, so we have 61 intervals, every interval contains ten digits 2 at the second place (20..29, 120..129... 6020..6029). We add 61*10. Also we have to add 5+1 2's for values 6120..6125
At the third round we need to calculate how many 2's are met as the third (from right) digit in all numbers from 0 to 6125. We round number down to 6000 and up to 7000. Digit is 1, so we have 6 intervals, every interval contains one hundred of digit 2 at the third place (200.299.. 5200..5299). So add 6*100.
I think it is clear now that we add 1 interval with thousand of 2's (2000.2999) as the leftmost digit (6>2)

Maximum path cost in matrix

Can anyone tell the algorithm for finding the maximum path cost in a NxM matrix starting from top left corner and ending with bottom right corner with left ,right , down movement is allowed in a matrix and contains negative cost. A cell can be visited any number of times and after visiting a cell its cost is replaced with 0
Constraints
1 <= nxm <= 4x10^6
INPUT
4 5
1 2 3 -1 -2
-5 -8 -1 2 -150
1 2 3 -250 100
1 1 1 1 20
OUTPUT
37
Explanation is given in the image
Explanation of Output
Since you have also negative costs then use bellman-ford. What you do is that you change sign of all the costs(convert negative signs to positive and positive to negative) then find the shortest path and this path will be the longest because you have changed the signs.
If the sign is never becoms negative then use dijkstra shrtest-path but before that make all values negative and this will return you the longest path with it's cost.
You matrix is a direct graph. In your image you are trying to find a path(max or min) from index (0,0) to (n-1,n-1).
You need these things to represent it as a graph.
You need a linkedlist and in each node you have a first_Node, second_Node,Cost to move from first node to second.
An array of linkedlist. In each array index you save a linkedlist.If for example there is a path from 0 to 5 and 0 to 1(it's an undirected graph) then your graph will look like this.
If you want a direct-graph then simply add in adj[0] = 5 and do not add in adj[5] = 0 , this means that there is path from 0 to 5 but not from 5 to zero.
Here linkedlist represents only nodes which are connected not there cost. You have to add extra variable there which keep cost for each two nodes and it will look like this.
Now instead of first linkedlist put this linkedlist in your array and you have a graph now to run shortest or longest path algorithm.
If you want an intellgent algorithm then you can use A* with heuristic, i guess manhattan will be best.
If cost of your edges is not negative then use Dijkstra.
If cost is negative then use bellman-ford algorithm.
You can always find the longest path by converting the minus sign to plus and plus to minus and then run shortest path algorithm. Path founded will be longest.
I answered this question and as you said in comments to look at point two. If that's a task then main idea of this assignment is ensure the Monotonocity.
h stands for heuristic cost.
A stands for accumulated cost.
Which says that each node the h(A) =< h(A) + A(A,B). Means if you want to move from A to B then cost should not be decreasing(can you do something with your values such that this property will hold) but increasing and once you satisfy this condition then everyone node which A* chooses , that node will be part of your path from source to Goal because this is the path with shortest/longest value.
pathMax You can enforece monotonicity. If there is path from A to B such that f(S...AB) < f(S ..B) then set cost of the f(S...AB) = Max(f(S...AB) , f(S...A)) where S means source.
Since moving up is not allowed, paths always look like a set of horizontal intervals that share at least 1 position (for the down move). Answers can be characterized as, say
struct Answer {
int layer[N][2]; // layer[i][0] and [i][1] represent interval start&end
// with 0 <= layer[i][0] <= layer[i][1] < M
// layer[0][0] = 0, layer[N][1] = M-1
// and non-empty intersection of layers i and i+1
};
An alternative encoding is to note only layer widths and offsets to each other; but you would still have to make sure that the last layer includes the exit cell.
Assuming that you have a maxLayer routine that finds the highest-scoring interval in each layer (const O(M) per layer), and that all such such layers overlap, this would yield an O(N+M) optimal answer. However, it may be necessary to expand intervals to ensure that overlap occurs; and there may be multiple highest-scoring intervals in a given layer. At this point I would model the problem as a directed graph:
each layer has one node per score-maximizing horizontal continuous interval.
nodes from one layer are connected to nodes in the next layer according to the cost of expanding both intervals to achieve at least 1 overlap. If they already overlap, the cost is 0. Edge costs will always be zero or negative (otherwise, either source or target intervals could have scored higher by growing bigger). Add the (expanded) source-node interval value to the connection cost to get an "edge weight".
You can then run Dijkstra on this graph (negate edge weights so that the "longest path" is returned) to find the optimal path. Even better, since all paths pass once and only once through each layer, you only need to keep track of the best route to each node, and only need to build nodes and edges for the layer you are working on.
Implementation details ahead
to calculate maxLayer in O(M), use Kadane's Algorithm, modified to return all maximal intervals instead of only the first. Where the linked algorithm discards an interval and starts anew, you would instead keep a copy of that contender to use later.
given the sample input, the maximal intervals would look like this:
[0]
1 2 3 -1 -2 [1 2 3]
-5 -8 -1 2 -150 => [2]
1 2 3 -250 100 [1 2 3] [100]
1 1 1 1 20 [1 1 1 1 20]
[0]
given those intervals, they would yield the following graph:
(0)
| =>0
(+6)
\ -1=>5
\
(+2)
=>7/ \ -150=>-143
/ \
(+7) (+100)
=>12 \ / =>-43
\ /
(+24)
| =>37
(0)
when two edges incide on a single node (row 1 1 1 1 20), carry forward only the highest incoming value.
For each element in a row, find the maximum cost that can be obtained if we move horizontally across the row, given that we go through that element.
Eg. For the row
1 2 3 -1 -2
The maximum cost for each element obtained if we move horizontally given that we pass through that element will be
6 6 6 5 3
Explanation:
for element 3: we can move backwards horizontally touching 1 and 2. we will not move horizontally forward as the values -1 and -2, reduces the cost value.
So the maximum cost for 3 = 1 + 2 + 3 = 6
The maximum cost matrix for each of elements in a row if we move horizontally, for the input you have given in the description will be
6 6 6 5 3
-5 -7 1 2 -148
6 6 6 -144 100
24 24 24 24 24
Since we can move vertically from one row to the below row, update the maximum cost for each element as follows:
cost[i][j] = cost[i][j] + cost[i-1][j]
So the final cost matrix will be :
6 6 6 5 3
1 -1 7 7 -145
7 5 13 -137 -45
31 29 37 -113 -21
Maximum value in the last row of the above matrix will be give you the required output i.e 37

The 1000th element which is product of 2, 3, 5

There is a sequence S.
All the elements in S is product of 2, 3, 5.
S = {2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24 ...}
How to get the 1000th element in this sequence efficiently?
I check each number from 1, but this method is too slow.
A geometric approach:
Let s = 2^i . 3^j . 5^k, where the triple (i, j, k) belongs to the first octant of a 3D state space.
Taking the logarithm,
ln(s) = i.ln(2) + j.ln(3) + k.ln(5)
so that in the state space the iso-s surfaces are planes, which intersect the first octant along a triangle. On the other hand, the feasible solutions are the nodes of a square grid.
If one wants to produce the s-values in increasing order, one can keep a list of the grid nodes closest to the current s-plane*, on its "greater than" side.
If I am right, to move from one s-value to the next, it suffices to discard the current (i, j, k) and replace it by the three triples (i+1, j, k), (i, j+1, k) and (i, j, k+1), unless they are already there, and pick the next smallest s.
An efficient implementation will be by storing the list as a binary tree with the log(s)-value as the key.
If you are asking for the first N values, you will explore a pyramidal volume of state-space of height O(³√N), and base area O(³√N²), which is the number of tree nodes, hence the spatial complexity. Every query in the tree will take O(log(N)) comparisons (and O(1) operations to fetch the minimum), for a total of O(N.log(N)).
*More precisely, the list will contain all triples on the "greater than" side and such that no index can be decreased without getting on the other side of the plane.
Here is Python code that implements these ideas.
You will notice that the logarithms are converted to fixed point (7 decimals) to avoid floating-point inaccuracies that could result in the log(s)-values not being found equal. This causes the s values being inexact in the last digits, but this does not matter as long as the ordering of the values is preserved. Recomputing the s-values from the indexes yields exact values.
import math
import bintrees
# Constants
ln2= round(10000000 * math.log(2))
ln3= round(10000000 * math.log(3))
ln5= round(10000000 * math.log(5))
# Initial list
t= bintrees.FastAVLTree()
t.insert(0, (0, 0, 0))
# Find the N first products
N= 100
for i in range(N):
# Current s
s= t.pop_min()
print math.pow(2, s[1][0]) * math.pow(3, s[1][1]) * math.pow(5, s[1][2])
# Update the list
if not s[0] + ln2 in t:
t.insert(s[0] + ln2, (s[1][0]+1, s[1][1], s[1][2]))
if not s[0] + ln3 in t:
t.insert(s[0] + ln3, (s[1][0], s[1][1]+1, s[1][2]))
if not s[0] + ln5 in t:
t.insert(s[0] + ln5, (s[1][0], s[1][1], s[1][2]+1))
The 100 first values are
1 2 3 4 5 6 8 9 10 12
15 16 18 20 24 25 27 30 32 36
40 45 48 50 54 60 64 72 75 80
81 90 96 100 108 120 125 128 135 144
150 160 162 180 192 200 216 225 240 243
250 256 270 288 300 320 324 360 375 384
400 405 432 450 480 486 500 512 540 576
600 625 640 648 675 720 729 750 768 800
810 864 900 960 972 1000 1024 1080 1125 1152
1200 1215 1250 1280 1296 1350 1440 1458 1500 1536
The plot of the number of tree nodes confirms the O(³√N²) spatial behavior.
Update:
When there is no risk of overflow, a much simpler version (not using logarithms) is possible:
import math
import bintrees
# Initial list
t= bintrees.FastAVLTree()
t[1]= None
# Find the N first products
N= 100
for i in range(N):
# Current s
(s, r)= t.pop_min()
print s
# Update the list
t[2 * s]= None
t[3 * s]= None
t[5 * s]= None
Simply put, you just have to generate each ith number consecutively. Let's call the set {2, 3, 5} to be Z. At ith iteration, assume you have all (i-1) of the values generated in the previous iteration. While generating the next one, what you basically have to do is trying all the elements in Z and for each of them generating **the least element they can form that is larger than the element generated at (i-1)th iteration. Then, you simply consider the smallest one among them as the ith value. A simple and not so efficient implementation is given below.
def generate_simple(N, Z):
generated = [1]
for i in range(1, N+1):
minFound = -1
minElem = -1
for j in range(0, len(Z)):
for k in range(0, len(generated)):
candidateVal = Z[j] * generated[k]
if candidateVal > generated[-1]:
if minFound == -1 or minFound > candidateVal:
minFound = candidateVal
minElem = j
break
generated.append(minFound)
return generated[-1]
As you may observe, this approach has a time complexity of O(N2 * |Z|). An improvement in terms of efficiency would be to store where we left off scanning in the array of generated values for each element in a second array, indicesToStart. Then, for each element we would only scan all N values of the array generated for once(i.e. all through the algorithm), which means the time complexity after such an improvement would be O(N * |Z|).
A simple implementation of the improvement based on the simple version provided above, is given below.
def generate_improved(N, Z):
generated = [1]
indicesToStart = [0] * len(Z)
for i in range(1, N+1):
minFound = -1
minElem = -1
for j in range(0, len(Z)):
for k in range(indicesToStart[j], len(generated)):
candidateVal = Z[j] * generated[k]
if candidateVal > generated[-1]:
if minFound == -1 or minFound > candidateVal:
minFound = candidateVal
minElem = j
break
indicesToStart[j] += 1
generated.append(minFound)
indicesToStart[minElem] += 1
return generated[-1]
If you have a hard time understanding how complexity decreases with this algorithm, try looking into the difference in time complexity of any graph traversal algorithm when an adjacency list is used, and when an adjacency matrix is used. The improvement adjacency lists help achieve is almost exactly the same kind of improvement we get here. In a nutshell, you have an index for each element and instead of starting to scan from the beginning you continue from wherever you left the last time you scanned the generated array for that element. Consequently, even though there are N iterations in the algorithm(i.e. the outermost loop) the overall number of operations you make is O(N * |Z|).
Important Note: All the code above is a simple implementation for demonstration purposes, and you should consider it just as a pseudocode you can test. While implementing this in real life, based on the programming language you choose to use, you will have to consider issues like integer overflow when computing candidateVal.

Test cases for algorithm puzzle

The following is the problem from Interviewstreet Can someone please give me a few test cases along with the output. My solution is within the time limit for all test cases but is giving Wrong Answer.
Circle Summation (30 Points)
There are N children sitting along a circle, numbered 1,2,...,N clockwise. The ith child has a piece of paper with number ai written on it. They play the following game:
In the first round, the child numbered x adds to his number the sum of the numbers of his neighbors.
In the second round, the child next in clockwise order adds to his number the sum of the numbers of his neighbors, and so on.
The game ends after M rounds have been played.
Input:
The first line contains T, the number of test cases. T cases follow. The first line for a test case contains two space seperated integers N and M. The next line contains N integers, the ith number being ai.
Output:
For each test case, output N lines each having N integers. The jth integer on the ith line contains the number that the jth child ends up with if the game starts with child i playing the first round. Output a blank line after each test case except the last one. Since the numbers can be really huge, output them modulo 1000000007.
Constraints:
1 <= T <= 15
3 <= N <= 50
1 <= M <= 10^9
1 <= ai <= 10^9
Sample Input:
2
5 1
10 20 30 40 50
3 4
1 2 1
Sample Output:
80 20 30 40 50
10 60 30 40 50
10 20 90 40 50
10 20 30 120 50
10 20 30 40 100
23 7 12
11 21 6
7 13 24
If it seems to do ok for small test-cases, but not all, I would guess you have an overflow problem.
Make sure you either...
Do the modulus after each addition, not just after adding all three numbers.
Use 64-bit numbers. This would still require modulus, but not as often.
1000000007 is pretty close to the limit of signed 32-bit numbers (214748367). You can add to modulated numbers without overflow, but not three.

Resources