Thank you for reading my question.
We use a simulation software to simulate particles dynamics.
The image below represents the output of the simulation software (for only one particle). The output is a set of points p0(x,y,z), p1(x,y,z),...,p19(x,y,z). we connect the points to get the curve.
The x,y,z coordinate must be always > 0 and < max length.
However, in our case which is called Periodic boundary condition we got some points has a value < 0 (red arrow) and some points has a value > max length (green arrow).
If we add the max length to the x,y,z value ( left Box 1 ) then we will move the points into box 1.
and if we subtract the max length from the x,y,z value in Box 2 then we will move the points into box 1.
Actually, this does not solve the problem.
How to solve this issue, please?
Related
I am trying to find an optimal way to place a set of ranges in a larger range. I like to think of it as flat boxes that can move left or right. Some things to consider:
There are N boxes, each of them with a center point Ci.
There are N attractor points (one per box), we can call them Pi. Each box is attracted to one attractor point with a force proportional to the distance.
The order of the boxes is fixed. The order of the attractor points and of the boxes is the same. So C1 is attracted to P1, C2 to P2, etc.
The boxes cannot overlap.
I made a diagram that may make it easier to understand:
The question is, what algorithm can I use to move the boxes around so that each Ci is the closest possible to its respective Pi. In other words, how do I find the locations for the Ci points that minimizes the distance (Li) between all Ci-Pi pairs?
I'd also be helpful if you can point me in to some material to read or something, I'm not very familiar with this type of problems... My guess is that some sort of force-directed algorithm would work but I'm not sure how to implement those.
Since "each box is attracted to one attractor point with a force proportional to the distance", you are describing a system where the boxes are attached to the attractor points by springs (see Hooke's law), and you want to determine the state of the system at rest (the state of minimum potential energy).
Because the forces are proportional to the distances, what you want is to minimize the sum of the distances squared, or the sum of Li^2 from i=0 to i=n. Here is an algorithm to do that.
The idea is to group boxes that need to touch by the end and figure out their position as a group based on their corresponding attractor points.
The first step is not to find these groups, because we can actually start with one big group and cut it later if necessary. For simplicity, let's treat all Li as signed distances. So Li = Ci-Pi. Let's also name the sizes of the boxes, though it will be easier to handle half-sizes. So let Si be half the size of the i-th box. Finally, let's write the sum of Xi from i=a to i=b like sum[a,b](Xi).
Here is how to compute the position of a group of boxes, assuming each one touches the next. Li is a function of the position of the group: if x is that position, Li(x) = Ci(x) - Pi (where Ci(x) is just x plus some constant). x can be point of the group of box, for example the left edge of the first box.
We also know that sum[a,b](Li(x)^2) must be minimal. This means the derivative of that sum must be zero: sum[a,b](2*Li(x)) = 0. So:
sum[a,b](2*Li) = 0
sum[a,b](Li) = 0
sum[a,b](Ci - Pi) = 0
sum[a,b](Ci) = sum[a,b](Pi)
Computing sum[a,b](Pi) is trivial, and sum[a,b](Ci) can be expressed in terms of Ca (center of the first box), since C[i+1] = Ci + Si + S[i+1].
Now that you can compute the position of a group of boxes, do it first with a group made of all boxes, and then remove boxes from that group as follows.
Starting from the left, consider all boxes with Li > 0 and compute Q = sum(Li) for all corresponding i. Similarly, starting from the right, consider all boxes with Li < 0 and compute R = -sum(Li) for all corresponding i (note that negative sign, because we want the absolute value). Now, if Q > R, remove the boxes on the left and make a new group with them, otherwise remove the boxes on the right and make a new group with them.
You cannot make these two new groups at the same time, because removing boxes from one end can change the position of the original group, where boxes you would have removed from the other end should not be removed.
If you made a new group, repeat: compute the position of each separate group of boxes (they will never overlap at this point), and remove boxes if necessary. Otherwise, you have your solution.
It seems the objective is a quadratic function and all the constraints are linear. So I think you can solve it by standard quadratic programming solvers.
If we write S_i be the half-size of i-th box, and the Pi's are given, then:
Minimize y
with respect to C_1, C_2, ...C_n
subject to
y = sum_i (P_i - C_i)^2
C_i + S_i + S_{i+1} <= C_{i+1} for each i = 1, ... n-1
Edit: this is a crude solution to minimize the sum of all Li, which is no longer the question.
Let's name the boxes B, so Bi has center Ci. Let n be the number of boxes and points.
Assuming all the boxes can fit into the larger range, here is how I would do it:
Let Q(a, b) be the average of Pi from i=a to i=b.
Place all the boxes next to each other (in order) to form a superbox, so that the center of this superbox is at Q(1, n).
If it goes over one end of the larger range, move it so that it sits at the limit.
Then, for each Bi, move it as close to Pi as possible without moving other boxes (and while still being inside the larger range). Repeat until you can't move any more box.
Now, the only way to minimize the sum of all Li is as follows.
Let G be a group of boxes that touch. Let F(G) be the predicate: if the center boxes of a series are Bi and Bj (if there are an odd number of boxes in the series, i=j), then Ci != Pi and Cj != Pj.
Find a G such that F(G) is true, and move the corresponding boxes so that F(G) becomes false. If the group of boxes hit another box while moving, add that box to the group and repeat. Of course, don't move any box outside the larger range.
Once there is no G for which F(G) is true or for which you would need to move outside the larger range, you have your solution (one of potentially an infinite number).
Just for completion, I found a (probably subtompimal) solution that works pretty well and is very easy to implement.
Place all boxes with their Ci's at their Pi's.
Go over all boxes, from left to right and do the following:
Check if box i overlaps with the box to its left. If it is the first box, check if it overlaps with the range minimum.
If there is overlap, move the box to the right so that there is no left overlap.
Repeat step 2 but from right to left, checking right overlaps (or range maximum for the last box).
Repeat steps 2-3 until no more overlaps remain or a maximum number of repetitions is reached.
It's quite efficient for my relatively small dataset and I get good results with 10 repetitions of steps 2-3 (5 left to right checks, 5 right to left checks).
https://leetcode.com/problems/trapping-rain-water-ii/
Given an m x n matrix of positive integers representing the height of
each unit cell in a 2D elevation map, compute the volume of water it
is able to trap after raining.
A slight addition is if there's a hole in it and whole platform is in air? How much can it actually store?
While i can look for bounding region around the hole and calculate how much water is wasted there, i can only define a rectangular bounding region (Case 1), but for the second case how can you locate and calculate water in this region:
If i just look for rectangular region which consists the bounding region defined by grey lines, calculate water stored in here then subtract from total, water stored in green region will be removed which shouldn't be. And the bigger problem what if it doesn't exist at all?
Or is there any approach i'm missing, any and all suggestions are welcome.
Here’ the approach that worked for me.
I was looking at separate cells, not regions.
Let a[i][j] be the total height of combined stone (or whatever material is it) and water above it.
Then we have:
a[i][j] = max(height[i][j], min(a[i+1][j], a[i][j+1], a[i-1][j], a[i][j-1]))
The “max” part is to prevent the value from being less than the stone part. And the “min” part is to make sure that water is held by the adjacent cells.
For boundaries the water level is zero so a[i][j] = height[i][j]. For other cells we can start with a very big number.
To illustrate this a little bit: suppose you know for sure that the water level for an adjacent cell can't be more than 7 (for example). Then the water level for your current cell also can't be more than 7: there's literally nothing to hold the water from flowing in direction of that adjacent cell.
By the way, if you have a "hole" in a cell then a[i][j] = 0 since no water can be accumulated there.
We can repeatedly apply that formula as kind of “relaxation” until it’s no longer possible. When it’s no longer possible we have our final configuration and we just need to calculate the water volume.
For procedure to be efficient we can go from top to bottom applying:
a[i][j] = max(height[i][j], min(a[i-1][j], a[i][j-1]))
and then from bottom to top applying:
a[i][j] = max(height[i][j], min(a[i+1][j], a[i][j+1]))
repeating it again an again while at least one cell value changes.
I'm looking for a algorithm that computes the following: I have an image with a predefined area (the green one on the attached image). The user draws the red rectangle and the algorithm should compute whether the red rectangle matches approximately the green one. For example the position of the red rectangle on the attached picture would be ok.
What is a good way to compute this? Is there any best practice algorithm?
My idea is to compute the middle of the red rectangle and then to determine whether the middle is inside the green rectangle. In addition, I would calculate if the length and height match approximately the length and height of the green one (25% more or less).
Is this a good idea? Any other suggestion?
Compute the area of the intersection and divide by the average of the areas of the two rectangles (arithmetic or geometric). You will get a fraction. The closer to 1, the better the match.
Take the average distance between vertices as the criteria for mismatch.
Lets assume first rectangle's vertices are [x1,y1], [x2,y2], [x3,y3], [x4,y4] and for second one are [a1,b1],[a2,b2],[a3,b3],[a4,b4]
Get euclidiean distance between these points
Lower distance means better match, e.g exact overlap will give 0, a shape shift or offset shift of any rectangle would increase the average distance of vertices.
Investigating the problem, I tend to think about the conditions that should make the comparison of the green and the red rectangles fail, together with reasoning about the failing conditions, separately about each condition.
What I mean above, practically, is that I would like the following responses from the algorithm, making clear what aspect of the comparison fails:
Your rectangle's width is way off.
Your rectangle's height is way off.
Your rectangle's horizontal placement is way off.
Your rectangle's vertical placement is way off.
Let us call the conditions above "failing conditions". These failing conditions suggest my view of the comparison, which unavoidably directs my approach. One could view it differently ("Your rectangle's area is way off."). The user, of course, could get more generic responses like the following:
Your rectangle's dimensions are way off.
Your rectangle's placement is way off.
Your rectangle is way off. Try again.
Dude, are you drunk?
In the following I use green to refer to the green rectangle as an object and red to refer to the red rectangle as an object. All conditions are based on relative errors, that is absolute errors normalized with respect to the actual values, i.e. the values of the green rectangle.
One thing that needs to be specified is what "way off" means for horizontal and vertical placement. It means that there is a divergence between the location of a key point of the green rectangle and the location of the corresponding key point of the red rectangle. Let us choose the center of a rectangle as the key point for comparisons (one could choose the top-left corner of the rectangle).
Another thing that needs to be specified is how you may compare two points in a relative way, separately for each axis. You need a reference value. What you can do is calculate the absolute offset between the two points in each axis. Then you can calculate the relative offset with respect to the green rectangle's corresponding dimension. For instance, you can calculate the relative horizontal offset as the absolute offset between the centers in the x-axis divided by the width of the green rectangle. All in all, for a comparison to succeed, I would like the rectangles to have almost the same dimensions and almost the same center. Where "almost" should be quantified as a percentage.
Concerning failing condition (1), assuming that the maximum allowed relative error for the rectangle's width is 25%, the boolean value that we have to calculate is:
| green.width - red.width | / green.width > 0.25
If the value above is true, then failing condition (1) goes off. The Dude may be drunk. We may exit and notify.
Concerning failing condition (2), assuming that the maximum allowed relative error for the rectangle's height is 30%, the boolean value that we have to calculate is:
| green.height - red.height | / green.height > 0.30
If the value above is true, then failing condition (2) goes off. We may exit and notify.
Concerning failing condition (3), assuming that the maximum allowed relative error for the rectangle's horizontal offset is 15%, the boolean value that we have to calculate is:
| green.center.x - red.center.x | / green.width > 0.15
If the value above is true, then failing condition (3) goes off. We may exit and notify.
Concerning failing condition (4), assuming that the maximum allowed relative error for the rectangle's vertical offset is 20%, the boolean value that we have to calculate is:
| green.center.y - red.center.y | / green.height > 0.20
If the value above is true, then failing condition (4) goes off. We may exit and notify.
If at least one failing condition goes off, then the comparison fails. If no failing condition is true, then the comparison is successful, the green and the red rectangles are almost the same.
I believe that the approach above has a lot of advantages, such as reasoning for separate aspects of the comparison, as well as defining different thresholds for the failing conditions. You can also tune the thresholds according to your taste. In extreme cases more parameters may need to be taken into account, though.
I have a distributions of point distance to a parallel line. Each distributions have an area more populated which represent the point channel. I would like to extract the minimum and maximum represented by the red line in the graphs? The eyes can do it easily but how to do it robustly with an algorithm?
The x axis represents the perpendicular distance of the points to the line from 0 to 100m.
The y axis represents the number of points that have their distance in a certain bin.
Example 1
Example 2
Since the distribution comes from a set of distances from points to a line, and the values are in order, you may try to compute the normal distribution that models your samples. From there, get as margins (your red bars) the mean +/- x*sigma, where x can be the value you want (maybe 1 or 2).
If the points were not in order, you may get some percentile (0.25, for example) of the full list of values as a threshold, and assume your populated part of the distribution starts there for values higher than that percentile.
Source: Facebook Hacker Cup Qualification Round 2011
At the arcade, you can play a simple game where a ball is dropped into the top of the game, from a position of your choosing. There are a number of pegs that the ball will bounce off of as it drops through the game. Whenever the ball hits a peg, it will bounce to the left with probability 0.5 and to the right with probability 0.5. The one exception to this is when it hits a peg on the far left or right side, in which case it always bounces towards the middle.
When the game was first made, the pegs where arranged in a regular grid. However, it's an old game, and now some of the pegs are missing. Your goal in the game is to get the ball to fall out of the bottom of the game in a specific location. Given the arrangement of the game, how can we determine the optimal place to drop the ball, such that the probability of getting it to this specific location is maximized?
The image below shows an example of a game with five rows of five columns. Notice that the top row has five pegs, the next row has four pegs, the next five, and so on. With five columns, there are four choices to drop the ball into (indexed from 0). Note that in this example, there are three pegs missing. The top row is row 0, and the leftmost peg is column 0, so the coordinates of the missing pegs are (1,1), (2,1) and (3,2). In this example, the best place to drop the ball is on the far left, in column 0, which gives a 50% chance that it will end in the goal.
x.x.x.x.x
x...x.x
x...x.x.x
x.x...x
x.x.x.x.x
G
x indicates a peg, . indicates empty space.
Start at the bottom and assign a probability of 1 to the goal and 0 to other slots. Then for the next row up, assign probabilities as follows:
1) if there is no peg, use the probability directly below.
2) for a peg, use the average of the probabilities in the adjacent columns one row down.
This will simply propagate the probabilities to the top where each slot will be assigned the probability of reaching the goal from that slot. No tree, no recursion.
We can solve this problem using probability theory. We drop the ball in a position and recursively split the ball's path in its one (at the sidewall) or two possible directions. At the first step, we know with probability 1 the position of the ball (we are dropping it after all!). At each subsequent split into two directions, the probability halves. If we end up at the bottom row in the target location, we add the probability of path taken to our total. Repeat this process for all starting positions and take the highest probability of reaching the target.
We can improve this algorithm by removing the recursion and processing row-by-row using dynamic programming. Start with the first row set to all 0, except for the starting location which we set to 1. Then calculate the probabilities of reaching each cell in the next row by starting with an array of 0's and. For each cell in our current row, add half its probability to the cell to its left in the next row and half to its right, unless its against the sidewall in which case we add the full probability to the single cell. Continue doing this for each row until reaching the final row.
So far we've neglected the missing pegs. We can take them into account by having three probabilities for each cell: one for each direction the ball is currently travelling. In the end, we sum up all thre as direction doesn't matter.
This question was in Facebook Hacker Cup 2011.
marcog solution seems correct, but I solved a bit different. I solved like this:
Setup board: Read input, setup a NxM board, read missing pegs and insert holes on the board.
For each possible initial drop hole, do a BFS as follow:
Drop hole has 1.0 initial probability.
From current state you can either go down, left, right, left and right.
If you can only go down, left, or right, sum the current state probability and add it to the queue if it is not already on the queue. For example: if you are at (1, 2) with probability 0.5 and can only go down, sum 0.5 to state (2,2) and add it to the queue if it is not on the queue already.
If you can go left and right, sum half the current state probability to each possible next state and add them to the queue if they are not already there. For example: if you are at (3, 3) with probability 0.5 and can go both left and right, add 0.25 to (4, 2) and 0.25 to (4, 4) and them to the queue if they are not already there.
Update current best
Print global best.
My solution (not the cleanest code) in cpp can be downloaded from: https://github.com/piva/Programming-Challenges/blob/master/peggame.cpp
Hope that helped...
Observations:
For a given starting position, on each row there is a distribution of probabilities
From one full row to the next, the distribution will simply be blurred except for the edges.
Where there are holes, we will see predictable deviation from the blurring in (2)
We could separate these deviations out, since the balls are dropped one at a time, so the probabilities obey the superposition principle (quantum computers would be ideal here).
Separating out the deviations, we can see that really there is a set of holes overlaid on a grid of pegs, so we can calculate the distribution from the complete set of pegs first (easy) and then go through the pegs individually to see their effect - this assumes that there are more pegs than holes!
An edge is really a mirror - we can calculate for an infinite array of these mirrored virtual boards rather than using if conditions for the boundaries.
So I would start at the bottom, in the desired position, and spread the probability. The missing pegs effectively just skip a row, so you keep a register of vertically falling balls.
Ideally, I would start with a complete (fibonacci) tree, and for each missing of the missing pegs on a row add in the effect of them being missing.
O(R*C) solution
dp[i][j] gives the probability of the ball reaching the goal slot if it is currently at row i and in slot j.
The base case has dp[R-1][goal] = 1.0 and all other slots in row R-1 to 0.0
The recurrence then is
dp[i][j] = dp[i + 2][j] if the peg below is missing
dp[i][j] = dp[i + 1][left] if the peg is on the right wall
dp[i][j] = dp[i + 1][right] if the peg is on the left wall
dp[i][j] = (dp[i + 1][left] + dp[i + 1][right]) / 2 otherwise