fully connection algorithm - algorithm

I have encoutered an algorithm question:
Fully Connection
Given n cities which spreads along a line, let Xi be the position of city i and Pi be its population.
Now we begin to lay cables between every two of the cities based on their distance and population. Given two cities i and j, the cost to lay cable between them is |Xi-Xj|*max(Pi,Pj). How much does it cost to lay all the cables?
For example, given:
i Xi Pi
- -- --
1 1 4
2 2 5
3 3 6
Then the total cost can be calculated as:
i j |Xi-Xj| max(Pi, Pj) Segment Cost
- - ------ ----------- ------------
1 2 1 5 5
2 3 1 6 6
1 3 2 6 12
So that the total cost is 5+6+12 = 23.
While this can clearly be done in O(n2) time, can it be done in asymptotically less time?

I can think of faster solution. If I am not wrong it goes to O(n*logn). Now let's first sort all the cities according to Pi. This is O(n* log n). Then we start processing the cities in increasing order of Pi. the reason being - you always know you have max (Pi, Pj) = Pi in this case. We only want to add all the segments that come from relations with Pi. Those that will connect with larger indexes will be counted when they will be processed.
Now the thing I was able to think of was to use several index trees in order to reduce the complexity of the algorithm. First index tree is counting the number of nodes and can process queries of the kind: how many nodes are to the right of xi in logarithmic time. Lets call this number NR. The second index tree can process queries of the kind: what is the sum of distances from all the points to the right of a given x. The distances are counted towards a fixed point that is guaranteed to be to the right of the rightmost point, lets call its x XR.Lets call this number SUMD. Then the sum of the distances to all points to the right of our point can be found that way: NR * dist(Xi, XR) - SUMD. Then all these contribute (NR * dist(Xi, XR) - SUMD) *Pi to the result. The same for the left points and you get the answer. After you process the ith point you add it to the index trees and can go on.
Edit: Here is one article about Biary index trees: http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=binaryIndexedTrees

This is the direct connections problem from codesprint 2.
They will be posting worked solutions to all problems within a week on their website.
(They have said "Now that the contest is over, we're totally cool with everyone discussing their solutions to the problems.")

Related

Touching segments

Can anyone please suggest me algorithm for this.
You are given starting and the ending points of N segments over the x-axis.
How many of these segments can be touched, even on their edges, by exactly two lines perpendicular to them?
Sample Input :
3
5
2 3
1 3
1 5
3 4
4 5
5
1 2
1 3
2 3
1 4
1 5
3
1 2
3 4
5 6
Sample Output :
Case 1: 5
Case 2: 5
Case 3: 2
Explanation :
Case 1: We will draw two lines (parallel to Y-axis) crossing X-axis at point 2 and 4. These two lines will touch all the five segments.
Case 2: We can touch all the points even with one line crossing X-axis at 2.
Case 3: It is not possible to touch more than two points in this case.
Constraints:
1 ≤ N ≤ 10^5
0 ≤ a < b ≤ 10^9
Let assume that we have a data structure that supports the following operations efficiently:
Add a segment.
Delete a segment.
Return the maximum number of segments that cover one point(that is, the "best" point).
If have such a structure, we can get use the initial problem efficiently in the following manner:
Let's create an array of events(one event for the start of each segment and one for the end) and sort by the x-coordinate.
Add all segments to the magical data structure.
Iterate over all events and do the following: when a segment start, add one to the number of currently covered segments and remove it from that data structure. When a segment ends, subtract one from the number of currently covered segment and add this segment to the magical data structure. After each event, update the answer with the value of the number of currently covered segments(it shows how many segments are covered by the point which corresponds to the current event) plus the maximum returned by the data structure described above(it shows how we can choose another point in the best possible way).
If this data structure can perform all given operations in O(log n), then we have an O(n log n) solution(we sort the events and make one pass over the sorted array making a constant number of queries to this data structure for each event).
So how can we implement this data structure? Well, a segment tree works fine here. Adding a segment is adding one to a specific range. Removing a segment is subtracting one from all elements in a specific range. Get ting the maximum is just a standard maximum operation on a segment tree. So we need a segment tree that supports two operations: add a constant to a range and get maximum for the entire tree. It can be done in O(log n) time per query.
One more note: a standard segment tree requires coordinates to be small. We may assume that they never exceed 2 * n(if it is not the case, we can compress them).
An O(N*max(logN, M)) solution, where M is the medium segment size, implemented in Common Lisp: touching-segments.lisp.
The idea is to first calculate from left to right at every interesting point the number of segments that would be touched by a line there (open-left-to-right on the lisp code). Cost: O(NlogN)
Then, from right to left it calculates, again at every interesting point P, the best location for a line considering segments fully to the right of P (open-right-to-left on the lisp code). Cost O(N*max(logN, M))
Then it is just a matter of looking for the point where the sum of both values tops. Cost O(N).
The code is barely tested and may contain bugs. Also, I have not bothered to handle edge cases as when the number of segments is zero.
The problem can be solved in O(Nlog(N)) time per test case.
Observe that there is an optimal placement of two vertical lines each of which go through some segment endpoints
Compress segments' coordinates. More info at What is coordinate compression?
Build a sorted set of segment endpoints X
Sort segments [a_i,b_i] by a_i
Let Q be a priority queue which stores right endpoints of segments processed so far
Let T be a max interval tree built over x-coordinates. Some useful reading atWhat are some sources (books, etc.) from where I can learn about Interval, Segment, Range trees?
For each segment make [a_i,b_i]-range increment-by-1 query to T. It allows to find maximum number of segments covering some x in [a,b]
Iterate over elements x of X. For each x process segments (not already processed) with x >= a_i. The processing includes pushing b_i to Q and making [a_i,b_i]-range increment-by-(-1) query to T. After removing from Q all elements < x, A= Q.size is equal to number of segments covering x. B = T.rmq(x + 1, M) returns maximum number of segments that do not cover x and cover some fixed y > x. A + B is a candidate for an answer.
Source:
http://www.quora.com/What-are-the-intended-solutions-for-the-Touching-segments-and-the-Smallest-String-and-Regex-problems-from-the-Cisco-Software-Challenge-held-on-Hackerrank

Stuck on graph algorithm task

I'm currently trying to solve an algorithm problem from last year's Polish Collegiate Championships which reads as follows:
The Lord Mayor of Bytetown plans to locate a number of radar speed
cameras in the city. There are n intersections in Bytetown numbered
from 1 to n, and n-1 two way street segments. Each of these street
segments stretches between two intersections. The street network
allows getting from each intersection to any other.
The speed cameras are to be located at the intersections (maximum one
per intersection), wherein The Lord Mayor wants to maximise the number
of speed cameras. However, in order not to aggravate Byteland
motorists too much, he decided that on every route running across
Bytetown roads that does not pass through any intersection twice there
can be maximum k speed cameras (including those on endpoints of the
route). Your task is to write a program which will determine where the
speed cameras should be located.
Input
The first line of input contains two integers n and k (1 <= n, k <=
1000000): the number of intersections in Bytetown and maximum number
of speed cameras which can be set up on an individual route. The lines
that follow describe Bytetown street network: the i-th line contains
two integers a_i and b_i (1 <= a_i, b_i <= n), meaning that there is a
two-way street segment which joins two intersections numbered a_i and
b_i.
Output
The first output line should produce m: the number describing the
maximum number of speed cameras, that can be set up in Byteland. The
second line should produce a sequence of m numbers describing the
intersections where the speed cameras should be constructed. Should
there be many solutions, your program may output any one of them.
Example
For the following input data:
5 2
1 3
2 3
3 4
4 5
one of the correct results is:
3 1 2 4
So judging by how many teams solved it, I'm guessing it can't be too hard but still, I got stuck almost immediately with no idea as to how to move on. Since we know that "on every route running across Bytetown roads that does not pass through any intersection twice there can be maximum k speed camera", I guess we first have to somehow dissect the graphs into components being possible routes around the town. This alone seems like a really hard thing to do cause supposing there's an intersection with four motorways coming out of it, it already creates three possible directions for every enter point, thus making 12 routes. Not to mention how the situation complicates when there's more such four-handed intersections.
Maybe I'm approaching the task from the wrong angle? Could you please help?
It seems greedy works here
while k >= 2
mark all leaves of the tree and remove them
k = k - 2;
if ( k == 1 )
mark any 1 of remaining vertices

Is there better way than a Dijkstra algorithm for finding fastest path that do not exceed specified cost

I'm having a problem with finding fastest path that do not exceed specified cost.
There's similar question to this one, however there's a big difference between them.
Here, the only records that can appear in the data are the ones, that lead from a lower point to a higher point (eg. 1 -> 3 might appear but 3 -> 1 might not) (see below). Without knowing that, I'd use Dijkstra. That additional information, might let it do in a time faster than Dijkstras algorithm.
What do you think about it?
Let's say I've got specified maximum cost, and 4 records.
// specified cost
10
// end point
5
//(start point) (finish point) (time) (cost)
2 5 50 5
3 5 20 9
1 2 30 5
1 3 30 7
// all the records lead from a lower to a higher point no.
I have to decide, whether It's possible to get from point (1) to (5) (its impossible when theres no path that costs <= than we've got or when theres no connection between 1-5) and if so, what would be the fastest way to get in there.
The output for such data would be:
80 // fastest time
3 1 // number of points that (1 -> 2) -> (2 -> 5)
Keep in mind, that if there's a record saying you can move 1->2
1 2 30 5
It doesnt allow you to move 2<-1.
For each node at depth n, the minimum cost of path to it is n/2 * (minimal first edge at the path + minimal edge connecting to the node) - sum of arithmetic series. If this computation exceeds the required maximum, no need to check the nodes that follow. Cut these nodes off and apply Dijkstra on the rest.

Eliminating symmetry from graphs

I have an algorithmic problem in which I have derived a transfer matrix between a lot of states. The next step is to exponentiate it, but it is very large, so I need to do some reductions on it. Specifically it contains a lot of symmetry. Below are some examples on how many nodes can be eliminated by simple observations.
My question is whether there is an algorithm to efficiently eliminate symmetry in digraphs, similarly to the way I've done it manually below.
In all cases the initial vector has the same value for all nodes.
In the first example we see that b, c, d and e all receive values from a and one of each other. Hence they will always contain an identical value, and we can merge them.
In this example we quickly spot, that the graph is identical from the point of view of a, b, c and d. Also for their respective sidenodes, it doesn't matter to which inner node it is attached. Hence we can reduce the graph down to only two states.
Update: Some people were reasonable enough not quite sure what was meant by "State transfer matrix". The idea here is, that you can split a combinatorial problem up into a number of state types for each n in your recurrence. The matrix then tell you how to get from n-1 to n.
Usually you are only interested about the value of one of your states, but you need to calculate the others as well, so you can always get to the next level. In some cases however, multiple states are symmetrical, meaning they will always have the same value. Obviously it's quite a waste to calculate all of these, so we want to reduce the graph until all nodes are "unique".
Below is an example of the transfer matrix for the reduced graph in example 1.
[S_a(n)] [1 1 1] [S_a(n-1)]
[S_f(n)] = [1 0 0]*[S_f(n-1)]
[S_B(n)] [4 0 1] [S_B(n-1)]
Any suggestions or references to papers are appreciated.
Brendan McKay's nauty ( http://cs.anu.edu.au/~bdm/nauty/) is the best tool I know of for computing automorphisms of graphs. It may be too expensive to compute the whole automorphism group of your graph, but you might be able to reuse some of the algorithms described in McKay's paper "Practical Graph Isomorphism" (linked from the nauty page).
I'll just add an extra answer building on what userOVER9000 suggested, if anybody else are interested.
The below is an example of using nauty on Example 2, through the dreadnaut tool.
$ ./dreadnaut
Dreadnaut version 2.4 (64 bits).
> n=8 d g -- Starting a new 8-node digraph
0 : 1 3 4; -- Entering edge data
1 : 0 2 5;
2 : 3 1 6;
3 : 0 2 7;
4 : 0;
5 : 1;
6 : 2;
7 : 3;
> cx -- Calling nauty
(1 3)(5 7)
level 2: 6 orbits; 5 fixed; index 2
(0 1)(2 3)(4 5)(6 7)
level 1: 2 orbits; 4 fixed; index 4
2 orbits; grpsize=8; 2 gens; 6 nodes; maxlev=3
tctotal=8; canupdates=1; cpu time = 0.00 seconds
> o -- Output "orbits"
0:3; 4:7;
Notice it suggests joining nodes 0:3 which are a:d in Example 2 and 4:7 which are e:h.
The nauty algorithm is not well documented, but the authors describe it as exponential worst case, n^2 average.
Computing symmetries seems to be a bit of a second order problem. Taking just a,b,c and d in your second graph, the symmetry would have to be expressed
a(b,c,d) = b(a,d,c)
and all its permutations, or some such. Consider a second subgraph a', b', c', d' added to it. Again, we have the symmetries, but parameterised differently.
For computing people (rather than math people), could we express the problem like so?
Each graph node contains a set of letters. At each iteration, all of the letters in each node are copied to its neighbours by the arrows (some arrows take more than one iteration and can be treated as a pipe of anonymous nodes).
We are trying to find efficient ways of determining things such as
* what letters each set/node contains after N iterations.
* for each node the N after which its set no longer changes.
* what sets of nodes wind up containing the same sets of letters (equivalence class)
?

tower of boxes (stacking cubes)

i got this task last week but can't find a good algorithm to solve the problem. So here is the description:
You can build a stable tower with building cubes by not putting bigger cubes to smaller ones and if you don't put harder cubes into lighter ones. Make a programme which gives you the highest possible tower with n number of cubes.
Input:
In the first row of in.txt there is the number of cubes n (1 =< n =<1000). the following n line consisting two positive integer, a cube's sidelength and weight (both of them are not higher than 2000) there are no similar cubes which sidelength and wieght is the same
Output:
you have to write the highest possible stable tower's m number into out.txt. into the second row you have to write in the ordinal number m of the tower from bottom to top. by the height of the tower we mean the amount of all of the cubes's sidelength (not the number of cubes). if there are more than one solution, you can give whatever you want
example for input and output:
input:
5
10 3
20 5
15 6
15 1
10 2
and the output:
3
2 1 5
here are limits on the program. Time limit: 0.2 sec. Memory limit: 16 Mb
I hope you can help me to solve this. thx for all help
The relationship "block A can be placed on top of block B" defines a partial order on the blocks. You can use Kahn's algorithm (aka "topological sort") to turn this into a total order, which you can then traverse in depth order to find the longest path.

Resources