What are strongly connected components used for?

What are strongly connected components used for? - algorithm

I have found several algorithms that explain how to find strongly connected components in a directed graph, but none explain why you would want to do this. What are some applications of strongly connected components?

You should check out Tim Roughgarden's Introduction to Algorithms course on Coursera. For every algorithm he goes over, he explains some applications of it. Very useful, and makes one see the value of studying algorithms!
The use of strongly connected components that I remember him saying is that one could use it to find groups of people who are more closely related in a huge set of data. Think of facebook and how they recommend people that might be your friends...
This could also be used to see chunks of a population. Say, "Wow, this huge component all has the hobby of walking backwards and likes eating moldy pizza!," it could show correlation. Advertisers for moldy pizza would use this data to target people who like walking backwards. Who knows!

One example is in model checking:
Finding strongly connected component is done in explicit model checking in formal verification.
In model checking - we have a state machine, which represents the models of our software/hardware, and we try to prove temporal logic1 formulas on it.
For example: The formula EG(p) means: there is a path in the graph, where for each state - the logic formula p yields true.
The algorithm for proving if EG(p) is true on a graph (model) is finding the maximal strongly connected components (SCC), and then checking paths leading to it in the graph.
Note that model checking is applied widely in the industry - especially for proving correctness of hardware components.
(1) The importance of temporal logic to computer science is great, and its inventor Amir Pnueli recieved a turing award for it!

Another application can be found in vehicle routing applications. A road network can be modeled as a directed graph, with vertices being intersections, and arcs being directed road segments or individual lanes. If the graph isn't Strongly Connected, then vehicles can get trapped in a certain part of the graph (i.e they can get in, but not get out).
In many of these vehicle routing applications, you want to generate routes for a specific area (for instance a routing problem within a city). Prior to generate routes, you would have to extract street data, for instance from Google Maps, Here maps, or Open Street maps. These maps do not just cover the area you are interested in, but they cover the entire world. Consequently you end up taking a snapshot of the area of interest, for instance by computing the induced subgraph of all intersections who's geographical coordinates lay within the area of interest. The resulting subgraph is not necessarily strongly connected (for instance, you can have a road that enters and leaves the area, but is not connected to any other road inside the area). One would then preprocess the subgraph by enumerating all strongly connected components, and throwing away all but the largest component.

Related

What are the applications of Widest path problem?

I want to know some more applications of the widest path problem. CLICK!
It seems like something that can be used in a multitude of places, but I couldn't get anything constructive from searching on the internet.
Can someone please share as to where else this might be used?
thanks in advance.
(what I searched for included uses in p2p networks and CDN, but I couldn't find exactly how it is used / the papers were too long for me to scout.)

The widest path problem has a variety of applications in areas such as network routing problems, digital compositing and voting theory. Some specific applications include:
Finding the route with maximum transmission speed between two nodes.
This comes almost directly from the widest-path problem definition. We want to find the path between two nodes which maximizes the minimum-weight edge in the path.
Computing the strongest path strengths in Schulze’s method.
Schulze's method is a system in voting theory for finding a single winner among multiple candidates. Each voter provides an ordered preference list. We then construct a weighted graph where vertices represents candidates and the weight of an edge (u, v) represents the number of voters who prefer candidate u over candidate v. Next, we want to find the strength of the strongest path between each pair of candidates. This is the part of Schulze's method that can be solved using the widest-path problem. We simply run an algorithm to solve the widest-path problem for each pair of vertices.
Mosaicking of digital photographic maps. This is a technique for merging two maps into a single bigger map. The challenge is the the two original photos might have different light intensity, colors, etc. One way to do mosaicking is to produce seams where each pixel in the resulting picture is represented entirely by one photo or the other. We want the seam to appear invisible in the final product. The problem of finding the optimal seam can be modeled as a widest-path problem. Details for the modeling are found in the original paper
Metabolic path analysis for living organisms. The objective of this type of analysis is identify critical reactions in living organisms. A network is constructed based on the stoichiometry of the reactions. We wish to find the path which is energetically favored in the production of a particular metabolite, ie the path where the bottleneck between two vertices is the smallest. This corresponds to the widest-path problem.

Graph algorithm/approach for finding two nodes that disrupts the graph the most

I'm trying to develop a way of retrieving a measure of disruptiveness when removing two nodes from a graph. So far I'm performing a collection of algorithms like multiple measures of centralities, degrees, pagerank etc.
Its obvious that it can be done by actually removing two nodes and then analyzing the resulting graph (or collection of graphs), but this is also time-consuming when there is O(N^2) combinations of two nodes.
Any help to steer me in the right direction would be appreciated.

I think what you are looking for is the KPP-Neg problem (Key Players Problem).
It is defined in terms of the extent to which the network depends on its key players to maintain its cohesiveness. It is a “negative” problem because it measures the amount of reduction in the cohesiveness of the network that would occur if the nodes were not present.
(As opposed to the KPP-Pos problem where you are looking for a set of network nodes that are optimally positioned to quickly diffuse information, attitudes, behaviors or goods).
Both KPP problems were defined in The key player problem [Borgatti, 2003] and Identifying sets of key players in a network [Borgatti, 2006]. See also "Key players" - a discussion by Yves Zenou here.
Many more approaches were suggested since these papers were presented. Just google Key Players Social Networks.

How to identify loosely-connected components of a graph

Imagine that a graph has two relatively densely connected components that are only connected to each other by relatively few edges. How can I identify the components? I don't know the proper terminology, but the intuition is that a relatively densely connected subgraph is hanging onto another subgraph by a few threads. I want to identify these clumps that are only loosely connected to the rest of the graph.

If your graph represents a real-world system, this task is called community detection. You'll find many articles about that, starting with Fortunato's review (2010). He describes, amongst others, the min-cut based methods mentioned in the earlier answers.
There're also many posts on SO, such as :
Are there implementations of algorithms for community detection in graphs?
What are the differences between community detection algorithms in igraph?
Community detection in Networkx
People in Cross Validated also talk about community detection :
How to do community detection in a weighted social network/graph?
How to find communities?
Finally, there's a proposal in Area 51 for a new Network Science site, which would be more directly related to this problem.

You probably want sparsest cut instead of min cut -- unless you can identify several nodes in a component, min cuts have a tendency to be very unbalanced when the degrees are small. One of the common approaches to sparsest cut is to compute an eigenvector of the graph's Laplacian and threshold it.

The answer might be somewhat general, but you could could try to model your problem as a flow problem and generate a minimum cut; see here. The edges could be bidirectional with capacity 1, and a resulting cut would maybe yield the desired partition?

Help on algorithm to place rooms on a limited space

I'm working on a small house design project and one of its most important parts is a section where the user can give some info about how he wants his rooms (for example, a house with 10 x 10 meters, having a 3x3 living room, a 3x3 kitchen, two 4 x 5 bedrooms, and a 4x2 bathroom), and then the program generates a map of the house according to the requeriments made.
For now, I'm not worried about drawing the map, just arranging the rooms in a way they don't overlap (yes, the output can be pretty ugly). I've already made some searches and found that what I want is very similar to the packing problem, which has some algorithms that handle this problem pretty well (although it's a NP-complete problem).
But then I had one more restriction: the user can specify "links" between rooms, for example, he may wish that a room must have a "door" to a bathroom, the living room to have a direct to the kitchen, etc (that is, the rooms must be placed side by side), and this is where the things get complicated.
I'm pretty sure that what I want configures a NP-problem, so I'm asking for tips to construct a good, but not necessarily optimal implementation. The idea I have is to use graphs to represent the relationship between rooms, but I can't find out how I can adapt the existent packing algorithms to fit this new restriction. Can anyone help me?

I don't have a full answer for you, but I do have a hint: Your connectivity constraints will form what is known as a planar graph (if they don't, the solution is impossible with a single-story house). Rooms in the final solution will correspond to areas enclosed by edges in the dual of the constraint graph, so all you need to do then is take said dual, and adjust the shape of its edges, without introducing intersections, to fit sizing constraints. Note that you will need to introduce a vertex to represent 'outside' in the constraint graph, and ensure it is not surrounded in the dual. You may also need to introduce additional edges in the constraint graph to ensure all the rooms are connected (and add rooms for hallways, etc).

You might find this interesting.
It's a grammar for constructing Palladian villas.
To apply something like that to your problem, I would have a way to construct one at random, and then be able to make random changes to it, and use a simulated annealing algorithm.

data structure to support google/bing maps [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I was wondering what the data structure is in an application like google/bing maps. How is it that the results are returned so quickly when searching for directions?
what kind of algorithms are being used to determine this information?
thanks

There are two parts to your question:
What kind of data structure is used to store the map information.
What kind of algorithm is used to "navigate" from source to destination.
To this, I would add another question:
How is Google/Bing able to "stream in" the data. So for example, you are able to zoom in from miles up to the ground level seamlessly, all the while maintaining the coordinate system.
I will attempt to address each question in order. Do note, that I do not work for the Google Maps or the Bing team, so quite obviously, this information might not be completely accurate. I am basing this off of the knowledge gained from a good CS course about data structures and algorithms.
Ans 1) The map is stored in an Edge Weighted Directed Graph. Locations on the map are Vertices and the path from one location to another (from one vertex to another) are the Edges.
Quite obviously, since there can be millions of vertices and an order of magnitude more edges, the really interesting thing would be the representation of this Edge Weighted Digraph.
I would say that this would be represented by some kind of Adjacency List and the reason I say so is because, if you imagine a map, it is essentially a sparse graph. There are only a few ways to get from one location to another. Think about your house! How many roads (edges in our case) lead to it? Adjacency Lists are good for representing sparse graphs, and adjacency matrix is good for representing dense graphs.
Of course, even though we are able to efficiently represent sparse graphs in memory, given the sheer number of Vertices and Edges, it would be impossible to store everything in memory at once. Hence, I would imagine some kind of a streaming library underneath.
To create an analogy for this, if you have ever played an open-world game like World of Warcraft / Syrim / GTA, you will observe that to a large part, there is no loading screen. But quite obviously, it is impossible to fit everything into memory at once. Thus using a combination of quad-trees and frustum culling algorithms, these games are able to dynamically load resources (terrain, sprites, meshes etc).
I would imagine something similar, but for Graphs. I have not put a lot of thought into this particular aspect, but to cook up a very basic system, one can imagine an in memory database, which they query and add/remove vertices and edges from the graph at run-time as needed. This brings us to another interesting point. Since vertices and edges need to be removed and added at run-time, the classic implementation of Adjacency List will not cut it.
In a classic implementation, we simply store a List (a Vector in Java) in each element of an array: Adj[]. I would imagine, a linked list in place of the Adj[] array and a binary search tree in place of List[Edge]. The binary search tree would facilitate O(log N) insertion and removal of nodes. This is extremely desirable since in the List implementation, while addition is O(1), removal is O(N) and when you are dealing with millions of edges, this is prohibitive.
A final point to note here is that until you actually start the navigation, there is "no" graph. Since there can be million of users, it doesn't make sense to maintain one giant graph for everybody (this would be impossible due to memory space requirement alone). I would imagine that as you stat the navigation process, a graph is created for you. Quite obviously, since you start from location A and go to location B (and possibly other locations after that), the graph created just for you should not take up a very large amount of memory (provided the streaming architecture is in place).
Ans 2) This is a very interesting question. The most basic algorithm for solving this problem would be Dijkstra Path Finding algorithm. Faster variations such as A* exist. I would imagine Dijkstra to be fast enough, if it could work properly with the streaming architecture discussed above. Dijkstra uses space proportional to V and time proportional to E lg V, which are very good figures, especially for sparse graphs. Do keep in mind, if the streaming architecture has not been nailed down, V and E will explode and the space and run-time requirements of Dijkstra will make it prohibitive.
Ans 1) Streaming question: Do not confuse this question with the streaming architecture discussed above. This is basically asking how the seamless zoom is achieved.
A good algorithm for achieving this is the Quad Tree algorithm (you can generalize this to n-tree). You store coarser images higher up in the tree and higher resolution images as you traverse down the tree. This is actually what KML (Keyhole) did with its mapping algorithm. Keyhole was a company that partnered with NVIDIA many years back to produce one of the first "Google Earth" like softwares.
The inspiration for Quad Tree culling, comes from modern 3D games, where it is used to quickly cull away parts of the scene which is not in the view frustum.
To further clarify this, imagine that you are looking at the map of USA from really high up. At this level, you basically split the map into 4 sections and make each section a child of the Quad Tree.
Now, as you zoom in, you zoom in on one of the sections (quite obviously you can zoom right in the center, so that your zoom actually touches all 4 sections, but for simplicity's sake, lets say you zoom in on one of the sections). So when you zoom in to one section, you traverse the 4 children of that section. These 4 children contain higher resolution data of its parent. You can then continue to zoom down till you hit a set of leaves, which contain the highest resolution data. To make the jump from one resolution to the next "seamless" a combination of blur and fading effects can be utilized.
As a follow-up to this post, I will try to add links to many of the concepts I put in here.

For this sort of application, you would want some sort of database to represent map features and the connections between them, and would then need:
spatial indexing of the map feature database, so that it can be efficiently queried by 2D coordinates; and
a good way to search the connections to find a least-cost route, for some measure of cost (e.g. distance).
For 1, an example would be the R-tree data structure.
For 2, you need a graph search algorithm, such as A*.

Look up a paper about Highway Dimension from google authors. The idea is to precompute the shortest path between important nodes and then route everything through those. You are not going to use residential streets to go from LA to Chicago save for getting on and off the freeway at both ends.

I'm not sure of the internal data structure, but it may be some kind of 2D coordinate based tree structure that only displays a certain number of levels. The levels would correspond to zoom factors, so you could ignore as insignificant things below, say, 5 levels below the current level, and things above the current level.
Regardless of how it's structured, here's how you can use it:
http://code.google.com/apis/maps/documentation/reference.html

I would think of it as a computational geometry problem. When you click on a particular coordinate in the map and using that information, can get the latitude and longitude of that location. Based on the latitude and longitude and the level of zoom, the place can be identified.
Once you have identified the two places, the only problem for you is to identify the nearest route. Now this problem is finding the shortest path between two points, having polygon blocks between them(which correspond to the places which contains no roads) and the only possible connections are roads. This is a known problem and efficient algorithms exist to solve this.
I am not sure if this is what google is doing, but I hope they do something on these lines.
I am taking computational geometry this semester. Here is the course link: http://www.ams.sunysb.edu/~jsbm/courses/545/ams545.html. Check them if you are interested.

I was wondering what the data
structure is in an application like
google/bing maps.
To the user: XHTML/CSS/Javascript. Like any website.
On the server: who knows? Any Google devs around here? It certainly isn't PHP or ASP.net...
How is it that the results are
returned so quickly when searching for
directions?
Because Google spent years, manpower and millions of dollars on building up the architecture to get the fastest server reaction time possible?
What kind of algorithms are being used
to determine this information?
A journey planner algorithm.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio