Print all possible course schedules algorithm - algorithm

This is famous course schedule question, but want to print out all possible course schedules.
Q: There are ‘N’ courses, labeled from ‘0’ to ‘N-1’.
Each course can have some prerequisite courses which need to be completed before it can be scheduled.
Given the number of courses and a list of prerequisite pairs, write a method to print all possible ordering of courses meeting all prerequisites.
Assume that there is no cycle.
For the example in the main method, need to print
[3, 2, 0, 1]
[3, 2, 1, 0]
but my code prints only one of them
[3, 2, 1, 0]
Backtracking is needed to make it work, but at some point my backtracking is wrong and not sure how to fix this since it keeps choosing the same order after backtrack. Once chose 1, 0 and then backtrack, it should choose 0, 1, but keeps choosing the same order 1, 0.
Can someone help me to make it work?
class AllCourseOrders {
static Map<Integer, List<Integer>> map = null;
static int[] visited = null;
static int n = 0;
public static void printOrders(int courses, int[][] prerequisites) {
List<Integer> sortedOrder = new ArrayList<>();
// init
n = courses;
visited = new int[courses];
map = new HashMap<>();
for(int i =0; i < courses; i++)
map.put(i, new ArrayList<>());
// 1. build graph
for(int[] pre: prerequisites) {
int from = pre[0], to = pre[1];
List<Integer> list = map.get(from);
list.add(to);
}
// 2. dfs
List<List<Integer>> results = new ArrayList<List<Integer>>();
List<Integer> result = new ArrayList<>();
for(Integer u: map.keySet()) {
if(visited[u] == 0) {
dfs(u, result, results);
if(result.size() == n) {
results.add(new ArrayList<>(result));
result.remove(result.size()-1);
visited[u] = 0;
}
}
}
results.forEach(res -> System.out.println(res));
}
static void dfs(Integer u, List<Integer> result, List<List<Integer>> results) {
visited[u] = 1;
for(Integer v: map.get(u)) {
if(visited[v] == 0 ) {
dfs(v, result, results);
}
}
visited[u] = 2;
result.add(0, u);
}
public static void main(String[] args) {
printOrders(4, new int[][] { new int[] { 3, 2 }, new int[] { 3, 0 }, new int[] { 2, 0 }, new int[] { 2, 1 } });
}
}

Your algorithm finds the first solution it can, not every single one. Every time you are presented with multiple vertices to take next (you can take different starting nodes, certain class you can take in any order), each choice will lead to a different result.
The course problem is simply trying to topologically sort a directed, acyclic graph. GeeksForGeeks provides the algorithm on their site in java, where the vertices are the courses and the edges are the prereqs.

Related

Algorithm to remove duplicated location on list

I have a service to find journey and remove the duplicated visit city.
public static void main(String[] args){
List<List<String>> allPaths = new ArrayList<>();
allPaths.add(List.of("Newyork","Washington","Los Angeles","Chicago"));
allPaths.add(List.of("Newyork","Washington","Houston"));
allPaths.add(List.of("Newyork","Dallas"));
allPaths.add(List.of("Newyork","Columbus", "Chicago"));
Set<String> orphanageLocations = new HashSet<>();
removeDuplicatedLocation(allPaths, orphanageLocations);
//expected allPaths:
//"Newyork","Washington","Los Angeles","Chicago"
//"Newyork","Dallas"
//"Newyork","Columbus"
//expected orphanageLocations
//"Houston"
}
private static void removeDuplicatedLocation(List<List<String>> allPaths, Set<String> orphanageLocations){
//do something here
}
in the allPaths i store all the path from a origin to other cities.
but may be some path will contain same city, like Washington appear in both first and second path.
Now i need a service to remove that duplicated city. when two paths has same city then we take the path which going to more city.
And service also return city that can not visit. for example the 2nd path has "Washington" is duplicated with 1st path, then we remove that 2nd path (it has less city than first one), then there are no path to "Houston" available -> becoming orphanage
Other test cases:
public static void main(String[] args){
List<List<String>> allPaths = new ArrayList<>();
allPaths.add(List.of("Newyork","Washington","Los Angeles","Chicago", "Dallas"));
allPaths.add(List.of("Newyork","Los Angeles","Houston", "Philadenphia"));
allPaths.add(List.of("Newyork","Dallas"));
allPaths.add(List.of("Newyork","Columbus", "San Francisco"));
Set<String> orphanageLocations = new HashSet<>();
removeDuplicatedLocation(allPaths, orphanageLocations);
//expected allPaths:
//"Newyork","Washington","Los Angeles","Chicago", "Dallas"
//"Newyork","Columbus", "San Francisco"
//expected orphanageLocations
//"Houston","Philadenphia"
}
Would somebody suggest me a algorithm to solve it?
---Edit 1: i update my dirty solution here, still waiting for better one
private static void removeDuplicatedLocation(List<List<String>> allPaths, Set<String> orphanageLocations){
//sort to make sure the longest path is on top
List<List<String>> sortedPaths = allPaths.stream().sorted((a, b) -> Integer.compare(b.size(), a.size()))
.collect(Collectors.toList());
for(int i = 0; i < sortedPaths.size()-1; i++){
List<String> path = sortedPaths.get(i);
orphanageLocations.removeIf(path::contains);
for(int j = i+1; j < sortedPaths.size(); j++){
for(int k = 1; k < path.size();k++) {
Iterator<String> iterator = sortedPaths.get(j).iterator();
boolean isRemove = false;
while (iterator.hasNext()) {
String city = iterator.next();
if(isRemove && !path.contains(city)){
orphanageLocations.add(city);
}
if(StringUtils.equals(city, path.get(k))){
isRemove = true;
}
if(isRemove){
iterator.remove();
}
}
}
}
}
//remove path if it's only origin
sortedPaths.removeIf(item -> item.size() == 1);
allPaths.clear();
allPaths.addAll(sortedPaths);
}
---Edit 2: Thanks for solution of #devReddit, i made a small test with huge amount of route.
The more city in each path, the slower your solution is
public static void main(String[] args){
List<List<String>> allPaths = new ArrayList<>();
List<List<String>> allPaths2 = new ArrayList<>();
List<String> locations = Stream.of("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N",
"O", "P", "Q", "R", "S", "T", "U", "V", "X", "Y", "Z").collect(Collectors.toList());
Random rand = new Random();
int numberOfRoute = 10000;
String origin = "NY";
for(int i = 0; i < numberOfRoute; i++){
List<String> route = new ArrayList<>();
List<String> route2 = new ArrayList<>();
route.add(origin);
route2.add(origin);
//int routeLength = rand.nextInt(locations.size());
int routeLength = 10;
while(route.size() < routeLength){
int randomIndex = rand.nextInt(locations.size()-1);
if(!route.contains(locations.get(randomIndex))){
route.add(locations.get(randomIndex));
route2.add(locations.get(randomIndex));
}
}
allPaths.add(route);
allPaths2.add(route2);
}
System.out.println("Process for " + allPaths2.size() + " routes");
Set<String> orphanageLocations2 = new HashSet<>();
long startTime2 = System.currentTimeMillis();
removeDuplicatedLocation3(allPaths2, orphanageLocations2); //uncle bob solution
long endTime2 = System.currentTimeMillis();
System.out.println(allPaths2);
System.out.println(orphanageLocations2);
System.out.println("Total time uncleBob solution(ms):" + (endTime2-startTime2));
System.out.println("Process for " + allPaths.size() + " routes");
Set<String> orphanageLocations = new HashSet<>();
long startTime = System.currentTimeMillis();
removeDuplicatedLocation(allPaths, orphanageLocations); //devReddit solution
long endTime = System.currentTimeMillis();
System.out.println(allPaths);
System.out.println(orphanageLocations);
System.out.println("Total time devReddit solution(ms):" + (endTime-startTime));
}
//devReddit solution
private static void removeDuplicatedLocation(List<List<String>> allPaths, Set<String> orphanageLocations) {
List<List<String>> sortedFixedPaths = allPaths // List.of produces immutable lists,
.stream() // so you can't directly remove string from the list
.sorted((a, b) -> Integer.compare(b.size(), a.size())) // this fixed list will be needed later
.collect(Collectors.toList());
List<List<String>> sortedPaths = sortedFixedPaths // The list is regenerated through manual deep copy
.stream() // generated a single string from the streams of
.map(path -> // each List<String> and created new list, this is now mutable
new ArrayList<>(Arrays.asList(String.join(",", path).split(","))))
.collect(Collectors.toList());
Set<List<String>> valuesToBeRemoved = new HashSet<>();
String source = sortedPaths.get(0).get(0);
Map<String, List<Integer>> cityMapOfIndex = generateHashMap(sortedPaths, source); // This hashmap keeps track of the existence of cities in different lists
removeDuplicates(cityMapOfIndex, sortedPaths); // this method removes the duplicates from the smaller paths
// adds the remaining cities to orphanList
cityMapOfIndex.forEach((cityName, value) -> { // this block checks whether any mid element in the path is gone
int index = value.get(0); // removes the path from result list
List<String> list = sortedPaths.get(index);
int indexInPath = list.indexOf(cityName);
if (indexInPath != sortedFixedPaths.get(index).indexOf(cityName)) {
orphanageLocations.add(cityName);
sortedPaths.get(index).remove(indexInPath);
}
});
valuesToBeRemoved.add(new ArrayList<>(Collections.singleton(source))); // To handle the case where only source remains in the path
sortedPaths.removeAll(valuesToBeRemoved); // after removing other duplicates
allPaths.clear();
allPaths.addAll(sortedPaths);
}
private static void removeDuplicates(Map<String, List<Integer>> cityMapOfIndex, List<List<String>> sortedPaths) {
for (Map.Entry<String, List<Integer>> entry : cityMapOfIndex.entrySet()) {
List<Integer> indexList = entry.getValue();
while (indexList.size() > 1) {
int index = indexList.get(indexList.size() - 1); // get the last index i.e. the smallest list of city where this entry exists
sortedPaths.get(index).remove(entry.getKey()); // remove the city from the list
indexList.remove((Integer) index); // update the index list of occurrence
}
cityMapOfIndex.put(entry.getKey(), indexList);
}
}
private static Map<String, List<Integer>> generateHashMap(List<List<String>> sortedPaths,
String source) {
Map<String, List<Integer>> cityMapOfIndex = new HashMap<>();
for (int x = 0; x < sortedPaths.size(); x++) {
int finalX = x;
sortedPaths.get(x)
.forEach(city -> {
if (!city.equalsIgnoreCase(source)) { // add entries for all except the source
List<Integer> indexList = cityMapOfIndex.containsKey(city) ? // checks whether there's already an entry
cityMapOfIndex.get(city) : new ArrayList<>(); // to avoid data loss due to overwriting
indexList.add(finalX); // adds the index of the List of string
cityMapOfIndex.put(city, indexList); // add or update the map with current indexList
}
});
}
return cityMapOfIndex;
}
//Bob solution
private static void removeDuplicatedLocation3(List<List<String>> allPaths, Set<String> orphanageLocations){
//sort to make sure the longest path is on top
List<List<String>> sortedPaths = allPaths.stream().sorted((a, b) -> Integer.compare(b.size(), a.size()))
.collect(Collectors.toList());
for(int i = 0; i < sortedPaths.size()-1; i++){
List<String> path = sortedPaths.get(i);
orphanageLocations.removeIf(path::contains);
for(int j = i+1; j < sortedPaths.size(); j++){
for(int k = 1; k < path.size();k++) {
Iterator<String> iterator = sortedPaths.get(j).iterator();
boolean isRemove = false;
while (iterator.hasNext()) {
String city = iterator.next();
if(isRemove && !path.contains(city)){
orphanageLocations.add(city);
}
if(StringUtils.equals(city, path.get(k))){
isRemove = true;
}
if(isRemove){
iterator.remove();
}
}
}
}
}
//remove path if it's only origin
sortedPaths.removeIf(item -> item.size() == 1);
allPaths.clear();
allPaths.addAll(sortedPaths);
}
Here is one of the result:
Test with route lenth is 6
Process for 10000 routes
[[NY, Q, Y, T, S, X], [NY, E], [NY, V, A, H, N], [NY, J, L, I], [NY, D], [NY, O], [NY, C], [NY, P, M], [NY, F], [NY, K], [NY, U], [NY, G], [NY, R], [NY, B]]
[]
Total time uncleBob solution(ms):326
Process for 10000 routes
[[NY, Q, Y, T, S, X], [NY, E], [NY, V], [NY, J, L], [NY, D], [NY, O]]
[A, B, C, F, G, H, I, K, M, N, P, R, U]
Total time devReddit solution(ms):206
With route length is 10
Process for 10000 routes
[[NY, J, V, G, A, I, B, R, U, S], [NY, L, X, Q, M, E], [NY, K], [NY, Y], [NY, F, P], [NY, N], [NY, H, D], [NY, T, O], [NY, C]]
[]
Total time uncleBob solution(ms):292
Process for 10000 routes
[[NY, J, V, G, A, I, B, R, U, S]]
[C, D, E, F, H, K, L, M, N, O, P, Q, T, X, Y]
Total time devReddit solution(ms):471
Also result is not the same,from the same inpit, mine return more valid route
Actually i this is not what i expect because solution from #devReddit look better & faster
Thanks
Your provided solution is O(m^2xn^2). I've figured out a solution which has O(n^2) time complexity. Necessary comments have been added as explanation:
The core method removeDuplicatedLocation:
private static void removeDuplicatedLocation(List<List<String>> allPaths, Set<String> orphanageLocations) {
List<List<String>> sortedFixedPaths = allPaths // List.of produces immutable lists,
.stream() // so you can't directly remove string from the list
.sorted((a, b) -> Integer.compare(b.size(), a.size())) // this fixed list will be needed later
.collect(Collectors.toList());
List<List<String>> sortedPaths = sortedFixedPaths // The list is regenerated through manual deep copy
.stream() // generated a single string from the streams of
.map(path -> // each List<String> and created new list, this is now mutable
new ArrayList<>(Arrays.asList(String.join(",", path).split(","))))
.collect(Collectors.toList());
Set<List<String>> valuesToBeRemoved = new HashSet<>();
String source = sortedPaths.get(0).get(0);
Map<String, List<Integer>> cityMapOfIndex = generateHashMap(sortedPaths, source); // This hashmap keeps track of the existence of cities in different lists
removeDuplicates(cityMapOfIndex, sortedPaths); // this method removes the duplicates from the smaller paths
cityMapOfIndex.entrySet().stream().forEach(city -> { // this block checks whether any mid element in the path is gone
String cityName = city.getKey(); // adds the remaining cities to orphanList
int index = city.getValue().get(0); // removes the path from result list
List<String> list = sortedPaths.get(index);
int indexInPath = list.indexOf(cityName);
if (indexInPath != sortedFixedPaths.get(index).indexOf(cityName)) {
orphanageLocations.add(cityName);
sortedPaths.get(index).remove(indexInPath);
}
});
valuesToBeRemoved.add(new ArrayList<>(Collections.singleton(source))); // To handle the case where only source remains in the path
sortedPaths.removeAll(valuesToBeRemoved); // after removing other duplicates
allPaths.clear();
allPaths.addAll(sortedPaths);
}
The removeDuplicates and generateHashMap methods used in the aforementioned stub is given below:
private static void removeDuplicates(Map<String, List<Integer>> cityMapOfIndex, List<List<String>> sortedPaths) {
for (Map.Entry<String, List<Integer>> entry : cityMapOfIndex.entrySet()) {
List<Integer> indexList = entry.getValue();
while (indexList.size() > 1) {
int index = indexList.get(indexList.size() - 1); // get the last index i.e. the smallest list of city where this entry exists
sortedPaths.get(index).remove(entry.getKey()); // remove the city from the list
indexList.remove((Integer) index); // update the index list of occurrence
}
cityMapOfIndex.put(entry.getKey(), indexList);
}
}
private static Map<String, List<Integer>> generateHashMap(List<List<String>> sortedPaths,
String source) {
Map<String, List<Integer>> cityMapOfIndex = new HashMap<>();
for (int x = 0; x < sortedPaths.size(); x++) {
int finalX = x;
sortedPaths.get(x)
.stream()
.forEach(city -> {
if (!city.equalsIgnoreCase(source)) { // add entries for all except the source
List<Integer> indexList = cityMapOfIndex.containsKey(city) ? // checks whether there's already an entry
cityMapOfIndex.get(city) : new ArrayList<>(); // to avoid data loss due to overwriting
indexList.add(finalX); // adds the index of the List of string
cityMapOfIndex.put(city, indexList); // add or update the map with current indexList
}
});
}
return cityMapOfIndex;
}
Please let me know if you have any query.

Algorithm that discovers all the fields on a map with as least turns as possible

Let's say I have such map:
#####
..###
W.###
. is a discovered cell.
# is an undiscovered cell.
W is a worker. There can be many workers. Each of them can move once per turn. In one turn he can move by one cell in 4 directions (up, right, down or left). He discovers all 8 cells around him - turns # into .. In one turn, there can be maximum one worker on the same cell.
Maps are not always rectangular. In the beginning all cells are undiscovered, except the neighbours of W.
The goal is to make all the cells discovered, in as least turns as possible.
First approach
Find the nearest # and go towards it. Repeat.
To find the nearest # I start BFS from W and finish it when first # is found.
On exemplary map it can give such solution:
##### ##### ##### ##### ##... #.... .....
..### ...## ....# ..... ...W. ..W.. .W...
W.### .W.## ..W.# ...W. ..... ..... .....
6 turns. Pretty far from optimal:
##### ..### ...## ....# .....
..### W.### .W.## ..W.# ...W.
W.### ..### ...## ....# .....
4 turns.
Question
What is the algorithm that discovers all the cells with as least turns as possible?
Here is a basic idea that uses A*. It is probably quite time- and memory-consuming, but it is guaranteed to return an optimal solution and is definitely better than brute force.
The nodes for A* will be the various states, i.e. where the workers are positioned and the discovery state of all cells. Each unique state represents a different node.
Edges will be all possible transitions. One worker has four possible transitions. For more workers, you will need every possible combination (about 4^n edges). This is the part where you can constrain the workers to remain within the grid and not to overlap.
The cost will be the number of turns. The heuristic to approximate the distance to the goal (all cells discovered) can be developed as follows:
A single worker can discover at most three cells per turn. Thus, n workers can discover at most 3*n cells. The minimum number of remaining turns is therefore "number of undiscovered cells / (3 * worker count)". This is the heuristic to use. This could even be improved by determining the maximum number of cells that each worker can discover in the next turn (will be max. 3 per worker). So overall heuristic would be "(undiscorvered cells - discoverable cells) / (3 * workers) + 1".
In each step you examine the node with the least overall cost (turns so far + heuristic). For the examined node, you calculate the costs for each surrounding node (possible movements of all workers) and go on.
Strictly speaking, the main part of this answer may be considered as "Not An Answer". So to first cover the actual question:
What is the algorithm that discovers all the cells with as least turns as possible?
Answer: In each step, you can compute all possible successors of the current state. Then the successors of these successors. This can be repeated recursively, until one of the successors contains no more #-fields. The sequence of states through which this successor was reached is optimal regarding the number of moves that have been necessary to reach this state.
So far, this is trivial. But of course, this is not feasible for a "large" map and/or a "large" number of workers.
As mentioned in the comments: I think that finding the optimal solution may be an NP-complete problem. In any case, it's most likely at least a tremendously complicated optimization problem where you may employ some rather sophisticated techniques to find the optimal solution in optimal time.
So, IMHO, the only feasible approach for tackling this are heuristics.
Several approaches can be imagined here. However, I wanted to give it a try, with a very simple approach. The following MCVE accepts the definition of the map as a rectangular string (empty spaces represent "invalid" regions, so it's possible to represent non-rectangular maps with that). The workers are simply enumerated, from 0 to 9 (limited to this number, at the moment). The string is converted into a MapState that consists of the actual map, as well as the paths that the workers have gone through until then.
The actual search here is a "greedy" version of the exhaustive search that I described in the first paragraph: Given an initial state, it computes all successor states. These are the states where each worker has moved in either direction (e.g. 64 states for 3 workers - of course these are "filtered" to make sure that workers don't leave the map or move to the same field).
These successor states are stored in a list. Then it searches the list for the "best" state, and again computes all successors of this "best" state and stores them in the list. Sooner or later, the list contains a state where no fields are missing.
The definition of the "best" state is where the heuristics come into play: A state is "better" than another when there are fewer fields missing (unvisited). When two states have an equal number of missing fields, then the average distance of the workers to the next unvisited fields serves as the criterion to decide which one is "better".
This finds and a solution for the example that is contained in the code below rather quickly, and prints it as the lists of positions that each worker has to visit in each turn.
Of course, this will also not be applicable to "really large" maps or "many" workers, because the list of states will grow rather quickly (one could consider dropping the "worst" solutions to speed this up a little, but this may have caveats, like being stuck in local optima). Additionally, one can easily think of cases where the "greedy" strategy does not give optimal results. But until someone posts an MVCE that always computes the optimal solution in polynomial time, maybe someone finds this interesting or helpful.
import java.awt.Point;
import java.util.ArrayList;
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
public class MapExplorerTest
{
public static void main(String[] args)
{
String mapString =
" ### ######"+"\n"+
" ### ###1##"+"\n"+
"###############"+"\n"+
"#0#############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"##### #######"+"\n"+
"##### #######"+"\n"+
"##### #######"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"### ######2##"+"\n"+
"### #########"+"\n";
MapExplorer m = new MapExplorer(mapString);
MapState solution = m.computeSolutionGreedy();
System.out.println(solution.createString());
}
}
class MapState
{
private int rows;
private int cols;
private char map[][];
List<List<Point>> workerPaths;
private int missingFields = -1;
MapState(String mapString)
{
workerPaths = new ArrayList<List<Point>>();
rows = countLines(mapString);
cols = mapString.indexOf("\n");
map = new char[rows][cols];
String s = mapString.replaceAll("\\n", "");
for (int r=0; r<rows; r++)
{
for (int c=0; c<cols; c++)
{
int i = c+r*cols;
char ch = s.charAt(i);
map[r][c] = ch;
if (Character.isDigit(ch))
{
int workerIndex = ch - '0';
while (workerPaths.size() <= workerIndex)
{
workerPaths.add(new ArrayList<Point>());
}
Point p = new Point(r, c);
workerPaths.get(workerIndex).add(p);
}
}
}
}
MapState(MapState other)
{
this.rows = other.rows;
this.cols = other.cols;
this.map = new char[other.map.length][];
for (int i=0; i<other.map.length; i++)
{
this.map[i] = other.map[i].clone();
}
this.workerPaths = new ArrayList<List<Point>>();
for (List<Point> otherWorkerPath : other.workerPaths)
{
this.workerPaths.add(MapExplorer.copy(otherWorkerPath));
}
}
int distanceToMissing(Point p0)
{
if (getMissingFields() == 0)
{
return -1;
}
List<Point> points = new ArrayList<Point>();
Map<Point, Integer> distances = new HashMap<Point, Integer>();
distances.put(p0, 0);
points.add(p0);
while (!points.isEmpty())
{
Point p = points.remove(0);
List<Point> successors = MapExplorer.computeSuccessors(p);
for (Point s : successors)
{
if (!isValid(p))
{
continue;
}
if (map[p.x][p.y] == '#')
{
return distances.get(p)+1;
}
if (!distances.containsKey(s))
{
distances.put(s, distances.get(p)+1);
points.add(s);
}
}
}
return -1;
}
double averageDistanceToMissing()
{
double d = 0;
for (List<Point> workerPath : workerPaths)
{
Point p = workerPath.get(workerPath.size()-1);
d += distanceToMissing(p);
}
return d / workerPaths.size();
}
int getMissingFields()
{
if (missingFields == -1)
{
missingFields = countMissingFields();
}
return missingFields;
}
private int countMissingFields()
{
int count = 0;
for (int r=0; r<rows; r++)
{
for (int c=0; c<cols; c++)
{
if (map[r][c] == '#')
{
count++;
}
}
}
return count;
}
void update()
{
for (List<Point> workerPath : workerPaths)
{
Point p = workerPath.get(workerPath.size()-1);
for (int dr=-1; dr<=1; dr++)
{
for (int dc=-1; dc<=1; dc++)
{
if (dr == 0 && dc == 0)
{
continue;
}
int nr = p.x + dr;
int nc = p.y + dc;
if (!isValid(nr, nc))
{
continue;
}
if (map[nr][nc] != '#')
{
continue;
}
map[nr][nc] = '.';
}
}
}
}
public void updateWorkerPosition(int w, Point p)
{
List<Point> workerPath = workerPaths.get(w);
Point old = workerPath.get(workerPath.size()-1);
char oc = map[old.x][old.y];
char nc = map[p.x][p.y];
map[old.x][old.y] = nc;
map[p.x][p.y] = oc;
}
boolean isValid(int r, int c)
{
if (r < 0) return false;
if (r >= rows) return false;
if (c < 0) return false;
if (c >= cols) return false;
if (map[r][c] == ' ')
{
return false;
}
return true;
}
boolean isValid(Point p)
{
return isValid(p.x, p.y);
}
private static int countLines(String s)
{
int count = 0;
while (s.contains("\n"))
{
s = s.replaceFirst("\\\n", "");
count++;
}
return count;
}
public String createMapString()
{
StringBuilder sb = new StringBuilder();
for (int r=0; r<rows; r++)
{
for (int c=0; c<cols; c++)
{
sb.append(map[r][c]);
}
sb.append("\n");
}
return sb.toString();
}
public String createString()
{
StringBuilder sb = new StringBuilder();
for (List<Point> workerPath : workerPaths)
{
Point p = workerPath.get(workerPath.size()-1);
int d = distanceToMissing(p);
sb.append(workerPath).append(", distance: "+d+"\n");
}
sb.append(createMapString());
sb.append("Missing "+getMissingFields());
return sb.toString();
}
}
class MapExplorer
{
MapState mapState;
public MapExplorer(String mapString)
{
mapState = new MapState(mapString);
mapState.update();
computeSuccessors(mapState);
}
static List<Point> copy(List<Point> list)
{
List<Point> result = new ArrayList<Point>();
for (Point p : list)
{
result.add(new Point(p));
}
return result;
}
public MapState computeSolutionGreedy()
{
Comparator<MapState> comparator = new Comparator<MapState>()
{
#Override
public int compare(MapState ms0, MapState ms1)
{
int m0 = ms0.getMissingFields();
int m1 = ms1.getMissingFields();
if (m0 != m1)
{
return m0-m1;
}
double d0 = ms0.averageDistanceToMissing();
double d1 = ms1.averageDistanceToMissing();
return Double.compare(d0, d1);
}
};
Set<MapState> handled = new HashSet<MapState>();
List<MapState> list = new ArrayList<MapState>();
list.add(mapState);
while (true)
{
MapState best = list.get(0);
for (MapState mapState : list)
{
if (!handled.contains(mapState))
{
if (comparator.compare(mapState, best) < 0)
{
best = mapState;
}
}
}
if (best.getMissingFields() == 0)
{
return best;
}
handled.add(best);
list.addAll(computeSuccessors(best));
System.out.println("List size "+list.size()+", handled "+handled.size()+", best\n"+best.createString());
}
}
List<MapState> computeSuccessors(MapState mapState)
{
int numWorkers = mapState.workerPaths.size();
List<Point> oldWorkerPositions = new ArrayList<Point>();
for (int i=0; i<numWorkers; i++)
{
List<Point> workerPath = mapState.workerPaths.get(i);
Point p = workerPath.get(workerPath.size()-1);
oldWorkerPositions.add(p);
}
List<List<Point>> successorPositionsForWorkers = new ArrayList<List<Point>>();
for (int w=0; w<oldWorkerPositions.size(); w++)
{
Point p = oldWorkerPositions.get(w);
List<Point> ps = computeSuccessors(p);
successorPositionsForWorkers.add(ps);
}
List<List<Point>> newWorkerPositionsList = new ArrayList<List<Point>>();
int numSuccessors = (int)Math.pow(4, numWorkers);
for (int i=0; i<numSuccessors; i++)
{
String s = Integer.toString(i, 4);
while (s.length() < numWorkers)
{
s = "0"+s;
}
List<Point> newWorkerPositions = copy(oldWorkerPositions);
for (int w=0; w<numWorkers; w++)
{
int index = s.charAt(w) - '0';
Point newPosition = successorPositionsForWorkers.get(w).get(index);
newWorkerPositions.set(w, newPosition);
}
newWorkerPositionsList.add(newWorkerPositions);
}
List<MapState> successors = new ArrayList<MapState>();
for (int i=0; i<newWorkerPositionsList.size(); i++)
{
List<Point> newWorkerPositions = newWorkerPositionsList.get(i);
if (workerPositionsValid(newWorkerPositions))
{
MapState successor = new MapState(mapState);
for (int w=0; w<numWorkers; w++)
{
Point p = newWorkerPositions.get(w);
successor.updateWorkerPosition(w, p);
successor.workerPaths.get(w).add(p);
}
successor.update();
successors.add(successor);
}
}
return successors;
}
private boolean workerPositionsValid(List<Point> workerPositions)
{
Set<Point> set = new HashSet<Point>();
for (Point p : workerPositions)
{
if (!mapState.isValid(p.x, p.y))
{
return false;
}
set.add(p);
}
return set.size() == workerPositions.size();
}
static List<Point> computeSuccessors(Point p)
{
List<Point> result = new ArrayList<Point>();
result.add(new Point(p.x+0, p.y+1));
result.add(new Point(p.x+0, p.y-1));
result.add(new Point(p.x+1, p.y+0));
result.add(new Point(p.x-1, p.y+0));
return result;
}
}

Sorting two Dictionaries and finding the differences in index position of items in the sorted list

I have a challenge in front of me. Let me present the challenge which is perplexing me -
There are two dictionaries say - D1 and D2.
These dictionaries have same keys most of the time but there is no guarantee that it will be always the same.
The two Dictionaries could be represented as follows -
D1 = {["R1", 0.7], ["R2",0.73], ["R3", 1.5], ["R4", 2.5], ["R5", 0.12], ["R6", 1.9], ["R7", 9.8], ["R8", 6.5], ["R9", 7.2], ["R10", 5.6]};
D2 = {["R1", 0.7], ["R2",0.8], ["R3", 1.5], ["R4", 3.1], ["R5", 0.10], ["R6", 2.0], ["R7", 8.0], ["R8", 1.0], ["R9", 0.0], ["R10", 5.6], ["R11", 6.23]};
Here in these dictionaries, the keys are of string data type and value are of float data type.
Physically they are snapshot of a system in two different times. D1 being older than D2.
I need to sort these dictionaries independently based on the values in ascending orders. Which when done changes these dictionaries to -
D1 = {["R5", 0.12], ["R1", 0.7], ["R2",0.73], ["R3", 1.5], ["R6", 1.9], ["R4", 2.5], ["R10", 5.6], ["R8", 6.5], ["R9", 7.2], ["R7", 9.8]};
and
D2 = {["R9", 0.0], ["R5", 0.10], ["R1", 0.7], ["R2",0.8], ["R8", 1.0], ["R3", 1.5], ["R6", 2.0], ["R4", 3.1], ["R10", 5.6], ["R11", 6.23], ["R7", 8.0]};
Here the sorting of elements in the dictionary D1 is taken as reference point. Each element of the D1 is connected with the immediate next one in D1. It is expected to identify the elements in D2 which have broken the sequence as it appears in the reference dictionary D1 after sorting. While determining this addition of elements (i.e. the key not being present in D1 but is present in D2) to D2 and removal of elements (i.e. the key is present in D1 but not in D2) from D1 are ignored. i.e They should not highlighted in the result.
For example, in continuing with the example listed above, the elements which break a sequence in D2 with reference to D1 (ignoring addition and removal) are -
Breakers = {["R9", 0.0],["R8", 1.0]} since, R9 has jumped the sequence from 8th index in D1 sorted dictionary to 0th index in D2 sorted dictionary. Similarly R8 has jumped the sequence from 7th index in D1 sorted dictionary to 4th index in D2 sorted dictionary (all indexes are started from 0).
Note - ["R11", 6.23] is not expected to be in the list of Breakers since it is addition to D2.
Please suggest an algorithm to achieve this optimally, since this operation needs to be performed on data fetched from a database with 3,256,190 records.
Programming language is not a worry, if guided with logic I could take up the task of implementing it in any language.
I came up with this algorithm in C#. It works perfect for you example data. I also did a test with 3000000 totally random values (so a lot of breakers are detected) and it completes in 3.2seconds on my notebook (Intel Core i3 2.1GHz, 64bit).
I first put your data into temporary dictionaries, so I could copy-paste your values, before I put them into the Lists. Of course your application will put them directly in the lists.
class Program
{
struct SingleValue
{
public string Key;
public float Value;
public SingleValue(string key, float value)
{
Key = key;
Value = value;
}
public override string ToString()
{
return string.Format("{0}={1}", Key, Value);
}
}
static void Main(string[] args)
{
List<SingleValue> D1 = new List<SingleValue>();
HashSet<string> D1keys = new HashSet<string>();
List<SingleValue> D2 = new List<SingleValue>();
#if !LARGETEST
Dictionary<string, double> D1input = new Dictionary<string, double>() { { "R1", 0.7 }, { "R2", 0.73 }, { "R3", 1.5 }, { "R4", 2.5 }, { "R5", 0.12 }, { "R6", 1.9 }, { "R7", 9.8 }, { "R8", 6.5 }, { "R9", 7.2 }, { "R10", 5.6 } };
Dictionary<string, double> D2input = new Dictionary<string, double>() { { "R1", 0.7 }, { "R2", 0.8 }, { "R3", 1.5 }, { "R4", 3.1 }, { "R5", 0.10 }, { "R6", 2.0 }, { "R7", 8.0 }, { "R8", 1.0 }, { "R9", 0.0 }, { "R10", 5.6 }, { "R11", 6.23 } };
// You should directly put you values into this list... I converted them from a Dictionary so I didn't have to type over your input values :)
foreach (KeyValuePair<string, double> kvp in D1input)
{
D1.Add(new SingleValue(kvp.Key, (float)kvp.Value));
D1keys.Add(kvp.Key);
}
foreach (KeyValuePair<string, double> kvp in D2input)
D2.Add(new SingleValue(kvp.Key, (float)kvp.Value));
#else
Random ran = new Random();
for (int i = 0; i < 3000000; i++)
{
D1.Add(new SingleValue(i.ToString(), (float)ran.NextDouble()));
D1keys.Add(i.ToString());
D2.Add(new SingleValue(i.ToString(), (float)ran.NextDouble()));
}
#endif
// Sort the lists
D1.Sort(delegate(SingleValue x, SingleValue y)
{
if (y.Value > x.Value)
return -1;
else if (y.Value < x.Value)
return 1;
return 0;
});
D2.Sort(delegate(SingleValue x, SingleValue y)
{
if (y.Value > x.Value)
return -1;
else if (y.Value < x.Value)
return 1;
return 0;
});
int start = Environment.TickCount;
Dictionary<string, float> breakers = new Dictionary<string, float>();
List<SingleValue> additions = new List<SingleValue>();
// Walk through D1
IEnumerator<SingleValue> i1 = D1.GetEnumerator();
IEnumerator<SingleValue> i2 = D2.GetEnumerator();
while (i1.MoveNext() && i2.MoveNext())
{
while (breakers.ContainsKey(i1.Current.Key))
{
if (!i1.MoveNext())
break;
}
while (i1.Current.Key != i2.Current.Key)
{
if (D1keys.Contains(i2.Current.Key))
breakers.Add(i2.Current.Key, i2.Current.Value);
else
additions.Add(i2.Current);
if (!i2.MoveNext())
break;
}
}
int duration = Environment.TickCount - start;
Console.WriteLine("Lookup took {0}ms", duration);
Console.ReadKey();
}
}
If you could delete stuff from D2 before sorting it would be easy, right? You say you can't delete the data. However, you can make an additional data structure that simulates such deletions (e.g., add a "deleted" bit to the item, or if you can't change its type then make a set of "deleted" items). Then run the simple algorithm, but make sure to ignore items that are "deleted".
I have been thinking about this. As you mentioned Levenshtein distance, I am assuming you want to get those elements by moving whom from their position in D2 to some other position in D2 you will get D1 from D2 in the least number of moves (ignoring elements which don't exist in both sequences).
I wrote a greedy algorithm which may be sufficient for your needs, but it may not necessarily give the optimal result in all cases. I am honestly not sure and may come back to it later (weekend at the earliest) to check the correctness. However, if you really need to do this on sequences of 3 million elements, I believe that no algorithm which does any kind of good job at this will be fast enough, because I can't see an O(n) algorithm which does a good job and can't fail even on some trivial inputs.
This algorithm tries moving each element to its intended position and calculates the sum of errors (how far each element is from its original position) after the move. The element which results in the lowest sum of errors is proclaimed a breaker and moved. This is repeated until the sequence reverts back to D1.
I think it has O(n^3) complexity, although elements sometimes need to be moved multiple times so it could possibly be O(n^4) worst case, I'm not sure, but in 1 million random examples with 50 elements the max number of outer loop runs was 51 (n^4 would mean it can be 2500 and somehow I was lucky in all my million tests). There are just keys, no values. This is because the values are irrelevant in this step so there is no point in storing them.
edit: I wrote a counterexample generator for this and indeed it is not always optimal. The more breakers there are, the less of a chance of an optimal solution. For example, in 1000 elements with 50 randomly moved ones it will usually find a set of 55-60 breakers, when the optimal solution is at most 50.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Breakers
{
class Program
{
static void Main(string[] args)
{
//test case 1
//List<string> L1 = new List<string> { "R5", "R1", "R2", "R3", "R6", "R4", "R10", "R8", "R9", "R7" };
//List<string> L2 = new List<string> { "R9", "R5", "R1", "R2", "R8", "R3", "R6", "R4", "R10", "R11", "R7" };
//GetBreakers<string>(L1, L2);
//test case 2
//List<string> L1 = new List<string> { "R5", "R1", "R2", "R3", "R6", "R4", "R10", "R8", "R9", "R7" };
//List<string> L2 = new List<string> { "R5", "R9", "R1", "R6", "R2", "R3", "R4", "R10", "R8", "R7" };
//GetBreakers<string>(L1, L2);
//test case 3
List<int> L1 = new List<int>();
List<int> L2 = new List<int>();
Random r = new Random();
int n = 100;
for (int i = 0; i < n; i++)
{
L1.Add(i);
L2.Add(i);
}
for (int i = 0; i < 5; i++) // number of random moves, this is the upper bound of the optimal solution
{
int a = r.Next() % n;
int b = r.Next() % n;
if (a == b)
{
i--;
continue;
}
int x = L2[a];
Console.WriteLine(x);
L2.RemoveAt(a);
L2.Insert(b, x);
}
for (int i = 0; i < L2.Count; i++) Console.Write(L2[i]);
Console.WriteLine();
GetBreakers<int>(L1, L2);
}
static void GetBreakers<T>(List<T> L1, List<T> L2)
{
Dictionary<T, int> Appearances = new Dictionary<T, int>();
for (int i = 0; i < L1.Count; i++) Appearances[L1[i]] = 1;
for (int i = 0; i < L2.Count; i++) if (Appearances.ContainsKey(L2[i])) Appearances[L2[i]] = 2;
for (int i = L1.Count - 1; i >= 0; i--) if (!(Appearances.ContainsKey(L1[i]) && Appearances[L1[i]] == 2)) L1.RemoveAt(i);
for (int i = L2.Count - 1; i >= 0; i--) if (!(Appearances.ContainsKey(L2[i]) && Appearances[L2[i]] == 2)) L2.RemoveAt(i);
Dictionary<T, int> IndInL1 = new Dictionary<T, int>();
for (int i = 0; i < L1.Count; i++) IndInL1[L1[i]] = i;
Dictionary<T, int> Breakers = new Dictionary<T, int>();
int steps = 0;
int me = 0;
while (true)
{
steps++;
int minError = int.MaxValue;
int minErrorIndex = -1;
for (int from = 0; from < L2.Count; from++)
{
T x = L2[from];
int to = IndInL1[x];
if (from == to) continue;
L2.RemoveAt(from);
L2.Insert(to, x);
int error = 0;
for (int i = 0; i < L2.Count; i++)
error += Math.Abs((i - IndInL1[L2[i]]));
L2.RemoveAt(to);
L2.Insert(from, x);
if (error < minError)
{
minError = error;
minErrorIndex = from;
}
}
if (minErrorIndex == -1) break;
T breaker = L2[minErrorIndex];
int breakerOriginalPosition = IndInL1[breaker];
L2.RemoveAt(minErrorIndex);
L2.Insert(breakerOriginalPosition, breaker);
Breakers[breaker] = 1;
me = minError;
}
Console.WriteLine("Breakers: " + Breakers.Count + " Steps: " + steps);
foreach (KeyValuePair<T, int> p in Breakers)
Console.WriteLine(p.Key);
Console.ReadLine();
}
}
}

using LINQ getting previous and next element

One of my colleague was looking for something like picking up previous and next values from a list for a given value. I wrote a little function with some help of Google, which works but I wanted to see
1. if is this an efficient way to do this?
2. Any other way in LINQ to do this?
private static List<double> GetHighLow(double value)
{
List<double> tenorList = new List<double> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 20, 30 };
double previous = tenorList.OrderByDescending(s => s).Where(s => s.CompareTo(value) < 0).FirstOrDefault();
double next = tenorList.OrderBy(s => s).Where(s => s.CompareTo(value) > 0).FirstOrDefault();
List<double> values = new List<double> { previous, next };
return values;
}
thanks
Pak
Ordering just to find a single item would make me suspicious.
You can do it in linear time this way:
double prev = double.MinValue;
double nx = double.MaxValue;
foreach (var item in tenorList) {
if (item < value && item > prev) { prev = item; }
if (item > value && item < nx) { nx = item; }
}
List<double> values = new List<double> { prev, nx };

How to find a word from arrays of characters?

What is the best way to solve this:
I have a group of arrays with 3-4 characters inside each like so:
{p, {a, {t, {m,
q, b, u, n,
r, c v o
s } } }
}
I also have an array of dictionary words.
What is the best/fastest way to find if the array of characters can combine to form one of the dictionary words? For example, the above arrays could make the words:
"pat","rat","at","to","bum"(lol)but not "nub" or "mat"Should i loop through the dictionary to see if words can be made or get all the combinations from the letters then compare those to the dictionary
I had some Scrabble code laying around, so I was able to throw this together. The dictionary I used is sowpods (267751 words). The code below reads the dictionary as a text file with one uppercase word on each line.
The code is C#:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Diagnostics;
namespace SO_6022848
{
public struct Letter
{
public const string Chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
public static implicit operator Letter(char c)
{
return new Letter() { Index = Chars.IndexOf(c) };
}
public int Index;
public char ToChar()
{
return Chars[Index];
}
public override string ToString()
{
return Chars[Index].ToString();
}
}
public class Trie
{
public class Node
{
public string Word;
public bool IsTerminal { get { return Word != null; } }
public Dictionary<Letter, Node> Edges = new Dictionary<Letter, Node>();
}
public Node Root = new Node();
public Trie(string[] words)
{
for (int w = 0; w < words.Length; w++)
{
var word = words[w];
var node = Root;
for (int len = 1; len <= word.Length; len++)
{
var letter = word[len - 1];
Node next;
if (!node.Edges.TryGetValue(letter, out next))
{
next = new Node();
if (len == word.Length)
{
next.Word = word;
}
node.Edges.Add(letter, next);
}
node = next;
}
}
}
}
class Program
{
static void GenWords(Trie.Node n, HashSet<Letter>[] sets, int currentArrayIndex, List<string> wordsFound)
{
if (currentArrayIndex < sets.Length)
{
foreach (var edge in n.Edges)
{
if (sets[currentArrayIndex].Contains(edge.Key))
{
if (edge.Value.IsTerminal)
{
wordsFound.Add(edge.Value.Word);
}
GenWords(edge.Value, sets, currentArrayIndex + 1, wordsFound);
}
}
}
}
static void Main(string[] args)
{
const int minArraySize = 3;
const int maxArraySize = 4;
const int setCount = 10;
const bool generateRandomInput = true;
var trie = new Trie(File.ReadAllLines("sowpods.txt"));
var watch = new Stopwatch();
var trials = 10000;
var wordCountSum = 0;
var rand = new Random(37);
for (int t = 0; t < trials; t++)
{
HashSet<Letter>[] sets;
if (generateRandomInput)
{
sets = new HashSet<Letter>[setCount];
for (int i = 0; i < setCount; i++)
{
sets[i] = new HashSet<Letter>();
var size = minArraySize + rand.Next(maxArraySize - minArraySize + 1);
while (sets[i].Count < size)
{
sets[i].Add(Letter.Chars[rand.Next(Letter.Chars.Length)]);
}
}
}
else
{
sets = new HashSet<Letter>[] {
new HashSet<Letter>(new Letter[] { 'P', 'Q', 'R', 'S' }),
new HashSet<Letter>(new Letter[] { 'A', 'B', 'C' }),
new HashSet<Letter>(new Letter[] { 'T', 'U', 'V' }),
new HashSet<Letter>(new Letter[] { 'M', 'N', 'O' }) };
}
watch.Start();
var wordsFound = new List<string>();
for (int i = 0; i < sets.Length - 1; i++)
{
GenWords(trie.Root, sets, i, wordsFound);
}
watch.Stop();
wordCountSum += wordsFound.Count;
if (!generateRandomInput && t == 0)
{
foreach (var word in wordsFound)
{
Console.WriteLine(word);
}
}
}
Console.WriteLine("Elapsed per trial = {0}", new TimeSpan(watch.Elapsed.Ticks / trials));
Console.WriteLine("Average word count per trial = {0:0.0}", (float)wordCountSum / trials);
}
}
}
Here is the output when using your test data:
PA
PAT
PAV
QAT
RAT
RATO
RAUN
SAT
SAU
SAV
SCUM
AT
AVO
BUM
BUN
CUM
TO
UM
UN
Elapsed per trial = 00:00:00.0000725
Average word count per trial = 19.0
And the output when using random data (does not print each word):
Elapsed per trial = 00:00:00.0002910
Average word count per trial = 62.2
EDIT: I made it much faster with two changes: Storing the word at each terminal node of the trie, so that it doesn't have to be rebuilt. And storing the input letters as an array of hash sets instead of an array of arrays, so that the Contains() call is fast.
There are probably many way of solving this.
What you are interested in is the number of each character you have available to form a word, and how many of each character is required for each dictionary word. The trick is how to efficiently look up this information in the dictionary.
Perhaps you can use a prefix tree (a trie), some kind of smart hash table, or similar.
Anyway, you will probably have to try out all your possibilities and check them against the dictionary. I.e., if you have three arrays of three values each, there will be 3^3+3^2+3^1=39 combinations to check out. If this process is too slow, then perhaps you could stick a Bloom filter in front of the dictionary, to quickly check if a word is definitely not in the dictionary.
EDIT: Anyway, isn't this essentially the same as Scrabble? Perhaps try Googling for "scrabble algorithm" will give you some good clues.
The reformulated question can be answered just by generating and testing. Since you have 4 letters and 10 arrays, you've only got about 1 million possible combinations (10 million if you allow a blank character). You'll need an efficient way to look them up, use a BDB or some sort of disk based hash.
The trie solution previously posted should work as well, you are just restricted more by what characters you can choose at each step of the search. It should be faster as well.
I just made a very large nested for loop like this:
for(NSString*s1 in [letterList objectAtIndex:0]{
for(NSString*s2 in [letterList objectAtIndex:1]{
8 more times...
}
}
Then I do a binary search on the combination to see if it is in the dictionary and add it to an array if it is

Resources