is this the best complexity I can get?

is this the best complexity I can get? - algorithm

The problem goes as follows:
you have n domino pieces and the two numbers on the every domino piece (n pieces), also there is an extra set of m domino pieces but you can use only one piece from this set (at most) to help you do the following:
calculate the minimum number of domino pieces that you can use to go from a given starting point S to an ending point D.
meaning that the starting piece should have the number (S) and the ending piece should have the number (D).
Input:
n and the domino pieces' numbers (n pairs).
m and the extra domino pieces' numbers (m pairs).
starting point S and a destination D.
Output:
the minimum number of domino pieces.
I am thinking of using BFS for this problem where I can start from S and find the minimum path to D with constantly removing node m(i) from the graph and adding node m(i+1)
but doing this the time complexity will be O(n * m).
but not only this, there could be multiple starting points S so the complexity would be O(|S| * n * m).
can it be solved in a better way?
The teacher said it could be solved in a Linear Time but I am just very confused.

I initially missed that your question allows multiple sources, and wrote a somewhat long answer explaining how to approach that problem. Let me post it here anyway, because it might still be helpful. Scroll further for the solution to the original question.
Finding shortest paths from single S to D in linear time
Let's build the idea incrementally. First, let's try to solve a simpler version of the problem, where we just need to find out whether you can get from a single S to a single D at all by using at most one domino from the set of extra M dominoes.
I suggest to approach it this way: do some preprocessing on the N dominoes that will let you, for each of the M additional dominoes, quickly (in constant time) answer whether there exists a path from S to D that goes through this domino. (And of course we need to remember the edge case when we don't need an extra domino at all, but it's easy to cover in linear time.)
What kind of information lets you answer this question? Let's say you are looking at a domino with numbers A and B on its ends. If you knew that you can get from S to A, and from B to D, you use this domino to get from S to D, right? Alternatively, if there was a path from S to B and from A to D, it would also work. If neither is true, then there is no way this domino can help you to get from S to D.
That's great, but if we run BFS from every possible B, we won't achieve linear time complexity. However, note that you can reverse the second problem (detecting if a path from B's to D exists) and pose it as "can I get from D to every possible B"? That is easily answered with a single BFS.
Can you see how this approach can be adapted to find length of the shortest path through each domino, as opposed to just detecting if a path exists?
Finding shortest paths from multiple S to D in linear time
Let's reverse the problem and say we want to find the shortest paths from D to multiple S. You could create a directed graph with twice as many nodes as there were unique numbers written on dominoes. That is, for each number there are nodes V and V', and if you are in V, it means you haven't used an extra domino yet, but if you are in V', it means you already used one. Each core (that is, one of the original N) domino (A, B) corresponds to 4 edges in this graph: (A -> B), (B -> A), (A' -> B'), (B' -> A'). Each extra domino corresponds to 2 edges: (A -> B'), (B -> A'). Note that once we get into a node with ', we can never get out of it, so we will only use at most one extra domino this way. A single BFS from D in this graph will answer the problem.

Related

Dynamic Programming / Subproblems + Transition

I am kind of stuck, I decided to try this problem https://icpcarchive.ecs.baylor.edu/external/71/7113.pdf
to prevent it 404'ing here's the basic assignment
a hopper only visits arrays with integer entries,
• a hopper always explores a sequence of array elements using the following rules:
– a hopper cannot jump too far, that is, the next element is always at most D indices away
(how far a hopper can jump depends on the length of its legs),
– a hopper doesn't like big changes in values — the next element differs from the current
element by at most M, more precisely the absolute value of the difference is at most M (how
big a change in values a hopper can handle depends on the length of its arms), and
– a hopper never visits the same element twice.
• a hopper will explore the array with the longest exploration sequence.
n is the length of the array (as described above, D is the maximum length of a jump the
hopper can make, and M is the maximum difference in values a hopper can handle). The next line
contains n integers — the entries of the array. We have 1 ≤ D ≤ 7, 1 ≤ M ≤ 10, 000, 1 ≤ n ≤ 10, 000
and the integers in the array are between -1,000,000 and 1,000,000.
EDIT: I am doing this out of pure curiosity this is not a assignment I need to do for any particular reason other than challenging myself
basically its building a sparse graph out of an array,
the graph is undirected and due to the symmetry of the -d ... d jumps, its also either a complete graph (all edges are included) or mutually disjoint graph components
As first step I tried to simply exhaustive DFS search the graph, which works but has the infamous O(n!) runtime, the first iteration of this was written in F# which was horrible slow the second in C which still plateaus pretty fast too
so I know the longest path problem is NP hard but I thought I would give it a try with dynamic programming
The next approach was to simply use the common DP solution (bitmasked path) to DFS on the graph but at this at this point I already traversed the array and built the entire graph which may contain up to 1000 nodes so its not feasible
My next approach was to build a DFS Tree (tree of all the paths) which is a bit faster but then needs to store all entire path in memory for each iteration already which isn't what I really want, I am thinking I can reduce it to substates while already traversing the array
next I tried to memoize all paths I've already walked by simply using a bitmask and a simple memoization functions as seen here:
let xf = memoizedEdges (fun r i' p mask ->
let mask' = (addBit i' mask)
let nbs = [-d .. -1] # [ 1 .. d]
|> Seq.map (fun f -> match f with
| x when (i' + x) < 0 -> None
| x when (i' + x) >= a.Length -> None
| x when (diff a.[i'+x] a.[i']) > m -> None
| x when (i' + x) = i -> None
| x when (isSet (i'+x) mask') -> None
| x -> Some (i' + x )
)
let ec = nbs
|> Seq.choose id
|> Seq.toList
|> List.map (fun f ->
r f i' mask'
)
max (bitcount mask) (ec |> mxOrZero)
)
So memoized edges works by 3 int parameters the current index (i'), the previous (p) and the path as bitmask, the momizedEdges function itself will check on each recursive call it if has seen i' and p and the mask ... or p and i' and the mask with the i' and p bits flipped to mask the path in the other way (basically if we have seen this path coming from the other side already)
this works as I would expect, but the assignment states its up to 1000 indices which would cause the int32 mask to be too short
so I've been thinking for days now and there must be a way to encode each of the -d ... d steps into a start and end vertice and calculate the path for each step in that window based on the previous steps
I've come up with basically this
0.) Create a container to hold starting and endvertex as key with the current pathlength as value
1.) Check neighbors of i
2.) Have I seen either this combination either as (from -> to) or (to -> from) then I do not add or increase
3.) Check whatever any other predecessors to this node exist and increase the path of those by 1
but this would lead to having all paths stored and I would basically result in tuples and then I am back at my graph with DFS in another form
I am very thankful for any pointers (I just need some new ideas I am really stuck rn) how I could encode each subproblem from -d..d that I can use just intermediate results for calculating the next step (if this is even possible)

Partial answer
This is a difficult problem. Indeed, on competitive programming problem compendium Kattis it is (at the time of writing) in the top 5 of most difficult problems.
Only you know if this sort of problem is possible for you to solve, but there is a fair chance no one on this site can help you completely, hence this partial answer.
Longest path
What we're asked to do here is solve the longest path problem for a particular graph. This problem is known to be NP-complete in general, even for undirected unweighted graphs as ours is. Because the graph can have 1000 vertices, a (sub-)exponential algorithm in N will not work, and we're likely not asked to prove that P=NP, so the only option we have left is to somehow exploit the structure of the graph.
The most promising avenue is through D. D is at most 7, because of which the maximum degree of the graph is at most 14, and all edges are—in a sense—local.
Now, according to Wikipedia, the longest path problem can be solved polynomially on various classes of graphs, such as noncyclic ones. Our graph is of course not noncyclic, but unfortunately this is largely where my knowledge ends. I am not sufficiently familiar with graph theory to see whether the implied graph of the problem is in any of the classes Wikipedia mentions.
Of particular note is that the longest path problem can be solved in polynomial time given bounded-by-a-constant clique-width (or tree-width, which implies the former). I am unable to confirm or prove that our graph has bounded clique-width because of the bound on D, but perhaps you yourself know more about this, or you could try asking on the math or CS stackexchange, as at this point we're pretty far from any actual programming.
Regardless, if you're able to confirm that the graph is clique-width-bounded, this paper may help you further.
I hope this answer is of some use despite not being entirely fulfilling, and good luck!
Citation for the paper in case of link decay
Fomin, F. V., Golovach, P. A., Lokshtanov, D., & Saurabh, S. (2009, January). Clique-width: on the price of generality. In Proceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms (pp. 825-834). Society for Industrial and Applied Mathematics.

Pathfinding task - how can I find next vertex on the shortest path from A to B faster that O ( n )?

I have a quite tricky task to solve:
You are given a N * M board (1 <= N, M <= 256). You can move from each field to it's neighbouring field (moving diagonally is not allowed). At the beginning, there are two types of fields: active and blocked. You can pass through active field, but you can't go on the blocked one. You have Q queries (1 <= Q <= 200). There are two types of queries:
1) find the next field (neighbouring to A) that lies on the shortest path from field A to B
2) change field A from active to blocked or conversly.
The first type query can be easily solved with simple BFS in O(N * M) time. We can represent active and blocked fields as 0 or 1, so the second query could be done in constant time.
The total time of that algorithm would be O(Q (number of queries) * N * M).
So what's the problem? I have a 1/60 second to solve all the queries. If we consider 1 second as 10^8 calculations, we are left with about 1,5 * 10^6 calculations. One BFS may take up to N * M * 4 time, which is about 2,5 * 10^5. So if Q is 200, the needed calculations may be up to 5 * 10^7, which is way too slow.
As far as I know, there is no better pathfinding algorithms than BFS in this case (well, I could go for an A*, but I'm not sure if it's much quicker than BFS, it's still worst-case O(|E|) - according to Wikipedia ). So there's not much to optimize in this area. However, I could change my graph in some way to reduce the amount of edges that the algorithm would have to process (I don't need to know the full shortest path, only the next move I should make, so the rest of the shortest path can be very simplified). I was thinking about some preprocessing - grouping vertices in a groups and making a graph of graphs, but I'm not sure how to handle the blocked fields in that way.
How can I optimize it better? Or is it even possible?
EDIT: The actual problem: I have some units on the board. I want to start moving them to the selected destination. Units can't share the same field, so one can block others' paths or open a new, better paths for them. There can be a lot of units, that's why I need a better optimization.

If I understand the problem correctly, you want to find the shortest path on a grid from A to B, with the added ability that your path-finder can remove walls for an additional movement cost?
You can treat this as a directed graph problem, where you can move into any wall-node for a cost of 2, and into any normal node for a cost of 1. Then just use any directed-graph pathfinding algorithm such as Dijkstra's or A* (the usual heuristic, manhatten distance, will still work)

Select the most elements that do not overlap so that the sum of their size is maximized

I'm trying to find an algorithm to the following problem.
Say I have a number of objects A, B, C,...
I have a list of valid combinations of these objects. Each combination is of length 2 or 4.
For eg. AF, CE, CEGH, ADFG,... and so on.
For combinations of two objects, eg. AF, the length of the combination is 2. For combination of four objects, eg CEGH, the length of the combination.
I can only pick non-overlapping combinations, i.e. I cannot pick AF and ADFG because both require objects 'A' and 'F'. I can pick combinations AF and CEGH because they do not require common objects.
If my solution consists of only the two combinations AF and CEGH, then my objective is the sum of the length of the combinations, which is 2 + 4 = 6.
Given a list of objects and their valid combinations, how do I pick the most valid combinations that don't overlap with each other so that I maximize the sum of the lengths of the combinations? I do not want to formulate it as an IP as I am working with a problem instance with 180 objects and 10 million valid combinations and solving an IP using CPLEX is prohibitively slow. Looking for some other elegant way to solve it. Can I perhaps convert this to a network? And solve it using a max-flow algorithm? Or a Dynamic program? Stuck as to how to go about solving this problem.

My first attempt at showing this problem to be NP-hard was wrong, as it did not take into account the fact that only combinations of size 2 or 4 were allowed. However, using Jim D.'s suggestion to reduce from 3-dimensional matching (3DM), we can show that the problem is nevertheless NP-hard.
I'll show that the natural decision problem form of your problem ("Given a set O of objects, and a set C of combinations of either 2 or 4 objects from O, and an integer m, does there exist a subset D of C such that all sets in D are pairwise disjoint, and the union of all sets in D has size at least m?") is NP-hard. Clearly the optimisation problem (i.e., your original problem, where we seek an actual subset of combinations that maximises m above) is at least as hard as this problem. (To see that the optimisation problem is not "much" harder than the decision problem, notice that you could first find the maximum m value for which a solution exists using a binary search on m in which you solve a decision problem at each step, and then, once this maximal m value has been found, solving a series of decision problems in which each combination in turn is removed: if the solution after removing some particular combination is still "YES", then it may also be left out of all future problem instances, while if the solution becomes "NO", then it is necessary to keep this combination in the solution.)
Given an instance (X, Y, Z, T, k) of 3DM, where X, Y and Z are sets that are pairwise disjoint from each other, T is a subset of X*Y*Z (i.e., a set of ordered triples with first, second and third components from X, Y and Z, respectively) and k is an integer, our task is to determine whether there is any subset U of T such that |U| >= k and all triples in U are pairwise disjoint (i.e., to answer the question, "Are there at least k non-overlapping triples in T?"). To turn any such instance of 3DM into an instance of your problem, all we need to do is create a fresh 4-combination from each triple in T, by adding a distinct dummy value to each. The set of objects in the constructed instance of your problem will consist of the union of X, Y, Z, and the |T| dummy values we created. Finally, set m to k.
Suppose that the answer to the original 3DM instance is "YES", i.e., there are at least k non-overlapping triples in T. Then each of the k triples in such a solution corresponds to a 4-combination in the input C to your problem, and no two of these 4-combinations overlap, since by construction, their 4th elements are all distinct, and by assumption of the. Thus there are at least m = k non-overlapping 4-combinations in the instance of your problem, so the solution for that problem must also be "YES".
In the other direction, suppose that the solution to the constructed instance of your problem is "YES", i.e., there are at least m non-overlapping 4-combinations in C. We can simply take the first 3 elements of each of the 4-combinations (throwing away the fourth) to produce a set of k = m non-overlapping triples in T, so the answer to the original 3DM instance must also be "YES".
We have shown that a YES-answer to one problem implies a YES-answer to the other, thus a NO-answer to one problem implies a NO-answer to the other. Thus the problems are equivalent. The instance of your problem can clearly be constructed in polynomial time and space. It follows that your problem is NP-hard.

You can reduce this problem to the maximum weighted clique problem, which is, unfortunately, NP-hard.
Build a graph such that every combination is a vertex with weight equal to the length of the combination, and connect vertices if the corresponding combinations do not share any object (i.e. if you can pick both them at the same time). Then, a solution is valid if and only if it is a clique in that graph.
A simple search on google brings up a lot of approximation algorithms for this problem, such as this one.

Minimum sum of multiple source-destination path

Given an undirected and unweighted graph, I want to find out the minimum sum of k pair of source-destination path, under the constraint that each vertex is used at most once.
For example, the following graph with 2 pair source-destination (A, E) and (B, F) has minimum sum of 7 steps. That is, (A > G > H > I > E) = 4 steps and (B > C > D > F) = 3 steps.
http://i.imgur.com/W4uNiac.png
It is obviously that greedy method of finding all these k path sequentially will result in suboptimal solution. For example, there exists a solution of 8 steps with (A > C > D > E) = 3 steps and (B > J > K > L > M > F) = 5 steps.
I have consider to model this problem into minimum cost maximum flow problem. However, such multiple source-destination pair cannot always be distinguished. For example, if k = 2 and the two pairs are (A, B) and (C, D), the solution applying MCMF may result in (A > ... > D) and (B > ... > C) under some situation, which is apparently not the solution I want.
Is there any idea about this problem?
Thanks in advance! :)

This looks like a modified multi-agent path planning problem. Normally agents are just restricted not to collide or cross paths, but in your version they cannot share vertices anywhere. It wouldn't be hard to modify the CBS algorithm to solve this problem.
I'm going to call your k paths agents to help describe a solution.
The basic idea is that you use a binary search tree to find the solution. Start by finding the shortest path for each of the k agents independently. Then, you check to see if you have any collisions between agents in the solved paths. When you find a node shared by two paths, you branch in the tree. On the right side of the tree the first agent is forbidden from using that node. On the left side of the tree the second agent is forbidden from using the node. Then, the agent that has been given a new constraint is required to re-search without using the given node.
Nodes in the tree are expanded in a best-first order according to the total path cost of all agents. When you find a solution, it is guaranteed to be optimal.
There are other algorithms you could use, but this one is pretty simple as it keeps the planning decoupled. The efficiency depends on the exact nature of the problems you are solving.
If I had to guess, the problem is NP-Hard or NP-Complete - easy to verify solutions, but the given algorithm runs in exponential time.

Understanding the AStar Algorithm

From this link: Link
If an adjacent square is already on the open list, check to see if
this path to that square is a better one. In other words, check to see
if the G score for that square is lower if we use the current square
to get there. If not, don’t do anything.
Example:
parent (already traversed): O
branches: A, B, C
parent (working on): A
Branches: B, D
The open list contains, A, B, C and now D.
Now, the bold statements in the above quote are comparing which path with, path A to B?
i.e.
Comparison of A to B && O to B
OR
Comparison of A to B && A to D
Please clarify.

Basic A* is:
While we're not close enough to the / a destination
Take the point that we have now, with the lowest expected total cost (so known cost up to this point plus estimated remaining cost).
Calculate cost for all surrounding locations & add them back to the priority queue with their expected total cost.
return route that we followed to the closest point
In the official A* terminology, G-score is the cost to get there. H-score is the estimate to get from there to where you want to go.
Extremes are if your H-score always overestimates; the new score (estimate + new cost) will then always be lower than the previous square's estimate + cost, so you'll beeline to the target. If your H-score always underestimates (or is 0, or whatever) you'll always prefer squares closer to your point of departure, as they'll have a lower cost so far, so you'll basically floodfill from that position.
You can use A* from a theoretical point of view or from a practical point of view. Theoretically, you may never overestimate the cost of any link. That means that you will always find the most efficient route, but will take longer finding it as you'll expand a lot more nodes.
Using it from the practical point of view allows slightly-inadmissible heuristics (ones that can overestimate slightly). The route you find will most likely be slightly unoptimal but shouldn't be too bad for game use. The calculations get way faster though, since you don't expand everything anymore. One (slow) implementation I had made took 6 minutes across a 1k*1k map with a regular trig distance heuristic but only a few seconds with that augmented times 3. The routes weren't too different. Making the heuristic times 5 made the routes basically a beeline & much faster still, but unusably so.
WRT your direct question:
It's the second square you evaluate. You have the roads from O to A,B,C planned (with a given cost G for each of those routes). Now, suppose there's a highway from O through A to B, instead of a dirt road from O to B. That means that going through A is faster. When expanding A it looks at all surrounding squares and adds the cost + heuristic to the open list if it wasn't in there already. If it was in there, it sees if the new route is faster (lower G to get to that square) and only if so, does it replace it.
So in your example, it adds O->A->D to the list and then checks if O->A->B is faster than O->B.
-- addendum
Open list: 2 / A (through O), 5 / B (through O), 7 / C (through O)
Closed list: 0 / O (origin / through nothing)
Take A as it's the lowest cost so far. Then, for each neighbor of A calculate the cost to get there.
If there's an entry for it already and the cost is lower than we know so far, replace the entry.
If there's no current entry, add it.
Working on A, with roads of cost 2 to B and 3 to D. Going to B through A has a cost of 4, where the current entry is 5. So we replace it. D isn't in there, so we add it.
Open list: 4 / B (through A), 5 / D (through A), 7 / C (through O)
Closed list: 0 / O (origin / through nothing), 2 / A (through O)
So we compared A to B versus O to B.

Well, if we are working on the node A, we are considering its neighbours. Say we are considering the B node now (which is in the openList).
The definition given tells us to compare the G value of B (previously computed when it was first added to open list when the working node was O) and the sum of the cost of reaching A from the begining (O) and the cost of reaching B from A.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio