How to solve the below Dynamic Programming question - data-structures

Recently I have taken a coding challenge on DP and DS. Below is a question where I was struck (I am just framing the question from my memory)
There are n towers each of different height. Each tower is separated by unit distance. Find the maximum length of a rope to be tied between two towers. Length of the rope can be calculated as : sum of absolute difference between tower heights and distance between the towers.
As a beginner in programming, I used two nested for loops and calculated the rope length between each pair of towers and stored the maximum length every time. But when I ran the program, I got time out error(obviously TC will be n^2). This might be an easy one to many of you folks but as a beginner I am still looking for a logic that can reduce the Time complexity. Please let me know the logic or code in python.
Thank you very much in advance. Cheers!!!

Related

Finding a partition of a graph with no edges crossing partition

I have a graph with a guarantee that it can be divided into two equal-sized partitions (one side may be 1 larger than the other), with no edges across this partition. I initially thought this was NP-hard, but I suspect it might not be. Is there any (efficient) way of solving this?
It's possible to solve your problem in time O(n2) by combining together two well-known algorithms.
When I first saw your problem, I initially thought it was going to relate to something like finding a maximum or minimum cut in a graph. However, since you're specifically looking for a way of splitting the nodes into two groups where there are no edges at all running between those groups, I think what you're looking for is much closer to finding connected components of a graph. After all, if you break the graph apart into connected components, there will be no edges running between those components. Therefore, the question boils down to the following:
Find all the connected components of the graph, making a note of how many nodes are in each component.
Partition the two connected components into two groups of roughly equal size - specifically, the size of the group on one side should be at most one more than the size of the group on the other.
Step (1) is something that you can do using a breadth-first or depth-first search to identify all the connected components. That will take you time O(m + n), where m is the number of edges and n is the number of nodes.
Step (2), initially, seems like it might be pretty hard. It's reminiscent of the partition problem, which is known to be NP-hard. The partition problem works like this: you're given as input a list of numbers, and you want to determine whether there's a split of those numbers into two groups whose totals are equal to one another. (It's possible to adapt this problem so that you can tolerate a split that's off by plus or minus one without changing the complexity). That problem happens to be NP-complete, which suggests that your problem might be hard.
However, there's a small nuance that actually makes the apparent NP-hardness of the partition problem not an issue. The partition problem is NP-hard in the case where the numbers you're given are written out in binary. On the other hand, if the numbers are written out in unary, then the partition problem has a polynomial-time solution. More specifically, there's an algorithm for the partition problem that runs in time O(kU), where k is the number of numbers and U is the sum of all those numbers. In the case of the problem you're describing, you know that the sum of the sizes of the connected components in your graph must be n, the number of nodes in the graph, and you know that the number of connected components is also upper-bounded by n. This means that the runtime of O(kU), plugging in k = O(n) and U = O(n), works out to O(n2), firmly something that can be done in polynomial time.
(Another way to see this - there's a pseudopolynomial time algorithm for the partition problem, but since in your case the maximum possible sum is bounded by an actual polynomial in the size of the input, the overall runtime is a polynomial.)
The algorithm I'm alluding to above is a standard dynamic programming exercise. You pick some ordering of the numbers - not necessarily in sorted order - and then fill in a 2D table where each entry corresponds to an answer to the question "is there a subset of the first i numbers that adds up to exactly j?" If you're not familiar with this algorithm, I'll leave it up to you to work out the details, as it's a really beautiful problem to solve that has a fairly simple and elegant solution.
Overall, this algorithm will run in time O(n2), since you'll do O(m + n) = O(n2) work to find connected components, followed by time O(n2) to run the partition problem DP to determine whether the split exists.
Hope this helps!

How to calculate the maximum number of iterations taken by perceptron learning algorithm?

I believe perceptron learning algorithm has an upper bound on the number of iterations it takes to converge for a linear separable data. I looked to find the exact formula that would help me find out the number of iterations this would take, but it was not there on wikipedia.
I read online that this entity depends on the number of data samples, so if I have n(say 5000) samples, how many iterations will it take for perceptron to converge(assume that the data is linearly separable). Or is it not as straightforward and depends on the data itself?
P.S. Very new to machine learning, hence a simple question.
Wikipedia refers to a proof by Novikoff, A. B. (1962), one of the early results in statistical-learning.
Here some slightly different form given a result for the online-learning environment:
When ||x_i|| ≤ D for all i (intuition: radius of the training instances) and a margin >= γ:
iterations needed are bounded above by: (D/γ)^2.
Interesting consequences:
convergence-result independent on number of samples!
convergence-result independent on input-dimension!

Bin packing algorithm

I have a kitchen heating meals from frozen, they need to produce meals to a head count order. The meals come in frozen sizes of portions such as 4's, 6's etc. The larger sizes have a lower cost per unit. So allowing waste, how do I calculate the sizes to complete an order at the lowest cost.
This problem kind of sounds like the knapsack problem to me. I'm assuming that a greedy algorithm will not work here because there appear to be overlapping subproblems. You will probably have to use a dynamic programming algorithm which determines the minimum cost for a given head count by calculating the cost for all possible combinations of meal portions satisfying that head count.
I only pointed you in the right direction because this sounds like it might be homework. Either way this problem sounds like it can be reduced to one with a well-known solution.

Dynamic Programming algorithms and real world usage

I have studied in the past the classical DP problems and algorithms (coins, longest increasing subsequence, longest common subsequence, etc).
I know that these algorithms have practical applications (ie. genetic algorithms, just to name one). What I question though is if these algorithms have practical applications in modern computer science, where the size of input is very large and problems are not solvable on just one machine.
My point is that these algorithms are quite hard to parallelize (ie. Parallel Dynamic Programming), and memory occupation is quadratic in most of the formulations, making it hard to process inputs that are reasonably big.
Anyone has real world use cases on this?
Practical application: diff. This is an essential Linux utility which finds the differences between two files by solving the longest common subsequence problem using the DP algorithm.
DP algorithms are used because in many cases they are the only practical solution. And besides, there is nothing wrong with them.
Memory usage: Often, a sliding window can be used to reduce the memory usage dramatically. Fibonacci, when solved using a naive bottom-up DP, requires O(n) memory. A sliding window improves this to O(1) memory (I know of the magical constant time solution, but that's beside the point).
Parallelization: Top-down DPs are often easy to parallelize. Bottom-ups may or may not be. #amit's example (parallelizing longest common subsequence) is a good one, where any given diagonal's tiles can be solved independently as long as the previous diagonals are known.
The longest common subsequence problem and Longest common substring problem are sometimes important for analyzing strings [analyzing genes sequence, for example]. And they can be solved efficiently using dynamic programming.
Note you can parallelize this algorithm: you do it in iterations on the diagonals [from left,down to right,up] - so total of 2n-1 iterations. And in every diagonal: each cell does not depend on other cells in this diagonal - so parallelizing can be done here, each thread will have a block of cells in this diagonal.
Note that data synchronization using this method is also minimal: each thread needs only to transfer data to his "neighboring threads" so it can be done even if the memory is not shared.
Also, both problems, as #larsmans mentioned - can use linear space - at each point you only need to "remember" the current + 2 last diagonals, and not the entire matrix.
Another common problem that is solved using dynamic programming is polynomial interpolation. The interpolation can be effieciently done using Newton Interpolation, which first needs to calculate the divided differences - which is built using dynamic programming.

Find a bijection that best preserves distances

I have two spaces (not necessarily equal in dimension) with N points.
I am trying to find a bijection (pairing) of the points, such that the distances are preserved as well as possible.
I can't seem to find a discussion of possible solutions or algorithms to this question online. Can anyone suggest keywords that I could search for? Does this problem have a name, or does it come up in any domain?
I believe you are looking for a Multidimensional Scaling algorithm where you are minimizing the total change in distance. Unfortunately, I have very little experience in this area and can't be of much more help.
I haven't heard of the exact same problem. There are two similar types of problems:
Non-linear dimensionality reduction, you're given N high dimensional points and you want to find N low dimensional points that preserve distance as well as possible. MDS, mentioned by Michael Koval is one such method.
This might be more promising: algorithms for the assignment problem. For example Kuhn-Munkres (the Hungarian algorithm), you're given an NxN matrix that encodes the cost of matching pi with pj and you want to find the minimum cost bijection. There are many generalizations of this problem, for example b-matching (Kuhn-Munkres solves 1-matching).
Depending on how you define "preserves distances as well as possible" I think you either want (2) or a generalization of (2) in such a way that the cost doesn't only depend on the two points being matched but the assignment of all other points.
Finally, Kuhn-Munkres comes up everywhere in operations research.

Resources