Minimize the difference of the distance between points on the line - algorithm

My problem is as follows:
Given n points on a line segment and a threshold k, pick the points on the line so that would minimize the average difference of the distance between each consecutive point and the threshold.
For example:
If we were given an array of points n = [0, 2, 5, 6, 8, 9], k = 3
Output: [0, 2, 6, 9]
Explanation: when we choose this path, the difference from the threshold in each interval is [1, 1, 0] which gets an average of .66 difference.
If I chose [0, 2, 5, 8, 9], the differences would be [1, 0, 0, 2], which averages to .75.
I understand enough dynamic programming to consider several solutions including memorization and depth-first search, but I was hoping someone could offer a specific algorithm with the best efficiency.

Related

The best solution (considering time complexity) for the function implementation

A function does the following task:
For example L = [[1, 2, 3], [1, 2], [1, 2, 3, 5, 6, 8], [1, 8, 6, 10, 21], [1, 4, 6, 9], [22]]; (array of arrays)
find out the index number of L such that all digit numbers in the value(sub-array) don't appear in other sub-arrays. In this example, the function would return 5 (the index of [22]) because 22 is only in this sub-array.
What could be the optimal solution in time complexity
The algorithm is to keep track of all the numbers you've seen so far (for example in a hashset), and process the sub-arrays one by one until you find one which matches your condition. In the worst case it's O(n) basic set operations, where n is the sum of the lengths of the subarrays of L. This is O(n) comparisons on average if you use a hashset.

What is the most efficient way to split an array of numbers, such that sum of each subset is as close to a target as possible, without exceeding it?

I am faced with this optimization challenge:
Take for example the array, [1, 2, 4, 3, 3, 6, 2, 1, 6, 7, 4, 2]
I want to split this into multiple sub-arrays, such that their sums are as close to a target sum. Say, 7.
The only condition I have is the sums cannot be more that the target sum.
Using a greedy approach, I can split them as
[1, 2, 4], [3, 3, 1], [6], [2, 4], [6], [7], [2]
The subset sums are 7, 7, 6, 6, 6, 7 and 2.
Another approach I tried is as follows:
Sort the array, in reverse.
Set up a running total initialized to 0, and an empty subset.
If the list is empty, proceed to Step 6.
Going down the list, pick the first number, which when added to the running total does not exceed the target sum. If no such element is found, proceed to Step 6, else proceed to Step 5.
Remove this element from the list, add it to the subset, and update running total. Repeat from step 3.
Print the current subset, clear the running total and subset. If the list isn't empty, repeat from Step 3. Else proceed to Step 7.
You're done!
This approach produced the following split:
[7], [6, 1], [6, 1], [4, 3], [4, 3], [2, 2, 2]
The subset sum was much more even: 7, 7, 7, 7, 7 and 6.
Is this the best strategy?
Any help is greatly appreciated!
I think you should use the terms "subset" and "sub-array" carefully. What you are looking for is "subset".
The best strategy here would be to write the recursive solution that tries each possibility of forming a subset so that the sum remains <= maximum allowed sum.
If you carefully understand what the recursion does, you'll understand that some sub-problems are being solved again and again. So, you can (memoize) store the solutions to the sub-problems and re-use them. Thus, reading about dynamic programming will help you.

Partition an array into subsets of fixed size with minimum sum difference

I have found versions of this problem for either 2 subsets of half the size of the original array or with any number of subsets of any length. Does anybody have any pointers to any good solution for this problem? (can be greedy)
Given an array of positive numbers of length N (can have repetitions)
Partition the N numbers into subsets of length M with their sum difference minimized.
Simple example:
N=9, M=2
[5, 2, 3, 7, 5, 3, 7, 8, 1]
into
[[8, 1], [7, 2], [7, 3], [5, 5]] + [3] (leftover)
9 9 10 10
My real world use case is to group files of different sizes into batches of a given length but having the total size of each batch be as close as possible to that of the other batches.
Thanks!

kth largest element in range interval

Given a list of overlapping intervals of integers. I need to find the kth largest element.
Example:
List { (3,4), (2,8), (4,8), (1,3), (7,9) }
This interval represents numbers as
[3, 4], [2, 3, 4, 5, 6, 7, 8], [4, 5, 6, 7, 8], [1, 2, 3], and [7, 8, 9].
If we merge and sort it in decreasing order, we get
9, 8, 8, 8, 7, 7, 7, 6, 6, 5, 5, 4, 4, 4, 3, 3, 3, 2, 2, 1
Now the 4th largest number in the list is 8.
Can anyone please explain an efficient (we don't have to generate the list) algorithm to find the kth element given only a list of internals ?
Find out the largest number. You go through intervals and examine ends of intervals. In your case it is 9. Set k = 1, and L = 9.
Perhaps there are other 9s. Mark (7,9) interval as visited and check if any other intervals contains 9 a >= 9 && b <= '. In your case there is only one 9.
Decrement current largest number (L -= L) and clear history of visited intervals. And repeat checking intervals.
Every time you meet your current largest number within an interval you should increment k and mark the interval as visited. As soon as it becomes equal to kth the current greatest number L is your answer.

Is this equivalent to insertion sort?

Say we have a 0-indexed sequence S, take S[0] and insert it in a place in S where the next value is higher than S[0] and the previous value is lower than S[0]. Formally, S[i] should be placed in such a place where S[i-1] < S[i] < S[i+1]. Continue in order on the list doing the same with every item. Remove the element from the list before putting it in the correct place. After one iteration over the list the list should be ordered. I recently had an exam and I forgot insertion sort (don't laugh) and I did it like this. However, my professor marked it wrong. The algorithm, as far as I know, does produce a sorted list.
Works like this on a list:
Sorting [2, 8, 5, 4, 7, 0, 6, 1, 10, 3, 9]
[2, 8, 5, 4, 7, 0, 6, 1, 10, 3, 9]
[2, 8, 5, 4, 7, 0, 6, 1, 10, 3, 9]
[2, 5, 4, 7, 0, 6, 1, 8, 10, 3, 9]
[2, 4, 5, 7, 0, 6, 1, 8, 10, 3, 9]
[2, 4, 5, 7, 0, 6, 1, 8, 10, 3, 9]
[2, 4, 5, 0, 6, 1, 7, 8, 10, 3, 9]
[0, 2, 4, 5, 6, 1, 7, 8, 10, 3, 9]
[0, 2, 4, 5, 1, 6, 7, 8, 10, 3, 9]
[0, 1, 2, 4, 5, 6, 7, 8, 10, 3, 9]
[0, 1, 2, 4, 5, 6, 7, 8, 3, 9, 10]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Got [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Since every time an element is inserted into the list up to (n-1) numbers in the list may be moved and we must do this n times the algorithm should run in O(n^2) time.
I had a Python implementation but I misplaced it somehow. I'll try to write it again in a bit, but it's kinda tricky to implement. Any ideas?
The Python implementation is here: http://dpaste.com/hold/522232/. It was written by busy_beaver from reddit.com when it was discussed here http://www.reddit.com/r/compsci/comments/ejaaz/is_this_equivalent_to_insertion_sort/
It's a while since this was asked, but none of the other answers contains a proof that this bizarre algorithm does in fact sort the list. So here goes.
Suppose that the original list is v1, v2, ..., vn. Then after i steps of the algorithm, I claim that the list looks like this:
w1,1, w1,2, ..., w1,r(1), vσ(1), w2,1, ... w2,r(2), vσ(2), w3,1 ... ... wi,r(i), vσ(i), ...
Where σ is the sorted permutation of v1 to vi and the w are elements vj with j > i. In other words, v1 to vi are found in sorted order, possibly interleaved with other elements. And moreover, wj,k ≤ vj for every j and k. So each of the correctly sorted elements is preceded by a (possibly empty) block of elements less than or equal to it.
Here's a run of the algorithm, with the sorted elements in bold, and the preceding blocks of elements in italics (where non-empty). You can see that each block of italicised elements is less than the bold element that follows it.
[4, 8, 6, 1, 2, 7, 5, 0, 3, 9]
[4, 8, 6, 1, 2, 7, 5, 0, 3, 9]
[4, 6, 1, 2, 7, 5, 0, 3, 8, 9]
[4, 1, 2, 6, 7, 5, 0, 3, 8, 9]
[1, 4, 2, 6, 7, 5, 0, 3, 8, 9]
[1, 2, 4, 6, 7, 5, 0, 3, 8, 9]
[1, 2, 4, 6, 5, 0, 3, 7, 8, 9]
[1, 2, 4, 5, 6, 0, 3, 7, 8, 9]
[0, 1, 2, 4, 5, 6, 3, 7, 8, 9]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
If my claim is true, then the algorithm sorts, because after n steps all the vi are in order, and there are no remaining elements to be interleaved. But is the claim really true?
Well, let's prove it by induction. It's certainly true when i = 0. Suppose it's true for i. Then when we run the (i + 1)st step, we pick vi+1 and move it into the first position where it fits. It certainly passes over all vj with j ≤ i and vj < vi+1 (since these are sorted by hypothesis, and each is preceded only by smaller-or-equal elements). It cannot pass over any vj with j ≤ i and vj ≥ vi+1, because there's some position in the block before vj where it will fit. So vi+1 ends up sorted with respect to all vj with j ≤ i. So it ends up somewhere in the block of elements before the next vj, and since it ends up in the first such position, the condition on the blocks is preserved. QED.
However, I don't blame your professor for marking it wrong. If you're going to invent an algorithm that no-one's seen before, it's up to you to prove it correct!
(The algorithm needs a name, so I propose fitsort, because we put each element in the first place where it fits.)
Your algorithm seems to me very different from insertion sort. In particular, it's very easy to prove that insertion sort works correctly (at each stage, the first however-many elements in the array are correctly sorted; proof by induction; done), whereas for your algorithm it seems much more difficult to prove this and it's not obvious exactly what partially-sorted-ness property it guarantees at any given point in its processing.
Similarly, it's very easy to prove that insertion sort always does at most n steps (where by a "step" I mean putting one element in the right place), whereas if I've understood your algorithm correctly it doesn't advance the which-element-to-process-next pointer if it's just moved an element to the right (or, to put it differently, it may sometimes have to process an element more than once) so it's not so clear that your algorithm really does take O(n^2) time in the worst case.
Insertion sort maintains the invariant that elements to the left of the current pointer are sorted. Progress is made by moving the element at the pointer to the left into its correct place and advancing the pointer.
Your algorithm does this, but sometimes it also does an additional step of moving the element at the pointer to the right without advancing the pointer. This makes the algorithm as a whole not an insertion sort, though you could call it a modified insertion sort due to the resemblance.
This algorithm runs in O(n²) on average like insertion sort (also like bubble sort). The best case for an insertion sort is O(n) on an already sorted list, for this algorithm it is O(n) but for a reverse-sorted list since you find the correct position for every element in a single comparison (but only if you leave the first, largest, element in place at the beginning when you can't find a good position for it).
A lot of professors are notorious for having the "that's not the answer I'm looking for" bug. Even if it's correct, they'll say it doesn't meet their criteria.
What you're doing seems like insertion sort, although using removes and inserts seems like it would only add unnecessary complexity.
What he might be saying is you're essentially "pulling out" the value and "dropping it back in" the correct spot. Your prof was probably looking for "swapping the value up (or down) until you found it's correct location."
They have the same result but they're different in implementation. Swapping would be faster, but not significantly so.
I have a hard time seeing that this is insert sort. Using insert sort, at each iteration, one more element would be placed correctly in the array. In your solution I do not see an element being "fully sorted" upon each iteration.
The insert sort algorithm begin:
let pos = 0
if pos == arraysize then return
find the smallest element in the remaining array from pos and swap it with the element at position pos
pos++
goto 2

Resources