FInd overlapping appointments in O(n) time? - algorithm

I was recently asked this question in an interview. Even though I was able to come up the O(n²) solution, the interviewer was obsessed with an O(n) solution. I also checked few other solutions of O(n logn) which I understood, but O(n) solution is still not my cup of tea which assumes appointments sorted by start-time.
Can anyone explain this?
Problem Statement: You are given n appointments. Each appointment contains a start time and an end time. You have to retun all conflicting appointments efficiently.
Person: 1,2, 3, 4, 5
App Start: 2, 4, 29, 10, 22
App End: 5, 7, 34, 11, 36
Answer: 2x1 5x3
O(n logn) algorithm: separate start and end point like this:
2s, 4s, 29s, 10s, 22s, 5e, 7e, 34e, 11e, 36e
then sort all of this points (for simplicity let's assume each point is unique):
2s, 4s, 5e, 7e, 10s, 11e, 22s, 29s, 34e, 36e
if we have consecutive starts without ends then it is overlapping:
2s, 4s are adjacent so overlapping is there
We will keep a count of "s" and each time we encounter it will +1, and when e is encountered we decrease count by 1.

The general solution to this problem is not possible in O(n).
At a minimum you need to sort by appointment start time, which requires O(n log n).
There is an O(n) solution if the list is already sorted. The algorithm basically involves checking whether the next appointment is overlapped by any previous ones. There is a bit of subtlety to this one as you actually need two pointers into the list as you run through it:
The current appointment being checked
The appointment with the latest end time encountered so far (which might not be the previous appointment)
O(n) solutions for the unsorted case could only exist if you have other constraints, e.g. a fixed number of appointment timeslots. If this was the case, then you can use HashSets to determine which appointment(s) cover each timeslot, algorithm roughly as follows:
Create a HashSet for each timeslot - O(1) since timeslot number is a fixed constant
For each appointment, store its ID number in HashSets of slot(s) that it covers - O(n) since updating a constant number of timeslots is O(1) for each appointment
Run through the slots, checking for overlaps - O(1) (or O(n) if you want to iterate over the overlapping appointments to return them as results)

Assuming you have some constraint on the start and end times, and on the resolution at which you do the scheduling, it seems like it would be fairly easy to turn each appointment into a bitmap of times it does/doesn't use, then do a counting sort (aka bucket sort) on the in-use slots. Since both of those are linear, the result should be linear (though if I'm thinking correctly, it should be linear on the number of time slots rather than the number of appointments).
At least if I asked this as an interview question, the main thing I'd be hoping for is the candidate to ask about those constraints (i.e., whether those constraints are allowed). Given the degree to which it's unrealistic to schedule appointments for 1000 years from now, or schedule to a precision of even a minute (not to mention something like a nanosecond), they strike me as reasonable constraints, but you should ask before assuming them.

A naive approach might be to build two parallel trees, one ordered by the beginning point, and one ordered by the ending point of each interval. This allows discarding half of each tree in O(log n) time, but the results must be merged, requiring O(n) time. This gives us queries in O(n + log n) = O(n).

This is the best I can think of, in horrible pseudocode. I attempted to reduce the problem as much as possible. This is only less than On^2 (I think).
Note that the output at the end will not show every appointment that a given appointment will conflict with on that appointment's specific output line...but at some point every conflict is displayed.
Also note that I renamed the appointments numerically in order of starting time.
output would be something like the following:
Appointment 1 conflicts with 2
Appointment 2 conflicts with
Appointment 3 conflicts with
Appointment 4 conflicts with 5
Appointment 5 conflicts with
appt{1},appt{2},appt{3} ,appt{4} ,appt{5}
2 4 10 22 29
5 7 11 36 34
pseudocode
list=(1,2,3,4,5)
for (i=1,i<=5,i++)
list.shift() **removes first element
appt{i}.conflictswith()=list
for (i=1,i<=n,i++)
{ number=n
done=false
while(done=false)
{if (number>i)
{if (appt(i).endtime() < appt(number).startime())
{appt{i}.conflictswith().pop()}
else
{done=true}
number--
}
else
{done=true}
}
}
for (i=1,i<=n,i++)
print "Appointment ",i," conflicts with:",appt{i}.conflictswith()

I came across a Data Structure called Interval tree, with the help of which we can find intervals in less than O(n log (n)) time, depending upon the data provided

Related

Big O runtime analysis for 3-way recursion with memoization

I'm doing some practice interview questions and came across this one:
Given a list of integers which represent hedge heights, determine the minimum number of moves to make the hedges pretty - that is, compute the minimum number of changes needed to make the array alternate between increasing and decreasing. For example, [1,6,6,4,4] should return 2 as you need to change the second 6 to something >6 and the last 4 to something <4. Assume the min height is 1 and the max height is 9. You can change to any number that is between 1 and 9, and that counts as 1 move regardless of the diff to the current number.
My solution is here: https://repl.it/#plusfuture/GrowlingOtherEquipment
I'm trying to figure out the big O runtime for this solution, which is memoized recursion. I think it's O(n^3) because for each index, I need to check against 3 possible states for the rest of the array, changeUp, noChange, and changeDown. My friend maintains that it's O(n) since I'm memoizing most of the solutions and exiting branches where the array is not "pretty" immediately.
Can someone help me understand how to analyze the runtime for this solution? Thanks.

Given a set of intervals and a time t, find one of the intervals containing that time

Intervals have start and end time.
Intervals can overlap. There might be several intervals containing a time t. It is ok to just return one of them.
This was an interview question. I was able to solve it by sorting the intervals based on end and another time based on start and taking the intersection of the intervals that have a matching start and end. Apparently there are more optimized solutions.
Here is an example: [1, 5] [2, 10] [3, 6] [2, 9] and target is 8. In this case either of [2, 10] and [2, 9] are correct answers.
I guess the point of the question is to precompute a data structures on intervals, so that searches can be run with a complexity of better than linear.
I found the answer here, the question is exactly like an interval tree.
The solution can be found in many resources but I found the pdf mentioned above to be very concise and to the point.
Here is the relevant part to the answer:
Iterate through the intervals, until you find one where the given time is between the start end of that interval. You can stop at the first one you find. The worst case performance is O(n).
Edit to add because the question has been edited to add the suggestion of pre-computing the answers
Pre-computing the answers forces the assumption that the times for each interval do not change, and the collection of intervals do not change. This is not stated in the question, but the following is based on that assumption.
If you were to pre-compute the results, you would need to examine the collection for each possible hour. You could do this for all 24 hours, or (for a small number of intervals) the list of possible hours is a range from the Minimum of the start hour to the Maximum of the end hour (since all other hours will have no intervals), costing O(n) to calculate both Min and Max. Which approach to choose would depend how many intervals you have. If n is sufficiently large then you'd ignore calculating Min and Max, and do it for all 24 hours,
For each possible hour, you would would need to examine the collection for the first interval to match (worst case O(n), for each hour), and store it for future lookup.
They could be stored as an array of references to the intervals, using the hour as the index, yielding worst case O(1) lookup cost.
So if you were doing enough lookups, and assuming that the intervals never change, this would be faster than calculating it each time.

Interview Scheduling Algorithm

I am trying to think of an algorithm that always produces the optimum solution in the best possible time to this problem:
There are n candidates for a job, and k rooms in which they have scheduled interviews at various times of the day. Interviews have a specific schedule in each room, with each interview having a specified start time (si), finish time (fi), and interview room (ri). All time units are always integers. In addition we need to schedule pictures with the people currently being interviewed throughout the day. The pictures don't effectively take any time, but at some point in the day each interviewee must be in a picture. If we schedule a picture at time t, all people currently being interviewed will be in that picture. Taking a picture has no affect on the rest of each interviews start and end time. So the problem is this: with an unordered list of interviews , each with variables (si, fi, ri), how do you make sure every interview candidate is in a picture, while taking as few pictures as possible?
So ideally we would take pictures when there are as many people present as possible to minimize the number of pictures taken. My original idea for this was sort of a brute force, but it would be a really bad big-O runtime. It is very important to minimize the runtime of this algorithm while still returning the fewest possible photographs. That being said, if you can think of a fast greedy algorithm that doesn't perfectly solve the problem, I would like to hear that too.
I'm sure my description here was far from flawless, so if you would like me to clarify anything, feel free to leave a comment and I'll get back to you.
Start with the following observations:
At least one picture must be taken during each interview, since we cannot photograph that interviewee before they arrive or after they leave.
The set of people available to photograph changes only at the times si and fi.
After an arrival event si, if the next event j is an arrival, there is no need to take a picture between si and sj, since everyone available at si is still available at sj.
Therefore, you can let the set of available interviewees "build up" through arrival events (up to k of them) and wait to take a picture until someone is about to leave.
Thus I think the following algorithm should work:
Put the arrival and departure times into a list and sort it (times should remain tagged with "arrival" or "departure" and the interviewee's index).
Create a boolean array A of size n to keep track of whether each interviewee is available (interview is in progress).
Create a boolean array P of size n to keep track of whether each interviewee has been photographed.
Loop over the sorted time list (index variable i):
a. If an arrival is encountered, set A[i] to true.
b. If a departure j is encountered, check P[j] to see if the person leaving has been photographed already. If not, take a picture now and record its effects (for all A[k] = true set P[k] = true). Finally set A[i] to false.
The sort is O(n log n), the loop has 2n iterations, and checking the arrays is O(1). But since on each picture-taking event, you may need to loop over A, the overall runtime is O(n2) in the worst case (which would happen if no interviews overlapped in time).
Here's an O(n log n) solution:
Step 1: Separately sort the starting and finishing time of all interviews, but at the same time keep track of the places they are sorted to (i.e. the original indices and the indices after sort). This results in 4 arrays below
sst[] (sst = sorted starting time)
sft[] (sft = sorted finishing time)
sst2orig[] (sst index to original index)
sft2orig[] (sst index to original index)
Note: by definitions of the above 4 arrays,
"sst2orig[j] = i & sst2orig[k] = i" means that
interview [i] has starting time sst[j] and finishing time sft[k]
Step 2: Define a boolean array p_taken[] to represent if the candidate of an interview has already been phtographed. All elements in the array will be set to false initially.
Step 3: The loop
std::vector<int> photo_time;
int last_p_not_taken_sst_index = 0;
for (int i=0; i<sft.size; i++) {
// ignore the candidate already photographed
if (p_taken[sft2orig[sft[i]]]) continue;
// Now we found the first leaving candidate not phtographed, we
// must take a photo now.
photo_time.push_back(sft[i]);
// So we can now mark all candidate having prior sst[] time as
// already photographed. So, we search for the first elm. in
// sst[] that is greater than sft[i], and returns the index.
// If all elm. in sst[] is smaller than sft[i], we return sst.size().
// This could be done via a binary search
int k = upper_inequal_bound_index(sst, sft[i]);
// now we can mark all candidate with starting time prior than sst[k]
// to be "photographed". This will include the one corresponding to
// sft[i]
for (int j=last_p_not_taken_sst_index; j<k; j++)
p_taken[sst2orig[j]] = true;
last_p_not_taken_sst_index = k;
}
The final answer is saved in photo_time, and the number of photos is photo_time.size().
Time Complexity:
Step 1: Sorts: O(n log n)
Step 2: initialize p_taken[]: O(n)
Step 3: We loop n times, and in each loop
3-1 check p_taken: O(1)
3-2 binary search: O(log n)
3-3 mark candidates: aggreated O(n), since we mark once only, per candidate.
So, overall for step 3: O(n x ( 1 + log n) + n) = O(n log n)
Step 1 ~ 3, total: O(n log n)
Note that step 3 can be futher optimized: we can shrink to exclude those already previous binary-searched range. But the worst case is still O(log n) per loop. Thus the total is still O(n log n)

Are there any worse sorting algorithms than Bogosort (a.k.a Monkey Sort)? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
My co-workers took me back in time to my University days with a discussion of sorting algorithms this morning. We reminisced about our favorites like StupidSort, and one of us was sure we had seen a sort algorithm that was O(n!). That got me started looking around for the "worst" sorting algorithms I could find.
We postulated that a completely random sort would be pretty bad (i.e. randomize the elements - is it in order? no? randomize again), and I looked around and found out that it's apparently called BogoSort, or Monkey Sort, or sometimes just Random Sort.
Monkey Sort appears to have a worst case performance of O(∞), a best case performance of O(n), and an average performance of O(n·n!).
What is the currently official accepted sorting algorithm with the worst average sorting performance (and there fore beeing worse than O(n·n!))?
From David Morgan-Mar's Esoteric Algorithms page: Intelligent Design Sort
Introduction
Intelligent design sort is a sorting algorithm based on the theory of
intelligent design.
Algorithm Description
The probability of the original input list being in the exact order
it's in is 1/(n!). There is such a small likelihood of this that it's
clearly absurd to say that this happened by chance, so it must have
been consciously put in that order by an intelligent Sorter. Therefore
it's safe to assume that it's already optimally Sorted in some way
that transcends our naïve mortal understanding of "ascending order".
Any attempt to change that order to conform to our own preconceptions
would actually make it less sorted.
Analysis
This algorithm is constant in time, and sorts the list in-place,
requiring no additional memory at all. In fact, it doesn't even
require any of that suspicious technological computer stuff. Praise
the Sorter!
Feedback
Gary Rogers writes:
Making the sort constant in time
denies the power of The Sorter. The
Sorter exists outside of time, thus
the sort is timeless. To require time
to validate the sort diminishes the role
of the Sorter. Thus... this particular
sort is flawed, and can not be
attributed to 'The Sorter'.
Heresy!
Many years ago, I invented (but never actually implemented) MiracleSort.
Start with an array in memory.
loop:
Check to see whether it's sorted.
Yes? We're done.
No? Wait a while and check again.
end loop
Eventually, alpha particles flipping bits in the memory chips should result in a successful sort.
For greater reliability, copy the array to a shielded location, and check potentially sorted arrays against the original.
So how do you check the potentially sorted array against the original? You just sort each array and check whether they match. MiracleSort is the obvious algorithm to use for this step.
EDIT: Strictly speaking, this is not an algorithm, since it's not guaranteed to terminate. Does "not an algorithm" qualify as "a worse algorithm"?
Quantum Bogosort
A sorting algorithm that assumes that the many-worlds interpretation of quantum mechanics is correct:
Check that the list is sorted. If not, destroy the universe.
At the conclusion of the algorithm, the list will be sorted in the only universe left standing.
This algorithm takes worst-case Θ(N) and average-case θ(1) time. In fact, the average number of comparisons performed is 2: there's a 50% chance that the universe will be destroyed on the second element, a 25% chance that it'll be destroyed on the third, and so on.
Jingle Sort, as described here.
You give each value in your list to a different child on Christmas. Children, being awful human beings, will compare the value of their gifts and sort themselves accordingly.
I'm surprised no one has mentioned sleepsort yet... Or haven't I noticed it? Anyway:
#!/bin/bash
function f() {
sleep "$1"
echo "$1"
}
while [ -n "$1" ]
do
f "$1" &
shift
done
wait
example usage:
./sleepsort.sh 5 3 6 3 6 3 1 4 7
./sleepsort.sh 8864569 7
In terms of performance it is terrible (especially the second example). Waiting almost 3.5 months to sort 2 numbers is kinda bad.
I had a lecturer who once suggested generating a random array, checking if it was sorted and then checking if the data was the same as the array to be sorted.
Best case O(N) (first time baby!)
Worst case O(Never)
There is a sort that's called bogobogosort. First, it checks the first 2 elements, and bogosorts them. Next it checks the first 3, bogosorts them, and so on.
Should the list be out of order at any time, it restarts by bogosorting the first 2 again. Regular bogosort has a average complexity of O(N!), this algorithm has a average complexity of O(N!1!2!3!...N!)
Edit: To give you an idea of how large this number is, for 20 elements, this algorithm takes an average of 3.930093*10^158 years,well above the proposed heat death of the universe(if it happens) of 10^100 years,
whereas merge sort takes around .0000004 seconds,
bubble sort .0000016 seconds,
and bogosort takes 308 years, 139 days, 19 hours, 35 minutes, 22.306 seconds, assuming a year is 365.242 days and a computer does 250,000,000 32 bit integer operations per second.
Edit2: This algorithm is not as slow as the "algorithm" miracle sort, which probably, like this sort, will get the computer sucked in the black hole before it successfully sorts 20 elemtnts, but if it did, I would estimate an average complexity of 2^(32(the number of bits in a 32 bit integer)*N)(the number of elements)*(a number <=10^40) years,
since gravity speeds up the chips alpha moving, and there are 2^N states, which is 2^640*10^40, or about 5.783*10^216.762162762 years, though if the list started out sorted, its complexity would only be O(N), faster than merge sort, which is only N log N even at the worst case.
Edit3: This algorithm is actually slower than miracle sort as the size gets very big, say 1000, since my algorithm would have a run time of 2.83*10^1175546 years, while the miracle sort algorithm would have a run time of 1.156*10^9657 years.
If you keep the algorithm meaningful in any way, O(n!) is the worst upper bound you can achieve.
Since checking each possibility for a permutations of a set to be sorted will take n! steps, you can't get any worse than that.
If you're doing more steps than that then the algorithm has no real useful purpose. Not to mention the following simple sorting algorithm with O(infinity):
list = someList
while (list not sorted):
doNothing
Bogobogosort. Yes, it's a thing. to Bogobogosort, you Bogosort the first element. Check to see if that one element is sorted. Being one element, it will be. Then you add the second element, and Bogosort those two until it's sorted. Then you add one more element, then Bogosort. Continue adding elements and Bogosorting until you have finally done every element. This was designed never to succeed with any sizable list before the heat death of the universe.
You should do some research into the exciting field of Pessimal Algorithms and Simplexity Analysis. These authors work on the problem of developing a sort with a pessimal best-case (your bogosort's best case is Omega(n), while slowsort (see paper) has a non-polynomial best-case time complexity).
Here's 2 sorts I came up with my roommate in college
1) Check the order
2) Maybe a miracle happened, go to 1
and
1) check if it is in order, if not
2) put each element into a packet and bounce it off a distant server back to yourself. Some of those packets will return in a different order, so go to 1
There's always the Bogobogosort (Bogoception!). It performs Bogosort on increasingly large subsets of the list, and then starts all over again if the list is ever not sorted.
for (int n=1; n<sizeof(list); ++n) {
while (!isInOrder(list, 0, n)) {
shuffle(list, 0, n);
}
if (!isInOrder(list, 0, n+1)) { n=0; }
}
1 Put your items to be sorted on index cards
2 Throw them into the air on a windy day, a mile from your house.
2 Throw them into a bonfire and confirm they are completely destroyed.
3 Check your kitchen floor for the correct ordering.
4 Repeat if it's not the correct order.
Best case scenerio is O(∞)
Edit above based on astute observation by KennyTM.
The "what would you like it to be?" sort
Note the system time.
Sort using Quicksort (or anything else reasonably sensible), omitting the very last swap.
Note the system time.
Calculate the required time. Extended precision arithmetic is a requirement.
Wait the required time.
Perform the last swap.
Not only can it implement any conceivable O(x) value short of infinity, the time taken is provably correct (if you can wait that long).
Nothing can be worse than infinity.
Segments of π
Assume π contains all possible finite number combinations.
See math.stackexchange question
Determine the number of digits needed from the size of the array.
Use segments of π places as indexes to determine how to re-order the array. If a segment exceeds the size boundaries for this array, adjust the π decimal offset and start over.
Check if the re-ordered array is sorted. If it is woot, else adjust the offset and start over.
Bozo sort is a related algorithm that checks if the list is sorted and, if not, swaps two items at random. It has the same best and worst case performances, but I would intuitively expect the average case to be longer than Bogosort. It's hard to find (or produce) any data on performance of this algorithm.
A worst case performance of O(∞) might not even make it an algorithm according to some.
An algorithm is just a series of steps and you can always do worse by tweaking it a little bit to get the desired output in more steps than it was previously taking. One could purposely put the knowledge of the number of steps taken into the algorithm and make it terminate and produce the correct output only after X number of steps have been done. That X could very well be of the order of O(n2) or O(nn!) or whatever the algorithm desired to do. That would effectively increase its best-case as well as average case bounds.
But your worst-case scenario cannot be topped :)
My favorite slow sorting algorithm is the stooge sort:
void stooges(long *begin, long *end) {
if( (end-begin) <= 1 ) return;
if( begin[0] < end[-1] ) swap(begin, end-1);
if( (end-begin) > 1 ) {
int one_third = (end-begin)/3;
stooges(begin, end-one_third);
stooges(begin+one_third, end);
stooges(begin, end-one_third);
}
}
The worst case complexity is O(n^(log(3) / log(1.5))) = O(n^2.7095...).
Another slow sorting algorithm is actually named slowsort!
void slow(long *start, long *end) {
if( (end-start) <= 1 ) return;
long *middle = start + (end-start)/2;
slow(start, middle);
slow(middle, end);
if( middle[-1] > end[-1] ) swap(middle-1, end-1);
slow(start, end-1);
}
This one takes O(n ^ (log n)) in the best case... even slower than stoogesort.
Recursive Bogosort (probably still O(n!){
if (list not sorted)
list1 = first half of list.
list 2 = second half of list.
Recursive bogosort (list1);
Recursive bogosort (list2);
list = list1 + list2
while(list not sorted)
shuffle(list);
}
Double bogosort
Bogosort twice and compare results (just to be sure it is sorted) if not do it again
This page is a interesting read on the topic: http://home.tiac.net/~cri_d/cri/2001/badsort.html
My personal favorite is Tom Duff's sillysort:
/*
* The time complexity of this thing is O(n^(a log n))
* for some constant a. This is a multiply and surrender
* algorithm: one that continues multiplying subproblems
* as long as possible until their solution can no longer
* be postponed.
*/
void sillysort(int a[], int i, int j){
int t, m;
for(;i!=j;--j){
m=(i+j)/2;
sillysort(a, i, m);
sillysort(a, m+1, j);
if(a[m]>a[j]){ t=a[m]; a[m]=a[j]; a[j]=t; }
}
}
You could make any sort algorithm slower by running your "is it sorted" step randomly. Something like:
Create an array of booleans the same size as the array you're sorting. Set them all to false.
Run an iteration of bogosort
Pick two random elements.
If the two elements are sorted in relation to eachother (i < j && array[i] < array[j]), mark the indexes of both on the boolean array to true. Overwise, start over.
Check if all of the booleans in the array are true. If not, go back to 3.
Done.
Yes, SimpleSort, in theory it runs in O(-1) however this is equivalent to O(...9999) which is in turn equivalent to O(∞ - 1), which as it happens is also equivalent to O(∞). Here is my sample implementation:
/* element sizes are uneeded, they are assumed */
void
simplesort (const void* begin, const void* end)
{
for (;;);
}
One I was just working on involves picking two random points, and if they are in the wrong order, reversing the entire subrange between them. I found the algorithm on http://richardhartersworld.com/cri_d/cri/2001/badsort.html, which says that the average case is is probably somewhere around O(n^3) or O(n^2 log n) (he's not really sure).
I think it might be possible to do it more efficiently, because I think it might be possible to do the reversal operation in O(1) time.
Actually, I just realized that doing that would make the whole thing I say maybe because I just realized that the data structure I had in mind would put accessing the random elements at O(log n) and determining if it needs reversing at O(n).
Randomsubsetsort.
Given an array of n elements, choose each element with probability 1/n, randomize these elements, and check if the array is sorted. Repeat until sorted.
Expected time is left as an exercise for the reader.

Sort numbers by sum algorithm

I have a language-agnostic question about an algorithm.
This comes from a (probably simple) programming challenge I read. The problem is, I'm too stupid to figure it out, and curious enough that it is bugging me.
The goal is to sort a list of integers to ascending order by swapping the positions of numbers in the list. Each time you swap two numbers, you have to add their sum to a running total. The challenge is to produce the sorted list with the smallest possible running total.
Examples:
3 2 1 - 4
1 8 9 7 6 - 41
8 4 5 3 2 7 - 34
Though you are free to just give the answer if you want, if you'd rather offer a "hint" in the right direction (if such a thing is possible), I would prefer that.
Only read the first two paragraph is you just want a hint. There is a an efficient solution to this (unless I made a mistake of course). First sort the list. Now we can write the original list as a list of products of disjoint cycles.
For example 5,3,4,2,1 has two cycles, (5,1) and (3,4,2). The cycle can be thought of as starting at 3, 4 is in 3's spot, 2 is in 4's spot, and 4 is in 3's. spot. The end goal is 1,2,3,4,5 or (1)(2)(3)(4)(5), five disjoint cycles.
If we switch two elements from different cycles, say 1 and 3 then we get: 5,1,4,2,3 and in cycle notation (1,5,3,4,2). The two cycles are joined into one cycle, this is the opposite of what we want to do.
If we switch two elements from the same cycle, say 3 and 4 then we get: 5,4,3,2,1 in cycle notation (5,1)(2,4)(3). The one cycle is split into two smaller cycles. This gets us closer to the goal of all cycles of length 1. Notice that any switch of two elements in the same cycle splits the cycle into two cycles.
If we can figure out the optimal algorithm for switching one cycle we can apply that for all cycles and get an optimal algorithm for the entire sort. One algorithm is to take the minimum element in the cycle and switch it with the the whose position it is in. So for (3,4,2) we would switch 2 with 4. This leaves us with a cycle of length 1 (the element just switched into the correct position) and a cycle of size one smaller than before. We can then apply the rule again. This algorithm switches the smallest element cycle length -1 times and every other element once.
To transform a cycle of length n into cycles of length 1 takes n - 1 operations. Each element must be operated on at least once (think about each element to be sorted, it has to be moved to its correct position). The algorithm I proposed operates on each element once, which all algorithms must do, then every other operation was done on the minimal element. No algorithm can do better.
This algorithm takes O(n log n) to sort then O(n) to mess with cycles. Solving one cycle takes O(cycle length), the total length of all cycles is n so cost of the cycle operations is O(n). The final run time is O(n log n).
I'm assuming memory is free and you can simulate the sort before performing it on the real objects.
One approach (that is likely not the fastest) is to maintain a priority queue. Each node in the queue is keyed by the swap cost to get there and it contains the current item ordering and the sequence of steps to achieve that ordering. For example, initially it would contain a 0-cost node with the original data ordering and no steps.
Run a loop that dequeues the lowest-cost queue item, and enqueues all possible single-swap steps starting at that point. Keep running the loop until the head of the queue has a sorted list.
I did a few attempts at solving one of the examples by hand:
1 8 9 7 6
6 8 9 7 1 (+6+1=7)
6 8 1 7 9 (7+1+9=17)
6 8 7 1 9 (17+1+7=25)
6 1 7 8 9 (25+1+8=34)
1 6 7 8 9 (34+1+6=41)
Since you needed to displace the 1, it seems that you may have to do an exhaustive search to complete the problem - the details of which were already posted by another user. Note that you will encounter problems if the dataset is large when doing this method.
If the problem allows for "close" answers, you can simply make a greedy algorithm that puts the largest item into position - either doing so directly, or by swapping the smallest element into that slot first.
Comparisons and traversals apparently come for free, you can pre-calculate the "distance" a number must travel (and effectively the final sort order). The puzzle is the swap algorithm.
Minimizing overall swaps is obviously important.
Minimizing swaps of larger numbers is also important.
I'm pretty sure an optimal swap process cannot be guaranteed by evaluating each ordering in a stateless fashion, although you might frequently come close (not the challenge).
I think there is no trivial solution to this problem, and my approach is likely no better than the priority queue approach.
Find the smallest number, N.
Any pairs of numbers that occupy each others' desired locations should be swapped, except for N.
Assemble (by brute force) a collection of every set of numbers that can be mutually swapped into their desired locations, such that the cost of sorting the set amongst itself is less than the cost of swapping every element of the set with N.
These sets will comprise a number of cycles. Swap within those cycles in such a way that the smallest number is swapped twice.
Swap all remaining numbers, which comprise a cycle including N, using N as a placeholder.
As a hint, this reeks of dynamic programming; that might not be precise enough a hint to help, but I'd rather start with too little!
You are charged by the number of swaps, not by the number of comparisons. Nor did you mention being charged for keeping other records.

Resources