I'm reading a book on Data Structures and Algorithms and currently I'm reading about a Jump Search algorithm. I think that there is an error in the pseudocode in the book (please check the code printed below). The jumps are not performed during the Step 3, and because of that the run-time is of the algorithm below is of order O(n) (and the run-time of the correctly implemented Jump Search algorithm is O(sqrt(n)).
All in all I think that there is an error in the Jump Search algorithm, however I might be wrong, and I would appreciate any help/comments. Thank you!
**JUMP_SEARCH (A, lower_bound, upper_bound, VAL, N)**
Step 1: [INITIALIZE] SET STEP = sqrt(N), I = 0, LOW = lower_bound, HIGH = upper_bound, POS = –1
Step 2: Repeat Step 3 while I < STEP
Step 3:
IF VAL < A[STEP]
SET HIGH = STEP – 1
ELSE
SET LOW = STEP + 1
[END OF IF]
SET I = I + 1
[END OF LOOP]
Step 4: SET I = LOW
Step 5: Repeat Step 6 while I <= HIGH
Step 6:
IF A[I] = Val
POS = I
PRINT POS
Go to Step 8
[END OF IF]
SET I = I + 1
[END OF LOOP]
Step 7: IF POS = –1
PRINT "VALUE IS NOT PRESENT IN THE ARRAY"
[END OF IF]
Step 8: EXIT
You are absolutely right. The pseudo code has quite a few issues:
Step 3 always makes the same comparison as STEP is not modified in the loop. So this means that in this loop either HIGH is set, or LOW is set, but never the two. If LOW is set, then the search will still take O(n) as you rightly indicate. The index for A should change in that loop, and make the "jumps".
When HIGH is set in that loop, the loop should exit immediately.
When LOW is set, the + 1 is also wrong, as it does not consider that the preceding element might be the value being looked for.
Even though there is a parameter for a specific lower_bound, this variable is only used for the initialisation of LOW at the start, but then in the actually access to A it is never used.
It is strange that N is a parameter, since logically N = upper_bound - lower_bound + 1. So this can only lead to further inconsistency.
Concluding: there are too many errors in this pseudo code for it to be helpful.
Related
Problem: Show that RANDOMIZED-SELECT never makes a recursive call to a 0-length array.
Hint: Don't assume that the input array is empty, i.e., p>r. Rather, show that if an empty
(sub-)array is ever generated by RANDOMIZED-PARTITION, then a recursive call will not
be made on such an empty (sub-)array
This is the exercise problem of Cormen's Introduction to Algorithms Chapter 9. Median and order statistics exercise No. 9.2-1.
The answer should be:
Calling a 0-length array would mean that the second and third arguments are equal. So, if the call is made on line 8, we would need that p=q−1, which means that q - p + 1 = 0.
However, i is assumed to be a nonnegative number, and to be executing line 8, we would need that i < k = q - p + 1 = 0, a contradiction. The other possibility is that the bad recursive call occurs on line 9. This would mean that q + 1 = r. To be executing line 9, we need that i > k = q - p + 1 = r - p. This would be a nonsensical original call to the array though because we are asking for the ith element from an array of strictly less size.
This solution can be found this link
The algorithm it's refer can be found Cormen's Introduction to Algorithms Chapter 9. Median and order statistics section 9.2 Selection in expected linear time
Line number 8: of the algorithm says return RANDOMIZED-SELECT(A,p,q-1,i)
The solution says 2nd and 3rd argument should be equal, So, p=q-1 which means p-q+1 =0 but in the solution it was given q - p + 1 = 0. How could they get that?
Then again for line 9, they calculated q - p + 1 = r - p. As I cannot figure out how did they get q-p+1=0 the equation q-p+1=r-p also meaningless for me.
Can anyone please clarify my doubts?
Thank you.
Algorithm 1: RANDOMIZED-SELECT
RANDOMIZED-SELECT(A, p, r, i)
1 if p == r
2 return A[p]
3 q = RANDOMIZED-PARTITION (A,p,r)
4 k = q - p + 1
5 if i = = k // the pivot value is the answer
6 return A[q]
7 elseif i<k
8 return RANDOMIZED-SELECT(A,p,q - 1,i)
9 else return RANDOMIZED-SELECT(A, q + 1, r, i - k)
Algorithm 2: RANDOMIZED_PARTITION
RANDOMIZED-PARTITION(A,p,r)
1 i = RANDOM(p,r)
2 exchange A[r] with A[i]
3 return PARTITION (A,p, r)
Yes, I think you are right that the proposed solution is incorrect.
The solutions you are looking at are not part of the textbook, nor were they written by any of the textbook's authors, nor were they reviewed by the textbook's authors. In short, they are, like this site, the unverified opinions of uncertified contributors of uncertain value. It hardly seems necessary to observe that the internet is full of inexact, imprecise and plainly incorrect statements, some of them broadcast maliciously with intent to deceive, but the vast majority simple errors with no greater fault than sloppiness or ignorance. The result is the same: you have the responsibility to carefully evaluate the veracity of anything you read.
One aid in this particular repository of proposed solutions is the bug list, which is also not authored by infallible and reliable reviewers, but still allows some kind of triangulation since it largely consists of peer reviews. So it should be your first point of call when you suspect that a solution is buggy. And, indeed, there you will find this issue, which seems quite similar to your complaint. I'll quote the second comment in that issue (from "Alice-182"), because I don't think I can say it better; lightly edited, it reads:
Calling a 0-length array would mean that the second argument is larger than the third argument by 1. So, if the call is made on line 8, we would need that p = q - 1 + 1 = q.
However, i is assumed to be a positive number, and to be executing line 8, we would need that i < k = q - p + 1 = 1, which means that i ≤ 0, a contradiction. The other possibility is that the bad recursive call occurs on line 9. This would mean that q + 1 = r + 1. But if line 9 runs, it must be that i > k = q - p + 1 = r - p + 1. This would be a nonsensical original call to the array though for i should be in [1, r - p + 1].
You can find the explanation of Algorithm 4.3.1D, as it appears in the book Art of The Computer Programming Vol. 2 (pages 272-273) by D. Knuth in the appendix of this question.
It appears that, in the step D.6, qhat is expected to be off by one at most.
Lets assume base is 2^32 (i.e we are working with unsigned 32 bit digits). Let u = [238157824, 2354839552, 2143027200, 0] and v = [3321757696, 2254962688]. Expected output of this division is 4081766756 Link
Both u and v is already normalized as described in D.1(v[1] > b / 2 and u is zero padded).
First iteration of the loop D.3 through D.7 is no-op because qhat = floor((0 * b + 2143027200) / (2254962688)) = 0 in the first iteration.
In the second iteration of the loop, qhat = floor((2143027200 * b + 2354839552) / (2254962688)) = 4081766758 Link.
We don't need to calculate steps D.4 and D.5 to see why this is a problem. Since qhat will be decreased by one in D.6, result of the algorithm will come out as 4081766758 - 1 = 4081766757, however, result should be 4081766756 Link.
Am I right to think that there is a bug in the algorithm, or is there a fallacy in my reasoning?
Appendix
There is no bug; you're ignoring the loop inside Step D3:
In your example, as a result of this test, the value of q̂, which was initially set to 4081766758, is decreased two times, first to 4081766757 and then to 4081766756, before proceeding to Step D4.
(Sorry I did not have the time to make a more detailed / “proper” answer.)
https://leetcode.com/problems/find-all-numbers-disappeared-in-an-array/discuss/93007/simple-java-in-place-sort-solution
Could you please check above link?
I can't understand the code
while (nums[i] != i + 1 && nums[i] != nums[nums[i] - 1])
What is the difference between those two?
1) nums[i] != i+1
2) nums[i] != nums[nums[i]-1]
for example
index 0 : 1
index 1 : 2
index 2 : 3
Then, the first one just simply using the index we can check
index+1 is the value or not.
and Second one,
nums[0] = nums[nums[i]-1]
nums[0] = nums[nums[0]-1]
nums[0] = nums[1-1]
nums[0] = nums[0]
It is also ultimately the same thing, just to prove that
index value = index+1.
But why while loop have to have both condition?
or we can just use one of that?
I agree the second condition is unnecessary. In fact, I think it needlessly clutters the code.
In English, the code essentially says "if [something] and (x != y), then swap x and y". All the "x != y" check does is prevent swapping x with (something equal to) itself. But that is a no-op, so that check can be removed without changing the behavior or O(n) performance.
Removing that check makes it easier to read the algorithm: "For each slot i, while the item at slot i is wrong, swap it to where it belongs."
[Update]
Whoops! I just realized the point of the check... It prevents a potential infinite loop where you keep swapping the same value back and forth. (Because the condition is actually a "while", not an "if".)
So the algorithm as presented is correct.
nums[i] != i+1
Is the value at its place? If not may be swap it to its place...
This is needed because you have to test every position
nums[i] != nums[nums[i]-1]
Does the value needs to be swapped to its place ?
This is needed because the algorithm place every element in a chain to its place.
Take this example:
[3,1,2,4,6,5,8,7]
it should be clear that you need to rearrange 1,2,3 and 5,6 and 7,8.
Lets look how the sorting takes place:
i:0 [3,1,2,4,6,5,8,7] 3<->2
i:0 [2,1,3,4,6,5,8,7] 2<->1
i:0 [1,2,3,4,6,5,8,7] now 1 is at its place, go to the right and find another chain
i:1 [1,2,3,4,6,5,8,7] 2 at its place
i:2 [1,2,3,4,6,5,8,7] 3 at its place
i:3 [1,2,3,4,6,5,8,7] 4 at its place
i:4 [1,2,3,4,6,5,8,7] 6<->5
i:4 [1,2,3,4,5,6,8,7] now is 5 at its place, go to the right and find another chain
i:5 [1,2,3,4,5,6,8,7] 6 at its place
i:6 [1,2,3,4,5,6,8,7] 8<->7
i:6 [1,2,3,4,5,6,7,8] now is 7 at its place, go to the right and find another chain
i:7 [1,2,3,4,5,6,7,8] 8 at its place
END
Beware that the algorithm can't sort the array given in the link! What the algorithm provides is that if in the initial array element e is present, then it will be at its place at the end. In the given example, 3 is present two times, one is placed at the right place but the other not! But the end of the algorithm retains values that are at their right places and ignores others. Then it's a "sorting and doublons removal" algorithm or "longest strictly increasing sequence algorithm".
I'm trying to understand this code my pairing partner wrote. I dont understand why she used the until loop stating to loop until (finish - start) == 1. What exactly is she looping until?
def binary_search(object, array)
array.sort!
start = -1
finish = array.length
until (finish - start) == 1 do
median = start + ((finish - start) / 2)
# p start
# p finish
return median if object == array[median]
if object > array[median]
start = median
elsif object < array[median]
finish = median
end
end
-1
end
finish - start is the length of the window left to search (+ 1, for easier arithmetic); it starts of as the entire array and gets halved on every iteration, by setting either the start or the finish to the median.
When it reaches 1, there is nothing left to search, and the input object was not found.
Think about how kids play the "guess a number between 1 and 100" game. "Is it bigger than 50?" "No." You now know it's a number between 1 and 50. "Is it bigger than 25?" "Yes." You now know it's between 26 and 50. And so on...
It's the same with binary search. You check to see if the target is above or below the midrange. Whichever way the answer turns out, you've eliminated half of the possibilities and can focus on the remaining subset. Every time you repeat the process, you cut the range that's still under consideration in half. When the range gets down to size one, you've either found the target value or established it wasn't in the set.
I want an algorithm to simulate this loaded die:
the probabilities are:
1: 1/18
2: 5/18
3: 1/18
4: 5/18
5: 1/18
6: 5/18
It favors even numbers.
My idea is to calculate in matlab the possibility of the above.
I can do it with 1/6 (normal die), but I am having difficulties applying it for a loaded die.
One way: generate two random numbers: first one is from 0 to 5 (0: odd, 1 - 5: even), which is used to determine even or odd. Then generate a second between 0 and 2, which determines exact number within its category. For example, if the first number is 3 (which says even) and second is 2 (which says the third chunk, 1-2 is a chunk, 3-4 is another chunk and 5-6 is the last chunk), the the result is 6.
Another way: generate a random number between 0 and 17, then you can simply / 6 and % 6 and use those two numbers to decide. For example, if /6 gives you 0, then the choice is between 1 and 2, then if % 6 == 0, the choice lands on 1, otherwise lands on 2.
In matlab:
ceil(rand*3)*2-(rand>(5/6))
The generic solution:
Use roulette wheel selection
n = generate number between 0 and sum( probabilities )
s = 0;
i = 0;
while s <= n do
i = i + 1;
s = s + probability of element i;
done
After the loop is done i will be the number of the chosen element. This works for any kind of skewed probability distribution, even when you have weights instead of a probability and want to skip normalizing.
In the concise language of J,
>:3(<([++:#])|)?18