I'm tearing my hair our trying to understand why the following insertion sort program in TI-BASIC works sometimes but gives a dimension error other times.
0→dim(L₁ // clear the list L1
While J>0 and L₁(J)>K
End // end while
End // end for
Disp L₁
As far as I can tell the code is a faithful implementation of Insertion Sort based on this pseudocode:
for i ← 2 to n
key ← A[i]
j ← i − 1
while j > 0 and A[j] > key do
A[j + 1] ← A[j]
j ← j − 1
A[j + 1] ← key
I've tried stepping through the code manually and it looks like the BASIC version does the same as the pseudocode. What am I missing please?
OK I see the problem. TI-BASIC doesn't appear to do short circuit Boolean evaluation, so it sometimes tries to access the list at index 0, and fails. Refactoring in TI-BASIC is a real pain as there isn't even a break statement.
I am currently reading chapter 2 of the TCRC Introduction to Algorithms 3rd edition textbook and I am reading the author's interpretation of the loop invariant of this algorithm. I understand the author's logic for both the initialization and the maintenance. However, the termination is what I am kind of bogged up on. The author claims that at termination, j = n + 1. However, in the pseudocode of the algorithm, j loops from 2 to n. So shouldn't j = n - 1?
EDIT: The book's pseudo-code for insertion sort is:
for j = 2 to A.length
key = A[j]
// Insert A[j] into sorted sequence A[1...j - 1]
i = j - 1
while i > 0 and A[i] > key
A[i + 1] = A[i]
i = i - 1
A[i + 1] = key
EDIT: After reading it carefully, I have finally understood why j = n + 1 during termination. It's because the for loop goes from 2 to n (inclusively), so after j exceeds n, the loop terminates, hence why j = n + 1 at termination. I appreciate the help.
Disclaimer: this can be totally incorrect... It is just a brain spit.
Side note: since j is incremented during this loop, the starting point is irrelevant for the end condition.
for j = 2 to A.length //A.length = n in your question
There is a bit of ambiguity in this pseudo code.
First of all, we assume j is defined outside this for loop and will have an end value when the loop is terminated. see #Dukeling's comment
Second, your code is targeting an array, using the j as indexer: A[j]
The ambiguity exist with the word to in for j = 2 to A.length, is it including or excluding A.length? and there is this indexer A[j]
In common cases, for the indexer in A[j], the valid range for j is [0...A.length -1]
Some languages uses another range, namely: [1...A.length] I think this is intended by the author because A[0] is not being hit at all.
If that's the case.... and the for condition increments j before it breaks the loop (to test the condition and see that it is false), then... you'll get j = A.length + 1.
As a side note:
In common C like languages, arrays have a valid range from [0...A.length -1].
And in this C example, c has the value of A.length after termination:
int c = 0;
for (c = 3; c < A.length; c++)
//c = A.length after the loop is completed.
I'm having trouble understanding the notation used by a book in describing a prng testing algorithm. Here are some of the snippets in question:
My confusion is: what is the significance of of j? It's not well defined at all. Is it supposed to be the index of the vectors? How does it not start from 0 then?
Continuing on:
I get that the left arrow is assignment. But again the algorithm is referring to j and only referring to j0 and j1. Again, it seems like it would be the indices of j. But then I'm especially confused by the line, "Then for j <- j, j1 - 1 ..., j0 " because it seems like it's referring to deincrementing the index of j, but it's subtracting from j itself and not the subscript.
Any help much appreciated, thanks.
It appears to me that there are multiple independent uses of j here. This is typical of CS texts written by mathematicians, who -- you might be surprised to learn -- are often very sloppy indeed about their notation, expecting the reader to glark all sorts of things from implicature and context.
In the first paragraph, the author uses V[j] to refer to an arbitrary element of an array of 20-element vectors (and j does start from zero). They are defining how to fill this array of fixed-width vectors (you might find it more comfortable to think of this as a 2-dimensional array) from a 1-dimensional stream of random numbers. Maybe it helps if I write out the first two rows of the array explicitly?
V0 = (Y0, Y1, Y2, ... Y19)
V1 = (Y20, Y21, Y22, ... Y39)
Vj = (Y[j*20], Y[j*20+1], ... Y[j*20+19])
In the second paragraph, A[j] is again an arbitrary element of an array, but it's a different, unrelated, array of floating-point numbers.
In the third and fourth paragraphs, j, j0, and j1 appear to be three separate index variables, and the situation is made more confusing by cramming an algorithm into prose. The author ought to have used pseudocode and chosen better variable names. Here is an attempt to produce a pseudocode version - I deliberately kept the bad variable names, though, so you can see the correspondence.
Algorithm_S (m, n):
# S1: Initialize.
var A: float[n+1]
var j, j0, j1: int
for j in 0, 1, ..., n:
A[j] ← 0
A[1] ← 1
# S2: Update probabilities.
j0 ← 1
j1 ← 1
repeat n-1 times:
j1 ← j1 + 1
for j in j1, j1 - 1, ..., j0:
A[j] ← (j/m)*A[j] + (1 + 1/m - j/m)*A[j-1]
if A[j] < 1e-20:
A[j] ← 0
if j = j1: j1 ← j1 - 1
if j = j0: j0 ← j0 + 1
# S3: ...
I'm not sure this is correct, because it doesn't make a whole lot of sense to me even after unpacking it like that. The problem is probably that you only quoted the first two steps of the algorithm, so I don't know what this "auxiliary array of probabilities" will be used for and I can't tell if the code should do what it's doing. Don't worry about it.
In summary: you are confused because this book is confusing. It is not your fault. I would recommend you find a book that is less confusing, and maybe come back to this one when you have had a good deal more practice reading math journal articles.
j is a dummy variable. When you write A[j] for j0 <= j <= j1 it serves as a placeholder to denote all the elements of A at indexes between j0 and j1.
When you will program the algorithm, you will probably declare one or more loop variables to reflect this.
I am attempting to answer some questions that I've come across. I have been watching some videos regarding running time of algorithms. From what I understood, you have to count each operation in an iteration to get the running time.
I have the following question which I don't quite understand. The constants in choice are throwing me off. Can anyone try to explain?
What is the running time of the following code? (A is an array size N, "B", "C", "D" are constants in choices)
1. for j ← 2 to length(A)
2. key ← A[ j ]
3. //Insert A[ j ] is added in the sorted sequence A[1,..j-1]
4. i ← j - 1
5. while i >= 0 and A [ i ] > key
6. A[ i +1 ] ← A[ i ]
7. i ← i -1
8. A [i +1] ← key
The answer options are:
A)B * n + C + D
B)B * n * log n + C * log n + D
C)B * log n + C + D
D) B * n^2 + C * n + D
What matters to me is not really the answer, but how to actually get to it. Thanks.
This looks to me as though it's O(n^2) in the worst case, which is option (D).
Your outer loop runs across the length of the array, so your outer loop goes through n times. (It's actually n-1, but you can ignore that in this type of calculation.)
Then (step 5) you have an inner loop. (I'm assuming this loop ends after step 7 or 8, though it's not quite clear.) This inner loop also runs for up to n iterations.
In total, then, you've got an O(n) loop running inside another O(n) loop; or roughly n iterations multiplied by n iterations. That gives you O(n^2). The constant factors aren't important: it'll take some number times n^2 operations, plus some other smaller stuff we don't care about. That is what option (D) says.
It's not clear what step 3 is doing, by the way. This might be commented out, or it might be an indication that an insertion into the array is happening here. It doesn't matter, though: however the insertion is happening, it can't take more than O(n) steps. It's outside the inner loop, so the outer loop still has O(n) iterations, and the stuff inside it still runs in O(n) time. The total is still O(n^2).
I think this question was asked so many times, but still there aren't any clear solution!
Anyways, this is what I found as good answer in O(k) (possibly O(logm + logn) too). But I don't understand part, where if M_B > M_A (or other way round) we should be throwing away after elements after M_B. But here its reverse - throwing elements which are before M_B. Can anyone please explain why?
And other question is doing K/2 ... we should be doing it, but it isn't obvious to me.
[EDIT 1]
A = [2, 9, 15, 22, 24, 25, 26, 30]
B = [1, 4, 5, 7, 18, 22, 27, 33]
k= 6
Answer is 9 (A[1])
Here is what I think, if I want to solve in O(Log k) ... need to throw k/2 elements each time.
Base solution: if K < 2: return 2nd smallest element from - A[0], A[1], B[0], B[1]
compare A[k/2] and B[k/2]: if A[k/2] < B[k/2]: then kth smallest element will be in A[1 ... n] and B[1 ... K/2] ... okay here I thrower k/2 (can do similar for A[k/2] > B[k/2]. so now question is next time also k index is K or k/2?
What I'm doing is right?
That algorithm isn't bad -- it's better than the one which is usually referenced here on SO, in my opinion, because it's a lot simpler -- but it has one huge flaw: it requires that both vectors have at least k elements. (The problem says that they both have the same number of elements, n, but never specifies that n ≥ k; the function doesn't even let you tell it how big the vectors are. However, that's easily solved. I'll leave it as an exercise for now. In general, we'd need an algorithm like this to work on differently-sized arrays, and it does; we just need to be clear on the preconditions.)
The use of floor and ceil is nice and specific, but maybe confusing. Let's just look at this in the most general way. Also, the solution quoted seems to assume that arrays are 1-indexed (i.e. A[1] is the first element, not A[0]). The description I'm about to write, however, uses a more C-like pseudocode, so it assumes that A[0] is the first element. Consequently, I'm going to write it to find element k in the combined set, which is the (k+1)th element. And finally, the solution I'm about to describe differs subtly from the solution presented, which will be apparent in the end condition. IMHO, it's slightly better.
OK, if x is element k in a sequence, there are exactly k elements in the sequence smaller than x. (We won't deal with the case where there are repeated elements, but it's not much different. See note 3.)
Suppose that we know that A and B each have an element k. (Remember, this means they each have at least k + 1 elements.) Select any non-negative integer less than k; we'll call it i. And let j be k - i - 1 (so that i + j == k - 1). [See note 1, below.] Now, look at elements A[i] and B[j]. Let's say A[i] is smaller, since we just have to change all the names in the other case. Remember that we're assuming all the elements are different. So here's what we know at this point:
1) There are i elements in A which are < A[i]
2) There are j elements in B which are < B[j]
3) A[i] < B[j]
4) From (2) and (3), we know that:
5) There are at most j elements in B which are < A[i]
6) From (1) and (5), we know that:
7) There are at most i + j elements in A and B together which are < A[i]
8) But i + j is k - 1, so actually we know:
9) Element k of the merged array must be greater than A[i] (because A[i] is at most element i + j).
Since we know that the answer must be greater than A[i], we can discard A[0] through A[i] (actually, we just increment an array pointer, but effectively we'll discard them). However, we've now discarded i + 1 elements from the original problem. So out of the new set of elements (in the shortened A and the original B), we need element k - (i + 1), instead of the element k.
Now, let's check the precondition. We said that both A and B had an element k elements to start with, so they both have at least k + 1 elements. In the new problem we want to know whether the shortened A and the original B each have at least k - i elements. Clearly B does, because k - i is no greater k. Also, we removed i + 1 elements from A. Originally it had at least k + 1 elements, so now it has at least k - i elements. So we're OK there.
Finally, let's check the termination condition. At the beginning I said that we choose non-negative integers i and j so that i + j == k - 1. That's not possible if k == 0, but it can be done for k == 1. So we only need to do something special once k reaches 0, in which case what we need to do is return min(A[0], B[0]). [This is a much simpler termination condition than in the algorithm you looked at, see Note 2.]
So what's a good strategy for picking i? We'll end up removing either i + 1 or k - i elements from the problem, and we'd like that to be as close to half of the elements as possible. So we should choose i = floor((k - 1) / 2). Although it might not be immediately obvious, that will make j = floor(k / 2).
I'm leaving out the bit where I solve the case where A and B have fewer elements. It's not complicated; I'd encourage you to think about it yourself.
[1] The algorithm you were looking at selects i + j == k (if k is even), and drops either i or j elements. Mine selects i + j == k - 1 (always) which might make one of them smaller, but then it drops i + 1 or j + 1 elements. So it should converge slightly more rapidly.
[2] The difference between selecting i + j == k (theirs) and i + j == k - 1 (mine) is apparent in the end condition. In their formulation, both i and j must be positive, because if one of the were 0, there is a risk of dropping 0 elements, which would be an infinite recursive loop. So in their formulation, the minimum possible value of k is 2, not 1, and so their termination case has to handle k == 1, which involves comparing between four elements, rather than two. For what it's worth, I believe the best solution of "find the second smallest element out of two sorted vectors" is: min(max(A[0], B[0]), min(A[1], B[1])), which requires three comparisons. This doesn't make their algorithm slower; just more complicated.
[3] Suppose elements could repeat. Actually this doesn't change anything. The algorithm still works. Why? Well, we could pretend that every element in A was actually a pair with its actual value and its actual index, and similarly for every element in B, and that we use the index as a tie breaker when comparing values within a vector. Between vectors, we give preference to all the elements in A if A[i] ≤ B[j]; otherwise to all the elements in B. This doesn't actually change the actual code at all, because we never actually have to do any comparison differently, but it makes all the inequalities in the proof valid.
I've never had much need for writing large quantities of formal pseudo-code but the need has arisen, so I thought I'd pick some standards in order to stay consistent across code.
To that effect I picked up some "iTunes U" courseware videos, amongst other things the 6.046J / 18.410J Introduction to Algorithms (SMA 5503).
In the very first lecture video, the lecturer writes Insertion Sort on the blackboard, and he writes this:
Insertion-Sort(A, N) // Sorts A[1..n]
for j ← 2 to n
do key ← A[j]
i ← j-1
while i > 0 and A[i] > key
do A[i+1] ← A[i]
i ← i-1
A[i+1] ← key
So, my questions:
Why i ← j-1 when A[i+1] = key? That is, why ← in some cases, and = in another? Note that in the above code, the ← is used for the latter too, but in the handouts, available on the web, = is used, is this simply a typo? (I assume so)
More important, why do key ← A[j] when i ← j-1? What is so special that it requires a do command like that, and an indentation?
In other words, why isn't the above pseudo-code written like this (with my highlights):
Insertion-Sort(A, N) // Sorts A[1..n]
for j ← 2 to n
key ← A[j] <-- lost the do here
i ← j-1 <-- no indentation
while i > 0 and A[i] > key
A[i+1] ← A[i] <-- lost the do here
i ← i-1 <-- no indentation
A[i+1] ← key
Final question: Does anyone have a code standard for pseudo-code handy somewhere? My main goal is consistency, so that I only have to "teach" the recipients once.
Structured English is a 'standardised' pseudo-code language.
the arrow serve as = in normal code.
equal sign in pseudo serve as == in normal code
so j <- 1 mean j = 1
and j = 1 mean if( j == 1)