I'm having trouble understanding the notation used by a book in describing a prng testing algorithm. Here are some of the snippets in question:
My confusion is: what is the significance of of j? It's not well defined at all. Is it supposed to be the index of the vectors? How does it not start from 0 then?
Continuing on:
I get that the left arrow is assignment. But again the algorithm is referring to j and only referring to j0 and j1. Again, it seems like it would be the indices of j. But then I'm especially confused by the line, "Then for j <- j, j1 - 1 ..., j0 " because it seems like it's referring to deincrementing the index of j, but it's subtracting from j itself and not the subscript.
Any help much appreciated, thanks.
It appears to me that there are multiple independent uses of j here. This is typical of CS texts written by mathematicians, who -- you might be surprised to learn -- are often very sloppy indeed about their notation, expecting the reader to glark all sorts of things from implicature and context.
In the first paragraph, the author uses V[j] to refer to an arbitrary element of an array of 20-element vectors (and j does start from zero). They are defining how to fill this array of fixed-width vectors (you might find it more comfortable to think of this as a 2-dimensional array) from a 1-dimensional stream of random numbers. Maybe it helps if I write out the first two rows of the array explicitly?
V0 = (Y0, Y1, Y2, ... Y19)
V1 = (Y20, Y21, Y22, ... Y39)
⋮
Vj = (Y[j*20], Y[j*20+1], ... Y[j*20+19])
In the second paragraph, A[j] is again an arbitrary element of an array, but it's a different, unrelated, array of floating-point numbers.
In the third and fourth paragraphs, j, j0, and j1 appear to be three separate index variables, and the situation is made more confusing by cramming an algorithm into prose. The author ought to have used pseudocode and chosen better variable names. Here is an attempt to produce a pseudocode version - I deliberately kept the bad variable names, though, so you can see the correspondence.
Algorithm_S (m, n):
# S1: Initialize.
var A: float[n+1]
var j, j0, j1: int
for j in 0, 1, ..., n:
A[j] ← 0
A[1] ← 1
# S2: Update probabilities.
j0 ← 1
j1 ← 1
repeat n-1 times:
j1 ← j1 + 1
for j in j1, j1 - 1, ..., j0:
A[j] ← (j/m)*A[j] + (1 + 1/m - j/m)*A[j-1]
if A[j] < 1e-20:
A[j] ← 0
if j = j1: j1 ← j1 - 1
if j = j0: j0 ← j0 + 1
# S3: ...
I'm not sure this is correct, because it doesn't make a whole lot of sense to me even after unpacking it like that. The problem is probably that you only quoted the first two steps of the algorithm, so I don't know what this "auxiliary array of probabilities" will be used for and I can't tell if the code should do what it's doing. Don't worry about it.
In summary: you are confused because this book is confusing. It is not your fault. I would recommend you find a book that is less confusing, and maybe come back to this one when you have had a good deal more practice reading math journal articles.
j is a dummy variable. When you write A[j] for j0 <= j <= j1 it serves as a placeholder to denote all the elements of A at indexes between j0 and j1.
When you will program the algorithm, you will probably declare one or more loop variables to reflect this.
Related
I have two arrays (a and b) of size n, (positive whole numbers)
a= [a1…..an] b= [b1….bn]
I want to store them in array c, also an array of size n
c=[c1…..cn]
where I add one element from a plus one element from b (each used once) into c, lets say the first element in c is combining a1+b3
Quick example:
n=4 a=[a1,a2,a3,a4] b=[b1,b2,b3,b4]
one way could be:
c=[a1+b2,b3+a4,a2+b1,a3+b4]
The problem is that I want to add them in a way so that the elements in c become as evenly distributed as possible,
One ideal case would be that c came out as:
c=[5,5,5,5]
but the numbers in a and b might not match up so they become even, so I want it to come as close to even as possible.
I an trying to find a way so that the difference between the biggest number in c minus the smallest number in c (after being combined as evenly as I can) to be as small as possible. In my optimal example above that would be 5-5=0 which is most optimal since 0 is the smallest minimum difference I want to achieve. Some other case with other numbers might come out as 6-5=1, which might be the smallest I could get in that situation
My way of going would be to sort array a in ascending order and my array b in descending order,and then combining them with the same element that they are in. Im not sure if this is the best way or the fastest to do this in, I want my code (doing it with python) to be fast. I cant come up with a better way where I could distribute them more evenly,any clue if there are better ways to solve this problem? I really appreciate all advice I could get! Thank you
When trying to solve it in a way where one of the arrays is ascending, and the other one being descending, there might already exist an algorithm that solves it better that I have not thought of. Thank you for reading!
Your algorithm is both correct and fast. It is just proving it that is optimal which is tricky.
We can do this by proving the following two results.
Any other matching of a and b will lead to a maximum at least as big as yours.
Any other matching of a and b will lead to a minimum at least as small as yours.
And the conclusion is that any other matching must have a maximum-minimum at least as big as yours. From which yours must be optimal.
Now let's look at part 1. Sort a ascending, and b descending. Find the i such that c[i] = a[i] + b[i] is a maximum. Suppose that m is any other matching where we're matching up a[j] + b[m[j]]. Note that m[1], ..., m[n] is a permutation of 1, ..., n.
If a[i] + b[m[i]] >= a[i] + b[i] then part 1 is true..
If a[i] + b[m[i]] < a[i] + b[i] then b[m[i]] < b[i] and so we must have i < m[i]. Now there are n-i numbers in the range i+1, ..., n. m maps something out of that range into that range. Because m is a permutation, by the pigeonhole principle, m must map something in that range, out of that range.
In other words there must be a j > i such that m[j] <= i. But now a[i] <= a[j] and b[i] <= b[m[j]] and therefore a[i] + b[i] <= a[j] + b[m[j]]. And so part 1 is true again.
That concludes the proof of part 1.
The proof of part 2 is similar. Except now a[i] + b[i] is at a minimum, m[i] < i, there is a j < i with i <= m[j], a[j] <= a[i], b[m[j]] <= b[i], and a[j] + b[m[j]] <= a[i] + b[i].
And as noted, part 1 and part 2 together implies that you've minimized the difference between the minimum and maximum.
I'm tearing my hair our trying to understand why the following insertion sort program in TI-BASIC works sometimes but gives a dimension error other times.
0→dim(L₁ // clear the list L1
randIntNoRep(1,5,5)→L₁
For(I,2,5)
L₁(I)→K
I-1→J
While J>0 and L₁(J)>K
L₁(J)→L₁(J+1)
J-1→J
End // end while
K→L₁(J+1)
End // end for
Disp L₁
As far as I can tell the code is a faithful implementation of Insertion Sort based on this pseudocode:
for i ← 2 to n
key ← A[i]
j ← i − 1
while j > 0 and A[j] > key do
A[j + 1] ← A[j]
j ← j − 1
A[j + 1] ← key
I've tried stepping through the code manually and it looks like the BASIC version does the same as the pseudocode. What am I missing please?
OK I see the problem. TI-BASIC doesn't appear to do short circuit Boolean evaluation, so it sometimes tries to access the list at index 0, and fails. Refactoring in TI-BASIC is a real pain as there isn't even a break statement.
This seems to be a very common book (Cormen, Leiserson, Rivest, Stein) so hopefully someone will be able to assist. In chapter 8, the algorithm for counting sort is given. It makes sense where you have the input array A and you find the range from 0 to k for the size that array C will be. C[i] is then made to contain the number of elements in A equal to i. For example:
A: [2,5,3,0,2,3,0,3]
C: [2,0,2,3,0,1]
But after this they make it so that C[i] contains the number of elements less than or equal to i. For example:
C: [2,2,4,7,7,8]
Why is this necessary? Couldn't you just iterate through the original C and have a sorted array from that? You know the exact count of each number so you could just go in order putting the right amount of each number in B and have a sorted array. Does transforming C from the first form to the second form somehow make it stable?
I suppose you are proposing to do the following with the intermediate C (using index 1 arrays):
i = 1
for k = 1 to len(C)
for j = 1 to C[i]
B[i] = k
i = i + 1
This seems to be reasonable, and has the same asymptotic running time. However, consider the case where the items whose keys you are sorting on are not just single integers, but have some other data attached to them. The counting algorithm makes the sort stable; relative orders of items with same keys are preserved (see the Wikipedia article). You lose the ability to sort general data if you just assign the output from the indices of C. Hence why the sort assigns elements via B[C[A[j]]] <- A[j].
For others who are curious, this is the completion of the original algorithm:
# C[i] now contains the number of elements equal to i.
for i = 1 to k
C[i] <- C[i] + C[i-1]
# C[i] now contains the number of elements less than or equal to i.
for j = length[A] downto 1
B[C[A[j]]] <- A[j]
C[A[j]] <- C[A[j]] - 1
To explain the decrement in the last part, I cite the book, which also explains the stability of the sort:
Because the elements might not be distinct, we decrement C[A[j]] each time we place a value A[j] into the B array. Decrementing C[A[j]] causes the next input element with a value equal to A[j], if one exists, to go to the position immediately before A[j] in the output array.
Also, if we did that, I guess we wouldn't be able to call it COUNTING-SORT anymore because it wouldn't be counting the number of items less than any particular item in the input array (as they define it). :)
I think this question was asked so many times, but still there aren't any clear solution!
Anyways, this is what I found as good answer in O(k) (possibly O(logm + logn) too). But I don't understand part, where if M_B > M_A (or other way round) we should be throwing away after elements after M_B. But here its reverse - throwing elements which are before M_B. Can anyone please explain why?
http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15451-s01/recitations/rec03/rec03.ps
And other question is doing K/2 ... we should be doing it, but it isn't obvious to me.
[EDIT 1]
Example
A = [2, 9, 15, 22, 24, 25, 26, 30]
B = [1, 4, 5, 7, 18, 22, 27, 33]
k= 6
Answer is 9 (A[1])
Here is what I think, if I want to solve in O(Log k) ... need to throw k/2 elements each time.
Base solution: if K < 2: return 2nd smallest element from - A[0], A[1], B[0], B[1]
else:
compare A[k/2] and B[k/2]: if A[k/2] < B[k/2]: then kth smallest element will be in A[1 ... n] and B[1 ... K/2] ... okay here I thrower k/2 (can do similar for A[k/2] > B[k/2]. so now question is next time also k index is K or k/2?
What I'm doing is right?
That algorithm isn't bad -- it's better than the one which is usually referenced here on SO, in my opinion, because it's a lot simpler -- but it has one huge flaw: it requires that both vectors have at least k elements. (The problem says that they both have the same number of elements, n, but never specifies that n ≥ k; the function doesn't even let you tell it how big the vectors are. However, that's easily solved. I'll leave it as an exercise for now. In general, we'd need an algorithm like this to work on differently-sized arrays, and it does; we just need to be clear on the preconditions.)
The use of floor and ceil is nice and specific, but maybe confusing. Let's just look at this in the most general way. Also, the solution quoted seems to assume that arrays are 1-indexed (i.e. A[1] is the first element, not A[0]). The description I'm about to write, however, uses a more C-like pseudocode, so it assumes that A[0] is the first element. Consequently, I'm going to write it to find element k in the combined set, which is the (k+1)th element. And finally, the solution I'm about to describe differs subtly from the solution presented, which will be apparent in the end condition. IMHO, it's slightly better.
OK, if x is element k in a sequence, there are exactly k elements in the sequence smaller than x. (We won't deal with the case where there are repeated elements, but it's not much different. See note 3.)
Suppose that we know that A and B each have an element k. (Remember, this means they each have at least k + 1 elements.) Select any non-negative integer less than k; we'll call it i. And let j be k - i - 1 (so that i + j == k - 1). [See note 1, below.] Now, look at elements A[i] and B[j]. Let's say A[i] is smaller, since we just have to change all the names in the other case. Remember that we're assuming all the elements are different. So here's what we know at this point:
1) There are i elements in A which are < A[i]
2) There are j elements in B which are < B[j]
3) A[i] < B[j]
4) From (2) and (3), we know that:
5) There are at most j elements in B which are < A[i]
6) From (1) and (5), we know that:
7) There are at most i + j elements in A and B together which are < A[i]
8) But i + j is k - 1, so actually we know:
9) Element k of the merged array must be greater than A[i] (because A[i] is at most element i + j).
Since we know that the answer must be greater than A[i], we can discard A[0] through A[i] (actually, we just increment an array pointer, but effectively we'll discard them). However, we've now discarded i + 1 elements from the original problem. So out of the new set of elements (in the shortened A and the original B), we need element k - (i + 1), instead of the element k.
Now, let's check the precondition. We said that both A and B had an element k elements to start with, so they both have at least k + 1 elements. In the new problem we want to know whether the shortened A and the original B each have at least k - i elements. Clearly B does, because k - i is no greater k. Also, we removed i + 1 elements from A. Originally it had at least k + 1 elements, so now it has at least k - i elements. So we're OK there.
Finally, let's check the termination condition. At the beginning I said that we choose non-negative integers i and j so that i + j == k - 1. That's not possible if k == 0, but it can be done for k == 1. So we only need to do something special once k reaches 0, in which case what we need to do is return min(A[0], B[0]). [This is a much simpler termination condition than in the algorithm you looked at, see Note 2.]
So what's a good strategy for picking i? We'll end up removing either i + 1 or k - i elements from the problem, and we'd like that to be as close to half of the elements as possible. So we should choose i = floor((k - 1) / 2). Although it might not be immediately obvious, that will make j = floor(k / 2).
I'm leaving out the bit where I solve the case where A and B have fewer elements. It's not complicated; I'd encourage you to think about it yourself.
[1] The algorithm you were looking at selects i + j == k (if k is even), and drops either i or j elements. Mine selects i + j == k - 1 (always) which might make one of them smaller, but then it drops i + 1 or j + 1 elements. So it should converge slightly more rapidly.
[2] The difference between selecting i + j == k (theirs) and i + j == k - 1 (mine) is apparent in the end condition. In their formulation, both i and j must be positive, because if one of the were 0, there is a risk of dropping 0 elements, which would be an infinite recursive loop. So in their formulation, the minimum possible value of k is 2, not 1, and so their termination case has to handle k == 1, which involves comparing between four elements, rather than two. For what it's worth, I believe the best solution of "find the second smallest element out of two sorted vectors" is: min(max(A[0], B[0]), min(A[1], B[1])), which requires three comparisons. This doesn't make their algorithm slower; just more complicated.
[3] Suppose elements could repeat. Actually this doesn't change anything. The algorithm still works. Why? Well, we could pretend that every element in A was actually a pair with its actual value and its actual index, and similarly for every element in B, and that we use the index as a tie breaker when comparing values within a vector. Between vectors, we give preference to all the elements in A if A[i] ≤ B[j]; otherwise to all the elements in B. This doesn't actually change the actual code at all, because we never actually have to do any comparison differently, but it makes all the inequalities in the proof valid.
I've never had much need for writing large quantities of formal pseudo-code but the need has arisen, so I thought I'd pick some standards in order to stay consistent across code.
To that effect I picked up some "iTunes U" courseware videos, amongst other things the 6.046J / 18.410J Introduction to Algorithms (SMA 5503).
In the very first lecture video, the lecturer writes Insertion Sort on the blackboard, and he writes this:
Insertion-Sort(A, N) // Sorts A[1..n]
for j ← 2 to n
do key ← A[j]
i ← j-1
while i > 0 and A[i] > key
do A[i+1] ← A[i]
i ← i-1
A[i+1] ← key
So, my questions:
Why i ← j-1 when A[i+1] = key? That is, why ← in some cases, and = in another? Note that in the above code, the ← is used for the latter too, but in the handouts, available on the web, = is used, is this simply a typo? (I assume so)
More important, why do key ← A[j] when i ← j-1? What is so special that it requires a do command like that, and an indentation?
In other words, why isn't the above pseudo-code written like this (with my highlights):
Insertion-Sort(A, N) // Sorts A[1..n]
for j ← 2 to n
key ← A[j] <-- lost the do here
i ← j-1 <-- no indentation
while i > 0 and A[i] > key
A[i+1] ← A[i] <-- lost the do here
i ← i-1 <-- no indentation
A[i+1] ← key
Final question: Does anyone have a code standard for pseudo-code handy somewhere? My main goal is consistency, so that I only have to "teach" the recipients once.
Structured English is a 'standardised' pseudo-code language.
the arrow serve as = in normal code.
equal sign in pseudo serve as == in normal code
so j <- 1 mean j = 1
and j = 1 mean if( j == 1)