Knuth the art of computer programming ex 1.1.8 - algorithm

I can't figure out what Knuth meant in his instructions for an exercise 8 from Chapter 1.1.
The task is to make an efficient gcd algorithm of two positive integers m and n using his notation theta[j], phi[j], b[j] and a[j] where theta and phi are strings and a and b - positive integers which represent computational steps in this case.
Let an input be the string of the form a^mb^n.
An excellent explanation of Knuth's algorithm is given by schnaader here.
My question is how this may be connected with the direction given in the exercise to use his Algorithm E given in the book with original r (remainder) substituted by |m-n| and n substituted by min(m,n).

When Knuth says "Let the input be represented by the string a^mb^n", what he means is that the input should take the form of m number of as and n number of bs. So the input f((m,n)) where m = 3 and n = 2 would be represented by the string aaabb.
Take a moment to look back at his equation 3 in that chapter, which represents a Markov algorithm. (below)
f((σ,j)) = (σ,a_j) if θ_j does not occur in σ
f((σ,j)) = (αφ_jω,b_j) if α is the shortest string for which σ = αθ_jω
f((σ,N)) = (σ,N)
So the idea is to define the sequence for each variable j, θ_j, φ_j, a_j & b_j. This way, using the above Markov's algorithm you can specify what happens to your input string, depending on the value of j.
Now, to get onto your main question;
My question is how this may be connected with the direction given in the excercise to use his Algorithm E given in the book with original r (remainder) substituted by |m-n| and n substituted by min(m,n).
Essentially what Knuth is saying here, is that you can't do division with the above Markov's algorithm. So what's the closest thing to division? Well, essentially we can subtract the smaller number from the larger number until we're left with a remainder. For example;
10 % 4 = 2 is the same as doing the following;
10 - 4 = 6 Can we remove another 4? Yes. Do it again.
6 - 4 = 2 Can we remove another 4? No. We have our remainder.
And now we have our remainder. This is essentially what he wants you to do with our input string eg aaabb.
If you read through Knuth's suggested answer in the back of the book and work through a couple of examples you will see that this is essentially what he is doing by removing the pairs ab and then adding a c until no more pairs ab exist. Taking the example I used at the top we get the string being manipulated as such aaabb, aab, caab, ca, cca, ccb, aab (then start again)
Which is the same as r = m % n, m = n, n = r (then start again). The difference is of course that in the above we have used the modulus operator and division, but in the example above that we have only used subtraction.
I hope this helps. I actually wrote a more in-depth analysis of Knuth's variation on a Markov algorithm on my blog. So if you're still feeling a little stuck have a read through the series.

Related

Diff algorithm with fuzzy difference metric

I'm looking for an algorithm similar to largest common subsequence algorithms that has an alphabet letter similarity metric. What I mean is that known algorithms treat all letters of alphabet as completely different, my use case has letters of alphabet that are easier to edit into another letter, hence they should be treated as similar by diffing algorithm.
As usage example you may think about diffing algorithm working on lines of text where some lines are more similar to other lines.
The paper An O(ND) Difference Algorithm and Its Variations states on page 4: Consider adding a weight or cost to every edge. Give diagonal edges weight 0 and non-diagonal edges weight 1. I'd like to have an option to assign any weight from [0;1] interval.
The Largest Common Subsequence (LCS) problem is usually computed by dynamic programming methods and you can tweak existing methods to apply them to your use case.
In this example explaining how LCS works (from Wikipedia) https://en.wikipedia.org/wiki/Longest_common_subsequence_problem#Example, you should think tweaking the algorithm such that:
instead of scoring :
score_j = socre_i + 1, for j = i +1 (that is to say, you add 1 when you find a new common item) when a new item is added to the LCS, you should score:
score_j = F(score_i, p(letter_i, letter_j)) no matter what.
p(letter_i, letter_j) is the probability to change from letter_i to letter_j (that is the weight [0, 1] you were talking about)
F is an aggreggation function, to go from score_i to score_j knowing that probability p.
For instance F can be defined as the operator +. It would then yield :
score_j = score_i + p(letter_i, letter_j))
or more precisely :
score_j = score_i + p(letter_i, letter_j)) x 1 (read the x 1 as of a character)
and that woud give you the maximum similarity (of characters) of the 2 subsequences, that you can find by backtracking at the end of the algorithm.
You can find your own function F to yield better results.

Algorithm to figure out whether a given set of characters covers a given set of words

I came across the following algorithms problem(src: http://www.careercup.com/question?id=15029889)
Problem:
c = ‘a’
w = “apple”
c covers w, iff w contains c.
c_set = {‘a’, ‘b’, ‘c’, ‘g’}
w_set = {‘apple’, ‘ibm’, ‘cisco’, ‘google’}
c_set covers w_set, iff every w in w_set is covered by some c in c_set.
Given c_set and w_set, Whether w_set is covered by c_set?
Follow up: if w_set is fixed say a book, how to determine whether a c_set covers this w_set?
One possible solution is:
If we use 26bits(1 bit for each character) to represent each word in w_set,
and then also form a similar bitmask for c_set, a solution to check coverage
would be to do "c_set_bitmask & word_i_bitmask". If this is non-zero for each
word, then we have covered every word.
My question is, can we do anything better than this, given that
the w_set is static e.g. a book.
One possible solution is of O(nlogk) ~ O(n) where n = total number of 'letters' in w_set and k = number of letters in c_set. max(k) = 26. Hence we can assume k is constant.
The algorithm seems naïve but I can't see any solution better than O(n) because in any algorithm you will atleast need to scan all the letters in w_set to check whether they are present in c_set.
Sort c_set lexicographically. (O(klogk)) ~ constant
Scan each word in w and perform binary search for the letters in c_set. (O(nlogk)) ~ O(n).

Efficient way to take determinant of an n! x n! matrix in Maple

I have a large matrix, n! x n!, for which I need to take the determinant. For each permutation of n, I associate
a vector of length 2n (this is easy computationally)
a polynomial of in 2n variables (a product of linear factors computed recursively on n)
The matrix is the evaluation matrix for the polynomials at the vectors (thought of as points). So the sigma,tau entry of the matrix (indexed by permutations) is the polynomial for sigma evaluated at the vector for tau.
Example: For n=3, if the ith polynomial is (x1 - 4)(x3 - 5)(x4 - 4)(x6 - 1) and the jth point is (2,2,1,3,5,2), then the (i,j)th entry of the matrix will be (2 - 4)(1 - 5)(3 - 4)(2 - 1) = -8. Here n=3, so the points are in R^(3!) = R^6 and the polynomials have 3!=6 variables.
My goal is to determine whether or not the matrix is nonsingular.
My approach right now is this:
the function point takes a permutation and outputs a vector
the function poly takes a permutation and outputs a polynomial
the function nextPerm gives the next permutation in lexicographic order
The abridged pseudocode version of my code is this:
B := [];
P := [];
w := [1,2,...,n];
while w <> NULL do
B := B append poly(w);
P := P append point(w);
w := nextPerm(w);
od;
// BUILD A MATRIX IN MAPLE
M := Matrix(n!, (i,j) -> eval(B[i],P[j]));
// COMPUTE DETERMINANT IN MAPLE
det := LinearAlgebra[Determinant]( M );
// TELL ME IF IT'S NONSINGULAR
if det = 0 then return false;
else return true; fi;
I'm working in Maple using the built in function LinearAlgebra[Determinant], but everything else is a custom built function that uses low level Maple functions (e.g. seq, convert and cat).
My problem is that this takes too long, meaning I can go up to n=7 with patience, but getting n=8 takes days. Ideally, I want to be able to get to n=10.
Does anyone have an idea for how I could improve the time? I'm open to working in a different language, e.g. Matlab or C, but would prefer to find a way to speed this up within Maple.
I realize this might be hard to answer without all the gory details, but the code for each function, e.g. point and poly, is already optimized, so the real question here is if there is a faster way to take a determinant by building the matrix on the fly, or something like that.
UPDATE: Here are two ideas that I've toyed with that don't work:
I can store the polynomials (since they take a while to compute, I don't want to redo that if I can help it) into a vector of length n!, and compute the points on the fly, and plug these values into the permutation formula for the determinant:
The problem here is that this is O(N!) in the size of the matrix, so for my case this will be O((n!)!). When n=10, (n!)! = 3,628,800! which is way to big to even consider doing.
Compute the determinant using the LU decomposition. Luckily, the main diagonal of my matrix is nonzero, so this is feasible. Since this is O(N^3) in the size of the matrix, that becomes O((n!)^3) which is much closer to doable. The problem, though, is that it requires me to store the whole matrix, which puts serious strain on memory, nevermind the run time. So this doesn't work either, at least not without a bit more cleverness. Any ideas?
It isn't clear to me if your problem is space or time. Obviously the two trade back and forth. If you only wish to know if the determinant is positive or not, then you definitely should go with LU decomposition. The reason is that if A = LU with L lower triangular and U upper triangular, then
det(A) = det(L) det(U) = l_11 * ... * l_nn * u_11 * ... * u_nn
so you only need to determine if any of the main diagonal entries of L or U is 0.
To simplify further, use Doolittle's algorithm, where l_ii = 1. If at any point the algorithm breaks down, the matrix is singular so you can stop. Here's the gist:
for k := 1, 2, ..., n do {
for j := k, k+1, ..., n do {
u_kj := a_kj - sum_{s=1...k-1} l_ks u_sj;
}
for i = k+1, k+2, ..., n do {
l_ik := (a_ik - sum_{s=1...k-1} l_is u_sk)/u_kk;
}
}
The key is that you can compute the ith row of U and the ith column of L at the same time, and you only need to know the previous row/column to move forward. This way you parallel process as much as you can and store as little as you need. Since you can compute the entries a_ij as needed, this requires you to store two vectors of length n while generating two more vectors of length n (rows of U, columns of L). The algorithm takes n^2 time. You might be able to find a few more tricks, but that depends on your space/time trade off.
Not sure if I've followed your problem; is it (or does it reduce to) the following?
You have two vectors of n numbers, call them x and c, then the matrix element is product over k of (x_k+c_k), with each row/column corresponding to distinct orderings of x and c?
If so, then I believe the matrix will be singular whenever there are repeated values in either x or c, since the matrix will then have repeated rows/columns. Try a bunch of Monte Carlo's on a smaller n with distinct values of x and c to see if that case is in general non-singular - it's quite likely if that's true for 6, it'll be true for 10.
As far as brute-force goes, your method:
Is a non-starter
Will work much more quickly (should be a few seconds for n=7), though instead of LU you might want to try SVD, which will do a much better job of letting you know how well behaved your matrix is.

Finding even numbers in an array without using feedback

I saw this post: Finding even numbers in an array and I was thinking about how you could do it without feedback. Here's what I mean.
Given an array of length n containing at most e even numbers and a
function isEven that returns true if the input is even and false
otherwise, write a function that prints all the even numbers in the
array using the fewest number of calls to isEven.
The answer on the post was to use a binary search, which is neat since it doesn't mean the array has to be in order. The number of times you have to check if a number is even is e log n instead if n because you do a binary search (log n) to find one even number each time (e times).
But that idea means that you divide the array in half, test for evenness, then decide which half to keep based on the result.
My question is whether or not you can beat n calls on a fixed testing scheme where you check all the numbers you want for evenness without knowing the outcome, and then figure out where the even numbers are after you've done all the tests based on the results. So I guess it's no-feedback or blind or some term like that.
I was thinking about this for a while and couldn't come up with anything. The binary search idea doesn't work at all with this constraint, but maybe something else does? Even getting down to n/2 calls instead of n (yes, I know they are the same big-O) would be good.
The technical term for "no-feedback or blind" is "non-adaptive". O(e log n) calls still suffice, but the algorithm is rather more involved.
Instead of testing the evenness of products, we're going to test the evenness of sums. Let E ≠ F be distinct subsets of {1, …, n}. If we have one array x1, …, xn with even numbers at positions E and another array y1, …, yn with even numbers at positions F, how many subsets J of {1, …, n} satisfy
(∑i in J xi) mod 2 ≠ (∑i in J yi) mod 2?
The answer is 2n-1. Let i be an index such that xi mod 2 ≠ yi mod 2. Let S be a subset of {1, …, i - 1, i + 1, … n}. Either J = S is a solution or J = S union {i} is a solution, but not both.
For every possible outcome E, we need to make calls that eliminate every other possible outcome F. Suppose we make 2e log n calls at random. For each pair E ≠ F, the probability that we still cannot distinguish E from F is (2n-1/2n)2e log n = n-2e, because there are 2n possible calls and only 2n-1 fail to distinguish. There are at most ne + 1 choices of E and thus at most (ne + 1)ne/2 pairs. By a union bound, the probability that there exists some indistinguishable pair is at most n-2e(ne + 1)ne/2 < 1 (assuming we're looking at an interesting case where e ≥ 1 and n ≥ 2), so there exists a sequence of 2e log n calls that does the job.
Note that, while I've used randomness to show that a good sequence of calls exists, the resulting algorithm is deterministic (and, of course, non-adaptive, because we chose that sequence without knowledge of the outcomes).
You can use the Chinese Remainder Theorem to do this. I'm going to change your notation a bit.
Suppose you have N numbers of which at most E are even. Choose a sequence of distinct prime powers q1,q2,...,qk such that their product is at least N^E, i.e.
qi = pi^ei
where pi is prime and ei > 0 is an integer and
q1 * q2 * ... * qk >= N^E
Now make a bunch of 0-1 matrices. Let Mi be the qi x N matrix where the entry in row r and column c has a 1 if c = r mod qi and a 0 otherwise. For example, if qi = 3^2, then row 2 has ones in columns 2, 11, 20, ... 2 + 9j and 0 elsewhere.
Now stack these matrices vertically to get a Q x N matrix M, where Q = q1 + q2 + ... + qk. The rows of M tell you which numbers to multiply together (the nonzero positions). This gives a total of Q products that you need to test for evenness. Call each row a "trial", and say that a "trial involves j" if the jth column of that row is nonempty. The theorem you need is the following:
THEOREM: The number in position j is even if and only if all trials involving j are even.
So you do a total of Q trials and then look at the results. If you choose the prime powers intelligently, then Q should be significantly smaller than N. There are asymptotic results that show you can always get Q on the order of
(2E log N)^2 / 2log(2E log N)
This theorem is actually a corollary of the Chinese Remainder Theorem. The only place that I've seen this used is in Combinatorial Group Testing. Apparently the problem originally arose when testing soldiers coming back from WWII for syphilis.
The problem you are facing is a form of group testing, type of a problem with the objective of reducing the cost of identifying certain elements of a set (up to d elements of a set of N elements).
As you've already stated, there are two basic principles via which the testing may be carried out:
Non-adaptive Group Testing, where all the tests to be performed are decided a priori.
Adaptive Group Testing, where we perform several tests, basing each test on the outcome of previous tests. Obviously, adaptive testing has a potential to reduce the cost, compared to non-adaptive testing.
Theoretical bounds for both principles have been studied, and are available in this Wiki article, or this paper.
For adaptive testing, the upper bound is O(d*log(N)) (as already described in this answer).
For non-adaptive testing, it can be shown that the upper bound is O(d*d/log(d)*log(N)), which is obviously larger than the upper bound for adaptive testing by a factor of d/log(d).
This upper bound for non-adaptive testing comes from an algorithm which uses disjunct matrices: matrices of dimension T x N ("number of tests" x "number of elements"), where each item can be either true (if an element was included in a test), or false (if it wasn't), with a property that any subset of d columns must differ from all other columns by at least a single row (test inclusion). This allows linear time of decoding (there are also "d-separable" matrices where fewer test are needed, but the time complexity for their decoding is exponential and not computationaly feasible).
Conclusion:
My question is whether or not you can beat n calls on a fixed testing scheme [...]
For such a scheme and a sufficiently large value of N, a disjunct matrix can be constructed which would have less than K * [d*d/log(d)*log(N)] rows. So, for large values of N, yes, you can beat it.
The underlying question (challenge) is kind of silly. If the binary search answer is acceptable (where it sums sub arrays and sends them to IsEven) then I can think of a way to do it with E or less calls to IsEven (assuming the numbers are integers of course).
JavaScript to demonstrate
// sort the array by only the first bit of the number
A.sort(function(x,y) { return (x & 1) - (y & 1); });
// all of the evens will be at the beginning
for(var i=0; i < E && i < A.length; i++) {
if(IsEven(A[i]))
Print(A[i]);
else
break;
}
Not exactly a solution, but just few thoughts.
It is easy to see that if a solution exists for array length n that takes less than n tests, then for any array length m > n it is easy to see that there is always a solution with less than m tests. So, if you have a solution for n = 2 or 3 or 4, then the problem is solved.
You can split the array into pairs of numbers and for each pair: if the sum is odd, then exactly one of them is even, otherwise if one of the numbers is even, then both of them are even. This way for each pair it takes either one or two tests. Best case:n/2 tests, worse case:n tests, if even and odd numbers are chosen with equal probability, then: 3n/4 tests.
My hunch is there is no solution with less than n tests. Not sure how to prove it.
UPDATE: The second solution can be extended in the following way.
Check if the sum of two numbers is even. If odd, then exactly one of them is even. Otherwise label the set as "homogeneous set of size 2". Take two "homogenous set"s of same size n. Pick one number from each set and check if their sum is even. If it is even, combine these two sets to a "homogeneous set of size 2n". Otherwise, it implies that one of those sets purely consists of even numbers and the other one purely odd numbers.
Best case:n/2 tests. Average case: 3*n/2. Worst case is still n. Worst case exists only when all the numbers are even or all the numbers are odd.
If we can add and multiply array elements, then we can compute every Boolean function (up to complementation) on the low-order bits. Simulate a circuit that encodes the positions of the even numbers as a number from 0 to nC0 + nC1 + ... + nCe - 1 represented in binary and use calls to isEven to read off the bits.
Number of calls used: within 1 of the information-theoretic optimum.
See also fully homomorphic encryption.

Calculate discrete logarithm

Given positive integers b, c, m where (b < m) is True it is to find a positive integer e such that
(b**e % m == c) is True
where ** is exponentiation (e.g. in Ruby, Python or ^ in some other languages) and % is modulo operation. What is the most effective algorithm (with the lowest big-O complexity) to solve it?
Example:
Given b=5; c=8; m=13 this algorithm must find e=7 because 5**7%13 = 8
From the % operator I'm assuming that you are working with integers.
You are trying to solve the Discrete Logarithm problem. A reasonable algorithm is Baby step, giant step, although there are many others, none of which are particularly fast.
The difficulty of finding a fast solution to the discrete logarithm problem is a fundamental part of some popular cryptographic algorithms, so if you find a better solution than any of those on Wikipedia please let me know!
This isn't a simple problem at all. It is called calculating the discrete logarithm and it is the inverse operation to a modular exponentation.
There is no efficient algorithm known. That is, if N denotes the number of bits in m, all known algorithms run in O(2^(N^C)) where C>0.
Python 3 Solution:
Thankfully, SymPy has implemented this for you!
SymPy is a Python library for symbolic mathematics. It aims to become a full-featured computer algebra system (CAS) while keeping the code as simple as possible in order to be comprehensible and easily extensible. SymPy is written entirely in Python.
This is the documentation on the discrete_log function. Use this to import it:
from sympy.ntheory import discrete_log
Their example computes \log_7(15) (mod 41):
>>> discrete_log(41, 15, 7)
3
Because of the (state-of-the-art, mind you) algorithms it employs to solve it, you'll get O(\sqrt{n}) on most inputs you try. It's considerably faster when your prime modulus has the property where p - 1 factors into a lot of small primes.
Consider a prime on the order of 100 bits: (~ 2^{100}). With \sqrt{n} complexity, that's still 2^{50} iterations. That being said, don't reinvent the wheel. This does a pretty good job. I might also add that it was almost 4x times more memory efficient than Mathematica's MultiplicativeOrder function when I ran with large-ish inputs (44 MiB vs. 173 MiB).
Since a duplicate of this question was asked under the Python tag, here is a Python implementation of baby step, giant step, which, as #MarkBeyers points out, is a reasonable approach (as long as the modulus isn't too large):
def baby_steps_giant_steps(a,b,p,N = None):
if not N: N = 1 + int(math.sqrt(p))
#initialize baby_steps table
baby_steps = {}
baby_step = 1
for r in range(N+1):
baby_steps[baby_step] = r
baby_step = baby_step * a % p
#now take the giant steps
giant_stride = pow(a,(p-2)*N,p)
giant_step = b
for q in range(N+1):
if giant_step in baby_steps:
return q*N + baby_steps[giant_step]
else:
giant_step = giant_step * giant_stride % p
return "No Match"
In the above implementation, an explicit N can be passed to fish for a small exponent even if p is cryptographically large. It will find the exponent as long as the exponent is smaller than N**2. When N is omitted, the exponent will always be found, but not necessarily in your lifetime or with your machine's memory if p is too large.
For example, if
p = 70606432933607
a = 100001
b = 54696545758787
then 'pow(a,b,p)' evaluates to 67385023448517
and
>>> baby_steps_giant_steps(a,67385023448517,p)
54696545758787
This took about 5 seconds on my machine. For the exponent and the modulus of those sizes, I estimate (based on timing experiments) that brute force would have taken several months.
Discrete logarithm is a hard problem
Computing discrete logarithms is believed to be difficult. No
efficient general method for computing discrete logarithms on
conventional computers is known.
I will add here a simple bruteforce algorithm which tries every possible value from 1 to m and outputs a solution if it was found. Note that there may be more than one solution to the problem or zero solutions at all. This algorithm will return you the smallest possible value or -1 if it does not exist.
def bruteLog(b, c, m):
s = 1
for i in xrange(m):
s = (s * b) % m
if s == c:
return i + 1
return -1
print bruteLog(5, 8, 13)
and here you can see that 3 is in fact the solution:
print 5**3 % 13
There is a better algorithm, but because it is often asked to be implemented in programming competitions, I will just give you a link to explanation.
as said the general problem is hard. however a prcatical way to find e if and only if you know e is going to be small (like in your example) would be just to try each e from 1.
btw e==3 is the first solution to your example, and you can obviously find that in 3 steps, compare to solving the non discrete version, and naively looking for integer solutions i.e.
e = log(c + n*m)/log(b) where n is a non-negative integer
which finds e==3 in 9 steps

Resources