Computing the square root of 1000+ bit word in C - algorithm

Imagine that we have e.g. 1000 bit word in our memory. I'm wondering if there is any way to calcuate a square root of it (not necessarily accurate, lets say without floating point part). Or we've got only memory location and later specified various size.
I assume that our large number is one array (most significant bits at the beginning?). Square root is more or less half of original number. When trying to use Digit-by-digit algorithm there is a point when usnigned long long is not enough to remember partial result (subtraction with 01 extended number). How to solve it? What with getting single digit of the large number? Only by bitmask?
While thinking about pseudocode stucked at this questions. Any ideas?

How would you do it by hand? How would you divide a 1000 digit number by a 500 digit by hand? (Just think about the method, obviously it would be quite time consuming). Now with a square root, the method is very similar to division where you "guess" the first digit, then the second digit and so on and subtract things. It's just that for a square root, you subtract slightly different things (but not that different, calculating a square root can be done in a way very similar to a division except that with each digit added, the divisor changes).
I wouldn't want to tell you exactly how to do it, because that spoils the whole fun of discovering it yourself.
The trick is: Instead of thinking about the square root of x, think about finding a number y such that y*y = x. And as you improve y, recalculate x - y*y with the minimum effort.

Calculating square roots is very easily done with a binary search algorithm.
A pseudo-code algorithm:
Take a guess c: the 1000 bit value divided by 2 (simple bitshift).
Square it:
If the square (almost) equals your 1000 bit number you've got your answer
If the square is smaller than your number, you can be sure the root is between c and your upper bound.
If the square is larger than your number, you know that the root lies between your lower bound and c.
Repeat until you have found your root, while keeping track of your upper and lower bound.
This kind of algorithm should run in log (N) time.

Kind of depends on how accurate you want it. Consider that the square root of 2^32 == 2^16. So one thing you could do is shift the 1000-bit number 500 bits to the right, and you have an answer that would be in the ballpark.
How well does this work? Let's see. The number 36 in binary is 100100. If I shift that to the right 3 bits, then I get 4. Hmmm ... should be 6. Pretty big error of 33%. The square root of 1,000,000 is 1,000. In binary, 1,000,000 is 1111 0100 0010 0100 0000. That's 20 bits. Shifted right 10 bits, it's 1111 0100 00, or 976. The error is 24/1000, or 2.4%.
When you get to a 1,000 bit number, the absolute error might be large, but the percentage error is going to be very small.
Depending on how you're storing the numbers, shifting a 1,000 bit number 500 bits to the right shouldn't be terribly difficult.

Newton's method is probably the way to go. At some point with Newton's method you're going to have to perform a division (in particular, when finding the next point to test), but it might be okay to approximate this to the nearest power of two and just do a bitshift instead.

Related

What division algorithm should be used for dividing small integers in hardware?

I need to multiply an integer ranging from 0-1023 by 1023 and divide the result by a number ranging from 1-1023 in hardware (verilog/fpga implementation). The multiplication is straight forward since I can probably get away with just shifting 10 bits (and if needed I'll subtract an extra 1023). The division is a little interesting though. Area/power arent't really critical to me (I'm in an FPGA so the resources are already there). Latency (within reason) isn't a big deal so long as I can pipeline the deisgn. There are obviously several choices with different trade offs, but I'm wondering if there's an "obvious" or "no brainer" algorithm for a situation like this. Given the limited range of operands and the abundance of resources that I have (bram etc) I'm wondering if there isn't something obvious to do.
If you can work with fixed point precision rather than integers it may be possible to change :
divide the result by a number ranging from 1-1023
to multiplication by a number ranging from 1 - 1/1023, ie pre-compute the divide and store that as the coefficient for the multiply.
If you can pre-compute everything, and you've got a spare 20x20 multiplier, and some way to store your pre-computed number, then go for Morgan's suggestion. You need to precompute a 20-bit multiplicand (10b quotient, 10b remainder), and multiply by your first 10b number, and take the bottom 30b of the 40b result.
Otherwise, the no-brainer is non-restoring division, since you say that latency isn't important (lots of stuff on the web, most of it incomprehensible). you have a 20-bit numerator (the result of your (1023 x) multiplication), and a 10-bit denominator. This gives a 20b quotient, and a 10b remainder (ie. 20 bits for the integer part of the answer, and 10 bits for the fractional part, giving a 30b answer).
The actual hardware is pretty trivial: an 11b adder/subtractor, a 31b shift register, and a 10b or 11b register to store the divisor. You also need a small FSM to control it (2b). You have to do a compare, add or subtract, and shift in every clock cycle, and you get the answer out in 21 cycles. I think. :)

Distinct digit count

Is is possible to count the distinct digits in a number in constant time O(1)?
Suppose n=1519 output should be 3 as there are 3 distinct digits(1,5,9).
I have done it in O(N) time but anyone knows how to find it in O(1) time?
I assume N is the number of digits of n. If the size of n is unlimited, it can't be done in general in O(1) time.
Consider the number n=11111...111, with 2 trillion digits. If I switch one of the digits from a 1 to a 2, there is no way to discover this without in some way looking at every single digit. Thus processing a number with 2 trillion digits must take (of the order of) 2 trillion operations at least, and in general, a number with N digits must take (of the order of) N operations at least.
However, for almost all numbers, the simple O(N) algorithm finishes very quickly because you can just stop as soon as you get to 10 distinct digits. Almost all numbers of sufficient length will have all 10 digits: e.g. the probability of not terminating with the answer '10' after looking at the first 100 digits is about 0.00027, and after the first 1000 digits it's about 1.7e-45. But unfortunately, there are some oddities which make the worst case O(N).
After seeing that someone really posted a serious answer to this question, I'd rather repeat my own cheat here, which is a special case of the answer described by #SimonNickerson:
O(1) is not possible, unless you are on radix 2, because that way, every number other than 0 has both 1 and 0, and thus my "solution" works not only for integers...
EDIT
How about 2^k - 1? Isn't that all 1s?
Drat! True... I should have known that when something seems so easy, it is flawed somehow... If I got the all 0 case covered, I should have covered the all 1 case too.
Luckily this case can be tested quite quickly (if addition and bitwise AND are considered an O(1) operation): if x is the number to be tested, compute y this way: y=(x+1) AND x. If y=0, then x=2^k - 1. because this is the only case when all the bits needed to be flipped by the addition. Of course, this is quite a bit flawed, as with bit lengths exceeding the bus width, the bitwise operators are not O(1) anymore, but rather O(N).
At the same time, I think it can be brought down to O(logN), by breaking the number into bus width size chunks, and AND-ing together the neighboring ones, repeating until only one is left: if there were no 0s in the number tested, the last one will be full 1s too...
EDIT2: I was wrong... This is still O(N).

Algorithm to find an arbitrarily large number

Here's something I've been thinking about: suppose you have a number, x, that can be infinitely large, and you have to find out what it is. All you know is if another number, y, is larger or smaller than x. What would be the fastest/best way to find x?
An evil adversary chooses a really large number somehow ... say:
int x = 9^9^9^9^9^9^9^9^9^9^9^9^9^9^9
and provides isX, isBiggerThanX, and isSmallerThanx functions. Example code might look something like this:
int c = 2
int y = 2
while(true)
if isX(y) return true
if(isBiggerThanX(y)) fn()
else y = y^c
where fn() is a function that, once a number y has been found (that's bigger than x) does something to determine x (like divide the number in half and compare that, then repeat). The thing is, since x is arbitrarily large, it seems like a bad idea to me to use a constant to increase y.
This is just something that I've been wondering about for a while now, I'd like to hear what other people think
Use a binary search as in the usual "try to guess my number" game. But since there is no finite upper end point, we do a first phase to find a suitable one:
Initially set the upper end point arbitrarily (e.g. 1000000, though 1 or 1^100 would also work -- given the infinite space to work in, all finite values are equally disproportionate).
Compare the mystery number X with the upper end point.
If it's not big enough, double it, and try again.
Once the upper end point is bigger than the mystery number, proceed with a normal binary search.
The first phase is itself similar to a binary search. The difference is that instead of halving the search space with each step, it's doubling it! The cost for each phase is O(log X). A small improvement would be to set the lower end point at each doubling step: we know X is at least as high as the previous upper end point, so we can reuse it as the lower end point. The size of the search space still doubles at each step, but in the end it will be half as large as would have been. The cost of the binary search will be reduced by only 1 step, so its overall complexity remains the same.
Some notes
A couple of notes in response to other comments:
It's an interesting question, and computer science is not just about what can be done on physical machines. As long as the question can be defined properly, it's worth asking and thinking about.
The range of numbers is infinite, but any possible mystery number is finite. So the above method will eventually find it. Eventually is defined such as that, for any possible finite input, the algorithm will terminate within a finite number of steps. However since the input is unbounded, the number of steps is also unbounded (it's just that, in every particular case, it will "eventually" terminate.)
If I understand your question correctly (advise if I do not), you're asking about how to solve "pick a number from 1 to 10", except that instead of 10, the upper bound is infinity.
If your number space is truly infinite, the following are true:
The value will never be held in an int (or any other data type) on any physical hardware
You will NEVER find your number
If the space is immensely large but bound, I think the best you can do is a binary search. Start at the middle of the number range. If the desired number turns out to be higher or lower, divide that half of the number space, and repeat until the desired number is found.
In your suggested implementation you raise y ^ c. However, no matter how large c is chosen to be, it will not even move the needle in infinite space.
Infinity isn't a number. Thus you can't find it, even with a computer.
That's funny. I've wondered the same thing for years, though I've never heard anyone else ask the question.
As simple as your scenario is, it still seems to provide insufficient information to allow the choice of an optimal strategy. All one can choose is a suitable heuristic. My heuristic had been to double y, but I think that I like yours better. Yours doubles log(y).
The beauty of your heuristic is that, so long as the integer fits in the computer's memory, it finds a suitable y in logarithmic time.
Counter-question. Once you find y, how do you proceed?
I agree with using binary search, though I believe that a ONE-SIDED binary search would be more suitable, since here the complexity would NOT be O( log n ) [ Where n is the range of allowable numbers ], but O( log k ) - where k is the number selected by your adversary.
This would work as follows : ( Pseudocode )
k = 1;
while( isSmallerThanX( k ) )
{
k = k*2;
}
// At this point, once the loop is exited, k is bigger than x
// Now do normal binary search for the range [ k/2, k ] to find your number :)
So even if the allowable range is infinity, as long as your number is finite, you should be able to find it :)
Your method of tetration is guaranteed to take longer than the age of the universe to find an answer, if the opponent merely uses a paradigm which is better (for example, pentation). This is how you should do it:
You can only do this with symbolic representations of numbers, because it is trivial to name a number your computer cannot store in floating-point representation, even if it used arbitrary-precision arithmetic and all its memory.
Required reading: http://www.scottaaronson.com/writings/bignumbers.html - that pretty much sums it up
How do you represent a number then? You represent it by a program which will, if run to completion, print out that number. Even then, your computer is incapable of computing BusyBeaver(10^100) (if you dictated a program 1 terabyte in size, this well over the maximum number of finite clock cycles it could run without looping forever). You can see that we could easily have the computer print out 1 0 0... each clock cycle, making the maximum number it could say (if we waited nearly an eternity) would be 10^BusyBeaver(10^100). If you allowed it to say more complicated expressions like eval(someprogram), power-towers, Ackermann's function, whatever-- then I believe that would be no better than increasing the original 10^100 by some constant proportional to the complexity of what you described (plus some logarithmic interpreter factor, see Kolmogorov complexity).
So let's frame this another way:
Your opponent picks a finite computable number, and gives you a function tells you if the number is smaller/larger/equal by computing it. He also gives you a representation for the output (in a sane world this would be "you can only print numbers like 99999", but he can make it more complicated; it actually doesn't matter). Proceed to measure the size of this function in bits.
Now, answer with your own function, which is twice the size of his function (in bits), and prints out the largest number it can while keeping the code to less than 2N bits in length. (You use the same representation he chose: In a world where you can only print out numbers like "99999", that's what you do. If you can define functions, it gets slightly more complicated.)
I do not understand the purpose here, but I this is what I thought of:
Reading your comments, I suppose you aren't looking for infinitely large number, but a "super large number" instead. And whatever be the number, it will have a large no. of digits. How you got them, isn't the concern. Keeping this in mind:
No complex computation is required. Just type random keys on your numeric keyboard to have a super large number, and then have a program randomly add/remove/modify digits of that number. You get a list of very large numbers - select any one out of them.
e.g: 3672036025039629036790672927305060260103610831569252706723680972067397267209
and keep modifying/adding digits to get more numbers
PS: If you state the purpose in your question clearly, we might be able to give better answers.

Algorithm for ProjectEuler Problem 99

From http://projecteuler.net/index.php?section=problems&id=99
Comparing two numbers written in index
form like 211 and
37 is not difficult, as any
calculator would confirm that
211 = 2048 37 =
2187.
However, confirming that
632382518061 >
519432525806 would be much
more difficult, as both numbers
contain over three million digits.
Using base_exp.txt (right click
and 'Save Link/Target As...'), a 22K
text file containing one thousand
lines with a base/exponent pair on
each line, determine which line number
has the greatest numerical value.
How might I approach this?
Not a full solution, but some ideas. You can use the following formula:
log(ax) = x*loga
The log10 can easily be estimated as number of digits. The log2 can easily be estimated by counting right shifts.
Based on the above you could narrow the list significantly. For the remaining numbers, you would have to do full calculations. Are math functions allowed in project Euler? If yes, it would be better to use logarithms.
Since the logarithm is a monotonic function, instead of ax you could compare x * log a to find the maximum. You might need to take numerical precision into account, though.
One possible approach would be to use logarithm identities (that is, ab is identical to eb * ln a). As a matter of fact, ab is identical to baseb * logbase a for all bases except 0 and 1.
Comparing exponent * log(base) instead of base^exponent for each line in the file works for this problem without taking precision into account. This is certainly the best solution from a mathematical perspective, but I just wanted to point out that this isn't really necessary.
Another possible solution is to divide every number in the file by some constant (say 100,000) before performing the exponentiation on the now smaller numbers. Since you're comparing all of the values to each other, scaling them all down by a constant factor doesn't affect the final outcome.

Best algorithm for avoiding loss of precision?

A recent homework assignment I have received asks us to take expressions which could create a loss of precision when performed in the computer, and alter them so that this loss is avoided.
Unfortunately, the directions for doing this haven't been made very clear. From watching various examples being performed, I know that there are certain methods of doing this: using Taylor series, using conjugates if square roots are involved, or finding a common denominator when two fractions are being subtracted.
However, I'm having some trouble noticing exactly when loss of precision is going to occur. So far the only thing I know for certain is that when you subtract two numbers that are close to being the same, loss of precision occurs since high order digits are significant, and you lose those from round off.
My question is what are some other common situations I should be looking for, and what are considered 'good' methods of approaching them?
For example, here is one problem:
f(x) = tan(x) − sin(x) when x ~ 0
What is the best and worst algorithm for evaluating this out of these three choices:
(a) (1/ cos(x) − 1) sin(x),
(b) (x^3)/2
(c) tan(x)*(sin(x)^2)/(cos(x) + 1).
I understand that when x is close to zero, tan(x) and sin(x) are nearly the same. I don't understand how or why any of these algorithms are better or worse for solving the problem.
Another rule of thumb usually used is this: When adding a long series of numbers, start adding from numbers closest to zero and end with the biggest numbers.
Explaining why this is good is abit tricky. when you're adding small numbers to a large numbers, there is a chance they will be completely discarded because they are smaller than then lowest digit in the current mantissa of a large number. take for instance this situation:
a = 1,000,000;
do 100,000,000 time:
a += 0.01;
if 0.01 is smaller than the lowest mantissa digit, then the loop does nothing and the end result is a == 1,000,000
but if you do this like this:
a = 0;
do 100,000,000 time:
a += 0.01;
a += 1,000,000;
Than the low number slowly grow and you're more likely to end up with something close to a == 2,000,000 which is the right answer.
This is ofcourse an extreme example but I hope you get the idea.
I had to take a numerics class back when I was an undergrad, and it was thoroughly painful. Anyhow, IEEE 754 is the floating point standard typically implemented by modern CPUs. It's useful to understand the basics of it, as this gives you a lot of intuition about what not to do. The simplified explanation of it is that computers store floating point numbers in something like base-2 scientific notation with a fixed number of digits (bits) for the exponent and for the mantissa. This means that the larger the absolute value of a number, the less precisely it can be represented. For 32-bit floats in IEEE 754, half of the possible bit patterns represent between -1 and 1, even though numbers up to about 10^38 are representable with a 32-bit float. For values larger than 2^24 (approximately 16.7 million) a 32-bit float cannot represent all integers exactly.
What this means for you is that you generally want to avoid the following:
Having intermediate values be large when the final answer is expected to be small.
Adding/subtracting small numbers to/from large numbers. For example, if you wrote something like:
for(float index = 17000000; index < 17000001; index++) {}
This loop would never terminate becuase 17,000,000 + 1 is rounded down to 17,000,000.
If you had something like:
float foo = 10000000 - 10000000.0001
The value for foo would be 0, not -0.0001, due to rounding error.
My question is what are some other
common situations I should be looking
for, and what are considered 'good'
methods of approaching them?
There are several ways you can have severe or even catastrophic loss of precision.
The most important reason is that floating-point numbers have a limited number of digits, e.g..doubles have 53 bits. That means if you have "useless" digits which are not part of the solution but must be stored, you lose precision.
For example (We are using decimal types for demonstration):
2.598765000000000000000000000100 -
2.598765000000000000000000000099
The interesting part is the 100-99 = 1 answer. As 2.598765 is equal in both cases, it
does not change the result, but waste 8 digits. Much worse, because the computer doesn't
know that the digits is useless, it is forced to store it and crams 21 zeroes after it,
wasting at all 29 digits. Unfortunately there is no way to circumvent it for differences,
but there are other cases, e.g. exp(x)-1 which is a function occuring very often in physics.
The exp function near 0 is almost linear, but it enforces a 1 as leading digit. So with 12
significant digits
exp(0.001)-1 = 1.00100050017 - 1 = 1.00050017e-3
If we use instead a function expm1(), use the taylor series:
1 + x +x^2/2 +x^3/6 ... -1 =
x +x^2/2 +x^3/6 =: expm1(x)
expm1(0.001) = 1.00500166667e-3
Much better.
The second problem are functions with a very steep slope like tangent of x near pi/2.
tan(11) has a slope of 50000 which means that any small deviation caused by rounding errors
before will be amplified by the factor 50000 ! Or you have singularities if e.g. the result approaches 0/0, that means it can have any value.
In both cases you create a substitute function, simplying the original function. It is of no use to highlight the different solution approaches because without training you will simply not "see" the problem in the first place.
A very good book to learn and train: Forman S. Acton: Real Computing made real
Another thing to avoid is subtracting numbers that are nearly equal, as this can also lead to increased sensitivity to roundoff error. For values near 0, cos(x) will be close to 1, so 1/cos(x) - 1 is one of those subtractions that you'd like to avoid if possible, so I would say that (a) should be avoided.

Resources