Why does Math.Log crash only inside my for loop? - ruby

I have the below code
A = 1.0
B = 0.20
N = 8.0
for i in 1..Total
t = Maxt * rand
x = A * Math.cos(t) / (Math.log(B*Math.tan(t/(2*N))))
y = A * Math.sin(t) / (Math.log(B*Math.tan(t/(2*N))))
end
If I comment out the For loop it executes fine and produces 1 of the results I want. If I don't comment out the for loop, it generates the below. I am a newbie with Ruby and am mainly curious why it only breaks when the for loop is present.
rubyfile.rb:22:in `log': Numerical argument out of domain - log (Errno::EDOM)
from rubyfile.rb:22
from rubyfile.rb:20:in `each'
from rubyfile.rb:20

Math.log represents the logarithm function, which is undefined for negative numbers. Math.tan, however, represents the tangent function, which can return negative numbers. So, if Math.tan comes out to a negative number, the Math.log will tell you that its argument is "out of domain", meaning that there is no logarithm for that number.
I'm betting the fact that your input is random means that, when you loop, you are far more likely to get that error than if you just run the script once. If you were the remove the loop then run the script multiple times, I bet you'd get that error eventually.
Find out why your math involves negative numbers when it shouldn't, and you're good to go :)

B*Math.tan(t/(2*N))) will take negative values and log is undefined for x < 0. As the error states, you're out of domain.

Related

Reliable multiplication and modulo with numbers larger than maxint

Situation
After working on a coding kata I finally got the algorithm to work on my small test cases.
Only to find out it did not work on a large scale, the time is not an issue but the size of the numbers are.
In one of my calculations in one of the test cases I need to perform the following calculation.
var numberOfColumns = 34359738368;
var numberOfRows = 28827050410;
var valueOverflow = 13719506;
var totalOfSingleRow = (numberOfColumns * (numberOfColumns - 1))/2;
var totalGridValue = totalOfSingleRow * numberOfRows;
var result = (totalOfSingleRow * totalGridValue) % valueOverflow;
Because the top row is sequential, I can calculate the sum of the first row by doing (numberOfColumns * (numberOfColumns - 1))/2;.
Then I need to multiply that answer by the number of rows and apply the modulo to get my resulting value.
The problem
The problem is that Javascript can only calculate reliable with number less than 9007199254740991.
Only the calculation above results in a totalGridValue of 17016487081526963049249353236480.
You can imagine that my calculation does not result in the desired value of 10552574 because the value gets truncated to 1.7016487081526963e+31.
This results in the wrong value of 8479672
Question
How can I alter my calculation so the result becomes the desired 10552574.
I've tried applying the modulo operator sooner on the numberOfColumns without the desired result.
I've also looked at adding two large values as string but this process would become to slow as I have to add two strings to many times.
Note
Because I need to submit this on codewars, I cannot use any external libraries!
I can use other languages though, but I know it is possible in javascript.
I think you were on the right track with moving the modulus operation further up the chain, but I wouldn't use the modulo operator for that. Instead, use regular floating point division, and carry that value through to the last step, when all other calculations are done, then convert that float's decimal part into an integer. Basically, with pure division, you're just doing a transformation on the value, rather than changing the value. Once you go to the modulus, you've changed the value. (I'd also change the totalSingleRow formula to divide the big values then multiply the results, rather than multiply then divide.)
I agree that moving the modulus operation further up the chain is the correct basic idea. The one subtlety is handling the calculation of totalOfSingleRow, because it involves a division by two. I think what you need to do is perform that stage modulo (2 * valueOverflow), and only compute the result modulo valueOverflow at the end. Perhaps as follows:
var doubleOverflow = 2 * valueOverflow;
var totalOfSingleRow = (((numberOfColumns % doubleOverflow) * (numberOfColumns + doubleOverflow - 1)) / 2) % valueOverflow;
var totalGridValue = (totalOfSingleRow * (numberOfRows % valueOverflow)) % valueOverflow;
var result = (totalOfSingleRow * totalGridValue) % valueOverflow;
You'll need to check that (4 * valueOverflow*valueOverflow) is less than the maximum integer value that can be represented correctly.

Complex Numbers Seemingly Arising from Non-Complex Logarithms

I have a simple program written in TI-BASIC that converts from base 10 to base 2
0->B
1->E
Input "DEC:",D
Repeat D=0
int(round(log(D)/log(2),1))->E
round(E)->E
B+10^E->B
D-2^E->D
End
Disp B
This will sometimes return an the error 'ERR: DATA TYPE'. I checked, and this is because the variable D, will sometimes become a complex number. I am not sure how this happens.
This happens with seemingly random numbers, like 5891570. It happens with this number, but not something close to it like 5891590 Which is strange. It also happens with 1e30, But not 1e25. Another example is 1111111111111111, and not 1111111111111120.
I haven't tested this thoroughly, and don't see any pattern in these numbers. Any help would be appreciated.
The error happens because you round the logarithm to one decimal place before taking the integer part; therefore, if log(D)/log(2) is something like 8.99, you will round E up rather than down, and 2^9 will be subtracted from D instead of 2^8, causing, in the next iteration, D to become negative and its logarithm to be complex. Let's walk through your code when D is 511, which has base-2 logarithm 8.9971:
Repeat D=0 ;Executes first iteration without checking whether D=0
log(D)/log(2 ;8.9971
round(Ans,1 ;9.0
int(Ans ;9.0
round(Ans)->E ;E = 9.0
B+10^E->B ;B = 1 000 000 000
D-2^E->D ;D = 511-512 = -1
End ;loops again, since D≠0
---next iteration:----
log(D ;log(-1) = 1.364i; throws ERR:NONREAL ANS in Real mode
Rounding the logarithm any more severely than nine decimal places (nine digits is the default for round( without a "digits" argument) is completely unnecessary, as on my TI-84+ rounding errors do not accumulate: round(int(log(2^X-1)/log(2)) returns X-1 and round(int(log(2^X)/log(2)) returns X for all integer X≤28, which is high enough that precision would be lost anyway in other parts of the calculation.
To fix your code, simply round only once, and only to nine places. I've also removed the unnecessary double-initialization of E, removed your close-parens (it's still legal code!), and changed the Repeat (which always executes one loop before checking the condition D=0) to a While loop to prevent ERR:DOMAIN when the input is 0.
0->B
Input "DEC:",D
While D
int(round(log(D)/log(2->E
B+10^E->B
D-2^E->D
End
B ;on the last line, so it prints implicitly
Don't expect either your code or my fix to work correctly for D > 213 or so, because your calculator can only store 14 digits in its internal representation of any number. You'll lose the digits while you store the result into B!
Now for a trickier, optimized way of computing the binary representation (still only works for D < 213:
Input D
int(2fPart(D/2^cumSum(binomcdf(13,0
.1sum(Ans10^(cumSum(1 or Ans

Scope of variables and the digits function

My question is twofold:
1) As far as I understand, constructs like for loops introduce scope blocks, however I'm having some trouble with a variable that is define outside of said construct. The following code depicts an attempt to extract digits from a number and place them in an array.
n = 654068
l = length(n)
a = Int64[]
for i in 1:(l-1)
temp = n/10^(l-i)
if temp < 1 # ith digit is 0
a = push!(a,0)
else # ith digit is != 0
push!(a,floor(temp))
# update n
n = n - a[i]*10^(l-i)
end
end
# last digit
push!(a,n)
The code executes fine, but when I look at the a array I get this result
julia> a
0-element Array{Int64,1}
I thought that anything that goes on inside the for loop is invisible to the outside, unless I'm operating on variables defined outside the for loop. Moreover, I thought that by using the ! syntax I would operate directly on a, this does not seem to be the case. Would be grateful if anyone can explain to me how this works :)
2) Second question is about syntex used when explaining functions. There is apparently a function called digits that extracts digits from a number and puts them in an array, using the help function I get
julia> help(digits)
Base.digits(n[, base][, pad])
Returns an array of the digits of "n" in the given base,
optionally padded with zeros to a specified size. More significant
digits are at higher indexes, such that "n ==
sum([digits[k]*base^(k-1) for k=1:length(digits)])".
Can anyone explain to me how to interpret the information given about functions in Julia. How am I to interpret digits(n[, base][, pad])? How does one correctly call the digits function? I can't be like this: digits(40125[, 10])?
I'm unable to reproduce you result, running your code gives me
julia> a
1-element Array{Int64,1}:
654068
There's a few mistakes and inefficiencies in the code:
length(n) doesn't give the number of digits in n, but always returns 1 (currently, numbers are iterable, and return a sequence that only contain one number; itself). So the for loop is never run.
/ between integers does floating point division. For extracting digits, you´re better off with div(x,y), which does integer division.
There's no reason to write a = push!(a,x), since push! modifies a in place. So it will be equivalent to writing push!(a,x); a = a.
There's no reason to digits that are zero specially, they are handled just fine by the general case.
Your description of scoping in Julia seems to be correct, I think that it is the above which is giving you trouble.
You could use something like
n = 654068
a = Int64[]
while n != 0
push!(a, n % 10)
n = div(n, 10)
end
reverse!(a)
This loop extracts the digits in opposite order to avoid having to figure out the number of digits in advance, and uses the modulus operator % to extract the least significant digit. It then uses reverse! to get them in the order you wanted, which should be pretty efficient.
About the documentation for digits, [, base] just means that base is an optional parameter. The description should probably be digits(n[, base[, pad]]), since it's not possible to specify pad unless you specify base. Also note that digits will return the least significant digit first, what we get if we remove the reverse! from the code above.
Is this cheating?:
n = 654068
nstr = string(n)
a = map((x) -> x |> string |> int , collect(nstr))
outputs:
6-element Array{Int64,1}:
6
5
4
0
6
8

Cubic root of large number

I'm trying to identify the cubic root of a large number. I found a solution which works for smaller numbers, but not in this case:
require 'openssl'
q = OpenSSL::BN::generate_prime(2048)
ti = q.to_i #=> 3202718747...
ti3 = ti ** 3 #=> 328515909...
m = ti3 ** (1/3.0) #=> Infinity
I was hoping to see m = the original output of ti. Yes, this is a part of a Matasano challenge. I've put a lot of effort into not seeking help thus far, but I've reached a point where it's just a "how do I do something otherwise simple, in Ruby". Any assistance appreciated.
In ruby operations on integers automatically get promoted to bignums (arbitrary precision integers), so you never get an overflow.
The same is not true of floating point operations: you end up with infinity because raising to the power 1/3 is a floating point operation and the first thing it does is try to convert your number to a float. The biggest number a float in ruby can represent is about 10^308 whereas your number is probably around the 10^1800 mark, so it bails out and returns Infinity
Ruby has a BigDecimal class for this. You might therefore be tempted to do
BigDecimal.new(ti3) ** (1/3.0)
This gives a wildly wrong answer for me - I suspect because (1/3.0) is a float, so only approximately 1/3
BigDecimal.new(ti3) ** Rational(1,3)
On the other hand produces the correct result for me (with negligible error). Rational is Ruby's class for representing fractions in an exact manner. In ruby 2.1 you can shorten this to
BigDecimal.new(ti3) ** (1r/3)
The docs do say that only integer exponents are supported but this seems to be a hangover from the ruby 1.8 days
The following code was put forward based on the two pieces of advice given.
def nthroot(n, a, precision = 1e-1024)
x = a
begin
prev = x
x = ((n - 1) * prev + a / (prev ** (n - 1))) / n
end while (prev - x).abs > precision
x
end
It was based on an implementation of Newton's method which dealt with floats, but also just returned infinity. This version deals with integers only, but works for large integers.
Of course, an nthroot, may be called with n = 3.
I don't know what the Matasano challenge is, but what comes to mind is Newton's Method
The wikipedia page on Cube Roots also suggests using Newton's Method

a faster way of implementing the nested loop with gamma function

I am trying to evaluate the following integral:
I can find the area for the following polynomial as follows:
pn =
-0.0250 0.0667 0.2500 -0.6000 0
First using the integration by Simpson's rule
fn=#(x) exp(polyval(pn,x));
area=quad(fn,-10,10);
fprintf('area evaluated by Simpsons rule : %f \n',area)
and the result is area evaluated by Simpsons rule : 11.483072
Then with the following code that evaluates the summation in the above formula with gamma function
a=pn(1);b=pn(2);c=pn(3);d=pn(4);f=pn(5);
area=0;
result=0;
for n=0:40;
for m=0:40;
for p=0:40;
if(rem(n+p,2)==0)
result=result+ (b^n * c^m * d^p) / ( factorial(n)*factorial(m)*factorial(p) ) *...
gamma( (3*n+2*m+p+1)/4 ) / (-a)^( (3*n+2*m+p+1)/4 );
end
end
end
end
result=result*1/2*exp(f)
and this returns 11.4831. More or less the same result with the quad function. Now my question is whether or not it is possible for me to get rid of this nested loop as I will construct the cumulative distribution function so that I can get samples from this distribution using the inverse CDF transform. (for constructing the cdf I will use gammainc i.e. the incomplete gamma function instead of gamma)
I will need to sample from such densities that may have different polynomial coefficients and speed is of concern to me. I can already sample from such densities using Monte Carlo methods but I would like to see whether or not it is possible for me to use exact sampling from the density in order to speed up.
Thank you very much in advance.
There are several things one might do. The simplest is to avoid calling factorial. Instead one can use the relation that
factorial(n) = gamma(n+1)
Since gamma seems to be actually faster than a call to factorial, you can save a bit there. Even better, you can
>> timeit(#() factorial(40))
ans =
4.28681157826087e-05
>> timeit(#() gamma(41))
ans =
2.06671024634146e-05
>> timeit(#() gammaln(41))
ans =
2.17632543333333e-05
Even better, one can do all 4 calls in a single call to gammaln. For example, think about what this does:
gammaln([(3*n+2*m+p+1)/4,n+1,m+1,p+1])*[1 -1 -1 -1]'
Note that this call has no problem with overflows either in case your numbers get large enough. And since gammln is vectorized, that one call is fast. It costs little more time to compute 4 values than it does to compute one.
>> timeit(#() gammaln([15 20 40 30]))
ans =
2.73937416896552e-05
>> timeit(#() gammaln(40))
ans =
2.46521943333333e-05
Admittedly, if you use gammaln, you will need a call to exp at the end to recover the final result. You could do it with a single call to gamma however too. Perhaps like this:
g = gamma([(3*n+2*m+p+1)/4,n+1,m+1,p+1]);
g = g(1)/(g(2)*g(3)*g(4));
Next, you can be more creative in the inner loop on p. Rather than a full loop, coupled with a test to ignore the combinations you don't need, why not just do this?
for p=mod(n,2):2:40
That statement will select only those values of p that would have been used anyway, so now you can drop the if statement completely.
All of the above will give you what I'll guess is about a 5x speed increase in your loops. But it still has a set of nested loops. With some effort, you might be able to improve that too.
For example, rather than computing all of those factorials (or gamma functions) many times, do it ONCE. This should work:
a=pn(1);b=pn(2);c=pn(3);d=pn(4);f=pn(5);
area=0;
result=0;
nlim = 40;
facts = factorial(0:nlim);
gammas = gamma((0:(6*nlim+1))/4);
for n=0:nlim
for m=0:nlim
for p=mod(n,2):2:nlim
result = result + (b.^n * c.^m * d.^p) ...
.*gammas(3*n+2*m+p+1 + 1) ...
./ (facts(n+1).*facts(m+1).*facts(p+1)) ...
./ (-a)^( (3*n+2*m+p+1)/4 );
end
end
end
result=result*1/2*exp(f)
In my test on my machine, I find that your triply nested loops required 4.3 seconds to run. My version above produces the same result, yet required only 0.028418 seconds, a speedup of roughly 150 to 1, despite the triply nested loops.
Well, without even making changes to your code you could install an excellent package from Tom Minka at Microsoft called lightspeed which replaces some built-in matlab functions with much faster versions. I know there's a replacement for gammaln().
You'll get nontrivial speed improvements, though I'm not sure how much, and it's straight-forward to install.

Resources