Really weird interaction with numbers - pascal

Today I was doing my university programming classes exercises and came up to this weird thing. I would like to know if anyone could explain to me what's going on here.
This is the thing I coded to show it:
program problema;
var
a : real;
b : real;
begin
a := 1 - 0.8 - 0.2;
b := 1 - 0.2 - 0.8;
write(a);
writeln(b);
end.
While I expected it to return 0 in both cases, it actually returns -1.3... on the first one and 0 on the second one. How can that be possible?

You are seeing round off errors when dealing with decimals expressed in binary. The other commenters have implied that but just didn't say it specifically.
To deal with this it is important to judge equality as a range rather than an equality. For example, to decide if a real number x is equal to 0.2, test it not as x=0.2, but rather for |x-0.2|< epsilon, where epsilon is the tolerance you want. Perhaps abs(x-0.2)<0.000001 is good enough.

Related

How does one do Algebra in Lua?

I've looked and tried but i cant find anything really helpful so thank you in advance.
My problem is i have a changing variable, "balance" for the moment i have it represented as 200. I need to use this equation to find how much money i should withdraw in a game, but I don't know how to write a LUA script that solves algebra
The equation is: 200/(x+x^2+x^3+x^4+x^5)=0.00001001 how would i set about solving for x?
I have tried adding .0000001 if 200/(x+x^2+x^3+x^4+x^5) doesn't equal 0.00001001 but it is very impractical and I haven't gotten it to work. This is The only way I can come up with at the moment. Any help would be appreciated.
This solution finds zero of any continuous function (not only algebraical and not only differentiable) and requires knowing the diapazone of the root to be found.
local function find_zero(f, x_left, x_right, eps)
eps = eps or 0.0000000001 -- precision
local f_left, f_right = f(x_left), f(x_right)
assert(x_left <= x_right and f_left * f_right <= 0, "Wrong diapazone")
while x_right - x_left > eps do
local x_middle = (x_left + x_right) / 2
local f_middle = f(x_middle)
if f_middle * f_left > 0 then
x_left, f_left = x_middle, f_middle
else
x_right, f_right = x_middle, f_middle
end
end
return (x_left + x_right) / 2
end
local function my_func(x)
return 200/(x+x^2+x^3+x^4+x^5) - 0.00001001
end
-- Assuming that the root is between 1 and 1000
local x = find_zero(my_func, 1.0, 1000.0)
print(x) --> 28.643931367544
200/(x+x^2+x^3+x^4+x^5)=0.00001001 is equivalent to 200 = 0.00001001 * (x+x^2+x^3+x^4+x^5), so you have a polynomial equation to solve, and traditionally it is this form of the equation that people like to deal with.
If you want to stay in Lua, then if the form of the equation is predictable enough that you can find a place where the right side is always less than the left (e.g. x = 0) and a place where the right sight is always greater than the left (e.g. very large values of x) then you can use binary search - not terribly efficient, but certain and easy to code.
For general polynomial equations, one well known method is https://en.wikipedia.org/wiki/Newton's_method. Given f(x) = 0 and a guess for x, a better guess might be x - f(x) / f'(x), where f'(x) is the derivative of f(x). There are a few pathological cases where this fails for various reasons, though, so again you probably want to know that your equations is reliably tractable.
Since you have Lua, you may be able to bring in C code that calls out to a maths library such as http://commons.apache.org/proper/commons-math/. They have a routine called LaguerreSolver() which will reasonably reliably solve polynomial equations for you, defending itself against all of the pathological cases. Most math libraries contain a lot more work than any single person is likely to put in for an individual problem, and are of correspondingly higher quality than do it yourself approach such as I describe above.

Mathematica variable defined outside Do loop

I want mathematica to display the result in decimal form.Say 3 decimal places together with 10 to some power. What function shall I use?
r = 0;
Do[r += (i/100), {i, 1, 100}];
Print[r];
I tried ScientificForm[r,3]andNumberForm[r,3] and both do not work.
Thanks in advance!
The problem you have, though you don't quite state this, is that Mathematica can compute r accurately. Your code sets the value of 2 to the rational number 101/2 and Mathematica's default behaviour is to display accurate numbers accurately. That's what you (or whoever bought your licence) pay for.
The expression
N[r]
will produce a decimal representation of r, ie 50.5 and
ScientificForm[N[r]]
gives the result
5.05*10^(1)
(though formatted rather more nicely in the Mathematica front end).

Pascal double value is not exact, how to fix this?

I have been solving a programming challenge in UVA and got this problem, which is REALLY strange. Here is the flawed code:
program WTF;
begin
WriteLn(Trunc(2.01 * 100));
ReadLn();
end.
Obviously, I need to get 201 as Integer, but I get 200, this happens because Double somehow doesn't store the exact value... It's 2.01 = 2.00(9) for reasons unbeknownst to me, can someone explain this and provide a solution?
Edit: Yet, I figgured that using Round() instead of Trunc() fixes this... But still, why wouldn't Trunc() work?
Double stores numbers of the form s*2p where s and p are integers. The number 2.01 is not of the form s*2p for any integers s, p so it cannot be stored exactly in a Double.
The solution here is to round 2.01 * 100 to the nearest integer instead of truncating it. Although 2.01 is not exactly 2.01, it is only a little bit below. Rounding to the nearest integer would result in 201.
Note that if by 2.00(9) you mean 2.0099999999… repeating indefinitely, then 2.00(9)is not the Double you get when you write 2.01. The nearest Double to the real 2.01, and the number you got, is 2.0099999999999997868371792719699442386627197265625. It is of the form s * 2p: 9052235251014696 * 2-52

Is it possible to overflow an Oracle NUMBER type?

I'm working with process scheduling software called Appworx. In it, each process and subprocess can have an arbitrary number of "conditions", which if true, some conditional action is taken.
One of the possible conditional actions is a goto statement, where a plain integer is the label (each condition being numbered starting at 1). I'd like to use this feature to evaluate and run a few tasks in a loop, but you can only goto higher-numbered conditions (Don't ask me why... this seems to ruin most of the utility).
I have reason to believe that all of this is evaluated by Oracle on the backend. And having looked at the schema for Appworx, it appears that the goto labels are all NUMBER(12,0). I suspect that the logic that checks whether a label is lower than the current condition is something like:
where label > current_condition
So, if I were to supply a goto with a high enough value, I think it would cheat the checking and allow me to do simple loops. At least if Oracle used normal integers. Is it possible to overflow them, and what value would I use to overflow the value back to 1?
I suppose the Oracle version matters quite a bit, if so, it's 11g.
PS Also, if anyone would care to re-tag this for me, please add "appworx"
Oracle numbers are in fact floating point numbers with 40 decimal digit significand.
So, they can't be overflowed.
(10^40-1) is the maximal integer number which can be increased by 1.
proof
NUMBER(12,0) is a subtype of type NUMBER.
That is, it consists of NUMBER type and a restriction checker.
Well, that depends on your definition of 'overflow'. If you define 'overflow' as 'find a value n where n + 1 < n', then no, there's no such value. If you define 'overflow' as 'raises an exception', then yes, it's quite possible to perform an operation on a NUMBER(12,0) where an exception is raised.
Run the following:
DECLARE
n NUMBER(12, 0);
BEGIN
n := 999999999999; -- Twelve 9's
DBMS_OUTPUT.PUT_LINE('1 : n=' || n);
n := n + 1;
DBMS_OUTPUT.PUT_LINE('2 : n=' || n);
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('Exception: ' || SQLCODE || ' ' || SQLERRM);
END;
As you can see, the following exception is thrown when attempting to execute "n := n + 1":
ORA-06502: PL/SQL: numeric or value error: number precision too large
So it's quite possible to overflow a subtype of NUMBER. However, given that you're hoping to find a value n where n + 1 < n, I think you're out of luck.
And if you really want to provoke this behavior using the base NUMBER type, just execute
n := POWER(10, 126);
Of course, for truly nasty behavior in NUMBER you need to get it to produce a NaN (Not a Number):
n := 9999999999999999999999999999999999999999 * POWER(10, 125);
DBMS_OUTPUT.PUT_LINE('n=' || n);
produces
n=~
WTF?!? '~'? What the heck is '~'? Well, it appears that this is Oracle's way of printing a NaN. And the really fun part? Once you've got a NaN in a variable, any operation you perform on that variable will produce another NaN. Quietly. Silently. Without warning. Without recourse. Try:
DBMS_OUTPUT.PUT_LINE('n * 1234=' || n * 1234); -- produces n * 1234=~
DBMS_OUTPUT.PUT_LINE('n / 5678=' || n / 5678); -- produces n / 5678=~
Hey - have fun sweating your financials! :-)
In actual practice you're very unlikely to encounter this behavior, but it's the kind of thing you really need to be aware of - not only because encountering it can really ruin your month, but because (and you can count on this) next week the clueless guy in the cube by the bathroom is going to be asking about this - and you will now know all about it. (And you will now be able to rest easy, comfortable in the knowledge that this guy really is clueless and thus deserves to be parked in The Cube From Hell. I mean, you picked up on this on StackOverflow, right? So how hard can it be? :-)
Share and enjoy.

What is a "good" R value when comparing 2 signals using cross correlation?

I apologize for being a bit verbose in advance: if you want to skip all the background mumbo jumbo you can see my question down below.
This is pretty much a follow up to a question I previously posted on how to compare two 1D (time dependent) signals. One of the answers I got was to use the cross-correlation function (xcorr in MATLAB), which I did.
Background information
Perhaps a little background information will be useful: I'm trying to implement an Independent Component Analysis algorithm. One of my informal tests is to (1) create the test case by (a) generate 2 random vectors (1x1000), (b) combine the vectors into a 2x1000 matrix (called "S"), and multiply this by a 2x2 mixing matrix (called "A"), to give me a new matrix (let's call it "T").
In summary: T = A * S
(2) I then run the ICA algorithm to generate the inverse of the mixing matrix (called "W"), (3) multiply "T" by "W" to (hopefully) give me a reconstruction of the original signal matrix (called "X")
In summary: X = W * T
(4) I now want to compare "S" and "X". Although "S" and "X" are 2x1000, I simply compare S(1,:) to X(1,:) and S(2,:) to X(2,:), each which is 1x1000, making them 1D signals. (I have another step which makes sure that these vectors are the proper vectors to compare to each other and I also normalize the signals).
So my current quandary is how to 'grade' how close S(1,:) matches to X(1,:), and likewise with S(2,:) to X(2,:).
So far I have used something like: r1 = max(abs(xcorr(S(1,:), X(1,:)))
My question
Assuming that using the cross correlation function is a valid way to go about comparing the similarity of two signals, what would be considered a good R value to grade the similarity of the signals? Wikipedia states that this is a very subjective area, and so I defer to the better judgment of those who might have experience in this field.
As you might realize, I'm not coming from a EE/DSP/statistical background at all (I'm a medical student) so I'm going through a sort of "baptism through fire" right now, and I appreciate all the help I can get. Thanks!
(edit: as far as directly answering your question about R values, see below)
One way to approach this would be to use cross-correlation. Bear in mind that you have to normalize amplitudes and correct for delays: if you have signal S1, and signal S2 is identical in shape, but half the amplitude and delayed by 3 samples, they're still perfectly correlated.
For example:
>> t = 0:0.001:1;
>> y = #(t) sin(10*t).*exp(-10*t).*(t > 0);
>> S1 = y(t);
>> S2 = 0.4*y(t-0.1);
>> plot(t,S1,t,S2);
These should have a perfect correlation coefficient. A way to compute this is to use maximum cross-correlation:
>> f = #(S1,S2) max(xcorr(S1,S2));
f =
#(S1,S2) max(xcorr(S1,S2))
>> disp(f(S1,S1)); disp(f(S2,S2)); disp(f(S1,S2));
12.5000
2.0000
5.0000
The maximum value of xcorr() takes care of the time-delay between signals. As far as correcting for amplitude goes, you can normalize the signals so that their self-cross-correlation is 1.0, or you can fold that equivalent step into the following:
ρ2 = f(S1,S2)2 / (f(S1,S1)*f(S2,S2);
In this case ρ2 = 5 * 5 / (12.5 * 2) = 1.0
You can solve for ρ itself, i.e. ρ = f(S1,S2)/sqrt(f(S1,S1)*f(S2,S2)), just bear in mind that both 1.0 and -1.0 are perfectly correlated (-1.0 has opposite sign)
Try it on your signals!
with respect to what threshold to use for acceptance/rejection, that really depends on what kind of signals you have. 0.9 and above is fairly good but can be misleading. I would consider looking at the residual signal you get after you subtract out the correlated version. You could do this by looking at the time index of the maximum value of xcorr():
>> t = 0:0.001:1;
>> y = #(a,t) sin(a*t).*exp(-a*t).*(t > 0);
>> S1=y(10,t);
>> S2=0.4*y(9,t-0.1);
>> f(S1,S2)/sqrt(f(S1,S1)*f(S2,S2))
ans =
0.9959
This looks pretty darn good for a correlation. But let's try fitting S2 with a scaled/shifted multiple of S1:
>> [A,i]=max(xcorr(S1,S2)); tshift = i-length(S1);
>> S2fit = zeros(size(S2)); S2fit(1-tshift:end) = A/f(S1,S1)*S1(1:end+tshift);
>> plot(t,[S2; S2fit]); % fit S2 using S1 as a basis
>> plot(t,[S2-S2fit]); % residual
Residual has some energy in it; to get a feel for how much, you can use this:
>> S2res=S2-S2fit;
>> dot(S2res,S2res)/dot(S2,S2)
ans =
0.0081
>> sqrt(dot(S2res,S2res)/dot(S2,S2))
ans =
0.0900
This says that the residual has about 0.81% of the energy (9% of the root-mean-square amplitude) of the original signal S2. (the dot product of a 1D signal with itself will always be equal to the maximum value of cross-correlation of that signal with itself.)
I don't think there's a silver bullet for answering how similar two signals are with each other, but hopefully I've given you some ideas that might be applicable to your circumstances.
A good starting point is to get a sense of what a perfect match will look like by calculating the auto-correlations for each signal (i.e. do the "cross-correlation" of each signal with itself).
THIS IS A COMPLETE GUESS - but I'm guessing max(abs(xcorr(S(1,:),X(1,:)))) > 0.8 implies success. Just out of curiosity, what kind of values do you get for max(abs(xcorr(S(1,:),X(2,:))))?
Another approach to validate your algorithm might be to compare A and W. If W is calculated correctly, it should be A^-1, so can you calculate a measure like |A*W - I|? Maybe you have to normalize by the trace of A*W.
Getting back to your original question, I come from a DSP background, so I get to deal with fairly noise-free signals. I understand that's not a luxury you get in biology :) so my 0.8 guess might be very optimistic. Perhaps looking at some literature in your field, even if they aren't using cross-correlation exactly, might be useful.
Usually in such cases people talk about "false acceptance rate" and "false rejection rate".
The first one describes how many times algorithm says "similar" for non-similar signals, the second one is the opposite.
Selecting a threshold thus becomes a trade-off between these criteria. To make FAR=0, threshold should be 1, to make FRR=0 threshold should be -1.
So probably, you will need to decide which trade-off between FAR and FRR is acceptable in your situation and this will give the right value for threshold.
Mathematically this can be expressed in different ways. Just a couple of examples:
1. fix some of rates at acceptable value and minimize other one
2. minimize max(FRR,FAR)
3. minimize aFRR+bFAR
Since they should be equal, the correlation coefficient should be high, between .99 and 1. I would take the max and abs functions out of your calculation, too.
EDIT:
I spoke too soon. I confused cross-correlation with correlation coefficient, which is completely different. My answer might not be worth much.
I would agree that the result would be subjective. Something that would involve the sum of the squares of the differences, element by element, would have some value. Two identical arrays would give a value of 0 in that form. You would have to decide what value then becomes "bad". Make up 2 different vectors that "aren't too bad" and find their cross-correlation coefficient to be used as a guide.
(parenthetically: if you were doing a correlation coefficient where 1 or -1 would be great and 0 would be awful, I've been told by bio-statisticians that a real-life value of 0.7 is extremely good. I understand that this is not exactly what you are doing but the comment on correlation coefficient came up earlier.)

Resources