Why does Ruby's Fixnum#/ method round down when it is called on another Fixnum? - ruby

Okay, so what's up with this?
irb(main):001:0> 4/3
=> 1
irb(main):002:0> 7/8
=> 0
irb(main):003:0> 5/2
=> 2
I realize Ruby is doing integer division here, but why? With a langauge as flexible as Ruby, why couldn't 5/2 return the actual, mathematical result of 5/2? Is there some common use for integer division that I'm missing? It seems to me that making 7/8 return 0 would cause more confusion than any good that might come from it is worth. Is there any real reason why Ruby does this?

Because most languages (even advanced/high-level ones) in creation do it? You will have the same behaviour on integer in C, C++, Java, Perl, Python... This is Euclidian Division (hence the corresponding modulo % operator).
The integer division operation is even implemented at hardware level on many architecture. Others have asked this question, and one reason is symetry: In static typed languages such as see, this allows all integer operations to return integers, without loss of precision. It also allow easy access to the corresponding low-level assembler operation, since C was designed as a sort of extension layer over it.
Moreover, as explained in one comment to the linked article, floating point operations were costly (or not supported on all architectures) for many years, and not required for processes such as splitting a dataset in fixed lots.

Related

Is it worth it to rewrite an if statement to avoid branching?

Recently I realized I have been doing too much branching without caring the negative impact on performance it had, therefore I have made up my mind to attempt to learn all about not branching. And here is a more extreme case, in attempt to make the code to have as little branch as possible.
Hence for the code
if(expression)
A = C; //A and C have to be the same type here obviously
expression can be A == B, or Q<=B, it could be anything that resolve to true or false, or i would like to think of it in term of the result being 1 or 0 here
I have come up with this non branching version
A += (expression)*(C-A); //Edited with thanks
So my question would be, is this a good solution that maximize efficiency?
If yes why and if not why?
Depends on the compiler, instruction set, optimizer, etc. When you use a boolean expression as an int value, e.g., (A == B) * C, the compiler has to do the compare, and the set some register to 0 or 1 based on the result. Some instruction sets might not have any way to do that other than branching. Generally speaking, it's better to write simple, straightforward code and let the optimizer figure it out, or find a different algorithm that branches less.
Jeez, no, don't do that!
Anyone who "penalize[s] [you] a lot for branching" would hopefully send you packing for using something that awful.
How is it awful, let me count the ways:
There's no guarantee you can multiply a quantity (e.g., C) by a boolean value (e.g., (A==B) yields true or false). Some languages will, some won't.
Anyone casually reading it is going observe a calculation, not an assignment statement.
You're replacing a comparison, and a conditional branch with two comparisons, two multiplications, a subtraction, and an addition. Seriously non-optimal.
It only works for integral numeric quantities. Try this with a wide variety of floating point numbers, or with an object, and if you're really lucky it will be rejected by the compiler/interpreter/whatever.
You should only ever consider doing this if you had analyzed the runtime properties of the program and determined that there is a frequent branch misprediction here, and that this is causing an actual performance problem. It makes the code much less clear, and its not obvious that it would be any faster in general (this is something you would also have to measure, under the circumstances you are interested in).
After doing research, I came to the conclusion that when there are bottleneck, it would be good to include timed profiler, as these kind of codes are usually not portable and are mainly used for optimization.
An exact example I had after reading the following question below
Why is it faster to process a sorted array than an unsorted array?
I tested my code on C++ using that, that my implementation was actually slower due to the extra arithmetics.
HOWEVER!
For this case below
if(expression) //branched version
A += C;
//OR
A += (expression)*(C); //non-branching version
The timing was as of such.
Branched Sorted list was approximately 2seconds.
Branched unsorted list was aproximately 10 seconds.
My implementation (whether sorted or unsorted) are both 3seconds.
This goes to show that in an unsorted area of bottleneck, when we have a trivial branching that can be simply replaced by a single multiplication.
It is probably more worthwhile to consider the implementation that I have suggested.
** Once again it is mainly for the areas that is deemed as the bottleneck **

How does addition work in racket/scheme?

For example, if you try (+ 3 4), how is it broken down and calculated in the source, specifically? Does it use recursion with add1?
The implementation of + is actually much more complicated than you might expect, because arithmetic is generic in Racket: it works on integers, rational numbers, complex numbers, and so on. You can even mix and match these kinds of numbers and it'll do the right thing. In the end, it's ultimately going to use arithmetic in C, which is what the runtime system is written in.
If you're curious, you can find more of the guts of the numeric tower here: https://github.com/plt/racket/blob/master/src/racket/src/numarith.c
Other pointers: Bignum arithmetic, the Scheme numeric tower, the Racket reference on numbers.
The + operator is a primitive operation, part of the core language. For efficiency reasons, it wouldn't make much sense to implement it as a recursive procedure.

Behaviour of ruby with regards to numerical exactness (Scheme comparison)

In Ruby (1.9), are numbers always considered exact? I know Ruby does some fairly complex stuff with numbers under the hood, since you can do crazy things like ask for 1 trillion to the power of 1 trillion and you will get an answer (after waiting many moons for it to compute).
In Scheme, part of the specification dictates that implementations should indicate whether or not the implementation's internal representation of the number is "exact" or not. For example, 1/2 is always exact, 1/3 is exact, 0.3333 is exact etc etc. But the result of an imprecise mathematical operation on an exact number may produce a number that the Scheme implementation knows to be inexact (due to floating point precision).
(exact? (/ 0.33333 2))
=> #f
That's false, so it's not exact.
Is there a way to deduce the same information in Ruby? If I always use the Complex (or Rational) representation of a number during mathematical operations, will it always be exact in Ruby, or just extremely close to exact?
Complex("0.33333") / 2
Is that exact, or not?
Ruby's class Float is not exact. Class Integer on the other hand more or less "is" since it shifts seamlessly from Fixnum to Bignum
>> 1.4534346345235236346363574564356435
=> 1.45343463452352
>> 10**400
=> 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
>> 10.0**400
=> Infinity
It looks like others would love to see the test you are asking about, namely Numeric#exact? -- see this feature request.

Arbitrary precision arithmetic with Ruby

How the heck does Ruby do this? Does Jörg or anyone else know what's happening behind the scenes?
Unfortunately I don't know C very well so bignum.c is of little help to me. I was just kind of curious it someone could explain (in plain English) the theory behind whatever miracle algorithm its using.
irb(main):001:0> 999**999
368063488259223267894700840060521865838338232037353204655959621437025609300472231530103873614505175218691345257589896391130393189447969771645832382192366076536631132001776175977932178658703660778465765811830827876982014124022948671975678131724958064427949902810498973271030787716781467419524180040734398996952930832508934116945966120176735120823151959779536852290090377452502236990839453416790640456116471139751546750048602189291028640970574762600185950226138244530187489211615864021135312077912018844630780307462205252807737757672094320692373101032517459518497524015120165166724189816766397247824175394802028228160027100623998873667435799073054618906855460488351426611310634023489044291860510352301912426608488807462312126590206830413782664554260411266378866626653755763627796569082931785645600816236891168141774993267488171702172191072731069216881668294625679492696148976999868715671440874206427212056717373099639711168901197440416590226524192782842896415414611688187391232048327738965820265934093108172054875188246591760877131657895633586576611857277011782497943522945011248430439201297015119468730712364007639373910811953430309476832453230123996750235710787086641070310288725389595138936784715274150426495416196669832679980253436807864187160054589045664027158817958549374490512399055448819148487049363674611664609890030088549591992466360050042566270348330911795487647045949301286614658650071299695652245266080672989921799342509291635330827874264789587306974472327718704306352445925996155619153783913237212716010410294999877569745287353422903443387562746452522860420416689019732913798073773281533570910205207767157128174184873357050830752777900041943256738499067821488421053870869022738698816059810579221002560882999884763252161747566893835178558961142349304466506402373556318707175710866983035313122068321102457824112014969387225476259342872866363550383840720010832906695360553556647545295849966279980830561242960013654529514995113584909050813015198928283202189194615501403435553060147713139766323195743324848047347575473228198492343231496580885057330510949058490527738662697480293583612233134502078182014347192522391449087738579081585795613547198599661273567662441490401862839817822686573112998663038868314974259766039340894024308383451039874674061160538242392803580758232755749310843694194787991556647907091849600704712003371103926967137408125713631396699343733288014254084819379380555174777020843568689927348949484201042595271932630685747613835385434424807024615161848223715989797178155169951121052285149157137697718850449708843330475301440373094611119631361702936342263219382793996895988331701890693689862459020775599439506870005130750427949747071390095256759203426671803377068109744629909769176319526837824364926844730545524646494321826241925107158040561607706364484910978348669388142016838792902926158979355432483611517588605967745393958061959024834251565197963477521095821435651996730128376734574843289089682710350244222290017891280419782767803785277960834729869249991658417000499998999
Simple: it does it the same way you do, ever since first grade. Except it doesn't compute in base 10, it computes in base 4 billion (and change).
Think about it: with our number system, we can only represent numbers from 0 to 9. So, how can we compute 6+7 without overflowing? Easy: we do actually overflow! We cannot represent the result of 6+7 as a number between 0 and 9, but we can overflow to the next place and represent it as two numbers between 0 and 9: 3×100 + 1×101. If you want to add two numbers, you add them digit-wise from the right and overflow ("carry") to the left. If you want to multiply two numbers, you have to multiply every digit of one number individually with the other number, then add up the intermediate results.
BigNum arithmetic (this is what this kind of arithmetic where the numbers are bigger than the native machine numbers is usually called) works basically the same way. Except that the base is not 10, and its not 2, either – it's the size of a native machine integer. So, on a 32 bit machine, it would be base 232 or 4 294 967 296.
Specifically, in Ruby Integer is actually an abstract class that is never instianted. Instead, it has two subclasses, Fixnum and Bignum, and numbers automagically migrate between them, depending on their size. In MRI and YARV, Fixnum can hold a 31 or 63 bit signed integer (one bit is used for tagging) depending on the native word size of the machine. In JRuby, a Fixnum can hold a full 64 bit signed integer, even on an 32 bit machine.
The simplest operation is adding two numbers. And if you look at the implementation of + or rather bigadd_core in YARV's bignum.c, it's not too bad to follow. I can't read C either, but you can cleary see how it loops over the individual digits.
You could read the source for bignum.c...
At a very high level, without going into any implementation details, bignums are calculated "by hand" like you used to do in grade school. Now, there are certainly many optimizations that can be applied, but that's the gist of it.
I don't know of the implementation details so I'll cover how a basic Big Number implementation would work.
Basically instead of relying on CPU "integers" it will create it's own using multiple CPU integers. To store arbritrary precision, well lets say you have 2 bits. So the current integer is 11. You want to add one. In normal CPU integers, this would roll over to 00
But, for big number, instead of rolling over and keeping a "fixed" integer width, it would allocate another bit and simulate an addition so that the number becomes the correct 100.
Try looking up how binary math can be done on paper. It's very simple and is trivial to convert to an algorithm.
Beaconaut APICalc 2 just released on Jan.18, 2011, which is an arbitrary-precision integer calculator for bignum arithmetic, cryptography analysis and number theory research......
http://www.beaconaut.com/forums/default.aspx?g=posts&t=13
It uses the Bignum class
irb(main):001:0> (999**999).class
=> Bignum
Rdoc is available of course

Should I use NSDecimalNumber to deal with money?

As I started coding my first app I used NSNumber for money values without thinking twice. Then I thought that maybe c types were enough to deal with my values. Yet, I was advised in the iPhone SDK forum to use NSDecimalNumber, because of its excellent rounding capabilities.
Not being a mathematician by temperament, I thought that the mantissa/exponent paradigm might be overkill; still, googlin' around, I realised that most talks about money/currency in cocoa were referred to NSDecimalNumber.
Notice that the app I am working on is going to be internationalised, so the option of counting the amount in cents is not really viable, for the monetary structure depends greatly on the locale used.
I am 90% sure that I need to go with NSDecimalNumber, but since I found no unambiguous answer on the web (something like: "if you deal with money, use NSDecimalNumber!") I thought I'd ask here. Maybe the answer is obvious to most, but I want to be sure before starting a massive re-factoring of my app.
Convince me :)
Marcus Zarra has a pretty clear stance on this: "If you are dealing with currency at all, then you should be using NSDecimalNumber." His article inspired me to look into NSDecimalNumber, and I've been very impressed with it. IEEE floating point errors when dealing with base-10 math have been irritating me for a while (1 * (0.5 - 0.4 - 0.1) = -0.00000000000000002776) and NSDecimalNumber does away with them.
NSDecimalNumber doesn't just add another few digits of binary floating point precision, it actually does base-10 math. This gets rid of the errors like the one shown in the example above.
Now, I'm writing a symbolic math application, so my desire for 30+ decimal digit precision and no weird floating point errors might be an exception, but I think it's worth looking at. The operations are a little more awkward than simple var = 1 + 2 style math, but they're still manageable. If you're worried about allocating all sorts of instances during your math operations, NSDecimal is the C struct equivalent of NSDecimalNumber and there are C functions for doing the exact same math operations with it. In my experience, these are plenty fast for all but the most demanding applications (3,344,593 additions/s, 254,017 divisions/s on a MacBook Air, 281,555 additions/s, 12,027 divisions/s on an iPhone).
As an added bonus, NSDecimalNumber's descriptionWithLocale: method provides a string with a localized version of the number, including the correct decimal separator. The same goes in reverse for its initWithString:locale: method.
Yes. You have to use
NSDecimalNumber and
not double or float when you deal with currency on iOS.
Why is that??
Because we don't want to get things like $9.9999999998 instead of $10
How that happens??
Floats and doubles are approximations. They always comes with a rounding error. The format computers use to store decimals cause this rouding error.
If you need more details read
http://floating-point-gui.de/
According to apple docs,
NSDecimalNumber is an immutable subclass of NSNumber, provides an object-oriented wrapper for doing base-10 arithmetic. An instance can represent any number that can be expressed as mantissa x 10^exponent where mantissa is a decimal integer up to 38 digits long, and exponent is an integer from –128 through 127.wrapper for doing base-10 arithmetic.
So NSDecimalNumber is recommonded for deal with currency.
(Adapted from my comment on the other answer.)
Yes, you should. An integral number of pennies works only as long as you don't need to represent, say, half a cent. If that happens, you could change it to count half-cents, but what if you then need to represent a quarter-cent, or an eighth of a cent?
The only proper solution is NSDecimalNumber (or something like it), which puts off the problem to 10^-128¢ (i.e.,
0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000001¢).
(Another way would be arbitrary-precision arithmetic, but that requires a separate library, such as the GNU MP Bignum library. GMP is under the LGPL. I've never used that library and don't know exactly how it works, so I couldn't say how well it would work for you.)
[Edit: Apparently, at least one person—Brad Larson—thinks I'm talking about binary floating-point somewhere in this answer. I'm not.]
I've found it convenient to use an integer to represent the number of cents and then divide by 100 for presentation. Avoids the whole issue.
A better question is, when should you not use NSDecimalNumber to deal with money. The short answer to that question is, when you can't tolerate the performance overhead of NSDecimalNumber and you don't care about small rounding errors because you're never dealing with more than a few digits of precision. The even shorter answer is, you should always use NSDecimalNumber when dealing with money.
VISA, MasterCards and others are using integer values while passing amounts. It's up to sender and reciever to parse amouts correctly according to currency exponent (divide or multiply by 10^num, where num - is an exponent of the currency). Note that different currencies have different exponents. Usually it's 2 (hence we divide and multiply by 100), but some currencies have exponent = 0 (VND,etc), or = 3.

Resources