Conversion of decimal fraction to floating-point binary reprecentation - algorithm

Let's assume that we have normalized floating point numbers with an exponent range of [-3,3] and a precision of 4 bits. Below you see 4 decimal numbers and the corresponding binary representation. How can I convert these decimal numbers to binary? How to go from binary to decimal I know, but not vice versa.
0.11 (decimal) = 1.000 * 2^-3 (binary)
3.1416 (decimal) = 1.101 * 2^1 (binary)
2.718 (decimal) = 1.011 * 2^1 (binary)
7 (decimal) = 1.110 * 2^2 (binary)

Just go out from the definition of both mantissa and exponent. The exponent is the easiest part. The mantissa is nothing else than a sum of two's negative powers: 1 + ½ + ¼ + ⅛ … , some of which are multiplied to one, some — to zero.
To determine exponent's value, find the biggest power of two that, when being divided (multiplied for numbers in [0,1) ) to, gives a value in range [1, 2).
For 0.11, it is -4 (not -3 as you state), as 0.11 * 2⁴ = 1.76.
For 3.1416, it is +1 because 3.1416/2¹ = 1.5708
Then you'll have a number m in range [1,2) left to convert to a binary fraction. Start with r = "1." as a result, then subtract 1 from m and multiply it by two.
If the result is more than one, write "1" to the end of r and subtract 1 from m, otherwise write "0" to the end of r. Continue multiplying by two and optionally subtracting 1 from m, while simultaneously writing "0" and "1" to r depending whether you had to subtract 1 or not. Stop when you have enough digits in mantissa.
I guess you can figure out how to do desired rounding mode yourself.

Related

How can I substract 253 from 175(175-253) through 2's complement method?

For 2's complement, substraction process by computer.
176-253=176+(-253)
176=10110000
253=11111101
253(inverse)=00000010
253(complement)=00000010+1=00000011
-253=253(complement)=00000011
176+(-253)=10110000+00000011=10110011=179?
but in fact 176-253=-77
is anybody tell me what's wrong here?
With 8 bits you can only represent numbers from -128 to 127 inclusive in 2's complement. Both your numbers lie outside that range. You would need at least nine bits to do the calculation you want to do.
In 2's complement the most significant bit (MSB, the first bit from the left), indicates the sign, 1 for negative numbers and 0 for non-negative numbers. The value:
00000011
is not -253, but is 3.
Doing your calculation in 9 bits yields:
176 = 010110000
253 = 011111101
253(inverse) = 100000010
253(complement) = 100000010+1=100000011
-253 = 253(complement) = 100000011
176+(-253) = 010110000 + 100000011 = 110110011 = -77
Note that all the negative numbers have MSB=1 and all the non-negative numbers have MSB=0.

How to convert a fraction to binary?

I don't know how to convert from a fraction to binary. When I search it, there has a solution shows that:
1 1
-- (dec) = ---- (bin)
10 1010
0.000110011...
-------------
1010 | 1.0000000000
1010
------
01100
1010
-----
0010000
1010
-----
01100
1010
-----
0010
I don't know how and why to do it.
Let's take a look at converting the decimal value of 0.625 to binary.
Step 1: Begin with the decimal fraction and multiply by 2. The whole number part of the result is the first binary digit to the right of the point.
Because .625 x 2 = 1.25, the first binary digit to the right of the point is a 1.
So far, we have .625 = .1??? . . . (base 2) .
Step 2: Next we disregard the whole number part of the previous result (the 1 in this case) and multiply by 2 once again. The whole number part of this new result is the second binary digit to the right of the point. We will continue this process until we get a zero as our decimal part or until we recognize an infinite repeating pattern.
Because .25 x 2 = 0.50, the second binary digit to the right of the point is a 0.
So far, we have .625 = .10?? . . . (base 2) .
Step 3: Disregarding the whole number part of the previous result (this result was .50 so there actually is no whole number part to disregard in this case), we multiply by 2 once again. The whole number part of the result is now the next binary digit to the right of the point.
Because .50 x 2 = 1.00, the third binary digit to the right of the point is a 1.
So now we have .625 = .101?? . . . (base 2) .
Step 4: In fact, we do not need a Step 4. We are finished in Step 3, because we had 0 as the fractional part of our result there.
Hence the representation of .625 = .101 (base 2) .
Decimal 1/10 converts to an infinite binary fraction.
In your question you said that 1/10 in decimal equals 1/1010 in binary. .1 (1/10) in decimal actually equals 0.00011001100110011... in binary.
Fractional value to Binary number conversion
The fraction value is multiplied by 2 and
result has a decimal (1 or 0) and a fraction value.
take the faction value for step 1 operation.
The repeat process until the fraction value reached to 0.
collects a decimal value from bottom to up
fraction = .125
= .125 x 2
= 0.250 x 2
= 0.50 x 2
= 1.0
fraction = 0.125 = 100
Results
given fraction value (base 10)= 0.125
into binary bits (base 2) = 0.100
Real number to binary conversion
In a binary weighted fraction each digit to the right of the decimal point is a power of (1/2) (or the negative of power of 2) smaller than the one to the left. The first rightward digit has a weight of 1/2, the second is 1/4, the third 1/8, and so on.
So a 0.111 (base-2) is:
1*(0.5) + 1*(0.25) + 1*(0.125) = 0.875
And a 0.0101 (base-2) is:
0*(0.5) + 1*(0.25) + 0*(0.125) + 1*(0.0625) = 0.3125
It's no different from binary integers, except we're just extending it to negative powers of 2 as we move right of the decimal point.
I hope that addresses at least part of your question.

An efficient algorithm to compute the number of '1' bit in a long decimal integer that is represented in string?

I came across this interesting question today. (Note that this is not for my homework or interview, etc.)
Given a decimal number that is represented in string, we want to compute the number of '1' bits for the large number in binary format. Here the string can have thousands of characters, and cannot be represented with one int or long long variable.
For example, countBits("10") = 2 as '10' in decimal can be represented as '1010' in binary format. Similarly, we have countBits("12") = 2, countBits("7") = 3
What is an efficiently algorithm for this? One possible solution is to convert the decimal string to another string in the binary format, and count the '1's. Can we do better?
When converting from a decimal representation to and integer, the *n*th digit from the end of the string represents the number of 1010n ( one zero base ten to the power of n ) that is added to total the integer value. If you then want to represent that integer in binary, you have to raise 1010 which is 10102 to that power and multiply that value by the digit's value.
Because one of the factors of the base you are translating from, 5, is relatively prime compared to 2, the powers of 1010 have increasing long representations in base 2 - 12, 10102, 11001002, 11111010002.
Note that these powers have trailing zeros ( 1010 = 2 × 5 and 2 is not relatively prime with the base we are translating into ), so will only effect 1, 3, 5, and 7 bits of the answer instead of all 1, 4, 7, 10 bits. But the number of bits they effect will still vary with O(N) where N is the length of the input, so to calculate the effected bits will take O(N2) operations.
If the base you were translating from did not have factors where were relatively prime to the base you are translating to - say translating base 16 to base 2 or base 9 to base 3 and counting non-zero digits, then there would be a O(N) algorithm as the sum of non-zero digits in the target base would equal the sum for each digit in the input translated individually, but since that is not the case then you are stuck at an O(N2) algorithm where you translate the decimal representation into binary and then count the bits in the binary representation.
You convert it to binary and use Hamming weight algorithm.
How it works? Suppose you have the number 8, which is 00001000.
The algorithm takes chunks of 2 bits, so it'll have 00 00 10 00.
Now it'll sum each two bits (by having a mask 10101010, multiplying and shifting), which will result: 00 00 01 00.
Now it does the same for each 4 bits (by having a mask 00110011..), so it'll have 0000 1000. After adding each side, you'll have 0000 00001.
The last stage is adding the two numbers, 0 + 1, which is 1 and that's the final result.

Algorithm in hardware to find out if number is divisible by five

I am trying to think of an algorithm to implement this for a given n bit binary number. I tried out many examples, but am unable to find out any pattern. So how shall I proceed?
How about this:
Convert the number to base 4 (this is trivial by simply combining pairs of bits). 5 in base 4 is 11. The values base 4 that are divisible by 11 are somewhat familiar: 11, 22, 33, 110, 121, 132, 203, ...
The rule for divisibility by 11 is that you add all the odd digits and all the even digits and subtract one from the other. If the result is divisible by 11 (which remember is 5), then it's divisible by 11 (which remember is 5).
For example:
123456d = 1 1110 0010 0100 0000b = 132021000_4
The even digits are 1 2 2 0 0 : sum = 5d
The odd digits are 3 0 1 0 : sum = 4d
Difference is 1, which is not divisble by 5
Or another one:
123455d = 1 1110 0010 0011 1111b = 132020333_4
The even digits are 1 2 2 3 3 : sum = 11d
The odd digits are 3 0 0 3 : sum = 6d
Difference is 5, which is a 5 or a 0
This should have a fairly efficient HW implementation because it's mostly bit-slicing, followed by N/2 adders, where N is the number of bits in the number you're interested in.
Note that after adding the digits and subtracting, the maximum value is 3/4 * N, so if you have 16-bit numbers max, you can get at most 12 as a result, so you only need to check for 0, ±5 and ±10 explicitly. If you're using 32-bit numbers then you can get at most 24 as a result, so you need to also check if the result is ±15 or ±20.
Make a Deterministic Finite Automaton (DFA) to implement the divisibility check and implement the DFA in hardware.
Creating a DFA for divisibility by 5 is easy. You just need to notice the remainders and check what 2r (mod 5) and 2r + 1(mod 5) map to. There are many websites that discuss this. For example this one.
There are well-known examples to convert DFA to a hardware representation as well.
Well , I just figured out ...
number mod 5 = a0 * 2^0 mod 5 + a1 * 2^1 mod 5 +a2* 2^2 mod 5 + a3 * 2^3 mod 5 + a4 * 2^4 mod 5 + ....
= a0 (1) + a1(2) +a2 (-1) +a3 (-2) +a4 (1) repeats ...
Hence difference of odd digits + 2 times difference of even digits = divisible by 5
for example ... consider 110010
odd digits differnce = 0-0+1 = 1 or 01
even digits difference = 1-0+1 = 2 or 10
difference of odd digits + 2 times difference of even digits = 01 + 2*(10)=01 + 100 = 101 is divisible by 5 .
The contribution of each bit toward being divisible by five is a four bit pattern 3421.
You could shift through any binary number 4 bits at a time adding the corresponding value for positive bits.
Example:
100011
take 0011
apply the pattern 0021
sum 3
next four bits 0010
apply the pattern 0020
sum = 5
We can design a Deterministic Finite Automaton (DFA) for the same. The DFA, then can be implemented in Hardware. This is similar to this answer.
We will simulate a Deterministic Finite Automaton (DFA) that accepts Binary Representation of Integers which are divisible by 5
Now, by accept, we mean that when we are done with scanning string, we should be in one of the multiple possible Final States.
Approach to Design DFA : Essentially, we need to divide the Binary Representation of Integer by 5, and track the remainder. If after consuming/scanning [From Left to Right] the entire string, remainder is Zero, then we should end up in Final State, and if remainder isn't zero we should be in Non-Final States.
Now, DFA is defined by Quintuple/5-Tuple (Q,q₀,F,Σ,δ). We will obtain these five components step-by-step.
Q : Finite Set of States
We need to track remainder. On dividing any integer by 5, we can get remainder as 0,1, 2, 3 or 4. Hence, we will have Five States Z, O, T, Th and F for each possible remainder.
Q={Z, O, T, Th, F}
If after scanning certain part of Binary String, we are in state Z, this means that integer defined from Left to this part will give remainder Zero when divided by 5. Similarly, O for remainder One, and so on.
Now, we can write these three states by Euclidean Division Algorithm as
Z : 5m
O : 5m+1
T : 5m+2
Th : 5m+3
F : 5m+4
where m is Integer.
q₀ : an initial/start state from set Q
Now, start state can be thought in terms of empty string (ɛ). An ɛ directly gets into q₀.
What remainder does ɛ gives when divided by 5?
We can append as many 0s in left hand side of a Binary Number. In the similar fashion, we can append ɛ in left hand side of a Binary String. Thus, ɛ in left can be thought of as 0. And 0 when divided by 5 gives remainder 0. Hence, ɛ should end in State Z. But ɛ ends up in q₀.
Thus, q₀=Z
F : a set of accept states
Now we want all strings which are divisible by 5, or which gives remainder 0 when divided by 5, or which after complete scanning should end up in state Z, and gets accepted.
Hence,
F={Z}
Σ : Alphabet (a finite set of input symbols)
Since we are scanning/reading a Binary String. Hence,
Σ={0,1}
δ : Transition Function (δ : Q × Σ → Q)
Now this δ tells us that if we are in state x (in Q) and next input to be scanned is y (in Σ), then at which state z (in Q) should we go.
If the string upto this point gives remainder 3/Th when divided by 5, and if we append 1 to string, then what remainder will resultant string give.
Now, this can be analyzed by observing how magnitude of a binary string changes on appending 0 and 1.
a.
In Decimal (Base-10), if we add/append 0, then magnitude gets multiplied by 10 . 53, on appending 0 it becomes 530
Also, if we append 8 to decimal, then Magnitude gets multiplied by 10, and then we add 8 to multiplied magnitude.
b.
In Binary (Base-2), if we add/append 0, then magnitude gets multiplied by 2 (The Positional Weight of each Bit get multiplied by 2)
Example : (1010)2 [which is (10)10], on appending 0 it becomes (10100)2 [which is (20)10]
Similarly, In Binary, if we append 1, then Magnitude gets multiplied by 2, and then we add 1.
Example : (10)2 [which is (2)10], on appending 1 it becomes (101)2 [which is (5)10]
Thus, we can say that for Binary String x,
x0=2|x|
x1=2|x|+1
We will use these relation to analyze Five States
Any string in Z can be written as 5m
- On 0, it becomes 2(5m), which is 5(2m), nothing but state Z.
- On 1, it becomes 2(5m)+1, which is 5(2m)+1, that is O. [This can be read as if a Binary String is presently divisible by 5, and we append 1, then resultant string will give remainder as 1]
Any string in O can be written as 5m+1
- On 0, it becomes 2(5m+1) = 10m+2, which is 5(2m)+2, state T.
- On 1, it becomes 2(5m+1)+1 = 10m+3, which is 5(2m)+3, that is state Th.
Any string in T can be written as 5m+2
- On 0, it becomes 2(5m+2) = 10m+4, which is 5(2m)+4, state F.
- On 1, it becomes 2(5m+2)+1 = 10m+5, which is 5(2m+1), state Z. [If m is integer, so is (2m+1)]
Any string in Th can be written as 5m+3
- On 0, it becomes 2(5m+3) = 10m+6, which is 5(2m+1)+1, state V.
- On 1, it becomes 2(5m+3)+1 = 10m+7, which is 5(2m+1)+2, that is state T.
Any string in F can be written as 5m+4
- On 0, it becomes 2(5m+4) = 10m+8, which is 5(2m+1)+3, state Th.
- On 1, it becomes 2(5m+4)+1 = 10m+9, which is 5(2m+1)+4, that is state F.
Hence, the final DFA combining Everything (creating using Tool)
We can even write code [in High Level Language] for the same. But it would go beyond main aim of this question. If readers wish to see the same, they can check here.
As any assignment this would have been an answer for is bound to be way overdue a year later:
in the binary representation of a natural divisible by five the parities of bits 4n and 4n+2 equal, as well as those for bits 4n+1 and 4n+3.
(This is entirely equivalent to the answers of JoshG79, notsogeek, or james: 4≡-1(mod 5), 3≡-2(mod 5) (with reduced hand-waving about recursion in argumentation, and no dispensable handling of carries in circuitry))

How do you calculate floating point in a radix other than 10?

Given Wikipedia's article on Radix Point, how would one calculate the binary equivalent of 10.1 or the hex equivalent of 17.17? For the former, what is the binary equivalent of a tenth? For the latter, the hex representation of 17/100?
I'm looking more for an algorithm than for solutions to just those two examples.
To convert decimal 10.1 to binary, separate the integer and fractional parts and convert each separately.
To convert the integer part, use repeated integer division by 2, and then write the remainders in reverse order:
10/2 = 5 remainder 0
5/2 = 2 remainder 1
2/2 = 1 remainder 0
1/2 = 0 remainder 1
Answer: 1010
To convert the fractional part, use repeated multiplication by 2, subtracting off the integer part at each step. The integer parts, in order of generation, represent your binary number:
0.1 * 2 = 0.2
0.2 * 2 = 0.4
0.4 * 2 = 0.8
0.8 * 2 = 1.6
0.6 * 2 = 1.2
0.2 * 2 = 0.4
0.4 * 2 = 0.8
... (cycle repeats forever)
So decimal 0.1 is binary 0.000110011001100...
(For a more detailed explanation see routines dec2bin_i() and dec2bin_f() in my article http://www.exploringbinary.com/base-conversion-in-php-using-bcmath/ .)
For hexadecimal, use the same procedure, except with a divisor/multiplier of 16 instead of 2. Remainders and integer parts greater than 9 must be converted to hex digits directly: 10 becomes A, 11 becomes B, ... , 15 becomes F.
The algorithm is quite simple, but in practice you can do a lot of tweaks both with lookup tables and logs to speed it up.
But for the basic algorithm, you may try something like this:
shift=0;
while v>=base, v=v/base, shift=shift+1;
Next digit:
if v<1.0 && shift==0, output the decimal point
else
D=floor(v)
output D
v=v-D
v=v*base
shift = shift-1
if (v==0) exit;
goto Next Digit
You may also put a test in there to stop printing after N digits for longer repeating decimals.
A terminating number (a number which can be represented by a finite number of digits) n1 in base b1, may end up being a non-terminating number in a different base b2. Conversely, a non-terminating number in one base b1 may turn out to be a terminating number in base b2.
The number 0.110 when converted to binary is a non-terminating number, as is 0.1710 when converted to a hexadecimal number. But the terminating number 0.13 in base 3, when converted to base 10 is the non-terminating, repeating number 0.(3)10 (signifying that the number 3 repeats). Similarly, converting 0.110 to binary and 0.1710 to hexadecimal, one ends up with the non-terminating, repeating numbers 0.0(0011)2 and 0.2(B851E)16
Because of this, when converting such a number from one base to another, you may find yourself having to approximate the number instead of having a representation which is completely accurate.
The 'binary equivalent' of one tenth is one half, i.e instead of 1/10^1, it's 1/2^1.
Each digit represents a power of two. The digits behind the radix point are the same, it's just that they represent 1 over the power of two:
8 4 2 1 . 1/2 1/4 1/8 1/16
So for 10.1, you obviously need an '8' and a '2' to make the 10 portion. 1/2 (0.5) is too much, 1/4 ( 0.25 ) is too much, 1/8 (0.125) is too much. We need 1/16 (0.0625), which will leave us with 0.0375. 1/32 is 0.03125, so we can take that too. So far we have:
8 4 2 1 . 1/2 1/4 1/8 1/16 1/32
1 0 1 0 0 0 0 1 1
With an error of 0.00625. 1/64 (0.015625) and 1/128 (0.0078125) are both too much, 1/256 (0.00390625) will work:
8 4 2 1 . 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256
1 0 1 0 0 0 0 1 1 0 0 1
With an error of 0.00234375.
The .1 cannot be expressed exactly in binary ( just as 1/3 can't be expressed exactly in decimal ). Depending on where you put your radix, you eventually have to stop, probably round, and accept the error.
Before I twiddle with this in the light of my GMP library, here's where I got to trying to make Rick Regan's PHP code generic for any base from 2 up to 36.
Function dec2base_f(ByVal ddecimal As Double, ByVal nBase As Long, ByVal dscale As Long) As String
Const BASES = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ" 'up to base 36
Dim digitCount As Long
Dim wholeNumber As Double
Dim digit As String * 1
digitCount = 0
dscale = max(dscale, Len(CStr(ddecimal)) - Len("0."))
Dim baseary_f As String
baseary_f = "0."
Do While ddecimal > 0 And digitCount < dscale
ddecimal = ddecimal * nBase
digit = Mid$(BASES, Fix(ddecimal) + 1)
baseary_f = baseary_f & digit '"1"
ddecimal = ddecimal - Fix(ddecimal)
digitCount = digitCount + 1
Loop
dec2base_f = baseary_f
End Function
Function base2dec_f(ByVal baseary_f As String, nBase As Double) As Double
Const BASES As String = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
Dim decimal_f As Double
Dim i As Long
Dim c As Long
For i = Len(baseary_f) To Len("0.") + 1 Step -1
c = InStr(BASES, Mid$(baseary_f, i, 1)) - 1
decimal_f = decimal_f + c
decimal_f = decimal_f / nBase
Next
base2dec_f = decimal_f
End Function
Debug.Print base2dec_f(dec2base_f(0.09, 2, 200), 2) --> 0.09
Debug.Print base2dec_f(dec2base_f(0.09, 8, 200), 8) --> 0.09
Debug.Print base2dec_f(dec2base_f(0.09, 16, 200), 16) --> 0.09

Resources