I use de2bi(x) but it help only for integer i want to convert for digit with point decimal into binary
de2bi(x)
convert this 9.2553 into decimal with fraction part also convert into binary format in Matlab if possible than without function file with code want output
MATLAB, of course, already stores double values in binary using IEEE-754 binary64 format. All we have to do is somehow get MATLAB to show us the bits.
One way is to use typecast which makes MATLAB interpret a set of memory locations as a different type. In this case, we'll make MATLAB think a double is a uint64 and then send the "integer" through dec2bin. We'll have to do some decomposition on the string after that to get the actual value.
Note: This currently only works with positive values. If you need negative values too, I'll have to make some adjustments.
function binstr = double2bin(d)
d = double(d); % make sure the input is a double-precision float
ieee754_d = dec2bin(typecast(d, 'uint64'),64); % read double as uint64
% IEEE-754 64-bit double:
% bit 1 (msb) = sign bit (we'll ignore this for now)
% bits 2-12 = exponent with bias of 1023
% bits 13-64 = significand with leading 1 removed (implicit)
exponent = bin2dec(ieee754_d(2:12))-1022; % 2^n has n+1 bits
significand = ['1' ieee754_d(13:64)];
if (exponent < 1) % d < 1, so we'll need to pad with zeros
binstr = ['0.' repmat('0',1,-exponent) significand];
else % d >= 1; move exponent bits to the left of binary point
binstr = [significand(1:exponent) '.' significand(exponent+1:end)];
end
end
Test run:
>> double2bin(9.2532)
ans = 1001.0100000011010001101101110001011101011000111000100
Ad hoc solution:
Expand by 2^44 (to get an integer value).
Convert integer result to binary.
Reduce by 2^44 by placing "decimal" point.
(2^44 is the smallest power of 2 expansion that gives an integer result).
Code sample:
expandedRes = dec2bin(9.2553*2^44);
res = expandedRes[1:end-44, '.', end-43:end);
Result:
res =
1001.01000001010110110101011100111110101010110011
Related
I have two images to fuse. Images are
A
B
I have fused them and got this image fused
Now I want difference between fused image F and the image B.
I have execute the code, but not getting desirable results
I am getting this image -> difference normalized difference,
But I want this -> Required
Values of difference image are normalized to the range of 0 to 1.
The code used is
difference=F-B;
figure,imshow(difference);
normImage = mat2gray(difference);
figure,imshow(normImage);
Please anyone help. Thank you.
Using:
R = mat2gray(im2double(F)-im2double(B));
My result is:
To see why conversion to double is important, look at an area of the image where B(y,x) > F(y,x), such as (343, 280) in your sample images.
>> F(343,280)
ans = 32
>> B(343,280)
ans = 107
Mathematically, we'd expect 32-107 to equal -75, but:
>> F(343,280) - B(343,280)
ans = 0
This is because both F and B are arrays of uint8:
>> class(F)
ans = uint8
>> class(B)
ans = uint8
As an unsigned integer, uint8 can't take a negative value, so any attempt to assign a negative value to a uint8 variable results in 0. Since both operands are uint8, the result is uint8. Trying to cast that value to a double after it has already been clamped to be with in the range of 0-255 would simply result in a double variable with a value of 0. (The same thing also happens at the upper end of the range. Try uint8(444).)
Casting F and B to a signed type (one big enough to the range -max to +max, or -255 to 255 in this case) will take care of the math problem:
>> int16(F(343,280)) - int16(B(343,280))
ans = -75
For images, though, casting to double feels more natural and gives you more precision than integers when you're doing calculations and rescaling. Plus, there's this handy im2double function we can use that not only casts the array to doubles, but rescales everything to be between 0 and 1:
>> Fd = im2double(F);
>> Fd(343,280)
ans = 0.1255 % 32.0/255.0
>> Bd = im2double(B);
>> Bd(343,280)
ans = 0.4196 % 107.0/255.0
But now when we try to subtract the two, we actually get a negative value as expected:
>> Fd(343,280) - Bd(343,280)
ans = -0.2941 % -75.0/255.0
So, im2double(F)-im2double(B) gives us double values between -1.0 and 1.0. mat2gray takes care of scaling those values back to a range of 0.0 to 1.0 for display.
Note: I chose the coordinates (343,280) very carefully because that's where F-B is most negative. If you're curious about how the conversion happens and what values get scaled to what, you can also have a look at (53,266).
I am writing a program where I need to delete duplicate points stored in a matrix. The problem is that when it comes to check whether those points are in the matrix, MATLAB can't recognize them in the matrix although they exist.
In the following code, intersections function gets the intersection points:
[points(:,1), points(:,2)] = intersections(...
obj.modifiedVGVertices(1,:), obj.modifiedVGVertices(2,:), ...
[vertex1(1) vertex2(1)], [vertex1(2) vertex2(2)]);
The result:
>> points
points =
12.0000 15.0000
33.0000 24.0000
33.0000 24.0000
>> vertex1
vertex1 =
12
15
>> vertex2
vertex2 =
33
24
Two points (vertex1 and vertex2) should be eliminated from the result. It should be done by the below commands:
points = points((points(:,1) ~= vertex1(1)) | (points(:,2) ~= vertex1(2)), :);
points = points((points(:,1) ~= vertex2(1)) | (points(:,2) ~= vertex2(2)), :);
After doing that, we have this unexpected outcome:
>> points
points =
33.0000 24.0000
The outcome should be an empty matrix. As you can see, the first (or second?) pair of [33.0000 24.0000] has been eliminated, but not the second one.
Then I checked these two expressions:
>> points(1) ~= vertex2(1)
ans =
0
>> points(2) ~= vertex2(2)
ans =
1 % <-- It means 24.0000 is not equal to 24.0000?
What is the problem?
More surprisingly, I made a new script that has only these commands:
points = [12.0000 15.0000
33.0000 24.0000
33.0000 24.0000];
vertex1 = [12 ; 15];
vertex2 = [33 ; 24];
points = points((points(:,1) ~= vertex1(1)) | (points(:,2) ~= vertex1(2)), :);
points = points((points(:,1) ~= vertex2(1)) | (points(:,2) ~= vertex2(2)), :);
The result as expected:
>> points
points =
Empty matrix: 0-by-2
The problem you're having relates to how floating-point numbers are represented on a computer. A more detailed discussion of floating-point representations appears towards the end of my answer (The "Floating-point representation" section). The TL;DR version: because computers have finite amounts of memory, numbers can only be represented with finite precision. Thus, the accuracy of floating-point numbers is limited to a certain number of decimal places (about 16 significant digits for double-precision values, the default used in MATLAB).
Actual vs. displayed precision
Now to address the specific example in the question... while 24.0000 and 24.0000 are displayed in the same manner, it turns out that they actually differ by very small decimal amounts in this case. You don't see it because MATLAB only displays 4 significant digits by default, keeping the overall display neat and tidy. If you want to see the full precision, you should either issue the format long command or view a hexadecimal representation of the number:
>> pi
ans =
3.1416
>> format long
>> pi
ans =
3.141592653589793
>> num2hex(pi)
ans =
400921fb54442d18
Initialized values vs. computed values
Since there are only a finite number of values that can be represented for a floating-point number, it's possible for a computation to result in a value that falls between two of these representations. In such a case, the result has to be rounded off to one of them. This introduces a small machine-precision error. This also means that initializing a value directly or by some computation can give slightly different results. For example, the value 0.1 doesn't have an exact floating-point representation (i.e. it gets slightly rounded off), and so you end up with counter-intuitive results like this due to the way round-off errors accumulate:
>> a=sum([0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1]); % Sum 10 0.1s
>> b=1; % Initialize to 1
>> a == b
ans =
logical
0 % They are unequal!
>> num2hex(a) % Let's check their hex representation to confirm
ans =
3fefffffffffffff
>> num2hex(b)
ans =
3ff0000000000000
How to correctly handle floating-point comparisons
Since floating-point values can differ by very small amounts, any comparisons should be done by checking that the values are within some range (i.e. tolerance) of one another, as opposed to exactly equal to each other. For example:
a = 24;
b = 24.000001;
tolerance = 0.001;
if abs(a-b) < tolerance, disp('Equal!'); end
will display "Equal!".
You could then change your code to something like:
points = points((abs(points(:,1)-vertex1(1)) > tolerance) | ...
(abs(points(:,2)-vertex1(2)) > tolerance),:)
Floating-point representation
A good overview of floating-point numbers (and specifically the IEEE 754 standard for floating-point arithmetic) is What Every Computer Scientist Should Know About Floating-Point Arithmetic by David Goldberg.
A binary floating-point number is actually represented by three integers: a sign bit s, a significand (or coefficient/fraction) b, and an exponent e. For double-precision floating-point format, each number is represented by 64 bits laid out in memory as follows:
The real value can then be found with the following formula:
This format allows for number representations in the range 10^-308 to 10^308. For MATLAB you can get these limits from realmin and realmax:
>> realmin
ans =
2.225073858507201e-308
>> realmax
ans =
1.797693134862316e+308
Since there are a finite number of bits used to represent a floating-point number, there are only so many finite numbers that can be represented within the above given range. Computations will often result in a value that doesn't exactly match one of these finite representations, so the values must be rounded off. These machine-precision errors make themselves evident in different ways, as discussed in the above examples.
In order to better understand these round-off errors it's useful to look at the relative floating-point accuracy provided by the function eps, which quantifies the distance from a given number to the next largest floating-point representation:
>> eps(1)
ans =
2.220446049250313e-16
>> eps(1000)
ans =
1.136868377216160e-13
Notice that the precision is relative to the size of a given number being represented; larger numbers will have larger distances between floating-point representations, and will thus have fewer digits of precision following the decimal point. This can be an important consideration with some calculations. Consider the following example:
>> format long % Display full precision
>> x = rand(1, 10); % Get 10 random values between 0 and 1
>> a = mean(x) % Take the mean
a =
0.587307428244141
>> b = mean(x+10000)-10000 % Take the mean at a different scale, then shift back
b =
0.587307428244458
Note that when we shift the values of x from the range [0 1] to the range [10000 10001], compute a mean, then subtract the mean offset for comparison, we get a value that differs for the last 3 significant digits. This illustrates how an offset or scaling of data can change the accuracy of calculations performed on it, which is something that has to be accounted for with certain problems.
Look at this article: The Perils of Floating Point. Though its examples are in FORTRAN it has sense for virtually any modern programming language, including MATLAB. Your problem (and solution for it) is described in "Safe Comparisons" section.
type
format long g
This command will show the FULL value of the number. It's likely to be something like 24.00000021321 != 24.00000123124
Try writing
0.1 + 0.1 + 0.1 == 0.3.
Warning: You might be surprised about the result!
Maybe the two numbers are really 24.0 and 24.000000001 but you're not seeing all the decimal places.
Check out the Matlab EPS function.
Matlab uses floating point math up to 16 digits of precision (only 5 are displayed).
I'm working on a harmonic ratio program and part of what I want a user to be able to do is plug in various ratios and have the decimal-valued frequency that's playing show you more ratio-locked frequencies that are higher or lower.
Anyway, on this webpage there is a javascript algorithm to show fractional values (ratios) from given decimals.
http://www.mindspring.com/~alanh/fracs.html
How does it work? I am interested in implementing it myself, but I don't really understand how it functions. If you try out some fractions, it gives you many options (some with extra decimals) so it's not exactly just GCD.
edit: if this algorithm question would be better suited for programmers.se just let me know and I'll repost there and delete this.
It is calculating the continued fraction and displaying that. Each term in the continued fraction gives you another fraction that is an order of magnitude better.
See Algorithm for simplifying decimal to fractions for more detailed explanations and alternative algorithms that you could choose to use.
You can exploit IEEE 754 as your decimal value is most likely stored in it and it use integral binary representation where mantissa is integer and exponent can be converted to integer division too so you can extract a/b form from it directly. For 32 bit float we got:
1 bit sign
8 bit exponent (with bias 127)
23+1 bit mantissa (the highest bit is not present in binary but it is 1).
Now for example take float 3.14159265358979. If I read this float content as integer type then it is stored as:
0x40490FDB hex
0100 0000 0100 1001 0000 1111 1101 1011 bin
0 10000000 10010010000111111011011 bin
s exponent mantissa
so:
3.14159265358979 = +1.10010010000111111011011b*2^(10000000b-01111111b)
3.14159265358979 = +110010010000111111011011b/2^(23-(10000000b-01111111b))
3.14159265358979 = +110010010000111111011011b/2^(23-(10000000b-01111111b))
3.14159265358979 = +110010010000111111011011b/2^22
3.14159265358979 = +110010010000111111011011b/2^22
3.14159265358979 = 13176795 / 4194304 = 3.1415927410125732421875
If I define it as "algebraic" equation I got:
float = (sign) (mantissa+2^23) / 2^(23-(exp-127))
Now you can apply GCD or what ever you want ... Here simple C++ code for this:
void fraction(int &a,int &b,float c) // a/b ~= c
{
union // convert between float and integer representation
{
float f32;
unsigned int u32;
} x;
x.f32=c;
int s,e;
s =x.u32&0x80000000; // sign bit
a =x.u32&0x007FFFFF; // mantisa
a|= 0x00800000; // add MSB in mantisa (not present in float representation)
e =(x.u32>>23)&0xFF; // exponent
e-= 0x7F; // exponent bias to make exponent signed again
// (optional) divide by 2 while you can (too lazy for GCD as b will be always power of 2 ...) it is better to do it on e instead of b to avoid possible overflows
while ((a>=2)&&((a&1)==0)) { a>>=1; e++; }
b=1<<(23-e); // b= 2^(23-exp)
if (s) a=-a; // sign
}
As we got binary exponent the b will always be a power of 2. That means that instead of GCD is enough to divide a by 2 while we can and either increase exponent e or divide b first and only after apply GCD on usually much smaller numbers. Better to apply this on e to avoid overflows as final exponent is e=<-104,151> and the resulting b is just integer so it has much much less bits as it needs. In such case b does not fit into integer do the opposite (multiply a by 2 and decrement e or multiply b by 2 until it fits and or cut some low bits of mantissa ...)
Here examples from the page you linked:
a b a / b c
13176795 / 4194304 = 3.141593 ~= 3.141593
11863283 / 8388608 = 1.414214 ~= 1.414214
13573053 / 8388608 = 1.618034 ~= 1.618034
46751 / 128 = 365.242188 ~= 365.242188
Unless you are computing this on strings or arbitrary precision than you can not get any better than this due to floating rounding problems. So just chose floating precision you want (32bit float, 64bit double, 80bit extended,...) extract mantissa,exponent and convert to a/b
Hope it is clear enough now. In case you want to know how we can get the IEEE 754 form from (string/value) it boils down to conversion to binary. We need just the fractional part and that is done by successive multiplication by target base (2) in source base (10 or 2^8,2^16,2^32,...). So in each iteration multiply the value, the integer part of result is new digit and use fractional part for next iteration ... repeat until the value is non zero or max number of digits was used.
0.123 0b
0.246 -> 0.0b
0.492 -> 0.00b
0.984 -> 0.000b
1.968 -> 0.0001b
1.936 -> 0.00011b
1.872 -> 0.000111b
1.744 -> 0.0001111b
1.488 -> 0.00011111b
0.976 -> 0.000111110b
I have a 16 bit number which I want to divide by 100. Let's say it's 50000. The goal is to obtain 500. However, I am trying to avoid inferred dividers on my FPGA because they break timing requirements. The result does not have to be accurate; an approximation will do.
I have tried hardware multiplication by 0.01 but real numbers are not supported. I'm looking at pipelined dividers now but I hope it does not come to that.
Conceptually: Multiply by 655 (= 65536/100) and then shift right by 16 bits. Of course, in hardware, the shift right is free.
If you need it to be even faster, you can hardwire the divide as a sum of divisions by powers of two (shifts). E.g.,
1/100 ~= 1/128 = 0.0078125
1/100 ~= 1/128 + 1/256 = 0.01171875
1/100 ~= 1/128 + 1/512 = 0.009765625
1/100 ~= 1/128 + 1/512 + 1/2048 = 0.01025390625
1/100 ~= 1/128 + 1/512 + 1/4096 = 0.010009765625
etc.
In C code the last example above would be:
uint16_t divideBy100 (uint16_t input)
{
return (input >> 7) + (input >> 9) + (input >> 12);
}
Assuming that
the integer division is intended to truncate, not round (e.g. 599 /
100 = 5)
it's ok to have a 16x16 multiplier in the FPGA (with a fixed value on
one input)
then you can get exact values by implementing a 16x16 unsigned multiplier where one input is 0xA3D7 and the other input is your 16-bit number. Add 0x8000 to the 32-bit product, and your result is in the upper 10 bits.
In C code, the algorithm looks like this
uint16_t divideBy100( uint16_t input )
{
uint32_t temp;
temp = input;
temp *= 0xA3D7; // compute the 32-bit product of two 16-bit unsigned numbers
temp += 0x8000; // adjust the 32-bit product since 0xA3D7 is actually a little low
temp >>= 22; // the upper 10-bits are the answer
return( (uint16_t)temp );
}
Generally, you can multiply by the inverse and shift. Compilers do this all the time, even for software.
Here is a page that does that for you: http://www.hackersdelight.org/magic.htm
In your case that seems to be multiplication by 0x431BDE83, followed by a right-shift of 17.
And here is an explanation: Computing the Multiplicative Inverse for Optimizing Integer Division
Multiplying by the reciprocal is often a good approach, as you have noted though real numbers are not supported. You need to work with fixed point rather than floating point reals.
Verilog does not have a definition of fixed point, but it it just uses a word length and you decide how many bits are integer and how many fractional.
0.01 (0.0098876953125) in binary would be 0_0000001010001. The bigger this word length the greater the precision.
// 1Int, 13Frac
wire ONE_HUNDREDTH = 14'b0_0000001010001 ;
input a [15:0]; //Integer (no fractional bits)
output result [15+14:0]; //13 fractional bits inherited form ONE_HUNDREDTH
output result_int [15:0]; //Integer result
always #* begin
result = ONE_HUNDREDTH * a;
result_int = result >>> 13;
end
Real to binary conversion done using the ruby gem fixed_point.
A ruby irb session (with fixed_point installed via gem install fixed_point):
require 'fixed_point'
#Unsigned, 1 Integer bit, 13 fractional bits
format = FixedPoint::Format.new(0, 1, 13)
fix_num = FixedPoint::Number.new(0.01, format )
=> 0.0098876953125
fix_num.to_b
=> "0.0000001010001"
The main question: How many digits?
Let me explain. I have a number in binary system: 11000000 and in decimal is 192.
After converting to decimal, how many digits it will have (in dicimal)? In my example, it's 3 digits. But, it isn't a problem. I've searched over internet and found one algorithm for integral part and one for fractional part. I'm not quite understand them, but (I think) they works.
When converting from binary to octal, it's more easy: each 3 bits give you 1 digit in octal. Same for hex: each 4 bits = 1 hex digit.
But, I'm very curious, what to do, if I have a number in P numeral system and want to convert it to the Q numeral system? I know how to do it (I think, I know :)), but, 1st of all, I want to know how many digits in Q system it will take (u no, I must preallocate space).
Writing n in base b takes ceiling(log base b (n)) digits.
The ratio you noticed (octal/binary) is log base 8 (n) / log base 2 (n) = 3.
(From memory, will it stick?)
There was an error in my previous answer: look at the comment by Ben Schwehn.
Sorry for the confusion, I found and explain the error I made in my previous answer below.
Please use the answer provided by Paul Tomblin. (rewritten to use P, Q and n)
Y = ln(P^n) / ln(Q)
Y = n * ln(P) / ln(Q)
So Y (rounded up) is the number of characters you need in system Q to express the highest number you can encode in n characters in system P.
I have no answer (that wouldn't convert the number already and take up that many space in a temporary variable) to get the bare minimum for a given number 1000(bin) = 8(dec) while you would reserve 2 decimal positions using this formula.
If a temporary memory usage isn't a problem, you might cheat and use (Python):
len(str(int(otherBaseStr,P)))
This will give you the number of decimals needed to convert a number in base P, cast as a string (otherBaseStr), into decimals.
Old WRONG answer:
If you have a number in P numeral system of length n
Then you can calculate the highest number that is possible in n characters:
P^(n-1)
To express this highest number in number system Q you need to use logarithms (because they are the inverse to exponentiation):
log((P^(n-1))/log(Q)
(n-1)*log(P) / log(Q)
For example
11000000 in binary is 8 characters.
To get it in Decimal you would need:
(8-1)*log(2) / log(10) = 2.1 digits (round up to 3)
Reason it was wrong:
The highest number that is possible in n characters is
(P^n) - 1
not
P^(n-1)
If you have a number that's X digits long in base B, then the maximum value that can be represented is B^X - 1. So if you want to know how many digits it might take in base C, then you have to find the number Y that C^Y - 1 is at least as big as B^X - 1. The way to do that is to take the logarithm in base C of B^X-1. And since the logarithm (log) of a number in base C is the same as the natural log (ln) of that number divided by the natural log of C, that becomes:
Y = ln((B^X)-1) / ln(C) + 1
and since ln(B^X) is X * ln(B), and that's probably faster to calculate than ln(B^X-1) and close enough to the right answer, rewrite that as
Y = X * ln(B) / ln(C) + 1
Covert that to your favourite language. Because we dropped the "-1", we might end up with one digit more than you need in some cases. But even better, you can pre-calculate ln(B)/ln(C) and just multiply it by new "X"s and the length of the number you are trying to convert changes.
Calculating the number of digit can be done using the formulas given by the other answers, however, it might actually be faster to allocate a buffer of maximum size first and then return the relevant part of that buffer instead of calculating a logarithm.
Note that the worst case for the buffer size happens when you convert to binary, which gives you a buffer size of 32 characters for 32-bit integers.
Converting a number to an arbitrary base could be done using the C# function below (The code would look very similar in other languages like C or Java):
public static string IntToString(int value, char[] baseChars)
{
// 32 is the worst cast buffer size for base 2 and int.MaxValue
int i = 32;
char[] buffer = new char[i];
int targetBase= baseChars.Length;
do
{
buffer[--i] = baseChars[value % targetBase];
value = value / targetBase;
}
while (value > 0);
char[] result = new char[32 - i];
Array.Copy(buffer, i, result, 0, 32 - i);
return new string(result);
}
The keyword here is "logarithm", here are some suggestive links:
http://www.adug.org.au/MathsCorner/MathsCornerLogs2.htm
http://staff.spd.dcu.ie/johnbcos/download/Fermat%20material/Fermat_Record_Number/HOW_MANY.html
look at the logarithms base P and base Q. Round down to nearest integer.
The logarithm base P can be computed using your favorite base (10 or e): log_P(x) = log_10(x)/log_10(P)
You need to compute the length of the fractional part separately.
For binary to decimal, there are as many decimal digits as there are bits. For example, binary 0.11001101001001 is decimal 0.80133056640625, both 14 digits after the radix point.
For decimal to binary, there are two cases. If the decimal fraction is dyadic, then there are as many bits as decimal digits (same as for binary to decimal above). If the fraction is not dyadic, then the number of bits is infinite.
(You can use my decimal/binary converter to experiment with this.)