Fortran format 1P10E11.3 - format

Does anyone know what this format line means in fortran:
FORMAT(1x,F7.0,2x,1P10E11.3)
I know the first part is one repetition of float number but I don't understand how many exponential data points are read in the second part and what that P is for.

The P format shifts the decimal point. The behavior is different on input and output. On output, applied to an E format, it shifts the decimal point of the value before the exponent and changes the values of the exponent such that the value of the number is unchanged. If plain E would output 0.123E+3, 1PE will output 1.230E+2. On input it changes the value read -- use with great caution or not at all. Another "gotcha" is that P stays in effect for the rest of the format, until another P specifier appears in the format, e.g., 0P to reset. One of the newer G, ES or EN formats are generally better than the combination of P and E.

Related

Base91, how is it calculated?

I've been looking online to find out how basE91 is calculated. I have found resources such as this one which specifies the characters used for a specific value but nowhere have I found how I get that value.
I have tried changing the input values into binary and taking chunks of both 6 and 7 bits but these do not work and I get the incorrect output. I do not want code that will do this for me as I which to write that myself, I only want to know the process needed to encode a string into basE91.
First, you need to see the input as a bit stream.
Then, read 13 bits from the stream, and form an integer value from it. If the value of this integer is lower than or equal to 88, then read one additional bit, and put it into the 14th bit (lowest bit being 1st) of the integer. This integer's (let's call it v) maximum value is: 8192+88 = 8280.
Then split v into two indices: i0 = v%91, i1 = v/91. Then use a 91-element character table, and output two characters: table[i0], table[i1].
(now you can see the reason of 88: for the maximal value (8280), both i0 and i1 become 90)
So this process is more complicated than base64, but more space efficient. Furthermore, unlike base64, the size of the output is a little bit dependent of the input bytes. A N-length sequence of 0x00 will be shorter than a N-length sequence of 0xff (where N is a sufficiently large number).

Comparing vector of double

I am trying to compare two vectors.
v1 = {0.520974 , 0.438171 , 0.559061}
v2 = [0.520974 , 0.438171 , 0.559061}
I write v1 to a file, read and that's v2. For some reason when I compare the two vectors, I am getting false!
When I do: v1[0]-v2[0] I get 4.3123e-8
Thanks,
Double values, unlike integers, are fragile against write and read. That means, the information that represents them in a string is not necessarily complete.
The leading reason of that is rounding: it's like if you had 1/7 and wanted to write it on a paper in the same format as in your question, you'd get:
0.142857
That's exact to 6 decimal places, but no more than that, and the difference shows up. The only difference in the computer is that it counts in binary (and rounds in binary, too), and is further complicated by the fact that at output (or input) you coerce that into decimal (or back, respectively) and round again at each step. All of those are sources of little errors.
If you want to be able to save and reload your doubles exactly (on the same machine), do it in their native binary representation using a write and read. If you want them to be human-readable, you need to sacrifice the exact reconstruction. You'd then need to compare them up to a little allowed deviation.

Convert a long character field to numeric, NOT scientific notation (SAS)

I need to join two tables - one table has householdid which is CHAR30, which appears to have center alignment and the other householdid as numeric 20. I need to convert to the numeric 20 but when I do that it appears truncated, perhaps because of the strange alignment (not all of the 30 positions are actually needed).
When I try to keep the full 30 positions as a numeric I instead get a conversion to scientific notation so of course this will not work as a key id for later operations.
As long as the number is converted properly, it doesn't matter what format it has. A format just tells SAS how to show you the number. Behind the scenes, it is just a DOUBLE.
1.0 = 1 = 1e0
Now if you have converted to a number and cannot get a join, then look at the informat you used to read it in.
try
num_id = input(strip(char_id),best32.);
Strip removes leading and trailing blanks. The BEST32. INFORMAT tries its "best" to read the number up to 32 characters in length.
You cannot store a 20 digit number as a numeric in SAS. SAS stores all numbers as 8 byte floating point and so does not have enough bits to represent that many digits uniquely. You can ask SAS what is the largest integer it can represent exactly by using the CONSTANT() function.
1 data _null_;
2 x=constant('EXACTINT',8);
3 put x = comma32. ;
4 run;
x=9,007,199,254,740,992
Read and store your 20 and 30 digit strings as character variables.
Use the bestd32. format. Tends to work out pretty well for long key variables. Depending on the length of the variable, you can change 32 to whichever length you need.
Based on the comments under the original question, the only thing you can do is convert all ID fields to strings, and use the strings to do the joins. #Reeza suggested this in one of the comments but it should have been posted as an answer.
I assume you are pulling this information out of another database/system that allows for greater numeric precision then SAS does. If you don't convert the values to strings when they are read into SAS, then you run the risk of losing precision.
If you lose precision, the ID in SAS is likely to become very slightly different to the ID in the original system, which can cause problems when searching the original system for an ID obtained from SAS.
Be sure you don't read the numbers into SAS as numeric, then convert to string. If you do it this way you are still losing precision as soon as the numbers are stored in SAS as numeric variables.

How to convert fixed-point VHDL type back to float?

I am using IEEE fixed point package in VHDL.
It works well, but I now facing a problem concerning their string representation in a test bench : I would like to dump them in a text file.
I have found that it is indeed possible to directly write ufixed or sfixed using :
write(buf, to_string(x)); --where x is either sfixed or ufixed (and buf : line)
But then I get values like 11110001.10101 (for sfixed q8.5 representation).
So my question : how to convert back these fixed point numbers to reals (and then to string) ?
The variable needs to be split into two std-logic-vector parts, the integer part can be converted to a string using standard conversion, but for the fraction part the string conversion is a bit different. For the integer part you need to use a loop and divide by 10 and convert the modulo remainder into ascii character, building up from the lower digit to the higher digit. For the fractional part it also need a loop but one needs to multiply by 10 take the floor and isolate this digit to get the corresponding character, then that integer is used to be substracted to the fraction number, etc. This is the concept, worked in MATLAB to test and making a vhdl version I will share soon. I was surprised not to find such useful function anywhere. Of course fixed-point format can vary Q(N,M) N and M can have all sorts of values, while for floating point, it is standardized.

How to do high precision float point arithmetics in mathematica

In Mma, for example, I want to calculate
1.0492843824838929890231*0.2323432432432432^3
But it does not show the full precision. I tried N or various other functions but none seemed to work. How to achieve this? Many thanks.
When you specify numbers using decimal point, it takes them to have MachinePrecision, roughly 16 digits, hence the results typically have less than 16 meaningful digits. You can do infinite precision by using rational/algebraic numbers. If you want finite precision that's better than default, specify your numbers like this
123.23`100
This makes Mathematica interpret the number as having 100 digits of precision. So you can do
ans=1.0492843824838929890231`100*0.2323432432432432`100^3
Check precision of the final answer using Precision
Precision[ans]
Check tutorial/ArbitraryPrecisionNumbers for more details
You may do:
r[x_]:=Rationalize[x,0];
n = r#1.0492843824838929890231 (r#0.2323432432432432)^3
Out:
228598965838025665886943284771018147212124/17369643723462006556253010609136949809542531
And now, for example
N[n,100]
0.01316083216659453615093767083090600540780118249299143245357391544869\
928014026433963352910151464006549
Sometimes you just want to see more of the machine precision result. These are a few methods.
(1) Put the cursor at the end of the output line, and press Enter (not on the numeric keypad) to copy the output to a new input line, showing all digits.
(2) Use InputForm as in InputForm[1.0/7]
(3) Change the setting of PrintPrecision using the Options Inspector.

Resources