Get Wtime function returning "***" - parallel-processing

I'm currently working on converting some Fortran code into parallel using openMP. I'm trying to use omp_get_wtime() to calculate how much actual time passes, but its returning ******. Other OpenMP functions work, yet for some reason this doesn't. I've removed all the code from in between the timer just to try to get something different. Removing the finish, and just displaying the start gives the same result. Any ideas of what I'm doing wrong would be much appreciated.
C$ USE OMP_LIB
DOUBLE PRECISION START,FINISH
START = OMP_GET_WTIME()
FINISH=OMP_GET_WTIME()
WRITE(OUT,850) FINISH-START
850 FORMAT(25X,'ELAPSED TIME',I6)

Your problem has nothing to do with the OMP_GET_WTIME function. Rather it stems from the fact that the I edit descriptor is used to display integers and you are feeding it with a double precision number instead. You should use one of the floating-point edit descriptors like, e.g. F10.6:
$ cat wtime.f
USE OMP_LIB
IMPLICIT NONE
DOUBLE PRECISION START,FINISH
START = OMP_GET_WTIME()
CALL SLEEP(1)
FINISH=OMP_GET_WTIME()
WRITE(*,850) FINISH-START
850 FORMAT(25X,'ELAPSED TIME',F10.6)
END
$ ifort -openmp -o wtime.exe wtime.f
$ ./wtime.exe
ELAPSED TIME 1.000277

Related

How do I declare the precision of a number to be an adjustable parameter?

In 2013 there was a question on converting a big working code from double to quadruple precision: "Converting a working code from double-precision to quadruple-precision: How to read quadruple-precision numbers in FORTRAN from an input file", and the consensus was to declare variables using an adjustable parameter "WP" that specifies the "working precision", instead of having a separate version of the program with variables declared using D+01, and another version using Q+01. This way we can easily switch back and forth by defining WP=real128 or WP=real64 at the top, and the rest doesn't need to change.
But how do we do this?
I tried the suggestion in the answer to that question, by making a simple code TEST.F90:
PROGRAM TEST
use ISO_FORTRAN_ENV
WP= real128
IMPLICIT NONE
real (WP) :: X
X= 5.4857990945E-4_WP
END PROGRAM TEST
compiled with:
~/gcc-4.6/bin/gfortran -o tst.x TEST.F90
But it gives:
IMPLICIT NONE
1
Error: Unexpected IMPLICIT NONE statement at (1)
QLEVEL16.F90:5.12:
real (WP) :: MEL
1
Error: Parameter 'wp' at (1) has not been declared or is a variable, which does not reduce to a constant expression
QLEVEL16.F90:6.29:
MEL= 5.4857990945E-4_WP
1
Error: Missing kind-parameter at (1)
The kind specifier must be an integer parameter - and you do not declare it appropriately. Furthermore, implicit none must go before any declaration.
Here is a working version addressing both issues:
PROGRAM TEST
use ISO_FORTRAN_ENV
IMPLICIT NONE
integer, parameter :: WP= real128
real (WP) :: X
X= 5.4857990945E-4_WP
END PROGRAM TEST
Actually many code using this WP approach. Many with select_*_kind intrinsic function. But I think there is a 'easier' way. It's to use default precision without specifying any kind keyword andusing compiler's flag to choose what the default precision is.
Pro is this method is easier if you don't need a precise control of precision on each variable. Con is that will heavily depend on compiler flags, which varies for each compiler or even might not available.
For gfortran, there is more flags -freal4-real8 or -freal4-real16 to promote each explicitly specified lower precision variable to higher precision.

Fortran Bus Error on allocating a large matrix (gfortran)

When I compile the following Fortran code with gfortran and run it, it gives me 'signal SIGBUS: Access to undefined portion of a memory object', whenever n>=180. I'm running this on a Mac OSX Mavericks.
PROGRAM almatrix
IMPLICIT NONE
INTEGER :: i,j,n
REAL,ALLOCATABLE :: a(:,:)
READ(*,*)n
ALLOCATE(a(n+1,n+1))
DO i=0,n
DO j=0,n
a(i,j)=0.0
END DO
END DO
DEALLOCATE(a)
END PROGRAM almatrix
I understood that instead of
ALLOCATE(a(n+1,n+1))
this
ALLOCATE(a(n+1,n+1),STAT=err)
IF(err /= 0) STOP
would prevent crashing. It didn't, however. Why?
I tried to look at similar problems, but so far they haven't helped.
I tried to compile with -Wall, -g, -fcheck=all, as suggested in another answer, but those didn't give me warnings.
I've also noticed before, that unlike with C, Fortran usually does not give bus errors when using small dynamic arrays and not deallocating them.
The problem isn't directly with the allocate statement, but with accessing the resulting array. [Note also that that an array 181x181 is not "large".] As there is nothing wrong with the allocation, err will indeed be zero.
From that allocate one is left with an array a which has elements a(1,1), a(2,1), ..., a(n+1,1), ..., a(n+1,n+1). So, a(0,0) (the first access in the loop) is not valid.
There are two options: request that the array elements be a(0,0) to a(n,n) as the loop wants, or change the loop:
allocate(a(0:n,0:n))
or
do i=1,n+1
do j=1,n+1
a(j,i) = 0 ! Note I've changed the order to Fortran-friendly
end od
end do
Finally, those loops aren't even necessary:
allocate(a(0:n,0:n))
a = 0.
or even
allocate(a(0:n,0:n), source=0.)
if you have a compiler later than Fortran 95.

Does Fortran modify input arguments through subroutine multiple calls?

I have a subroutine called several times in an if-then-goto loop.The subroutine has two input arguments:
1.is a constant
2.is an array of definite size(i.e 200X1) whose elements change in a do loop right before the subroutine is called.
The problem is that subroutine doesn't understand that change and returns the same results every time it is called (i.e the results of the first time that it is called).It seems as if the values of all variables that are calculated inside the sudroutine are somehow saved and do not change although input atgument no.2 changes..
Is something wrong with my code?Is there a Fortran bug i'm not aware of?
My code looks like this:
PROGRAM calcul.f
REAL aa(100000),dd(100000),mm(100000),yy(100000),hh(100000),mn(100000),ss(100000),ml(100000),m0(100000)
INTEGER N,snv
DOUBLE PRECISION m0(100000),excerpt(1000),k1(100000),sqsum,s2,vk1,mk1,sdk1,v
filelength=610
W=200
OPEN (1,file='filename.dat')
DO i=1,filelength
READ (1,*) aa(i),dd(i),mm(i),yy(i),hh(i),mn(i),ss(i),ml(i),m0(i)
END D0
CLOSE (1)
10 FORMAT(g12.6)
11 FORMAT(I5,1x,g12.6)
c1=1
c2=W
snv=0
14 IF ((c2.LT.filelength).AND.(c1.LT.(filelength-(W-1)))) THEN
DO i=c1,c2
excerpt(i)=m0(i)
END DO
CALL calk1(W,excerpt)
OPEN (3,file='meank1.dat')
READ (3,*) N,mk1
OPEN (2,file='resultsk1.dat')
DO i=1,N
READ (2,*) k1(i)
END DO
sqsum=0.0d0
DO i=1,N
sqsum=dble(sqsum+((k1(i)-mk1)**2))
s2=sqsum
END DO
vk1=(s2)/N
sdk1=dsqrt(vk1)
OPEN (4,file='resultsv.dat')
v=dble(sdk1/mk1)
snv=snv+1
WRITE (4,11) snv,v
CLOSE (2)
CLOSE (3)
mk1=0.0d0
vk1=0.0d0
sdk1=0.0d0
v=0.0d0
c1=c1+1
c2=c2+1
GOTO 14
END IF
CLOSE (4)
END
My subroutine is:
SUBROUTINE calk1(winlength,sm)
DOUBLE PRECISION sm(100000),sumk1,sum,s,x,x2,k1,sk1,mk1
INTEGER snk1
OPEN (2,file='resultsk1.dat')
10 FORMAT (g12.6)
start=1
w=winlength
c3=6
snk1=0
sumk1=0.0d0
sk1=0.0d0
13 IF (c3.LE.w) THEN
l1=1
l2=c3
c=c3
12 IF ((l1.LE.(w-5)).AND.(l2.LE.w)) THEN
sum=0.0d0
s=0.0d0
DO k=l1,l2
sum=sum+sm(k)
s=sum
END DO
av=0.0d0
av2=0.0d0
x=0.0d0
x2=0.0d0
DO k=l1,l2
av=av+dble(((k)/(c))*(sm(k)/s))
av2=av2+dble((((k)/(c))**2)*(sm(k)/s))
x=av
x2=av2
END DO
k1=x2-((x)**2)
sumk1=sumk1+k1
snk1=snk1+1
WRITE (2,10) k1
l1=l1+1
l2=l2+1
k1=0.0d0
GOTO 12
ELSE
c3=c3+1
GOTO 13
END IF
END IF
CLOSE (2)
N=snk1
sk1=sumk1
mk1=dble((sk1)/N)
OPEN (3,file='meank1.dat')
WRITE (3,10) N,mk1
CLOSE (3)
RETURN
END
I haven't attempted to compile the code (I see there are still some errors that would upset a compiler), but I can suggest a problem.
However, the first thing to say is: if this is your code you'll make things much easier for yourself if you use much more modern Fortran features.
You say that excerpt changes each time before the subroutine is entered. This is true, but not in a meaningful way. Let's look at what is happening to the array.
This is all looped:
c1=1
c2=W
DO i=c1,c2
excerpt(i)=m0(i)
END DO
CALL calk1(W,excerpt)
c1=c1+1
c2=c2+1
Well, W isn't changed in an interation. In this first iteration you are (using array syntax) setting excerpt(1:W)=m0(1:W); in the second setting excerpt(2:W+1)=m0(2:W+1), and so on. That is: each time you call calk1, excerpt(1:W) is still exactly m0(1:W) which hasn't changed. The only change to excerpt is after the W-th element, which you suggest won't be used in the subroutine anyway.
As to what you should do instead with that excerpt setting loop, I can't say: it depends on what you want to happen. Perhaps
DO i=c1,c2
excerpt(i-c1+1) = m0(i)
END DO
?
But use modern Fortran instead.
When I try to compile this I get numerous compiler warnings and errors. I suggest using maximum compile-time warning options from our compiler and clean up those problems. That will clear out some problems with minimal effort. With gfortran, try -O2 -ffixed-form -ffixed-line-length-none -W -Wall -pedantic -fimplicit-none -Wsurprising -Waliasing -Wimplicit-interface -Wunused-parameter -fcheck=all -pedantic -fbacktrace.
Indenting your code with help you understand it. And why use FORTRAN 77 in 2014? Fortran 95/2003 is much easier to program in.

try catch or type conversion performance in julia - (Julia 73 seconds, Python 0.5 seconds)

I have been playing with Julia because it seems syntactically similar to python (which I like) but claims to be faster. However, I tried making a similar script to something I have in python for tesing where numerical values are within a text file which uses this function:
function isFloat(s)
try:
float64(s)
return true
catch:
return false
end
end
For some reason, this takes a great deal of time for a text file with a reasonable amount of rows of text (~500000).
Why would this be? Is there a better way to do this? What general feature of the language can I understand from this to apply to other languages?
Here are the two exact scripts i ran with the times for reference:
python: ~0.5 seconds
def is_number(s):
try:
np.float64(s)
return True
except ValueError:
return False
start = time.time()
file_data = open('SMW100.asc').readlines()
file_data = map(lambda line: line.rstrip('\n').replace(',',' ').split(), file_data)
bools = [(all(map(is_number, x)), x) for x in file_data]
print time.time() - start
julia: ~73.5 seconds
start = time()
function isFloat(s)
try:
float64(s)
return true
catch:
return false
end
end
x = map(x-> split(replace(x, ",", " ")), open(readlines, "SMW100.asc"))
u = [(all(map(isFloat, i)), i) for i in x]
print(start - time())
Note also that you can use the float64_isvalid function in the standard library to (a) check whether a string is a valid floating-point value and (b) return the value.
Note also that the colons (:) after try and catch in your isFloat code are wrong in Julia (this is a Pythonism).
A much faster version of your code should be:
const isFloat2_out = [1.0]
isFloat2(s::String) = float64_isvalid(s, isFloat2_out)
function foo(L)
x = split(L, ",")
(all(isFloat2, x), x)
end
u = map(foo, open(readlines, "SMW100.asc"))
On my machine, for a sample file with 100,000 rows and 10 columns of data, 50% of which are valid numbers, your Python code takes 4.21 seconds and my Julia code takes 2.45 seconds.
This is an interesting performance problem that might be worth submitting to julia-users to get more focused feedback than SO will probably provide. At a first glance, I think you're hitting problems because (1) try/catch is just slightly slow to begin with and then (2) you're using try/catch in a context where there's a very considerable amount of type uncertainty because of lots of function calls that don't return stable types. As a result, the Julia interpreter spend its time trying to figure out the types of objects rather than doing your computation. It's a bit hard to tell exactly where the big bottlenecks are because you're doing a lot of things that are not very idiomatic in Julia. Also you seem to be doing your computations in the global scope, where Julia's compiler can't perform many meaningful optimizations due to additional type uncertainty.
Python is oddly ambiguous on the subject of whether using exceptions for control flow is good or bad. See Python using exceptions for control flow considered bad?. But even in Python, the consensus is that user code shouldn't use exceptions for control flow (although for some reason generators are allowed to do this). So basically, the simple answer is that you should not be doing that – exceptions are for exceptional situations, not for control flow. That is why almost zero effort has been put into making Julia's try/catch construct faster – you shouldn't be using it like that in the first place. Of course, we will probably get around to making it faster at some point.
That said, the onus is on us as the designers of Julia's standard library to make sure that we provide APIs that never force you to use exceptions for control flow. In this case, you need a function that allows you to try to parse something as a floating-point value and indicate whether that was possible or not – not by throwing an exception, but rather by returning normal values. We don't provide such an API, so this ultimately a shortcoming of Julia's standard library – as it exists right now. I've opened an issue to discuss this API design question: https://github.com/JuliaLang/julia/issues/5704. We'll see how it pans out.

How to implement the behaviour of -time-passes in my own Jitter?

I am working on a Jitter which is based on LLVM. I have a real issue with performance. I was reading a lot about this and I know it is a problem in LLVM. However, I am wondering if there are other bottlenecks. Hence, I want to use in my Jitter the same mechanism offers by -time-passes, but saving the result to a specific file. In this way, I can do some simple math like:
real_execution_time = total_time - time_passes
I added the option to the command line, but it does not work:
// Disable branch fold for accurate line numbers.
llvm_argv[arrayIndex++] = "-disable-branch-fold";
llvm_argv[arrayIndex++] = "-stats";
llvm_argv[arrayIndex++] = "-time-passes";
llvm_argv[arrayIndex++] = "-info-output-file";
llvm_argv[arrayIndex++] = "pepe.txt";
cl::ParseCommandLineOptions(arrayIndex, const_cast<char**>(llvm_argv));
Any solution?
Ok, I found the solution. I am publishing the solution because It may be useful for someone else.
Before any exit(code) in your program you must include a call to
llvm::llvm_shutdown();
This call flush the information to the file.
My problem was:
1 - Other threads emitted exit without the mentioned call.
2 - There is a fancy struct llvm::llvm_shutdown_obj with a destructor which call to the mentioned method. I had declared a variable in the main function as follow:
llvm::llvm_shutdown_obj X();
Everybody know that the compiler should call the destructor, but in this case it was no happening. The reason is that the variable was not used, so the compiler removed it.
No variable => No destructor => No flush to the file

Resources