Why do I keep having process abort signal while I deallocated my allocatable array? - memory-management

I changed threads_list_all from an array with a constant length to an allocatable array. Since then, It shows me an error.
Here is the subroutine I modified:
subroutine management_16_tasks(tasklist_GRAD,ww,pas,cpt ,nb_element,cpt1,dt,dx,p_element,u_prime,u_prime_moins,u_prime_plus,&
&taux,grad_x_u,grad_t_u,grad_x_f,grad_t_f,ax_plus,ax_moins,ux_plus,ux_moins,sm,flux,tab0,tab)
INTEGER ::ff,pas
INTEGER,intent(inout)::cpt,cpt1,nb_element,ww
real(kind=REAL64) :: dt,dx
integer ,allocatable, dimension(:),intent(inout) ::p_element
REAL(KIND=REAL64) ,allocatable, dimension(:),intent(inout) :: u_prime,u_prime_moins, u_prime_plus,taux,grad_x_u,&
&grad_t_u,grad_t_f,grad_x_f,flux,sm
real(kind=REAL64),allocatable,dimension(:),intent(inout) :: ax_plus,ax_moins,ux_moins,ux_plus
REAL(KIND=REAL64) ,allocatable, dimension(:,:),intent(inout) ::tab0,tab
type(tcb)::self
!OpenMP variables
integer::num_thread,nthreads
integer, external :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS
type(tcb),dimension(20)::tasklist_GRAD,tasks_ready_master
!Variables pour gestion des threads
integer,allocatable,dimension(:)::threads_list !liste contenant les nums des threads workers
integer,allocatable,dimension(:)::threads_list_all !liste contenant les nums des threads workers dans l'ordre selon les tâches
integer,dimension(16)::threads_list_part3 ! le reste des tâches
!=======================================================================================================================================================
!$OMP PARALLEL PRIVATE(num_thread,threads_list,threads_list_all) &
!$OMP SHARED(tasklist_GRAD,tasks_ready_master) &
!$OMP SHARED(threads_list_part3,nthreads)
num_thread=OMP_GET_THREAD_NUM() ! le rang du thread
nthreads=OMP_GET_NUM_THREADS() ! le nombre de threads
!Thread Application Master (numero 1)
if (num_thread==1) then
do ff=5,20 ! 16 tâches
if (associated(tasklist_GRAD(ff)%f_ptr) .eqv. .true.) then
tasks_ready_master(ff) = tasklist_GRAD(ff)
tasks_ready_master(ff)%state=STATE_READY
end if
end do
end if
!$OMP BARRIER
!Thread Master (numero 0)
if (num_thread==0) then
allocate(threads_list(nthreads-2)) ! liste des threads workers
do ff=1,nthreads-2
threads_list(ff)=ff+1
end do
do ff=5,20,nthreads-2
if (tasks_ready_master(ff)%state==STATE_READY) then
if (.not. allocated(threads_list_all)) allocate(threads_list_all(nthreads-2))
threads_list_all(ff-4:ff+nthreads-7)=threads_list(:)
end if
end do
threads_list_part3=threads_list_all(1:16) ! 16 tâches
deallocate(threads_list_all)
deallocate(threads_list)
end if
!$OMP BARRIER
!Threads workers
do ff=5,20
if (num_thread==threads_list_part3(ff-4)) then
call tasks_ready_master(ff)%f_ptr(self,ww,pas,cpt ,nb_element,cpt1,dt,dx,p_element,u_prime,u_prime_moins,u_prime_plus,&
&taux,grad_x_u,grad_t_u,grad_x_f,grad_t_f,ax_plus,ax_moins,ux_plus,ux_moins,sm,flux,tab0,tab)
tasks_ready_master(ff)%state=STATE_RUNNING
end if
!$OMP BARRIER
end do
!Thread Master (numero 0)
if (num_thread==0) then
do ff=5,20
if (tasks_ready_master(ff)%state==STATE_RUNNING) then
tasklist_GRAD(ff)%state=STATE_RUNNING
end if
end do
end if
!$OMP END PARALLEL
end subroutine management_16_tasks
I allocated it using allocate(threads_list_all(nthreads-2)) and deallocated it using deallocate(threads_list_all).
You can find the allocation line in the part under the comment !Thread Master (numero 0).
The error I get is:
malloc(): corrupted top size
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0x7ff01ae1e700 in ???
#1 0x7ff01ae1d8a5 in ???
#2 0x7ff01aabd20f in ???
#3 0x7ff01aabd18b in ???
#4 0x7ff01aa9c858 in ???
#5 0x7ff01ab073ed in ???
#6 0x7ff01ab0f47b in ???
#7 0x7ff01ab12839 in ???
#8 0x7ff01ab15d14 in ???
#9 0x7ff01ae1dbf6 in ???
#10 0x7ff01b058547 in ???
#11 0x7ff01b05879f in ???
#12 0x7ff01b04fa38 in ???
#13 0x55df8b75a313 in hecese
at /home/hakim/stage_hecese_HPC/OpenMP/OpenMP_hecese_OpenMP.f90:1244
#14 0x55df8b7521fe in main
at /home/hakim/stage_hecese_HPC/OpenMP/OpenMP_hecese_OpenMP.f90:804
Abandon (core dumped)
The full code is on https://gitlab-sds.insa-cvl.fr/houeslat/stage_hecese/-/blob/master/OpenMP/OpenMP_hecese_OpenMP.f90.

Related

OpenMPI IPC performance is worse than reading/writing to file

I am trying out various ways of IPC to do the following:
Master starts.
Master starts a slave.
Master passes an array to slave.
Slave processes the array.
Slave sends the array back to master.
I have tried using OpenMPI to solve this by having the parent process spawn a child which in turn does the aforementioned processing. However, I have also tried - what I thought would be the worst possible way to do this - letting master write the data to a file and have slave read and write back to that file. The result is stunning.
Below is the two ways in which I achieve this. The first way is the "file" way, the second one is by using OpenMPI.
Master.f90
program master
implicit none
integer*4, dimension (10000) :: matrix
integer :: length, i, exitstatus, cmdstatus
logical :: waistatus
! put integers in matrix and output data into a file
open(1, file='matrixdata.dat', status='new')
length = 10000
do i=1,length
matrix(i) = i
write(1,*) matrix(i)
end do
close(1)
call execute_command_line("./slave.out", wait = .true., exitstat=exitstatus)
if(exitstatus .eq. 0) then
! open and read the file changed by subroutine slave
open(1, file= 'matrixdata.dat', status='old')
do i = 1, length
read(1,*) matrix(i)
end do
close(1)
endif
end program master
Slave.f90
program slave
implicit none
integer*4, dimension (10000) :: matrix
integer :: length, i
! Open and read the file made by master into a matrix
open (1, file= 'matrixdata.dat', status = 'old')
length = 10000
do i = 1, length
read(1,*) matrix(i)
end do
close(1)
! Square all numbers and write over the file with new data
open(1, file= 'matrixdata.dat', status = 'old')
do i=1,length
matrix(i) = matrix(i)**2
write(1,*) matrix(i)
end do
close(1)
end program slave
* OpenMPI *
Master.f90
program master
use mpi
implicit none
integer :: ierr, num_procs, my_id, intercomm, i, siz, array(10000000), s_tag, s_dest, siffra
CALL MPI_INIT(ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, my_id, ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
siz = 10000
!print *, "S.Rank =", my_id
!print *, "S.Size =", num_procs
if (.not. (ierr .eq. 0)) then
print*, "S.Unable to initilaize bös!"
stop
endif
do i=1,size(array)
array(i) = 2
enddo
if (my_id .eq. 0) then
call MPI_Comm_spawn("./slave.out", MPI_ARGV_NULL, 1, MPI_INFO_NULL, my_id, &
& MPI_COMM_WORLD, intercomm, MPI_ERRCODES_IGNORE, ierr)
s_dest = 0 !rank of destination (integer)
s_tag = 1 !message tag (integer)
call MPI_Send(array(1), siz, MPI_INTEGER, s_dest, s_tag, intercomm, ierr)
call MPI_Recv(array(1), siz, MPI_INTEGER, s_dest, s_tag, intercomm, MPI_STATUS_IGNORE, ierr)
!do i=1,10
! print *, "S.Array(",i,"): ", array(i)
!enddo
endif
call MPI_Finalize(ierr)
end program master
Slave.f90
program name
use mpi
implicit none
! type declaration statements
integer :: ierr, parent, my_id, n_procs, i, siz, array(10000000), ctag, csource, intercomm, siffra
logical :: flag
siz = 10000
! executable statements
call MPI_Init(ierr)
call MPI_Initialized(flag, ierr)
call MPI_Comm_get_parent(parent, ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, my_id, ierr)
call MPI_Comm_size(MPI_COMM_WORLD, n_procs, ierr)
csource = 0 !rank of source
ctag = 1 !message tag
call MPI_Recv(array(1), siz, MPI_INTEGER, csource, ctag, parent, MPI_STATUS_IGNORE, ierr)
!do i=1,10
! print *, "C.Array(",i,"): ", array(i)
!enddo
do i=1,size(array)
array(i) = array(i)**2
enddo
!do i=1,10
! print *, "C.Array(",i,"): ", array(i)
!enddo
call MPI_Send(array(1), siz, MPI_INTEGER, csource, ctag, parent, ierr)
call MPI_Finalize(ierr)
end program name
Now, the interesting part is that by using the time program I have measured that it takes 19.8 ms to execute the "file version of the program". The OpenMPI version takes 60 ms. Why? Is there really so much overhead in OpenMPI that it is faster to read/write to file if you're working with <400 KiB?
I tried increasing the array to 10^5 integers. The file version executes in 114ms, OpenMPI in 53ms. When increasing to 10^6 integers file: 1103 ms, OpenMPI: 77ms.
Is the overhead really that much?
Fundamentally, it doesn't make sense to use distributed processing for problem sizes that fit in cache (except in some trivially parallel cases). The typical usage scenario is for data transfer much larger than LLC. Even you biggest case (10^6) fits in modern caches.
Firstly, for the method of writing to disk, you have to be aware of the influence of a page cache in your operating system. If your MPI processes are on the same chip, the operating system just hears 'do a write' then 'do a read'. If, in the interim, nothing pollutes the page cache then it will just fetch the data from RAM as oppose to the disk. A better experiment would be to flush the page cache between the write and read (this is possible, at least on linux, via a shell command). In effect you are performing shared memory processing if you're grabbing the data from the page cache.
Also, you are using time on the command line so you're incorporating the time it takes for MPI to initialize and establish communication interfaces with a few function calls. This is not a good benchmark because the interface provided for disk IO method has already been initialized by the operating system. Also for such a small problem size, the initialization of MPI is nontrivial compared to the runtime of the body of the program. The proper way to do this is to do the timing in the code.
For both methods, you should expect linear scaling biased by the overhead of the method. In fact, you should see a few regimes as the data size surpasses LLC and page cache. The best way to do this is to repeat your runs with ARRAY_SIZE=2^n for n=12,13,..24 and check out the curve.

Parallelization of an openMP nested do loop

I have a nested do loop in an openmp fortran 77 code that I am unable to parallelize (the code gives a segmentation fault error when it is run). I have a very similar nested do loop in a different subroutine of the same code that runs parallel with no issues.
Here is the nested do loop that I am having problems with:
do n=1,num_p
C$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(l,i1,i2,j1,j2,k1,k2
C$OMP& ,i,j,k,i_t,j_t,i_ddf,j_ddf,ddf_dum)
do l=1,n_l(n)
call del_fn(l,n)
i1=p_iw(l,n)
i2=p_ie(l,n)
j1=p_js(l,n)
j2=p_jn(l,n)
k1=p_kb(l,n)
k2=p_kt(l,n)
do i=i1,i2
i_ddf=i-i1+1
if(i .lt. 1) then
i_t=nx+i
elseif (i .gt. nx) then
i_t=i-nx
else
i_t=i
endif
do j=j1,j2
j_ddf=j-j1+1
if(j .lt.1) then
j_t=ny+j
elseif(j .gt. ny) then
j_t=j-ny
else
j_t=j
endif
do k=k1,k2
ddf(l,n,i_ddf,j_ddf,k-k1+1) = ddf_dum(i_t,j_t,k)
enddo
enddo
enddo
enddo
C$OMP END PARALLEL DO
enddo
I have narrowed the problem down to ddf_dum(i_t,j_t,k). When this term is turned off (say I replace it by 0.d0), the code runs fine.
On the other hand, I have a very similar nested do loop that runs parallel with no issues. Below is that nested do loop that runs parallel with no issues. Can anyone please identify what I am missing here?
do n=1,1
C$OMP PARALLEL DO DEFAULT(SHARED), PRIVATE(l,i1,i2,j1,j2,k1,k2
C$OMP& ,i,j,k,i_f,j_f,i_ddf,j_ddf)
do l=1,n_l(n)
i1=p_iw(l,n)
i2=p_ie(l,n)
j1=p_js(l,n)
j2=p_jn(l,n)
k1=p_kb(l,n)
k2=p_kt(l,n)
u_forcing(l,n)= (u_p(l,n)-up_tilde(l,n))/dt
v_forcing(l,n)= (v_p(l,n)-vp_tilde(l,n))/dt
w_forcing(l,n)= (w_p(l,n)-wp_tilde(l,n))/dt
do i=i1,i2
i_ddf=i-i1+1
if(i .lt. 1) then
i_f=nx+i
elseif (i .gt. nx) then
i_f=i-nx
else
i_f=i
endif
do j=j1,j2
j_ddf=j-j1+1
if(j .lt.1) then
j_f=ny+j
elseif(j .gt. ny) then
j_f=j-ny
else
j_f=j
endif
do k=k1,k2
forcing_x(i_f,j_f,k)=forcing_x(i_f,j_f,k)+u_forcing(l,n)
& *ddf_n(l,n,i_ddf,j_ddf,k-k1+1)*dv_l(l,n)
forcing_y(i_f,j_f,k)=forcing_y(i_f,j_f,k)+v_forcing(l,n)
& *ddf_n(l,n,i_ddf,j_ddf,k-k1+1)*dv_l(l,n)
forcing_z(i_f,j_f,k)=forcing_z(i_f,j_f,k)+w_forcing(l,n)
& *ddf_n(l,n,i_ddf,j_ddf,k-k1+1)*dv_l(l,n)
enddo
enddo
enddo
enddo
C$OMP END PARALLEL DO
enddo
As you noted, your problem is ddf_dum. It should be a shared variable, not private, because it is only being read from and never written to. You are getting a segfault because you are attempting to access uninitialized memory on all the threads that aren't your master thread.
A good rule of thumb that you could have used to find this mistake yourself: all variables that are only found on the RHS of your equal signs within your parallel region should always be shared.

MPI_TEST: invalid mpi_request

I want to test if mpi_iSend and mpi_iRecv have run fine.
I have 2 request(argument) vectors: one vector for all the mpi_iSend, the other for all the mpi_iRecv.
The point is that the program runs fine until it started to run the cycle for MPI_TEST. I have tried even with 2 numbers (do i=1,2), still the same error.
Fatal error in PMPI_Test: Invalid MPI_Request, error stack:
PMPI_Test(166): MPI_Test(request=0x7fff93fd2220, flag=0x7fff93fd1ffc,
status=0x7fff93fd2890) failed PMPI_Test(121): Invalid MPI_Request
INTEGER :: ierr, myid, istatus(MPI_STATUS_SIZE), num, i, n
INTEGER,parameter :: seed = 86456, numbers=200
INTEGER :: req1(numbers), req2(numbers)
LOGICAL :: flag
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)
IF (myid==0) THEN
DO n=1, numbers
req1(n)=0
req2(n)=0
num=IRAND()
CALL MPI_ISEND(num,1,MPI_INTEGER,1,1,MPI_COMM_WORLD,req1(n),ierr)
CALL MPI_IRECV(best_prime,1,MPI_INTEGER,1,0,MPI_COMM_WORLD,req2(n),ierr)
END DO
ELSE IF (myid==1) THEN
DO i=1, numbers
CALL MPI_TEST(req2(i),flag,istatus,ierr)
IF (flag .eqv. .false.) THEN
WRITE(*,*)'RECV',i,'non-blocking FAIL'
ELSE IF (flag .eqv. .true.) THEN
WRITE(*,*)'RECV',i,'non-blocking SUCCESS'
END IF
END DO
END IF

Fortran issue with undefined statements in subroutine

I have a bunch of errors coming up that say that my variables are not variables and that my statements are undefined. This subroutine is needed to compile my larger program, so I just copy pasted it at the end of my program. It didn't work if I compiled them together on different files.
How would I go about fixing those "undefined" issues? Is it because of the way I copy pasted my subroutine at the end of my program?
(I compiled with g77 and with gfortran, same thing happens)
geomalb.f:1088.6:
TEMP(J)= TLINAL(J)
1
Error: Unclassifiable statement at (1)
geomalb.f:1089.6:
DEN(J)= DLINAL(J)
1
Error: Unclassifiable statement at (1)
geomalb.f:1090.6:
PRESS(J)=PLINAL(J)
1
Error: Unclassifiable statement at (1)
geomalb.f:1110.6:
XMU(J)=28.0134*XN2(J)+2.158*H2(J)+16.0426*CH4(J)+39.948*AR(J)
1
Error: Unclassifiable statement at (1)
geomalb.f:1132.6:
1001 IF (IPRINT .LT. 0) RETURN
1
Error: Bad continuation line at (1)
geomalb.f:1152.72:
ENDDO
1
Error: Statement label in ENDDO at (1) doesn't match DO label
geomalb.f:1130.72:
IF (DT .LT. 0.001) GO TO 1001
1
Error: Label 1001 referenced at (1) is never defined
geomalb.f:72.27:
CALL ATMSETUP(NLEVEL,Z,RHCH4,FH2,FARGON,TEMP,PRESS,DEN,XMU,
1
Warning: Rank mismatch in argument 'z' at (1) (scalar and rank-1)
Here is the subroutine part of the program:
SUBROUTINE ATMSETUP(NLEVEL,Z,RHCH4,FH2,FARGON,TEMP,PRESS,DEN,XMU,
& CH4,H2,XN2,AR,IPRINT)
PARAMETER (NMAX=201)
DIMENSION CH4(1),H2(1),XN2(1),AR(1)
DIMENSION TLINAL(NMAX),DLINAL(NMAX),PLINAL(NMAX)
CALL LINDAL(NLEVEL,Z,TLINAL,DLINAL,PLINAL)
DO J=1,NLEVEL
TEMP(J)= TLINAL(J)
DEN(J)= DLINAL(J)
PRESS(J)=PLINAL(J)
ENDDO
DO 1000 ITS =1,20
CH4(NLEVEL)=PCH4(TEMP(NLEVEL))*RHCH4/PRESS(NLEVEL)
DO 134 J=NLEVEL-1,1,-1
CH4SAT=PCH4(TEMP(J))/PRESS(J)
CH4(J)=AMIN1(CH4SAT,CH4(NLEVEL),CH4(J+1))
134 CONTINUE
DO 20 J=1,NLEVEL
H2(J)=FH2
IF (FARGON .LT. 0.) THEN
AR(J)=(-FARGON-28.0134+25.8554*H2(J)+11.9708*CH4(J))/11.9346
ELSE
IF (FARGON .EQ. 0.) THEN
AR(J)=0.0
ELSE
AR(J)=FARGON
ENDIF
ENDIF
XN2(J)=1.0 - H2(J) - CH4(J) -AR(J)
XMU(J)=28.0134*XN2(J)+2.158*H2(J)+16.0426*CH4(J)+39.948*AR(J)
20 CONTINUE
SUMT=PLINAL(1)*6.02E23/10.
SUMB=SUMT
TLAST=TEMP(NLEVEL)
DO J=2,NLEVEL
DENF=294.1/(XN2(J)*294.1 + CH4(J)*410. + H2(J)*136. + AR(J)*277.8)
DEN(J) = DLINAL(J)*DENF
ADEN=(DEN(J)-DEN(J-1))/ALOG(DEN(J)/DEN(J-1))
SUMT=SUMT+(EFFG(Z(J))*ADEN)*( Z(J-1)-Z(J))*XMU(J)
ADEN=(DLINAL(J)-DLINAL(J-1))/ALOG(DLINAL(J)/DLINAL(J-1))
SUMB=SUMB+(EFFG(Z(J))*ADEN)*( Z(J-1)-Z(J))*28.01340
PRESS(J)=PLINAL(J)*SUMT/SUMB
TEMP(J) =TLINAL(J)*(SUMT/SUMB)*(1./DENF)
ENDDO
30 CONTINUE
DT= ABS(TEMP(NLEVEL)-TLAST)
IF (DT .LT. 0.001) GO TO 1001
1000 CONTINUE
1001 IF (IPRINT .LT. 0) RETURN
WRITE (6,139)RHCH4,FH2,FARGON,DT
DO 135 J=1,NLEVEL-1
WRITE(6,140)J,Z(J),PRESS(J),DEN(J),TEMP(J),
& CH4(J)*PRESS(J)/PCH4(TEMP(J))
& ,CH4(J)*100.,XN2(J)*100.,H2(J)*100.,AR(J)*100.,XMU(J)
& ,(TEMP(J+1)-TEMP(J))/(Z(J+1)-Z(J))
135 CONTINUE
J=NLEVEL
WRITE(6,140)J,Z(J),PRESS(J),DEN(J),TEMP(J),
& CH4(J)*PRESS(J)/PCH4(TEMP(J))
& ,CH4(J)*100.,XN2(J)*100.,H2(J)*100.,AR(J)*100.,XMU(J)
139 FORMAT(///' BACKGROUNG ATMOSPHERE AT LEVELS'/
& ' SURFACE HUMIDITY OF CH4:',F5.3,' H2 MIXING RATIO:',F6.4,
& ' ARGON SETTING:',F8.4/' FINAL CONVERGENCE ON TEMP:',F10.5
& ' LINDAL ET AL SCALING'/
&' LVL ALTITUDE P(BARS) DEN(CM-3) TEMP RH-CH4'
& , ' %CH4 %N2 %H2 %AR MU DT/DZ' )
140 FORMAT(1X,I3,F8.3,1P2E10.3,0PF7.2,F5.2,2F6.2,2F5.2,4F6.2)
RETURN
ENDDO
END
You will need a
DIMENSION DEN(1), PRESS(1), TEMP(1)
statement in the subroutine. Otherwise the subroutine does not "know" that these variables are to be treated as arrays.

Ruby program halted in gdb fails to resume after writing to any IO using rb_eval_string

I'm using gdb to try to track down a memory leak in a ruby program.
I'm trying to print some debug data (or write it to a file), and it appears that any time anything is printed to any IO, the program fails to resume after I detach.
The simple test case is a program as follows:
#!/usr/bin/env ruby
counter = 0
loop do
puts "#{counter}\n"
counter += 1
sleep 1
end
I then attach gdb to the process via gdb -p PID and run p rb_eval_string("$stderr.puts(\"hi\\n\")") on the gdb console. This causes the failure to resume after detaching. If I run p rb_eval_string("a = 1") and detach, the ruby process resumes as normal.
The same problem happens if I attempt to write to a file or $stdout instead.
The backtrace of the program when it fails to resume looks like this:
(gdb) bt
#0 0x00007fcfb6617414 in pthread_cond_wait##GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x0000000000543773 in native_cond_wait (mutex=0x15fcf30, cond=<optimized out>) at thread_pthread.c:309
#2 gvl_acquire_common (vm=0x15fcf20) at thread_pthread.c:64
#3 gvl_acquire (th=0x15fd520, vm=0x15fcf20) at thread_pthread.c:82
#4 native_sleep (timeout_tv=<synthetic pointer>, th=0x15fd520) at thread_pthread.c:918
I'm using gdb version 7.7 on Ubuntu Trusty with ruby 1.9.3.
Can anyone suggest how to get the program to resume?
Thanks!
I don't understand yet what's going on but it might be because of Ruby's sleep and the way rb_thread_sleep uses native_sleep (and I can reproduce your issue locally).
Instead of using sleep I just considered using a method with a high-cost, that would consume almost a second so I decided to just use an unoptimized fibonacci :-)
Here's the code:
# fibo.rb
def fibonacci(n)
return n if (0..1).include?(n)
fibonacci(n-1) + fibonacci(n-2)
end
i = 0
loop do
fibonacci(30)
i += 1
puts "#### i = #{i}"
end
Now, you should be able to (please note I'm using lldb instead of gdb, but it's the same thing (almost :-)) :
➜ ~ lldb ruby
Current executable set to 'ruby' (x86_64).
(lldb) r foo.rb
Process 41134 launched: '/Users/xxx/.rubies/ruby-2.1.0/bin/ruby' (x86_64)
#### i = 1
Process 41134 stopped
* thread #1: tid = 0x72502, 0x00000001000c4820 ruby`range_include + 256, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
frame #0: 0x00000001000c4820 ruby`range_include + 256
ruby`range_include + 256:
-> 0x1000c4820: movq 0x16a691(%rip), %rsi ; id_cmp
0x1000c4827: movq %rbx, %rdi
0x1000c482a: movl $0x1, %edx
0x1000c482f: movq %r12, %rcx
(lldb) call (void)rb_eval_string("puts 42")
42
(lldb) c
Process 41134 resuming
#### i = 2
#### i = 3
...

Resources