Related
I am trying to implement qsort algorithm in Fortran.
The implemented qsort is intended to operate over an array of a derived type which contains also another derived type.
The derived types are defined in a separate module as:
MODULE DATA_MODEL
! -------------------
! CONSTANTS
! -------------------
integer,parameter :: max_records = 100000000
type :: timestamp
integer :: year
integer :: month
integer :: day
integer :: hour
integer :: minute
integer :: second
end type
type :: tape
type(timestamp) :: ts
integer :: value1
integer :: value2
end type
END MODULE
This is what I have tried to implement the quicksort algorithm.
! DESCRIPTION:
! THIS MODULE IMPLEMENTS QSORT ALGORITH USING LOMUTO PARTITION SCHEME
! PSEUDOCODE:
! ALGORITHM QUICKSORT(A, LO, HI) IS
! IF LO < HI THEN
! P := PARTITION(A, LO, HI)
! QUICKSORT(A, LO, P - 1)
! QUICKSORT(A, P + 1, HI)
!
! ALGORITHM PARTITION(A, LO, HI) IS
! PIVOT := A[HI]
! I := LO
! FOR J := LO TO HI DO
! IF A[J] < PIVOT THEN
! SWAP A[I] WITH A[J]
! I := I + 1
! SWAP A[I] WITH A[HI]
! RETURN I
!
! SORTING THE ENTIRE ARRAY IS ACCOMPLOMISHED BY QUICKSORT(A, 0, LENGTH(A) - 1).
module qsort
use data_model
contains
subroutine quicksort(a, lo, hi)
implicit none
! SUBROUTINE PARAMETERS
type(tape),allocatable,intent(in out) :: a
integer,intent(in) :: lo, hi
! ALGORITHM INTERNAL VARIABLES
integer :: p
if (lo < hi) then
call partition(a, lo, hi, p)
call quicksort(a, lo, p - 1)
call quicksort(a, p + 1, hi)
end if
end subroutine
subroutine partition(a, lo, hi, p)
implicit none
! SUBROUTINE PARAMETERS
type(tape),allocatable,intent(inout) :: a
integer,intent(in) :: lo
integer,intent(in) :: hi
integer,intent(out) :: p
! ALGORITHM INTERNAL VARIABLES
type(tape) :: pivot
type(tape) :: swap
integer :: i,j
pivot = a(hi)
i = lo
do j = lo, hi
if (compare(a(j), pivot)) then
swap = a(i)
a(i) = a(j)
a(j) = swap
i = i + 1
endif
end do
swap = a(i)
a(i) = a(hi)
a(hi) = swap
p = i
end subroutine
function compare(a,b)
implicit none
! FUNCTION PARAMETERS
type(tape) :: a
type(tape) :: b
logical :: compare
if (a%ts%year < b%ts%year) then
compare = .true.
else if (a%ts%year > a%ts%year) then
compare = .false.
else if (a%ts%month < b%ts%month) then
compare = .true.
else if (a%ts%month > b%ts%month) then
compare = .false.
else if (a%ts%day < b%ts%day) then
compare = .true.
else if (a%ts%day > b%ts%day) then
compare = .false.
else if (a%ts%hour < b%ts%hour) then
compare = .true.
else if (a%ts%hour > b%ts%hour) then
compare = .false.
else if (a%ts%minute < b%ts%minute) then
compare = .true.
else if (a%ts%minute > b%ts%minute) then
compare = .false.
else if (a%ts%second < b%ts%second) then
compare = .true.
else if (a%ts%second > b%ts%second) then
compare = .false.
else
compare = .false.
end if
end function
end module
This is the errors I get while trying to compile it:
$ flang -c data_model.f95
$ flang -c qsort.f95
F90-S-0072-Assignment operation illegal to external procedure a (qsort.f95: 79)
F90-S-0076-Subscripts specified for non-array variable a (qsort.f95: 80)
F90-S-0076-Subscripts specified for non-array variable a (qsort.f95: 84)
F90-S-0076-Subscripts specified for non-array variable a (qsort.f95: 85)
F90-S-0076-Subscripts specified for non-array variable a (qsort.f95: 85)
F90-S-0076-Subscripts specified for non-array variable a (qsort.f95: 86)
0 inform, 0 warnings, 6 severes, 0 fatal for partition
$
Edit 1: I have modified the source code with the subroutine based code, which makes more sense as we want to modify the arguments.
Edit 2: modifying the definition of a to type(tape),intent(in out) :: a(:) in both quicksort and partition subroutines make the module to compile without errors – see comments.
I saw that you got unblocked with your problem with the help of the comments, but let me give you some suggestions to make your implementation more modular, easy to use and modern.
Disclaimer: Some of my suggestions might need a more recent Fortran version than 95.
You can improve your timestamp type definition by providing overloads for the relational operators.
type :: timestamp
integer :: year, month, day, hour = 0, minute = 0, second = 0
contains
procedure, private :: eq, ne, gt, ge, lt, le
generic :: operator(==) => eq
generic :: operator(/=) => ne
generic :: operator(>) => gt
generic :: operator(>=) => ge
generic :: operator(<) => lt
generic :: operator(<=) => le
end type
(A subtle change there is that I have default values for hour, minute and second. So you can instantiate like this: timestamp(2021,5,22))
To get this working, you just need to provide implementations for the functions eq, ne, gt, ge, lt, le available in the module you define your type. Note that, when writing a generic type bound procedure, you must declare your bound parameter as class(timestamp) instead of type(timestamp).
elemental function lt(left, right) result(result)
class(timestamp), intent(in) :: left, right
logical :: result
result = compare(left, right) < 0
end function
elemental function compare(this, other) result(result)
class(timestamp), intent(in) :: this, other
integer :: result
if (this%year /= other%year) then
result = sign(1, this%year - other%year)
else if (this%month /= other%month) then
result = sign(1, this%month - other%month)
else if (this%day /= other%day) then
result = sign(1, this%day - other%day)
else if (this%hour /= other%hour) then
result = sign(1, this%hour - other%hour)
else if (this%minute /= other%minute) then
result = sign(1, this%minute - other%minute)
else if (this%second /= other%second) then
result = sign(1, this%second - other%second)
else
result = 0
end if
end function
Another good practice you can implement is to control access of your module elements by using public and private.
module data_model
implicit none
public :: timestamp, tape
private
type :: timestamp
! (...)
end type
type :: tape
type(timestamp) :: ts
integer :: value1, value2
end type
contains
! (...) implementations of eq, ne, gt, ge, lt, le
end
Then, when you use this module from another program unit, only the public names will be available. You can also use only specific name with the use only clause:
module qsort
use data_model, only: tape
implicit none
public :: quicksort
private
contains
! (...) your quicksort implementation
end
Finally, let me suggest some tweaks on your quicksort implementation.
First, I suggest that you don't need to pass around the boundaries lo and hi everywhere together with your array. One of the most distinctive features of Fortran is how easy it is to operate on array segments. You can call the quicksort procedure on a contiguous portion of your array, and the procedure can work on it in a boundaries-agnostic way, if you use assumed-shape arrays, like this: type(tape) :: a(:). Inside the procedure, the array segment is rebounded to start on index 1, no matter what are the bounds on the call site.
Besides that, as I mentioned in the comments, you don't need to declare the array argument as allocatable in this case. Even if the original array you are passing is originally allocatable, you can pass an allocatable array to a procedure without declaring the argument as allocatable in the procedure, it will be handled as a normal array. It only makes sense to declare the argument as allocatable if you plan to allocate/deallocate inside the procedure.
pure recursive subroutine quicksort(a)
type(tape), intent(inout) :: a(:)
integer :: p
if (size(a) == 0) return
call partition(a, p)
call quicksort(a(:p-1))
call quicksort(a(p+1:))
end
I declared this procedure as pure in this case, but that would depend on your specific use case. Making it pure helps me to remind declaring intents correctly and have well-though procedures (and there is a performance gain in some cases), but this brings many restrictions (like not being able to print inside the procedure). You can search for pure procedures to learn more.
Both quicksort and partition are implemented as subroutines here. I like to do this way always that the procedure performs important side-effects, like updates on the passed argument. If I need a returned value, I can have an intent(out) argument, like the argument out in partition, that returns the pivot position.
pure subroutine partition(a, out)
type(tape), intent(inout) :: a(:)
integer, intent(out) :: out
integer :: i, j
i = 1
do j = 1, size(a)
if (a(j)%ts < a(size(a))%ts) then
call swap(a(i), a(j))
i = i + 1
end if
end do
call swap(a(i), a(size(a)))
out = i
end
elemental subroutine swap(a, b)
type(tape), intent(inout) :: a, b
type(tape) :: temp
temp = a
a = b
b = temp
end
You may note at a(j)%ts < a(size(a))%ts that I am making use of the overloaded operator < to compare two timestamp. This way, the comparison logic belongs to the same module as the type definition.
Finally, you can use the modules and make some tests on your quicksort implementation!
program main
use data_model, only: tape, timestamp
use qsort, only: quicksort
implicit none
type(tape) :: a(8) = [ &
tape(timestamp(2020, 01, 08), 0, 0), &
tape(timestamp(2021, 01, 30), 0, 0), &
tape(timestamp(2020, 01, 06), 0, 0), &
tape(timestamp(2019, 12, 14), 0, 0), &
tape(timestamp(2020, 01, 08), 0, 0), &
tape(timestamp(2020, 05, 05), 0, 0), &
tape(timestamp(2021, 04, 30), 0, 0), &
tape(timestamp(2020, 10, 22), 0, 0) &
]
call quicksort(a(3:7)) ! will sort in place, only from index 3 to 7
call quicksort(a) ! will sort whole array
end
Works like a charm!
This is not an answer directly related to the quicksort algorithm but rather on how to implement type-bound operators.
You can move the compare function inside the data_model module.
This decouples the modules further s.t. the quicksort module only contains the quicksort algorithm.
The compare function can be implemented by a type-bound operator operator(<).
The following shows a quick implementation (only for year/month/day) and it should help you to edit your own code accordingly.
module timestamp_m
implicit none
private
public timestamp
type timestamp
integer :: y, m, d
contains
generic :: operator(<) => timestamp_lt
procedure, private :: timestamp_lt
end type
contains
logical function timestamp_lt(this, rhs) result(tf)
!! result of: this < rhs
class(timestamp), intent(in) :: this
type(timestamp), intent(in) :: rhs
! compare year
if (this%y < rhs%y) then
tf = .true.
else if (this%y > rhs%y) then
tf = .false.
else
! compare month
if (this%m < rhs%m) then
tf = .true.
else if (this%m > rhs%m) then
tf = .false.
else
! compare day
if (this%d < rhs%d) then
tf = .true.
else
tf = .false.
end if
end if
end if
end function
end module
You will need to adjust one line in your quicksort module:
module qsort
..
subroutine quicksort(a, lo, hi)
..
! if (compare(a(j), pivot)) then ! OLD. replace by:
if (a(j)%ts < pivot%ts) then
..
I'm new to parallel programming and attempting to produce a sparse matrix-vector calculation in Fortran 95. I'm working on a subprogram that only gathers the components of the vector that the sparse matrix will touch (instead of MPI_AllGather), but I keep getting SIGSESV errors. I know this means I've asked the process to touch something it can't/doesn't exist, but I can't for the life of me figure out what it could be.
!Gather the vector matrix in matrix vector multiplication for sparse matrices
subroutine sparsegather(u,BW,myid,nprocs)
use header
include "mpif.h"
type(Vector), intent(inout) :: u
integer,intent(in) :: BW !Bandwidth
integer,intent(in) :: myid !process id
integer,intent(in) :: nprocs !number of processes
integer :: n, i
integer,dimension(BW) :: rlr, rrr, slr, srr !Range of receive left/right, send left/right
real(kind=rk),dimension(BW) :: rl, rr, sl, sr !Arrays of actual values
integer :: ierr
n = u%n !Length of whole vector - used in periodic condition
!Define ranges
do i = 1,BW
rlr(i) = u%ibeg - BW - 1 + i
rrr(i) = u%iend + i
srr(i) = u%iend - i + 1
slr(i) = u%ibeg + i - 1
end do
!Periodic conditions
do i = 1,BW
if (rlr(i) < 1) then
rlr(i) = rlr(i) + n
end if
if ((srr(i) < 1) then
srr(i) = srr(i) + n
end if
if (rrr(i) > n ) then
rrr(i) = rrr(i) - n
end if
if (slr(i) > n ) then
slr(i) = slr(i) - n
end if
end do
!Store the matrix values being sent over
sl = u%xx(slr)
sr = u%xx(srr)
!Pass the value parcels around
if (myid == 0) then
call MPI_Recv(rl,BW,MPI_DOUBLE_PRECISION,nprocs-1,MPI_ANY_TAG,MPI_COMM_WORLD,ierr)
call MPI_Send(sr,BW,MPI_DOUBLE_PRECISION,myid+1,0,MPI_COMM_WORLD,ierr)
call MPI_Recv(rr,BW,MPI_DOUBLE_PRECISION,myid+1,MPI_ANY_TAG,MPI_COMM_WORLD,ierr)
call MPI_Send(sl,BW,MPI_DOUBLE_PRECISION,nprocs-1,0,MPI_COMM_WORLD,ierr)
elseif (myid == nprocs-1) then
call MPI_Send(sr,BW,MPI_DOUBLE_PRECISION,0,0,MPI_COMM_WORLD,ierr)
call MPI_Recv(rl,BW,MPI_DOUBLE_PRECISION,myid-1,MPI_ANY_TAG,MPI_COMM_WORLD,ierr)
call MPI_Send(sl,BW,MPI_DOUBLE_PRECISION,myid-1,0,MPI_COMM_WORLD,ierr)
call MPI_Recv(rr,BW,MPI_DOUBLE_PRECISION,0,MPI_ANY_TAG,MPI_COMM_WORLD,ierr)
elseif (mod(myid,2) == 0) then
call MPI_Recv(rl,BW,MPI_DOUBLE_PRECISION,myid-1,MPI_ANY_TAG,MPI_COMM_WORLD,ierr)
call MPI_Send(sr,BW,MPI_DOUBLE_PRECISION,myid+1,0,MPI_COMM_WORLD,ierr)
call MPI_Recv(rr,BW,MPI_DOUBLE_PRECISION,myid+1,MPI_ANY_TAG,MPI_COMM_WORLD,ierr)
call MPI_Send(sl,BW,MPI_DOUBLE_PRECISION,myid-1,0,MPI_COMM_WORLD,ierr)
else
call MPI_Send(sr,BW,MPI_DOUBLE_PRECISION,myid+1,0,MPI_COMM_WORLD,ierr)
call MPI_Recv(rl,BW,MPI_DOUBLE_PRECISION,myid-1,MPI_ANY_TAG,MPI_COMM_WORLD,ierr)
call MPI_Send(sl,BW,MPI_DOUBLE_PRECISION,myid-1,0,MPI_COMM_WORLD,ierr)
call MPI_Recv(rr,BW,MPI_DOUBLE_PRECISION,myid+1,MPI_ANY_TAG,MPI_COMM_WORLD,ierr)
end if
u%xx(rrr) = rr
u%xx(rlr) = rl
end subroutine sparsegather
u is an object with the vector values stored in %xx and its size in %n. The relevant starting point and end points for each processor are in %ibeg and %iend.
BW is bandwith of the sparse banded matrix. This equation has periodic conditions, so values to the left of the start of the vector wrap around to the right side (and vice versa), which is done in the periodic conditions section.
I have been tried to do the Morris Pratt table and the code is basically this one in C:
void preMp(char *x, int m, int mpNext[]) {
int i, j;
i = 0;
j = mpNext[0] = -1;
while (i < m) {
while (j > -1 && x[i] != x[j])
j = mpNext[j];
mpNext[++i] = ++j;
}
}
and here is where i get so far in Fortran
program MP_ALGORITHM
implicit none
integer, parameter :: m=4
character(LEN=m) :: x='abac'
integer, dimension(4) :: T
integer :: i, j
i=0
T(1)=-1
j=-1
do while(i < m)
do while((j > -1) .AND. (x(i+1:i+1) /= (x(j+i+1:j+i+1))))
j=T(j)
end do
i=i+1
j=j+1
T(i)=j
end do
print *, T(1:)
end program MP_ALGORITHM
and the problem is i think i am having the wrong output.
for x=abac it should be (?):
a b a c
-1 0 1 0
and my code is returning 0 1 1 1
so, what i've done wrong?
The problem here is that C indices start from zero, but Fortran indices start from one. You can try to adjust the index for every array acces by one, but this will get unwieldy.
The Morris-Pratt table itself is an array of indices, so it should look different in C and Fortran: The Fortran array should have one-based indices and it should use zero as invalid index.
Together with the error that chw21 pointed out, your function might look like this:
subroutine kmp_table(x, t)
implicit none
character(*), intent(in) :: x
integer, dimension(:), intent(out) :: t
integer m
integer :: i, j
m = len(x)
i = 1
t(1) = 0
j = 0
do while (i < m)
do while(j > 0 .and. x(i:i) /= x(j:j))
j = t(j)
end do
i = i + 1
j = j + 1
t(i) = j
end do
end subroutine
You can then use it in the Morris-Pratt algorithm as taken straight from the Wikipedia page with adjustment for Fortran indices:
function kmp_index(S, W) result(res)
implicit none
integer :: res
character(*), intent(in) :: S ! text to search
character(*), intent(in) :: W ! word to find
integer :: m ! zero-based offset in S
integer :: i ! one-based offset in W and T
integer, dimension(len(W)) :: T ! KMP table
call kmp_table(W, T)
i = 1
m = 0
do while (m + i <= len(S))
if (W(i:i) == S(m + i:m + i)) then
if (i == len(W)) then
res = m + 1
return
end if
i = i + 1
else
if (T(i) > 0) then
m = m + i - T(i)
i = T(i)
else
i = 1
m = m + 1
end if
end if
end do
res = 0
end function
(The index m is zero-based here, because t is only ever used in conjunction with i in S(m + i:m + i). Adding two one-based indices will yield an offset of one, whereas keeping m zero-based makes this a neutral addition. m is a local variable that isn't exposed to code from the outside.)
Alternatively, you could make your Fortran arrays zero-based by specifying a lower bound of zero for your string and array. That will clash with the useful character(*) notation, though, which always uses one-based indexing. In my opinion, it is better to think about the whole algorithm in the typical one-based indexing scheme of Fortran.
this site isn't really a debugging site. Normally I would suggest you have a look at how to debug code. It didn't take me very long to go through your code with a pen and paper and verify that that is indeed the table it produces.
Still, here are a few pointers:
The C code compares x[i] and x[j], but you compare x[i] and x[i+j] in your Fortran code, more or less.
Integer arrays usually also start at index 1 in Fortran. So just like adding one to the index in the x String, you also need to add 1 every time you access T anywhere.
I have a matrix A and B. I want to take the sum of squares errors between them ss = sum(sum( (A-B).^2 )), but I only want to do so if NEITHER matrix elements are identically zero. For now, I am going through each matrix as follows:
for i = 1:N
for j = 1:M
if( A(i,j) == 0 )
B(i,j) = 0;
elseif( B(i,j) == 0 )
A(i,j) = 0;
end
end
end
and then taking the sum of squares after that. Is there a way to vectorize the comparison and reassigning of values?
If you were just trying to achieve what the listed code is doing, but in a vectorized fashion, you can use this approach -
%// Create mask to set elements in both A and B to zeros
mask = A==0 | B==0
%// Set A and B to zeros at places where mask has TRUE values
A(mask) = 0
B(mask) = 0
If the bigger context of finding the sum of squares errors after the listed code could be considered, you can do so with this -
df = A - B;
df(A==0 | B==0) = 0;
ss_vectorized = sum(df(:).^2);
Or as #carandraug commented, you can use the built-in sumsq for the sum of squares calculation at the last step -
ss_vectorized = sumsq(df(:));
I need help debugging QuickSort. As I was debugging, it actually sorts the array properly up to a point, but in the last couple of steps, it ends up doing unnecessary swaps and ends up returning an unsorted array. I've spent quite some time trying to figure out what's causing it, but I've made no progress.
I've chosen the partition as the first element (I know that's not optimal, but I'm just trying to understand QS).
Script:
A = [3 6 2 5 1 7 4];
rightIndex = length(A);
E = QuickSort(A,1,rightIndex);
QuickSort:
function [pvt, B] = QuickSort(A,left,right)
if left < right
[B, pvt] = PartnPivot1(A, left, right); %chosen pivot
QuickSort(B, left, pvt-1);
QuickSort(B, pvt+1, right);
end
Partition:
function [sortedSubArray, pivot] = PartnPivot1(subArray,leftIndex,rightIndex)
%% Initializations
S = subArray;
left = leftIndex;
right = rightIndex;
P = S(left); %pivot
i = left+1;
%% Partition
for j = i:right
if S(j) < P
temp1 = S(j); %
temp2 = S(i); % swap S(i) with S(j)
S(j) = temp2; %
S(i) = temp1; %
i = i+1; %increment i only when swap occurs
end
end
swap1 = S(left); %
swap2 = S(i-1); % final swap
S(left) = swap2; %
S(i-1) = swap1; %
sortedSubArray = S;
pivot = P;
The recursive calls to QuickSort need to assign the output to some variables, otherwise the sorted array never gets passed back. I also think you don't need to return the pivot.
I'm typing in a browser instead of testing in Matlab, but I think this will do it...
function A = QuickSort(A,left,right)
if left < right
[A, pvt] = PartnPivot1(A, left, right); %chosen pivot
A = QuickSort(A, left, pvt-1);
A = QuickSort(A, pvt+1, right);
end