fortran - MPI_SEND doesn't wait for MPI_RECV to complete - Stack Overflow

admin2025-04-18  3

I am trying out to implement the MPI_BARRIER function by myself using MPI_SEND and MPI_RECV functions in Fortran. I am extremely sorry if this has been asked before and any help is appreciated.

Program - I:

! Barrier Function Example Program 

program barrier 
    implicit none 
    include 'mpif.h'

    integer :: rank, nproc, ierr, tag, msg, root 
    integer :: i 

    call mpi_init(ierr)
    call mpi_comm_size(mpi_comm_world, nproc, ierr)
    call mpi_comm_rank(mpi_comm_world, rank, ierr)
    
    call sleep(rank)

    tag = 0 ; msg = 10 ; root = 0
    ! Barrier Function [RHS]
    if(rank == 0) then
        do i = 2 , nproc
            call mpi_recv(msg, 1, mpi_int, i - 1, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
        enddo
        do i = 2, nproc 
            call mpi_send(msg, 1, mpi_int, i - 1, tag, mpi_comm_world, ierr)
        enddo   
    else 
        call mpi_send(msg, 1, mpi_int, root, tag, mpi_comm_world, ierr)
        call mpi_recv(msg, 1, mpi_int, root, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
    endif 

    write(*,*) "ID: ",rank 

    call mpi_finalize(ierr)
end program barrier

Program - II:

! Barrier Function Example Program 

program barrier 
    implicit none 
    include 'mpif.h'

    integer :: rank, nproc, ierr, tag, msg, root 
    integer :: i 

    call mpi_init(ierr)
    call mpi_comm_size(mpi_comm_world, nproc, ierr)
    call mpi_comm_rank(mpi_comm_world, rank, ierr)
    
    call sleep(rank)

    tag = 0 ; msg = 10 ; root = 0
    ! Barrier Function [RHS]
    if(rank == 0) then
        do i = 2, nproc 
            call mpi_send(msg, 1, mpi_int, i - 1, tag, mpi_comm_world, ierr)
        enddo
        do i = 2 , nproc
            call mpi_recv(msg, 1, mpi_int, i - 1, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
        enddo
           
    else 
        call mpi_recv(msg, 1, mpi_int, root, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
        call mpi_send(msg, 1, mpi_int, root, tag, mpi_comm_world, ierr)
    endif 

    write(*,*) "ID: ",rank 

    call mpi_finalize(ierr)
end program barrier

Notice that only the order of MPI_SEND and MPI_RECV functions has changed. However, when executing the programs, Program-I is able to implement the barrier while Program-II is not.

However, to my understanding, MPI_SEND function waits until the message has been MPI_RECVed. What might the issue with the program here?

It seems that MPI_SEND doesn't wait for MPI_RECV to complete.

I am trying out to implement the MPI_BARRIER function by myself using MPI_SEND and MPI_RECV functions in Fortran. I am extremely sorry if this has been asked before and any help is appreciated.

Program - I:

! Barrier Function Example Program 

program barrier 
    implicit none 
    include 'mpif.h'

    integer :: rank, nproc, ierr, tag, msg, root 
    integer :: i 

    call mpi_init(ierr)
    call mpi_comm_size(mpi_comm_world, nproc, ierr)
    call mpi_comm_rank(mpi_comm_world, rank, ierr)
    
    call sleep(rank)

    tag = 0 ; msg = 10 ; root = 0
    ! Barrier Function [RHS]
    if(rank == 0) then
        do i = 2 , nproc
            call mpi_recv(msg, 1, mpi_int, i - 1, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
        enddo
        do i = 2, nproc 
            call mpi_send(msg, 1, mpi_int, i - 1, tag, mpi_comm_world, ierr)
        enddo   
    else 
        call mpi_send(msg, 1, mpi_int, root, tag, mpi_comm_world, ierr)
        call mpi_recv(msg, 1, mpi_int, root, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
    endif 

    write(*,*) "ID: ",rank 

    call mpi_finalize(ierr)
end program barrier

Program - II:

! Barrier Function Example Program 

program barrier 
    implicit none 
    include 'mpif.h'

    integer :: rank, nproc, ierr, tag, msg, root 
    integer :: i 

    call mpi_init(ierr)
    call mpi_comm_size(mpi_comm_world, nproc, ierr)
    call mpi_comm_rank(mpi_comm_world, rank, ierr)
    
    call sleep(rank)

    tag = 0 ; msg = 10 ; root = 0
    ! Barrier Function [RHS]
    if(rank == 0) then
        do i = 2, nproc 
            call mpi_send(msg, 1, mpi_int, i - 1, tag, mpi_comm_world, ierr)
        enddo
        do i = 2 , nproc
            call mpi_recv(msg, 1, mpi_int, i - 1, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
        enddo
           
    else 
        call mpi_recv(msg, 1, mpi_int, root, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
        call mpi_send(msg, 1, mpi_int, root, tag, mpi_comm_world, ierr)
    endif 

    write(*,*) "ID: ",rank 

    call mpi_finalize(ierr)
end program barrier

Notice that only the order of MPI_SEND and MPI_RECV functions has changed. However, when executing the programs, Program-I is able to implement the barrier while Program-II is not.

However, to my understanding, MPI_SEND function waits until the message has been MPI_RECVed. What might the issue with the program here?

It seems that MPI_SEND doesn't wait for MPI_RECV to complete.

Share asked Jan 30 at 7:11 Tanish JainTanish Jain 31 silver badge2 bronze badges 2
  • 3 I strongly suggest moving to use mpi instead of the obsolete include "mpif.h". The even newer mpi_f08 interface is even better, but requires more changes. It will help avoiding several kinds of bugs resulting from incorrect calls. – Vladimir F Героям слава Commented Jan 30 at 8:35
  • 2 In particular note MPI_INT is not standard - it is part of the C interface not the Fortran one. Use MPI_INTEGER – Ian Bush Commented Jan 30 at 10:57
Add a comment  | 

2 Answers 2

Reset to default 2

MPI_Send() returns when the application can safely overwrite the send buffer. This does not imply that the message has been received by the destination process. Depending on the specifics of your MPI implementation and other factors, you might observe that MPI_Send() returns immediately for short messages, but blocks until a corresponding receive is initiated for longer messages. However, you should not rely on this behavior: according to the MPI standard, any program where a blocking send could lead to a deadlock is considered incorrect.

MPI_Ssend() (note the double "s"), in contrast, does not return until the matching receive operation has been started by the destination MPI process.

MPI has no send function that waits until the receive is complete. MPI has as synchronous send (MPI_Ssend) which will wait until the matching receive has started.

The common semantics of a barrier is: all participants need to wait in the barrier until all other participants have arrived. Therefore, when a participant leaves the barrier, all other participants have reached the barrier.

A basic barrier implementation consists therefore of a reduction + bcast. Only your first example implements a reduction to rank 0 + broadcast from 0 and therefore provides barrier semantics.

The second code implements bcast + reduction, which synchronizes some processes, but not all.

转载请注明原文地址:http://anycun.com/QandA/1744933449a89681.html