Initialization & Termination
Put/Get (vector)
PutGet (strided)
Put/Get (contiguous)
Accumulate (vector)
Accumulate (strided)
Accumulate (contig.)
Register Ops
Fence / Wait / Barrier
Aggregation
Atomic / Sync
Memory Mgmt.
Collective Ops
Configuration Info

ARMCI - Programming Interfaces

ARMCI programming interfaces are explained below. This is an up-to-date document that containing all the supporting ARMCI APIs. There is also a document describing the ARMCI design (API corresponding to release 1.0 only) is available in the PDF format.

Header file
The interfaces are prototyped in the "armci.h" header file.
 

1 Initialization and Termination

int ARMCI_Init()
PURPOSE:     Initializes the ARMCI. This function must be called before any ARMCI functions.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_Finalize()
PURPOSE:     Finalizes the ARMCI. This function must be called after using ARMCI functions.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

void ARMCI_Cleanup()
PURPOSE: Releases any system resources (like System V shmem ids) that
ARMCI can be holding. It is intended to be used before terminating a program
(e.g., by calling MPI_Abort) in case of an error.

void ARMCI_Error(char *message, int code)
Purpose: Combines the functionality of ARMCI_Cleanup and MPI_Abort, and it prints (to the stdout and stderr) a user specified message followed by an integer code.
ARGUMENTS:   
message           - Message to print out
code              - Error code

2 Copy operations using the generalized I/O vector format

int ARMCI_PutV(armci_giov_t *dsrc_arr, int arr_len, int proc)
PURPOSE: Generalized  I/O vector operation that transfers data from the local memory of calling process (source) to the memory of a remote process (destination).
DATA TYPE:
       typedef struct {
         void **src_ptr_ar;  - Source starting addresses of each data segment.
         void **dst_ptr_ar;  - Destination starting addresses of each data segment.
         int bytes;         - The length of each segment in bytes.
         int ptr_ar_len;    - Number of data segment.
       }armci_giov_t;
ARGUMENTS:
       dsrc_arr - Array of data (type of armci_giov_t) to be put to remote process.
       arr_len  - Number of elements in the dsrc_arr.
       proc     - Remote process ID (destination).
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_NbPutV(armci_giov_t *dsrc_arr,int arr_len,int proc,armci_hdl_t* handle)
PURPOSE: Generalized Non-Blocking I/O vector operation that transfers data from the local memory of the calling process (source) to the memory of a remote process (destination).
DATA TYPE:
       typedef struct {
         void **src_ptr_ar;  - Source starting addresses of each data segment.
         void **dst_ptr_ar;  - Destination starting addresses of each data segment.
         int bytes;         - The length of each segment in bytes.
         int ptr_ar_len;    - Number of data segment.
       }armci_giov_t;
ARGUMENTS:
       dsrc_arr - Array of data (type of armci_giov_t) to be put to remote process.
       arr_len  - Number of elements in the dsrc_arr.
       proc     - Remote process ID (destination).
       handle   - Pointer to a desciptor associated with a particular non-blocking transfer. 
                  Passing of a NULL value for this arg makes this function do an implicit 
                  handle non-blocking transfer.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_GetV(armci_giov_t *dsrc_arr,int arr_len,int proc)
 PURPOSE: Generalized  I/O vector blocking operation that transfers data from the remote process memory (source) to the calling process local memory (destination).
DATA TYPE:
       typedef struct {
         void **src_ptr_ar;  - Source starting addresses of each data segment.
         void **dst_ptr_ar;  - Destination starting addresses of each data segment.
         int bytes;         - The length of each segment in bytes.
         int ptr_ar_len;    - Number of data segment.
       }armci_giov_t;
ARGUMENTS:
       dsrc_arr - Array of data (type of armci_giov_t) to get from remote process.
       arr_len  - Number of elements in the dsrc_arr.
       proc     - Remote process ID (source).
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_NbGetV(armci_giov_t *dsrc_arr,int arr_len,int proc,armci_hdl_t* handle)
PURPOSE: Generalized Non-blocking I/O vector operation that transfers data from the remote process memory (source) to the callingprocess local memory (destination).
DATA TYPE:
       typedef struct {
         void **src_ptr_ar;  - Source starting addresses of each data segment.
         void **dst_ptr_ar;  - Destination starting addresses of each data segment.
         int bytes;         - The length of each segment in bytes.
         int ptr_ar_len;    - Number of data segment.
       }armci_giov_t;
ARGUMENTS:

       dsrc_arr - Array of data (type of armci_giov_t) to get from remote process.
       arr_len  - Number of elements in the dsrc_arr.
       proc     - Remote process ID (source).
       handle   - Pointer to a desciptor associated with a particular non-blocking transfer. 
                  Passing of a NULL value for this arg makes this function do an implicit 
                  handle non-blocking transfer.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).
_____________________________________________________________________________

3 Copy operations using the strided format


int ARMCI_PutS(void* src_ptr, int src_stride_ar[], void* dst_ptr, int dst_stride_ar[],
               int count[], int stride_levels, int proc)
PURPOSE: Blocking strided  operation that transfers data from the local memory of calling process (source) to the memory of a remote process (destination).
ARGUMENTS:
       src_ptr        - Source starting address of the data block to put.
       src_stride_arr - Source array of stride distances in bytes.
       dst_ptr        - Destination starting address to put data.
       dst_stride_ar  - Destination array of stride distances in bytes.
       count          - Block size in each dimension. count[0] should be the
                        number of bytes of contiguous data in leading dimension.
       stride_levels  - The level of strides.
       proc           - Remote process ID (destination).
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_NbPutS(void* src_ptr, int src_stride_ar[], void* dst_ptr, int dst_stride_ar[],
                 int count[], int stride_levels, int proc,armci_hdl_t* handle)
PURPOSE: Non-blocking strided  operation that transfer data from the local memory of calling process (source) to the memory of a remote process (destination).
ARGUMENTS:
       src_ptr        - Source starting address of the data block to put.
       src_stride_arr - Source array of stride distances in bytes.
       dst_ptr        - Destination starting address to put data.
       dst_stride_ar  - Destination array of stride distances in bytes.
       count          - Block size in each dimension. count[0] should be the
                        number of bytes of contiguous data in leading dimension.
       stride_levels  - The level of strides.
       proc           - Remote process ID (destination).
       handle         - Pointer to a desciptor associated with a particular non-blocking transfer. 
                        Passing of a NULL value for this arg makes this function do an implicit 
                        handle non-blocking transfer.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_GetS(void *src_ptr c_ptr, int src_stride_ar[], void* dst_ptr,
               int dst_stride_ar[], int count[], int stride_levels, int proc)
PURPOSE: Blocking strided operation that transfers data from the remote process memory (source) to the calling process  local memory (destination).
ARGUMENTS:
src_ptr        - Source starting address of the data block to get.
       src_stride_arr - Source array of stride distances in bytes.
       dst_ptr        - Destination starting address to get data.
       dst_stride_arr - Destination array of stride distances in bytes.
       count          - Block size in each dimension. count[0] should be the
                        number of bytes of contiguous data in leading dimension.
       stride_levels  - The level of strides.
       proc           - Remote process ID (source).
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_NbGetS(void *src_ptr c_ptr, int src_stride_ar[], void* dst_ptr, int dst_stride_ar[],
                 int count[], int stride_levels, int proc, armci_hdl_t* handle)
PURPOSE: Non-blocking strided operation that transfers data from the remote process memory (source) to the calling process  local memory (destination).
     ARGUMENTS:
src_ptr        - Source starting address of the data block to get.
       src_stride_arr - Source array of stride distances in bytes.
       dst_ptr        - Destination starting address to get data.
       dst_stride_arr - Destination array of stride distances in bytes.
       count          - Block size in each dimension. count[0] should be the
                        number of bytes of contiguous data in leading dimension.
       stride_levels  - The level of strides.
       proc           - Remote process ID (source).
       handle         - Pointer to a desciptor associated with a particular non-blocking transfer. 
                        Passing of a NULL value for this arg makes this function do an implicit 
                        handle non-blocking transfer.
RETURN VALUE:
zero        - Successful.
       other value - Error code (described in the release notes).
_____________________________________________________________________________

4 Copy operations for contiguous data

int ARMCI_Put(void* src, void* dst, int bytes, int proc)
PURPOSE: Blocking transfer of contiguous data from the local process  memory (source) to remote process memory (destination).
ARGUMENTS:
       src     - Source starting address of the data block to put.
       dst     - Destination starting address to put data.
       bytes   - amount of data to transfer in bytes.
       proc    - Remote process ID (destination).
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_NbPut(void* src, void* dst, int bytes, int proc, armci_hdl_t* handle)
PURPOSE: Non-blocking transfer of contiguous data from the local process  memory (source) to remote process memory (destination).
ARGUMENTS:
       src     - Source starting address of the data block to put.
       dst     - Destination starting address to put the data.
       bytes   - amount of data to transfer in bytes.
       proc    - Remote process ID (destination).
       handle  - Pointer to a desciptor associated with a particular non-blocking transfer. 
                 Passing of a NULL value for this arg makes this function do an implicit 
                 handle non-blocking transfer.

RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_Get(src, dst, bytes, proc)
PURPOSE: Blocking transfer of contiguous data from the remote process  memory (source) to the calling process memory (destination).
ARGUMENTS:
       src     - Source starting address of the data block to get.
       dst     - Destination starting address to get the data.
       bytes   - amount of data to transfer in bytes.
       proc    - Remote process ID (destination).

RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_NbGet(src, dst, bytes, proc, armci_hdl_t* handle)
PURPOSE: Non-blocking transfer of contiguous data from the remote process  memory (source) to the calling process memory (destination).
ARGUMENTS:
       src     - Source starting address of the data block to get.
       dst     - Destination starting address to get the data.
       bytes   - amount of data to transfer in bytes.
       proc    - Remote process ID (destination).
       handle  - Pointer to a desciptor associated with a particular non-blocking transfer. 
                 Passing of a NULL value for this arg makes this function do an implicit 
                 handle non-blocking transfer.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).
__________________________________________________________________________

5 Accumulate operation using the generalized I/O vector format

Accumulate operation performs atomic scaled reduction, i.e.,  remote += scale*local

int ARMCI_AccV(int datatype, void *scale, armci_giov_t *dsrc_arr, int arr_len, int proc, 
               armci_hdl_t* handle)

PURPOSE: Blocking generalized I/O vector operation that atomically updates the memory of
         a remote process (destination).
DATA TYPE:
       typedef struct {
         void **src_ptr_ar;  - Source starting addresses of each data segment.
         void **dst_ptr_ar;  - Destination starting addresses of each data segment.
         int bytes;         - The length of each segment in bytes.
         int ptr_ar_len;    - Number of data segment.
       }armci_giov_t;

ARGUMENTS:
       datatype  - Supported data types are:
                        ARMCI_ACC_INT -> int, ARMCI_ACC_LNG -> long,
                        ARMCI_ACC_FLT -> float, ARMCI_ACC_DBL-> double,
                        ARMCI_ACC_CPL -> complex, ARMCI_ACC_DCPL -> double complex.
       scale     - Scale for the data (dest = dest + scale * src).
       dsrc_arr  - Array of data (type of armci_giov_t) to be accumulated to the remote process.
       arr_len   - Number of elements in the dsrc_arr.
       proc      - Remote process ID.
       handle    - Pointer to a desciptor associated with a particular non-blocking transfer. 
                   Passing of a NULL value for this arg makes this function do an implicit 
                   handle non-blocking transfer.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_NbAccV(int datatype, void *scale, armci_giov_t *dsrc_arr, int arr_len, int proc,
armci_hdl_t* handle)
PURPOSE: Non-blocking generalized  I/O vector operation that atomically  updates the memory of a remote process (destination).
DATA TYPE:
       typedef struct {
         void **src_ptr_ar;  - Source starting addresses of each data segment.
         void **dst_ptr_ar;  - Destination starting addresses of each data segment.
         int bytes;         - The length of each segment in bytes.
         int ptr_ar_len;    - Number of data segment.
       }armci_giov_t;
ARGUMENTS:
       datatype  - Supported data types are:
                        ARMCI_ACC_INT -> int, ARMCI_ACC_LNG -> long,
                        ARMCI_ACC_FLT -> float, ARMCI_ACC_DBL-> double,
                        ARMCI_ACC_CPL -> complex, ARMCI_ACC_DCPL -> double complex.
       scale     - Scale for the data (dest = dest + scale * src).
       dsrc_arr  - Array of data (type of armci_giov_t) to be accumulated to the remote process.
       arr_len   - Number of elements in the dsrc_arr.
       proc      - Remote process ID.
       handle    - Pointer to a desciptor associated with a particular non-blocking transfer. 
                   Passing of a NULL value for this arg makes this function do an implicit 
                   handle non-blocking transfer.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).
_____________________________________________________________________________

6 Accumulate operation using the strided format

int ARMCI_AccS(int datatype, void *scale, void* src_ptr,int src_stride_ar[], void* dst_ptr, 
                 int dst_stride_ar[], int count[],  int stride_levels, int proc)
PURPOSE: Blocking strided operation that atomicaly updates the memory of a remote process (destination).
ARGUMENTS:
       datatype       - Supported data types are:
                        ARMCI_ACC_INT -> int, ARMCI_ACC_LNG -> long,
                        ARMCI_ACC_FLT -> float, ARMCI_ACC_DBL-> double,
                        ARMCI_ACC_CPL -> complex, ARMCI_ACC_DCPL -> double complex.
       scale          - Scale for the data (dest = dest + scale * src).
       src_ptr        - Source starting address of the data block to put.
       src_stride_arr - Source array of stride distances in bytes.
       dst_ptr        - Destination starting address to put data.
       dst_stride_arr - Destination array stride distances in bytes.
       count          - Block size in each dimension. count[0] should be the
                        number of bytes of contiguous data in leading dimension.
       stride_levels  - The level of strides.
       proc           - Remote process ID (destination).
RETURN VALUE:
       zero           - Successful.
       other value    - Error code (described in the release notes).

int ARMCI_NbAccS(int datatype, void *scale, void* src_ptr,int src_stride_ar[],
                  void* dst_ptr, int dst_stride_ar[], int count[],  int stride_levels,
                  int proc, armci_hdl_t* handle)
PURPOSE: Non-blocking strided operation that atomicaly updates the memory of a remote process (destination).
ARGUMENTS:
       datatype       - Supported data types are:
                        ARMCI_ACC_INT -> int, ARMCI_ACC_LNG -> long,
                        ARMCI_ACC_FLT -> float, ARMCI_ACC_DBL-> double,
                        ARMCI_ACC_CPL -> complex, ARMCI_ACC_DCPL -> double complex.
       scale          - Scale for the data (dest = dest + scale * src).
       src_ptr        - Source starting address of the data block to put.
       src_stride_arr - Source array of stride distances in bytes.
       dst_ptr        - Destination starting address to put data.
       dst_stride_arr - Destination array stride distances in bytes.
       count          - Block size in each dimension. count[0] should be the
                        number of bytes of contiguous data in leading dimension.
       stride_levels  - The level of strides.
       proc           - Remote process ID (destination).
       handle         - Pointer to a desciptor associated with a particular non-blocking transfer. 
                        Passing of a NULL value for this arg makes this function do an implicit 
                        handle non-blocking transfer.
RETURN VALUE:
       zero           - Successful.
       other value    - Error code (described in the release notes).

7 Accumulate operation using contiguous format

int ARMCI_Acc(int datatype, void *scale, void* src, void* dst, int bytes, int proc)
PURPOSE: Blocking operation that atomicaly updates the memory of a remote process (destination).

ARGUMENTS:
       datatype - Supported data types are:
                        ARMCI_ACC_INT -> int, ARMCI_ACC_LNG -> long,
                        ARMCI_ACC_FLT -> float, ARMCI_ACC_DBL-> double,
                        ARMCI_ACC_CPL -> complex, ARMCI_ACC_DCPL -> double complex.
       scale   - Scale for the data (dest = dest + scale * src).
       src     - Source starting address of the data to transfer.
       dst     - Destination starting address to add incoming data.
       bytes   - amount of data to transfer in bytes.
       proc    - Remote process ID (destination).
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_NbAcc(int datatype, void *scale, void* src, void* dst, int bytes, int proc,
                armci_hdl_t* handle)

PURPOSE: Bon-blocking operation that atomicaly updates the memory of a remote process (destination).
ARGUMENTS:
       datatype - Supported data types are:
                        ARMCI_ACC_INT -> int, ARMCI_ACC_LNG -> long,
                        ARMCI_ACC_FLT -> float, ARMCI_ACC_DBL-> double,
                        ARMCI_ACC_CPL -> complex, ARMCI_ACC_DCPL -> double complex.
       scale   - Scale for the data (dest = dest + scale * src).
       src     - Source starting address of the data to transfer.
       dst     - Destination starting address to add incoming data.
       bytes   - amount of data to transfer in bytes.
       proc    - Remote process ID (destination).
       handle   - Pointer to a desciptor associated with a particular non-blocking transfer. 
                  Passing of a NULL value for this arg makes this function do an implicit 
                  handle non-blocking transfer.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).
 _____________________________________________________________________________

8 Register Originated Data Transfer Operations

int ARMCI_PutValueXXX(DATATYPE src, void* dst, int proc)
PURPOSE: Transfer of a value stored in a register  of a local process to remote process memory (destination).   XXX can be "Long"/"Int"/"Double"/"Float"
ARGUMENTS:
DATA TYPE    - long, int, float, double according to XXX in the function name
       src   - Value in a register to put.
       dst   - Destination starting address to put data.
       proc  - Remote process ID (destination).
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_NbPutValueXXX(DATATYPE src, void* dst, int proc, armci_hdl_t* handle)
PURPOSE: Non-blocking transfer of a value stored in a register of a local process to remote process memory (destination).  XXX can be "Long"/"Int"/"Double"/"Float"
ARGUMENTS:
DATA TYPE:  - long, int, float, double according to XXX in the function name
      src   - Value in a register to put.
      dst   - Destination starting address to put data.
      proc  - Remote process ID (destination).
      handle - Pointer to a desciptor associated with a particular non-blocking transfer. 
                Passing of a NULL value for this arg makes this function do an implicit 
                handle non-blocking transfer.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

DATATYPE ARMCI_GetValueXXX(void *src, int proc)
PURPOSE: Transfer of a value stored in a register  of a process (source).   XXX can be "Long"/"Int"/"Double"/"Float"
ARGUMENTS:
DATATYPE    - long, int, float, double according to XXX in the function name
       src   - Source starting address.
       proc  - Remote process ID (source).
RETURN VALUE:
       the value (of type DATATYPE) is returned.
_____________________________________________________________________________

9 Completion of Outstanding Operations

void ARMCI_Fence(int proc)
PURPOSE: Blocks the calling process until all put or accumulate operations
issued to the specified remote process complete at the destination.
ARGUMENTS:
    proc    - Remote process ID.

void ARMCI_AllFence()
PURPOSE: Blocks the calling process until all the outstanding put or accumulate
operations complete remotely regardless of the destination processor.

ARGUMENTS: none


int ARMCI_Wait(armci_hdl_t* handle)
PURPOSE: A common function to be used  to wait for non-blocking ARMCI operations with explicit handle.
ARGUMENTS:
handle - Pointer to a desciptor associated with a particular non-blocking transfer. 
            A value of NULL for the pointer is erroneous. A value other then NULL would 
            make this routine act as a wait for an explicit non-blocking request with 
            the handle pointed to by reqid.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_Test(armci_hdl_t* handle)
PURPOSE: A function to be used as check completion status of  non-blocking ARMCI operations with explicit handle.
ARGUMENTS:
handle - Pointer to a desciptor associated with a particular non-blocking transfer.
            A value of NULL for the pointer is erroneous. A value other then NULL would 
            make this routine act as a wait for an explicit non-blocking request with 
            the handle pointed to by reqid.
RETURN VALUE:
       zero        - Completed
       1           - In progress
       other value - Error code (described in the release notes).

int ARMCI_WaitProc(int proc)
PURPOSE: Wait for all outstanding non-blocking operations with implicit handles to a particular process to finish.
ARGUMENTS:
       proc - proc for which all the outstanding non-blocking operations have to be completed.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_WaitAll()
PURPOSE: Wait for all outstanding non-blocking operations with implicit handles to finish.
ARGUMENTS: none
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_Barrier()

PURPOSE: Synchronize processors and memory. This operation combines functionality of
        MPI_Barrier and ARMCI_AllFence.
ARGUMENTS: none
RETURN VALUE: none
_____________________________________________________________________________

10 Aggregated Data Transfer Operations

ARMCI_SET_AGGREGATE_HANDLE (armci_hdl_t* handle)
handle - Pointer to a desciptor associated with a particular non-blocking transfer.
PURPOSE: Mark a handle as aggregate. This will allow ARMCI to combine nonblocking operations that use that particular handle and process them as a single operation. In the initial implementation only contiguous puts or gets could use aggregate handle. Specifying the same handle for a mix of put anmd get calls is not allowed i.e., only multiple put or only multiple get calls can use the same handle.

ARMCI_UNSET_AGGREGATE_HANDLE (armci_hdl_t* handle)

handle - Pointer to a desciptor associated with a particular non-blocking transfer.
PURPOSE: Clears a handle that has been marked as aggregate.

11 Atomic and Synchronization Operations

int ARMCI_Rmw(int op, void *ploc, void *prem, int value, proc)
PURPOSE: Combines atomically the specified integer  value with the corresponding integer value (int or long) at the remote memory location and returns the original value found at that location.
ARGUMENTS: 
    op    - Available operations are:
            ARMCI_FETCH_AND_ADD -> int
            ARMCI_FETCH_AND_ADD_LONG -> long
            ARMCI_SWAP -> int
            ARMCI_SWAP_LONG ->long
   ploc   - Pointer to the local memory.
   prem   - Pointer to the remote memory.
   value  - Value to be added to the remote memory.
   proc   - Remote process ID.

int ARMCI_Create_mutexes(int count)
PURPOSE: Collective operation to create sets of mutexes on individual processes.
Each process specifies the number of mutexes associated with that
process. The total number of mutexes allocate will be a sum of the
values specified on each process.
ARGUMENTS:
  count    - number of mutexes allocated on calling process
             count=0 means that no mutexes will be associated with that process.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_Destroy_mutexes(void)
PURPOSE: Collective operation to destroy mutex sets allocated by ARMCI_Create_mutexes.
ARGUMENTS:  none
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

void ARMCI_Lock(int mutex, int proc)
PURPOSE: Acquire the specified mutex on the specified process on behalf of the calling process.
ARGUMENTS:
 mutex    - Mutex number (0..count-1)
  proc    - Remote process ID

void ARMCI_Unlock(int mutex, int proc)
PURPOSE: Releas the specified mutex on the specified process on behalf of the calling process. The mutex must have been acquired with ARMCI_Lock.
ARGUMENTS:
    mutex    - Mutex number (0..count-1)
    proc     - Remote process ID

12 Memory Management

int ARMCI_Malloc(void* ptr[], armci_size_t bytes)
PURPOSE: Collective operation to allocate memory that can be used in the context
of ARMCI copy operations.

ARGUMENTS:
  ptr    - Pointer array. Each pointer points to the allocated memory of one process.
bytes    - The size of allocated memory in bytes.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

void* ARMCI_Malloc_local(armci_size_t bytes)
PURPOSE: Operation (noncollective) to allocate local memory. This memory can only be accessed locally. However, using this memory in ARMCI operations can improve performance on some systems. For example, on Myrinet or Infiniband, the memory is registered and therefore suitable for the native RDMA communication.
ARGUMENTS:
bytes    - The size of allocated memory in bytes.
RETURN VALUE:
       NULL pointer - Failure.
       other value - address of newly allocated memory.

int ARMCI_Free(void *address)
PURPOSE: Collective operation to free memory which was allocated by ARMCI_Malloc.
ARGUMENTS:
  address    - pointer to the allocated memory.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

int ARMCI_Free_local(void *address)
PURPOSE: Non-collective operation to free memory which was allocated by ARMCI_Malloc_local.
ARGUMENTS:
  address    - pointer to the allocated memory.
RETURN VALUE:
       zero        - Successful.
       other value - Error code (described in the release notes).

13 Collective Operations

These operations can be used as an alternative to the collective operations in a message-passing library ARMCI is running with. The programmer can use either.

void armci_msg_brdcst(void* buffer, int len, int root)
PURPOSE: broadcast data from process "root" to everybody else.
ARGUMENTS:
buffer - data to broadcast/receive
len - size of the data
root - rank of the sending process
RETURN VALUE: none

void armci_msg_XXXgop(void *x, int n, int type, char* op)
PURPOSE: allreduce operation for int, long, float, double corresponding to XXX is "i"/"l","f","d" 
ARGUMENTS:
x - data
n - number of elements
type - data type, one of: ARMCI_INT/ARMCI_LONG/ARMCI_FLOAT/ARMCI_DOUBLE
op - operator, one of: "+","*","min","max","abs"
RETURN VALUE: none

void armci_msg_barrier(void)
PURPOSE: synchronize all processors
ARGUMENTS: none
RETURN VALUE: none

void armci_msg_reduce(void *x, int n, char* op, int type, int root)
PURPOSE: reduce operation
ARGUMENTS:
x - data
len - size of the data
type - data type, one of: ARMCI_INT/ARMCI_LONG/ARMCI_FLOAT/ARMCI_DOUBLE
op - operator, one of: "+","*","min","max","abs"
RETURN VALUE: none

14 System Configuration

These operations can be used to determine configuration of the system the application  is running on.
The system configuration is described in terms of locality domains. For example on clusters with SMP nodes, SMP node is one of two locality domains for a process. The ARMCI header file predefines ARMCI_DOMAIN_SMP for querying configuration information on clusters composed of computer nodes with shared memory.

int armci_domain_nprocs(armci_domain_t domain, int id)
PURPOSE: return number of processes/tasks in locality domain represented by id. 
ARGUMENTS:
domain - domain name
id - identifier of a node within the locality domain, value < 0 means my node
RETURN VALUE:
< 0 - error
other value - number of processes/tasks (0, ..., armci_domain_count(domain)-1)

int armci_domain_count(armci_domain_t domain)
PURPOSE: return number of nodes in specified locality domain. 
ARGUMENTS:
domain - domain name
RETURN VALUE:
< 0 - error
other value - number of nodes

int armci_domain_id(armci_domain_t domain, int glob_proc_id)

PURPOSE: return ID of locality domain of specified process
ARGUMENTS:
domain - domain name
id - process/task id
RETURN VALUE
< 0 - error
other value - process/task id

int armci_domain_glob_proc_id(armci_domain_t domain, int id, int loc_proc_id)
PURPOSE: Returns global process/task id based on its id in a given locality domain node
ARGUMENTS:
  domain - domain name
  id - identifier of a node within the locality domain, value < 0 means my node
RETURN VALUE:
  < 0 - error
  other value - process/task id
int armci_domain_my_id(armci_domain_t domain)
PURPOSE: Returns id node in specified domain the calling process/task belongs to
ARGUMENTS:
domain - domain name
id - identifier of a node within the locality domain, value < 0 means my node
RETURN VALUE: id of domain