Disk Resident Arrays

Disk Resident Arrays (DRA) extend the Global Arrays (GA) programming model to disk. The library encapsulates the details of data layout, addressing and I/O transfer in disk arrays objects. Disk resident arrays resemble global arrays except that they reside on the disk instead of the main memory. The main features of this model are: DRA can take advantage of a shared file system of a collection of independent filesystems accessible from individual computing nodes.

dra_init

status = dra_init(max_arrays, max_array_size, total_disk_space, max_memory)
         integer max_arrays [input]
         double precision max_array_size [input]
         double precision total_disk_space [input]
         double precision max_memory [input]
Initializes Disk Resident Array I/O subsystem.

max_array_size , total_disk_space and max_memory are given in bytes.

max_memory specifies how much local memory per processor the application is willing to provide to the DRA I/O subsystem for buffering.

The value of -1 for any of input arguments means: "don't care ", "don't know" , or "use defaults ".


dra_terminate

  status=dra_terminate()
Close all open disk resident arrays and shut down DRA I/O subsystem.


dra_create

     status = dra_create(type, dim1, dim2, name, filename, mode,rdim1,rdim2,d_a)
              integer type                      [input]   ! MA type identifier
              integer dim1                      [input]
              integer dim2                      [input]
              character*(*) name                [input]
              character*(*) filename            [input]
              integer mode                      [input]
              integer rdim1                     [input] 
              integer rdim2                     [input]
              integer d_a                       [output]  ! DRA handle
Creates a new disk resident array with specified dimensions and data type (represented by MA type handle).

The string filename specifies name of an abstract meta-file that will store the data on the disk. The meta-file might be implemented as multiple files that will contain parts of the disk resident array. The component files will have names derived from the string filename according to some established scheme(s). Only one DRA object can be stored in DRA meta-file identified by filename .

DRA objects persist on the disk after calling dra_close. dra_delete should be used instead of dra_close to delete disk array and associated meta-file on the disk.

String name can be used for more informative (longer)names.

The data in disk resident array is implicitly initialized to 0 .

Access permissions (read, write, read& write) are set in mode . These are set using defined in the header files dra.fh (Fortran) and dra.h (C) preprocessor constants: DRA_R, DRA_W, DRA_RW.

The pair [rdim1, rdim2] specifies dimensions of a typical request. The value of -1 for either of them means "unspecified". The layout of the data on the disk(s) is determined based on the values of these arguments. Performance of the DRA operations depends on the dimensions (section shape) of the requests. If data layout is optimized for column-like sections, performance of DRA operations for row-like sections might be seriously degraded. This is analogous to the effect of wrong loop ordering yielding frequent cache misses in the following example .

              double precision a(1000, 1000)
              do i = 1, 1000
                 do j = 1, 1000
                    a(i,j) = dfloat(i+j)
                 enddo
              enddo

instead of
              do j = 1, 1000
                 do i = 1, 1000
                    a(i,j) = dfloat(i+j) 
                 enddo
              enddo





ndra_create

    status = ndra_create(type, ndim, dims, name, filename, mode, reqdims, d_a)
             integer type                       [input]  ! MA type identifier
             integer ndim                       [input]  ! Dimension of DRA
             integer dims(ndim)                 [input]  ! Dimensions of DRA
             character*(*) name                 [input]  ! Name of DRA
             character*(*) filename             [input]  ! Name of file containing DRA
             integer mode                       [input]  ! READ; WRITE; READ/WRITE
             integer reqdims(ndim)              [input]  ! Typical request size
             integer d_a                        [output] ! DRA handle
Creates an N-dimensional DRA with specified dimensions and data type (represented by MA type handel). The dimension of the DRA is specified by the variable ndim and the physical dimensions of the DRA are specified in the array dims. The variable name is an internal name that can be used to identify the DRA and the variable filename represents the name of an abstract meta-file that will be used to store the data on disk. The variable mode can be used to restrict the behavior of the DRA and can be set using the predefined values DRA_R (read), DRA_W (write), and DRA_RW (read/write).  The array reqdims contains the dimensions of request to the DRA; if any of the entries are set to -1, the ndra_create routine will attempt to use default values to come up with a guess for the size of this value. The variable d_a is an integer handle that is assigned to the DRA when it is created and can be used to access the DRA later in the program.

See documentation for  dra_create for additional information.



dra_set_default_config

    subroutine dra_set_default_config(numfiles, numioprocs)
integer numfiles [input]
integer numioprocs [input]

This subroutine allows users to control the number of files that a DRA is broken up into and to control the number of processors doing IO, provided the DRA is being created on an open filesystem. If the DRA is being created on local disk, then this subroutine has no effect. The original settings for these two variables is that both the number of files and the number of IO procs equals the number of SMP nodes being used by the calculation. Other settings can be chosen, however, to create DRAs composed of larger or smaller numbers of files and IO processors. These settings may provide better IO bandwidth on some platforms. The dra_set_default_config subroutine can be called multiple times throughout the program. Each DRA is created with whatever default configuration is applicable at the time of creation.


dra_open

    status = dra_open(filename, mode, d_a)
              character*(*) filename            [input]
              integer mode                      [input]
              integer d_a                       [output]  ! DRA handle
Open and assign DRA handle to disk resident array stored in DRA meta-file filename.Disk resident arrays that are created with dra_create and saved by calling dra_close can be later opened and accessed by the same or different application.

Attributes of the disk resident array can be found by calling dra_inquire.


dra_write

     status = dra_write(g_a, d_a, request)
              integer g_a                       [input]  ! GA handle
              integer d_a                       [input]  ! DRA handle
              integer request                   [output] ! request id
Write asynchronously specified global array to specified disk resident array.

The dimensions and type of arrays represented by handles g_a and d_a must match. If dimensions don't match, dra_write_section should be used instead.

The operation is by definition asynchronous but it might be implemented as synchronous i.e., it would return only when the I/O is completed.

request can be used to dra_probe or dra_wait for completion of the associated operation.



ndra_write

    status = ndra_write(g_a, d_a, request)
             integer g_a                        [input]  ! GA handle
             integer d_a                        [input]  ! DRA handle
             integer request                    [output] ! request id

N-dimensional asynchronous write from specified global array to specified disk resident array.

The dimension, physical dimensions, and type of arrays represented by handles g_a and d_a must match. If the physical dimensions don't match, dra_write_section should be used instead.

The operation is by definition asynchronous but it might be implemented as synchronous i.e., it would return only when the I/O is completed.

request can be used to dra_probe or dra_wait for completion of the associated operation.


dra_write_section

     status = dra_write_section(transp, g_a, gilo, gihi, gjlo, gjhi, 
                                        d_a, dilo, dihi, djlo, djhi, request)
              logical transp                    [input] ! transpose operator 
              integer g_a                       [input] ! GA handle 
              integer d_a                       [input] ! DRA handle 
              integer gilo                      [input] 
              integer gihi                      [input] 
              integer gjlo                      [input] 
              integer gjhi                      [input] 
              integer dilo                      [input] 
              integer dihi                      [input] 
              integer djlo                      [input] 
              integer djhi                      [input] 
              integer request                   [output] ! request id
Write asynchronously specified global array section to specified disk resident array section:
                OP(g_a[ gilo:gihi, gjlo:gjhi]) -->  d_a[ dilo:dihi, djlo:djhi]
where OP is the transpose operator (.true./.false.). Return error if the two section's types or sizes mismatch. See dra_write specs for discussion of request .



ndra_write_section
     status = ndra_write_section(transp, g_a, glo, ghi, d_b, dlo, dhi, req)
              logical transp                    [input]  ! transpose operator (ignored)
              integer g_a                       [input]  ! GA handle
              integer glo(ndim)                 [input]  ! array of lower indices on GA
              integer ghi(ndim)                 [input]  ! array of upper indices on GA
              integer d_b                       [input]  ! DRA handle
              integer dlo(ndim)                 [input]  ! array of lower indices on DRA
              integer dhi(ndim)                 [input]  ! array of upper indices on DRA
              integer request                   [output] ! request id
Asynchronously write specified global array section to specified disk resident array section.
              g_a[glo:ghi] --> d_a[dlo:dhi]
The transpose operator is currently disabled in this function and has no effect. The function returns an error if the two sections sizes are mismatched. See  dra_write  specs for a discussion of request.


dra_read

     status = dra_read(g_a, d_a, request)
              integer g_a                       [input]  ! GA handle
              integer d_a                       [input]  ! DRA handle
              integer request                   [output] ! request id
Asynchronous read to the specified global array from the specified disk resident array.

The dimension and type of arrays referred to by handles g_a and d_a must match. If dimensions don't match, dra_read_section could be used instead.

See dra_write specs for discussion of request .




 

ndra_read

     status = ndra_read(g_a, d_a, request)
              integer g_a                       [input]  ! GA handle
              integer d_a                       [input]  ! DRA handle
              integer request                   [output] ! request id
N-dimensional asynchronous read to the specified global array from the specified disk resident array.

The dimension, physical dimensions, and type of arrays referred to by handles g_a and d_a must match. If dimensions don't match, dra_read_section could be used instead.

See dra_write specs for discussion of request .


dra_read_section

     status = dra_read_section(transp, g_a, gilo, gihi, gjlo, gjhi,
                                       d_a, dilo, dihi, djlo, djhi, request)
              logical transp                    [input] ! transpose operator
              integer g_a                       [input] ! GA handle
              integer d_a                       [input] ! DRA handle
              integer gilo                      [input]
              integer gihi                      [input]
              integer gjlo                      [input]
              integer gjhi                      [input]
              integer dilo                      [input]
              integer dihi                      [input]
              integer djlo                      [input]
              integer djhi                      [input]
              integer request                   [output] ! request id
Read asynchronously specified global array section from specified disk resident array section:
                OP(d_a[ dilo:dihi, djlo:djhi]) -->  g_a[ gilo:gihi, gjlo:gjhi]
where OP is the transpose operator (.true./.false.).

See dra_write specs for discussion of request .



ndra_read_section

     status = ndra_read_section(transp, g_a, glo, ghi, d_a, dlo, dhi, request)
              logical transp                    [input]  ! transpose operator (ignored)
              integer g_a                       [input]  ! GA handle
              integer glo(ndim)                 [input]  ! array of lower indices on GA
              integer ghi(ndim)                 [input]  ! array of upper indices on GA
              integer d_b                       [input]  ! DRA handle
              integer dlo(ndim)                 [input]  ! array of lower indices on DRA
              integer dhi(ndim)                 [input]  ! array of upper indices on DRA
              integer request                   [output] ! request id
N-dimensional asynchronous read to specified global array section from specified disk resident array section:
                d_a[dlo:dhi] -->  g_a[glo:ghi]
The transpose operator is currently disabled in this function and has no effect. The function returns an error if the two sections sizes are mismatched.  See dra_write specs for discussion of request .


dra_probe

     status = dra_probe(request, compl_status)
              integer request                   [input]  ! request id
              integer compl_status              [output] ! completion status
Tests for completion of dra_write/read or dra_write/read_section operation which set the value passed in request argument.

compl_status .eq. 0 means the operation has been completed.

compl_status .ne. 0 means "not done yet ".


dra_wait

     status = dra_wait(request)
              integer request                   [input]  ! request id
Blocks until completion of dra_write/read or dra_write/read_section operation which set the value passed in request argument.


dra_inquire

     status = dra_inquire(d_a, type, dim1, dim2, name, filename)
              integer d_a                       [input]  ! DRA handle
              integer type                      [output]
              integer dim1                      [output]
              integer dim2                      [output]
              character*(*) name                [output]
              character*(*) filename            [output]
Return dimensions, type , name of disk resident array, and filename of DRA meta-file associated with d_a handle.


ndra_inquire

     status = ndra_inquire(d_a, type, ndim, dims, name, filename)
              integer d_a                       [input]  ! DRA handle
              integer type                      [output] ! DRA data type
              integer ndim                      [output] ! Dimension of DRA
              integer dims(ndim)                [output] ! Array of dimensions of DRA
              character*(*) name                [output] ! DRA name
              character*(*) filename            [output] ! DRA filename
Return type, dimension, dimensions, name of disk resident array, and filename of DRA meta-file associated with d_a handle.


dra_delete

     status = dra_delete(d_a)
              integer d_a                       [input]  ! DRA handle
Delete a disk resident array associated with d_a handle. Invalidate handle. The corresponding DRA meta-file is destroyed.


dra_close

     status = dra_close(d_a)
              integer d_a                       [input]  ! DRA handle
Close DRA meta-file associated with d_a handle and deallocate data structures corresponding to this disk array. Invalidate d_a handle. The array on the disk is persistent.


dra_flick

     subroutine dra_flick()
Returns control to DRA for a VERY short time to improve progress of pending asynchronous operations.



dra_print_internals
     subroutine dra_print_internals(d_a)
                integer d_a                      [input]  ! DRA handle
A call to this subroutine causes the program to dump all the internal information about the disk resident array to standard output. Only the information on processor 0 is written out.