DIO
Distant I/O Library for One-Sided Access to Remote Storage
Distant I/O (DIO) model applies the concept of one-sided communication to parallel I/O on clustered systems with attached disks.
- DIO facilitates data transfers between files on remote nodes and local memory
- DIO not require installation of special daemons or server processes to service remote I/O requests
- avoids significant CPU and memory overheads
- compute nodes become part-time servers and part-time clients
- Relies on efficient lightweight communication protocols such as Active Messages
DIO has been implemented on the IBM SP using LAPI Active Messages and Posix asynchronous I/O.
Performance of DIO operations on IBM SP is very close to that of local operations for request size >16KB.
Applications
- Shared Files: one of parallel I/O libraries developed by ChemIO project uses DIO to provide a shared file view with independent file pointers on collection of local disks on compute nodes of the IBM SP.
- An out-of-core version of COLUMBUS, a computational chemistry application, used DIO to solve the largest ever multireference configuration problem. A paper describing this implementation has received the Best Overall Paper Award at SuperComputing 98 (SC'98) conference in Orlando, Fl.
More Information
- A copy of the paper presented at the High Performance Distributed Computing Conference HPDC-7 in July, 1998 is available in the HPDC-7 Proceedings on pages 148-154 and here.
- An extended version of the HPDC-7 paper titled "Implementing noncollective parallel I/O in cluster environments using Active Message communication" appeared in Cluster Computing no. 2 in 1999 and is also available here.
- The library can be obtained by sending email to hpctools@googlegroups.com.