EuroPVM/MPI 2006 » Parallel I/O

Tutorial on High-Performance Parallel I/O

Full-Day Tutorial with Exercises
Level: TBA

Rob Ross, Argonne National Lab, USA
Joachim Worringen, C&C Research Labs  (NEC Europe Ltd.), Germany

Abstract. Effectively using I/O resources on HPC machines is a black art.  The purpose of this tutorial is to shed light on the state-of-the-art in parallel I/O and to provide the knowledge necessary for attendees to best leverage the I/O resources available to them.

In the first half of the tutorial we discuss the software involved in parallel I/O.  We cover the entire I/O software stack from parallel file systems at the lowest layer, to intermediate layers (such as MPI-IO), and finally high-level I/O libraries (such as HDF-5). The emphasis is not just on how to use these layers, but ways to use them that result in high performance.  As part of this discussion we will present benchmark results from current systems.

The second half of the tutorial will be hands-on, with the participants solving typical problems of parallel I/O using different approaches. The performance of these approaches will be evaluate on different machines at remote sites, using various types of file systems. The results are then compared to get a full picture of the performance differences and characteristics of the chosen approaches on the different platforms.

Basic knowledge of parallel (MPI) programming in C and/or Fortran is assumed. For the second half, each participant should bring his own notebook computer, running either Windows XP or Linux (x86). A limited number of loan notebook computers are available on request.

About the speakers.

Robert Ross received his Ph.D. in Computer Engineering from Clemson University in 2000.  Following this he joined the Mathematics and Computer Science Division at Argonne National Laboratory.  Rob’s research interests are in system software for high performance computing systems, in particular parallel file systems and I/O and message passing libraries.

Rob has been involved in Linux cluster computing since 1995, when he first tested PVFS on early Beowulf systems at NASA Goddard Space Flight Center.  He is the primary author of the PVFS parallel file system and l ead architect of the PVFS2 parallel file system.  He is an active maintainer of the ROMIO MPI-IO library and oversees the Parallel netCDF high-level I/O library development.  Rob is a member of the MPICH2 development team awarded the R&D 100 award in 2005, and was a recipient of the 2004 Presidential Early Career Award for Scientists and Engineers.

Joachim Worringen studied computer engineering at the RWTH Aachen, where he also completed his dissertation in the same field. His research at the RWTH from 1997 to 2002 led to an MPI implementation suitable for Meta-Computing (MetaMPICH) across wide-area connections, and a high-performance MPI implementation based on the SCI interconnect (SCI-MPICH) with numerous innovative features for efficient communication. Both projects are now part of MP-MPICH, maintained at the RWTH.

After this, Joachim joined NEC Computer & Communications Research Labs in St.Augustin, Germany, where he now acts as Principal Researcher. His research interests are parallel I/O and high-performance storage systems on the one hand, and performance evalution and analysis on the other hand. He maintains NEC’s cross-platform MPI-IO library for clusters and vector machines, designs future I/O solutions and is the creator of the ‘perfbase’ toolkit for experiment management and performance analysis.