IPPD Target Workflows
IPPD’s research will focus on the use, and impact on, two major workflow testbeds. The first originates from High Energy Physics (Belle II), and the second from Atmospheric modeling.
Community Earth System Model (CESM) is a widely used climate model with a centrally maintained code base at the National Center for Atmospheric Research (NCAR), the model includes many different modules (Atmosphere, Ocean, Ice, Land to name a few) which can be run alone or in combination with each other. An atmospheric science simulation would involve the following principle steps: Get a case (model configuration files spread over several directories) template from either a local file system or the NCAR repository. Clone the case template using NCAR supplied application and adjust parameters. Pull source code from source code control system and compile on target HPC system. Pull input files from external repository such as ESGF or NCAR or a local resource. Start model run. Production model runs often run for many weeks or months, requiring frequent restarts and resubmits to conform to local job queue limits. The output files will be moved to a secondary system were the analysis process is started. The analysis code pulls in observational data files suitable for the evaluation of the modeling results e.g. the climate best estimate data set from the Atmospheric Radiation Measurement program (ARM) in combination with the Tropical Rainfall Measuring Mission (TRMM) satellite datasets from NASA for the evaluation of a community atmospheric model (CAM) run.
Belle II is an international experiment based in Japan, with around 500 international partners. It is planned to provide ~50x more data than Belle II and allow scientists to perform stringent tests on the Standard Model of Particle Physics. PNNL holds a partial copy of Belle data and will hold a full copy of the Belle II raw data as well as processed data and Monte Carlo results ~250 PB by 2022. The experiment will produce trillions of events, that need to be processed before further analysis by the science community. To this end each event is reconstructed and analyzed with the help of both analytical algorithms and Monte carol simulations. A recent approach initially developed by the LHC ATLAS experiment has moved from file based batch processing to event based workflow driven analysis. The event service processes data in a continuous stream at the event level rather than the file level, dispatching single or few event workloads to processing nodes which then stage fine grained outputs in near real time to a data service that can in turn be used as data source for downstream processing. The variable grained partitioning of processing and outputs allows workloads to be tailored dynamically to resources currently available, and minimizes losses when opportunistic processing slots are abruptly revoked.