WOLFHPC 2015

Fifth International Workshop on
Domain-Specific Languages and High-Level Frameworks for High Performance Computing

November 16, 2015

Full-day workshop in conjunction with

SC15: The International Conference for High Performance Computing, Networking, Storage, and Analysis

Austin, Texas, USA

Held in cooperation with ACM SIGHPC
sighpc

| Home | Advance Program |

Workshop location: Hilton 408 (next to the Austin Convention Center)

Advance Program

Session 1

9:00 Opening remarks
9:00-10:00
Keynote Talk: Prof. Alex Aiken [webpage]
Stanford University
"Legion: Programming Heterogeneous, Distributed Parallel Machines" [Talk]

Break: 10:00-10:30

Session 2: Applications and Optimizations

10:30-11:00 Masaki Iwasawa, Ataru Tanikawa, Natsuki Hosono, Keigo Nitadori, Takayuki Muranushi and Junichiro Makino
"FDPS: A Novel Framework for Developing High-Performance Particle Simulation Codes for Distributed-Memory Systems" [Talk]
11:00-11:30 Christopher Earl
"Puffin: An Embedded Domain-Specific Language for Existing Unstructured Hydrodynamics Codes" [Talk]
11:30-12:00 Chunhua Liao, Pei-Hung Lin, Daniel Quinlan, Yue Zhao and Xipeng Shen
"Enhancing Domain Specific Language Implementations Through Ontology" [Talk]
12:00-12:30 Bradley Peterson, Harish Kumar Dasari, Alan Humphrey, James Sutherland, Tony Saad and Martin Berzins
"Reducing Overhead in the Uintah Framework to Support Short-Lived Tasks on GPU-Heterogeneous Architectures" [Talk]

Lunch Break: 12:30-2:00

Session 3

2:00-3:00pm Keynote Talk: Prof. Franz Franchetti [webpage]
Carnegie Mellon University
"HPC Libraries as DSL" [Talk]

Break: 3:00-3:30

Session 4: Stencils

3:30-4:00 Chenyang Liu and Milind Kulkarni
"Optimizing the LULESH Stencil Code using Concurrent Collections" [Talk]
4:00-4:30 Prashant Rawat, Martin Kong, Tom Henretty, Justin Holewinski, Kevin Stock, Louis-Noel Pouchet, J. Ramanujam, Atanas Rountev and P. Sadayappan
"SDSLc: a Multi-Target Domain-Specific Compiler for Stencil Computations" [Talk]
4:30-5:00 Julien Bigot, Helene Coullon and Christian Perez
"From DSL to HPC Component-Based Runtime: A Multi-Stencil DSL Case Study" [Talk]
5:00 Closing Remarks

Keynote Talks

Prof. Alex Aiken , Stanford University

Title: "Legion: Programming Heterogeneous, Distributed Parallel Machines"

Abstract: Programmers tend to think of parallel programming as a problem of dividing up computation, but often the most difficult part is the placement and movement of data, especially in heterogeneous, distributed machines with deep memory hierarchies. Legion is a programming model and runtime system for describing hierarchical organizations of both data and computation at an abstract level. A separate mapping interface allows programmers to control how data and computation are placed onto the actual memories and processors of a specific machine.

This talk will present the design of Legion and in the process highlight design choices that any programming model for future high performance computing systems is likely to have to address. As time permits, the talk will also discuss the Legion implementation, and experience with applications, including S3D, a turbulent combustion simulation.

Bio: Alex Aiken is the Alcatel-Lucent Professor and current chair of the Computer Science department at Stanford University. Alex received his Bachelors degree in Computer Science and Music from Bowling Green State University in 1983 and his Ph.D. from Cornell University in 1988. Alex was a Research Staff Member at the IBM Almaden Research Center (1988-1993) and a Professor in the EECS department at UC Berkeley (1993-2003) before joining the Stanford faculty in 2003. His research interests lie in areas related to programming languages, software engineering and high-performance computing.



Prof. Franz Franchetti , Carnegie Mellon University

Title:"HPC Libraries as DSL"

Abstract: The usual approach to implementing portable high performance kernels is to use vendor optimized libraries that implement standard APIs like the MPI, BLAS, LAPACK or FFTW interface, and to use standardized annotations like OpenMP or OpenACC to convey parallelization opportunities. This leaves performance optimization to the experts knowing the platform and allows application programmers to focus on functionality. The application programmers are free to use any language for which the libraries have bindings to implement functionality not supported by libraries, and to mix and match libraries.

We present a new take on this tried-and-true concept: We interpret a subset of BLAS1/2/3, FFTW, and OpenMP, and a subset of C as a domain specific language targeted at implementing HPC kernels. We then use a domain specific compiler to extract a high level specification in SPIRAL's operator language from a program implemented in the domain specific language, and perform high level cross library-call optimizations and convert loops over library calls into aggregate/batch calls. Finally, we translate the resulting representation into native code for an Intel Haswell CPU, an Intel Xeon PHI GPU, and a near-memory accelerator that is part of a 3D logic/DRAM stack that we developed in the context of the DARPA PERFECT program. For the PERFECT benchmark suite's STAP kernel (space time adaptive processing), the library based source code then runs unmodified across the 3 test platforms and fully leverage their performance capabilities.

Bio: Franz Franchetti is an Associate Research Professor with the Department of Electrical and Computer Engineering at Carnegie Mellon University. He received the Dipl.-Ing. (M.Sc.) degree in Technical Mathematics and the Dr. techn. (Ph.D.) degree in Computational Mathematics from the Vienna University of Technology in 2000 and 2003, respectively. In 2006 he was member of the team winning the Gordon Bell Prize (Peak Performance Award) and in 2010 he was member of the team winning the HPC Challenge Class II Award (most productive system).

Dr. Franchetti's research focuses on automatic performance tuning and program generation for emerging parallel platforms and algorithm/hardware co-synthesis. He targets multicore CPUs, clusters and high-performance systems (HPC), graphics processors (GPUs), field programmable gate arrays (FPGAs), FPGA-acceleration for CPUs, and logic-in-memory and 3DIC chip design. Within the Spiral effort, his research goal is to enable automatic generation of highly optimized software libraries for important kernel functionality. In other collaborative research threads, Dr. Franchetti is investigating the applicability of domain-specific transformations within standard compilers. He leads two DARPA projects in the HACMS and PERFECT program and is PI/Co-PI on a number of federal and industry grants.